Multi‐omics analysis identifies a CYP9K1 haplotype conferring pyrethroid resistance in the malaria vector Anopheles funestus in East Africa

Abstract Metabolic resistance to pyrethroids is a menace to the continued effectiveness of malaria vector controls. Its molecular basis is complex and varies geographically across Africa. Here, we used a multi‐omics approach, followed‐up with functional validation to show that a directionally selected haplotype of a cytochrome P450, CYP9K1 is a major driver of resistance in Anopheles funestus. A PoolSeq GWAS using mosquitoes alive and dead after permethrin exposure, from Malawi and Cameroon, detected candidate genomic regions, but lacked consistency across replicates. Targeted sequencing of candidate resistance genes detected several SNPs associated with known pyrethroid resistance QTLs. The most significant SNPs were in the cytochrome P450 CYP304B1 (Cameroon), CYP315A1 (Uganda) and the ABC transporter gene ABCG4 (Malawi). However, when comparing field resistant mosquitoes to laboratory susceptible, the pyrethroid resistance locus rp1 and SNPs around the ABC transporter ABCG4 were consistently significant, except for Uganda where SNPs in the P450 CYP9K1 was markedly significant. In vitro heterologous metabolism assays with recombinant CYP9K1 revealed that it metabolises type II pyrethroid (deltamethrin; 64% depletion) but not type I (permethrin; 0%), while moderately metabolising DDT (17%). CYP9K1 exhibited reduced genetic diversity in Uganda underlying an extensive selective sweep. Furthermore, a glycine to alanine (G454A) amino acid change in CYP9K1 was fixed in Ugandan mosquitoes but not in other An. funestus populations. This study sheds further light on the evolution of metabolic resistance in a major malaria vector by implicating more genes and variants that can be used to design field‐applicable markers to better track resistance Africa‐wide.


| INTRODUC TI ON
Malaria control relies heavily on insecticide-based interventions, notably long-lasting insecticidal nets (LLINs) incorporating pyrethroid insecticides, and indoor residual spraying (IRS). Together, these interventions are credited with a greater than 70% decrease in malaria burdens since their introduction (Bhatt et al., 2015). However, unless resistance to insecticides is managed, the recent gains in reducing malaria transmission could be lost (Hemingway, 2017).
Worryingly, several mosquito populations are developing multiple and cross-resistance to a broad range of insecticides, increasing the risks that such populations could be better equipped to rapidly develop resistance to novel classes of insecticides. Therefore, elucidating the genetic basis and evolution of resistance is crucial to design resistance management strategies and prevent malaria resurgence (Hemingway, 2017).
In the major malaria vector Anopheles funestus, metabolic resistance mechanisms are driving resistance to most insecticides, including pyrethroids (Amenya et al., 2008;Riveron et al., 2013;Weedall et al., 2019). The molecular basis of this resistance is diverse and complex across Africa, with different resistance mechanisms spreading, and potentially intermixing, from independent origins Djuicy et al., 2020;Riveron, Yunta, et al., 2014;Weedall et al., 2020). These mechanisms are driven by extensive genetic variation between regions, preventing the use of existing findings to inform control efforts across the continent. Progress was recently made in this area through the detection of a DNA-marker in the cis-regulatory region of the cytochrome P450s CYP6P9a and CYP6P9b allowing the design of PCR assays for detecting and tracking pyrethroid resistance in the field Weedall et al., 2019). This resistance marker, however, only explains resistance in southern Africa (Weedall et al., 2019(Weedall et al., , 2020. This is a major obstacle in designing effective resistance management strategies across the continent, to better control this major malaria vector. Transcriptomic analyses have successfully been used to detect key genes conferring resistance to insecticides in the principal malaria vectors Riveron et al., 2013;Weedall et al., 2019). Despite large-scale whole genome sequencing, it has proven difficult to conclusively associate variants with resistance. This indicates a need for a combination of sequencing methods followed by functional validation to detect metabolic resistance markers.
Genome-wide association of pooled individuals (GWAS-PoolSeq) has successfully detected candidate genomic regions of specific phenotypes, including variation in pigmentation in Drosophila (Bastide et al., 2013). In An. funestus, we recently discovered a duplication of the X chromosome cytochrome P450 CYP9K1 associated with increased gene expression using this method (Weedall et al., 2020). Deep sequencing of target-enriched data has successfully been implemented to elucidate mechanisms of insecticide resistance in the dengue mosquito vector, Aedes aegypti (Faucon et al., 2015). Therefore, a GWAS-PoolSeq approach in tandem with targeted enrichment of candidate genomics regions could offer further opportunities to elucidate the complexities of metabolic resistance in An. funestus, while also helping to detect causative resistance alleles.
Here, we used a multi-omics approach combining GWAS-PoolSeq and target enrichment with deep sequencing experiments to elucidate the molecular basis of pyrethroid resistance in the major malaria vector An. funestus. The genome-wide association study used pooled mosquitoes with binary "resistant" or "putatively susceptible" phenotypes from two locations representing Southern and Central Africa, respectively. The fine-scale targeted sequencing approach was used to enrich a portion of the genome of individual mosquitoes Southern, Central and East Africa. The set of genes targeted represent many candidate metabolic resistance loci and previously identified resistance-associated loci. In the target enrichment data, we identified an allele of the X-linked cytochrome P450 gene CYP9K1 probably driving pyrethroid resistance in East Africa. In vitro heterologous expression of CYP9K1 in E. coli revealed this P450 capable of efficiently metabolising the type II pyrethroids deltamethrin.

| Design of SureSelect baits
The sequence capture array was designed prior to the release of the An. funestus genome assembly, using a mix of de novo assembled An. funestus transcripts (Crawford et al., 2010;Gregory et al., 2011) selected from previous pyrethroid resistance microarray experiments Riveron et al., 2013). Among these were heat shock proteins (HSPs), Odorant Binding Proteins and immune response genes such as serine peptidases, Anopheles gambiae detoxification genes sequences (282 genes) and all target-site resistance genes sequences from An. funestus. We also included the entire genomic regions of the major quantitative trait locus (QTLs) associated with pyrethroid resistance which are the 120 kb BAC clone of the rp1 containing the major CYP6 P450 cluster on chromosome 2 (right arm), as well as the 113 kb BAC clone sequence for the rp2 chromosome 2 (left arm). A total of 1302 target sequences were included (with redundancy). Baits were designed using the SureSelect DNA Advanced Design Wizard in the eArray program of Agilent. The bait size was 120 bp for paired-end sequencing using the "centred" option with a bait tiling frequency (indicating the amount of bait overlap) of "x3".

| Collection, rearing and sequencing of mosquitoes
Two An. funestus laboratory colonies (the FANG and FUMOZ) and field mosquitoes from Cameroon, Malawi and Uganda were utilised in this study. The FANG colony is a fully insecticide susceptible colony derived from Angola (Hunt et al., 2005). The FUMOZ colony is a multi-insecticide resistant colony derived from southern Mozambique (Hunt et al., 2005). Field populations of mosquitoes representative of Central, East and southern Africa were sampled from Mibellon (6°46′N, 11°70′E), Cameroon in February 2015; in March 2014 from Tororo (0°45′N, 34°5′E), Uganda  in January 2014 from Chikwawa (16°1′S, 34°47′E), southern Malawi . Mosquitoes were kept until fully gravid and forced to lay eggs using the forced-egg laying method (Morgan et al., 2010). All F 0 females/parents that laid eggs were morphologically and molecularly identified as belonging to the An. funestus group according to a morphological key and cocktail PCR, respectively (Gillies & Coetzee, 1987;Koekemoer et al., 2002). Egg batches were transported to the Liverpool School of Tropical Medicine under a DEFRA licence (PATH/125/2012). Eggs were allowed to hatch in cups and mosquitoes reared to adulthood in the insectaries under conditions described previously (Morgan et al., 2010). Insecticide resistance bioassays on these samples have been previously described Riveron et al., 2015Riveron et al., , 2016. In summary, twoto-five-day old F 1 females were exposed to permethrin for differing lengths of time to define a set of putatively susceptible (dead after 60 min permethrin exposure for Malawi and Uganda populations, and 20 min for Cameroon) and resistant (alive after 180 min permethrin exposure; 60 min in Cameroon) mosquitoes. The variation of exposure time was associated with the level of resistance in the population.
For the PoolSeq experiment, there were sufficient individuals for two likely "susceptible" and three "resistant" replicates of 40 individuals each from Malawi and one "susceptible" and one "resistant" replicate also of 40 individuals from Cameroon. Genomic DNA was extracted per individual using the DNeasy Blood and Tissue kit (Qiagen) and individuals were species ID molecularly (Koekemoer et al., 2002) and pooled per replicate in equal amounts. Library preparation and whole-genome sequencing by Illumina HiSeq2500 (2 × 150 bp paired-end) was carried out by the Centre for Genomic Research (CGR), University of Liverpool, UK. The SureSelect experiment consisted of 10 putatively permethrin susceptible and 10 resistant mosquitoes from Malawi (Southern), Cameroon (Central) and Uganda (Eastern) Africa from the set used for the PoolSeq, above.
An additional 10 mosquitoes from the completely susceptible FANG strain were also included. The library construction and capture were performed by the CGR using the SureSelect target enrichment custom kit with the 41,082 probes. Libraries were pooled in equimolar amounts and paired-end sequenced (2 × 150 bp) with 20 samples per run of an Illumina MiSeq by CGR, using v4 chemistry.

| Analysis of PoolSeq data
The PoolSeq data was analysed in the R package poolfstat (Gautier et al., 2021) and popoolation2 (Kofler et al., 2011). For poolfstat, PoolSeq R1/R2 read pairs were aligned to the VectorBase version 52 An. funestus reference sequence using bwa (Li & Durbin, 2009). Output BAM alignment files were coordinate sorted and duplicates marked in picard (http://broad insti tute.github.io/picard). For F ST analyses, variant calling was carried out using varscan (2.4.4) (Koboldt et al., 2012), with a minimum variant frequency of 0.01 and p-value of .05 and filtered in bcftools (1.9) (Danecek et al., 2021) to retain only SNPs greater than 3 bp away from predicted indels and the resulting the VCF file input to poolfstat. For intra-Malawi and Cameroon "resistant versus susceptible" comparisons average F ST was calculated pairwise between replicates and summarised into nonoverlapping 1000 bp windows using windowscanr (https://github.com/tavar eshug o/Windo wScan R/). pl". The "Cameroon versus Malawi" comparison served as a positive control for this method as we expected to see a strong peak around the rp1 locus according to prior research (Weedall et al., 2019(Weedall et al., , 2020. Only sites with total coverage greater than 10-fold and lower than the 95th coverage percentile for each sample were considered. This test uses multiple independent pairwise comparisons to identify the signals common to all. Here, independent exposure assays were used to generate the dead and the alive mosquitoes, therefore any pair of samples used to generate a 2 × 2 contingency table is arbitrary. Using all six possible pairwise combinations of the two Dead and three Alive samples means that the 2 × 2 tables are not independent of one another and violates the assumptions of the test. This test was run, however, to compare the results with those of combined and pairwise F ST tests and with the intention of functionally validating inferences. We also ran a two-way independent CMH test of "Alive 1" versus "Dead 1" and "Alive 2" versus "Dead 2" replicates for comparison with the CMH test incorporating all replicates. There were six runs of the test made each with two different, independent pairwise combinations of dead and alive samples. Genome-wide F ST and -log 10 p-value plots were created in R using ggplot2 (Wickham, 2016) for poolfstat and Popoolation results, respectively.

| Analysis of SureSelect data
Initial processing and quality assessment of the sequenced data was performed as for the PoolSeq data and analysed using strandngs 3.4 (Strand Life Sciences). Alignment and mapping were performed using the "DNA alignment" option against the whole genome (version AfunF1) which was constructed into three chromosomes using synteny from An. gambiae (Weedall et al., 2019(Weedall et al., , 2020. To summarise Weedall et al. (2020), the 1392 AfunF1 Anopheles funestus genome assembly contigs were ordered relative to Anopheles gambiae (AgamP4) chromosomes using nucmer, from mummer v3.0. Nucmer alignment placed 46% (644 of 1392), totalling 217,255,185 bp or 96% (225,223,604 bp) of the AfunF1 assembly (96%). For reference, the AfunF3 chromosomal assembly is slightly shorter in length at 210,975,222 bp as of VectorBase release 56. The 2L to 3R transposition between An. gambiae and An. funestus was accounted for by renaming of the ordered contigs and unplaced scaffolds were placed at the end of the ordered contigs. Aligned and mapped reads were used to create a DNA variant analysis experiment.
Before variant detection, SNP preprocessing was performed to reduce false positive calls: (i) split read realignment of partially aligned split reads and full-length reads with gaps introduced by initial alignment, (ii) local realignment to reduce alignment artefacts around indels, and (iii) base quality score recalibration to reduce errors and systematic bias.
All variant types (SNPs, MNPs [multiple nucleotide polymorphisms] and indels) were detected by comparing against the FUMOZ AfunF1 genome using the MAQ independent model implemented in strandngs 3.4 and default parameters. A SNP multi sample report was generated for each sample. For each variant, its effect was predicted using the transcript annotation (version AfunF1.4). To identify SNPs significantly associated with permethrin resistance, two approaches were used. First, we used a differential allele frequency-based approach where a variant was significant in relation to permethrin resistance if the supporting read of the SNP is found in 35%-100% of the alive mosquitoes (R) after permethrin exposure but present at low frequency in dead mosquitoes (1%-35%) (C) (R-C comparison). Both sets of mosquitoes were also compared to the fully susceptible laboratory colony, FANG (S), with significant SNPs having frequency >35% but <35% in FANG (S) in R-S and C-S comparisons. This assumes that SNPs associated with resistance will be present at higher frequency in alive mosquitoes (>35%) but lower in dead (<35%). Additionally, a minimum threshold of five out of 10 individuals of the same category was required to include a SNP.
The second approach assessed the significant association between each variant and permethrin resistance by estimating the unpaired t-test unpaired of each variant between each comparison (R-C, R-S and C-S) and a Manhattan plot of -Log 10 of p-value created. A SNP frequency cutoff of three or more samples was applied for this approach.
Finally, the polymorphism pattern of the CYP9K1 gene was analysed across Africa using the sureselect data. CYP9K1 polymorphisms were retrieved from the SNP Multisample report file generated through Strand NGS 3.4 for each population. Bioedit (Hall, 1999) was used to input various polymorphisms in the VectorBase reference sequence using ambiguity codes to indicate heterozygote positions. Haplotype reconstruction and polymorphism analyses were made using dnaspv5.10 (Librado & Rozas, 2009). mega x (Kumar et al., 2018) was used to construct the maximum likelihood phylogenetic tree for CYP9K1.

| Heterologous expression of recombinant CYP9K1 and metabolic assays
2.4.1 | Amplification and cloning of full-length cDNA of An. funestus CYP9K1 RNA was extracted using the PicoPure RNA isolation Kit (Arcturus, Applied Biosystems, USA) from three pools each of 10 permethrinresistant females from Tororo in Uganda. The RNA was used to synthesize cDNA using SuperScript III (Invitrogen) with oligo-dT20 and RNAse H (New England Biolabs). Full length coding sequences of CYP9K1 were amplified separately from cDNA of 10 mosquitoes using the Phusion HotStart II Polymerase (Thermo Fisher) (primers sequences: Table S1). The PCR mixes comprised of 5× Phusion HF Buffer (containing 1.5 mM MgCl 2 ), 85.7 µM dNTP mixes, 0.34 µM each of forward and reverse primers, 0.015 U of Phusion HotStart II DNA polymerase (Fermentas) and 10.71 µl of dH 2 0, 1 µl cDNA to a total volume of 14 µl. Amplification was carried out using the following conditions: one cycle at 98°C for 1 min; 35 cycles each of 98°C for 20 s (denaturation), 60°C for 30 s (annealing), and extension at 72°C for 2 min; and one cycle at 72°C for 5 min (final elongation). PCR products were cleaned individually with QIAquick PCR Purification Kit (Qiagen) and cloned into pJET1.2/blunt cloning vector using the CloneJET PCR Cloning Kit (Fermentas). These were used to transform cloned E. coli DH5α, plasmids miniprepped with the QIAprep Spin Miniprep Kit (Qiagen) and sequenced on both strands using the pJET1.2F and R primers provided in the cloning kit.

| Cloning and heterologous expression of An. funestus CYP9K1 in E. coli
The pJET1.2 plasmid bearing the full-length coding sequence of CYP9K1 was used to prepare the P450 for expression by fusing it to a bacterial ompA+2 leader sequence allowing translocation to the membrane following previously established protocols Pritchard et al., 1997). This fusion was achieved in a PCR reaction using the primers given in Table S1. Details of these PCRs are provided in previous publications . The PCR product was cleaned, digested with NdeI and XbaI restriction enzymes and ligated into the expression vector pCWori+already linearized with the same restriction enzymes to produce the expression plasmid, pB13::ompA+2-CYP9K1. This plasmid was cotransformed together with An. gambiae cytochrome P450 reductase (in a pACYC-AgCPR) into E. coli JM109. Membrane expression and preparation was performed as for (Pritchard et al., 2006).
Recombinant CYP9K1 was expressed at 21°C and 150 rpm, 48 h after induction with 1 mM IPTG and 0.5 mM δ-ALA to the final concentrations. Membrane content of the P450 and P450 reductase activity were determined as previously established (Omura & Sato, 1964;Strobel & Dignam, 1978).

| In vitro metabolism assays with insecticides
Metabolism assays were conducted with permethrin (a type I pyrethroid insecticide), deltamethrin (a type II) and the organochlorine DDT. Assay protocols have been described previously (Ibrahim et al., 2018;. Then, 0.2 M Tris-HCl and NADPH-regeneration components were added to the bottom of chilled 1.5 ml tubes. Membranes containing recombinant CYP9K1 and AgCPR were added to the side of the tube to which cytochrome b 5 was already added in a ratio 1:4 to the concentration of the CYP9K1 membrane. These were preincubated for 5 min at 30°C, with shaking at 1200 rpm and then 20 µM of test insecticide was added into the final volume of 0.2 ml (~2.5% v/v methanol), and the reaction started by vortexing at 1200 rpm and 30°C for 90 min. Reactions were quenched with 0.1 ml ice-cold methanol and incubated for 5 min to precipitate protein. Tubes were centrifuged at 16,000 rpm and 4°C for 15 min, and 100 µl of supernatant and transferred into HPLC vials for analysis. All reactions were carried out in triplicate with experimental samples (+NADPH) and negative controls (−NADPH). Per sample volumes of 100 µl were loaded onto isocratic mobile phase (90:10 v/v methanol to water) with a flow rate of 1 ml/min, a wavelength of 226 nm and peaks separated with a 250 mm C18 column (Acclaim 120, Dionex) on an Agilent 1260 Infinity at 23°C. For DDT, a solubilizing agent sodium cholate (1 mM) was added as described in Mitchell et al. (2012) and absorption monitored at 232 nm. Enzyme activity was calculated as percentage depletion (difference in the amount of insecticide remaining in the +NADPH tubes compared with the -NADPH) and a t-test used to assess significance.

| Genome-wide association study with pooled mosquitoes to identify allelic variants putatively associated with permethrin resistance
Sequence data obtained for each F 1 pool were first quality controlled (trimming, pair-end) and aligned to the An. funestus F3 FUMOZ reference genome (Table S2)  Outliers were few and did not overlap with those for Malawi.
Finally, an intercountry comparison was made by poolfstat global F ST and Popoolation2 CMH test. In contrast to intracountry comparisons a well-defined peak of differentiation was observed across the rp1 locus for both analyses ( Figure S2). In addition, the X chromosome was of elevated background F ST versus autosomes with an average F ST of 0.165 versus 0.062 and 0.056 for chromosomes 2 and 3 respectively. Overall, because of the lack of strong candidate resistance variants detected with this PoolSeq GWAS approach, it was not pursued in other countries, but a fine-scale approach was employed instead. Detection of SNPs significantly associated with permethrin resistance was performed first using the differential SNP frequency analysis implemented in Strand NGS (Strand Life Sciences, Bangalore, India).

| Cameroon
Using the frequency-based filtering approach, 92 SNPs out of the 75,980 polymorphic sites were found to be significant between resistant and putatively susceptible field mosquitoes (R-C), 73 between resistant and FANG strain (R-S), and 64 between Cameroon susceptible and FANG (Figure 2a). Most of these SNPs were silent substitutions followed by intronic and nonsynonymous ones ( Table   F I (Table S6) and other immune response genes were present in this comparison.

F I G U R E 2
Variants significantly associated with permethrin resistance using SureSelect target enrichment sequencing of specific candidate resistance genomic regions. Using a frequency-based filtering approach implemented in StrandNGS, (a) sets of SNPs significantly associated with resistance were detected in various comparisons between field Permethrin Alive (R) and dead (C) and the susceptible laboratory strain FANG (S) in Cameroon. (b) Distribution of the significant SNPs between nonsynonymous (Nsyn), splice sites (SpS), synonymous (Syn), intron (Int), 5'untranslated region (5'UTR) and 3' untranslated regions (3'UTR) in Cameroon. (c) List of genes with variants significantly associated with permethrin resistance in Cameroon. (d), (e), (f) are equivalent of (a), (b) and (c) for Uganda, respectively, as are (g), (h) and (i) for Malawi

| Malawi
Despite the lower overall number of polymorphisms, 74 significant SNPs were detected between Malawian permethrin resistant and susceptible field mosquitoes (R-C) (Table S6), with 521 between Malawi resistant and the (R-S) and 519 between Malawian susceptible and FANG (Table S7; Figure 2g). Due to the similarity of Malawi data to the reference sequence in contrast to Cameroon and Uganda, significant SNPs have been detected assuming a higher frequency in the fieldsusceptible than the resistant. Intergenic SNPs common to all three comparisons belong to seven genes (Figure 2i) including three P450s from the rp1 (CYP6P4a) and rp2 (CYP6M1b and CYP6N2) QTLs. The gene with the most significant SNPs is the P450 CYP6M1b. The gene with most nonsynonymous substitutions in the R-C comparison alone was the P450 CYP6AK1 with three sites followed by the cuticle protein (AFUN009936) and cytochrome P450 CYP4H19 (AFUN001746) with two such sites each (Table S6). As in Cameroon and Uganda, SNPs in immune response genes were also found in all comparisons.
A second approach consisted of detecting significant SNPs by t-test in each country which provided the following results.

| Cameroon
For SNPs present in three or more mosquitoes, the most highly significant SNP was in the cytochrome P450 CYP304B1 on chromosome 2 (p = 1.7 × 10 −5 ). Analysis of the 29 SNPs with Bonferroni-corrected p < .001 revealed seven SNPs that were also detected with the frequency-based filtering approach above (Table S8; Figure 3a) including a SNP located in chorion peroxidase (AFUN00618) and the cytochrome P450 CYP6M1c (AFUN010919) on the rp2 QTL. Some of these 29 SNPs also belong to genes known to be significantly overexpressed in resistant mosquitoes such the P450 CYP315A1 and the glutathione Stransferase GSTe3 (Weedall et al., 2019). Three nonsynonymous SNPs were detected belonging to the P450 CYP304B1 (amino acid change: I504V), the chymotrypsin-like protease (AFUN015111) (D476G) and the decarboxylase, AFUN007527 (V169L). A comparison of the 10 resistant Cameroon mosquitoes to FANG detected a lowest p-value of 7.8 x 10 −48 in a cuticular protein gene (AFUN004689). Regions with most SNPs between Cameroon and FANG were found in the rp1 QTL, a Zinc finger protein (AFUN015873) and a cluster of ABC transporter genes around ABCG4 (Table S9; Figure 3b). This cluster of ABC transporter genes were also detected in the t test R-C comparison.
A comparison of the resistant mosquitoes from Uganda to FANG determined an intergenic substitution between the P450 gene CYP6P9a and a carboxylesterase gene (AFUN015793) in the rp1 QTL region as most significant p-value of 2.28 × 10 −50 (Table S11; Figure 3d).
Overall, most significant SNPs between Uganda and FANG were found around the pyrethroid resistant QTL rp1 and a cluster of ABC transporter genes around ABCG4 (Figure 3d). Interestingly, we saw a peak of significant SNPs around the P450 CYP9K1 on the X chromosome gene, a gene that has undergone a selective sweep and is highly expressed in Uganda (Weedall et al., 2020).

| Malawi
Among the 59 significant SNPs between resistant and susceptible (R-C) Malawi mosquitoes the top significant was a synonymous substitution in the ABC transporter gene (ABCG4/AFUN007162) on chromosome 3 (p-value of 3.0 × 10 −8 ) (Table S12). Nonsynonymous SNPs were detected in the detoxification gene xanthine dehydrogenase (AFUN002567) (Q799E) and immune response Toll-like receptor (AFUN002942) (V104M). A comparison of resistant Malawi mosquitoes to FANG with a lowest p-value of 1.7 × 10 −45 corresponding to a synonymous substitution in the P450 gene CYP6P2 in the rp1 QTL region, alongside a cluster of significant hits (Table S13; Figure 3f). Another cluster of significant hits is also detected around the ABCG4 gene which is also significant between the R-C comparisons (Figure 3f).

| Expression pattern of recombinant CYP9K1
A standard P450 carbon monoxide (CO) -difference spectrum was obtained when CYP9K1 was coexpressed with cytochrome p450 reductase (CPR) in E. coli, as expected from a good-quality functional enzyme with a predominant expression at 450 nm and low P420 content (Figure 4a) (Figure 4b,c). For DDT, a depletion of only 17% was observed, with no peak for either dicofol (kelthane) or DDE.  (Table S14). Similar patterns for Uganda were observed for other parameters including nucleotide diversity (π), this is well illustrated in the plot of haplotype diversity and nucleotide diversity (Figure 5a). Furthermore, Uganda samples exhibited low diversity when compared to FANG and FUMOZ. Both dead and alive mosquitoes exhibited this low diversity in Uganda (Figure 5a). A similar pattern of reduced polymorphism was seen when considering only the coding region (1614 bp) (Table S15) or the non-coding (introns plus UTRs; 1093 bp) ( Table S16). Analysis of the coding region detected a nonsynonymous polymorphism, substituting glycine for alanine at position 454, a mutation which is present in all individuals from Uganda. This G454A change was detected at lower frequencies in Malawi (14/40) and in Cameroon (9/40). An analysis using the Cytochrome P450 Engineering Database (CYPED) (Fischer et al., 2007) reveals that this G454A mutation is between the meander and cysteine

F I G U R E 3 Variants significantly associated with permethrin resistance using an unpaired t-test between the resistant mosquitoes (alive) and susceptible (Dead). (a) Significant variants between permethrin resistant and susceptible mosquitoes in Cameroon (unpaired t-test); whereas (b) is between Cameroon resistant and FANG susceptible strain. (c) Is for Uganda
Alive and Dead mosquitoes after permethrin exposure while (d) are the significant SNPs between the Uganda Alive and the susceptible laboratory strain (FANG) and (e and f) are for significant SNPs between Malawi Alive and Dead mosquitoes and versus FANG, respectively. SNPs located in the rp1 QTL resistance regions on chromosome 2 are consistently associated with pyrethroid resistance. Similarly, a cluster of ABC transporter genes including ABCG4. The black dotted line indicates multiple testing significance level (p = 5 × 10 −5 ) for R-C comparisons and (p = 5 × 10 −22 ) for comparisons with FANG susceptible strain. Chr stands for chromosome and the legend indicates genes underlying highly-significant p-values pocket, which should impact on activity/catalysis, as amino acids in this region stabilizes the heme structural core and supposed to be involved in interaction with P450 reductase.

| Phylogenetic tree
A maximum likelihood tree of CYP9K1 sequences supported the high genetic diversity of this gene across the continent with several haplotypes clustering, mostly by their geographical origin (Figure 5b).
While mosquitoes from other countries cluster randomly, the majority of those from Uganda belong to a major predominant haplotype (36 out of 40 sequences).

| CYP9K1 haplotype network
Analysis of the Templeton, Crandall and Sing (TCS) haplotype tree further highlighted the high polymorphism of CYP9K1 across Africa with many singleton haplotypes separated by many mutational steps (>30 steps) ( Figure S4a-b). The predominant haplotype "H1" was nearly fixed in Uganda (32/40) when considering the fulllength or also only the coding region (36/40). The fact that this

| DISCUSS ION
As malaria prevention still relies heavily on insecticide-based interventions, it is essential to improve our understanding of the mechanisms driving resistance in malaria vectors to prolong the effectiveness of these tools by implementing suitable resistance management strategies. The present study used a multi-omics approach, and one of these approaches detected that the cytochrome P450

| Genome-wide association study with the PoolSeq approach probably needs more replications
The replicated PoolSeq-based genome-wide association study did not detect significant variants associated with resistance. This is contrary to the usefulness of this method previously in detecting variants associated with natural pigmentation variation in Drosophila (Bastide et al., 2013). Among possible reasons for the lack of sensitivity of this is the poor phenotype segregation in our samples from Malawi. Resistance to insecticide was already relatively high in this population reducing the ability to differentiate between resistant and field-susceptible mosquitoes. Additionally, increasing the number of replicates could have increased power of detection unfortunately the high resistance level made it difficult to generate F I G U R E 5 Polymorphisms patterns of CYP9K1 in Africa. (a) Plot of genetic diversity parameters of CYP9K1 across Africa showing the signature of a strong directional selection of CYP9K1 in Uganda. Hd is for haplotype diversity while Pi is nucleotide diversity. (b) Phylogenetic tree for CYP9K1 full-length (2707 bp) between Fang, FUMOZ and resistant strains of Uganda, Cameroon, Malawi and FUMOZ using sureselect data sufficient susceptible individuals per location. This was the case for the Drosophila pigmentation experiment where more replicates of larger pools of flies were analysed (Bastide et al., 2013), not available to us here as stated above. Although no significant resistanceassociated variants were detected by PoolSeq GWAS, the elevated region of F ST on chromosome 3 segregating in Malawi aligns precisely with the known 3Rb inversion of An. funestus (Dia et al., 2013;Green & Hunt, 1980;Sharakhov et al., 2001), a region which contains six cuticular protein genes belonging to the RR-2 family. Unfortunately, it is not possible to estimate inversion frequency with confidence from poolseq data. RR-2 family genes were previously associated with the reduced penetration resistance mechanism . This could indicate that the reduce penetration resistance mechanism through cuticle thickening is playing a role in resistance to pyrethroid in Malawi. It is notable that no hit was detected on chromosome 2 spanning the resistance to pyrethroid 1 QTL (rp1), which was observed between Malawi and Cameroon. This is probably due to the fixation of selected alleles at these two P450 genes (Riveron et al., 2013), and highlights a drawback in our binary alive versus dead phenotypes as a proxy for resistant and susceptible genotypes. This is similar to the case of knockdown resistance allele L1014F which being fixed in many populations of An. gambiae does not correlate with phenotype when using field samples mainly due to the high selection in these samples (Antonio-Nkondjio et al., 2015;Kwiatkowska et al., 2013). The validity of the poolfstat and Popoolation 2 approaches was nevertheless confirmed by the southern (Malawi) versus Central Africa (Cameroon) analysis which detected differentiation at the rp1 locus. This locus contains the CYP6P9a and CYP6P9b cytochrome P450 genes, which confer pyrethroid-resistance and are under strong directional selection in southern African populations of An. funestus (Mugenzi et al., , 2020Weedall et al., 2019). Although statistically attractive, the replicated PoolSeq offers us little extra over intercountry comparisons of pooled-sequencing as demonstrated by detection of the rp1 locus here and prior studies (Weedall et al., 2019(Weedall et al., , 2020. Perhaps, a PoolSeq approach using a crossing of resistant strains to FANG could provide a more productive platform to detect genetic variants associated with related resistance as implemented in Aedes aegypti (Cattel et al., 2020).

| Deep targeted sequencing of genomic regions spanning detoxification genes detects genetic variants of interest
A fine-scale approach combining targeted enrichment and deep sequencing successfully detected variants associated with pyrethroid resistance. This was most evident when comparing resistant mosquitoes to the fully susceptible laboratory FANG strain than when alive and dead mosquitoes from the same location were compared. This low power of detection when comparing samples from the same locality is probably due to high level of resistance inducing a poor segregation between samples. If the high number of significant variants detected between resistant and susceptible strain could be due to a difference in genetic background, the fact that key genomic regions previously associated with resistance were clearly and consistently detected such as rp1, revealed the ability of this approach to detect resistance mutations. Indeed, the rp1 QTL region harbouring a cluster of P450s involved in resistance such as CYP6P9a/b, CYP6P4a/b, CYP6P5 was one of the major loci detected. This could explain why this region was significantly associated with resistance in all regions since at least one gene from this region is over-expressed in each region with CYP6P5 in Cameroon and Uganda, CYP6P9a/b in Malawi Weedall et al., 2019). Furthermore, a consistent resistance locus in all three countries when compared to FANG was associated with the ABC transporter gene ABCG4 (AFUN016161-RA) located in the vicinity of two other ABC genes (ABCC4 and ABCC6 as in An. gambiae). This highlights the potential important role played by ABC transporters in the resistance to insecticides in general as reported recently (Dermauw & Van Leeuwen, 2014;Pignatelli et al., 2018) and particularly in An. funestus. Results should be interpreted with caution however as comparing resistant mosquitoes of each region with FANG (R-S) may reflect resistance or underlying differences in population structure. Despite this potentially confounding factor, we consider the significant signal at CYP9K1 between Uganda resistant and FANG as a signature of resistance. This is because country-specific PoolSeq results and RNAseq that show a high overexpression of this gene only in Uganda (Weedall et al., 2019), and our results provide further support for the likely key role that this P450 gene plays in the pyrethroid resistance in this country (Weedall et al., 2020). CYP9K1 has also been implicated in pyrethroid resistant in other mosquito species such as An.
parensis  and An. coluzzii . This correlation between RNAseq and targeted sequencing for CYP9K1 shows that if the phenotypic segregation is wide enough then target enrichment and sequencing could be sufficiently robust to detect variants associated with resistance. Nevertheless, despite narrowing the genomic region associated with resistance to the gene level confirmation of the causative variant requires further fine-scale sequencing of candidate gene and regulatory regions in tandem with functional genomic dissection of promoter activity. Without taking such an approach whole genome studies do not yield the variants needed to design simple molecular diagnostic for resistance tracking of metabolic resistance. An approach we have taken for other metabolic resistance-conferring loci: GSTe2 (Riveron, Yunta, et al., 2014), CYP6P9a (Weedall et al., 2019) and CYP6P9b .

| An. funestus CYP9K1 is a metaboliser of type II pyrethroids
The heterologous expression of An. funestus CYP9K1 (AfCYP9K1) in E. coli followed by metabolism assays revealed that CYP9K1 metabolises the type II pyrethroid, deltamethrin. Recombinant CYP9K1 had a depletion rate similar to those observed for other cytochrome P450s genes in An. funestus including CYP6P9b (Riveron et al., 2013), CYP6P9a and CYP6M7 , CYP9J11 (CYP9J5) (Riveron et al., 2017) and CYP6AA1 (Ibrahim et al., 2018) or in other malaria vectors such as CYP6M2 in An. gambiae (Stevenson et al., 2011) or CYP6P3 (Muller et al., 2008). However, the observed An. funestus CYP9K1 depletion rate of deltamethrin was twice that for An. coluzzii CYP9K1 (64% vs. 32%), shown to be conferring pyrethroid resistance in the An. coluzzii population of Bioko Island  after scale-up of both LLINs and IRS . Notably, AfCYP9K1 did not metabolise the type I pyrethroid permethrin, with no substrate depletion observed after 90 min suggesting that AfCYP9K1 metabolism is specific to type II pyrethroid. This is similar to previous observations where some P450s could only metabolise one type of pyrethroids. Notably, the CYP6P4 of the malaria vector An. arabiensis sampled from Chad was shown not to metabolise type II pyrethroid, deltamethrin, which correlated with susceptibility to this insecticide in this mosquito population . However, we cannot rule out that AfCYP9K1 also contributes to type I resistance either through metabolism of secondary metabolites generated by other P450s such as CYP6P9a/b or CYP6P5 also shown to be over-expressed in Uganda (Riveron et al., 2017;Weedall et al., 2019). AfCYP9K1 could also act through other mechanisms such as sequestration. For example, the An. gambiae CYP6Z2 (AGAP008218) known to metabolize carbaryl (Chiu et al., 2008), insects juvenile hormone analogue insecticide pyriproxyfen (Yunta et al., 2016) (Barnes, Weedall, et al., 2017;Riveron et al., 2019;Weedall et al., 2020).

| A directionally selected CYP9K1 allele is driving resistance in Uganda
Furthermore, a single haplotype is predominant for CYP9K1 in Uganda in line with directional selection. Fixation of strongly directionally selected alleles was also observed in the CYP6P9a/b P450 genes in An. funestus from southern African populations (Riveron et al., 2013(Riveron et al., , 2019Weedall et al., 2019). This is also the case for CYP9K1 in An. coluzzii in Mali (Main et al., 2015) where an allele has been positively selected in populations post-2006. Similar selective sweeps on P450s have been also reported in Drosophila melanogaster, where a single CYP6G1 allele conferring DDT resistance containing a partial Accord transposable element in the 5' UTR has spread worldwide (Daborn et al., 2002;Schlenke & Begun, 2004). Previous analysis has also shown that the high selection of CYP9K1 occurs alongside a high level of overexpression related to duplication of the locus of this gene in Uganda (Weedall et al., 2020).
Further supporting selection of an allele with enhanced metabolic efficiency in breaking down pyrethroids. This is supported by the fixation of the amino acid substitution of glycine for alanine at position 454 (G454A). This position is located close to the substrate binding pocket, and we hypothesise that increase the affinity and metabolism of this enzyme for deltamethrin. A similar scenario was seen for An. funestus CYP6P9a/b for which both in vivo and in vitro studies revealed that key amino acid changes (N384S) were able to increase the catalytic efficiency of these enzymes .
Further evidence comes from humans for which amino acid changes in CYP2D6, CYP2C9, CYP2C19 and CYP2A6 have been shown to affect drug metabolism a low drug metabolism conferred by some alleles while others confer a fast metabolism rate (Ingelman-Sundberg et al., 2007). Similarly, other amino acid changes in the glutathione S-transferase GSTe2 enzyme in An. funestus (L119F) (Riveron, Yunta, et al., 2014) and in An. gambiae (I114T) (Mitchell et al., 2014) were also shown to drive pyrethroid/DDT resistance in these vectors.

ACK N OWLED G EM ENTS
Pooled-template whole genome sequencing and SureSelect Target enrichment libraries were made and sequenced by the Centre for Genomic Research, University of Liverpool.

CO N FLI C T O F I NTE R E S T
The authors declare that they have no competing interests.

O PE N R E S E A RCH BA D G E S
This article has earned an Open Data Badge for making publicly available the digitally-shareable data necessary to reproduce the and PRJEB24506 (FANG SureSelect) (Hearn et al., 2021).