Comparative analysis of complete mitochondrial genomes of Panorpidae (Insecta: Mecoptera) and new perspectives on the phylogenetic position of Furcatopanorpa

Comparative analysis of complete mitochondrial genomes of Panorpidae (Insecta: Mecoptera) and new perspectives on the phylogenetic position of Furcatopanorpa


Introduction
The scorpionfly genus Furcatopanorpa Ma & Hua, 2011 is a monotypic taxon of Panorpidae (Insecta: Mecoptera), with Panorpa longihypovalva Hua & Cai, 2009 as its type species.The genus is distinguishable from other confamilial genera by a suite of unique characters, especially the absence of notal organ on male tergum 3, and atypical O-shaped mating pattern (Hua and Cai 2009;Ma and Hua 2011;Zhong et al. 2015).The seventh and eighth abdominal segments of males are shortened and not constricted basally; and the hypovalvae of male genitalia are extremely elongated and parameres are extraordinarily developed with complicated lobes.The peculiar feature of the male reproductive system lies in the position of the epididymis, which is separated from the base of the testis within a peritoneal sheath, not pressed against the lateral base of the testis as in other genera of Panorpidae (Zhang et al. 2016).The ejaculatory ducts comprise a median duct and an accessory sac (Lyu et al. 2022).The axis of female medigynium is forked distally.
The genus Furcatopanorpa has a peculiar mating pattern.The male maintains copulation by continuous provision of salivary secretion to the female (Zhong et al. 2015), instead of by seizing the female with grasping devices as in other Panorpidae (Thornhill 1981).During copulation, the well-developed multi-branched male salivary glands continually provide liquid secretion through a mouth-to-mouth mode to the female (Zhong et al. 2015).Cytogenetically, Furcatopanorpa is characterized by large heterochromatic blocks, a chromosome number of n = 21, with the sex determination mechanism as XX/ XO type (Miao et al. 2019).
Based on phylogenetic analyses from molecular and morphological data (Hu et al. 2015;Miao et al. 2019, Wang andHua 2021), the Panorpidae are grouped into two subfamilies.The subfamily Neopanorpinae consists of Neopanorpa van der Weele, 1909, Leptopanorpa Mc-Lachlan, 1875, and two newly erected genera Lulilan Willmann, 2022 andPhine Willmann, 2022 (Willmann 2022), while the subfamily Panorpinae comprises all the other genera.However, the phylogenetic position of Furcatopanorpa remains controversial.Furcatopanorpa was considered a sister taxon to all other genera of Panorpidae by Ma et al. (2012), but was regarded to form a sister taxon to some species of Panorpa (Hu et al. 2015;Miao et al. 2019;Wang and Hua, 2021).
The mitochondrial genome (or mitogenome) of insects is a double-stranded circular molecule, varying in length from 14 to 20 kb (Cameron 2014).The mitogenome is characterized by simple genetic structure, small size, maternal inheritance, high copy numbers, less recombination, and fast evolutionary rate (Boore 1999), thus being regarded as a valuable tool for population genetics, species delimitation, and phylogenetic analyses in numerous groups of insects (Dowton et al. 2002;Wang et al. 2013Wang et al. , 2019;;Choudhary et al. 2015;Song et al. 2016).Mitochondrial genomes may provide further evidence for the phylogenetic analysis of Panorpidae.
In this study, we sequenced 43 mitochondrial genomes of Panorpidae in order to decipher the phylogenetic position of Furcatopanorpa in Panorpidae.

Taxon sampling and DNA extraction
Adults were captured from various mountain regions in China from 2019 to 2021 (Table 1).All specimens were preserved in 100% ethanol at −20°C and identified to species through morphological characters (Wang and Hua 2018).Total genomic DNA was extracted individually from one-side legs using DNeasy DNA Extraction Kit (Qiagen) according to the manufacturer's protocol.
Voucher specimens are kept at the Entomological Museum, Northwest A&F University.

Sequence analyses
The whole mitochondrial genome sequences were generated using Illumina HiSeq™2500 with paired reads of 2 × 150 bp by the Biomarker Technologies Co., LTD (Beijing, China).The raw data was subjected to fastp quality control filtering to obtain Clean Data (Chen et al. 2018).
Assembly and annotation were conducted using MitoZ v2.3 (Meng et al. 2019) and then checked by manual proofreading according to its relative species from NCBI.All the 13 PCGs (protein coding genes) were determined by the ORF Finder employing codon table 5 and compared with the homologous sequence of the reference mitogenome.Two rRNA genes were predicted by comparing with the homologous sequence of other Panorpidae mitogenomes and the locations of adjacent genes.Twenty-two tRNA genes were identified using the MITOS Web Server (http://mitos.bioinf.uni-leipzig.de/index.py)employing codon table 5 (Bernt et al. 2013).The control region was determined by the locations of adjacent genes.Tandem repeat units of the control regions were identified by the Tandem Repeats Finder server (http://tandem.bu.edu/trf/trf.html)(Benson 1999).Mitogenomic circular maps were generated using Organellar Genome DRAW (https://chlorobox.mpimp-golm.mpg.de/OGDraw.html)(Lohse et al. 2013).
Analyses of the sequenced mitogenomes were calculated using PhyloSuite 1.2.2 (Zhang et al. 2020), including the base composition, mitogenomic organization tables, and relative synonymous codon usage (RSCU) values.The sliding window analysis (a sliding window of 200 bp and step size of 25 bp), the nucleotide diversity (π) of 13 PCGs and two rRNAs among 48 mitogenomes of Panorpidae were conducted using DnaSP 6.0 (Rozas et al. 2003).We analyzed the genetic distances based on Kimura-2-parameter and the ratios between non-synonymous (Ka) and synonymous substitutions rates (Ks) of 13 PCGs among the 48 mitogenomes using MEGA X (Kumar et al. 2018) and DnaSP 6.0 (Rozas et al. 2003), respectively.AT-and GC-skews were used to measure the strand bias of the nucleotide composition of mitogenomes (Hassanin 2006).

Phylogenetic analyses
A total of 50 mitogenomes were used in the phylogenetic analyses, including 48 mitogenomes of Panorpidae as the ingroup and two mitogenomes of Panorpodidae as the outgroup (Table 1).The extractions of 13 PCGs, 22 tRNAs, and two rRNAs were conducted with Phylo-Suite 1.2.2 (Zhang et al. 2020).The nucleotide sequences were aligned in batches with MAFFT (Katoh and Standley 2013) integrated into PhyloSuite 1.2.2 and the ambiguous sites were removed using Gblocks (Talavera and Castresana 2007).The concatenations of genes were conducted using PhyloSuite 1.2.2.Phylogenetic trees were reconstructed for six genera of Panorpidae using Bayesian inference (BI) and maximum likelihood (ML) analyses.In order to reduce the impact of long-branch attraction and compositional heterogeneity, a dataset with third codon position removed was included, and the site-heterogeneous mixture CAT-GTR model was used in the phylogenetic analyses (Bergsten 2005;Song et al. 2016;Nie et al. 2018).Four datasets were generated: (1) PCG: 13 PCGs (11,178 bp); (2) PCG + R: 13 PCGs and 2 rRNAs (13,380 bp); (3) PCG + R + T: 13 PCGs, 2 rRNAs, and 22 tRNAs (14,906 bp); and (4) PCG12 + R: 13 PCGs excluding third codon position + 2 rRNAs (9,654 bp).The nucleotide substitution models and partitioning strategies for Bayesian inference were chosen by PartitionFinder 2 (Lanfear et al. 2017) (Table S2).The Bayesian inference was conducted using MrBayes 3.2.6 (Ronquist et al. 2012) and performed two Markov chain Monte Carlo (MCMC) runs of 200 million generations with sampling every 100 generations.The first 25% were discarded as burn-in, and the remaining trees were used to generate the majority consensus tree and to estimate the posterior probabilities (PP).The substitution models for ML analyses were chosen using ModelFinder (Kalyaanamoorthy et al. 2017) (Table S3).ML analyses were performed by IQ-TREE integrated into PhyloSuite 1.2.2 with Ultrafast bootstrap (Nguyen et al. 2015;Zhang et al. 2020).Bootstrap support (BS) values were calculated with 1000 replicates.Bayesian analyses with a site-heterogeneous model were performed using PhyloBayes-MPI 1.9 base on the CAT-GTR model (Lartillot et al. 2013).Two independent MCMC chains would continue to run until satisfactory convergence was reached (maxdiff < 0.1).The initial 25% trees of each run were discarded as burn-in, then the consensus tree was constructed from the remaining trees combined from two runs.

Divergence time estimation
Divergence time estimates were performed based on the dataset PCG + R in BEAST 1.10.4(Drummond et al. 2012).The substitution models of each locus for BEAST analyses were calculated in ModelFinder.The BEAST analysis was based on a Yule speciation process.Fossil evidence was used to calibrate the Bayesian estimates of divergence times (Parham et al. 2012).Based on the Ypresian fossil specimen, the fossil-calibrated node of panorpoids can be constrained to a normal distribution of 52.90 ± 0.83 Ma (Archibald et al. 2010(Archibald et al. , 2013)).Two MCMC runs were conducted with a chain length of 100 million generations, sampling every 1000 generations.The sampling of posterior distribution adequate was indicated by effective sample size (ESS) > 200 in Tracer 1.7 (Rambaut et al. 2018).The first 25% of resulting trees were ignored as burn-in, and the remaining trees were combined in LogCombiner 1.8.0 (Drummond et al. 2012) and summarized as maximum clade credibility (MCC) tree using TreeAnnotator 1.8.0 (Drummond et al. 2012).

Protein-coding genes and codon usage
Four PCGs (nad1, nad4, nad4L, and nad5) are encoded on the minority strand (N-strand), and the remaining nine PCGs on the majority strand (J-strand) in all the mitogenomes sequenced (Fig. 1).The mitogenomes have a variety of start codon usages.Besides the canonical start codons ATN (ATA, ATT, ATG, and ATC), TTG start codon is also used.The most frequently used start codon is ATG, which is utilized in seven PCGs across all species.
The non-canonical start codons TTG and TCG for cox1 and nad1 were found in part of the newly sequenced mitogenomes.These two kinds of unusual initiation codons also exist in C. obtusa.In addition to the complete stop codons TAA and TAG, partial stop codons (T or TA) are also a common feature in all panorpids studied.TAA occurs more frequently than TAG.TA is usually present as the stop codon for cox3, nad4, and nad5, and T− is usually used as the stop codon for cox2.
The amino acid compositions of PCGs and the relative synonymous codon usage (RSCU) are summarized in Figs S1 and S2.The RSCU in all Panorpidae mitogenomes is generally similar to each other.The three most frequently used amino acids -UUA (Leu2), AUU (Ile), and UUU (Phe) -are composed exclusively of U or U and A. The frequency of A and U in the third position was

Transfer and ribosomal RNA genes
The mitogenomes of Panorpidae have 22 tRNA genes, which are scattered discontinuously over the entire mitogenome with eight transcribed from the N-strand and 14 from the J-strand (Fig. 1 and Table S1).The total length of 22 tRNAs ranges from 1458 to 1484 bp.Two rRNA genes (rrnL and rrnS) are encoded on the N-strand in the mitogenomes of Panorpidae.The gene rrnS is located between trnV and the control region, and the gene rrnL is situated between trnL1 and trnV.The average A+T content of rrnL (79.9%) is slightly higher than that of rrnS (77.1%).

Control region
The control region is the largest non-coding region located between rrnS and trnI in the mitochondrial genomes.The size of the control region ranges from 1,434 bp in N. nielseni to 2,252 bp in F. longihypovalva HSD (Fig. 2 and Table S1).The control region has the highest A+T content (84.3%−87.6%)compared with other three regions (PCGs, tRNAs, and rRNAs).
The poly-adenine (A) and [TA(A)] n -like stretches were found in the control region of Mecoptera for the first time.The poly-A is randomly scattered in the control region.Most mitogenomes sequenced of the Panorpidae have tandem repeat units except for some species of Cerapanorpa, Dicerapanorpa and Panorpa (Fig. 2).The analyses of the control regions indicate that the length and copy number of tandem repeat units are dramatically divergent among panorpids (Fig. 2).Congeneric species may have similar tandem repeat units, e. g. S. digitiformis and S. nangongshana.

Comparative analyses of nucleotide diversity and evolutionary rate
A total of 48 mitogenomes were used in comparative analyses, including 43 newly sequenced mitogenomes together with five mitogenomes of Panorpidae downloaded from NCBI (Table 1).A sliding window analysis reveals a highly variable nucleotide diversity among the 13 PCGs and two rRNAs of the sequenced mitogenomes (Fig. S3).

Phylogenetic analyses
The ML and BI analyses from four datasets (PCG, PCG + R, PCG + R + T, and PCG12 + R) generated trees with similar topology.The results show that the species of Panorpidae form a monophyletic group.The topologies of these trees are consistent at the genus level, but incongruent for the interspecific relationship of some species in In clade A, Sinopanorpa, Dicerapanorpa, and Cerapanorpa are all monophyletic.Sinopanorpa forms a sister group with Cerapanorpa in all trees, although the support values were relatively low in some cases.The North American Panorpa debilis is usually present as the sister taxon of Sinopanorpa + (Cerapanorpa + other species of Panorpa), reconfirming the paraphyly of Panorpa.Dicerapanorpa is a sister taxon to Panorpa debilis + Panorpa spp + (Cerapanorpa + Sinopanorpa) (Fig. 3).
Furcatopanorpa forms a sister group relationship with Neopanorpa in all trees with strong support (BS =100, PP = 1) (Figs 3 and S5−S11).In turn, Neopanorpa + Furcatopanorpa form a sister group to clade A (all the other genera studied of Panorpidae).

Divergence time
The chronogram shows that the estimated divergence time between Panorpidae and Panorpodidae is approximately at 115.09 Ma (Fig. 4).The Panorpidae began to diverge approximately at 95. 35     including Panorpa, Cerapanorpa and Sinopanorpa have a common ancestor at ca. 52.89 Ma, consistent with the divergence time of the North American Panorpa debilis and other Eastern-Asian Panorpa species.The estimated divergence time between Dicerapanorpa and other genera in clade A is at 59.93 Ma.

Mitogenome architecture
The mitogenome sequences of Panorpidae are highly conserved in the gene content, gene order, gene length, and nucleotide composition.The pattern of nucleotide skewness in whole mitogenomes is coincident with that of other mecopterans and most other insects (Wei et al. 2010).The AT-skew of whole mitogenomes of Panorpidae is slightly positive or negative, while the GC-skew is usually negative.This result is consistent with that of a recent mitochondrial genomic study in Mecoptera (Li et al. 2019).The control region is responsible for regulating the transcription and replication of mtDNA in insects (Zhang and Hewitt 1997;Li and Liang 2018).It exhibits remarkable divergence of primary nucleotide sequences, with relatively high rates of nucleotide substitution and dramatic variation in fragment length among species or even individuals (Zhang and Hewitt 1997;Li and Liang 2018), thus being regarded as the most variable region of the mitochondrial genome.The size of the control region ranges from 1,434 to 2,252 bp in Panorpidae, but is only 898 bp in the nannochoristid Microchorista philpotti (Beckenbach 2011).The control region of Furcatopanorpa (~2,200 bp) is prominently longer than that of other confamilial genera (~1,500 to 1,600 bp).The control region structures of congeneric species are more comparable in Cerapanorpa, Dicerapanorpa, and Sinopanorpa.However, the structure of the control region varied considerably among individuals in Panorpa and Neopanorpa, indicating the existence of potential species groups within these two genera, consistent with a recent phylogenetic study (Wang and Hua 2021).Tandem repeat units are one of the most common structures in the control region (Li and Liang 2018).Different copy number and length of tandem repeat units are responsible for varying sizes of the control region in Panorpidae, leading to different mitogenome sizes, the socalled length heteroplasmy (Monforte et al. 1993;Zhang and Hewitt 1997;Li and Liang 2018).
Nucleotide diversity analyses are useful for identifying highly divergent nucleotide regions, which are crucial for designing species-specific markers (Jia et al. 2010;Ma et al. 2020), especially in the taxa of highly variable morphological characters.Although a fragment of 658 bp of the gene cox1 is frequently used as a universal barcode for species delimitation in animals (Cooper et al. 2007), this gene is the least variable in the Panorpidae and has a relatively lower ratio of Ka/Ks among the PCGs in these sequenced mitogenomes.Therefore, cox1 is difficult to afford the task of DNA barcoding in Panorpidae.Other genes with rapid evolutionary rates, alternatively, should be evaluated as potential barcode candidates (Lobry 1995;Demari-Silva et al. 2015;Zhang et al. 2018).

Phylogenetic status of Furcatopanorpa
Furcatopanorpa is unique in Panorpidae in that the male adult lacks a notal organ on the posterior margin of the third tergum, and assumes an unusual O-shaped mouth-to-mouth nuptial feeding position during copulation (Zhong et al. 2015).The wings are much longer than the abdomen.The median axis of the female medigynium is bifurcated distally (Ma and Hua 2011).The male genitalia bear a pair of elongate hypovalves, which extend well beyond the apex of gonocoxites (Hua and Cai 2009;Zhong et al. 2015).In the male internal reproductive system, the epididymis is far apart from the testis, not appressed against the lateral base of the testis as in other genera (Zhang et al. 2016).These characters make Furcatopanorpa easily distinguished from the other genera of Panorpidae.Furcatopanorpa was previously regarded as a sister group with all the other genera of Panorpidae based on a morphological phylogenetic analysis (Ma et al. 2012).A molecular phylogenetic analysis, however, indicates that Furcatopanorpa forms the sister group to Panorpa species from Northeastern Asia (Hu et al. 2015;Miao et al. 2019).Hu et al. (2015) suggested that Furcatopanorpa diverged from Panorpa, but here we confirm that Furcatopanorpa is the sister taxon to Neopanorpa based on phylogenetic analyses from mitogenomes.Although some features of the mitochondrial genome may generate misleading phylogenetic signals to cause problems such as long branch attraction, recent studies have found ways to avoid non-phylogenetic signal, such as using the site-heterogeneous mixture model, the inclusion of ribosomal RNA genes, and removal of fast-evolving sites (Song et al. 2016;Feuda et al. 2017;Liu et al. 2018).Based on the present analysis, the phylogenetic topologies of Panorpidae are generally very similar under standard models and the site-heterogeneous mixture model, indicating that the phylogenetic trees are considerably robust at the genus level.
Furcatopanorpa had unique cytogenetic features by large heterochromatic blocks occupying most of the chromosome length, suggesting that multiplied chromosome rearrangements might lead to considerable divergence between Furcatopanorpa and other genera of Panorpidae (Miao et al. 2019).Furcatopanorpa and most species of Neopanorpa have a similar number of chromosomes (n = 21) (Miao et al. 2019), also implying that Furcatopanorpa and Neopanorpa have a closer evolutionary relationship.In contrast, several species of Panorpa, such as P. japonica, P. kunmingensis, P. liui, and P. macrostyla, have different numbers of chromosomes (n = 23 or 24) (Miao et al. 2019), indicating a comparatively remote relationship.
Neopanorpa is regarded paraphyletic with Leptopanorpa based on a molecular (Miao et al. 2019) and a morphological phylogenetic analysis in Panorpidae (Wang and Hua 2020).Based on the present study, Neopanorpa forms a sister taxon to Furcatopanorpa.Nevertheless, since the genus Leptopanorpa is unfortunately not available in this study, and the two newly erected genera Lulilan and Phine are also not included in the analysis, the precise phylogenetic position of Neopanorpa awaits further research.
Panorpa Linnaeus, 1758 was considered paraphyletic with Neopanorpa according to a phylogenetic analysis from mitochondrial gene fragments (Misof et al. 2000).The paraphyly of Panorpa was confirmed with Sinopanorpa, Dicerapanorpa, and Cerapanorpa based on recent morphological and molecular phylogenetic studies (Ma et al. 2012;Hu et al. 2015;Miao et al. 2019).Our present phylogenetic analysis from mitogenomes further confirms that Panorpa is a paraphyletic group, which definitely needs a comprehensive taxonomic revision.The monophylies of Cerapanorpa, Sinopanorpa, and Dicerapanorpa are all confirmed, consistent with previous studies (Miao et al. 2019;Wang and Hua 2021).
Admittedly, mitogenomes are not available yet for the Indonesian genus Leptopanorpa MacLachlan, 1875 and recently erected genera Megapanorpa Wang & Hua, 2019, Lulilan Willmann, 2022and Phine Willmann, 2022, and even some species groups of Panorpa, such as the P. guttata group, the Japanese P. pryeri group, and western Indian species.This study can only provide some new insights into the putative phylogenetic position of Furcatopanorpa.

Divergence time
Based on the present study, Furcatopanorpa is likely one of the earliest genera diversified in Panorpidae.The divergence time to the most recent common ancestor of Neopanorpa was approximately at 49.07 Ma, slightly earlier than the results of a previous study (ca.42.1 Ma) (Miao et al. 2019).
The Panorpidae was supposed to originated from East Asia (Byers 1988;Miao et al. 2019), and migrated to North America via the Bering land bridge from early Paleocene to Pliocene (Sanmartín et al. 2001;Tiffney and Manchester 2001).Nevertheless, practically all the samples in this study were collected from China, only P. debilis was from North America.The migration route of Panorpidae can be better explained provided more specimens from Europe and North America are included in future studies.

Conclusions
In this paper, we used mitochondrial genomes to analyze the sequence architecture and to reconstruct the phylogeny of Panorpidae for the first time.Furcatopanorpa is the sister taxon to Neopanorpa, and is unsuitable to be assigned into the subfamily Panorpinae.We putatively conclude that Furcatopanorpa may deserve a subfamily status from the mitogenomic study. 6.

Figure 1 .
Figure 1.Circular maps of mitogenomes from representative species of Panorpidae.The J-strand is visualized on the outer circle and the N-strand on the inner circle.
much higher than C and G, reflecting AT nucleotide bias in the mitochondrial PCGs among the Panorpidae.
Cerapanorpa and Neopanorpa (Figs 3, S5−S7).Phylogenetic analyses based on site-heterogeneous models show essentially the same results, with only the position of Panorpa debilis slightly different (Figs S8−S11).Most phylogenetic analyses indicate that the Panorpidae can be categorized into three main clades.Clade A comprises Panorpa, Sinopanorpa, Dicerapanorpa, and Cerapanorpa, while clade B consists of Furcatopanorpa only, and clade C is composed of Neopanorpa (Fig. 3).

Figure 2 .
Figure 2. Organization of the control region in Panorpidae mitogenomes.The size of geometric drawings is proportional to the sequence length.The colored ovals indicate the tandem repeats; the remaining regions are shown with blue boxes.

Figure 3 .
Figure 3. BI and ML trees based on the dataset of PCG + R. Numerals at nodes are the Bayesian posterior probabilities and ML bootstrap values, respectively.

Figure 4 .
Figure 4. Chronogram of divergence time estimated from the BEAST analysis.Node numbers indicate the mean estimated divergence ages.Blue bars at nodes represent 95% highest posterior density date ranges.

Table 1 .
Information of the species and mitogenomes used in this study.
Note:The capital letter markers indicate the collection locations; the numeric marks indicate different samples from the same location.
Ma. Neopanorpa and Furcatopanorpa separated at ca. 82.07 Ma, while species of Neopanorpa shared the most recent common ancestor (TMRCA) at 49.07 Ma.Cerapanorpa and Sinopanorpa split from Panorpa at ca. 48.58 Ma, and diverged from each other approximately at 46.41 Ma.The whole clade