The pine processionary moth Thaumetopoea pityocampa (Notodontidae) species complex: a phylogeny-based revision

The pine processionary moth Thaumetopoea


Introduction
The pine processionary moth, Thaumetopoea pityocampa (Denis and Schiffermüller, 1775), is an iconic insect in the Mediterranean culture.Known as pityocampe (πιτυοκάμπη) in ancient Greek and as pityocampa in Latin, which means the pine (pityo) caterpillar (campa), has fascinated naturalists for its gregarious behaviour (e.g., Fabre 1899).This is probably one of the reasons why the lepidopterologist Jacob Hübner created the genus Thaumetopoea (Hübner, 1820) (literally, to make wonderful things) to allocate pityocampa and other related species (Hübner 1820).Besides the unique behaviour, the species attracted medical doctors since the times of Dioscorides (40-90 AD) because of the urtication caused by the larvae on human skin and because larval extracts have been used to poison people since the ancient times (Theophrastus of Eresos, 371-287 BC) (Roques and Battisti 2015).
A total evidence phylogeny of the genus Thaumetopoea suggested that the species associated with conifer hosts stemmed from an ancestor associated with broadleaved trees in the eastern Mediterranean area (Basso et al. 2017a).Thaumetopoea pityocampa is the best studied species in the clade of conifer-feeding Thaumetopoea.
According to Salvato et al. (2002), two species were identified based on molecular markers, i.e., T. pityocampa in the west and T. wilkinsoni Tams, 1925 in the east Mediterranean.The latter was originally described from Cyprus Island, and the divergence of the two species was estimated to be at least 5 million years ago (Simonato et al. 2007).Kerdelhué et al. (2009) explored the role of the Mediterranean Sea desiccation in the Miocene-Pliocene transition and of the Quaternary periodic glaciations on the genetic structure of the species, confirming the role of the Messinian Salinity Crisis in shaping the origin of the species and identifying the role of glacial refugia.In addition, the same authors identified a new clade called ENA (populations of Eastern-North Africa) being genetically different from T. pityocampa and T. wilkinsoni based on mitochondrial data.The occurrence of the ENA clade was further corroborated by El Mokhefi et al. (2016), based on a clear separation from the pityocampa populations of NW Africa found in mitochondrial but not in nuclear markers.Recently, the potential hybridization between T. pityocampa and T. wilkinsoni was documented both through laboratory crossing (Petrucco-Toffolo et al. 2018) and in the field, as a natural hybrid zone was identified in western Turkey (İpekdal et al. 2020).Interestingly, the viable F1 and F2 hybrids pityocampa x wilkinsoni obtained under controlled conditions were intermediate in all the morphological and genetic traits considered (Petrucco-Toffolo et al. 2018).
A large variation in the morphological traits of T. pityocampa was already pointed out in the first revision of the genus done by Agenjo (1941), where several synonymous varieties were identified based mainly on wing colour (Table 1).The same author casted doubts about the possibility to morphologically distinguish pityocampa from wilkinsoni, although the second was retained because of its distribution and life history traits described by Tams (1925) and Wilkinson (1926).In the revision by Kiriakoff (1970), the two species were considered valid.Moreover, two of the pityocampa varieties considered by Agenjo (1941) were retained, namely T. p. orana (Staudinger, 1901) and T. p. ceballosi (Agenjo, 1941), likely because of their restricted geographic distribution, in North Africa (Algeria and Morocco) and in Turkey (Anatolia), respectively.Later, de Freina and Witt (1987) and Schintlmeister (2013) considered pityocampa and wilkinsoni as separate species and synonymized most varieties described by Agenjo as well as T. galaica Palanca Soler, Castan Lanaspa and Calle Pascual, 1982, a further described species in NW Spain (Palanca Soler et al. 1982), with pityocampa.Only T. pityocampa orana was retained at subspecies level.In addition, de Freina and Witt (1987) proposed to assign the species to the genus Traumatocampa Wallengren, 1871.However, the phylogenies by Simonato et al. (2013) and Basso et al. (2017a) suggested that all the species in the group should be included in the genus Thaumetopoea.In the same work, the individuals included in the ENA clade were potentially identified as different species based on molecular data, confirming the previous results of Kerdelhué et al. (2009).A species named T. mediterranea Trematerra and Scalercio, 2017 was later described by Trematerra et al. (2017) based on the material from Pantelleria Island, which falls within the range of the ENA clade.In the same paper, another species, T. hellenica Trematerra and Scalercio, 2017, was described from continental Greece.Both species were described using morphological and molecular traits.
The aim of this paper is to reconsider all available morphological and molecular traits of the species in the T. pityocampa complex, including populations for which both types of data are available, even if they are not from the same individuals.The original material on which T. pityocampa was described is not available and a thorough analysis of the old literature has allowed us to clearly identify the type locality of the species and to propose a neotype for the nominal taxon.The expectation is to delimit the species in the group and to provide traits for their identification.A clear definition of taxa within T. pityocampa species complex is important because of the consequences it could have on the study of evolution of the genus and on the management of these important pests that threaten trees, human and animal health.

Taxa description
The taxa included in the T. pityocampa complex are presented in Table 1.Four are considered as valid species (in chronological order of their description, T. pityocampa, T. wilkinsoni, T. hellenica, and T. mediterranea), one as a subspecies (T.pityocampa orana), and 15 are synonyms of T. pityocampa.Types of all species, subspecies and synonyms were examined, either from museum collections (listed in Table S1a) or from publications, spanning over 120 locations in Europe and the Mediterranean Basin (Fig. 1).Whenever possible, populations close to the type locality of taxa were used for both morphological and molecular analyses.
The type material was available for all taxa except the nominal one, which was probably lost in the fire of the Vienna Hofburg on 31 October 1848 during the Vienna Uprising or October Revolution (personal communication of Sabine Gaal-Haszler, curator of the Naturhistorisches Museum, Vienna, Austria).In this regard, Denis and Schiffermüller (1776) merely reported the species from Dioscorides (40-90 AD) as pityocampa and from Réaumur (1734) as la chenille du pin (translation of pityocampa) in France, considering the latter as reliable based on the careful description of the larval, pupal, and adult stages on material received from the Bordeaux region (see Fig. S1).Denis and Schiffermüller (1776) mentioned that they did not see any adults and were hoping to get them from the rearing of larvae provided by the naturalist Dr. Popowitsch, who received them from the baron (Freiherr) von Sperges from Tyrol, without precise indication of the locality.At that time there was a lawyer and historian named Joseph Freiherr von Sperges (1725-1791), who was active in South Tyrol (http://worldcat.org/identities/lccn-n80040576/) so the larvae were most likely collected in that area.About two hundred years earlier, Matthioli (1562) described the bruchi de pini (translation in Italian of pityocampa) as abundant in today's Trento district (Fiemme and Non valleys), formerly part of South Tyrol.Denis and Schiffermüller (1776) formally described the larvae as Bombyx pityocampa and added the German name Fichtenspinner (translated as the spruce caterpillar, with spruce being Picea abies), based on the observation that the captive larvae were feeding on both Norway spruce and Scots pine (Pinus sylvestris).However, while it is known that under captive conditions the mature larvae may feed on spruce (Hellrigl 1995), this is not observed in the field (Battisti et al. 2015).In the world revision of Notodontidae types, Schintlmeister (2013) attributed the taxon to Denis and Schiffermüller (1776) and indicated both France and Tyrol (Austria) as terra typica of T. pityocampa, reporting that reference material was not available for any of these locations.After a thorough discussion with Alexander Schintlmeister in April 2022, it was concluded that the best solution was to define a neotype from the area where von Sperges collected the larvae, identified as today's South Tyrol in Italy.This is supported by the evidence that the moth was well documented in this area by Kitschelt (1925) and Hellrigl (1995).

Outgroup definition
To investigate the evolutionary relationships among taxa, it is crucial to select the appropriate outgroup set unambiguously outside the studied clade (Grimaldi and Engel 2005).Therefore, when considering different datasets belonging to the same specimens, it is essential to verify the phylogenetic signal associated with plesiomorphic and apomorphic characters (Sereno 2007).When molecular data are analysed, the richness of the dataset is often adequate to discern separate clades, even when they are closely related.Conversely, the use of morphological traits is challenging because they result from the expression of mutations accumulated over evolutionary  (Sereno 2007).Based on previous results (Simonato et al. 2013, Basso et al. 2017a), ten species were selected as outgroups to provide a representative sample of taxonomic diversity within the genus Thaumetopoea.Basso et al. (2017a) characterized this genus with 165 morphological traits, although most of them were fixed within the T. pityocampa complex.Therefore, to avoid computational noise from excessive plesiomorphic traits widespread in the ingroup spec-imens, the morphological matrix was reduced to the variable traits only, and some of the traits were re-coded (see File S1) in a straightforward perspective (Basso et al. 2017a).On the contrary, the reduction of the dataset, considering only the hyper-variable traits, may directly affect the conveyed phylogenetic signal and mislead the results.Preliminary analyses of the moth morphological traits revealed that the inclusion of the broadleaved-feeding species produced unstable tree topologies due to the divergent apomorphic traits of these moths (Basso et al. 2017a).Thus, these species were removed and the trees rooted on conifer-feeding Thaumetopoea, such as T. pinivora (Treitschke, 1834)

Morphological data
Finally, for male and female specimens, 44 and 31 traits, respectively, showed polymorphism and were used to explore morphological differentiation (Table S1).The moth samples used in the study were collected with different methods (e.g., rearing from larvae, light/pheromone trapping) following all applicable international and local permitting requirements.There was some variation in the specimen quality and suitability for a comparative analysis of several traits.For this reason, only individuals in a reasonably good preservation condition were used.
Male genitalia preparation and female scale extraction from anal tuft followed the method described in Basso et al. (2017a).Geographic coordinates and name of the collection site were recorded for each specimen, to which a unique alphanumeric code was assigned indicating country, locality, and number of specimens.Countries were indicated with NATO digram codes (https://en.wikipedia.org/wiki/List_of_NATO_country_codes), localities were scored using capital letters (A to Z), and the number of specimens was marked with two-digit progressive numbers (starting from 01) (see Table S1a).

Molecular data
So far, a few DNA fragments have been used to explore the genetic diversity in the T. pityocampa complex at both mitochondrial and nuclear levels (Kerdelhué et al. 2015).The available information from different laboratories and periods of time could be merged to obtain a unique picture, although this required the identification of a sequence shared among the datasets.A preliminary exploration of the sequences available for most taxa described so far identified the mitochondrial region 5p of the gene cox1 (primers LCO1490 -HCO2198, Folmer et al. 1994) as suitable for the study.Even though the choice involved discarding several studies and samples, it allowed a satisfactory and sufficient coverage of the distribution range and a good matching of both morphological and molecular datasets.
For those locations lacking the target sequence, specimens were retrieved from collections starting in the 1990s that were preserved either frozen or in ethanol.Individuals consisted of adult moths obtained from light/ pheromone trapping, larvae from either rearing, colonies on trees, or pupation processions, from a total of 92 locations in Europe and the Mediterranean Basin (Fig. 1).Total DNA extraction was performed from each specimen individually using Invisorb® Spin Tissue Mini Kit, following manufacturer's instructions to extract from muscle.The extractions from larvae were performed from the head (except for the first instar larvae, where extractions were performed on a bulk of 20-25 siblings), while the extractions from adults were carried out from the thorax muscles or legs to keep the head and abdomen for morphological analyses (Basso et al. 2017a).Quality and quantity of extracted DNA were evaluated using both NanoDrop ND-1000 (Thermo Fisher Scientific, USA) and Qubit 4 (Invitrogen, USA) while DNA integrity was assessed on agarose gel (0.8%).The barcode sequences (Ratnasingham et al. 2007) were amplified using universal primers (LCO 1490 and HCO 2198) and following the PCR conditions used in Salvato et al. (2002).Obtained amplicons were purified with Exo-Sap enzymes protocol and sequenced using both the forward and reverse primers previously used in the amplification step (Basso et al. 2017b).Sequencing was performed at BMR Genomics (https://www.bmr-genomics.it/) and INRAE France.The quality of the chromatograms was assessed using Chromas Lite software (http://technelysium.com.au/wp/chromas).Consensus sequences (385 new sequences with lengths spanning from 530 to 710 nucleotides) were assembled using the DNASTAR software (Lasergene, Madison, WI).In addition, 59 sequences belonging to the genus Thaumetopoea were retrieved in February 2021 (see Table S1b) from NCBI and BOLD databases (Ratnasingham et al. 2007;Mutanen et al. 2010;Huemer et al. 2012;Simonato et al. 2013;Mutanen et al. 2016;Basso et al. 2017;Trematerra et al. 2017a).Among these, a total of 45 sequences (voucher specimens) were used as references to assign species identification in the phylogenetic analyses (see Table S1b).

Data analyses
A full set morphological matrix was built in Mesquite software (Maddison et al. 2018) and traits were encoded as described in File S1 following the encoding used in Basso et al. (2017a).The traits were divided by sex, creating two matrices (File S2a, File S2b) of 187 × 44 and 137 × 31 (specimens x traits), respectively.The outgroup specimens were scored together based on Basso et al. (2017a).Furthermore, the "inapplicable characters (-)" for these specimens were coded as "absent" into the matrices.Principal Component Analysis (PCA) was carried out for both matrices to explore the variance among the multidimensional datasets and to investigate the traits that mainly contribute to outlining the groups (Legendre and Legendre 1998;Loko 2015;Zimisuhara 2015).The analyses were performed with PAST 4.11 (Hammer et al. 2001) under multivariate ordination setting, using a correlation matrix and scaling principal component (PC) by eigenvalue.The relevant PCs were detected by screeplot, cutting off the point equal or under the broken-stick distributions.The contribution of single variables (traits) in each PC was deemed significant when ≥ |0.4|, while the signs of the values were interpreted for positive (+) or negative (-) associations.Concurrently, the matrices were analysed separately using a phylogenetic approach under both Maximum Parsimony (MP) and Maximum Likelihood (ML) criteria (see File S2a, File S2b) rooting trees in the T. pinivora clade.The MP approach was carried out using TNT v1.5 software (Goloboff et al. 2016).The software was set according to Basso et al. (2017a), and the analyses were performed both on equal (EW) and implied weighting (IW).IW parameters (k) were calculated using setk.runscript (Santos et al. 2015;Badano et al. 2018).Each MP analysis produced more than 100 trees.The consensus trees obtained for both datasets resulted in polytomies, assuming the absence of enough discriminating signals among the datasets, these results were discarded and not further discussed.The ML approach was carried out using Iqtree v1.6.12 software (Nguyen et al. 2015) using the nucleotide substitution model determined by ModelFinder (Kalyaanamoorthy et al. 2017).
A total of 430 sequences belonging to T. pityocampa group and 14 sequences belonging to another ten Thaumetopoea species were first translated into putative proteins to identify the proper reading frame.Then multiple alignments were carried out "in-frame" using MAFFT tools implemented in the TranslatorX (http://www.translatorx.co.uk/) pipeline (Abascal et al. 2010) with default settings.To study the variability conveyed by the barcode marker and due to the concordance in the sequences between siblings and specimens from the same geographic zone, we decided to use data-supported even by single chromatogram.Hence, the longer sequences were trimmed, while the shorter sequences were expanded in both 5' and 3' ends, manually curating each chromatogram and reporting the nucleotide called in small letters.Sequences retrieved from databases were used as found.The unique haplotypes and the presence of identical sequences in the dataset were assessed with DNAsp v.6.12.03 x64 (Rozas et al. 2017).Empty positions were treated as missing data and considered in the analysis; ambiguous data were not detected.In the case of identical sequences, the one with the longest length was chosen to represent the haplotype.The final multiple alignments set spanned 657 positions for 94 unique (65 presented in this work) sequences representing the haplotypes' variability of the T. pityocampa complex around the Mediterranean Basin.Initially, a double approach for the phylogenetic investigation was carried out, performing the analyses with NJ and ML methods.
According to the BOLD method (Ratnasingham et al. 2007), the NJ analysis was carried out in MEGA X and used as a reference for the specimens' identification.The ML analysis was carried out using Iqtree v1.6.12 software (Nguyen et al. 2015), applying K2P+G4 evolutionary model.The NJ tree was discarded during the preliminary evaluation due to inconsistencies between BOLD assign-ment and NJ topology obtained from MEGA X software.Indeed, the sequence belonging to the T. mediterranea clade stands apart from the main clade encompassing the other T. mediterranea and T. hellenica sequences.This choice was corroborated by genetic distances evaluation as well.Thus, we performed the following analyses with the ML approach, which was consistent with the clade designation of BOLD.Fifty independent runs were carried out with the settings described above, and each Log-likelihood score was assessed to identify the most likely tree describing the relationship between the clades.This tree was used as a starting point in the calculation of supports.Node supports were evaluated by standard Bootstrap (BT) (Felsenstein 1985) and Ultrafast bootstrap (UFB) (Minh et al. 2013;Hoang et al. 2018), while supports to the branches were estimated by SH-like approximate likelihood ratio test (SH-aLRT) (Guindon et al. 2010).Each test was performed for 10,000 iterations, and values only > 90 in both nodes and branches were reported.Tree was rooted on the clade of broadleaved-feeding species T. herculeana, T. processionea, and T. solitaria.Genetic distances on unique haplotypes between specimens of the detected genetic clades were estimated in MEGA X using the K2P substitution model (Kimura 1980) for the BIN evaluation (Ratnasingham et al. 2007).Finally, the outcomes of distance analyses were compared with results obtained via two typical web applications to detect the species gap.The dataset comprising only T. pityocampa complex was submitted to the Automatic Barcode Gap Discovery (ABGD) (Puillandre et al. 2012) (https://bioinfo.mnhn.fr/abi/public/abgd)using 1000 iteration and running the analyses with default settings for JC69, K2P, and SD models.The relative gap width was set to X = 1.5.The ML tree was evaluated with bPTP server (https:// species.h-its.org),using 100,000 MCMC generation and with a thinning of 1000 and a burn-in of 0.3 (Zhang et al. 2013).Haplotype networks were carried out separately by species, using TCS setting under popART 1.7 software (Leigh et al. 2015).

Morphological analysis
PCAs were carried out separately on male and female matrices to summarize the information carried by morphological data.The three most relevant PCs explained 34.59% of the total variation when analysing the male dataset and this value increased to 39.59% when analysing the female dataset.The relative variance of the first three PCs in both datasets spanned from 17.25% to 9.78%, with single variables positively or negatively correlated to each PC, as summarized in Table S2.
The score plots of PC1-3, conveyed by the male matrix, showed a clear separation between the outgroup and the T. pityocampa complex.On the contrary, in the female matrix, the separation between species groups was less defined.Conversely, in both cases, a complete overlap of the specimens was noticed within the T. pityocampa complex without chances to determine a defined pattern to distinguish taxa (Fig. S2).The results of ML analyses performed on morphological traits were shown in Figure 2 separately for female and male moths.In both cases, the T. pinivora clade (T.pinivora and related species) stood out clearly, forming a separate clade at the base of the tree.The other species that are associated with angiosperms cluster in consistent groups (e.g., apologetica Strand, 1909-dhofarensis Wiltshire, 1980-jordana Staudinger, 1887 and processionea -solitaria) but remained in the T. pityocampa complex.All the individuals of the T. pityocampa complex were mixed up and no patterns were detected with the exception that individuals from the same population group together, irrespective of the putative species (T.hellenica, T. mediterranea, and T. wilkinsoni), or subspecies (T.pityocampa orana) they belong to (Fig. S3).Even when single traits were considered in a comparative way across species and subspecies, a large overlap was observed irrespective of the type of trait (e.g., colour, female anal scale, male genitalia, shape of the canthus, wing venation) or taxonomic allocation (data not shown).

Molecular analysis
The outcome of the molecular study based on the barcode region of the mitochondrial DNA is presented in Figure 3 and in Fig. S4.The ML tree exhibited eight major clades.The two first clades corresponded to species associated with angiosperms in Europe and Northern Africa (T.herculeana, T. processionea, T. solitaria) and in Central Af-rica and the Arabian Peninsula (T.apologetica abyssinica Strand, 1911, T. apologetica, T. dhofarensis).The next clade included the species feeding on conifers in summer (T.bonjeani, T. ispartaensis, T. libanotica, T. pinivora).The ingroup was organized into three larger clades corresponding to species feeding on conifers in winter and differentiated according to geographic distribution.Following the phylogenetic order, T. hellenica, from Greece and Libya (clade 1: BIBSA1724-16, MW756044-45), was retrieved to stand apart from T. mediterranea from Algeria, Pantelleria Island, Tunisia (clade 2: BCLEP001-16, GBGL12179-13, MW756046-52).The second lineage included T. cretensis sp.n., collected in Crete Island (clade 3: MW756053-54, MZ425543-49), and T. wilkinsoni from islands of Cyprus and Rhodes, and Turkey (clade 4: GBGL12176-13, MW756055-64).The T. pityocampa clade spanning from Greece to NW Africa was split into four statistically supported subclades structured according to different geographic locations.Specimens from Corsica Island, Algeria and Morocco were clearly identified, while the others were not so distinct and included individuals from the Balkans to Spain and Portugal.Sequences from S Algeria and S Morocco (MW756065-66) were included in a subclade distinct from that including individuals from Corsica Island, NW Algeria and N Morocco (MW756067-69), which were sister to European specimens..The haplotype network in the complete dataset revealed the presence of 94 haplotypes, which were divided by species, as shown in the ML phylogenetic tree (Fig. 3).The interspecific genetic distances of the cox1 marker between ingroup clades ranged from 3.8% (T.hellenica -T.mediterranea) to 9.4% (T.hellenica -T.cretensis or T. wilkinsoni) (Table 2).Seven sequences for T. hellenica revealed three different haplotypes, each characterized by three nucleotide substitutions with the intraspecific genic distance of 0.6%.The haplotype distribution of Clade 1 showed that two sequences were located near the Libyan coast (MW756044-45).In contrast, a single haplotype from the reference sequences (BIBSA1724-16) belonged to males collected with pheromone traps in two sampling sites in Greece (Trematerra et al. 2017b) (Fig. S5).Thaumetopoea mediterranea was represented by 11 sequences split into nine unique haplotypes, with a 1.6% genetic distance (Table 2).The network analyses showed a subdivision into two groups supported by 13 nucleotide changes (Fig. S6).This set showed the two reference sequences reported, respectively, from Pantelleria Island (BCLEP001-16) (Trematerra et al. 2017b) and in Bizerte (GBGL12179-13) (Kerdelhue et al. 2009), diverged from those obtained by inland specimens (MW756046-52).The same distribution was obtained in the ML phylogenetic tree, with reference sequences (BCLEP001-16, GBGL12179-13) clustered together and were sisters to the others (MW756046-52).The haplotype distribution located the Clade 2 between Algeria and Tunisia, along the Mediterranean side of the Atlas Mountains, where more genetic variability was detected (MW756046-52).
The 18 sequences of T. cretensis were split into nine exclusive haplotypes with an overall genetic distance of 0.9%.The network grouped the haplotypes in two main clades (MZ425547-49 + MW756054 and MZ425543-46 + MW756053), separated by seven changes equally distributed in Crete Island (Fig. S7).This partition was sup- ported, also by nucleotide substitutions (A>T) involving position 130 of the alignment and it was shown in the phylogenetic tree where it was sustained by a fully supported node (Figs 3 and S4).In addition, this substitution reflected the isolation of this species, corroborated by two unique amino acid substitutions (I>L and L>M) occurring in positions 41 and 93 of the amino acid alignment.
The haplotype network of 19 sequences belonging to T. wilkinsoni detected 11 specific polymorphisms with an overall interspecific distance >8% and an intraspecific genetic distance of 1.4%.The analysis showed three main clusters separated by a hypothetical ancestor.The first clade included sequences from NE and SW Turkey (MW756061-64), which resulted sister group of the other clades comprising sequences from Cyprus (MW756055-56), and samples collected near Adana region in the SE Turkey (GBGL12176-13, MW756057-59).Sequence MW756060, retrieved from SW Turkey, was linked to this latter, with six substitutions from the reference sequence (Fig. S8).
Thaumetopoea pityocampa included 62 exclusive haplotypes from 375 analysed sequences arranged into four groups.The overall genetic difference calculated within the group was 2%, while the interspecific distances were >7.5% (Table 2).In addition, the TCS network analyses showed several substitution steps separating the subclades from S Algeria -S Morocco (MW756065-66), NW Algeria -N Morocco, and Corsica Island (MW756067-69) by European clades.The ancestors to link these sequences with the other from Europe were not detected.The group from Spain included sequences from the summer form of T. pityocampa and other sequences from the Iberian Peninsula.The TCS network identified haplotype KT768188 as a common ancestor of sequences retrieved from Spain and Portugal.The same mutation was distributed also in France, collected from specimens near Pyrenean and Brittany regions.The other sequences were combined in a large group from Pyreneans to Greece showed three clades of differentiation, linked to each other and with a few substitutions (BIBSA1730-16, BIBSA1727-16, ABOLD143-16) (Figs S9 and S10).During the analyses, MW756107 from southern France was discarded by the software after detecting length differences among sequences >5%.Therefore, it was added manually at the end of the analysis associating according to the MP principle (Fig. S10).
The barcode-gap evaluation carried out with ABGD resembled the findings above with slight differences.
Through the recursive partition analysis, the software retrieved 11 groups (p-values = 0.001-0.004),estimating the barcode gap boundary at 2%.Furthermore, the clear species delimitation was evaluated at around 5-5.5%, where a second split between distance values was detected.In detail, single clades were identified within Clade 1, 3, and 4, while Clade 2 was separated into two sub-groups based on geographic distribution.Finally, the sequences of T. pityocampa (Clade 5) were arranged into six other subgroups.Each of them agreed with the layout shown in the phylogenetic tree (Figs 3 and S4).The gap delimitation obtained with bPTP was generally congruent with previous results.However, it tended to over-fragment the T. pityocampa clade, retrieving up to seven sub-groups.

Diagnosis. The species can be distinguished from other
Thaumetopoea species based on several morphological and biological traits (de Freina and Witt 1987).The separation from other taxa included in the same group (Table 1) is difficult because of substantial variation of morphological traits within taxa (e.g., individuals from the original site of the material studied and described by Réaumur, Fig. S1 and S11).However, the barcode sequence allows an easy discrimination of T. pityocampa from the three other similar species (T.wilkinsoni, T. hellenica, and T. mediterranea), but not from the synonymous T. pityocampa orana.
Table 2. Genetic distances (base substitutions/site) calculated with MEGA X. Analyses were performed using the Kimura 2-parameter model involving 94 nucleotide sequences.Distances calculated among clades were reported on the diagonal and in bold.Distances calculated between species were reported in the lower-left matrix.Paratypes.Three adult ♂ and four adult ♀, same label.

Thaumetopoea cretensis
Type material deposited at the Naturhistorisches Museum Wien, Austria.
Diagnosis.Smaller than other species of the T. pityocampa complex, from which it can be hardly separated based on morphological traits of both male and female moths.
The anal scale of the female is more similar to T. wilkinsoni than to T. pityocampa.It can be separated from other species of the genus based on the gene sequence data of the barcode region cox1 (primers HCO -LCO) and from a clear separation at nuclear DNA level identified by Petsopoulos et al. (2018).

Description. (Figs 5, S12, S13
). Adult ♂.Forewing length 15 mm.Antenna bipectinate, rami well developed and more or less equal in length except in the apex, where they become progressively shorter; axis and rami pale orange-brown.Head small, labial palpus extremely short, directed ventrally, covered completely in long, dense dark grey hair-like scales; eye surrounded by long dark brown hair-like setae; chitinous and shiny frontal process (crest, canthus) with 6 teeth; first tooth pointing dorsally and others frontally; angle of about 90° between first and second tooth; teeth from second to sixth with lateral horns; height of first tooth about one third of second, height of other teeth progressively decreasing until the sixth, which is about one tenth of second; spacing between first and second tooth larger than that between other teeth, maximum width at level of fourth tooth; frontal process surrounded medially and basally by long yellow hair-like setae with dark brown apex; vertex covered by long yellow hair-like setae with dark brown apex; compound eye large, globular.Thorax, tegula and collar covered in dense, very long dark brown hair-like setae.Abdomen orange-brown dorsally, dark grey laterally.Forewing ground colour pale light brown, costal margin yellowish for two-thirds from the base; basal line and ante-median lines complete, post-median line extended from costal to ventral margin, more distinct in the upper half, discal spot well defined, c-shaped, brown, terminal line distinct and brown.Hindwing off-white, with yellowish venation; discal spot faint; terminal line present with clear anal spot.Underside of forewing as upperside but generally less distinct; underside of hindwing homogenous whitish as upperside but with darker shade, discal spot present but faint.-Genitalia.Uncus short, relative-ly broad at base, apically with two hooks.Socii short, longer than uncus, broad at base, lateral and inner margins straight, apex rounded.Tegumen robust, as long as valvae; juxta wider than long, concave at superior margin, lateral and inferior margins convex, lateral angles rounded.Valvae broad at base, rounded quadrangular, cucullus rounded with sparse, short setae.Ventral margin of valvae convex basally and concave distally; dorsal margin of valvae straight basally and concave distally; upper ribbing of valvae faint in the basal part.Phallus as long as valvae, straight, apically rounded, constricted in the distal half.
Adult ♀.Forewing length 19 mm.Antenna bipectinate, rami more or less equal in length except in the apex, where they become shorter; axis and rami pale yellow brown.Head small, labial palpus short, directed ventrally, light yellow and covered completely by long, dense brownish hair-like setae; eye surrounded by long brownish hair-like setae; chitinous and shining frontal process (crest, canthus) with 6 teeth; first tooth pointing dorsally and the others frontally; angle of about 90° between the first and the second tooth; teeth from the second to the sixth with lateral horns; height of the first tooth about one fourth of the second, the height of other teeth progressively decreasing until the sixth, which is about one eighth of the second; teeth 2-6 with lateral horns; spacing between the first and second tooth larger than that between the other teeth, maximum width at level of fourth tooth; spacing between tooth 2 and 3 larger than between 4 and 5; frontal process surrounded medially and basally by long yellow hairlike setae with brown apex; vertex covered by long yellow hair-like setae with brown apex; compound eye large, globular.Thorax, tegula and collar covered by dense, very long light brown hair-like setae; abdomen orange brown dorsally and covered with short hair-like setae, yellowish with longer hair-like setae laterally and ventrally; 8 th and 9 th tergites covered with dense pack of scales, partially covered by long hair-like setae.Scales elongated, maximum width at apex, 1.75 times longer than wide, light brown except at the base.
Forewing long, broad triangular, costal and ventral margin straight, outer margin evenly arcuate, tornal angle broad, apex narrowly rounded.Forewing ground colour pale light brown, costal margin yellowish for two-thirds from the base; basal line faint, ante-median line visible in the upper half and faint in the lower half, post-median line extended from costal to ventral margin but distinct only in the upper half, discal spot well defined, c-shaped, brown, terminal line distinct and brown.Hindwing off-white, with yellowish venation; discal spot faint; terminal line present with a clear anal spot.Underside of forewing as the upperside but generally less distinct; underside of hindwing homogenous whitish as upperside but with darker shade, discal spot present but faint.
Distribution.Known from Crete Island.Remarks.Etymology: from the island of Crete.

Discussion
The main results of the morphological and molecular analyses of the taxa included in the T. pityocampa complex indicate that morphology alone does not allow to delimit the species while the barcode region of mitochondrial DNA can be helpful.The lack of reliable morphological traits for the separation of the taxa was already mentioned in the first revision of Agenjo (1941), who considered the eastern pine processionary moth T. wilkinsoni as a self-standing species because of ecological and biogeographical traits, while morphological traits overlapped with those of T. pityocampa.Further efforts to find discriminant traits in male genitalia and female anal scales conducted by Démolin (1988) and Démolin et al. (1994), on individuals originating from populations associated with pines and cedars around the Mediterranean basin, failed to have a conclusive outcome on species identity.Some tendencies in the shape of the canthus, the middle rib of male genitalia valvae, or the shape of the female anal scales were evidenced, although they were never clear enough to discriminate among taxa even when these were geographically well separated.Hybridization between the two main species in the group, T. pityocampa and T. wilkinsoni, was achieved under laboratory conditions first by Démolin (Battisti et al. 2015) and then by Petrucco-Toffolo et al. (2018).In both cases, the obstacle represented by different emergence times was overcome by manipulating the temperature regimes under which the diapausing pupae were maintained, making them emerge simultaneously in mating boxes.This suggests that phenology of adult emergence could act as a barrier to gene flow in the field.Interestingly, the viable F1 and F2 generations did not only show hybrid genomes but also intermediate morphological features (Petrucco-Toffolo et al. 2018).An exploration of the geographical contact zone between T. pityocampa and T. wilkinsoni in NW Turkey was carried out by İpekdal et al. (2020) through the analysis of mitochondrial genes and nuclear polymorphic SSR markers, evidencing recurrent introgression by T. wilkinsoni males in several T. pityocampa populations.The phenomenon, however, was not frequent and the occurrence of prezygotic isolation mechanisms, such as differences in timing of the adult emergences, was considered the most likely as no geographical or environmental barriers occur in that region.In addition, both species respond well to the same sexual pheromone ((Z)-13-hexadecen-11-yn-1-ol acetate), isolated first from T. pityocampa (Guerrero et al. 1981) but shown to be equally effective in T. wilkinsoni (Halperin and Golan 1982).A thorough reassessment of the sexual pheromones confirmed the identity of the molecules in both taxa (Frérot and Démolin 1993).
Molecular markers used in the last decades have revealed a complex situation in the group of T. pityocampa.Here, the information differs depending greatly on the origin of genes considered, either mitochondrial or nuclear.Maternally inherited mitochondrial genes mirror the history of female lineages, that is generally restricted to small geographic regions because of females' short life (usually one night) and limited average flight range of few hundred metres (Battisti et al. 2015).Nuclear genes are strongly influenced by the gene flow caused by males, which live longer and spread over several kilometres, as suggested by trapping experiments (e.g., males were caught on mountain tops in the Alps, far from established populations) (Kitschelt 1925;Salvato et al. 2005).At both elevational and latitudinal expansion areas, nuclear markers revealed a larger admixture area than that identified by mtDNA markers, highlighting the role of male-driven gene flow (Salvato et al. 2005).The history of the pine processionary moth in the Mediterranean was reconstructed based on phylogeographic studies and it is essentially based on repeated isolation events in refugia since at least the Miocene, i.e., 10 million years ago (Kerdelhué et al. 2009(Kerdelhué et al. , 2015)).Those events may have given rise to the different taxa presently known in the group.However, some taxa could have come in secondary contact, in the past because of climatic oscillations or even in recent time because of extensive pine plantations in the Mediterranean.Consequently, gene flow in the nuclear genome could have occurred in the absence of barriers, as shown for example between Cyprus and the mainland (Kerdelhué et al. 2009) or between clades occurring in Northern Africa (El Mokhefi et al. 2016).Given that females are less mobile than males, the mitochondrial signature of former isolation may remain despite gene flow assured by males.
This situation has led to high haplotype diversity, as shown for example in the site where the neotype was collected, and to the definition of several clades mainly based on mitochondrial markers (Simonato et al. 2007;Kerdelhué et al. 2009), prompting the description of new taxa such as T. hellenica and T. mediterranea (Trematerra et al. 2017;Trematerra and Colacci 2018).Although these taxa are indistinguishable based on morphological traits, they are supported based on their genetic identity at the mitochondrial level.Basically, they represent two mitochondrial subclades described by Kerdelhué et al. (2009) included in the Eastern-Northern African (ENA) clade, the first subclade from East Algeria, Tunisia, and Pantelleria Island, and the second from the Cyrenaica peninsula in Libya.Interestingly, the haplotype found in Libya was later found in samples collected in the Attika region (Greece) (Avtzis et al. 2016(Avtzis et al. , 2018)), from which the name T. hellenica was chosen by Trematerra et al. (2017).Data suggest that the occurrence of Libyan-mtDNA bearing individuals in Greece is recent, as previous dense phylogeographical studies did not retrieve this corresponding divergent haplotype (Kerdelhué et al. 2009;Korsch et al. 2015).However, a thorough analysis of gene flow at nuclear level would be necessary to conclude about the species status of the new taxa recently described, and to determine if T. hellenica occurs in Greece or if this situation mostly corresponds to a mitochondrial introgression.Such an analysis has been achieved for the populations of Algeria, where two clades (pityocampa, including the subspecies orana, and ENA) are coming into contact, and it showed that nuclear differentiation is limited, and gene flow commonly occurs among them despite the strong mitochondrial genetic structure (El Mokhefi et al. 2016).Based on these results, there is no substantial reason to consider orana as a separate taxon.A different situation was described for the populations of the Crete Island, which are a subclade of T. wilkinsoni (Kerdelhué et al. 2009).Here, a thorough analysis of all markers showed that Crete populations are isolated from all the other populations of the Asian mainland and the Aegean archipelago (Petsopoulos et al. 2018), leading to the description of the new species in this paper.
The economic and medical importance of the pine processionary moths is acknowledged in all its range.The species complex is well characterised based on the unique traits of the life history, namely the winter feeding and the construction of conspicuous silk tents by the larvae.There is, however, evidence that the group is subdivided into various taxa, characterized either by a typical geographic distribution or a particular phenology that may represent barriers to gene flow.In this concern, the recent finding of an allochronic population with shifted phenology, i.e., summer feeding in Portugal (Santos et al. 2007(Santos et al. , 2011a)), partially co-occurring with populations still exhibiting the classical winter-feeding phenology, has confirmed the large adaptation potential of the species to local conditions (Santos et al. 2011b(Santos et al. , 2013)), creating barriers to gene flow that may end up with new species formation (Burban et al. 2016(Burban et al. , 2020)).The exploration of the demographic history (Leblois et al. 2018) and of the underlying genetic mechanisms has just started (Gschloessl et al. 2014(Gschloessl et al. , 2018(Gschloessl et al. , 2022)), and it could shed light on the speciation mechanism and species delimitation in the group.
The study of some traits of the life history as well as evidence from molecular data, mainly using the mitochondrial DNA-barcoding region, support the concept that the taxa included in the T. pityocampa complex are separately evolving lineages.Females of different species are very sedentary (Battisti et al. 2015) and possess distinct mitochondrial haplotypes that unambiguously set each taxon from others.The meta-populations of the five taxa are predominantly geographically separated, but when in contact they exhibit different phenological patterns, which minimizes but does not always prevent hybridization.Indeed, at least T. pityocampa and T. wilkinsoni were shown to produce hybrids both in laboratory and in the field (Petrucco-Toffolo et al. 2018;İpekdal et al. 2020).The production of fertile hybrids in the laboratory requires a strong manipulation of the phenology of the two taxa, thus enforcing very unnatural conditions.Furthermore, the occurrence in the field of hybrids is limited and restricted to a small overlapping portion of the distribution areas of T. pityocampa and T. wilkinsoni (İpekdal et al. 2020).This phenomenon is typical of the early stages of differentiation of taxa that are separating according to a parapatric speciation mechanism.In T. cretensis, the speciation process seems to have progressed even further because this taxon has not been exchanging genetic material with other moths of the complex for a long time, aided by a complete geographic isolation within its distribution range (Petsopoulos et al. 2018).Information on T. mediterranea and T. hellenica is limited, but still some level of isolation seems to characterize both species as judged by DNA-barcoding outputs (present work;Trematerra et al. 2017;Trematerra and Colacci 2018).

Conclusion
The five taxa included in the T. pityocampa complex seem to fulfil the fundamental property of the unified concept of species (De Queiroz 2007), i.e., they appear to evolve as separate lineages.However, it seems that the level of differentiation has not yet achieved the complete isolation in terms of reproduction for all taxa, and some of them still occupy grey zones within the cladogenetic branching pattern (De Queiroz 2007).Available knowledge favours the view that taxa within the T. pityocampa complex are experiencing a process of speciation.Thus, in this we suggest considering them as separate taxa, an approach that allows full appreciation to the biological diversity in this group of moths.

Figure 1 .
Figure 1.Sites where individuals of the Thaumetopoea pityocampa species complex were collected for the molecular (a above) and morphological (b below) analyses.In map «a», different pin colours indicate BOLD-records in purple, data from Burban et al. (2016) in light blue and new sequences in dark blue.

Figure 3 .
Figure 3. Simplified cladogram of ML phylogenetic analysis (-ln= 4358.6671) of the sequences considered in this work.A complete version is provided in Fig. S4.
Funding statement.European Union's Horizon 2020 Program for Research and Innovation under grant agreement no.771271 'HOMED' and the Bolzano/Bozen prov ince for the grant 92/2021.

Table 1 .
Schintlmeister (2013)d in the Thaumetopoea pityocampa complex and their current status based onSchintlmeister (2013)and further descriptions.