Research Article |
Corresponding author: Van Lun Low ( vanlun_low@um.edu.my ) Corresponding author: Zubaidah Ya’cob ( zyacob@um.edu.my ) Academic editor: Brian Wiegmann
© 2023 Noor Izwan-Anas, Van Lun Low, Zubaidah Ya’cob, Emmanuel Y. Lourdes, Mohamad Rasul Abdullah Halim, Mohd Sofian-Azirun, Hiroyuki Takaoka, Peter H. Adler.
This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Citation:
|
Black flies play a prominent role in public health and the epidemiology of parasitic diseases of humans, domesticated and wild animals. Correct identification and a comprehensive survey are required to identify vector and pest species and thus understand their biological attributes which play a vital role in the monitoring program. DNA barcoding is an established molecular tool that provides rapid and accurate species identification. Our study strengthens the molecular database for black flies in Malaysia by adding 59 cytochrome c oxidase I sequences for 22 species, of which 14 are included for the first time. These sequences, combined with those in public databases, represent a total of 338 sequences for 52 Malaysian species, nearly 50% of which were collected from type localities. At the subgeneric level, barcode gap analysis most accurately identified species in the subgenus Nevermannia (92%), followed by Simulium s. l. (91%), and Gomphostilbia (81%). The remaining sequences were ambiguous and could not be distinguished from those of nearest neighbour species due to an overlap in genetic divergence and low genetic diversity, especially between insular species. Tree analyses indicate that certain species had incomplete lineage sorting and low mitochondrial signals. Possible cryptic species were indicated in the Simulium (Gomphostilbia) batoense and S. (G.) epistum species groups. Species delimitations were consistent with morphological identifications except in large species groups such as the S. (G.) asakoae, S. (G.) batoense, S. (G.) epistum, and S. (Simulium) melanopus groups. The use of type specimens or specimens collected from type localities (topotypes) in barcoding is strongly recommended for reference sequences to increase the reliability of the molecular database.
cryptic species, cytochrome c oxidase I, Gomphostilbia, Simulium, vector
Black flies are important in public health and play a significant role in the epidemiology of parasitic diseases of humans, domesticated animals, and wildlife. Certain species of the genus Simulium are vectors of Onchocerca volvulus, the sole causative agent of human onchocerciasis. This disease infects more than 15 million people in Africa, Yemen, and Latin America, with approximately a million having lost sight (
Baseline taxonomic information is available for many black flies in Southeast Asian countries (
In the workflow application of DNA barcoding, organisms need to be identified and verified by taxonomists before DNA sequences can be deposited in the reference library (
All specimens in this study were collected from streams across Malaysia (Table
Species and collection details for 22 species of black flies from Malaysia used for barcoding.
Subgenus/species | n | Location | Coordinates | Date |
Nevermania Enderlein | ||||
Simulium aureohirtum Brunetti,1991 | 3 | Cameron Highland, Pahang | 4°31.258′N, 101°24.247′E | 23 Feb 2013 |
† Simulium ledangense Ya’cob, Takaoka & Sofian-Azirun, 2014 | 3 | Mount Ledang, Johor | 2°22.76′N, 102°36.615′E | 11 Apr 2013 |
Gomphostilbia Enderlein | ||||
† Simulium auratum Takaoka, 2009 | 2 | Murud, Sarawak | 3°57.097′N, 115°33.075′E | 11 Jun 2013 |
Simulium aziruni Takaoka, Hashim & Chen, 2012 | 1 | Tasik Kenyir, Terengganu | 5°0.5518′N, 102°42.1472′E | 23 Sep 2017 |
Simulium barioense Takaoka, 2008 | 3 | Mesilau, Sabah | 6°2.133′N, 116°35.835′E | 19 Jun 2014 |
† Simulium hiroyukii Ya’cob & Sofian-Azirun, 2015 | 2 | Murud, Sarawak | 3°55.365′N, 115°30.5066′E | 13 Jun 2013 |
Simulium kelabitense Takaoka, 2008 | 3 | Bakalalan, Sarawak | 3°57.419′N, 115°37.057′E | 16 Jun 2013 |
Simulium pegalanense Smart & Clifford, 1969 | 3 | National Park, Pahang | 4°33.282′N, 102°18.995′E | 13 Sep 2013 |
† Simulium sarawakense Takaoka, 2001 | 3 | Pueh, Sarawak | NA | 28 Aug 2008 |
† Simulium terengganuense Takaoka, Sofian-Azirun & Ya’cob, 2012 | 3 | Pasir Raja, Terengganu | 4°33.985′N, 102°57.429′E | 7 Jun 2013 |
Simulium varicorne Edwards, 1929 | 1 | Negeri Sembilan | NA | 29 Dec 2010 |
Simulium Latreille | ||||
Simulium alberti Takaoka, 2008 | 3 | Murud, Sarawak | 3°57.277′N, 115°33.308′E | 10 Jun 2013 |
Simulium beludense Takaoka, 1996 | 1 | Mesilau, Sabah | 3°50.104′N, 115°36.521′E | 9 Apr 2014 |
Simulium bishopi Takaoka & Davies, 1995 | 1 | Cameron Highland, Pahang | 4°18.420′N, 101°19.658′E | 24 Dec 2012 |
2 | 4°26.723′N, 101°22.979′E | 25 Dec 2012 | ||
Simulium brevipar Takaoka & Davies, 1995 | 2 | Raub, Pahang | 4°23.715′N, 101°36.443′E | 27 Dec 2012 |
1 | 4°26.723′N, 101°22.979′E | 22 Feb 2013 | ||
Simulium grossifilum Takaoka & Davies, 1995 | 1 | Cameron Highland, Pahang | 4°23.165′N, 101°22.334′E | 25 Dec 2012 |
† Simulium hackeri Edwards, 1928 | 1 | Cameron Highland, Pahang | 4°34.956′N, 101°20.717′E | 27 May 2012 |
1 | 31 Mac 2012 | |||
1 | 27 Jun 2012 | |||
1 | 26 Dec 2012 | |||
† Simulium hirtinervis Edwards, 1928 | 1 | Cameron Highland, Pahang | 4°22.220′N, 101°21.512′E | 25 Dec 2012 |
2 | 27 May 2013 | |||
Simulium jeffreyi Takaoka & Davies, 1995 | 3 | Tapah, Perak | 4°16.316′N, 101°19.022′E | 24 Dec 2012 |
Simulium malayense Takaoka & Davies, 1995 | 3 | Cameron Highland, Pahang | 4°22.220′N, 101°21.512′E | 25 Dec 2012 |
† Simulium murudense Takaoka, Ya’cob & Sofian-Azirun, 2015 | 3 | Murud, Sarawak | 3°55.6084′N, 115°30.8434′E | 13 Jun 2013 |
Simulium perakense Takaoka, Ya’cob & Sofian-Azirun, 2018 | 3 | Batu Gajah, Kelantan | 5°45.075′N, 101°58.816′E | 2 Feb 2015 |
2 | Janda, Baik, Pahang | 3°18.2167′N, 101°52.5′E | 25 Jul 2011 | |
n, total number of sequences. †, samples collected from type localities. NA, not available |
A total of 85 black fly specimens were subjected to DNA extraction using the G-spin Total DNA Extraction Mini Kit (iNtRON™ Biotechnology, Inc., Seongnam, South Korea), according to the animal tissues protocol provided by the manufacturer. Polymerase chain reaction (PCR) was performed as described by
PCR amplification was performed with Applied Biosystems Veriti 96-Wll Thermal Cycler (Applied Biosystems, Inc., Foster City, CA, USA). All amplifications were confirmed using 1.5% agarose gel pre-stained with SYBR Safe (Invitron Corp., Carlsbad, CA, USA) run using a 100 bp DNA ladder (GeneDireX, Inc., Taiwan). Successful PCR products (59 specimens) were confirmed approximately at 700 bp and sent to Apical Scientific Sdn. Bhd., Selangor, Malaysia, for sequencing.
A total of 361 sequences were included in data analyses. Of these, 338 were Malaysian sequences (Table
Malaysian black flies (n = 52) used for DNA barcoding, according to subgenus, with intraspecific genetic distances. Species with newly generated sequences are in bold.
Subgenus | Species | Intraspecific distance (average) | n |
Nevermannia | S. aureohirtum † | 0.25–1.02 (0.68) | 3 |
S. borneoense ‡ | 4.21 | 2 | |
S. ledangense †‡ | 0.00–0.25 (0.10) | 5 | |
S. pairoti ‡ | 0.00–1.54 (0.71) | 16 | |
Gomphostilbia | S. angulistylum ‡ | 0.00–0.25 (0.17) | 3 |
S. asakoae ‡ | 0.00 | 3 | |
S. auratum ‡ | 8.38 | 2 | |
S. aziruni † | — | 1 | |
S. barioense | 2.60–3.94 (3.31) | 3 | |
S. brinchangense ‡ | 0.25–0.77 (0.51) | 3 | |
S. cheongi | 0.00–2.59 (0.90) | 45 | |
S. decuplum | 0.00–1.28 (0.85) | 3 | |
S. duolongum | 0.00 | 4 | |
S. gombakense | 0.51–2.33 (1.72) | 3 | |
S. hiroyukii ‡ | 1.02 | 2 | |
S. izuae ‡ | 0.25–1.02 (0.68) | 3 | |
S. johorense ‡ | — | 1 | |
S. kelabitense | 0.00 | 3 | |
S. leparense ‡ | 0.00 | 3 | |
S. lurauense ‡ | 0.77–1.28 (1.02) | 3 | |
S. parahiyangum | 0.00 | 4 | |
S. pegalanense | 0.51–7.24 (4.99) | 3 | |
S. roslihashimi | 0.00–0.51 (0.34) | 3 | |
S. sarawakense ‡ | 0.77–3.67 (2.43) | 3 | |
S. sazalyi ‡ | 0.00–2.33 (1.27) | 8 | |
S. sheilae ‡ | 0.00 | 3 | |
S. sofiani ‡ | 0.00 | 3 | |
S. tanahrataense ‡ | 0.00 | 3 | |
S. terengganuense ‡ | 0.00–3.11 (2.08) | 3 | |
S. trangense | 0.00–0.77 (0.51) | 3 | |
S. varicorne | — | 1 | |
S. whartoni | 0.51–1.28 (0.84) | 3 | |
Simulium | S. alberti | 0.25–2.33 (1.55) | 3 |
S. argentipes | 0.51 | 2 | |
S. beludense | 0.00–0.25 (0.17) | 3 | |
S. bishopi | 1.02–1.54 (1.28) | 3 | |
S. brevipar | 0.51–0.77 (0.68) | 3 | |
S. crassimanum | — | 1 | |
S. grossifilum | — | 1 | |
S. hackeri †‡ | 0.00–0.51 (0.20) | 8 | |
S. hirtinervis ‡ | 0.25–0.77 (0.51) | 3 | |
S. jeffreyi † | 0.00–2.33 (0.99) | 36 | |
S. kiuliense | 0.00 | 6 | |
S. laterale | — | 1 | |
S. maklarini | — | 1 | |
S. malayense † | 0.00–3.38 (1.02) | 8 | |
S. mirum ‡ | 0.00–2.33 (0.93) | 11 | |
S. murudense ‡ | 0.00 | 3 | |
S. nigripilosum | 0.00–1.80 (1.20) | 4 | |
S. perakense | 0.25–2.60 (1.29) | 5 | |
S. tani | — | 1 | |
S. vanluni | 0.00–2.59 (0.73) | 84 | |
n, total number of sequences. †, species with newly generated sequences and sequences retrieved from GenBank. ‡, samples from type localities. |
Pairwise genetic distance was calculated using the Kimura 2-parameter model in MEGA 11 version 11.0.11 (
Two different species delimitation methods were used: Assemble Species by Automatic Partitioning (ASAP) (
Pairwise intraspecific divergences for all Malaysian sequences ranged from 0.00% to 8.38% (Fig.
A scatter plot of the data demonstrates that almost 80% of all species had a DNA barcode gap, whereas the rest did not due to overlap of the farthest conspecific with the nearest neighbour species (Fig.
DNA barcoding gap represented by a scatter plot of all 338 Malaysian black fly sequences: Maximum intra-distance versus minimum distance to nearest neighbour (NN). The gap exists for species above the 1:1 line. The DNA barcode gap was present in 79.54% of all species. Single sequence species were excluded. Red dots include newly generated sequences, whereas blue dots represent sequences from GenBank.
The overall percentage of correct identification was 88.46% for Best Match (BM) and 87.27% for Best-Closed Match (BCM) but only 75% for All Species Barcode (ASB) (Table
DNA barcode identifications of all 338 Malaysian black fly sequences, according to subgenus, using TaxonDNA functions: Best Match (BM), Best Close Match (BCM), and All Species Barcode (ASB).
Best Match (%) | Best Close Match (%) | All Species Barcode (%) | |||||||||
Correct | Ambiguous | Incorrect | Correct | Ambiguous | Incorrect | No match | Correct | Ambiguous | Incorrect | No match | |
Gomphostilbia | 82.40 | 9.60 | 8.00 | 80.80 | 9.60 | 3.20 | 6.40 | 49.60 | 43.20 | 0.80 | 6.40 |
Nevermannia | 100 | 0.00 | 0.00 | 92.31 | 0.00 | 0.00 | 7.69 | 92.31 | 0.00 | 0.00 | 7.69 |
Simulium | 90.91 | 6.42 | 2.67 | 90.91 | 6.42 | 0.00 | 2.67 | 90.37 | 6.95 | 0.00 | 2.67 |
All | 88.46 | 7.10 | 4.43 | 87.27 | 7.10 | 1.18 | 4.43 | 75.44 | 19.82 | 0.29 | 4.43 |
The subgenus Gomphostilbia had the lowest percentage of correct species identifications for all three functions, particularly ASB with less than 50%. All 45 sequences of S. cheongi and S. whartoni were ambiguous. Simulium lurauense was the only incorrect sequence for ASB. For BM, 10 sequences (8.0%) were incorrectly identified, whereas only four sequences were incorrectly identified for BCM (3.2%). Eight sequences were below the threshold value of 3%.
Five singleton sequences (2.06%) for species in the subgenus Simulium were incorrect for BM but not for BCM: Simulium crassimanum, S. laterale, S. maklarini, S. grossifilum, and S. tani. All these sequences were below the threshold value, including a sequence from S. malayense. Twelve sequences (6.41%), comprised of S. mirum (six) and S. kiuliense (six), were ambiguous for both BM and BCM.
Overall, 59 species of black flies were used in trees inferred based on maximum likelihood (ML): 52 species from Malaysia and eight reference species from Thailand and two from Vietnam. All species were members of one of three subgenera: Gomphostilbia (seven species groups), Nevermannia (two species groups), and Simulium (nine species group).
For species delimitation, 57 and 58 operational taxonomic units (OTUs) were recognised in ASAP and GMYC, respectively. The results were slightly different when compared against species groups (Table
Number of species and operational taxonomic units (OTUs), using two different species delimitation methods, ASAP & GMYC, according to species group in the genus Simulium
Subgenus | Species group | Morphology | ASAP | GMYC |
Nevermannia | Aureohirtum | 1 | 2 | 2 |
Feuerborni | 3 | 3 | 3 | |
Total | 4 | 5 | 5 | |
Gomphostilbia | Asakoae | 7 | 5 | 5 |
Batoense | 8 | 11 | 10 | |
Ceylonicum | 3 | 3 | 3 | |
Darjeelingense | 1 | 1 | 1 | |
Epistum | 7 | 8 | 8 | |
Gombakense | 3 | 3 | 3 | |
Varicorne | 1 | 1 | 1 | |
Total | 30 | 32 | 31 | |
Simulium | Argentipes | 3 | 3 | 3 |
Grossifilum | 1 | 1 | 1 | |
Melanopus | 7 | 6 | 6 | |
Multistriatum | 3 | 2 | 3 | |
Nitidithorax | 1 | 1 | 1 | |
Nobile | 2 | 1 | 1 | |
Striatum | 2 | 1 | 2 | |
Tuberosum | 3 | 3 | 3 | |
Variegatum | 2 | 2 | 2 | |
Total | 24 | 20 | 22 | |
Overall | 58 | 57 | 58 |
For the subgenus Nevermannia, S. ledangense and S. pairoti were recognised as the same entity, whereas S. borneoense consisted of two entities (Fig.
Maximum likelihood tree for the subgenus Nevermannia. Full view of the tree is in the top left corner. The bootstrap value and Bayesian Inference (BI) are shown on the branches. Sequences generated from this study are in bold. Vertical bars on the right are the result of species delimitation, with the species groups indicated to the right.
There were 24 nominal species in the subgenus Simulium (Fig.
Maximum likelihood tree of subgenus Simulium. Full view of the tree is in the top left corner. The bootstrap value and Bayesian Inference (BI) are shown on the branches. Sequences generated from this study are in bold. Vertical bars on the right are the result of species delimitation, with the species group indicated to the right. Species with an asterisk (*) came from a different species group: Simulium kiuliense belongs to the S. nobile species group but was assigned to the S. melanopus species group by species delimitation.
Thirty morphologically identified nominal species of the subgenus Gomphostilbia, including two species from Thailand (S. parahiyangum and S. decuplum) and one species from Vietnam (S. yvonneae), represented the S. batoense species group (Fig.
Maximum likelihood tree of subgenus Gomphostilbia. Full view of the tree is in the top left corner. The bootstrap value and Bayesian Inference (BI) are shown on the branches. Sequences generated from this study are in bold. Vertical bars on the right are the result of species delimitation, with the species group indicated to the right.
In the second clade, S. decuplum from Malaysia and Thailand was distinguished as two distinct entities. DNA barcode sequences of S. pegalanense are reported for the first time and the species was placed in two different lineages and considered two distinct entities. The third clade of the S. batoense species group consisted only of S. terengganuense, recognized by ASAP and GMYC as two and one entities, respectively. Seven species in the S. asakoae species group were clustered together, with S. asakoae, S. brinchangense, and S. tanahrataense forming monophyletic clades with strong support. Simulium izuae and S. roslihashimi formed a paraphyletic group and were recognised as a single entity by species delimitation; the same occurred with S. sofiani and S. lurauense. ASAP and GMYC were both consistent with morphological identification, with three species each in the S. ceylonicum and S. gombakense species groups. Seven nominal species comprised the S. epistum species group, which was split into three clades. The first clade consisted only of S. angulistylum, which was positioned near the subgenus Simulium. The second clade consisted of four nominal species: Simulium barioense (two OTUs), S. kelabitense (one OTU), S. auratum (two OTUs), and S. sarawakense (one OTU). Simulium whartoni and S. cheongi were a single entity sharing the same strong bootstrap and Bayesian support in the third clade.
An increasing demand for accurate and timely species identification, and rapid advances in genetic methodology, have spurred progress in establishing molecular identification tools for black flies. By using short standardized DNA markers from mitochondrial COI, molecular species identification is becoming standard practice in many regions, world wide. The increasing importance of black flies in Malaysia makes the development of comprehensive and reliable species databases essential for accurate identification and comparison of regional faunas. A total of 52 species of black flies from Malaysia are represented in our study, accounting for more than 50% of the total species recorded in the country (
The distance-based approach successfully identified almost 90% of the species, in parallel with morphological identification, except for several species considered ambiguous or misidentified due to overlapping genetic distance with the nearest-neighbour species and high intraspecific distance. Despite the small number of samples per species, the genetic divergence overlap in this study (8.38%) is less than in the comprehensive study of black flies in Thailand (12.6%) (
The barcode gap is the difference between intra- and interspecific genetic distances for each species (
A genetic distance for black fly species of more than 3% suggests the presence of cryptic species (
In the subgenus Nevermannia, S. aureohirtum in Malaysia shows high divergence (> 7%) from populations in Thailand. Delimitation of both species shows them as different entities. This generalist species has a wide geographic distribution, which could affect the species genetically. Chromosomal and molecular studies show differences among populations (
In the subgenus Simulium, S. kiuliense of the S. nobile species group is in the same lineage with species of the S. melanopus species group (S. mirum and S. murudense). Even species delineation indicates they are a single entity. Simulium kiuliense was revalidated after re-examination of S. nobile s. l., which revealed morphological differences and a large molecular distinction between populations in Java and mainland Asia along with a third species, S. vanluni (
Subgenus Gomphostilbia is the largest Simuliidae taxon in Malaysia, consisting of 50 species. In total, 58% of the Gomphostilbia species known from Malaysia were available for our study. In the S. batoense species group, S. parahiyangum and S. sazalyi are in the same lineage and are considered one entity due to incomplete lineage sorting (
Overall, 22% of the species in our study are non-monophyletic and almost half of their sequences are ambiguous at the species level. This finding is consistent with that of
In summary, we report the DNA barcode for black flies in Malaysia, with high (90%) accuracy for species identification. An increase in the number of sequences per species deposited in DNA barcode databases, particularly when based on correct species identifications, will enhance the possible applications, such as monitoring vectors and other species of public health importance.
Noor Izwan-Anas, Van Lun Low and Zubaidah Ya’cob designed research and analyzed the data.
Zubaidah Ya’cob, Mohamad Rasul Abdullah Halim, Van Lun Low, Mohd Sofian-Azirun and Hiroyuki Takaoka collected the samples in the field.
Noor Izwan-Anas, Emmanuel Yogan Lourdes, Van Lun Low and Zubaidah Ya’cob performed the research.
Noor Izwan-Anas, Van Lun Low, Zubaidah Ya’cob, Hiroyuki Takaoka and Peter H. Adler wrote the paper.
We extend our gratitude to Prof. Dr. Sazaly Abu Bakar, the Director of the Higher Institution Centre of Excellence, Tropical Infectious Diseases Research & Education Centre (TIDREC), University of Malaya, for his dedicated support. This study was financially supported by the Fundamental Research Grant Scheme (Ref code: FRGS/1/2019/STG03/UM/02/15) (UM Ref. code: FP024-2019A), Malaysia Ministry of Higher Education for Higher Institution Centre of Excellence (Ref. code: MO002-2019) and Ministry of Environment, Government of Japan under GBIF Biodiversity Information Fund for Asia (BIFA) program (BIFA6_017). The work by P.H.A. was supported by NIFA/USDA under project number SC-1700596 and is Technical Contribution No. 7151 of the Clemson University Experiment Station. The authors declare that no competing interests exist.