Genetic diversity and differentiation of populations of Chlorops oryzae (Diptera, Chloropidae)

Background Chlorops oryzae is an important pest of rice crops. There have been frequent outbreaks of this pest in recent years and it has become the main rice pest in some regions. To elucidate the molecular mechanism of frequent C. oryzae outbreaks, we estimated the genetic diversity and genetic differentiation of 20 geographical populations based on a dataset of ISSR markers and COI sequences. Results ISSR data revealed a high level of genetic diversity among the 20 populations as measured by Shannon’s information index (I), Nei’s gene diversity (H), and the percentage of polymorphic bands (PPB). The mean coefficient of gene differentiation (Gst) was 0.0997, which indicates that only 9.97% genetic variation is between populations. The estimated gene flow (Nm) value was 4.5165, indicating a high level of gene flow and low, or medium, genetic differentiation among some populations. The results of a Mantel test revealed no significant correlation between genetic and geographic distance among populations, which means there is no evidence of significant genetic isolation by distance. An UPGMA (unweighted pair-group method with arithmetic averages) dendrogram based on genetic identity, did not indicate any major geographic structure for the 20 populations examined. mtDNA COI data indicates low nucleotide (0.0007) and haplotype diversity (0.36) in all populations. Fst values suggest that the 20 populations have low, or medium, levels of genetic differentiation. And the topology of a Neighbor-Joining tree suggests that there are no independent groups among the populations examined. Conclusions Our results suggest that C. oryzae populations have high genetic diversity at the species level. There is evidence of frequent gene flow and low, or medium, levels of genetic differentiation among some populations. There is no significant correlation between genetic and geographic distance among C. oryzae populations, and therefore no significant isolation by distance. All results are consistent with frequent gene exchange between populations, which could increase the genetic diversity, and hence, adaptability of C. oryzae, thereby promoting frequent outbreaks of this pest. Such knowledge may provide a scientific basis for predicting future outbreaks.

research on this species has focused on the physiology and ecology [1][2][3], and its genetics is relatively unstudied [4].
Frequent C. oryzae outbreaks in recent years have caused the species to become a major pest in some regions. The propensity for outbreaks may itself play an important role in homogenizing genetic variation and intensifying gene flow between pest populations [5,6]. We hypothesized that frequent gene flow between populations enhances the species' overall adaptability, promoting the frequent outbreaks that occur today. In other words, the frequent outbreaks of C. oryzae are associated with the species' genetic diversity, population demography and high rate of gene flow between populations.
To test this hypothesis, we evaluated the genetic structure of different geographic populations of C. oryzae and the level of gene flow between them. We quantified the genetic diversity and degrees of genetic differentiation of 20 different geographical populations, which may provide a scientific basis for predicting future outbreaks.
We used two effective and promising DNA markers, mitochondrial DNA (mtDNA) and inter-simple sequence repeat (ISSR), to examine between-population differences. Studies of genetic variation between pest populations can not only provide information on their population structure in different geographical regions, but also deduce the demographic history of this species [7][8][9]. Yi et al. used microsatellite and mtDNA loci to investigate the genetic divergence and dispersal ability of Bactrocera dorsalis (Hendel) on six offshore islands in South China, which results indicated that these populations have high genetic diversity, frequent gene flow and low, or medium, levels of genetic differentiation. Thus, the geographic isolation of the six islands is no barrier to the dispersal of B. dorsalis [10]. Research on the genetic diversity and population structure of Leucinodes orbonalis collected from a variety of agro-climatic conditions found almost no genetic diversity and no significant genetic variation among the mitochondrial gene Cytochrome Oxidase I (COI) gene sequences of the populations examined. However, a few genetically distinct populations were associated with some specific habitat requirements [11]. Similarly, genetic differentiation in ISSR markers and the COI gene among Iranian populations of Hishimonus phycitis may have been induced by geographical and ecological isolation and may have an impact on the vectoring capability of this insect [12].
Our results not only provide information on the genetic structure and phylogeography research of C. oryzae, but also provide a potential scientific basis for monitoring and controlling this pest.

COI gene analysis Genetic diversity and differentiation
In total, 432 individuals collected from different locations were used to amplify 684 bp of the COI gene sequence, which defined 47 haplotypes. All 47 haplotypes had 43 variable sites, including 26 singleton variable sites and 17 parsimony informative sites. The mean total nucleotide frequencies of A, T, C and G in the nucleotide sequences from the 20 different populations were 29.98%, 36.56%, 16.95% and 16.51%, respectively, which shows an obvious AT bias (66.54%). The transition/transversion rate ratio was observed to be higher with purines (36.158) than pyrimidines (19.381). The overall transition/transversion bias was 12.022.
Genetic diversity parameters of the 20 populations and the results of neutrality tests are shown in Table 1. Haplotype diversity (Hd) for each population ranged from 0 to 0.71739 and the average number of differences (k) ranged from 0 to 1.35145. Nucleotide diversity (Π) for each population ranged from 0 to 0.00198. Tajima's D and Fu's Fs test of neutrality of 19 populations showed a negative value. When all samples were calculated as one population, Tajima's D and Fu's Fs values were negative and 1‰ significant, which is strong evidence of population expansion.
Based on our sequence database, inter-and intraspecific genetic distances of C. oryzae populations were 0.00012-0.00184 and 0-0.00198 (Table 2) respectively, which indicates no significant genetic differentiation. The Fst values between populations ranged from − 0.0595 to 0.1174 (Table 3), indicating that inter-population differences are relatively low. AMOVA results suggest that 97.28% of all genetic variation is within, and only 2.72% between, populations (Table 4).

Haplotype network and population tree
Evolutionary relationships among the haplotypes were depicted using the median-joining network method (Fig. 1). Among the 47 haplotypes, H1 was the most frequent haplotype, which occupied a central position of the network and was diversified by 46 haplotypes. Especially, H23 and H45 can be derived from two different haplotypes with just one mutational step, respectively. The topology of the C. oryzae population Neighbor-Joining tree suggested that there were no independent groups in all populations (Fig. 2). It is worth mentioning that NX population is most distant on the haplotype network and population tree, however, there is no certain landscape features around the NX location. The reason for preventing emigration of individuals is still unknown.

Genetic differentiation
The Gst was 0.0997, which indicates that 90.03% genetic variation is within populations and only 9.97% between populations. The estimated Nm value was 4.5165 (Table 6). These results suggest that genetic differentiation among populations of C. oryzae is impeded by high gene flow. Table 7 lists the genetic identity (above diagonal) and genetic distance (below diagonal) among populations.
The relationship between genetic and geographic distance was shown in Fig. 3. A Mantel test revealed no significant correlation between genetic and geographic distance (r = 0.54675, p = 0.9992) among C. oryzae populations, and there was therefore no evidence of significant isolation by distance.
An UPGMA dendrogram constructed based on genetic identity ( Fig. 4) grouped the 20 populations into two major clusters. The dendrogram did not reveal any major geographic structure for these populations.

Discussion
The rapid development of molecular techniques has made it possible to directly measure genetic differentiation and genetic diversity among populations [13]. The faster mutation rate and relatively conserved sequence of the mtDNA COI gene is ideally suited to species identification via DNA barcoding [14]. Analysis of mtCOI gene variation is regarded as an important and reliable tool for defining cryptic species [15], evaluating biodiversity [16], identifying samples [17], and distinguishing closely related species [18].
Various surveys have demonstrated the reliability of ISSR markers, which can generate more polymorphisms than either RAPD or RFLP, [19]. For example, Dioscorea hispida were grouped into 10 vital groups  Table 2 Genetic distances between (below diagonal) and within C. oryzae populations (on diagonal) based on COI sequences based on information that provided by ISSR markers, proving the existence of significant variation among germplasm specimens. D. hispida shows a high level of genetic diversity among accessions, which suggests that ISSR markers have been very effective in detecting polymorphism in this species [20]. ISSR primers have been used to determine the potential for the diversification of cassava crops in Angola, revealing genetic diversity within populations and genetic information sharing among the three main taxa [21]. The ISSR molecular marker technique has also been used to distinguish between citrus rootstock species, and to reveal the broad genetic base and high genetic variability among these [22].

Genetic diversity
Genetic diversity, also known as gene diversity, is the foundation of biodiversity that guarantees the evolution  [10]. High levels of genetic diversity are indicative of the strong viability and adaptability of populations [23]. Results from this study indicate that the genetic diversity of C. oryzae was high between populations that were sampled. This may increase fitness in populations to changing conditions. Crawford and Whitney [24] also showed that genetic diversity increases the ability of species to colonize on a short-term ecological timescale by increasing the possibility of population survival, growth and reproduction under novel environments. Episyrphus balteatus and Sphaerophoria scripta European populations successfully adapt to changing environmental conditions and have great colonization abilities due to the high genetic diversity [25]. The Tajima

Genetic differentiation
Fst values between C. oryzae populations ranged from − 0.0595 to 0.1174, and the Gst value was 0.0997, both of which are indicative of low genetic differentiation between populations. This suggests frequent gene flow between populations, which may increase the species' adaptability to environmental change [26]. Moreover, gene flow can not only demonstrate the probable genetic differentiation and genetic infiltration among populations, but also reduce the genetic differences among populations [27]. In this research, the Nm value of C. oryzae was 4.5165 which indicates high gene flow and low, or medium, genetic differentiation among some populations. High gene flow may impede genetic differentiation in C. oryzae. Gene flow (or lack of gene flow) plays a crucial role in genetic differentiation, affects the overall adaption of entire species and adaptative divergence between populations [28]. It has been traditionally considered as a homogeneous force that limits adaptive differences [29,30], and recent studies have shown that it can also promote adaptation to local environmental conditions [31,32]. For example, moderate gene flow increases the adaptation capabilities of Rhagoletis cerasi populations (which occupy different habitates in fragmented landscapes) to local habitates, thus preventing them from becoming extinct due to genetic processes [33,34]. An AMOVA based on Fst values indicates that most of the genetic variation was resulted from the difference within populations. Furthermore, a Mantel test indicates no significant correlation between genetic and geographic distance. This result is consistent with the findings of Yang et al. [35] who compared correlation of the symmetric matrix constituted by geographic and genetic distances to analyze the existence of isolation among populations of Odontotermes formosanus in different regions. These authors found no significant   Table 7 Nei's genetic identity (above diagonal) and genetic distance (below diagonal) among C. oryzae populations correlation between geographic distance and genetic distance and no significant isolation by distance. Overall, we found a high level of genetic diversity and a low degree of population differentiation among populations of C. oryzae, and the gene flow was unaffected by geographic distance. Similarly, geographic distance did not appear to affect gene flow between 10 geographically separated populations Oedaleus infernalis [36]. Fst and Gst values for these populations are low, and the gene flow is high, indicating a low level of genetic differentiation and high gene flow among populations [36]. The correlation between genetic and geographic distance was insignificant [36]. Our results provide important, new information on the genetic diversity and genetic differentiation of C. oryzae, and suggest that high gene flow between populations contributes to the now frequent outbreaks of this pest. However, further research on both additional geographical populations and different genetic markers are necessary before definitive conclusions can be reached. Furthermore, future work can focus on doing a more comprehensive ecological and behavioural research to understand the natural history of C. oryzae in greater detail.

Conclusions
This study showed that the now frequent outbreaks of C. oryzae may due to high gene flow between populations. We have found that these populations have high genetic diversity at the species level, whereas exhibited low genetic differentiation. High genetic diversity and frequent gene flow between populations may enhance the tolerance of populations to environmental variability and increase the adaptability to novel environmental pressures, leading to frequent outbreaks what had happened and what will happen in a large scale.

Sample collection and DNA extraction
400 specimens of C. oryzae were collected from different parts of Hunan province, China, and an additional 32 specimens from Zhejiang and Guizhou provinces (Fig. 5, Additional file 1: Table S1). Samples were soaked in 100% ethanol and stored at − 20 °C until their genomic DNA (gDNA) was isolated. After removing the residual

COI PCR amplification and sequencing
The COI was amplified with the COIF (5′-CTA GGT GCT CCA GAT ATA GCA TTT C-3′) and COIR (5′-GGC TAA AAC AAC TCC TGT TAA TCC -3′) primers from isolated DNA. PCR was performed in 20 μL volumes comprised of 10 μL PrimeSTAR Max DNA Polymerase (TaKaRa, Tokyo, Japan), 1 μL of each primer (10 mmol/L), 1 μL of template DNA solution (70 ng/ μL), and 7 μL double distilled water. Amplifications were conducted as follows: 34 cycles of denaturation at 94 °C for 30 s, annealing at 59 °C for 30 s, and extension at 72 °C for 1 min. All the PCR products were checked by electrophoresis on a 1.2% agarose gel and bidirectional sequencing was completed by TSINGKE (Beijing, China).

ISSR PCR amplification
A total of sixteen primers from the University of British Columbia Biotechnology Laboratory Primer kit No.9 were tested for PCR and nine (Additional file 1: Table S2) that could produce reproducible, clear, polymorphic electrophoretic bands were chosen for further analysis. PCR was performed in 20 μL volumes comprised of 10 μL Premix Taq ™ (Ex Taq ™ Version 2.0 plus dye) (TaKaRa, Tokyo, Japan), 2 μL of primer (10 mmol/L), 1 μL of template DNA solution (70 ng/μL), and 7 μL double distilled water. Amplifications were carried out as follows: an initial denaturing at 94 °C for 3 min, followed by 34 cycles of denaturing at 94 °C for 30 s, annealing at an optimized temperature for 30 s, and extension at 72 °C for 1 min, with a final extension 7 min at 72 °C. All PCR products were electrophoretically separated on 2% agarose.

Data analysis COI gene data analysis
COI sequences were edited manually with BioEdit v.7.0.9 to produce consensus sequences of 685 bp for each specimen [37]. All indices for sequence polymorphic sites, DNA polymorphism, genetic differentiation, neutrality tests [Tajima's D [38] and Fu's Fs [39] ], and haplotype analyses were executed using DnaSp v.5.10 [40]. A haplotype network, which included haplotype frequencies, was calculated using Network v.4.6 [41]. Intra-and inter-specific genetic distances and transition/transversion ratios in each codon were computed based on COI gene sequences using MEGA v.7.0 [42]. A population phylogenetic tree based on genetic distances was constructed using the Neighbor-Joining tree model in MEGA v.7.0. Analysis of molecular variance (AMOVA) was performed with Arlequin software v.3.5.2 [43].

ISSR data analysis
Amplified ISSR fragments were scored as present (1) or absent (0) according to the molecular weight (bp) and the resulting matrix of binary values was used for further analyses. The observed number of alleles (Na), effective number of alleles (Ne), Nei's gene diversity (H), Shannon's information index (I), the percentage of polymorphic bands (PPB), total gene diversity (Ht), genetic diversity within populations (Hs), coefficient of gene differentiation (Gst), and Gene flow (Nm) were calculated using POPO-GENE v.1.31 [44]. Cluster analysis was used to construct dendrograms using the UPGMA (unweighted pair-group