Geographical patterning of sixteen goat breeds from Italy, Albania and Greece assessed by Single Nucleotide Polymorphisms

Background SNP data of goats of three Mediterranean countries were used for population studies and reconstruction of geographical patterning. 496 individuals belonging to Italian, Albanian and Greek breeds were genotyped to assess the basic population parameters. Results A total of 26 SNPs were used, for a total of 12,896 genotypes assayed. Statistical analysis revealed that breeds are not so similar in terms of genetic variability, as reported in studies performed using different markers. The Mantel test showed a strongly significant correlation between genetic and geographic distance. Also, PCA analysis revealed that breeds are grouped according to geographical origin, with the exception of the Greek Skopelos breed. Conclusion Our data point out that the use of SNP markers to analyze a wider breed sample could help in understanding the recent evolutionary history of domestic goats. We found correlation between genetic diversity and geographic distance. Also PCA analysis shows that the breeds are well differentiated, with good correspondence to geographical locations, thus confirming the correlation between geographical and genetic distances. This suggests that migration history of the species played a pivotal role in the present-day structure of the breeds and a scenario in which coastal routes were easier for migrating in comparison with inland routes. A westward coastal route to Italy through Greece could have led to gene flow along the Northern Mediterranean.

Even now, goats are of great importance in many developing countries, to exploit marginal agricultural resources, and in developed countries, for the production of high quality products and the achievement of sustainable development of rural areas. However, breeding programs and selection schemes in goats are less advanced than in other livestock.
Archaeological evidence indicates a probable migration of the Neolithic farmers out of the Near East and across Europe following two main routes, through the continental heartland up the Danube valley or along the Mediterranean coast [4][5][6][7] crossing the sea to the major islands. Archaeological data and radiocarbon dates on seeds or bones provide support for an earlier arrival in western Europe via the Mediterranean route rather than the "Danubian" route [8]. In addition, genetic data revealed a weaker degree of phylogeographic structuring in domestic goats compared to other livestock species [9,10], which probably results from high gene flow at the intercontinental level, suggesting that goats have been extensively transported [2,11]. This led to initial settlement in the Balkans and southern Italy [12]. A decrease of genetic diversity likely occurred during this colonization process in Europe [13].
The Mediterranean Sea had a key role in the history of livestock also in post-Neolithic times, when civilisations like Phoenicians, Greeks, Romans and Berbers probably introduced new species of animals and new breeds of livestock in southwest Europe arriving by sea. Some colonists may have improved local livestock importing stock from overseas [14,15], explaining the unexpectedly high diversity in breeds of domestic goats [16], the differential cattle migration along the Mediterranean coast [14] and the close genetic relationship between Tuscan cattle breeds and Near Eastern breeds [17]. The role of the Mediterranean Sea as a natural corridor connecting the peninsula to the Near East, North Africa, and southern Europe is particularly plausible for domestic sheep and goat, species adaptable to various environments and easy to transport [15].
A relative lack of breed standardization, herdbook breeding, parentage control and rigorous management may have facilitated gene flow between geographically nearby breeds in south-eastern Europe [18]. Gene flow may also have a historical background as goats were actively traded all over the Mediterranean basin during the Phoenician, Greek and Roman periods. However, in some circumstances, gene flow is limited by distance and local management, which reduces the effective population size as a result of genetic isolation.
So far, studies on genetic diversity of goats have focused on Swiss [19] and Asian breeds [20][21][22][23][24][25][26]. Only a few were relative to Mediterranean breeds [27][28][29], but all of these employed mitochondrial markers. A wide-range analysis of goat diversity in Europe and the Middle East has been conducted using microsatellites [18]. The use of microsatellites, the most common method used today to estimate neutral genetic diversity [30,31], presents disadvantages such as null alleles, interpretation difficulty of allele calling and size homoplasy [32]. Single nucleotide polymorphisms (SNPs) are useful markers for estimating parameters such as population history and inference of relationships [33,34] and they could potentially become the marker of choice in ecological and conservation studies [35,36]. The use of these markers is promoted also by a rapid development of genotyping techniques [32,37,38].
We applied SNP genotyping to the study of goat breeds from Italy, Albania and Greece to analyse genetic diversity of goats in these areas.

Population Genetics
A total of 26 SNPs identified as polymorphic on a panel of European goat breeds [39] were used to genotype 496 individuals belonging to 6 Albanian, 2 Greek and 8 Italian breeds, for a total of 12,896 genotypes assayed. Loci were analysed to identify SNPs under selection in a separated study [40]. Analysis were then performed excluding the three SNPs in two loci identified as outliers, even if the use of all the 26 SNPs did not lead to substantial differences in the results.
Expected heterozygosity of the loci ranged from 0.016 (FABP4) to 0.495 (GHR), with a mean of 0.300. FIS per population ranged from 0.017 (Liquenasi) to 0.197 (Capore) ( Table 1). Observed heterozygosity of the loci determined from SNP frequencies ( Table 2) ranged from 0.012 (FABP4) to 0.463 (GHR), with a mean of 0.272. The frequencies of the major alleles are reported in table 2 and ranged from 0.508 for the locus IL4 to 0.992 for the locus FABP4. Except for IL2-1 and FABP4, showing frequencies of the rare alleles of 0.017 and 0.008, respectively, all other SNPs have a frequency of the rare allele greater than 5%, as observed for the same loci on a different European breeds panel [39]. Also mean observed and expected heterozygosity showed similar values to those reported in the same paper. The frequencies of the major alleles per SNP and per population are presented in Additional file 1. Some populations presented fixed alleles in a number of SNPs. Particularly, Skopelos and Girgentana showed the highest number of fixed alleles (5). SNPs FABP4 and IL2_1 are fixed in 12 and 8 populations, respectively (Additional file 1).
Significant deviations from Hardy-Weinberg equilibrium over all populations (p-value < 0.005) were observed in three loci (  The Mantel test showed a strongly significant correlation between genetic and geographic distances (0.40, p-value < 0.001, over 1000 permutations).

PCA analysis
Genetic relationships were also explored by means of principal component analysis. The coefficients of the linear combinations reveal which SNPs most affect the component value. As for the first component, SNP IL4 presents extreme positive and SNP LGB extreme negative values, respectively. Likewise, the second component is mostly affected by the SNPs ACVR2B and MTNR1A, with positive sign, and by the SNPs HLA-DQA_2, IL4 and LGB with negative sign. To examine the overall pattern of population differentiation, PCA was conducted with the first two axes, which cumulatively explained 52% of the total inertia contained in the data set ( Figure 1). Breeds are grouped according to geographical origin, with the exception of the Greek Skopelos breed.

Genetic distance
Distance-based phylogenetic analysis was used to describe the relationships between breeds.

Discussion
Archaeological evidence showed that two main colonization routes took place in Europe after the initial domestication events in the Fertile Crescent: the Mediterranean route and the Danubian route. Cañon et al. [18], using microsatellites, report a decrease in genetic diversity as well as an increase in the level of differentiation at the breed level from south-east to north-west in European goat breeds, supporting the hypothesis of migration of domestic livestock from the Middle East towards western and northern Europe.
Our results indicate that a highly significant correlation between genetic and geographic distance exists. The pres-  which cumulatively explained 52% of the total inertia contained in the data set. Breeds acronyms as in Table 1. Albanian breeds, green; North Italian breeds, blue; South Italian breeds, red; Greek breeds, pink.

ARG BIO CAM CAP DUK GIR GMO GRG HAS LIQ MAT MUZ ORO SAR SKO
ence of a geographic component in genetic diversity was already reported in breeds of Northern and Southern Italy in a previous study using SNPs [43] and it is confirmed here in a larger area. Such a geographic component is generally not observed when using mitochondrial markers. As reported by Luikart et al. [11] only 10% of the variance assessed by mtDNA is partitioned among continents. This could be due to the nature of the markers used for the analysis, as suggested by Naderi et al. [44]. In fact, mtDNA informativeness is limited because it does not detect malemediated gene flow and does not predict the nuclear genomic diversity [45]. In the paper by Naderi et al. [44] the breeds cannot be distinguished on the basis of mtDNA, even if authors report that more than 77% of the mtDNA variation is found within breeds, while there is a low regional differentiations of haplotypes. At a regional scale, the lack of geographic structure has been reported using mtDNA in different regions [16,25,28] with the exception of one paper [24].
From PCA analysis the breeds appear well differentiated with 52% of the variance explained by the first two principal components. There is also a good correspondence to geographical locations, thus confirming the correlation between geographical and genetic distances identified by the Mantel test. PCA indicates a westward route to southern Italy through Greece, that may suggest contacts between Albania and Italian peninsula and between Greece and Italian Islands (Sardinia and Sicily). In post-Neolithic times, some colonists may have improved local livestock as well as importing stock from overseas. The transport of animals made by sea has been already proposed for cattle [14,17] and goats [15,18]. The role of the Mediterranean Sea as a natural corridor connecting the Italian peninsula to the Near East, North Africa, and southern Europe is particularly plausible for small sized species, as sheep and goats species adaptable to various environments and easy to transport during human migration and commercial trade [4,11,45].
The Greek Skopelos breed results the most distant one, reflecting the fact that it has been raised only in a island and on the mainland of Magnisia. The distance is not due to inbreeding as FIS = 0.122, not the highest value in our breeds (max FIS = 0.197 in Capore), but to the lack of admixture with other populations since long time and possibly a natural selection versus local environment. The Skopelos breed is largely differentiated from the other goat populations in Greece, both morphologically and in terms of performance. According to the inhabitants of the Skopelos island, the goat used to live in an uninhabited small island of Northern Sporades, and it was recently domesticated. The breed is also said to have some relationship with the wild goat of the Gioura island, originated from the homonymous island [46]. Breeders, due to the favourable characteristics of the breed (high prolificacy and high milk production), established a breeders association and applied a genetic improvement programme since 1981. Also the Orobica is very far apart from the other breeds. Again the distance is not attributable to inbreeding (FIS = 0.030), but to isolation of this breed in a very secluded valley of Italian Alps.
Among the Italian southern breeds, it is interesting that the lowest distance is seen between Argentata dell'Etna, from Sicily, and Sarda, original from Sardinia. The two islands, although quite far apart, were important trade posts of Phoenician, Punic and Roman traders.
The analysis carried so far excluded SNPs that were proven under selection [40]. If we include these SNPs (CSN1S1_1, CSN1S1_2 and LIPE, [39]) we find that the overall distance pattern remains unchanged but for two breeds of Northern Italy, Bionda and particularly Valdostana, that become closer to Orobica ( Figure 2). This is due to casein and LIPE allele frequency that are almost fixed in these breeds for the same allele, while the average allele frequency for the other breeds is 50% (Additional file 1). Caseins have been the first genes to be associated to milk production, characteristics and curding properties [47][48][49]. It is noteworthy that milk production and cheese making is a primary agricultural activity in North Italy Principal component analysis (PCA) of allele frequencies from twenty six SNP loci genotyped in the sixteen goat breeds Figure 2 Principal component analysis (PCA) of allele frequencies from twenty six SNP loci genotyped in the sixteen goat breeds. Projection on axis 1 and axis 2, which cumulatively explained 56.8% of the total inertia contained in the data set. Breeds acronyms as in Table 1. Albanian breeds, green; North Italian breeds, blue; South Italian breeds, red; Greek breeds, pink.
since historical times as demonstrated by the high frequency of lactase persistence in humans [50]. We hypothesize that converged selection for caseins and LIPE (an enzyme important for cheese making as well [51]) occurred in Orobica, Valdostana and Bionda dell'Adamello breeds making them "similar" for what concerns their exploitation objectives.

Conclusion
Our data point out that the use of SNP markers to analyze a wider breed sample, until now scarcely employed for genetic population studies in livestock, could help in understanding the recent evolutionary history of domestic goats.
We found correlation between genetic diversity and geographic distance. Also PCA analysis shows that the breeds are well differentiated, with good correspondence to geographical locations, thus confirming the correlation between geographical and genetic distances. This suggests that migration history of the species played a pivotal role in the present-day structure of the breeds. Instead, the limited genetic similarity within main geographical areas suggests that breed differentiation could have occurred in more recent times, after the main migrations.
On the basis of the observed gradient of genetic diversity decreasing from south-east to north-west, and of the signals of the northward dispersal of populations from the domestication centre, we hypothesize that coastal routes from the domestication centre to Italy through Greece could be a likely explanation for the observed gene flow along the Northern Mediterranean.

Material
Blood samples of a total of 496 goats, about one third male, were collected in farms spread over the traditional rearing area of each breed (Figure 3). No more than 3 unrelated individuals per flock, from an average of 10 farms per breed, were sampled to reduce the relationship among animals and to increase the breed representativeness. Samples were obtained following the rules of each of the countries involved in sampling. Wherever possible, we used part of samples taken by public veterinaries within national animal health plans. A total of 16 breeds were analysed. The breed names, their acronyms, countries of origin, and the sample sizes are given in Table 1. DNA, extracted by phenol-clorophorm or commercial kits in the relative sampling laboratory, was tested for quality and concentration by electrophoresis on 0.8% agarose gel, stained with ethidium bromide and compared to a commercial standard.

SNP analysis
SNPs characterization has been described elsewhere [39]. SNP ascertainment bias was minimised by sequencing target DNA in at least 8 individuals from different popula- Figure 3 Rearing area of each analysed breed.

Rearing area of each analysed breed
tions [39]. Large scale genotyping of all animals was performed by outsourcing to a commercial genotyping company http://www.Kbioscience.co.uk. Generally, accuracy greater than 99% was achieved. Quality control criteria were adopted (water as negative control, inter plate duplicate testing of a known DNA, intra plate testing of a known DNA). All the SNPs described in [39] were genotyped on our samples. In this investigation 23 SNPs in 19 genes and in one microsatellite were used (Table 2), excluding SNPs showing a frequency >0.05 on our samples.

Data analysis
Allele frequencies were calculated using FSTAT 2.93 [52]. Observed and expected heterozygosities (HO and HE, respectively), Weir and Cockerham's [41] estimate of FIS per population, of FST per locus and population pairs were calculated for each locus using GENEPOP 4.0 [53]. The same software was used to test deviations from Hardy-Weinberg equilibrium (HWE) for each locus and population and for locus over all populations using a Markov chain of 100 000 steps and 1000 dememorization steps and to assess deviations from genotypic linkage disequilibrium (LD) for each pair of loci using the same Markov chain parameters as for the HWE test. We performed the Probability-test (the "exact HW test").
Isolation by distance (IBD, [54]) has been assessed by plotting the genetic distance among population pairs as a function of the geographic distance between those pairs to check whether more distant population pairs are more different genetically, providing information on the genetic structure of the populations and [55]. We applied the Mantel test [56,57] implemented in the ade4 library of R 2.6.0 open source software (publicly available at http:// www.r-project.org) to estimate the correlation between pairwise FST values and pairwise geographic distance using 1000 replicates to test significance. The matrix of the geographic distance was computed using the coordinates obtained with a global positioning system (GPS).
A principal component analysis (PCA) was performed on the covariance matrix of SNP frequency data to investigate spatial patterns of genetic variation using the R 2.6.0 open source software.
Nei [42] genetic distances between populations pairs were calculated to obtain relative estimates of the time that has passed since the populations were established using POWERMARKER [58].