Dissecting the multi-scale spatial relationship of earthworm assemblages with soil environmental variability

Background Studying the drivers and determinants of species, population and community spatial patterns is central to ecology. The observed structure of community assemblages is the result of deterministic abiotic (environmental constraints) and biotic factors (positive and negative species interactions), as well as stochastic colonization events (historical contingency). We analyzed the role of multi-scale spatial component of soil environmental variability in structuring earthworm assemblages in a gallery forest from the Colombian “Llanos”. We aimed to disentangle the spatial scales at which species assemblages are structured and determine whether these scales matched those expressed by soil environmental variables. We also tested the hypothesis of the “single tree effect” by exploring the spatial relationships between root-related variables and soil nutrient and physical variables in structuring earthworm assemblages. Multivariate ordination techniques and spatially explicit tools were used, namely cross-correlograms, Principal Coordinates of Neighbor Matrices (PCNM) and variation partitioning analyses. Results The relationship between the spatial organization of earthworm assemblages and soil environmental parameters revealed explicitly multi-scale responses. The soil environmental variables that explained nested population structures across the multi-spatial scale gradient differed for earthworms and assemblages at the very-fine- (<10 m) to medium-scale (10–20 m). The root traits were correlated with areas of high soil nutrient contents at a depth of 0–5 cm. Information on the scales of PCNM variables was obtained using variogram modeling. Based on the size of the plot, the PCNM variables were arbitrarily allocated to medium (>30 m), fine (10–20 m) and very fine scales (<10 m). Variation partitioning analysis revealed that the soil environmental variability explained from less than 1% to as much as 48% of the observed earthworm spatial variation. Conclusions A large proportion of the spatial variation did not depend on the soil environmental variability for certain species. This finding could indicate the influence of contagious biotic interactions, stochastic factors, or unmeasured relevant soil environmental variables. Electronic supplementary material The online version of this article (doi:10.1186/s12898-014-0026-4) contains supplementary material, which is available to authorized users.


Background
Ecological processes are spatially influenced on various scales, ranging from global to local scales [1,2]. In natural communities, the observed spatial pattern is the result of environmental, biological and/or historical drivers [3], which are not exclusive but rather complementary. The existence of spatial structures of species assemblages suggests the influence of at least one structuring factor: i) a spatially distributed environment is the driving force that structures species assemblages according to niche theory [4]; ii) species are assembled on certain spatial scales through the influence of biotic interactions [5][6][7][8][9][10]; and iii) historical contingency, according to neutral theory [10,11], or stochastic variations in the history of species arrival [12,13] drive this process, although the scale of the random effect has not been fully identified [14]. It is challenging to determine which process has a larger effect because historical species arrival data and past ecological processes are usually unknown.
When analyzing spatial datasets, striking and puzzling results are found if spatial autocorrelation is ignored because response variables are structured on various spatial scales [15][16][17][18]. Specific spatially explicit sampling protocols for targeted organisms and different approaches are needed in soil ecology studies [17,19,20], although these methods must be used with caution [16,21]. Geostatistics [22] allows the assessment of the spatial distribution of soil environmental variability and soil organisms [22,23], but other powerful statistical tools are necessary to model spatial structures on various scales, such as principal coordinates of neighbor matrices (PCNM) [3,24,25]. The PCNM approach is part of the distance-based Moran's eigenvector map (MEM) analysis, which is included in the spatial eigenfunction family of tools [2,25,26] and is a powerful statistical method to model spatial structures at all scales; in other words, the environmental variability is linked to community structure on a multi-scale level [3,24] to obtain new ecological insights [21]. It has also been used to test and separate the niche from neutral mechanisms that influence the community structure [15], although it may appear over-simplistic [27,28].
To date, few field studies have been performed on the assembly of soil invertebrate communities to infer overall patterns and draw conclusions on the importance of explicitly accounting for multi-spatial scales. Soil organism communities have been reported to be spatially structured due to their response to spatial variability in soil resources [12,19,[29][30][31][32][33], allowing the co-existence of competing species within the same patch in spatially heterogeneous environments [32,34]. Although complex spatial patterns have been described for soil invertebrates forming patch assemblages that range from the scale of soil aggregates [35] to those of individual plants [36], agricultural lands and natural ecosystems [37][38][39][40][41][42], no study has assessed the multi-scale spatial relationship between soil invertebrates and environmental variability thus far. The influence of disturbance and habitat heterogeneity on Carabidae assemblages has been described recently, but only on the landscape scale [19]. Studies and data analysis using these multi-spatial analysis techniques to perform invertebrate community research are needed, even if caution must also be exercised [43]. In this study, we aimed to i) analyze the spatial location of significant patches and gaps of the species assemblages identified, ii) test whether the relationship between species assemblages and soil environmental variability occurs on very fine (<10 m), fine (10-20 m), and medium scales (>30 m), and iii) investigate the spatial relationship between root traits and soil parameters to test the hypothesis of the "single tree effect" [44].
The CA extracted three axes (72.9% of the total variance), and these three axes were used to discriminate among the various species assemblages according to the axis selected ( Figure 1). Axis I (34.2% of the explained variance) discriminated new genus 1 versus all other species, whereas the second axis (21.7% of the total variance) revealed a clear distinction between endogeic and epigeic + anecic species. Moreover, the position of species along the positive side of axis 2 followed a body size increase among endogeic species. Axis 3 (17.1% of the total variance) separated Aymara, Andiodrilus and new genus 1 from Martiodrilus, Glossodrilus and new genus 2.

Patches and gaps of species assemblages
The SADIE spatial I a index and local v i and v j clustering indices were statistically significant for endogeic species and the group Martiodrilus, Glossodrilus and new genus 2 (1 anecic +2 endogeics), whereas only the v j index was significant for Andiodrilus, Aymara and new genus 1, i.e. one endogeic + two epigeics (Table 2). Significant spatial dissociations were found when using those assemblages identified with CA axes, i.e., −0.232 (p = 0.978) between new genus 1 and the rest of species, −0.278 (p = 0.995) between endogeics and epigecis + anecic group, and −0.383 (p = 0.999) between the group Andiodrilus, Aymara and new genus 1 from Martiodrilus, Glossodrilus and new genus 2 group.
The number of significant clusters of the earthworm assemblages ranged from 1 (new genus 1) to 9 (endogeics), with gaps occupying a larger area than that of patches ( Figure 2). The type of litter and tree root traits may influence the patchy distribution of endogeic earthworms, which is known as the "single tree effect" [44]. The endogeic species assemblage was close to A. maripa trees, except for the large patch at the central part of the surveyed plot, where values of the coarse root length and weight (CoRL, CoRW) were the lowest (Figure 3, kriged contour maps).
The identity and location of tree species within the surveyed plot did not appear to explain the observed spatial patterns of the remaining species assemblages: ■ new genus 1: a significant gap in the lower half area where multiple tree species, mainly A. maripa, were present. The correlogram was not significant (Figure 4a). ■ All other species: significant gaps and patches were not linked to areas of tree presence. The correlogram was not significant (Figure 4a). ■ Endogeics group: four significant patches close to the location where the A. maripa tree species was observed. The correlogram was significant ( Figure 4b). ■ epigeics + anecic group: same as described for the rest of species (CA1-). The correlogram was not significant (Figure 4b). ■ Andiodrilus, Aymara and new genus 1: two significant gaps in the lower half area and two significant patches in the plot edge where the A. maripa tree species was found. The correlogram was not significant (Figure 4c). ■ Martiodrilus, Glossodrilus and new genus 2 group: one significant patch in the western zone of the plot where trees were present and a large significant gap in the upper part; another two significant patches were in the eastern zone. The correlogram was significant ( Figure 4c).

Cross-correlogram analysis
Regarding the spatial cross-correlation between rootrelated variables and soil nutrient-related and physical variables, significant positive cross-correlations were identified at short lag distances (h) between the FiRL and SOC 0-5 , P 0-5 and C:N 5-10 (Table 3), whereas P 5-10 showed a significant negative cross-correlation (Monte Carlo permutation). With regard to the CoRL, the crosscorrelation at short distances was positive for SOC [5][6][7][8][9][10] and C:N 5-10 and negative for N 5-10 and P [5][6][7][8][9][10] . Regarding root biomass, the FiRW showed significant positive spatial cross-correlation with P0-5 and P 5-10 , whereas it was negative for the variables SOC 5-10 , C:N 0-5 and C: N 5-10 at short lag distances ( Table 3). The CoRW showed a positive spatial relationship with SOC 5-10 and C:N 5-10 and a negative relationship with P 5-10 . A significant positive spatial cross-correlation was observed between the FiRL and soil aggregates of less than 5 mm in size, whereas it was negative for larger soil aggregates and moisture at short lag distances (Table 4). Regarding the FiRW, a significant negative spatial cross-correlation at short distances was especially observed for <0.5 and 2-5 mm size soil aggregates, and a positive crosscorrelation was observed for >10 mm aggregates and bulk density (BD). Finally, the CoRL showed a positive spatial cross-correlation with 0.125-0.5 and 1-5 mm soil aggregates and hydraulic conductivity, and a negative spatial cross-correlation was observed for >10 mm soil aggregates and soil moisture. The CoRW showed a positive spatial cross-correlation with 0.125-0.25 and 0.5-5 mm size soil aggregates and a negative spatial cross-correlation for >10 mm soil aggregates (Table 4).

Decomposing multiple scale spatial patterns of species assemblages
Significant multi-scale spatial structures were obtained for the earthworm community, species and assemblages, especially in the case of new genus 1. The forward selection procedure resulted in various numbers of PCNM variables, ranging from 1 to 9 out of 69 positively autocorrelated spatial eigenvectors (significant Moran's I at p < 0.05). Variogram modeling [25] provided information on the scales of PCNM variables. The PCNM eigenfunctions selected to model the distribution of earthworm community are depicted in the Additional file 1.
These parameters indicate clear spatial structures on a medium (>30 m), fine (10-20 m) and very fine scale (<10 m), except for Andiodrilus (which presented only one significant PCNM). Regarding new genus 1, PCNM3 and PCNM8 defined the medium-scale patterns, whereas PCNM12 and PCNM16 encompassed the fine-scale patterns; PCNM29, PCNM33 and PCNM51 described very fine scales (Additional file 1). The maps of the fitted scores of the significant canonical axes in the PCNM analysis for Epigeic: live and feed on the soil surface; Endogeic: live and feed within the soil; Anecic: live within the soil and dig vertical or semi-vertical burrows to feed on the soil surface (after [45,46]). Endo-anecic worms have characteristics of anecic (anterodorsal pigmentation, flattened rear end) and endogeic worms (horizontal burrowing). 2 Average biometric data for adults (g.f.w.indicates grams of fresh weight in 4% formalin, gut contents included). 3 NC: not classified.
species (A), species assemblages and the whole community (B) are depicted in Additional file 2. The significant explanatory environmental variables that best described the multi-spatial structure varied for the earthworm community, species and species assemblages ( Table 5). The nutrient-related variables explained much of the structure of new genus 1 on the medium and fine scales, whereas the physical variables were better explained on a very fine scale, such as soil compaction (negatively) and humidity (positively). The variables C 0-5 (p corr < 0.001) and moisture content (p corr < 0.05) contributed positively to the spatial structure model of new genus 1, whereas C 5-10 (p corr < 0.05), N 0-5 (p corr < 0.001), C:N 0-5 (p < 0.01) and compaction (p corr < 0.05) contributed negatively to medium-scale patterns ( Table 5).
With regard to the endogeic Andiodrilus sp., the medium-scale spatial structure was explained by physical variables associated with the size of soil aggregates, although the values were not significant (p corr > 0.05). When species assemblages were used instead, new environmental variables were detected (Table 5), i.e., variable P 0-5 was negatively correlated to the medium-scale pattern of assemblage of endogeics, and litter contributed negatively to this pattern. For the epigeics and anecic assemblage, litter and moisture contributed positively to the medium-scale spatial structure model, although this contribution was not significant (p corr > 0.05).
Soil environmental control on earthworm species and assemblage spatial patterns The variation partitioning analysis revealed differences among species regarding the explanatory variables ( Table 6). The entire set of environmental and spatial variables explained the various percentages of variation within the community, species and species assemblage. In the case of the earthworm community, the explained variation was 41.9%, of which 32.3% was explained by the soil environment but not the spatial variables (p = 0.005). The environment and fine-scale structure explained 4.98% of the total variation, whereas the environment and medium-scale structure together explained 2.93%. For the species alone, the R a 2 coefficient for the environmental fraction ranged from 1% for Aymara to 48.0% for new genus 1. The medium and fine spatial scales explained 15.4% and 13.4%  for Aymara and 2% and 2.2% for new genus 1, respectively. The amount of variation explained only by spatial variables independent of the environment differed among species; it ranged from 1.3% to 28.8% for new genus 2 and Aymara, respectively (Table 6).

Discussion
Both spatial and environmental variables structured the species, assemblages and earthworm community, although variations were found in the explained contribution of environmental factors, i.e., 33.3% of the total variation of the global spatial structure of the earthworm community was explained by soil environmental variability. The specific soil environmental variables that were significantly linked to particular spatial scales for species and assemblages were also observed in other studies of nematodes in a forest [35]. To a certain extent, our results agree with Hutchinson's environmental control model [4], although a large portion of the variation was also linked to a purely spatial component ( Table 6).
The selected PCNM variables highlighted significant spatial patterns in earthworm assemblages from a gallery forest, allowing us to identify the spatial scale at which the earthworm community was structured. In general, the very fine scale of autocorrelation detected in our study represents spatial patterns of less than 10 m (PCNMs 33, 51; Figure S1 in Additional file 1), fine scales depicted patterns of 15-20 m (PCNMs 12), and medium scales (PCNMs 3, 5, 8) represented spatial patterns of >30 m (see details in Additional file 1). Furthermore, our observation of very fine-, fine-and medium-scale spatial relationships indicates the importance of considering multiple scales during ecological studies of soil organisms. In our study, we carefully chose the scale used to sample earthworms to focus on small-scale patterns. Additional studies are needed to increase the scale of the sampling design, i.e., hundreds of meters to several kilometres.
The influence of environmental constraints on the spatial distribution of species assemblages has previously New genus 1

Martiodrilus, Glossodrilus and new genus 2
Epigeics + anecic been demonstrated in a nearby savanna [29]. Moreover, earthworm activity also contributes to soil heterogeneity [33,37]. The joint influence of the soil environment and species-created heterogeneity, i.e., the so-called "functional domain" [47] of soil parameters, could explain the spatial patterns observed on several scales. On very fine scales, the environmental variables associated with the spatial distribution of earthworms were more difficult to detect, i.e., the concentration of soil C 0-5 and moisture better explained the spatial pattern of new genus 1 on fine and medium scales compared with very fine scales, whereas Andiodrilus sp. was mostly associated with physical variables, such as size class aggregated distributions, because this mediumsize species produces compact casts that influence the surrounding soil environment [32]. Very fine scales (PCNMs 32 and 50) may be overlooked by classical multivariate analyses, as their relationships may be masked by those of other explanatory variables associated with larger scales, such as PCNM 3. As a detailed analysis of the soil environmental variables was performed within a relatively small area, some of the variation could be attributed to unmeasured variables, leading to incomplete predictions [48]. Moreover, the fraction attributed solely to space was smaller than all other fractions, except for Aymara (28.8%), Glossodrilus (14.1%), and assemblages epigeics + anecic and Andiodrilus, Aymara and new genus 1, which represented 23.5% and 22.2% of the total variance, respectively.

Rest of species
The factors affecting the spatial distribution of soil organisms at larger scales include gradients in soil organic matter and vegetation structure [49], whereas at very fine scales (<10 m) earthworm spatial distribution could be influenced by local factors, such as the plant characteristics, soil moisture and micro-topography. A variety of plant species is likely to support important levels of soil heterogeneity [50]. The root architecture or fine-scale spatial patterns within the plant community determine the spatial structures of earthworm populations through fine-scale soil environmental variations, which is known as the "single-tree effect" [44]. Through litter input and root leachates, trees directly influence earthworm populations. They also indirectly influence earthworm populations by altering the soil properties, forming patches beneath tree canopies that influence community structure and ecosystem function [51,52]. Species exclusion was also reported for the savanna grass Imperata brasiliensis, as earthworms become injured upon contact with its sharp-pointed roots [38]. In our study, significant positive cross-correlations were found for the CoRL and CoRW and the soil nutrientand physical-related variables. These close associations between soil variables and vegetation structure have also been described for epigeic invertebrate assemblages [19]. Regarding the importance of soil environmental variables, it should be noted that the influential factors according to our analyses did not operate independently but interacted (e.g., root traits and nutrients); these complex interactions are characteristic of most ecological systems [53].
The idea that species distributions can be linked to key abiotic variables on multiple scales is not new [54]. Our analysis for empirical data has shown that environmental variables are indeed most important on broad scales, whereas purely spatial patterns appear to dominate on finer scales [55]. Applying PCNM analysis toward large-scale assessment of species-environment relationships is a well-established method [19,56,57]. Gilbert and Bennett [43] and Smith and Lundholm [28] criticized the application of variation partitioning to study the relationship between environmental variables and space, although they admitted that it yields useful results when carefully used. Our present analysis supports an optimistic view of this approach. All environmental patterns are spatially correlated on some scale [28], and all ecological processes are spatial to some extent [1]. The common fraction of variance explained jointly by environment and space appears to represent patterns generated by both environmental factors and the limitations on species dispersal [28]. In our study, the highest level of pure spatial variation was obtained for the epigeics + anecic assemblage (23.5%), whereas the endogeics assemblage showed the lowest level (1.6%, Table 6). Species, or even earthworm ecological categories, show specific dispersal behaviors [58], with endogeics typically showing less dispersal than do epigeic and anecic species.
We carefully selected the sampling scale and the spatial statistics tools to address the ecological question at hand [43]. We benefited from our previous knowledge regarding the biology and ecology of the species found in the region where the survey was undertaken [29,32,33,37,38]. Our study lays the groundwork for further detailed analysis of spatial structuring environmental factors and species assemblages on several scales, while also providing A false discovery rate (FDR) procedure was applied to correct the initial p-values (see text for explanation). * p < 0.05; ** p < 0.01; *** p < 0.001; NS, not significant.
clues for developing an accurate and spatially explicit sampling design for earthworm communities. In addition, the utility of the tools used to select species assemblages and analyze their spatial attributes relative to the soil environmental variability was clearly demonstrated. We are confident that our results provide crucial insight into the spatial relationship between species assemblages and soil environmental variability on scales that range several tens of meters. The selection of assemblages from the correspondence analysis, in this case epigeics versus endogeics, and the statistical methods used to draw our conclusions provide insights that improve understanding regarding why particular species assemblages are found at particular sites. From an ecological point of view, our study not only suggests that specific environmental factors determine the structure and spatial distribution of earthworms in the gallery forest but also indicates that a large proportion of unexplained variation exists. Whether this variation is the result of unmeasured soil environmental variables or null model (random) patterns is a subject for further research.

Conclusions
Earthworms were spatially structured within a relatively small but highly heterogeneous plot; i.e., even at ranges of just a few meters, a multi-scale spatial pattern was observed. The amount of variation jointly explained by the environment and space was not high. However, these sources of variation should not be neglected because they represent unmeasured soil environmental factors and processes that limit species dispersal. Further studies are needed because dispersal traits, for example, remain largely unknown in many earthworm communities. In conclusion, specific abiotic factors were responsible for the observed patterns, and the importance of these patterns needs to be elucidated, even if the multi-scale approach carries additional difficulties and caveats when interpreting the results.

Earthworms and soil sampling
Earthworms and soil samples were collected at 100 sampling points evenly distributed within a 45 × 45 m 2 grid with 5 m of inter-sample distance ( Figure 5). The earthworms were identified, and their abundance was counted in situ from soil blocks of 25 × 25 cm2 and a depth of 20 cm [60]. Previously, the fresh, tower-like casts deposited on the soil surface by Martiodrilus sp. (anecic) were counted at each point within 1 m 2 quadrats, as they are reliable indicators of the number of active individuals [61].
In total, 400 soil samples were collected for physical and chemical analyses. Soil cores were collected along the four sides of each sampling point ( Figure 5). The core method (5 cm depth and 5 cm diam. metal cylinder) was used for bulk density (soil dry mass per volume) following [62]. Water content (soil water per volume and soil water per dry mass) was determined gravimetrically, and hydric conductivity and susceptibility to compaction were also measured [63].
A second core (10 cm depth and 5 cm diam. metal cylinder) was used for soil organic carbon (SOC), nitrogen (N) and phosphorous (P) measurements at 0-5 and 5-10 cm. The soil was then oven dried at 75°C for 48 h and finely ground (<200 μm). A standard colorimetric method was used after digestion in H 2 SO 4 to measure SOC, and the Kjeldahl method was used to assess the total N. Available P was determined using the Bray-II extraction method. C:N and C:P ratios were calculated as the SOC concentration divided by the total N and P concentrations, respectively.
Another soil core (15 cm depth and 10 cm diam.) was taken to assess the size-class aggregate distribution; ca. 100 g of air-dried soil was mechanically shaken in a sieve column of 4.75, 2, 1, 0.5 and 0.250 mm for 30 min. The last soil core (15 cm depth and 10 cm diam.) was used for root length and biomass quantification. Soil was washed in the lab and sieved to separate the fine (<2 mm) and coarse roots (>2 mm) and then oven dried at 105°C for 48 h.
Resistance to penetration (RP) was measured (3 replicates) using a hand penetrometer at each sampling point. The soil moisture content (volumetric) of the topsoil at the time of sampling was ca. 38% (pF = 2.8).

Multivariate ordination analysis (CA)
Species abundance (raw data) was analyzed using correspondence analysis (CA). When the species abundance was <5% of the total, it was removed from the species matrix. The extracted factorial axes allowed us to identify various species assemblages, i.e. these were defined according to the sum of the individuals of all species linked to positive or negative row scores of the three axes.
Species assemblage patches and gaps SADIE (Spatial Analysis Distance IndicEs) analysis [64,65] was used to assess the presence of significant patches and gaps within species assemblages. The index uses count data, i.e., the total number of individuals who corresponded to any of the assemblages identified per sampling point. A global index of aggregation (I a ) is computed: where D is the distance moved to achieve the regular pattern for the observed data and Ea is the arithmetic mean distance to regularity for non-regular randomized samples [64]. I a equals 1 for a random distribution, whereas it is >1 or <1 for either a clumped (aggregated) or regular spatial pattern, respectively [65].
SADIE identifies clusters of high (patches) and low (gaps) mean density, respectively, and these clusters are categorized as v i (positive) and v j (negative cluster index). A patch or gap comprises at least one sample location where the cluster index (v i or v j ) is significant at the heuristic threshold of 1.5 and −1.5, respectively. Adjacent sample locations with significant index values form a single cluster [65]. The observed v i or v j indices are tested using random permutations against the H 0 of complete spatial independence of counts [66].
In this study, we used positive and negative row scores extracted from the CA to obtain count data and compute the SADIE v i and v j cluster indices. Factorial coordinates have been used as a typical procedure to analyze the inner structures of data matrices for community analysis [29,33,[37][38][39]67,68]. Because the row scores and factorial coordinates are not count data, which is a requisite for applying SADIE statistics, the various assemblages were obtained by summing the earthworm count data linked to the positive and negative row scores along the CA axes.
Finally, a spatial association/dissociation index was computed between species assemblage pairs [66]. The local association indices calculated from their individual samplingunit clustering indices are correlated between species assemblage pairs. The observed value of the association index is tested against the H 0 of complete spatial independence of the counts (based on random permutations). The two-tailed associated probability levels at α = 5% are <0.025 and >0.975 for significant association and dissociation, respectively [66].

Spatial autocorrelation analysis
In the presence of a spatial dependence, the observation made at one location is more similar to observations made at nearby sites [2], breaking the rule of sample independence for statistical analyses [23]. To assess the degree of spatial autocorrelation, the (semi)-variogram is a function that describes the spatial pattern of any variable with increasing inter-sample (lag) distance. When positive autocorrelation exists, the semi-variance γ(h) increases until it reaches a maximum value (the "sill") for a given lag distance, which is referred to as the range. This parameter defines the limit of spatial dependence of the variable concerned (detailed in [22,69]). Estimated values of γ(h) are adjusted using a theoretical model [70,71] that is later applied with an interpolation technique called "kriging" to estimate values of the variable under study at non-sampled sites [22]. In our study, interpolated maps were used for root-related variables using only the modeled parameters obtained in the variogram [33].
Cross-variograms can be calculated to assess how two variables co-vary in space [72]. Similar to univariate variograms, cross-covariances may be computed using the values of two distinct variables observed at locations separated by lag h [73]. However, variograms and crossvariograms are not associated with formal testing for departures from randomness (an r 2 correlation coefficient could be used to adjust the curve). However, in our study, spatial cross-correlation among the root-related variables, soil nutrient contents and physical variables was assessed by calculating the spatial cross-correlogram [55,74]. A spatial autocorrelation coefficient, named Moran's I, is plotted in the correlogram for increasing distance classes [75]. Data were allocated to 11 distance classes with a minimum of 50 pairs of points for each distance class to compute the cross-correlogram. The significance of the correlogram is tested with a Monte Carlo simulation [20]; it is significant when at least one coefficient is lower than the Bonferroni corrected p′ of α′ = α/k for the k distance classes used [76]. Data normality was tested with a Kolmogorov-Smirnov test; when the normality assumption was not confirmed, a Box-Cox transformation was used [77]. The gstat and ncf packages of the R program 2.15.1 [78] were used to compute the variograms and cross-correlograms and to depict the kriged maps.

Principal coordinates of neighbor matrices (PCNM) and variation partitioning
The multi-scale spatial analysis of fauna data and soil environmental variability was performed using PCNM analysis [24,79]. This method allows to capture extremely complex structures [80] and is based on the principal coordinate analysis (PCoA) of a truncated pairwise geographic distance matrix between sampling sites [25]. It creates PCNM variables (spatial predictors or eigenfunctions) and a spectral decomposition of spatial relationships from broad to fine spatial scales [81] that is encompassed by the data matrix among sampled sites and then determines to which PCNM variables the response data (uni-or multivariate) respond statistically [79]. Only spatial eigenfunctions associated with positive eigenvalues based on Moran's I were used to define the spatial structures [3], which represents a highly conservative method due to its penalization of degrees of freedom and adjusted R 2 statistics [43].
The PCNM variables that significantly contribute toward explaining the species response data are grouped into a small number of submodels, whereas they are normally assigned to broad, intermediate, and fine scales. The predicted values generated for each submodel can then be reanalyzed using canonical analysis against environmental variables to identify the environmental variables associated with species distributions on the scale represented by each submodel [25]. The forward-selection procedure was used [82] to reduce Type I error, as it is known to underestimate the residual variance [83]. In other words, the probability of selecting at least one PCNM is greater than the chosen significance level, even if the response variable is not spatially structured [80]. Appropriate and rigorous approaches for submodel selection have been argued by [43] and [81] is support of improving the methodological developments in MEM-based methods. Scale is generally defined according to the main features of the sampling design, such as the extent of the study area or the size and spacing of the sampling units [84]. Given the dimensions of the plot (45 × 45 m) and our knowledge on the spatial distribution of various earthworm species in the area [33,37,38], the scales were grouped for convenience into medium (>30 m), fine (10-20 m) and very fine (<10 m). The next step is to relate the spatial components using significant Bonferroni-adjusted p values extracted from the species matrix with soil variables. In other words, species-soil environment regression analysis is performed independently on each scale identified by the PCNM variables. Variogram analysis for each PCNM variable is performed to identify the spatial scale at (See figure on previous page.) Figure 5 Sampling protocol used with a regular grid of 10 x 10 sampling points and a 5 m inter-sample distance. A total of 400 soil samples were collected for the various soil analyses and 100 soil monoliths for earthworm species counts. The location of tree species (>5 m diameter at breast height, DBH) within the surveyed 0.2 ha plot are shown together with the soil pit where the earthworms were sampled, identified and counted, as well as the four soil cores taken for physical and chemical determinations from each of the 100 sampling points.
which the relationship was significant. The lowest value of the Akaike information criterion (AIC) identifies the best spatial model.
Partitioning the variation of a species response data table among two or more explanatory tables is performed using multivariate variation partitioning [85], which determines how much of the species variation is spatially structured and associated with the measured environmental variables [57,80]. The variation partitioning analysis is based on the adjusted R 2 statistic R a 2 [86], and patterns on finer scales identified by the PCNM variables appear smoother compared to other spatially explicit models, such as nested variograms and filter kriging [79].
The data matrix included count data for 6 earthworm species, 23 soil environmental variables (Additional files 3 and 4) and xy coordinates for 100 sampling points. It has been recommended that fauna data should be detrended and transformed during PCNM analysis. Contrary to correspondence analysis, earthworm abundance data were Hellinger transformed because PCNM has been found to be inappropriate for raw data that includes many null abundances [87]. Earthworm spatial distribution is represented by patches, and clearly meaningful trends are rarely observed; consequently, the data were not detrended. In other words, we were cautious with the use of tests to determine the presence of trends because they are likely not appropriate for patchy patterns. The packages vegan, mass and packfor were used for all the calculations needed during PCNM analysis in R 2.15.1 [78].