Skip to main content

Contrasting habitat associations of imperilled endemic stream fishes from a global biodiversity hot spot



Knowledge of the factors that drive species distributions provides a fundamental baseline for several areas of research including biogeography, phylogeography and biodiversity conservation. Data from 148 minimally disturbed sites across a large drainage system in the Cape Floristic Region of South Africa were used to test the hypothesis that stream fishes have similar responses to environmental determinants of species distribution. Two complementary statistical approaches, boosted regression trees and hierarchical partitioning, were used to model the responses of four fish species to 11 environmental predictors, and to quantify the independent explanatory power of each predictor.


Elevation, slope, stream size, depth and water temperature were identified by both approaches as the most important causal factors for the spatial distribution of the fishes. However, the species showed marked differences in their responses to these environmental variables. Elevation and slope were of primary importance for the laterally compressed Sandelia spp. which had an upstream boundary below 430 m above sea level. The fusiform shaped Pseudobarbus ‘Breede’ was strongly influenced by stream width and water temperature. The small anguilliform shaped Galaxias ‘nebula’ was more sensitive to stream size and depth, and also penetrated into reaches at higher elevation than Sandelia spp. and Pseudobarbus ‘Breede’.


The hypothesis that stream fishes have a common response to environmental descriptors is rejected. The contrasting habitat associations of stream fishes considered in this study could be a reflection of their morphological divergence which may allow them to exploit specific habitats that differ in their environmental stressors. Findings of this study encourage wider application of complementary methods in ecological studies, as they provide more confidence and deeper insights into the variables that should be managed to achieve desired conservation outcomes.


Knowledge of species specific ecological requirements is a prerequisite for successful conservation[1, 2], and for understanding biogeographic and phylogeographic patterns of extant taxa[3, 4]. Information on determinants of biodiversity patterns in ecological studies has often been derived from traditional regression methods[5]. Generally, most of these methods focus on identifying the single best model, not on quantifying the independent explanatory power of the predictor variables, yet the latter is likely to provide important insights into the variables that should be managed to achieve desired conservation outcomes. Further, the performance of traditional regression methods is influenced by multicollinearity of explanatory variables as well as by outliers and missing data. These problems may result in the exclusion of ecologically more causal variables from the models[6], thus potentially biasing the actual relationships between species distributions and the environment.

There are a number of alternative statistical approaches that have been developed to improve predictive performance and provide reliable identification of explanatory variables that have the strongest influence on species distribution patterns. These techniques include hierarchical partitioning[68], variance partitioning[9] and boosted regression trees (BRT)[10]. The ability of partitioning methods to address the problem of multicollinearity makes them more desirable approaches for ecological studies, because explanatory variables are often only nominally independent. BRT is a relatively new approach for modelling species-environment relationships[10]. Advantages of BRT models include superior predictive performance compared to most traditional modelling methods, ability to handle different types of explanatory variables (data can be categorical, numeric or binary), ability to accommodate missing data, and they do not require elimination of outliers or prior data transformation[10]. BRT models are insensitive to differing scales of measurement, and they can fit complex nonlinear relationships and interactions between predictors[10].

Despite the additional insights that may be gained from partitioning methods and boosted regression trees, these approaches have rarely been applied to the analysis of ecological data[1115]. The present study applied BRT and hierarchical partitioning to provide insights into the important variables that influence the distribution of stream fishes from the Cape Floristic Region (CFR) of South Africa. The CFR is a hotspot for endemic freshwater biota[1618]. This region’s high degree of endemism is thought to have resulted from its long period of isolation and complex evolutionary history, which promoted in situ diversification[18]. However, the majority of the native stream fishes of the CFR rank amongst the most imperilled freshwater taxa in southern Africa[19]. Nearly all native freshwater fishes of the CFR are already listed in threatened categories of the IUCN, because their historical distributions have declined as a result of multiple anthropogenic impacts, mainly hydrological modifications, degradation of habitats and widespread invasion of the rivers by at least 15 alien fish species[1922]. These impacts have collectively resulted in several local extinctions in a number of mountain tributaries and extirpation of almost all main-stem populations of native freshwater fishes[20]. The remaining native fish populations persist only in undisturbed headwater tributaries, often above in-stream physical barriers that prevent upstream migration of alien invasive fishes.

Detailed understanding of natural variation of species is essential for predicting past distribution patterns[23], assessing conservation status[24], projecting potential impacts of environmental changes[25], designing and prioritizing conservation areas and formulating recovery programs for threatened species[26]. Such information should best be generated from undisturbed or minimally disturbed systems[27]. The near-natural condition of upland tributaries of the Breede River system in the south-western CFR offered a unique opportunity to study the factors that influence the distribution of stream fishes in the absence of major confounding impacts such as pollution, sedimentation and alien fishes. The Breede River system was previously thought to contain only four indigenous primary freshwater fishes, currently Galaxias zebratus, Pseudobarbus burchelli, Sandelia capensis and Barbus andrewi[28, 29]. Molecular studies have, however, discovered four deeply divergent genetic lineages within G. zebratus, three historically isolated lineages within P. burchelli and three lineages of S. capensis in the Breede River system[3032], Chakona et al., in preparation. Taxonomic revision of these groups is underway and some of the lineages will be described as distinct species. This study assessed one lineage of Galaxias zebratus, one of Pseudobarbus burchelli and two lineages of Sandelia capensis that co-occur in a number of undisturbed or near-natural mountain tributaries of the Breede River system. Galaxias ‘nebula’ (~ 75 mm total length (TL)) has a slender body form and Pseudobarbus ‘Breede’ (~ 135 mm TL) is fusiform with forked caudal fins. The two Sandelia spp. lineages (~ 200 mm TL) are genetically closely related and have laterally compressed body form. They were therefore combined in all analyses and comparisons. Barbus andrewi was not included in the present study because it was only found at two riverine localities. This species now persists in two man made dams in the Breede River catchment.

Specifically, the study addressed three questions: (i) what are the main environmental determinants of the spatial distributions of stream fishes in the CFR? (ii) are there differences in the main determinants of distribution among species? (iii) are the results of BRT and hierarchical partitioning in concordance?

One potential source of differences in the distribution patterns and environmental relationships between stream fishes is differing body morphologies. Freshwater fishes exhibit high morphological divergence, suggesting that they evolved to exploit specific habitats that differ in their environmental stressors[33, 34]. The fishes considered in this study have distinct body forms [Additional file1. It was hypothesised that Sandelia spp. would be mainly associated with lower river reaches because fishes with laterally compressed bodies are generally adapted to life in slow flowing waters[35]. Pseudobarbus ‘Breede’ were predicted to be capable of exploiting stream reaches with faster flowing waters because fishes with forked tails are generally considered to have improved swimming performance[36, 37]. Galaxias ‘nebula’ were hypothesised to be capable of exploiting reaches at higher elevation because anguilliform and slender bodied fishes are expected to have reduced energetic expenditure necessary to maintain position in faster flowing water[34, 36].


Galaxias was the most widespread lineage, occurring in 61% of the sampled sites and 73% of the streams. Pseudobarbus was common and present in 57% of the sampled sites and 62% of the streams. Sandelia was uncommon and was present at only 28% of the sampled sites and in 43% of the streams.

Species distribution and environmental relationships

Boosted regression trees

The simplification procedure indicated that elevation, pH and slope were the strongest correlates of Sandelia spp. distribution. Elevation contributed almost half of the variation (49.1%), while the relative contributions of pH and slope were well balanced and equal to 27.0% and 23.9%, respectively (Table1). Model evaluation using 10-fold cross validation suggested very good predictive performance (AUC = 0.88), with a predictive deviance of 33% (Table2). Fitted functions from the BRT models indicated that reaches located below 400 m and gentle gradients (< 15 m/km) were the most suitable for Sandelia spp. (Figure1). Sandelia were frequently caught in reaches with low pH (< 6). Strong interactions were found between elevation and gradient (50.9), as well as between elevation and pH (32.9). There was a pronounced peak of Sandelia spp. occurrence in reaches that combined both low elevation and gentle gradient (Figure2).

Table 1 Independent explanatory power of predictors
Table 2 Boosted regression trees model performance
Figure 1

Species response curves. Functions fitted for the most important predictors by a boosted regression trees (BRT) model relating the probability of occurrence of Sandelia spp., Pseudobarbus ‘Breede’ and Galaxias ‘nebula’ to environment.

Figure 2

Probability plot of Sandelia spp. occurrence. Plot of the interaction between slope and elevation showing the predicted probability of occurrence of Sandelia spp.

For Pseudobarbus ‘Breede’, the simplification procedure retained temperature, elevation and width as the most influential predictors, and model evaluation suggested very good predictive performance to independent data (AUC = 0.85) (Table2). The relative contributions of these variables were 38.6%, 33.7% and 27.7%, respectively (Table1). Fitted functions from the BRT model indicate that Pseudobarbus ‘Breede’ occurred most frequently in wider stream reaches at elevations below 500 m, and temperatures above 20°C (Figure1). The strongest interaction was found between width and temperature (183.2). Interactions fitted for Pseudobarbus ‘Breede’ indicate that this species occurs most frequently in wider stream reaches, but this response is strongly affected by temperature, with higher probabilities of detection in reaches with warmer temperatures (Figure3).

Figure 3

Probability plot of Pseudobarbus ‘Breede’ occurrence. Plot of the interaction between width and temperature showing the predicted probability of occurrence of Pseudobarbus ‘Breede’.

The simplification procedure indicated that Galaxias ‘nebula’ distribution was most strongly influenced by elevation (27%), width (24.6%), depth (19.5%), with slope and conductivity contributing 14.4% each (Table1). The final model had a predictive deviance of 10% and an AUC of 0.70 (Table2), indicating fair or useful predictive performance to independent data. Similar to both Sandelia and Pseudobarbus, fitted functions from the BRT models were non-linear and complex for Galaxias. These functions indicate that Galaxias ‘nebula’ occurred in a wide range of elevation, but rarely occurred in elevations between 400 and 500 m above sea level. Galaxias ‘nebula’ demonstrated a distinct preference for streams up to 6 m wide, and occurred more frequently in reaches with shallow water (< 1 m) (Figure1). The strongest interactions were found between width and depth (17.0) as well as width and conductivity (16.9). There is a pronounced peak of occurrence of this species in stream sections that combine narrow widths and shallow depths (Figure4).

Figure 4

Probability plot of Galaxias ‘nebula’ occurrence. Plot of the interaction between depth and width showing the predicted probability of occurrence of Galaxias ‘nebula’.

Hierarchical partitioning

The independent effects of six of the seven variables included in hierarchical partitioning analyses were statistically significant for at least one of the species (Table3). The total independent contributions (I) for all three species were substantially larger than their joint contributions (J). The |I/J| ratios were 13.5 for Sandelia spp., 5.5 for Pseudobarbus ‘Breede’ and 5.6 for Galaxias ‘nebula’ (Table4). Similar to the BRT results, hierarchical partitioning also indicated that elevation (27.1%), pH (19.6%) and slope (14.8%) were the most important explanatory variables for Sandelia spp. Hierarchical partitioning, however, also revealed that depth (17.7%) had an important independent effect of the distribution of Sandelia spp. Width (30.1%) and temperature (29.0%) were the most important explanatory variables for the distribution of Pseudobarbus ‘Breede’. Width (42.7%) and depth were the most important explanatory variables for Galaxias ‘nebula’. Similar to BRT results, conductivity did not appear to have a very important independent effect on the distribution of the species considered in this study (Table4).

Table 3 Randomisation tests for predictor variables
Table 4 Joint and independent effects of predictors


Determinants of species distributions

Elevation, slope, width, depth and temperature were identified as having the most important contribution to the distribution of the studied stream fishes. These results are consistent with studies from other regions showing that ecological boundaries of stream fishes are strongly influenced by elevation, slope and stream size[38, 39]. Winemiller et al.[34] documented changes in fish diversity and distributions associated with the altitudinal gradient of streams. A similar effect of altitude has also been reported for the spatial variation in Andean stream fish assemblages[40]. Buisson et al.[13] indicated that elevation and temperature had a strong effect on the spatial distribution of fishes in south-western France, while Amadio et al.[41] have documented the central role of water temperature in determining the upstream boundaries of saugers (Sander canadensis) in North America. Temperature was also found to be an important explanatory variable related to the distribution of fish communities inhabiting mountain tributaries of the central Andes in Colombia[40].

The effect of riparian and aquatic vegetation, stream physical structure and bottom cover in influencing the distribution of stream fishes is also important[42]. Riparian vegetation provides shade and allochthonous organic debris, which is an important source of carbon and energy[43], while aquatic vegetation provides in-stream cover and increases habitat diversity. Results from boosted regression trees in the present study, however, indicated that riparian and aquatic vegetation, substrate type and bottom cover were not relevant in explaining the distribution of the fishes considered in the present study. These variables were dropped from all species models by the recursive feature elimination procedure, which excludes non-informative predictors[10]. A possible reason why riparian vegetation may not be important in the mountain tributaries considered in the present study could be related to the low retention time of allochthonous organic debris due to the swift flowing nature of the streams as well as the occurrence of spates during the rainy season. None of the streams were found to have accumulated organic debris. Aquatic vegetation is also largely lacking in most of the undisturbed streams. The relatively low variation in stream physical structure (predominantly cobbles and boulders) could be the reason why substrate type was non-informative in this study. Thus, while some ecological patterns of stream fishes may be common among different geographic regions, some patterns may also be specific to particular regions. This makes it difficult to make broad ecological generalisations.

Comparative species responses

The distributions of Sandelia spp., Pseudobarbus ‘Breede’ and Galaxias ‘nebula’ were not determined by the same environmental factors. The upstream boundaries of Sandelia spp. in mountain tributaries of the Breede River system were strongly affected by elevation and slope. Sandelia were not found in reaches that had channel slopes greater than 15 m/km and elevation higher than 425 m, and were primarily associated with pools. Slope and topography affect the distribution of stream animals through their influence on the geomorphology and flow dynamics[44]. High elevation streams and steep gradients are characterised by strong currents and turbulent flow, and this selects for species that have adaptations for maintaining position in fast flowing waters. The observed habitat selection for Sandelia may be related to morphological specialisation. Sandelia have laterally compressed bodies, large pectoral fins and lower caudal fin aspect ratios (more square shaped caudal fins) [Additional file1. Studies have shown that fishes with these morphological characteristics have poor swimming performance due to high drag penalties[32]. It is therefore likely that Sandelia may not be capable of maintaining position under greater turbulence, due to increased energetic demands. This may explain why Sandelia was absent from reaches at higher elevations and steeper gradients. Similar patterns of habitat segregation associated with body morphology have been reported for tropical and neotropical stream fishes[34, 40].

Water temperature and mean width (used here as a proxy for stream size) were identified as the primary determinants of Pseudobarbus ‘Breede’ distribution. Temperature is considered to be a major ecological factor that directly affects behaviour, metabolism, reproduction, development and growth of freshwater fishes[4548]. The interaction between stream size and water temperature indicated that the probability of occurrence of Pseudobarbus ‘Breede’ was highest in wider reaches and higher temperatures (> 25°C). This pattern may be related to the dietary requirements of this species. The sub-terminal mouth in this species is suited for scraping periphyton or picking small animals from rock surfaces, a behaviour that has been commonly observed during field surveys. Thus, the strong relationship between occurrence of Pseudobarbus ‘Breede’ with wider streams and higher temperatures could indicate that this species selects habitats in which environmental conditions promote increased primary and secondary productivity. This pattern is concordant with that reported by Angermeier & Karr[49] who found strong relationship between stream fish biomass and habitat features (e.g. stream size) that maximise the availability of preferred dietary items (reviewed by Winemiller et al.[34]). Although elevation was found to be less influential on the distribution of Pseudobarbus ‘Breede’, it is important to note that this species was found to be capable of utilising habitats at higher elevation and steeper gradients (and hence faster current velocities) compared to Sandelia. Pseudobarbus has a fusiform body shape and higher caudal fin aspect ratio (forked tails) [Additional file1, two traits that are known to reduce drag and increase swimming ability (thrust) in faster-flowing water[36, 37].

Stream size and depth were selected by both boosted regression trees and hierarchical partitioning as important determinants of Galaxias ‘nebula’ distribution. The interaction between these two variables indicated that the probability of occurrence of this species was highest in smaller streams and shallow habitats. This agrees with findings from tropical systems where many small stream fishes are associated with smaller streams and shallow habitats that provide refugia from piscivores[34]. Galaxias ‘nebula’ occurred at diverse elevations and a wider variety of slopes, and penetrates into higher elevations compared to Pseudobarbus ‘Breede’ and Sandelia spp. Galaxias ‘nebula’ is ecologically adapted to utilise regions of high flow velocity and turbulence. This species has a more slender, cylindrically shaped body and smaller pectoral fins, features that reduce hydrodynamic drag and hence reduces the energetic demands of maintaining position in flowing water[34, 36, 37]. Species with these traits usually have better swimming performance and are capable of exploiting river reaches with faster current velocities[34]. The regression tree model indicated that Galaxias 'nebula' had reduced frequency of occurrence at elevations between 300 and 500 m above sea level. This pattern is not readily explainable, but it may possibly reflect the role of other factors (including biotic interactions) that have not been considered in this study[34].

Methodological aspects

Minor differences were found between results from boosted regression trees and hierarchical partitioning for some habitat variables. For example, the relative contributions of elevation, slope and pH in explaining the distribution of Sandelia spp. were higher in the boosted regression trees than the hierarchical partitioning results. This difference could be related to that fact that data for some of the variables were transformed prior to hierarchical partitioning analyses, while boosted regression trees analyses do not require data transformation prior to analysis. Nevertheless, these two approaches were complementary. Hierarchical partitioning addresses the problem of multicollinearity among predictor variables, but a major weakness of this approach is the inability to account for non-monotonous functions, yet nonlinear responses are quite common in species-environment relationships[38]. Hierarchical partitioning also does not provide information about the type of responses, because its purpose is not to generate a predictive model[50]. The boosted regression trees method was an appropriate alternative to addressing these shortcomings, because it has the ability to fit nonlinear responses between species and environmental predictors[10]. Additional advantages of this approach include the capacity to determine the strengths of interactions between predictors, and fitting the interaction effects to identify optimal habitats for the species[10]. Thus, the simultaneous application of boosted regression trees and hierarchical partitioning in this study helped to identify the predictors that were selected by both methods as the most likely causal variables as well as fitting species responses to them. This provides confidence and deeper insights into the variables that need to be targeted and managed to achieve desired conservation outcomes.

Evaluation of model performance using AUC revealed some differences among the species. Both Sandelia and Pseudobarbus had substantially high AUC scores than Galaxias. Sandelia and Pseudobarbus also had the highest explained deviance compared to Galaxias. A possible explanation why Galaxias ‘nebula’s model obtained poor explanatory performance measures could be related to its occurrence in diverse habitats compared to the other species. Alternatively, this may indicate that other factors that were not considered in this study (for example biotic interactions) could be influential in the distribution of Galaxias ‘nebula’.

Conservation implications

The species-specific spatial patterns and environmental relationships found in this study, and also reported from other studies[13, 34, 38, 40], suggest that stream fishes may respond differently to specific impacts, with some species being potentially more vulnerable than others. The invasion of river landscapes in the CFR by alien species and habitat degradation are considered to be the greatest threats to the freshwater biodiversity of this region[1922]. The restriction of the remnant populations of Sandelia spp. to lower sections of mountain streams exposes these lineages to multiple impacts, which include increased susceptibility to invasion by alien predators from the main-stems, hydrological alteration and habitat loss due to building of water abstraction structures in upper reaches, sedimentation and increased water turbidity, pollution and pesticides from intensive agricultural activities. The two Sandelia lineages are therefore arguably the most threatened of the fishes considered in the present study. The inclusion of these lineages into one widespread species that was considered to be capable of exploiting diverse habitats[29] clearly masked the real threats to these taxa. Conservation strategies in many data deficient regions has had to rely almost exclusively on expert knowledge, but apart from the implications of cryptic diversity, lack of detailed knowledge of species ecology may misdirect conservation prioritisation, and can potentially lead to loss of biodiversity. For example, the building of weirs to prevent upstream migration of alien species has been considered to be one of the best conservation strategies to secure the remaining populations of threatened fish species[51, 52]. Given the species-specific habitat associations of stream fishes, it is clear that careful selection of the location of such barriers is required so that the protected river sections will encompass optimal habitats for all the target species.

Field surveys indicate that Sandelia and Pseudobarbus have been extirpated from tributaries where weirs have been built at higher altitude. In some instances, the remaining populations only occur in a very short stretch of river above the weirs. This indicates that these weirs have been built just below the fishes’ upper limits. Long-term persistence of these populations is uncertain, because the remaining habitat may not be optimal, and loss of genetic diversity may occur since migration from elsewhere could be blocked by the man-made structures. In most cases, the reaches below water take-off points are completely dry during the summer period, or if water is present, the habitats have been invaded by alien fishes. Given the socio-economic importance of farming and irrigation in the region, complete exclusion of water abstraction from the streams is not feasible. Conservation authorities should therefore seek support from local landowners whose properties have streams that still hold viable populations of native fishes, to ensure that (1) water take-off points and weirs are placed as low as possible in tributary streams, but above alien fish distributions, (2) in-stream habitats are rehabilitated and protected and (3) ecological flows are restored in stream sections that benefit indigenous fishes.

Translocation to undisturbed habitats has been suggested as a useful strategy in the recovery of threatened species[53]. However, in many regions (including the CFR), the remaining undisturbed streams are confined to high altitude mountain catchments where human development is still minimal. Given the species-specific ecological boundaries presented in this study, and also documented for stream fishes from other regions[13, 34, 38, 40], translocation into high altitude streams may not help certain species, while at the same time potentially impacting on other aquatic biota. For example, moving species with laterally compressed bodies such as Sandelia into reaches above their natural upstream boundaries may not be a viable long-term conservation measure, since such species are associated with lower river reaches with gentle gradient and an abundance of pools with slow flow. Findings from this study suggest that Sandelia could be used as umbrella species[54], because successful protection and restoration of their optimal habitat will indirectly protect other broadly co-distributed freshwater taxa.


The contrasting habitat relationships of Sandelia spp., Pseudobarbus ‘Breede’ and Galaxias ‘nebula’ support findings from earlier studies that also reported species specific responses of stream fishes to environmental descriptors[13, 38]. The species-specific modeling approach used in the present study provides deeper insights into species-environment relationships compared to the use of synthetic descriptors, such as guilds or species richness[55, 56]. The use of boosted regression trees and hierarchical partitioning allowed accurate identification of the most influential environmental predictors and the responses of the species to them. Results from the present study are consistent with previous research on stream fishes that suggest strong relationships between fish morphology and ecology[3437]. These species-specific responses should be considered in conservation planning and management.


Study area

The geology of the CFR is dominated by the Cape Supergroup which consists of extensively folded Table Mountain, Witteberg and Bokkeveld Groups[57, 58]. The Bokkeveld Formations are marine deposits and the rivers draining these rock types have high conductivity and high salt content. Erosion of the more resistant Table Mountain Formations produce highly leached quartzite sandstones and the rivers draining these formations have low conductivity and oligotrophic waters. The climate is Mediterranean with dry summers and wet winters resulting from orographic rainfall. Undisturbed rivers in this region have perennial flow.

Fish distribution data

The research was conducted under permit from CapeNature (permit number: AAA-004-000205-0035) issued only after the approval of methods by a review panel. Intensive sampling was conducted during low-flow conditions between November 2008 and December 2009. Data from 148 sites from 44 undisturbed mountain tributaries of the Breede River system (Figure5) were used in this study. Sites were classified as undisturbed and included in the present study if they were: (i) located upstream of weirs and water diversion structures, (ii) located upstream of agricultural or residential areas, (iii) not invaded by alien fishes and (iv) not isolated by apparent fish barriers (e.g. waterfalls). This was done to ensure that the distribution of the fishes was based on intrinsic habitat preference and not influenced by anthropogenic disturbance, alien fish impacts (predation or competition) or exclusion by natural or artificial barriers to dispersal.

Figure 5

Map of study area. Location of sampling sites across the Breede River system. Insert shows location of the Breede River system in the Cape Floristic Region at the southern tip of Africa.

At each locality riffles and pools in a stream section of about 30–50 m were sampled. Due to the relatively small size of the streams, this length usually included more than 3 pool-riffle sequences, which is considered to be adequate for getting a representative sample of fish communities within a reach[59]. Sampling techniques varied depending on the size of the stream, depth and water clarity. Electrofishing (SAMUS-725MP) was used for sampling in shallow riffle stream sections with cobble-boulder substratum, while the occurrence of fish in pools with clear water was determined by snorkelling. Deep tannin-stained pools were sampled with a seine net (3 m length, 3 mm mesh size). While there may be advantages in considering fish densities at each station rather than their presence or absence alone, the present study encompassed a wide geographic region. Time and resource constraints precluded assessment of fish densities at the localities sampled. Therefore the presence-absence approach was used for the present study. Fish at each site were either observed or captured using the methods described above. Captured fish were identified and quickly returned to the water alive, but at some of the localities some fish (up to 10 individuals per species per tributary) were retained for tissue samples for genetic analysis (Chakona et al., unpublished). The location of each sampling site was recorded with a hand held Global Positioning System (GPS) unit with accuracy within 10 m.

Environmental predictors

Studies from disparate regions have indicated that the distribution of stream fishes is influenced by a number of environmental factors, such as stream size, water depth, flow velocity, substrate types, water temperature and chemistry, riparian and aquatic vegetation, elevation and channel slope[27, 3841]. At each sampling locality, habitat was characterised by quantitative and qualitative measurements of 11 environmental variables. Portable electronic meters were used to measure temperature and conductivity (Hanna EC/TDS/Temperature Tester, HI98311 (DiST 5)) and pH (Hanna pH/Temperature tester HI98128). Local habitat features were characterised by measuring channel width, depth, assessing bottom substratum and aquatic vegetation. Within each reach, 4 to 8 transects were measured for physical habitat variables. Depth was measured with a graduated pole at three equally spaced intervals for each transect. Maximum depth was the greatest water depth measured among transects. Transect widths were used to calculate mean width (used here as proxy for stream size) for each sampling locality. Dominant substratum was visually estimated and characterised as silt-sand (< 2 mm), gravel (10 – 64 mm), cobble (64–256 mm), boulders (> 256 mm) and bed rock (solid rock surfaces)[60, 61]. Bottom cover, presence of aquatic and terrestrial riparian vegetation were visually assessed and characterised as none (0), scarce (< 30%), moderate (30 – 60%) and abundant (> 60%). Elevation and channel slope for each site were calculated from GPS coordinates using GIS Spatial Analyst.

Statistical modelling

Boosted regression trees

Boosted regression trees (BRT) was used to determine the relationship between fish occurrence and the 11 environmental variables. A detailed overview of boosted regression trees and guidelines for using this approach are given by Elith et al.[10]. BRT analyses were carried out in the R statistical package version 2.15.1[62] using the ‘dismo’ library following Elith & Leathwick[63]. Because of the binary nature of the response variable (presence/absence), the binomial error distribution and a logistic link function were used. Tree complexity (tc) and learning rate (lr) were altered to determine optimal settings for the base model containing all the descriptors. Ten-fold cross validation was used for each optimisation trial, with a random subset of 50% of the data being used to fit each new tree. This was followed by the recursive feature elimination procedure which was used to simplify the base model by dropping non-informative predictors[10]. Predictive performance of the final model was then evaluated using the cross-validation process internal to the model building procedure. This evaluation was based on predictions to sites that were withheld from model fitting. Two performance metrics were determined for each model. Predictive deviance (expressed as a percentage of the total deviance) provides a measure of the goodness-of-fit between predicted and raw values. The second metric is the area under the receiver operator characteristic curve (AUC) which estimates the degree to which fitted values discriminate between observed presences and absences. Values of AUC range from 0.5 to 1.0. Values of AUC > 0.90 indicate excellent distinction between presences and absences, 0.80 - 0.90 is considered very good, 0.70 – 0.80 indicates fair performance, values > 0.60 are considered useful and values < 0.60 indicate poor performance[6466]. The relative contribution (%) of the individual predictors was evaluated, and environmental optima for each species were determined by plotting the distribution of fitted values in relation to each of the predictors following Elith & Leathwick[63]. The effect of interactions between predictors was also evaluated.

Hierarchical partitioning

Hierarchical partitioning was used to determine the independent contribution of the explanatory variables on the occurrence of each of the taxa. The non-numeric variables (i.e. bottom cover, dominant substratum, aquatic and terrestrial vegetation) were excluded from analyses. Data for conductivity, width, depth, elevation and slope were log transformed prior to analyses. Hierarchical partitioning was performed using the ‘hier.part’ package version 1.0-3[67], which was implemented using the R statistical package version 2.15.1[62]. Logistic regression and log-likelihood were used as the goodness-of-fit measures in the analyses. Hierarchical partitioning computes the increase in fit for all models containing a given variable, compared to an equivalent model without that variable. The average improvement in fit (i.e. reduction in deviance) across all possible models containing that predictor is then computed. This process results in the estimation of the independent contribution of each explanatory variable (Ii), and the joint contribution (Ji) resulting from correlation with other variables[50]. The relative independent contribution of each predictor (%Ii) can thus be determined. Following Pont et al.[38], a predictor with %Ii higher than 100/N (where N in the number of predictors) was considered to have high explanatory power. Therefore, predictors with %Ii higher than 14.3% were considered to be important. Randomisation tests which yield z-scores were used to determine statistical significance of the relative independent contributions based on an upper confidence limit of 0.95[50]. Following Chevan & Sutherland[7] and Mac Nally[6], the ratio |I/J| was also calculated. Values of this ratio below unit indicate high correlation among predictors.


  1. 1.

    Dudgeon D, Arthington AH, Gessner MO, Kawabata Z-I, Knowler DJ, Lévêque C, Naiman NJ, Prieur-Richard A-H, Soto D, Stiassny MLJ, Sullivan CA: Freshwater biodiversity: importance, threats, status and conservation challenges. Biol Rev. 2006, 81: 163-182.

    Article  PubMed  Google Scholar 

  2. 2.

    Holland RA, Darwaal WRT, Smith KG: Conservation priorities for freshwater biodiversity: The Key Biodiversity Area approach refined and tested for continental Africa. Biol Conserv. 2012, 148: 167-179. 10.1016/j.biocon.2012.01.016.

    Article  Google Scholar 

  3. 3.

    Thacker CE, Unmack PJ, Matsui L, Rifenbark N: Comparative phylogeography of five sympatric Hypseleotris species (Teleostei: Eleotridae) in south-eastern Australia reveals a complex pattern of drainage basin exchanges with little congruence across species. J Biogeogr. 2007, 34: 1518-1533. 10.1111/j.1365-2699.2007.01711.x.

    Article  Google Scholar 

  4. 4.

    Burridge CP, Craw D, Jack DC, King TM, Waters JM: Does fish ecology predict dispersal across a river drainage divide?. Evolution. 2008, 62: 1484-1499. 10.1111/j.1558-5646.2008.00377.x.

    Article  PubMed  Google Scholar 

  5. 5.

    Guisan A, Zimmerman NE: Predictive habitat distribution models in ecology. Ecol Model. 2000, 135: 147-186. 10.1016/S0304-3800(00)00354-9.

    Article  Google Scholar 

  6. 6.

    Mac Nally R: Regression and model-building in conservation biology, biogeography and ecology: The distinction between—and reconciliation of - ‘predictive’ and explanatory models. Biodivers Conserv. 2000, 9: 655-671. 10.1023/A:1008985925162.

    Article  Google Scholar 

  7. 7.

    Chevan A, Sutherland M: Hierarchical partitioning. Am Stat. 1991, 45: 90-96.

    Google Scholar 

  8. 8.

    Mac Nally R, Walsh CJ: Hierarchical partitioning public-domain software. Biodivers Conserv. 2004, 13: 659-660.

    Article  Google Scholar 

  9. 9.

    Borcard D, Legendre P, Drapeau P: Partialling out the spatial component of ecological variation. Ecology. 1992, 73: 1045-1055. 10.2307/1940179.

    Article  Google Scholar 

  10. 10.

    Elith J, Leathwick JR, Hastie T: A working guide to boosted regression trees. J Anim Ecol. 2008, 77: 802-813. 10.1111/j.1365-2656.2008.01390.x.

    CAS  Article  PubMed  Google Scholar 

  11. 11.

    Heikkinen RK, Luoto M, Kuussaari M, Pöyry J: New insights into butterfly–environment relationships using partitioning methods. P Roy Soc B. 2005, 272: 2203-2210. 10.1098/rspb.2005.3212.

    Article  Google Scholar 

  12. 12.

    Leathwick JR, Elith J, Chadderton WL, Rowe D, Hastie T: Dispersal, disturbance and the contrasting biogeographies of New Zealand’s diadromous and non-diadromous fish species. J Biogeogr. 2008, 35: 1481-1497. 10.1111/j.1365-2699.2008.01887.x.

    Article  Google Scholar 

  13. 13.

    Buisson L, Blanc L, Grenouillet G: Modelling stream fish species distribution in a river network: the relative effects of temperature versus physical factors. Ecol Freshw Fish. 2008, 17: 244-257. 10.1111/j.1600-0633.2007.00276.x.

    Article  Google Scholar 

  14. 14.

    Taverny C, Lassalle G, Ortusi I, Roqueplo C, Lepage M, Lambert P: From shallow to deep waters: habitats used by larval lampreys (genus Petromyzon and Lampetra) over a western European basin. Ecol Freshw Fish. 2012, 21: 87-99. 10.1111/j.1600-0633.2011.00526.x.

    Article  Google Scholar 

  15. 15.

    Eglington SM, Pearce-Higgins JW: Disentangling the relative importance of changes in climate and land-use intensity in driving recent bird population trends. PlosOne. 2012, 7: 1-8.

    Article  Google Scholar 

  16. 16.

    Wishart MJ, Day JA: Endemism of the freshwater fauna of the south-western Cape, South Africa. Verh Internat Verein Limnol. 2002, 28: 1-5.

    Google Scholar 

  17. 17.

    Thieme ML, Abell R, Stiassny MLJ, Skelton PH, Lehner B, Teugels GG, Dinerstein E, Kamden TA, Burgess N, Olson DM: Freshwater Ecoregions of Africa and Madagascar. 2005, Island Press

    Google Scholar 

  18. 18.

    Linder HP, Johnson SD, Kuhlmann M, Matthee CA, Nyffeler R, Swartz ER: Biotic diversity in the Southern African winter-rainfall region. Curr Opin Environ Sustain. 2010, 2: 109-116. 10.1016/j.cosust.2010.02.001.

    Article  Google Scholar 

  19. 19.

    Tweddle D, Bills R, Swartz E, Coetzer W, Da Costa L, Engelbrecht J, Cambray J, Marshall B, Impson D, Skelton PH: The status and distribution of freshwater fishes. In The Status and Distribution of Freshwater Biodiversity in Southern Africa. Edited by: Darwall WRT, Smith KG, Tweddle D, Skelton PH. 2009, Gland and Grahamstown: IUCN and South African Institute for Aquatic Biodiversity, 21-37.

    Google Scholar 

  20. 20.

    Clark BM, Impson D, Rall J: Present status and historical changes in the fish fauna of the Berg River, South Africa. Trans Roy Soc S Afr. 2009, 64: 142-163. 10.1080/00359190909519249.

    Article  Google Scholar 

  21. 21.

    Skelton P, Weyl O: Fishes. Alien and invasive animals: A South African perspective. Edited by: Picker M, Griffiths C. 2011, Cape Town: Struik Nature, 47-70.

    Google Scholar 

  22. 22.

    Van Rensburg BJ, Weyl OLF, Davies SJ, Van Wilgen NJ, Spear D, Chimimba CT, Peacock F: Invasive vertebrates of South Africa. In Biological invasions: economic and environmental costs of alien plant, animal and microbe species. Edited by: Pimentel D. 2011, USA: CRC Press, 325-380. 2

    Chapter  Google Scholar 

  23. 23.

    Nogués-Bravo D: Predicting the past distribution of species climatic niches. Global Ecol Biogeogr. 2009, 18: 521-531. 10.1111/j.1466-8238.2009.00476.x.

    Article  Google Scholar 

  24. 24.

    IUCN: Guidelines for Application of IUCN Red List Criteria at Regional Levels: Version 3.0. 2003, IUCN, Gland, Switzerland and Cambridge, UK: IUCN Species Survival Commission

    Google Scholar 

  25. 25.

    Xenopoulos MA, Lodge DM: Going with the flow: using species-discharge relationships to forecast losses in fish biodiversity. Ecology. 2006, 87: 1907-1914. 10.1890/0012-9658(2006)87[1907:GWTFUS]2.0.CO;2.

    Article  PubMed  Google Scholar 

  26. 26.

    Filipe AF, Marques TA, Seabra S, Tiago P, Ribeiro F, Moreira Da Costa L, Cowx IG, Collares-Pereira MJ: Selection of priority areas for fish conservation in Guadiana river basin, Iberian Peninsula. Conserv Biol. 2004, 18: 189-200. 10.1111/j.1523-1739.2004.00620.x.

    Article  Google Scholar 

  27. 27.

    Fischer JR, Paukert CP: Habitat relationships with fish assemblages in minimally disturbed Great Plains regions. Ecol Freshw Fish. 2008, 17: 597-609. 10.1111/j.1600-0633.2008.00311.x.

    Article  Google Scholar 

  28. 28.

    Barnard KH: Revision of the indigenous freshwater fishes of the S.W. Cape Region. Ann S Afr Mus. 1943, 36: 101-263.

    Google Scholar 

  29. 29.

    Skelton P: A complete guide to the freshwater fishes of Southern Africa. 2001, Cape Town: Struik Publishers

    Google Scholar 

  30. 30.

    Van Niekerk R: Phylogeography of the Cape Galaxias, Galaxias zebratus. 2004, South Africa: Department of Genetics, University of Pretoria, Unpublished MSc thesis

    Google Scholar 

  31. 31.

    Roos H: Genetic diversity in the anabantids Sandelia capensis and S. bainsii: A phylogeographic and phylogenetic investigation. 2004, South Africa: Department of Genetics, University of Pretoria, Unpublished MSc thesis

    Google Scholar 

  32. 32.

    Swartz ER, Skelton PH, Bloomer P: Phylogeny and biogeography of the genus Pseudobarbus (Cyprinidae): Shedding light on the drainage history of rivers associated with the Cape Floristic Region. Mol Phylogenet Evo. 2009, 51: 75-84. 10.1016/j.ympev.2008.10.017.

    Article  Google Scholar 

  33. 33.

    Wikramanayake ED: Ecomorphology and biogeography of a tropical stream fish assemblage: evolution of assemblage structure. Ecology. 1990, 71: 1756-1764. 10.2307/1937583.

    Article  Google Scholar 

  34. 34.

    Winemiller KO, Agostinho AA, Pellegrini-Carasmachi E: Fish ecology in tropical streams. Tropical Stream Ecology. Edited by: Dudgeon D. 2008, London: Academic Press, 107-146.

    Chapter  Google Scholar 

  35. 35.

    Webb PW: Swimming. The Physiology of Fishes. Edited by: Evans DH. 1998, Boca Raton, FL: CRC Press, 3-24. 2

    Google Scholar 

  36. 36.

    Webb PW: Body form, locomotion and foraging in aquatic vertebrates. Am Zool. 1984, 24: 107-120.

    Article  Google Scholar 

  37. 37.

    Videler JJ: Fish swimming. 1993, London, New York: Chapman and Hal

    Book  Google Scholar 

  38. 38.

    Pont D, Hugueny B, Oberdorff T: Modelling habitat requirement of European fishes: do species have similar responses to local and regional environmental constraints?. Can J Fish Aquat Sci. 2005, 62: 163-173. 10.1139/f04-183.

    Article  Google Scholar 

  39. 39.

    Lipsey TSB, Hubert WA, Rahel FJ: Relationships of elevation, channel slope and stream width to occurrences of native fishes of the Great Plains-Rocky Mountains interface. J Freshwater Ecol. 2005, 2005 (20): 695-705.

    Article  Google Scholar 

  40. 40.

    Jaramillo-Villa U, Maldonado-Ocampo JA, Escobar F: Altitudinal variation in fish assemblage diversity in streams of the central Andes of Colombia. J Fish Biol. 2010, 2010 (76): 2401-2417.

    Article  Google Scholar 

  41. 41.

    Amadio CJ, Hubert W, Johnson K, Oberlie D, Dufek D: Factors affecting the occurrence of saugers in small, high-elevation rivers near the western edge of the species’ natural distribution. T Am Fish Soc. 2005, 134: 160-171. 10.1577/FT03-225.1.

    Article  Google Scholar 

  42. 42.

    Smokorowski KE, Pratt TC: Effect of a change in physical structure and cover on fish and fish habitat in freshwater ecosystems – a review and meta-analysis. Environ Rev. 2007, 15: 15-41. 10.1139/a06-007.

    Article  Google Scholar 

  43. 43.

    Gregory SV, Swanson FJ, McKee WA, Cummins KW: An ecosystem perspective on riparian zones. BioScience. 1991, 41: 540-551. 10.2307/1311607.

    Article  Google Scholar 

  44. 44.

    Allan JD: Stream Ecology: Structure and Function of Running Waters. 1995, London: Chapman & Hall

    Book  Google Scholar 

  45. 45.

    Mills CA, Mann RHK: Environmentally-induced fluctuations in year-class strength and their implications for management. J Fish Biol. 1985, 27: 209-226. 10.1111/j.1095-8649.1985.tb03243.x.

    Article  Google Scholar 

  46. 46.

    Taniguchi Y, Rahel FJ, Novingen DC, Gerow KG: Temperature mediation of competitive interactions among three fish species that replace each other along longitudinal stream gradients. Can J Fish Aquat Sci. 1998, 1998 (55): 1894-1901.

    Article  Google Scholar 

  47. 47.

    Gillooly JF, Brown JH, West GB, Van Savage M, Charnov EL: Effects of size and temperature on metabolic rate. Science. 2001, 2001 (293): 2248-2251.

    Article  Google Scholar 

  48. 48.

    Wolter C: Temperature influence on the fish assemblage structure in a large lowland river, the lower Oder River, Germany. Ecol Freshw Fish. 2007, 16: 493-503. 10.1111/j.1600-0633.2007.00237.x.

    Article  Google Scholar 

  49. 49.

    Angermeier PL, Karr JR: Fish communities along environmental gradients in a system of tropical streams. Environ Biol Fish. 1983, 9: 117-135. 10.1007/BF00690857.

    Article  Google Scholar 

  50. 50.

    Mac Nally R: Multiple regression and inference in ecology and conservation biology: further comments on identifying important predictor variables. Biodivers Conserv. 2002, 11: 1397-1401. 10.1023/A:1016250716679.

    Article  Google Scholar 

  51. 51.

    Impson D, Swartz E: Threatened fishes of the world: Barbus calidus Barnard, 1938 (Cyprinidae). Environ Biol Fish. 2002, 63: 340-10.1023/A:1014377310994.

    Article  Google Scholar 

  52. 52.

    Muhlfeld CC, D’Angelo V, Kalinowski ST, Landguth EL, Downs CC, Tohtz J, Kershner JL: A Fine-scale assessment of using barriers to conserve native stream salmonids: a case study in akokala creek, glacier national park, USA. The Open Fish Sci Journ. 2012, 5: 9-20. 10.2174/1874401X01205010009.

    Article  Google Scholar 

  53. 53.

    Maitland PS: The conservation of freshwater fish: past and present experience. Biol Conserv. 1995, 72: 259-270. 10.1016/0006-3207(94)00088-8.

    Article  Google Scholar 

  54. 54.

    Roberge J-M, Angelstam P: Usefulness of the umbrella species concept as a conservation tool. Conserv Biol. 2004, 18: 76-85. 10.1111/j.1523-1739.2004.00450.x.

    Article  Google Scholar 

  55. 55.

    Lamouroux N, Cattaneó F: Fish assemblages and stream hydraulics: consistent relations across spatial scales and regions. Riv Res Applic. 2006, 22: 727-737. 10.1002/rra.931.

    Article  Google Scholar 

  56. 56.

    Taylor CM, Holder TL, Fiorillo RA, Williams LR, Thomas RB, Warren ML: Distribution, abundance, and diversity of stream fishes under variable environmental conditions. Can J Fish Aquat Sci. 2006, 63: 43-54. 10.1139/f05-203.

    Article  Google Scholar 

  57. 57.

    Bond GW: A geological survey of the underground water supplies of the Union of South Africa. Mem. geol. Surv. Un. S. Afr. 1946, 41: 1-208.

    CAS  Google Scholar 

  58. 58.

    Theron JN: Fynbos palaeoecology. A preliminary synthesis. Edited by: Deacon HJ, Hendey QB, Lambrechts JJ. 1983, 21-34. Geological setting of the fynbos, S Afr Nat Sci Prog Rep,

    Google Scholar 

  59. 59.

    Arend KK: Classification of streams and reaches. Aquatic habitat assessment: common methods. Edited by: Bain MB, Stevenson NJ. 1999, American Fisheries Society, Bethesda, MD, 57-74.

    Google Scholar 

  60. 60.

    Bain MB, Finn JT, Booke HE: Quantifying stream substrate for habitat analysis studies. N Am J Fish Manage. 1985, 5: 499-506. 10.1577/1548-8659(1985)5<499:QSSFHA>2.0.CO;2.

    Article  Google Scholar 

  61. 61.

    Gibson RJ, Hilliér KG, Whalen RR: A comparison of three methods of estimating substrate coarseness in rivers. Fisheries Manag Ecol. 1998, 5: 323-329. 10.1046/j.1365-2400.1998.540323.x.

    Article  Google Scholar 

  62. 62.

    R Development Core Team: R: a language and environment for statistical computing. 2004, Vienna, Austria: R Foundation for Statistical Computing,,

    Google Scholar 

  63. 63.

    Elith J, Leathwick J: Boosted regression trees for ecological modelling.,

  64. 64.

    Swets K: Measuring the accuracy of diagnostic systems. Science. 1988, 240: 1285-1293. 10.1126/science.3287615.

    CAS  Article  PubMed  Google Scholar 

  65. 65.

    Parisien MA, Moritz MA: Environmental controls on the distribution of wildfire at multiple spatial scales. Ecol Monogr. 2009, 79: 127-154. 10.1890/07-1289.1.

    Article  Google Scholar 

  66. 66.

    Lane JQ, Raimondi PT, Kudela RM: Development of a logistic regression model for the prediction of toxigenic Pseudo-nitzschia blooms in Monterey Bay, California. Mar Ecol Prog Ser. 2009, 383: 37-51.

    CAS  Article  Google Scholar 

  67. 67.

    Walsh C, Mac Nally R: Package ‘hier.part’. 2012

    Google Scholar 

Download references


We thank GG, WK, JM, DI and MJ for providing assistance with data collection during field surveys. GG also revised and helped with editing the manuscript. This research was supported by grants from the International Foundation for Science, The Rufford Small Grants for Nature Conservation, WWF Prince Bernhard Scholarship, the National Research Foundation (South Africa) and The Claude Leon Foundation. CapeNature is acknowledged for providing permission to undertake this research (permit number: AAA-004-000205-0035).

Author information



Corresponding author

Correspondence to Albert Chakona.

Additional information

Competing interests

The authors declare that there are no competing interests.

Authors' contributions

This work formed part of AC’s PhD research on the ecology and biogeography of endemic stream fishes from the Cape Floristic Region of South Africa. Both authors contributed to the conception, design and acquisition of data during field surveys. AC analysed the data and drafted the manuscript. Both authors revised and approved the final manuscript.

Albert Chakona and Ernst R Swartz contributed equally to this work.

Electronic supplementary material

Authors’ original submitted files for images

Rights and permissions

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and Permissions

About this article

Cite this article

Chakona, A., Swartz, E.R. Contrasting habitat associations of imperilled endemic stream fishes from a global biodiversity hot spot. BMC Ecol 12, 19 (2012).

Download citation


  • Sandelia
  • Pseudobarbus
  • Galaxias
  • Habitat use
  • Breede river system
  • Cape floristic region
  • Boosted regression trees
  • Hierarchical partitioning