Measuring specialization in species interaction networks

Background Network analyses of plant-animal interactions hold valuable biological information. They are often used to quantify the degree of specialization between partners, but usually based on qualitative indices such as 'connectance' or number of links. These measures ignore interaction frequencies or sampling intensity, and strongly depend on network size. Results Here we introduce two quantitative indices using interaction frequencies to describe the degree of specialization, based on information theory. The first measure (d') describes the degree of interaction specialization at the species level, while the second measure (H2') characterizes the degree of specialization or partitioning among two parties in the entire network. Both indices are mathematically related and derived from Shannon entropy. The species-level index d' can be used to analyze variation within networks, while H2' as a network-level index is useful for comparisons across different interaction webs. Analyses of two published pollinator networks identified differences and features that have not been detected with previous approaches. For instance, plants and pollinators within a network differed in their average degree of specialization (weighted mean d'), and the correlation between specialization of pollinators and their relative abundance also differed between the webs. Rarefied sampling effort in both networks and null model simulations suggest that H2' is not affected by network size or sampling intensity. Conclusion Quantitative analyses reflect properties of interaction networks more appropriately than previous qualitative attempts, and are robust against variation in sampling intensity, network size and symmetry. These measures will improve our understanding of patterns of specialization within and across networks from a broad spectrum of biological interactions.


Background
The degree of specialization of plants or animals has been studied and debated extensively, and a continuum from complete specialization to full generalization can be found in various systems [1][2][3][4][5][6]. In general, two levels of specialization measures may be distinguished: first, the characterization of focal species and, second, the degree of specialization of an entire interaction network, representing an assemblage of species and their interaction partners (e.g. food webs, mutualistic networks, predator-prey relationships). When interactions are considered as ecological niche, the first level describes the niche breadth of a species and the second level the degree of niche partitioning across species. While the species level is more straightfor-ward in its biological interpretation, analyses at the network level can be useful for comparisons across different types of networks. Such analyses have been performed to compare plant-pollinator webs versus plant-seed disperser webs [4,5], different plant-pollinator networks along geographic gradients [1,7,8], or food webs of variable size [9,10]. Entire network analyses are also used to study patterns on a community level such as coevolutionary adaptations [3], ecosystem stability or resilience [11][12][13][14].

Quantifying specialization at the species level
Specialization or generalization of interactions are most commonly characterized as the number of partners (or 'links'), e.g. the number of pollinator species visiting a flowering plant species or the number of food plant families a herbivore feeds upon. In this qualitative approach, interactions between a consumer and a resource species are only scored in a binary way as 'present' or 'absent', ignoring any distinction between strong interactions and weak or occasional ones. For example, binary representation of interactions do not distinguish a scenario where 99% of the individuals of a herbivore species feed on a single plant species only, but occasionally an individual is found on another plant, from a different scenario where a herbivore regularly feeds on both food plants. The problem is analogous to the measurement of biodiversity either as a crude species richness versus as a more elaborate diversity index including relative abundances [15]. Several approaches have thus been used to directly include variation in interaction frequencies (i.e., their evenness) in characterizing the diversity of partners, e.g. Simpson's diversity index for pollinators [16,17] or Lloyd's index for host specificity [18]. Alternatively, other studies indirectly controlled for abundance or sampling intensity using rarefaction methods [13,19]. Correspondingly, Bersier and coworkers [20] have suggested to quantify the diversity of biomass flows in food webs using a Shannon diversity measure. Niche breadth theory provides several additional indices that include some measure of resource frequency or resource use intensity [21], which can be viewed in analogy to 'partner diversity' in the context of association networks. However, Hurlbert [22] emphasized that not only proportional utilization, but also the proportional availability of each niche should be taken into account. A species that uses all niches in the same proportion as their availability in the environment should be considered more opportunistic than a species that uses rare resources disproportionately more. If variation in resource availability is large, diversity-based measures that ignore this availability may be highly misleading [22,23]. Several niche breadth measures thus combine proportional resource utilization with proportional resource availability [22][23][24]. These concepts have been rarely applied in the context of species interaction net-works, e.g. plant-pollinator webs where binary data are more common than quantitative webs.

Quantifying specialization at the community level
The measurement used most commonly to characterize community-wide specialization is the 'connectance' index (C) [1,4,[8][9][10][25][26][27]. C is defined as the proportion of the actually observed interactions to all possible interactions. Consider a contingency table showing the association between two parties, with r rows (e.g., plant species) and c columns (e.g., pollinators). Connectance is defined as C = I/(r·c), with I being the total number of non-zero elements in the matrix. Therefore, like the number of partners or links (L) described above, C uses only binary information and ignores interaction strength. C is directly related to the mean number of links ( ) of plant species or pollinator species as C = plants /c = poll /r. This measure, , has also been used to compare networks [1,3,7,8,28]. Recently, it has been suggested to use instead of C to characterize networks [29]. However, note that comparisons across networks of different size (number of species) are problematic, since , unlike C, is not scaled according to the number of available partners (see also [2,10]).
in a small network may represent a larger proportion of available partners compared to the same value of in a large network.
Analyses based on binary data -both at the species and the community level -have obvious shortcomings, since they are highly dependent on sampling effort, decisions which species to include or not, and the size of investigated networks. Several authors thus emphasized the need to move beyond binary representations of interactions to quantitative measures involving some measure of interaction strength [4,20,27,[29][30][31][32]. A way to at least partly overcome these deficiencies is to cut off all rare species or weak interactions below a frequency threshold [3,9,33,34] or to control for sampling effort in null models [7,8,13,19,25,35]. However, for interaction webs where a more detailed information is available, simplification to binary data as in C or remains unsatisfactory. Conveniently, the observed interaction frequency may represent a meaningful surrogate for interaction strength, at least in pollination and seed-dispersal systems as shown by Vázquez et al. [30] (see also [16]). Incorporating interaction frequency or even a direct measure of interaction strength in a network measure of specialization would thus provide an important progress frequently called for.
A severe additional problem of connectance is that its lower and upper constraints are not scale-invariant [25], which limits its use for comparisons across networks. The minimum possible value (C min ) to maintain at least one link per species declines in a hyperbolic function with the number of interacting species, since C min = max(r, c)/(r·c), and an upper limit (C max ) may be constrained by, or a function of, total sampling effort. Across networks, C decays strongly with network size, which has been debated in detail in the context of food web analysis [9,10,26,27,36,37]. The strong relationship between C and network size generates a problem for disentangling any biologically meaningful effect from this mathematically inherent scale dependence. For instance, network comparisons may focus on residual variation in C after an average effect of network size has been controlled for [1,4], or C could be rescaled to account for this size effect (see [25,36]). For natural networks of similar size, the range of actual C values is typically very narrow [4], thus other structural forces may be poorly detectable.
The objective of this paper is to develop and discuss specialization measures that are based on frequency data and thus account for sampling intensity, and that overcome the problem of scale dependence. We then test these approaches by evaluating the effect of sampling effort and scale dependence on a published natural pollination network, and on randomly generated associations as a null model. We differentiate between species-level measures of specialization, useful to investigate variability among species within a web, and a single network-wide measure that can be used for comparisons across networks.

Patterns in two pollinator networks
Two selected plant-pollinator networks (British meadows studied by Memmott [32], Argentinean forests studied by Vázquez and Simberloff [33]) differ markedly in their degree of specialization when quantitative analyses are applied. The qualitative network index, connectance, is similar in both interaction webs (British web: C = 0.15, Argentinean web: C = 0.13). However, frequencies of pollinator visits are much more evenly distributed in the British community than in the Argentinean example. In the British web, the interaction between a dipteran species and Leontodon hispidus was the most frequent one, representing 6% of the total 2183 interactions observed.  1A). In contrast, most pollinators in the Argentinean web are moderately generalized to specialized, with the second highest level of specialization found in the most common species (Fig. 1B). Consequently, the weighted mean degree of specialization is much lower in the former web (<d' poll > = 0.16) than in the latter (<d' poll > = 0.54). The relationship between specialization of species i (d' i ) and its interaction frequency (A i ) across the pol-

Simulation of sampling effort
In order to test whether specialization estimates are dependent on sampling and scale effects, we simulated a decreased sampling intensity in both networks using rarefaction (see Methods: Simulation of sampling effort and matrix architecture). In both networks, H 2 ' is robust and already very well estimated by a small fraction of the interactions sampled (Fig. 2). The coefficient of variance of H 2 ' remains below 5% from about half of the total number of visits onwards in the British web and even at one-tenth of the total sampling effort of the Argentinean web. The estimation of connectance (C) is also relatively stable at least in the Argentinean web, although it shows a positive trend across sampling effort in the British web (Fig. 2). These findings suggest that network-wide measures of speciali-zation, particularly H 2 ', do not necessarily require a very large or even complete association matrix, but can also be very well estimated from a smaller representative subset as long as there is no systematic sampling bias.

Null model patterns
The degree of specialization can be further characterized by comparison with a null model. The null model used here is that each species has a fixed total number of interactions (given by the observed association matrix), but interactions are assigned randomly. In the above pollinator networks, random associations yield a specialization index H 2 ' that remains close to zero for almost the entire range of sampling intensity, while connectance (C) shows a positive trend over the total number of interactions (m) (Fig. 2). Therefore, H 2 ' derived from real networks may typically be clearly distinguished from this null model, while the comparison of C is complicated by scale dependence and the relatively large values yielded by the null model.

L L
Patterns within pollinator networks Figure 1 Patterns within pollinator networks. Frequency distribution of the species-level specialization index (d') for pollinators and plants from two published networks, one from Britain [32] and one from Argentina [33]. Bars show the number of individuals in each category (label '0' defines 0.00 ≤ d' < 0.05, etc.). Bars are separated for different species, and total number of species in each category is given on top. Arrows indicate cases where bars are invisible due to low numbers of individuals. Simulations of artificially generated random associations (see Methods: Simulation of sampling effort and matrix architecture) confirm that the network-level specialization index H 2 ' is largely unaffected by network size (Fig. 3A), network architecture (Fig. 3B) or total number of interactions (m) for a fixed matrix size (Fig. 3C). For random associations as shown here, H 2 ' is usually close to zero.
Connectance values (C) of random matrices show the known hyperbolic function over the number of associated species (Fig. 3A), changes with matrix asymmetry (Fig.  3B) and increase strongly with increasing m (Fig. 3C). For specialization measures at the species level, the average number of links per species ( ) increases strongly with network size, number of available partners, and m (Fig. 3).
While other niche breadth measures may also show some variation across different network scales (not shown), the weighted mean Kullback-Leibler distance <d'> is poorly affected by network size, network asymmetry, and number of interactions (Fig. 3). Both H 2 ' and d' may thus be appropriate for comparisons across matrices of different scale.

Properties of specialization measures
The suggested indices, d' and H 2 ', quantify the degree of specialisation of elements within an interaction network and of the entire network, respectively. While the number of links (L) and connectance (C) represent species-level and community-level measures of interactions based on L Sampling effect in pollinator networks Figure 2 Sampling effect in pollinator networks. Rarefaction of sampling effort in a British and an Argentinean pollination web [32,33]. Two network-level measures of specialization -the frequency-based specialization index (H 2 ') and the 'connectance' index (C) -are shown for networks in which the total number of interactions (m) has been reduced by randomly deleting interactions. Black dots show the effect of sampling effort for the original association matrix, gray dots the effect for a null model, i.e. five networks in which partners were randomly associated (same row and column totals as in the original matrix).
binary data, respectively, d' and H 2 ' represent corresponding measures for frequency-based data. The need to include information on interaction strength or interaction frequency into network analyses has been announced by various authors [4,20,27,30,31,38]. Parallel to earlier advances in diversity measures compared to species richness, quantitative network measures account for the heter-ogeneity in link strength rather than assigning equal weights to every link. Moreover, we have shown that d' and H 2 ' are largely robust against variation in matrix size, shape, and sampling effort. In several cases, C may be strongly affected by sampling effort [25,27], while H 2 ' remained largely unchanged in simulations of random associations over a range of network sizes, variable net- Matrix size (number of rows)

Simulated random networks
work asymmetries, and number of interactions. This scale invariance suggests that both d' and H 2 ' can be used directly for comparisons across different networks, while comparisons of L and C are more problematic [1,35].
Qualitative methods like the indices suggested here also allow a more detailed analysis of interaction patterns within and across networks. Fruitful areas include comparisons of networks across different interaction types [4], biogeographical gradients [1], biodiversity and land use gradients [13], robustness of networks against extinction risks [39], asymmetries between plants and animals [38], and relationships between specialisation and abundance [35]. While a comparison of the average number of partners between plants versus animals is solely dependent on the matrix architecture (i.e., the number of rows r versus columns c, since plants = c·C and poll = r·C), this limitation does not apply to d'. In the two selected pollinator webs, plants are either similarly or more specialised than pollinators in regard to weighted mean d'. This allows an scale-independent evaluation of asymmetries in the degree of specialization between partners (see also [38]). Moreover, Vázquez and Aizen [35] noted that the number of links of a species (L i ) is strongly positively correlated with its overall frequency (A i ) in five pollination networks including the datasets analyzed above. They argued that this apparent higher generalization of common plants and common pollinators may be largely explained by null models, calling for an improved measurement of specialization. Our results for the correlation between d' i and A i in two pollinator webs suggest that the relationship between specialization and abundance may be more variable, and even positive as in the Argentinean network.

Caveats
Some problems apply to any measure of network analyses including the proposed indices. Measures of specialization mostly ignore phylogenetic relationships or ecological similarity within an association matrix. For example, a plant species that is pollinated by multiple moth species may be unsuitably regarded as more generalized than a plant pollinated by few insect species comprising several different orders [40]. In addition, the fact that herbivores are commonly specialized on host plant families rather than species may skew network patterns if not carefully accounted for. A first approach to investigate such effects may be to compare the level of specialization after a stepwise reduction of the matrix by pooling species to higher taxonomic units, such as genera, families, and orders. For known phylogenies, more advanced techniques for analyses with a particular evolutionary focus are available [41][42][43]. Another deficiency may be that species or their partners are all given the same individual 'weight' in the analyses, whether they may be small bees or large bats visiting a small herb with little nectar or a mass flowering tree. Null models as in the calculation for both C and H 2 .' imply that all individuals can be shifted around between resources in the same way, irrespective of their size or non-fitting parameters. The role of 'forbidden links' as constraints to network analyses has been discussed elsewhere [44,45]. Similarly, calculations of d' or other niche breadth measures are based on the implicit assumption that each species adjusts its interactions according to the availability of partners (niches), irrespective of morphological or behavioral constraints. Moreover, if data are collected from a large heterogeneous habitat or over a prolonged time period, calculations of the degree of specialization may be severely constrained by the spatiotemporal overlap or non-overlap between partners for other reasons than resource preferences, e.g. when not all species are able to reach all sites in the same way, or when some resources and consumers have asynchronous phenologies. Consequently, network analyses as suggested here will be most useful to study resource-consumer partitioning within a short time frame and limited spatial scale.
For both indices d' and H 2 ', we proposed above to use the total number of interactions for each species as a measure of partner availability (q j ) and as constraint for standardization (fixed row and column totals). It may be debated whether independent measures of plant and animal abundances could be more appropriate than using interaction frequency data as such. However, despite the fact that such abundance data barely exist for most networks, note that the actual number of interactions often more suitably reflects resource availability and consumer activity than an independent measure of species abundance. For instance, a flower of one species may have a much higher nectar production than another and consequently receive a higher number of visitors, while the local abundance of the plant species does not reflect such differences in resource quality and/or quantity. Both d' and H 2 ' thus focus on the actual partitioning between the interacting species. In studies where detailed knowledge or theoretical assumptions about resources (availability and quality) or consumers (activity density and consumption rate) are available or under experimental control, such data may be incorporated into the analysis (defining q j and constraints) instead of interaction frequencies. The constraint of fixed row and column totals has been debated elsewhere in the context of species co-occurrence patterns, where it was found to be most appropriate in null model comparisons, although critics have argued earlier that these marginals themselves may already reflect competitive interactions ( [46] and references therein). Any

L L
approach to compare networks based on fixed marginals for standardization will fail to detect potentially meaningful patterns displayed by these architectural features, namely the number of resource and consumer species and the heterogeneity of total interaction frequencies. This network architecture may already be shaped by past competitive interactions or indicate fundamental constraints, a largely unexplored hypothesis that merits additional investigations.
It should also be emphasized that analyses of frequency data may be susceptible for pseudoreplication of repeated associations of the same individuals or close associations derived from a single dispersal event (e.g. a social insect colony, aggregating individuals, multiple offspring from a single egg cluster, or monospecific plant clusters). These may lead to an overestimation of specialization. To be more meaningful on a population level, frequency analyses should thus be based on spatially independent association replicates. Note that all species-wise specialization measures such as d' are sensitive to the behavior of the other species. Any systematic sampling bias (e.g. a taxonomic focus within a guild) will therefore affect the conclusions of comparisons within or across networks.

Conclusion
In accordance with previous calls [4,20,27,30,31,38], we suggest that the explicit inclusion of frequency data reflects an important step forward in network analyses, as too many assumptions are implicit in any measure based on binary representation. Most notably, connectance and 'number of partners' imply an equal availability of all partners -an unlikely scenario. Qualitative indices are not robust against sampling effort. On the contrary, the proposed quantitative measures based on interaction frequencies explicitly account for this source of variation.
Our study suggests that d' and H 2 ' represent scale-independent and meaningful indices to characterize specialization on the level of single species and the entire network, respectively. These novel indices allow us to investigate patterns within and across networks that have not been detected with qualitative measures such as correlations with species frequencies, network size and asymmetries in specialization between partners. Recently, Bascompte et al. [38] showed that the incorporation of frequency data may unveil pervasive asymmetries within networks. Particularly since Vázquez et al. [30] demonstrated that interaction frequencies in plant-pollinator and plant-seed disperser systems often correlate with the magnitude of mutualistic services for the plant (although variation in pollinator effectiveness can be important, see [47]), an increased collection of frequency data and appropriate quantitative analyses would greatly benefit future network studies.

Species-level index
As species-level measure of 'partner diversity', we propose the Kullback-Leibler distance (or Kullback-Leibler divergence, relative entropy) in a standardized form (d'). Coming from information theory, this index quantifies the difference between two probability distributions [48]. While the standardized Hurlbert's and Smith's measure of niche breadth could be used alternatively [21,22,24], d' has some advantages in the context of networks. While all three indices regard an exclusive pairing between two species as high degree of specialization as long as interactions between the two partners are infrequent, Hurlbert's and Smith's indices show a undesired trend towards full generalization when the number of interactions between the two partners increase, although this should be considered a stronger indication of specialization (see below, Properties of alternative niche breadth measures). The interaction between two parties is commonly displayed in a r × c contingency table, with r rows representing one party such as flowering plant species, and c columns representing the other party such as pollinator species. In each cell, the frequency of interaction between plant species i and pollinator species j (or another useful measure of interaction strength) is given as a ij , ( Table 1).
Instead of frequencies (a ij ), each interaction can be assigned a proportion of the total (m) as , where .
Let p' ij be the proportion of the number of interactions (a ij ) in relation to the respective row total (A i ), and q j the proportion of all interactions by partner j in relation to the total number of interactions (m). Thus, , , To quantify the specialization of a species i, the following index d i is suggested. This d i is related to Shannon diversity, similar to an index recently suggested to characterize biomass flow diversity in food webs [20]. However, an appropriate index in this context should not only consider the diversity of partners, but also their respective availability (see [22]). Consequently, the following index compares the distribution of the interactions with each partner (p' j ) to the overall partner availability (q j ). The Kullback-Leibler distance for species i is denoted as The theoretical maximum is given by d max = ln (m/A i ), and the theoretical minimum (d min ) is zero for the special case where all p' ij = q j . However, a realistic d min may be constrained at some value above zero given that p' ij and q j are calculated from discrete integer values (a ij ). To take this into account, d min is more suitably computed algorithmically as in a program available from the authors and online [49], providing all d' for a given matrix. This standardized Kullback-Leibler distance (d') ranges from 0 for the most generalized to 1.0 for the most specialized case. Thus, d' can be interpreted as deviation of the actual interaction frequencies from a null model which assumes that all partners are used in proportion to their availability. An average degree of specialization among the species of a party can be presented as a weighted mean of the standardized index, e.g. <d' i > for pollinators as While <d' i > usually differs from <d' j >, the weighted means of the non-standardized Kullback-Leibler distances are the same for both parties, hence <d i > = <d j >.

Network-level index
The following network-wide measure is based on the bipartite representation of a two mode network of interactions such as plant-animal or other resource-consumer interactions where members of each party interact with members of the other party but not among themselves (unlike many food webs). The two-dimensional Shannon entropy (termed H 2 in order to avoid confusion with the common one-dimensional H) is obtained as H 2 decreases with higher specialization. This measure is closely related to the weighted mean of the non-standardized Kullback-Leibler distance of all species, since (see below, Relationship between d i and H 2 ). H 2 can be standardized between 0 and 1.0 for extreme specialization versus extreme generalization, respectively, when its minimum and maximum values (H 2min and H 2max ) are known. H 2min and H 2max can be calculated for given constraints. The constraints used here are the maintenance of the total number of interactions of each species, thus all row and column totals, A i and A j , being fixed (see also [46]). Alternative constraints may be defined depending on the knowledge of the system studied.   while its theoretical minimum (H 2min ) may be close to zero depending on the matrix architecture. Like for d min above, H 2max and H 2min are constrained by the fact that they are derived from integer values. A program implementing a heuristic solution to obtain H 2max and H 2min , and to perform the entire analysis is available from the authors or online [49].
The degree of specialization is obtained as a standardized entropy on a scale between H 2min and H 2max as Consequently, H 2 ' ranges between 0 and 1.0 for extreme generalization and specialization, respectively.
Comparison with random associations H 2 can be tested against a null model of random associations (H 2ran ). A number of random permutations of the matrix can be performed using a r × c randomization algorithm (also available at [49]). The probability (p-value) that the observed H 2 is more specialized than expected by random associations is simply given as the proportion of values obtained for H 2ran that are equal or larger than H 2 , a common procedure in randomization statistics [25,50]. H 2ran is usually only slightly larger than H 2min .Previously, permutations of r × c contingency tables often used a different test statistics instead of H 2 [25,51,52]: The relationship between T and H 2 is described by a constant, the total number of interactions (m), as T = m·ln m -m·H 2 . Consequently, both methods yield exactly the same p-values.

Relationship between d i and H 2
In the following we derive the relationship between the individual levels of specialization (d i ) and the community level (H 2 ). The non-standardized Kullback-Leibler distance for row i can be rewritten as because .
The same calculation applies for <d j >, thus <d i > = <d j >. Consequently, the degree of specialization of the entire network (corresponding to the deviation of the networkwide entropy from its maximum value) equals the weighted sum of the specialization of its elements (species).

Properties of alternative niche breadth measures
The standardized Hurlbert's (B') and Smith's (FT) measure can be applied widely for niche breadth analysis [21,22,24]. In this context, the Kullback-Leibler distance (d) can be viewed as a modified Shannon-Wiener measure of niche breadth that accounts for niche availabilities. Like the Kullback-Leibler distance, both B' and FT compare the proportional distribution of individuals (p) to the proportional resource availability (q) (here: partner availability). For a certain species i, the two measures are in our notation: Each p' ij is the proportion of the number of interactions in relation to the respective row total, and q j is the propor- Both the standardized Hurlbert's (B') and Smith's (FT) measure range between 0 for the most specialized case to 1.0 for extreme generalization (broadest niche). In the context of niche breadth, it has been shown that the Shannon-Wiener measure is most sensitive, while Hurlbert's and particularly Smith's measure are less sensitive for the selection of rare resources [21] (see also [20]).
For the application in network analyses, however, both B' and FT may show some undesired properties. Generally, B', FT and d' are reasonably well correlated with each other across the species within a network (e.g., r s = -0.49 between d' and B', and r s = -0.36 between d' and FT for the 90 pollinators in the network of Vázquez and Simberloff [33], both p < 0.001). However, differences with d' are substantial when a highly specialized species interacts largely exclusively with a specialized partner, e.g. a specialized pollinator with a plant that is almost exclusively pollinated by this one. Imagine a scenario where one exclusive interaction occurs between a plant species and a pollinator species in a 3 × 3 matrix (Table 2). If the interaction between pollinator sp. 3 and plant sp. 3 is only infrequent (e.g. a 33 = 1), all indices show a high degree of specialization (d' = 1.0, B' = 0, FT = 0.14) for both partners. However, as the number of exclusive interactions (a 33 ) increases, the values for both B' and FT of pollinator sp. 3 and plant sp. 3 show a highly undesired change towards generalization, although a higher a 33 is intuitively considered as extreme specialization (e.g., for a 33 = 50 the values for pollinator sp. 3 are B' = 0.31 and FT = 0.70), while only d' remains unaffected (d' = 1.0). FT is always larger than zero, and B' becomes larger than zero when the specialists interact more frequently than one of the other partners, thus when q j > min(q 1 , q 2 , ... q c ). Both FT and B' approach a value of 1.0 (maximum generalization) for very large a 33 . This undesired effect of FT and B' is not restricted to completely exclusive interactions between two partners.

Simulation of sampling effort and matrix architecture
Two published plant-pollinator networks were selected to investigate the behavior of different specialization measures [32,33]. Both articles use their observed interaction matrices as a model to discuss network properties based on the number of links per pollinator or plant species, allowing a comparison of conclusions drawn. Both networks may be compared as they comprise relatively large datasets from temperate ecosystems, reporting interaction frequencies between plants and their floral visitors: the British meadow community studied by Memmott [32] involved 79 pollinator and 25 plant species (2183 pollinator visits observed), the forests in Argentina studied by Vázquez and Simberloff [33] involved 90 pollinator and 14 plant species (5285 visits). The datasets can be obtained from the Interaction Web Database [53]. We simulated a decreased sampling intensity in both networks using a rarefaction method in order to investigate how sampling effort affects the estimation of specialization indices. Real association matrices were reduced by randomly extracting interactions, e.g. from the total of m = 2183 visits in Memmott's web down to m = 5 visits (in steps of five, repeated ten times for each m).
In order to compare the null model characteristics of the specialization measures, we simulated artificial matrices with randomly associated partners and plotted the indices against an increasing number of partners and/or total number of interactions. We assumed that the total frequency of participating species approximates a lognormal distribution, which is typical for biological communities [21,22,24]. All row and column totals were randomly generated from a lognormal distribution (μ = 50, ∑= 1) that was scaled to the desired total number of interactions. Ten different combinations of row and column totals were obtained for each matrix size and taken as template to randomly associate the partners five times, thus each matrix size was represented by 50 random associations.