Study site
Uda Walawe National Park (UWNP), Sri Lanka, is located between latitudes 6° 25' - 6° 34' N and longitudes 80° 46' - 81° 00' E, at an average altitude of 118 m above sea level. It encompasses 308 km2 around the catchment of the Uda Walawe reservoir. The study area comprises approximately 1/3 of this area, which includes tall grassland, dense scrub, riparian forest, secondary forest, a permanent river, seasonal streams, and other water sources. It has a highly predictable pattern of rainfall with two monsoons per year, which occur in March through April and in October through December [58, 59]. There appear to be no non-human predators of elephants at this location. The Sri Lankan subspecies of Asian lion (Panthera leo sinhaleyus) was extinct prior to the colonization of the island by humans [60]. The leopard (Panthera pardus cotiya) is the current top land predator in Uda Walawe, but there is no evidence that it hunts elephants. The only other large predator is the freshwater crocodile (Crocodylus palustris), but predation on elephants has not been documented. The greatest threat to elephants both historically and currently is human activity, but disturbance within the park is minimal. Tourism has also led to the elephant population becoming well-habituated to people in vehicles.
Data collection
The data presented here span 259 field days (twenty months) from years 2007 and 2008, two or three days per week on average except between January and April 2008, when UWNP was temporarily closed due to political unrest. We typically entered the park between 0600-0700 h (sunrise), remaining continuously inside until 1730-1830 h (sunset). Driving routes were varied such that all accessible parts of the park were covered in a week. Locations where animals were closest to the road were marked on a hand-held GPS unit. Temperature, humidity and wind were recorded at least three times per day with a Kestrel™ pocket weather station. Rainfall (mm.) was recorded daily using a standard U.S. Weather Bureau rain gauge.
Individuals were identified photographically and catalogued for two years preceding the study period. All individuals were given numbers; the most frequently seen were also given names. A previous study of Asian elephants considered individuals within 100 m of one another to be associated [30] whereas studies of African elephants have used a distance of up to 500 m to define aggregations [6, 25, 61]. We considered all individuals within visual range of the observer and up to 500 m of one another who moved, rested, shared resources (mud, mineral wells, trees) to be a single aggregation, or group. Occasionally individuals showed affiliative vocal or tactile behavior such as growling and bodily rubbing [41] but such interactions were uncommon and not required for individuals to be considered part of the same group. Multiple groups occasionally shared water without interaction. The term 'group' here carries no implication of social history or permanence (see also Table 1). Individuals from multiple groups which initially co-occurred in space or even passed through one another, were not counted as associated unless they actively moved together. It was possible to spend several hours with a single group. We recorded identities of known individuals and counted the number of individuals in five size-based age classes [49]. Unidentified individuals were counted, but excluded from analyses.
We examined only relationships among adult females, as most sub-adults and juveniles were not identified individually. The strength of one individual's bond with another individual over the course of a season was quantified in terms of their association index. We used the Simple Ratio Index or SRI [6, 35, 36, 62], which is a symmetric measure that shows the proportion of time two individuals spent with each other. Before computing the SRI, we performed data aggregation which consisted of (a) partitioning the data into day-long sampling intervals; (b) identifying individuals that associated with each other in a given sampling interval by merging those groups that shared at least one identified individual; all individuals within such groups were then, by definition, 'associated'; and (c) excluding individuals that were observed only once in a season and only alone. Step (b) was performed in order to exclude potentially non-independent observations of the same group within the same sampling period. After data aggregation, we compute the association index for each pair of individuals A and B as SRI = X
AB
/(X
t
- X
n
) where X
AB
is the number of times individuals A and B were observed together, X
t
is the total number of observations, and X
n
is the number of observations in which neither A nor B were observed [62]. We also compute a measure of uncertainty of the SRI value for each dyad (see Additional file 7: Supplementary Text).
Association data were partitioned according to season. Months that had a total rainfall higher than the two year monthly average of 120 cm were designated as 'wet' months and those that had less were designated as 'dry', consistent with the monsoon cycle [59]. May-September constitute the 'Dry season' and October-December constitute the 'Wet season' according to this classification. January-April, with two wet months followed by two dry months, were considered a 'Transitional' period rather than divided into dry and wet periods since two month periods provided insufficient data for analysis. We refer to seasons as T1, D1, W1, D2 and W2 (Table 2).
Data analysis
To reduce variance in our data that arises from differences in the identities of individuals observed in different seasons, we constrain some of the analyses below to the set of 88 individuals that were observed in all seasons (either more than once or in association with other individuals). We refer to these individuals as 'residents'. To further reduce noise in the data due to low number of sightings, we constrain some of the analyses to the 51 residents that were observed at least 30 times throughout the entire study period. We refer to these individuals as the 'core individuals'. The uncertainty in the estimates of association indices for such individuals is generally below 10% for all seasons (see Additional file 1: Figure S1).
Dyadic structure
To test the data against the null hypothesis that associations within a season are random, we permute the seasonally partitioned datasets so that the number of sightings for each individual and the distribution of group sizes within the season are preserved. We use the 'fill' rather than the 'swap' method to generate 1000 permutations per season [63–65], with the average SRI value as the test statistic. To speed up computations, we partition the dry seasons into two overlapping three-month periods (May-July and July-September). We use some of the random datasets generated by this procedure in other analyses.
We examine the stability of associations among pairs of individuals across seasons in a few different ways. First, we test for correlations between matched SRI matrices across pairs of seasons using the Mantel test [63, 66, 67]. As not all individuals are seen in all seasons, by 'matched' matrices we mean that for each pair of seasons we test correlations only across the subset of individuals seen in both seasons. We used 10,000 permutations per test, with the Pearson product-moment correlation coefficient as the test statistic. This is the most basic way to test whether associations across seasons deviate from random [68].
Second, we track the SRI for each pair of individuals as a function of time and assess, using K-means clustering, whether such temporal SRI trajectories fall into distinct types. Three temporal patterns are natural to expect. If associations are stable, we expect to see a flat SRI trajectory (type A). If associations are temporary, we expect to see an SRI trajectory with a single peak at a particular season (type B). If associations are cyclic, we expect to see an SRI trajectory with more than one peak, in corresponding seasons across years (type C). In order to minimize noise in the data due to rarely observed individuals, we limit this analysis to the 51 core individuals (see above) that can potentially form 1275 dyads. We exclude those dyads that have never associated during the study period, yielding 478 dyads with at least one non-zero SRI value within the study period. We use the correlation distance between SRI trajectories as the metric for K-means clustering because in this analysis we are interested in similarities in the shape of temporal SRI trajectories rather than in their absolute values. The number of clusters, K, that is most appropriate for the data, is chosen using the Bayesian Information Criterion (BIC) where each K-means cluster is assumed to be generated by a Gaussian distribution [69]. We perform the K-means clustering procedure 100 times for each value of K between 2 and 15, starting with a random initial condition, and choose K at which the expected BIC is maximized as the optimal K. After determining the appropriate value of K, we run the K-means clustering algorithm 1000 times with different initial condition and pick the partition of the SRI trajectories into clusters that minimizes the sum of distances between the SRI trajectories and the cluster centroids to which they belong. In order to avoid confusion with the population-level clustering procedure below, K-means clusters are henceforth characterized by their centroids and are referred to as 'typical SRI trajectories'.
To investigate whether an individual's preferred companions change over time irrespective of the strength of the ties, we determine the top-n associates of each core individual within each season. Top-n associates of a core individual i in the given season are defined as the n core individuals (other than i) who have the highest SRI values with respect to individual i in that season. Then, in each season, a core individual i, by definition, allocates n 'companion slots' of time to spend with her top-n associates. Thus, the total number of companion slots available to each individual in 5 seasons is 5n. If individual j is present in the individual i's top-n over m seasons (1 ≤ m ≤ 5), we say that associate j 'occupies' m companion slots of individual i, or that individual i allocates m of her companion slots to individual j. We then call individual j 'an m-term associate' of individual i. If m = 1, individual j is a short-term associate of individual i, while if m = 5, individual j is a long-term associate of individual i, with obvious gradations in between. We arbitrarily set n = 5. To illustrate these concepts, consider two individuals A and B. If B's SRI index with respect to individual A is ranked 3rd, 2nd, 6th, 5th, and 9th in seasons 1 through 5 respectively, then B is in A's top-5 associates for 3 out of 5 seasons and therefore occupies 3 companion slots. Thus, individual B is a 3-term associate of individual A.
Now, if k
i
(m) is the number of individual i's m-term associates, then f
i
(m) = k
i
(m)m/5n is the fraction of companion slots that individual i allocates to all of her m-term associates (note that
because each individual has a total of exactly 5n companion slots). Then the average fraction of companion slots that individuals allocate to their m-term companions is
, where N = 51 is the total number of core individuals.
indicates the extent to which individuals prefer to associate with the same companions over many seasons or to change companions from season to season.
To quantify individual variation in social behavior, we count, for each of the 51 core individuals, the number of associates that are present in her top-n for the whole observation period for n = 5. This number ranges from 0 to n = 5.
Ego-network structure
We represent the SRI matrices as weighted social-network graphs [37, 38], where nodes represent individuals, edges connect those individuals who were associated within a season, and edge weights correspond to the SRI values. An ego-network is a social network that consists only of the subject, called 'ego,' and the nodes to which she is directly connected. For each of the 88 residents, we compute five ego-network measures with intuitive biological interpretations: Size, Ties, Pairs, Density, and 2-Step Reach (defined in Table 5). We then compute the corresponding ego-network statistics, i.e. the average values of ego-network measures over all residents. As ego-network measures for different individuals are non-independent, we test for differences in ego-network statistics between seasons using a bootstrap procedure in which association data for each season are re-sampled 1000 times with replacement [39].
Population structure
In order to investigate how the social network structure of the whole population changes over time, we construct 'network structure curves' (Figure 7) for each season using the following procedure. First, using the full SRI matrix for each season, we construct the social network graph for the whole population (i.e., we include non-resident individuals in this analysis). We then identify social clusters within the network using the Girvan-Newman algorithm for community detection [70]. This algorithm recursively fragments the network into subnetworks by successively removing the edges with the highest between-ness [37]. For each resulting subdivision of the network, the so-called 'modularity quotient,' Q, is computed. The modularity quotient takes values between 0 and 1, where 0 implies that the number of ties within a cluster does not exceed a random expectation, and values above 0.3 indicate potentially meaningful subdivisions [71–74]. We label the maximum modularity for a given social network as Qmax and take the partition or partitions that yield this maximum value to be the most appropriate way of subdividing the network. The term 'cluster' refers then to the set of individuals belonging to the same subnetwork in an optimal subdivision.
The original Girvan-Newman algorithm was designed to find community structure in unweighted networks. To accounts for edge weights, we add one more step. We apply the Girvan-Newman procedure to a social network that consists only of ties with strength above a particular threshold, and record the number of clusters in the resulting network (Figure 7). If multiple subdivisions of the network at a given threshold yield identical Qmax values, we record the average number or clusters (e.g. if either 14 or 15 clusters have an equivalent Qmax at a threshold of 0.1, we record 14.5 clusters). We perform this procedure iteratively for different SRI thresholds, incrementing by 0.02: 0, 0.02, 0.04, ..., 1. We then plot the average number of clusters against the SRI threshold. We call the resulting plot the 'network structure curve,' as it shows the structure of the network at a glance, being a lower-dimensional visual representation than social network graphs (Figure 7).
In order to test whether network structure curves have significantly different shapes in different seasons we modify the method used by Wittemyer et al. [6]. For each SRI threshold value, we compare the distributions of increments of the network structure curve within a window w before and after this value, using the Mann-Whitney test. We then find points in the curve where these distributions are different from each other at the significance level 0.05. The points with significant P-values are consistent when we vary the size of window w between 0.1 and 0.3 (see Additional file 8: Figure S7 and Additional file 9: Figure S8).
Implementation and ethical statement
Permutations of associations, observed group sizes, and corresponding significance tests within and across time partitions were implemented in the OCaml programming language; code is available upon request. Network visualizations were generated in NetDraw using a graph-theoretic layout with node repulsion [38]. All other statistical procedures and analyses were performed on Matlab v. 7.0, and R v. 2.7. This work was carried out in compliance with requirements of the Institutional Animal Care and Use Committee of the University of Pennsylvania, protocol number 801295.