Spatially balanced sampling methods are always more precise than random ones for estimating the size of aggregated populations

Population size is a crucial parameter for both ecological research and conservation planning. When individuals are aggregated, estimating the size of a population through sampling raises methodological challenges, as the high variance between sampling units leads to imprecise estimates. Choosing the right sample design depending on the population aggregation level could improve the precision of estimates; however, this is difficult because studies comparing sample designs for aggregated populations have been limited to a few populations and sampling designs, so their results cannot be generalised. To address this gap, we combined simulations of spatial point populations and field counts of three plant species to compare the relative precision of estimates between three sampling methods: simple random sampling (SRS), systematic sampling (SYS) and spatially balanced sampling (SBS). Comparisons were performed on density and aggregation gradients for a range of sample sizes. Our simulations showed that SYS and SBS were always more precise than SRS when individuals were aggregated, reducing sampling variance up to 80% and 60%. The highest precision for estimating population size was always obtained when the average distance between sampling units equalled the diameter of the clusters (i.e. the groups of individuals). The difference in precision was similar for the natural populations, with sampling variance lowered by up to 75% (SYS) and 60% (SBS) compared to SRS. These findings lead us to recommend using SYS or SBS rather than SRS to estimate population size when individuals are spatially aggregated, as these consistently provide more precise estimates. Assessing cluster diameters in the field enables a quick assessment of the potential gain in precision to expect, and thus the best choice of sampling method depending on the trade‐off between precision and field constraints.


| INTRODUC TI ON
Population size is a central parameter for all fields related to ecology and evolution. Evolutionary biologists use population size to predict the risk of genetic diversity loss due to inbreeding and genetic drift (Crow, 2010;Frankham, 1995). Ecologists study how population size varies over space and time to identify biotic interactions and abiotic factors that shape population dynamics (Bjørnstad & Grenfell, 2001;Quéroué et al., 2021) and their possible cascading consequences on ecosystems, as in biological invasions (Simberloff et al., 2013) or the reintroduction of keystone species (Ripple & Beschta, 2012;Watson & Estes, 2011). In conservation biology, population size is used to assess extinction risk (Dennis et al., 1991) and species conservation status (e.g. IUCN, 2019) and to evaluate the effectiveness of management actions (Beissinger & McCullough, 2002). Despite its apparent simplicity, estimating the size of a population raises several methodological challenges, as it is often impossible to count every individual. Ecologists thus rely on sampling: counting individuals in a subset of spatial units occupied by the studied population to infer the entire population size (Cochran, 1977).
Sampling is a prolific field of statistics, and various methods exist for selecting sampling units (Thompson, 2012). Whatever the sampling method, estimating the size of a population denoted Y is achieved through the same four steps (Cochran, 1977) (Cochran, 1977). Therefore, the distribution of individuals in the studied population strongly affects the precision of the estimates.
In natural populations, individuals are usually not randomly distributed in space (Legendre, 1993;Levin, 1992). Distribution may indicate a process of repulsion, with individuals more distant from their nearest neighbours than expected in a random distribution. Repulsion can occur, for example, in territorial animals (e.g. Hinde, 1956;Maher & Lott, 2000) or in plant species due to intraspecific (Stoll & Bergius, 2005) or interspecific competition (Rayburn & Schupp, 2013). When sampling such populations, the number of individuals per sampling unit is relatively homogeneous, so population density estimates have a high precision even with relatively small sample sizes (Cochran, 1977). Alternatively, individuals may aggregate, living closer to their nearest neighbours than expected in a random distribution (hereafter referred to as 'aggregated populations'). Aggregation of individuals is frequent in herbaceous plant species, leading many authors to state that most plant populations are aggregated (Damgaard & Irvine, 2019;Greig-Smith, 1983;Robinson, 1954). This arises due to limited dispersal capacity or a patchy habitat (Lara-Romero et al., 2016;Seabloom et al., 2005).
Aggregation can also be observed in animals, such as colonial breeding vertebrates (Danchin & Wagner, 1997) and freshwater mussels (Morales et al., 2006;Smith et al., 2011). When sampling aggregated populations, samples are typically composed of many zeros and a few high counts, resulting in imprecise estimates (McGarvey et al., 2016).
This makes improving the precision of population size estimates for aggregated populations by choosing an appropriate sampling design an important challenge (Thompson, 2004;Yoccoz et al., 2001).
Two types of sampling methods can be used for aggregated populations: one-step methods, in which all sampling units are selected prior to measurements in the field, and two-step methods, in which a sample of primary units is selected and counted in the field, and new units are added depending on the outcome of counts on primary units. Adaptive cluster sampling (Thompson, 1990) is a two-step method specifically developed for aggregated populations.
Although it usually improves the precision of population size estimates (Turk & Borkowski, 2005), it is difficult to implement in the field (see, however, Philippi, 2005 andMorrison et al., 2008) and can lead to very uncertain estimates for small or very aggregated populations (Shackleton et al., 2020), so we did not cover it in this study. The most commonly used one-step sampling methods (Smith et al., 2017) are simple random sampling (SRS), in which the units are randomly selected, and systematic sampling (SYS), in which the location of the first unit is randomly selected and the others are arranged along a rectangular grid (Cochran, 1977). Spatially balanced sampling (SBS) is a more recent one-step sampling method in which selected units are evenly distributed over the study area but without imposing a strictly equal distance between units as in SYS. This can be done through various processes, such as dividing the study area into multiple spatial strata and selecting one unit from each (Stevens & Olsen, 2004) or using a random-start low-discrepancy sequence to select the location of the sampling units (Robertson et al., 2013).
These sampling methods yield varyingly precise estimates depending on the spatial distribution of individuals in the studied population (McGarvey et al., 2016). However, no clear guidelines have yet been proposed on the best sampling method to choose depending on the observed aggregation. (i) Some compare the precision of estimates between sampling methods in a purely analytical way: for example, they showed that SYS is more precise than SRS for populations with certain types of autocorrelation in the density of individuals (Cochran, 1946(Cochran, , 1977Matérn, 1986;Quenouille, 1949). However, since the autocorrelation of density is generally unknown prior to fieldwork, these conclusions are difficult to translate into operational recommendations.
(ii) Some map all individuals from a given population in the field and simulate multiple sampling designs on the raw data or simulated populations with similar characteristics. This approach is valuable in identifying the sampling methods that provide the most precise estimates for a given population but lacks generality, especially as the conclusions about the relative precision of the sampling methods vary between studies. Indeed, SYS or SBS is often found to be more precise than SRS, but the difference in precision varies considerably between studies (Kermorvant et al., 2020;Mier & Picquelle, 2008;Morrison et al., 2008), and a non-negligible fraction of the studies find all sampling methods to yield roughly the same precision (Khaemba et al., 2001;Smith et al., 2011). However, to our knowledge, no study has found SRS to be more precise than SYS or SBS. They found that SYS was substantially more precise than SRS for all aggregated populations.
The three types of studies seem to indicate that SYS and SBS usually yield more or as precise estimates than SRS. However, it is unclear which method between SYS and SBS yields the most precise estimates, and no previous study has described how the relative precision of the sampling methods varies as a function of the three parameters affecting the precision of estimates: the mean density of individuals, their aggregation level and sample size. Furthermore, it has been shown analytically that SYS provides more precise estimates than SRS if the variance within the systematic samples is higher than the variance of the whole population (Cochran, 1977: 208). However, the mechanism determining the within-sample variance, and thus the precision of SYS relatively to SRS, has never been described. Understanding this mechanism would allow knowing with certainty which is the optimal sampling design for any studied population.
This study aimed to determine which sampling method provides the most precise estimates depending on the level of aggregation of the population. To this end, we combined computer-based simulations and field counts of plant populations to compare the precision of three one-step sampling methods (SRS, SYS and SBS) over wide gradients of density, aggregation and sample size. We sought to answer four key questions: (i) Does aggregation have the same effect on estimate precision across the three sampling methods? (ii) Does the effect of aggregation on estimate precision change with population density? (iii) Does the effect of aggregation on estimate precision change with sample size? (iv) What is the mechanism generating the differences in precision between sampling methods for aggregated populations?

| MATERIAL S AND ME THODS
Our simulations followed a three-step procedure: (1) we generated virtual populations for a given combination of density and aggregation, (2) we drew samples from each virtual population with three sampling methods (SRS, SYS and SBS) and several sample sizes, and (3), we measured the precision of the density estimates obtained with each sampling method and sample size for each virtual population ( Figure 1).

| Sampling process
We selected nine sample sizes: n = 9, 15, 25, 49, 100, 150, 196, 300 and 400 (out of 10,000 possible sample units) because for SYS they allow the sample units to be arranged in a square or quasi-square grid, which limits major spatial coverage differences between sample sizes. Many ways of performing SBS have been proposed, and we chose one of the most commonly used, called balanced acceptance sampling (BAS), in which the sampling units are selected by drawing their coordinates from a low discrepancy sequence, in our case the Halton sequence (Robertson et al., 2013). BAS is one of the methods yielding the best spatial balance when all units are accepted, which was always the case in our simulations (Robertson et al., 2018). For each virtual population, we simulated 1000 sampling surveys for each of the nine sample sizes. A survey consisted of two steps: drawing a sample of size n from a population with each of the three sampling methods and storing the mean densities calculated from the three samples. For SYS and SRS, all sampling units have the same inclusion probability; thus, the sample mean is an unbiased estimator of the population mean density (Cochran, 1977). For SBS as we simulated it, inclusion probabilities are equal to the 3rd or 4th decimal (Robertson et al., 2013), so we treated them as equiprobable, as weighting the sample mean by the inclusion probability of the units would have had no discernible effect on our results.
Altogether, we simulated 60,000 sampling surveys by combination of density, aggregation and sample size, that is, 1000 sampling surveys on 60 virtual populations.

| Comparison of estimate precision between sampling methods
Before comparing the precision of the density estimates obtained with each sampling method, we first checked whether they were unbiased. Then, for each combination of density, aggregation and sample size, we computed the mean of the 1000 mean densities calculated from the samples for each of the 60 virtual populations. We then averaged the resulting 60 mean densities and compared the result to the mean of the true densities of the 60 virtual populations that were sampled. The estimated density was always very close to the true density (see Appendix S3), indicating that the estimates obtained with the three sampling methods were unbiased, as expected from sampling theory (Thompson, 2012).
F I G U R E 1 Workflow of the simulation process. One level of density (out of seven) and four levels of aggregation (out of 78) are represented. The simulation process is described for one aggregation level and one sample size (out of nine). For every level of aggregation (top row), we simulated 60 virtual populations (step 1). We drew 1000 samples of size n with each of the three sampling methods from every population and computed their means (step 2). We calculated the variance of the 1000 sample means to estimate the sampling variance and computed the ratios of the sampling variance of SYS and spatially balanced sampling (SBS) relative to simple random sampling (SRS; step 3). The horizontal dotted line on the violin plot represents the true mean density of the sampled population. The final result is the curve of the variance ratio as a function of the population aggregation level, each point of the curve being the average over 60 virtual populations. For clarity, only the curve for var(systematic)/var(random) is shown.
To measure the precision of the density estimates, we estimated the sampling variance for every combination of sampling method and sample size by computing the variance of the means of the 1000 corresponding sampling surveys for each virtual population. This method of estimating the sampling variance avoids the need for variance estimators, but requires knowing the mean of all possible samples (although in our case we did not draw all possible samples but a large proportion of them) and can therefore only be used in simulation studies (Magnussen et al., 2020;Magnussen & Fehrmann, 2019;McGarvey et al., 2016). This method allows to estimate the conditional sampling variance, that is, conditional on a realised population covering a finite area as encountered in the field, within the context of design-based inference (see Brus, 2021, for a synthesis on design-based and model-based inference). To compare the estimate precision between the three sampling methods for a given sample size, we calculated the ratio of the sampling variance of SYS and SBS, relative to the sampling variance of SRS. This indicator represents the rate of reduction in variance obtained on average when changing from SRS to another sampling method (Kish, 1965).
A variance ratio below one means that a method (here SYS or SBS) is more precise than SRS. The variance ratio can also be interpreted as the effective sample size (Kish, 1965). Its inverse then represents the rate of increase in sample size required to achieve the same precision as the alternative method using SRS. For example, for a sample size of n = 100, a ratio of 0.5 between the sampling variance of SYS and SRS means that SRS would need, on average, a sample size of n = 100*(1/0.5) = 200 to achieve the same precision as SYS with n = 100. To obtain a generalizable result, for each sample size, we averaged the variance ratio values over the 60 virtual populations that were sampled for each combination of density and aggregation.
Hence, the values of variance ratio we present are the expected values for each combination of density, aggregation and sample size (Magnussen et al., 2020).

| Field study
To compare

| Comparison of estimate precision between the three sampling methods for the virtual populations
The relative precision of the three sampling methods varied considerably with the aggregation level ( Figure 2). For randomly distributed populations (I = 1) and populations exhibiting repulsion (I < 1), the values of the variance ratio were close to one: that is, SRS, SBS and SYS had equivalent precision. This result was consistent for all densities and sample sizes tested. For all populations with aggregated individuals (I > 1), whatever the sample size, SYS and SBS were always as precise or more precise than SRS, and SYS was always more precise than SBS. For both SYS and SBS, the variance ratio decreased rapidly below one as aggregation increased, and after reaching a minimum, it slowly increased towards 1. This means that as the aggregation of individuals increased, the estimates obtained with SYS and SBS quickly became more precise than those obtained with SRS until they reached a maximum of relative precision. After this maximum, their relative precision slowly decreased until it became equivalent to SRS for high aggregation levels. When the sample size was increased, the minimum of the variance ratio had a lower value and shifted towards higher levels of aggregation. In other words, as sample size increased, SYS and SBS became increasingly more precise than SRS, and maximum relative precision was reached at higher aggregation levels.
The effect of the mean density depended on the modality we used to simulate increasing density. For modality 1 (shown in Figure 2), when density increased, the variance ratio had lower values, and its minimum shifted towards higher aggregation levels. In other words, as density increased, SYS and SBS became more precise relative to SRS, and their maximum relative precision shifted towards more aggregated populations. For modality 3, the same pattern occurred in a more pronounced way. In contrast, for modality 2, the variance Variance ratio ratio curves were identical for all densities (Appendix S1). These modalities differed in that the diameter of the clusters changed with density for modalities 1 and 3, but not for modality 2. Therefore, cluster diameter appeared central to explain the relative precision of the sampling methods.
To investigate this result, we represented the variance ratio as a function of the diameter of clusters instead of the dispersion index. Figure 3 shows that the relative precision of SYS compared to SRS for a given sample size was the same for all densities and depended only on cluster diameter. Moreover, the variance ratio only reached one when the cluster diameter was so small that clusters could fit in one sampling unit. We obtained the same result for SBS (Appendix S1) and for all populations, regardless of the modality used to simulate increasing density. Furthermore, the minimum variance ratio of SYS and SBS was always reached when the mean distance between the sampling units was equal to the cluster diameter.

| Relative precision of estimates for three plant populations
The For the three populations, the variance ratio was below 1 for both SYS and SBS for almost all sample sizes, indicating that density estimates obtained with SYS and SBS were generally more precise than those obtained with SRS ( Figure 4)   The overall result that SYS is more precise than SRS for aggregated populations is in line with theoretical work (Matérn, 1986; for SYS compared to SRS, and Cochran (1977: 223) reported studies finding up to 83% lower variance for SYS compared to stratified random sampling. SBS has been less investigated, but the existing simulation studies have also shown that it is usually more precise than SRS (see Kermorvant et al., 2019, for a summary). Our findings demonstrate that this is true for a large variety of aggregated populations and sample sizes.

| How the distribution of individuals drives sampling variance in the virtual populations
Simulating sampling surveys over gradients of population density, aggregation and sample size showed that the relative precision of the sampling methods always followed the same pattern as aggregation increased. Changing the population density and sample size modulated this general pattern, shifting the minimum variance ratio to a higher level of aggregation and lowering its value. Nevertheless, the highest relative precision for SYS and SBS was always achieved when the mean distance between the sampling units was equal to the cluster diameter, whatever the density of individuals (Figure 3). This is because the aggregated populations we simulated were constituted of clusters with the same diameter and number of individuals. In this setup, sampling units located in clusters have values close to each other, and all sampling units located outside clusters are equal to zero. Thus, the sampling variance mainly depends on the between-sample variability in the proportion of sampling units located within clusters. The closer this proportion is between samples, the lower the sampling variance. With SYS and SBS, the spacing between units (fixed for SYS and slightly variable for SBS) leads the proportion of units located in clusters to be less variable between samples than with SRS ( Figure 5). Therefore, SYS and SBS will always lead to more precise estimates than SRS, except if the clusters are smaller than the sampling units, in which case all sampling methods will achieve the same precision.
With SYS, when the distance between the sampling units is equal to the cluster diameter ( Figure 5, 4th column), all clusters are intersected by a single sampling unit so that the proportion of units located in clusters is strictly identical for all samples. Consequently, the means are very close for all samples, and SYS reaches its optimal precision relative to SRS. When the distance between units is smaller or greater than the cluster diameter, the proportion of units located in clusters is not strictly identical between samples, and the relative precision deviates from the optimum. Using the formulation of Cochran (1977: 208), for any aggregated population, the withinsample variance is the highest for SYS when the distance between the sampling units is equal to the cluster diameter, and thereby SYS achieves its highest precision relative to SRS. The same mechanism operates for SBS, but as the distance between sampling units is not strictly constant, the proportion of sampling units located in clusters F I G U R E 5 Illustration of the mechanism underlying the simulation results regarding the impact of the distance between sampling units on the relative precision of the sampling methods. Three samples drawn from the same virtual population are presented for random (SRS) and systematic sampling (SYS), with two sample sizes (n = 9 and n = 16). The green circles represent clusters of individuals. For SYS, with n = 9 (2nd column), the cluster diameter is inferior to the inter-unit distance, so some clusters can be missed, but there is never more than one unit in each cluster. With n = 16 (4th column), the cluster diameter equals the inter-unit distance; all clusters present in the study area are sampled at least once in every sample. This leads to a lower sampling variance for SYS than random sampling, in which the proportion of units located in clusters varies more between samples (1st and 3rd columns).

Systematic Random
Random n = 9 n = 16 Systematic varies more between samples than in SYS. The SBS method is consequently less efficient than SYS for the virtual populations we simulated. Although it has often been stated that sampling methods with better spatial coverage are usually more precise for spatially structured populations (Kermorvant et al., 2019;Robertson et al., 2013;Stevens & Olsen, 2004), the mechanism driving this expectation has, to our knowledge, never been reported before. This mechanism certainly determines the relative precision of all one-step sampling methods when sampling aggregated populations, not only that of the versions of SYS and SBS that we investigated in this study (systematic square-grid sampling and BAS, respectively).

| From computer simulations to field studies
The aggregated populations we simulated had a simplistic spatial structure, as clusters were discs of equal diameter with clear bound- Further studies need to be conducted to identify the other characteristics of the distribution of individuals than the diameter of the clusters involved for natural populations. This will first require to better understand how individuals are distributed within natural populations. Currently, datasets containing the location of all individuals in a population, or more synthetic aggregation metrics, remain scarce (but see Morrison et al., 2008, andLaw et al., 2009), making it difficult to build simulations with realistic distributions of individuals. Once this barrier is removed, simulation studies will have to be carried out to identify the characteristics other than cluster diameter that drive the relative precision of sampling methods.
For example, the non-random location of clusters, heterogeneity in cluster size and shape, or heterogeneity in density between clusters and within each cluster might be good candidates. Finally, metrics that can be measured easily in the field will have to be identified so that the sampling design can be adapted to the spatial structure of the population. This last step will be critical for the results of future methodological work to be implemented in the field.

| Recommendations for field studies
Given our results on both virtual and natural populations, we recommend using SYS or SBS when studying populations with signs of spatial aggregation. These sampling methods will, on average, provide more precise estimates than SRS unless the clusters are of similar Field ecologists should be aware that SYS has a statistical drawback. For SYS, the sample mean is an unbiased estimator of the population mean, but there is no universally unbiased estimator for the variance of this estimate, that is, the sampling variance (Cochran, 1977: 224). In other words, all existing estimators can sometimes give biased estimates of the sampling variance so that although the population mean will be estimated without bias, the precision of this estimate can be under-or overestimated (Magnussen & Fehrmann, 2019). Nevertheless, we argue that this problem should not prevent using SYS, given the substantial potential increase in precision using this sampling method. The search for better estimators is an ongoing research topic, and the best estimators currently known usually allow to see a substantial gain in precision by shifting from SRS to SYS, although variances tend to be slightly overesti- for most populations, although cases of overestimation above 60% were found for a few populations. Simulating even more populations, Magnussen et al. (2020) found two other estimators to overestimate sampling variance on average by less than 10%. We tested these four estimators on our virtual and natural populations (Appendix S4). For the virtual populations, the four estimators overestimated the sampling variance, which was expected given that they are all based on measurements of the correlation among neighbouring units. This is not adapted for populations following a Matérn cluster process as in our simulations. However, we propose a new estimator that is more adapted for this type of populations, and it provided nearly unbiased estimates of the sampling variance. For the natural populations, we found similar levels of performance to previous studies (Magnussen et al., 2020;McGarvey et al., 2016), with several estimators based on correlation between neighbouring units, as well as our new estimator, providing almost unbiased estimates of the sampling variance.
Estimating sampling variance is less problematic for SBS, although some cases of biased estimates have also been reported (Robertson et al., 2013;Stevens & Olsen, 2003). We tested the most commonly used estimator, and it provided nearly unbiased estimates for both the virtual and natural populations (Appendix S4). These results confirm that given a variance estimator appropriate for the studied population is used, a substantial gain in precision will be seen when using SYS or SBS instead of SRS. To help choose the variance estimator, future simulation studies need to be conducted to screen the performance of estimators across many distributions of individuals.
This will be particularly useful if realistic distributions of individuals can be simulated based on a better understanding of how individuals are distributed in natural populations. Other advantages of SBS over the other two methods include the ability to incorporate legacy sites where data have already been accumulated (Foster et al., 2017) and to draw oversamples to replace units that could not be observed in the field (e.g. inaccessible sites) while maintaining spatially balanced samples (Kermorvant et al., 2019).
The results of this paper refer to design-based inference, in which the estimates of the population mean and sampling variance are generally unbiased (except for SYS as discussed above), and no assumptions about the studied population need to be made. Modelbased inference is an alternative approach in which the estimates are obtained by fitting a spatial variation model to the sample. This approach usually provides more precise estimates than design-based inference when there is spatial dependence in the density of individuals. Yet, the accuracy of this approach strongly relies on the realism of model assumptions, and unrealistic assumptions can lead to spurious estimates. Moreover, hybrid methods, called model-assisted inference, which combine the advantages of both approaches have been developed (Brus, 2021). Using model-based or preferably model-assisted inference seems a promising way to further increase the precision of estimates for aggregated populations. However, the difference in estimate precision between design-based and modelbased approaches seems limited when the sample is selected with a SBS method, while it is large with SRS (Dumelle et al., 2022). Finding ways to combine the strengths of model-assisted inference and SBS methods tailored to the aggregation of the studied population would be the next step to further improve the precision of aggregated population size estimates.

AUTH O R CO NTR I B UTI O N S
Jan Perret, Anne Charpentier, Guillaume Papuga and Aurélien Besnard conceived the study and designed the methodology; Jan Perret and Roger Pradel analysed the data; Jan Perret led the writing of the manuscript. All authors contributed critically to the drafts and gave final approval for publication.