A new digital method of data collection for spatial point pattern analysis in grassland communities

Abstract A major objective of plant ecology research is to determine the underlying processes responsible for the observed spatial distribution patterns of plant species. Plants can be approximated as points in space for this purpose, and thus, spatial point pattern analysis has become increasingly popular in ecological research. The basic piece of data for point pattern analysis is a point location of an ecological object in some study region. Therefore, point pattern analysis can only be performed if data can be collected. However, due to the lack of a convenient sampling method, a few previous studies have used point pattern analysis to examine the spatial patterns of grassland species. This is unfortunate because being able to explore point patterns in grassland systems has widespread implications for population dynamics, community‐level patterns, and ecological processes. In this study, we developed a new method to measure individual coordinates of species in grassland communities. This method records plant growing positions via digital picture samples that have been sub‐blocked within a geographical information system (GIS). Here, we tested out the new method by measuring the individual coordinates of Stipa grandis in grazed and ungrazed S. grandis communities in a temperate steppe ecosystem in China. Furthermore, we analyzed the pattern of S. grandis by using the pair correlation function g(r) with both a homogeneous Poisson process and a heterogeneous Poisson process. Our results showed that individuals of S. grandis were overdispersed according to the homogeneous Poisson process at 0–0.16 m in the ungrazed community, while they were clustered at 0.19 m according to the homogeneous and heterogeneous Poisson processes in the grazed community. These results suggest that competitive interactions dominated the ungrazed community, while facilitative interactions dominated the grazed community. In sum, we successfully executed a new sampling method, using digital photography and a geographical information system, to collect experimental data on the spatial point patterns for the populations in this grassland community.

For instance, the Neyman-Scott process, also called the Poisson cluster process, can effectively model both the dissemination of offspring around parents and the species response to habitat heterogeneity. Hence, ecological processes cannot be inferred from spatial patterns without additional external information. Thus, the underlying processes responsible for the observed spatial patterns are of key interest to ecologists (Liehold & Gurevitch, 2002;McIntire & Fajardo, 2009;Tilman & Kareiva, 1997;Tuda, 2007).
Recently, there has been rapid development in spatial pattern methodologies (Dale, 1999;Diggle, 2003Diggle, , 2013Ripley, 1981;Stoyan & Stoyan, 1994). Plants can be approximated as points in space, and thus, spatial point pattern analysis has become increasingly popular in ecological research (Velázquez, Martínez, Getzin, Moloney, & Wiegand, 2016). Now, the distribution of point pair distances is being used to describe the characteristics of point patterns with second-order statistics (Diggle, 2013;Velázquez et al., 2016;Wiegand & Moloney, 2014), such as the pair correlation function or Ripley's K (Ripley, 1981), which can describe a range of distances and also detect mixed patterns.
The basic data unit for point pattern analysis is a point location of an ecological object in some study region. All that is necessary is determining its coordinate within the observation window, in this case, a 2D x and y location. In short, the key to performing point pattern analysis is the ability to collect these data. Various approaches can be applied to obtain the coordinates of ecological objects in a given study region. For example, a grid may be used in the field, and the position of an ecological object can be determined by measuring its distance and direction from the nearest node of the grid (Wiegand & Moloney, 2014). The location of objects can also be determined by using survey equipment or GPS devices (Chacón-Labella, de la Cruz, & Escudero, 2016). A third approach is to determine the locations of objects indirectly from aerial photographs or satellite images, potentially allowing a very broad area to be sampled (e.g., Gil, Lobo, Abadi, Silva, & Calado, 2013;Moustakas et al., 2006;Nelson, Niemann, & Wulder, 2002). In all instances of point pattern analysis, it is critical to sample as complete an area as possible. Using these sampling methods, ecologists have often sought to comprehend the spatial distribution of a population by examining forests or shrubs (Velázquez et al., 2016) due to the great availability of data in these ecosystems (Law et al., 2009). Although many studies have examined spatial patterns in grasslands (Greig-Smith, 1987;Kershaw & Looney, 1985;Krahulec, Agnew, Agnew, & Willems, 1990;Purves & Law, 2002), only a few have used point data to analyze plant distribution (Brix & Chadoeuf, 2002). This paucity might be due to the lack of a convenient sampling method by which to collect these data in grassland communities. Some older methods require a lot of labor, and others are no longer applicable.
However, understanding point patterns in grassland systems has widespread implications for population dynamics, community-level patterns, and ecological processes. Therefore, a convenient method for determining point locations accurately is necessary in grassland communities. Recent advances in digital photography and image analysis of software provide new opportunities for improved spatial data collection and analysis. Digital cameras provide high-resolution pictures that can be quickly transferred to a computer for analysis.
Geographical information system (GIS) and image analysis software and technology make it possible to objectively quantify spatial information from digital images in a repeatable and timely manner. Thus, we opted to design a new sampling method here to make possible the measurement of individual locations of species in grassland communities and the subsequent point pattern analysis. This new method involves digital photographs and a geographical information system (GIS), and we expect that it will help ecologists study the underlying processes that create the observable spatial patterns in grassland communities.

| Data collection using digital photographs and GIS
A flat 5 m × 5 m sampling block was chosen in a study grassland community and divided with bamboo chopsticks into 100 sub-blocks of 50 cm × 50 cm (Figure 1). A digital camera was then mounted to a telescoping stake and positioned in the center of each sub-block to photograph vegetation within a 0.25-m 2 area. Pictures were taken 1.75 m above the ground at an approximate downward angle of 90° ( Figure 2). Automatic camera settings were used for focus, lighting, and shutter speed. After photographing the plot as a whole, photographs were taken of each individual plant in each sub-block. In order to identify each individual plant from the digital images, each plant was uniquely marked before the pictures were taken (Figure 2b).
Digital images were imported into a computer as JPEG files, and the position of each plant in the pictures was determined using GIS. This involved four steps: (1) A reference frame (Figure 3) was established using R2V software to designate control points, or the four vertexes of each sub-block ( Figure S1), so that all plants in each sub-block were within the same reference frame. The parallax and optical distortion in the raster images was then geometrically corrected based on these selected control points; (2) maps, or layers in GIS terminology, were set up for each species as PROJECT files ( Figure S2), and all individuals in each sub-block were digitized using R2V software ( Figure S3). For accuracy, the digitization of plant individual locations was performed manually; (3) each plant species layer was exported from a PROJECT file to a SHAPE file in R2V software ( Figure S4); (4) Finally, each species layer was opened in ArcGIS software in the SHAPE file format, and attribute data from each species layer were exported into ArcGIS to obtain the precise coordinates for each species. This last phase involved four steps of its own, from adding the data ( Figure S5), to opening the attribute table ( Figure S6), to adding new x and y coordinate fields ( Figure S7), and to obtaining the x and y coordinates and filling in the new fields ( Figure S8).

| Data reliability assessment
To determine the accuracy of our new method, we measured the individual locations of Leymus chinensis, a perennial rhizome grass, in representative community blocks 5 m × 5 m in size in typical steppe habitat in the Inner Mongolia Autonomous Region of China in July 2010 ( Figure 4a). As our standard for comparison, we used a ruler to measure the individual coordinates of L. chinensis. We tested for significant differences between (1) the coordinates of L. chinensis, as measured with our new method and with the ruler, and (2) the pair correlation function g of L. chinensis, as measured with our new method and with the ruler (see Section 3.2). If (1) the coordinates of L. chinensis, as measured with our new method and with the ruler, and (2) the pair correlation function g of L. chinensis, as measured with our new method and with the ruler, did not differ significantly, then we could conclude that our new method of measuring the coordinates of L. chinensis was reliable.
We compared the results using a t test (Table 1). We found no significant differences in either (1) the coordinates of L. chinensis or (2) the pair correlation function g of L. chinensis. Further, we compared the pattern characteristics of L. chinensis when measured by our new method against the ruler measurements using a null model.
We found that the two pattern characteristics of L. chinensis did not differ significantly based on the homogeneous Poisson process or complete spatial randomness ( Figure 4b). Thus, we concluded that the data obtained using our new method were reliable enough to perform point pattern analysis with a null model in grassland communities.

| Study sites
The study site is a permanent field site within the Inner Mongolia to prevent grazing. At the time of exclosure, the site was considered to be in excellent condition, representative of an undisturbed climax steppe community (Bai, Han, Wu, Chen, & Li, 2004). The Stipa grandis community represents the most widely distributed type of grassland community across the Eurasia steppe region (Wang, Yong, & Liu, 1985). The area immediately outside of the F I G U R E 1 A flat 5 m × 5 m sampling block was chosen in an objective community in the desert steppe site is open to large animal grazing and has become seriously degraded (Li, Wang, Liu, & Jiang, 2008;Wang, Liu, Hao, & Liang, 1996). In the region, dark chestnut soils are present, with a hummus layer 20-30 cm thick and a calcic layer at 50-60 cm below.
Average annual temperature is 0°C with mean yearly precipitation around 350 mm. Interannual precipitation varies between 180 and 550 mm with 60%-80% falling during the summer season from June to August. This is typical of the temperate-semiarid climate of this region (Li et al., 2008). Annual potential evaporation ranges from 1,600 to 1,800 mm. Perennial plant species germinate following the rains that occur in early July, but the growing season technically runs from early April to late September.
In July 2017, we chose S. grandis as the study species and selected three 5 m × 5 m replicate community blocks within the site F I G U R E 3 Coordinates of the control points in a 5 m × 5 m study plot. (•) control points (the vertexes of each subblock)

F I G U R E 4
Example data collected to assess the reliability of the method developed in this study. (a) Mapped point pattern of observed data, as measured with a ruler (○) and by using digital photographs and geographical information system (•). (b) Analysis of the spatial pattern of observed data in the 5 m × 5 m study plot collected with a ruler (red lines) and with our new method (digital photographs and geographical information system) (blue lines). The black lines show the confidence limits of the pair correlation functions. The confidence limits were constructed using the highest and lowest g(r) from 199 replicates of the homogeneous Poisson process (complete spatial randomness) and three outside the site. In each of the replicate blocks, the individual locations of S. grandis were measured by using our new method with digital photographs and GIS (Figures 5a and 6a).

| Data analysis
The pair correlation function g(r) (Stoyan & Penttinen, 2000;Stoyan & Stoyan, 1994) and Ripley's K(r) function (Ripley, 1976(Ripley, , 1977(Ripley, , 1981 are commonly used in the analysis of point patterns (Wiegand & Moloney, 2004). The pair correlation function g(r) and Ripley's K(r) function are both based on the distribution of distances between pairs of points. The λK(r) gives the expected number of points found within a distance r of an arbitrarily chosen point, where λ is the point process intensity of the pattern. Here, K(r) is based on all distances between points in the pattern. The pair correlation function g(r) is derived from the K(r) function, where g(r) = (2πr) −1 dK(r)/dr. K(r) and g(r) are related to both the cumulative distribution function and the probability density function of distances between pairs of points (Diggle, 2003;Stoyan & Penttinen, 2000).
In this study, the pair correlation function g(r) was used to detect spatial point pattern characteristics. Unbiased interpretations of the pair correlation function g(r) require the selection of an appropriate null model that addresses the specific biological questions being asked (Wiegand & Moloney, 2004 Here, nonparametric methods are used to estimate the intensity function λ(x, y) directly from the data using smoothing techniques based on kernel estimators (Wiegand & Moloney, 2014). From observations made in the field prior to this study, we found that environmental con- All analyses were conducted using Programmatic version 2014 (Wiegand & Moloney, 2014). We used this program to compare our observed data with the two null models described above. Confidence limits were constructed using the highest and lowest g(r) from 199 replicates of the null model. This led to an approximate type I error rate of alpha = 0.01.
In our study, we analyzed the observed data for each of the three replicates and combined the three replicates in each site into a single weighted pair correlation function g(r) (Diggle, 2013; Moloney, 2014).   (Baatz et al., 2002). Digital images and GIS are also used to analyze spatial patterns based on point patterns, which are in some cases based on Ripley's K function (Malkinson, Kadmon, & Cohen, 2003). However, few past studies have considered spatial population patterns in grassland communities by using point pattern analysis on data collected by digital photography and GIS.

| Sampling method by using digital photographs and GIS
In this study, a new method has been proposed to measure spa- L. chinensis is a rhizome grass, and its aboveground stem diameter is about 2 mm; (3) we carefully measured the individual coordinates of L. chinensis by using a steel ruler with a millimeter scale. We compared the results of the two methods using a t test (Table 1), and found no significant differences in either (1) the coordinates of L. chinensis or (2) the pattern of L. chinensis (Figure 4). Thus, the data obtained using our new method were reliable enough to perform point pattern analysis with a null model in grassland communities. At present, high-precision GPS, such as Leica Viva GS15 (±1 cm), is the most convenient and accurate positioning method. However, when the distance between neighbors is less than 10 cm, the relative error with using high-precision GPS is greater than 10%, which is not practical for measuring fine-scale Compared with other methods, our new method has its own advantages for spatial point pattern analysis in grassland communities.
First of all, compared with traditional methods, such as the ruler, our method reduces fieldwork. Second, compared with remote sensing or unmanned aerial vehicle technologies that are mainly aimed at large-scale population patterns in forests or shrubs, our new method is better able to detect small-scale population patterns in grassland communities. Third, compared with high-precision GPS, which is suitable for forests and shrubs, our new method is more accurate for grassland communities where plants are smaller and closer together.
In addition, the current high-precision GPS positioning system is expensive, which can make it economically infeasible for some studies.
Finally, our new method captures digital photograph data, which may be revisited more easily than field samples and corrected a posteriori. Overall, our new method is fast and accurate at collecting data for population point analysis in grassland communities.
To ensure the highest accuracy when using our new method to determine population patterns, we offer a few recommendations.
This new method of combining digital photography with GIS is mainly suitable for herbaceous plant communities and communities of small shrubs, especially in grasslands. It has low feasibility for forests and large-sized shrub communities.
With regard to digital cameras, the early Nikon D100 with 6.0 megapixels (lens focal length is 30 mm) was found to accurately mea- For the subquadrats in a sampling block, a size of 0.5 m x 0.5 m is reasonable. It can be useful for the subquadrats to divide the sampling area into some integer number of subareas, thereby making it easy to determine the coordinates of control points. Also important is that the larger the subquadrat, the higher the shooting height.
Therefore, a subquadrat size such as 1 m × 1 m may be more difficult for a photographer to shoot.
When using this method to locate individuals in a population, it is necessary to accurately identify species from their digital photographs. In grassland communities especially, it can be difficult to distinguish species from each other in this way. Thus, it was important to mark each species uniquely before taking the pictures in order to facilitate their identification (Figure 2b). This can increase the amount of fieldwork. Moreover, when manually digitizing plant locations in GIS, caution must be taken to include every individual inside the subquadrat. To avoid digitizing plant individuals outside the subquadrat, a subquadrat boundary can be added through the vertex of the subquadrat using the R2V software ( Figure S9).

| Example point pattern analysis of Stipa grandis in a typical steppe
In the example, individuals of S. grandis were overdispersed in the ungrazed S. grandis community, according to the homogeneous Poisson process at small scales (at 0-0.16 m) (Figure 7a). This suggests that competitive interactions dominated the ungrazed S. grandis community (Grime, 1973). Meanwhile, in the grazed S. grandis community, individuals were clustered at 0-0.3 m (Figure 7b). This suggests that habitat heterogeneity can cause individual aggregation . Therefore, we chose the heterogeneous Poisson process to investigate the individual effects of habitat heterogeneity in the grazed S. grandis community. We found that individual aggregation can be attributed to habitat heterogeneity in grazed communities at 0.19-0.3 m, while they are still clustered at 0-0.19 m (Figure 7c). Moreover, we also know that facilitation can bring about individual clustering (Jia, Dai, Shen, Zhang, & Wang, 2011). Facilitation is expected to be more intense than competition under high abiotic stress or consumer pressure (i.e., Bertness & Callaway, 1994;Callaway, 2007;Kikvidze, Suzuki, & Brooker, 2011 (Murrell, Purves, & Law, 2001). As such, further experimental work is necessary for manipulating these different processes.
In our example, three 5 m × 5 m community blocks were established for point pattern analysis. Although point pattern analysis via replicated sampling has been previously proposed (Diggle, 2013;Wiegand & Moloney, 2014), few studies have analyzed spatial population patterns in this way. Here, population patterns of S. grandis differed among the three replicated blocks in the grazed communities (Figure 6b,c). This illustrates that point pattern analysis of a single plot could be misleading for understanding the full population. It is more reliable to integrate the data of replicated plots into point pattern analysis with the pair correlation function g(r), which is a weighted average.

| CON CLUS IONS
Our study successfully designed a new sampling method that employs digital photography and a geographical information system to collect experimental data for the spatial point patterns within a grassland community. Our example showed that the point pattern characteristics of populations in grassland communities can be revealed by our new sampling method and that the underlying processes that create the spatial patterns can be also explored by using null models. Ke Fang, and Chao Li for their contributions to data collection. We thank Thorsten Wiegand for providing the Programita software. We would like to thank Elizabeth Tokarz at Yale University for her assistance with English language and grammatical editing.

CO N FLI C T O F I NTE R E S T
The authors declare no conflict of interest.

DATA AVA I L A B I L I T Y S TAT E M E N T
The data are available on Dryad: https://doi.org/10.5061/dryad. brv15 dv70.