The scale effects of the spatial autocorrelation measurement: aggregation level and spatial resolution

ABSTRACT The scale effects of the spatial autocorrelation (SA) measurement has been explored for decades. However, the effects of the data aggregation levels and spatial resolution on the SA measurement are often confused. Whether the two types of scale effects are the same is still unclear and requires further investigation. We retrieved the land surface temperature (LST) from Landsat 8 images in 30 capital cities of China. By aggregating the LST images, we observed a decrease in the SA of the LST as the data aggregation level increased; this relationship can be fitted well with a negative logarithmic function. We derived an indicator to measure the scale effects intensity of SA, which was negatively correlated with the spatial complexity of LST. Both aggregating images and the increasing spatial resolution induce weaker SA, but the effect of the former was stronger. The aggregating images negatively affected the SA degree regardless of the spatial resolutions of the original images. The SA degrees of the aggregated images were far below those of the real-life images. This study suggests that the scale effects caused by aggregation levels and spatial resolutions are different, and cautions should be taken when applying relevant conclusions derived from aggregating images.


Introduction
Spatial autocorrelation (hereinafter referred to as SA) is a crucial concept among spatial analyzers and modelers Ord 1969, Getis 2008). It denotes a property of random variables, at pairs of locations a certain distance apart, that are more similar (positive autocorrelation) or less similar (negative autocorrelation) than expected for randomly associated pairs of observations (Hubert andGolledge 1981, Legendre 1993). Numerous indices have been developed to measure the SA intensity, such as Moran's I and Geary's C (Moran 1948, Geary 1954. These indices are extensively used to test the hypothesis of the sample independence in traditional statistics, since most geographic phenomena show an obvious high degree of SA (Cliff and Ord 1972, Anselin 1988, Foresman and Luscombe 2017. We can also gain insight into the underlying process of geographic phenomena that operate across space through the SA measurement (Jones and Subramanian 2013). However, a recurring problem with the SA measurement is its sensitivity to the spatial scale.
Although numerous studies have been conducted on the scale effects of the SA measurement (Legendre and Fortin 1989, Wiens 1989, Sayre 2005, Lloyd 2014, Fischer 2015, the 'scale' in most of them refers to the operational scale in geostatistics (i.e. the scale of spatial variance) instead of the spatial resolution of satellite images. The operational scale is an inherent property of geo-spatial phenomena that represents the spatial extent at which a specific phenomenon operates. It is also a transition point from a pattern with a strong positive autocorrelation to a random pattern, or a pattern with weaker autocorrelation or without autocorrelation (Meisel andTurner. 1998, Diem 2003). For example, Wang et al. (2016) found that the operational scale of the land surface temperature (LST) was 500-650 m in a semivariance model of the LST retrieved from a Landsat-7 ETM+ image. In this case, the operational scale implied that the spatial variability of LST reached the maximum when the distance between the LST samples was 500-650 m, which was also the optimal distance of samples for inspecting the spatial pattern of LST. The decreasing SA measurement with increasing operational scales (i.e. correlogram) has been generally recognized in spatial science and the remote sensing community and is widely used to identify operational scales in ecology and environment science (Dungan et al. 2002, Diem 2003, Negreiros et al. 2010, Fischer 2015, Cheadle et al. 2017. The operational scale affects the SA measurement by changing the spatial relationship between samples, which usually occurs in single data. However, the effects of the satellite image resolution only make sense for multiple data, because the spatial resolution for a specific image is invariant. Furthermore, the response of the SA measurement on the changing spatial resolution was still unclear. For example, the spatial resolution of the MODIS image is 1 km, while it is 30 m for the Landsat image. As a result, images of different resolutions usually bring about different SA measurement values when exploring the same phenomenon at a specific position. These differences lead to misperceptions of the real world and further erroneous practical guidance. Since it is difficult to collect multiple data of continuously different spatial resolutions, the effects of the spatial resolution on the SA measurement was explored mostly based on data aggregation (Woodcock and Strahler 1987, Goodchild and Quattrochi 1997, Song et al. 2014, Wang et al. 2016. A series of coarse images of different spatial resolutions through aggregating a fine image were commonly obtained, and the response of the SA measurement was hence analyzed. For example, Qi and Wu (1996) explored the effects of changing aggregation level on the results of a landscape pattern analysis using SA indices. Wu (2004) revealed the scaling relations of landscape pattern indices through image aggregation. However, the correctness and validity of data aggregation have been questioned, and few studies have examined it. In regard to scale effects of SA measurement, we are apt to confuse the effects of aggregation levels and changing spatial resolutions on the SA measurement. The effects of the aggregation level on the SA measurement is the scale dimension of the modifiable areal unit problem (MAUP), which is also called the 'scale problem' (Openshaw 1984). Another aspect of MAUP is called the 'zoning problem' or the 'aggregation problem', which represent the effects of data aggregation schemes at a given aggregation level (Alexandridis et al. 2010). Numerous studies (Arbia et al. 1996, Wong 2004, Dark and Danielle 2007 investigated them from various perspectives since Openshaw (1984) described the problem in detail. The 'scale problem' was a particular focus of the remote sensing community because of the multiresolution characteristics of remote sensing. It has also been considered to be a core issue of remote sensing science.
Furthermore, most previous studies on the effects of aggregation levels on the SA measurement were based on single data at a specific study area, which failed to provide general knowledge on the scale issue (Fortin 1999, Xu et al. 2017. The conclusions in their studies may not be applicable beyond their study area because of spatial heterogeneity (Goodchild 2009). Therefore, is there a universal and explicit rule for the variability of the SA measurement with increasing aggregation levels? Can we quantitatively describe this rule with a reasonable formulation? If so, the SA intensity for a specific image will be easily obtained, and the scale effects intensity of the SA measurement will also be quantified. Furthermore, most previous studies on the scale issue were limited to describing relevant phenomena rather than analyzing potential causes (Woodcock andStrahler 1987, Cheadle et al. 2017). Is there a potential association between the spatial pattern of the original fine image and the scale effect intensity induced by data aggregation? Exploring its influential factors is beneficial for avoiding negative outcomes that are induced by the scale effects of the SA measurement.
Spatial resolution is an intrinsic attribute of remote sensing images, the value of which is dependent on the data acquisition scheme. It is unclear whether coarse images that are transformed from fine images are capable of describing the geographic phenomena of the corresponding observation scale. Therefore, a systematic analysis of the effects of aggregation levels and spatial resolutions on SA measurement is necessary (Lam and Quattrochi 1992, Hallisey et al. 2017, Flanagan et al. 2018. To achieve this goal, we have to collect at least two remote sensing images of different spatial resolutions with near overpass time. Landsat and MODIS images provide a good opportunity to comprehensively analyze the scale effects of the SA measurement. In thermal infrared remote sensing for the urban climate and environmental studies, Landsat and MODIS images are most widely used to retrieve the land surface temperature (hereafter referred to as LST), which is a key parameter in the physics of land surface processes. Numerous studies have also shown that the LST exhibits strong spatial autocorrelation (Das Majumdar and Atrayee 2016, Du et al. 2016), but few studies have compared the SA measurement of the Landsat LST and the MODIS LST. Furthermore, Landsat and MODIS (Terra) images are also temporally comparable with a little discrepancy.
In this study, we retrieved the LST from Landsat 8 images in 30 capital cities in China as a sample dataset. The corresponding MODIS datasets (MOD021KM and MOD11A1) were also collected for validation and comparison. After preprocessing the LST, this study will be composed of two aspects: the effects of increasing aggregation levels and the changing spatial resolutions on SA measurement. For aggregation levels, we will identify the operational scale of LST in each city first, in Section 4.1. Then, we will quantify the relationship between the SA of LST and the data aggregation levels in Section 4.2. After that, we will investigate the potential causes of the scale effects, as described in Section 4.3. On the other hand, the effects of spatial resolutions on the SA measurement will be analyzed in Section 4.4. The coupling effects of the aggregation levels and spatial resolutions on SA measurement will also be discussed in Section 4.4. Finally, we will provide several concluding remarks. To our knowledge, this is the first systematic, comparative study on the effects of both the aggregation levels and the spatial resolution of satellite images on the SA measurement.

Study area and data
In this study, we select 30 capital cities in China as samples (Figure 1). The criteria for determining samples of cities are as follows: (1) The sample cities are expected to have a relatively high population density and intensive human activities, which is one of the most significant causes of LST (Amiri et al. 2009). China is experiencing rapid urbanization and all of the 30 cities have at least 1 million urban residents in 2015, with several megacities like Beijing and Shanghai reaching more than 20 million people (China's National Bureau of Statistics 2015, Liu et al. 2015). (2) The sample cities should cover as diverse as possible types of spatial pattern for the comprehensiveness of the results, because previous studies have revealed that urban form (e.g. landscape composition and configuration) has remarkable impacts on LST (Estoque et al. 2017). It is noticeable that the 30 capital cities meet this requirement, because they have been experiencing urbanization with different growth processes and expansion speeds. (3) Climate should also be considered as an indispensable factor. Our sample cities are located in different climate zones, reducing the risk of biased consequences. Lhasa was not included because of its small urban size and the inaccessibility of LST data in the same periods.
Cloud-free Landsat 8 images, which were acquired at an urban area of sample cities during the warm season (April-October) during the period 2013-2016, were utilized in this study (Table 1). The images were selected because LST is considerably high during the daytime of the warm season in urban areas. In addition, the well-documented MODIS LST products MOD11A1 and MOD021KM datasets in each city at the corresponding time were also collected.

Methodology
We provide a methodological flow chart for a systematic view ( Figure 2). It involves six major steps: (1) retrieving and validating LST; (2) identifying the operational scale of LST using a semivariogram; (3) analyzing the relationship between the SA of LST and scale  transformation through mean pixel aggregation; (4) describing the spatial complexity of LST using the fractal dimension of LST; (5) identifying the association between the LST spatial complexity and the effect of aggregation levels on SA; and (6) analyzing the coupling effects of the aggregation levels and spatial resolutions on SA using Landsat 8 and MODIS images.

LST retrieval
We employed the practical split-window (SW) algorithm proposed by Du et al. (2015) to retrieve the LST. The reasons why we chose the SW algorithm are as follows: (1) it is relatively simple and highly efficient; (2) accurate information about the atmospheric profiles during satellite acquisition is not required, unlike the single-channel (SC) algorithm, and it is difficult to obtain; and (3) the algorithm was proven to be reliable through its application in different regions based on the analysis of Du et al. (2015).
The algorithm expresses the LST as a nonlinear combination of the two brightness temperatures measured in the two TIR channels: where T s is the LST retrieved from the Landsat 8 TIRS data; T i and T j are the TOA brightness temperatures measured in channels i (~11.0 μm) and j (~12.0 μm), respectively; ε is the average surface emissivity of the two channels (i.e. ε = (ε i + ε j )/2), while Δε is the channel surface emissivity difference (i.e. Δε = ε i -ε j ); and b k (k = 0, 1,…, 7) are the algorithm coefficients predetermined from regressing the simulated satellite data with a set of atmospheres and surface parameters. The thermodynamic initial guess retrieval (TIGR) atmospheric profiles and the American Advanced Spaceborne Thermal Emission Reflection (ASTER) emissivity database were used to represent a worldwide set of atmospheric and surface conditions, respectively (Chevallier et al. 2000, Hulley andHook 2009). The coefficient values were provided for different levels of atmospheric column water vapor (CWV) to improve the accuracy (Table 2). Other details of the algorithm were provided by Du et al. (2015).
To validate the LST precision, we employed the well-documented MODIS LST products MOD11A1 in each city at the corresponding time. The retrieved Landsat 8 LST images were resized to a resolution of 1000 m through the mean pixel aggregation method and then were compared with the MOD11A1 images. The results are shown in Table 2. The coefficients b k (k = 0, 1…7) for different levels of CWV.  Considering the diverse land cover and land use in 30 sample cities and the large validation of samples, the precision is acceptable.

Semivariogram
Many tools or methods have been developed to identify the operational scale of spatial phenomena. In this study, we employed the empirical semivariance and the semivariogram, which proved to be an effective tool for identifying the operational scale of LST. The semivariance analysis assesses the contribution to the total variance of the sample made by the variance of pairs of points that are separated by a certain lag distance (Meisel and Turner. 1998). The formula is shown in (2): whereγðhÞ is the semivariance between two LST observation samples z(x i +h) and z(x i ) in a pair with lag h, z is a pixel value of LST at a particular location, h is the distance between pixels, and n(h) is the number of paired pixels at a distance of h.
As for the semivariogram, the first step is to calculate semivariance of pairs of points that are separated by different lag distances from zero to a specific value, usually less than half the extent of the study area. Then, semivariance against the lag distance is plotted, typically monotonically increasing from zero to a specific value, which is termed as a sill. In an ideal semivariogram, the semivariance will start to level off at a certain lag distance, which is commonly described as the range. The range represents the extent beyond which variables of interest exhibit a random pattern or weaker SA or without SA and is thus identified as the operational scale. To determine the precise values of sill and range, a continuous curve is usually obtained using a recognized model (e.g. spherical, exponential) to fit the discrete points in the semivariogram (Porter 1974, Meisel and Turner. 1998, Sanden and Hoekman 2005. In this study, we used one of the most widely used models, the spherical model, to identify the first inflection points in the semivariogram. The inflection points mean where the semivariance starts to remain stable or changes abruptly. It indicates the range of the influence of the main ecological or geographic process. Previous studies have shown that the lag distance of the first inflection point in the semivariogram represents the operational scale of phenomena (Meisel andTurner. 1998, Diem 2003). Although the statistical significance of the operational scale identified in the semivariogram cannot be tested, it remains one of the tools widely used by geographers and ecologists to explore the spatial structure of the phenomenon.

Spatial autocorrelation analysis for multiple aggregation levels
In this study, we will use Moran's I index to measure the SA of the LST (Anselin 1996, Getis 2008, Negreiros et al. 2010, Dray 2011). Moran's I index ranges from −1 to 1, with the positive and negative index representing similar and dissimilar values clustered together in a specific extent, respectively. Moran's I index will be zero if the variable is stochastically allocated. The formula is shown in (3): where N is the total number of pixels, x i and x represent the i th pixel value and the mean value of the LST image, respectively, and w ij denotes the spatial weight of the i th pixel and the j th pixel, which was conceptualized using inverse distance weights within a threshold distance that is the same as the pixel size in this study. Given the different sizes of the urban area in each city and the intensive computation due to the large number of pixels, we have to resize our data to maintain data consistency. The impact of the spatial extent on SA was thus not included in this study. The LST data in each city was confined to a 36 km × 36 km square, nearly encompassing the whole urban area of each city. Each LST image was resized with a mean pixel aggregation method, from the original resolution (30 m) to 3600 m (i.e. 30, 60, 90, 120, 150, 180, 240, 300, 360, 450, 480, 600, 720, 750, 900, 1200, 1440, 1500, 1800, 2250, 2400, 3000, 3600 m), which is 23 aggregation levels altogether. It is noticeable that the urban area of several megacities is much larger than the square, but their core urban areas are included in our study area, which does not preclude our analysis. Although information loss exists during this procedure, the spatial pattern of LST remains stable based on the stable mean and spatial distribution (see Figure S1 and S2 in the supplementary material). We then calculated Moran's I index of the LST at 23 aggregation levels and analyzed the relationships between the SA of the LST and the corresponding aggregation levels. To quantitatively describe the effects of data aggregation levels on SA, we chose a linear least square method to fit a natural logarithmic linear function for each city (Equation (4)). Because the operational scale is also indicative of the threshold range of the spatial autocorrelation (Meisel and Turner. 1998), the LST will exhibit a random pattern or a pattern with weak autocorrelation or without autocorrelation when the aggregation level exceeds the operational scale. This finding implies that the SA of the LST varies unpredictably with changing aggregation levels beyond the operational scale. The fitting was, therefore, limited to aggregation levels between 30 m and the operational scales: where I m represents Moran's I value of LST at a spatial resolution of m; a and b are the regression coefficients; and OS denotes the operational scales of LST identified by the semivariogram.
To investigate the similarity and difference for the response of the SA measurement to aggregation levels and spatial resolutions, we collected the spectral radiance measured on top of the atmosphere (TOA) in the thermal infrared channel of Landsat 8 TIRS10 and MODIS Band 31 (hereafter referred as TIRS10 and MODIS31) in 24 sample cities. Six other cities are excluded because they are located at the gap between the Modis images. The multiscale spatial autocorrelation analysis of TIRS10 and MODIS31 was also performed by mean pixel aggregation. Analyzing the radiance directly instead of the LST also eliminates the uncertainty caused by the retrieval algorithm. Note that the MODIS31 was resized from 1000 m to 2000 m and 3000 m. The spectral radiance was calculated using (5): where L λ is the spectral radiance in units of W/(m 2 ·sr·µm), and Gain and Offset can be obtained from the metadata of the sensors.

Differential box-counting fractal dimension calculation
It is well-known that the complexity and irregularity of the spatial data can affect the measure of SA, while it is difficult to describe the nature of complex objects using traditional Euclidean geometry (Lam et al. 2002, Mark 2015. The fractal dimension, which is a ratio providing a statistical index of complexity comparing how details in a pattern (strictly speaking, a fractal pattern) change with the scale at which it is measured (Falconer 2003), can effectively describe the self-similarity and complexity of spatial data. We will thus calculate the fractal dimension of LST to investigate its correlation with the SA of the LST. In this study, we employed the differential boxcounting (DBC) method to estimate the fractal dimension of LST in each city, hereinafter referred to as DBFD.
The basic principle to estimate the DBFD is based on the concept of self-similarity. The DBFD of a bounded set A in Euclidean n-space is defined as where N r is the least number of distinct copies of A in the scale r. The union of N r distinct copies must cover set A completely. Detailed explanations of the DBC method could be seen in Sarkar and Chaudhuri (1994).

Identification of the operational scale
The empirical semivariogram was used to analyze the spatial structure and the spatial heterogeneity of LST (Figure 4). The lag corresponding to the blue dash line in each subplot denoted the operational scale identified by the spherical modeling. The semivariance monotonically increasing trend indicated that LST exhibited stronger heterogeneity with the lag distance increasing, and larger scales were acceptable for analyzing the spatial pattern of LST (Wang et al. 2016). Detailed identification results of operational scales for each city are as shown in Table 3. The operational scale of LST differed greatly among cities, but most of them were not higher than 1 km, which is similar to previous studies (Song et al. 2014, Wang et al. 2016. Different operational scales of LST are determined by the different magnitudes of the spatial heterogeneity of LST. The exceptionally high operational scale in Urumqi (1140 m) is related to the relatively heterogeneous land cover and landscape pattern (Estoque et al. 2017). Moreover, the semivariograms for several cities appeared with more than one inflection points. This finding is attributed to the multiple scales of the spatial pattern of LST (Meisel andTurner. 1998, Fu andWeng 2016).

Relationships between SA and the data aggregation levels
Through ordinary least square fitting, we found that the variation of SA with the data aggregation levels could be well described through the logarithmic function (Equation (4)), as shown in Figure 5. Obviously, the SA of LST decreased with aggregation levels increasing, and the rate of decrease gradually slowed down. However, the SA declined with contrasting speed and degrees. Tianjin, for instance, experienced a relatively large drop from approximately 0.95 to 0.55 after scaling up from 30 m to 1000 m. Harbin also experienced similar variation, but it always remained a high SA, with Moran's I index approximately 0.85 at a scale of 1000 m. Furthermore, the LST in each city was highly spatially autocorrelated at the original scale (30 m), with Moran's I index greater than 0.9, which indicated that the LST was closely correlated with the LST nearby. The estimated parameters for the natural logarithmic linear functions (Equation (4)) are shown in Table 4. The fitted logarithmic functions quantitatively describe the decline trend of the SA of LST, with aggregation levels increasing. In the case, the parameters a denote some basic characteristics of the scale effects, while the coefficient b denotes the intercept of the logarithmic function on the Y-axis, which does not possess any physical meaning, because the m can never be zero. The coefficient a represents the decrease rate of SA with respect to the logarithm of aggregation levels, in the sense that the absolute value of a could be regarded as a measure of the magnitude of scale effects. As shown in Table 4, values of coefficients a range from −0.14 to −0.04, thus representing a great deal of difference in the scale effects intensity of SA of LST. Tianjin, for example, possessed the largest absolute value of a (−0.14) among these cities, which in turn appeared to be a relatively steep curve in Figure 5. The −0.14 denotes that a unit increase in the logarithm of scales was predicted to weaken the SA of LST by 0.14.

LST spatial complexity and effect of aggregation levels on SA
We also observed that a large proportion of cities witnessed relatively weak scale effects, with a value a larger than −0.1 ( Figure 5 and Table 4). Now that the value a denoted the scale effect intensity that was induced by data aggregation, we investigated the relationship between the parameter a and spatial complexity of LST. It has been proven that the fractal dimension offers capability of describing spatial complexity of irregular surface. We calculated the differential box-counting fractal dimension (DBFD) of the LST in each city, as shown in Figure 6. Obviously, the DBFD of LST in 30 capital cities differs greatly with the minimum in Wuhan (a little lower than 2.2) and the maximum in Lanzhou (approximately 2.32). LST is primarily determined by climate and meteorology, while other related variables affect the local variation of LST, with a minor magnitude (Berger et al. 2017). Therefore, the LST surface has low spatial complexity, and the DBFD is unlikely to reach a high value (i.e. approaching 3). The Pearson correlation coefficients between the regression coefficients a in Equation (4) and the DBFD of the LST at the original scale were calculated. We found that the DBFD of the LST was significantly positively correlated with the regression coefficients a (r = 0.58, p < 0.01), as shown in Figure 7. In other words, the DBFD of the LST at the original scale was negatively correlated with the scale effects intensity (i.e. absolute value of a) of the SA. Since the DBFD reflects the spatial complexity, this analysis demonstrates that the spatial complexity of the data has a negative impact on the scale effects of SA. The SA measurement of data with a higher spatial complexity decays more slowly with increasing aggregation levels. Figure 8 shows the effects of the data aggregation level on the SA of the spectral radiance of TIRS10 and MODIS31 images. Although the SA of TIRS10 and MODIS31 both decreased with the increasing aggregation level, they varied greatly. For example, the SA degrees of the original MODIS31 (1000 m) were not equal to that of TIRS10 aggregated at 1000 m. The former showed a higher spatial autocorrelation than that of the latter. This implied that the SA degrees of aggregated images were far below those of the reallife images. For effects of spatial resolutions, the SA degrees of MODIS31 at 1000 m were also below those of TIRS10 at 30 m, but they were nearly equivalent in some cities (Shanghai and Zhengzhou, for example). This finding indicated that the SA degree also declined with increasing spatial resolution, with a slower speed than that for aggregation levels. This finding also indicated the different effects of data aggregation levels and spatial resolutions on SA measurement. We observed similar descent rates of SA of TIRS10 and MODIS31 with increasing aggregation levels, as manifested by the parallel Figure 7. The correlation between the regression coefficient a in Equation (4) and the differential box-counting fractal dimension of LST. lines in Figure 8. This finding implied that the data aggregation levels negatively affected the SA degree, regardless of the spatial resolutions of the original images.

Discussion
For the first time, this study compared the effects of the aggregation level and spatial resolutions on the SA measurement. Using the remote sensing data of the LST in 30 capital cities in China as samples, we observed a negative logarithmic rule in the variation of the SA degree with the increasing data aggregation level. This finding indicates that spatial heterogeneity increases with data aggregation. More importantly, this is different from Arbia's second law of geography, which reads, 'Everything is related to everything else, but things observed at a coarse spatial resolution are more related than things observed at a finer resolution' (Arbia et al. 1996). Arbia's second law refers to the scale dependence of correlation between pairs of variables due to the smoothing effect of data aggregation (Miller 2004). Based on this and the results stated above, Arbia's 'Second Law of Geography' may not be applicable to the scale effect of univariate spatial autocorrelation, no matter for data aggregations or spatial resolutions. Therefore, our results may be a supplement to Arbia's work.
The effects of aggregation levels on statistical analysis are the scale aspect of the modifiable areal unit problem (MAUP), which has been recognized and analyzed for decades. The most famous examples on the MAUP should be the boundaries of the 520 English constituencies as reviewed by the Parliamentary Boundary Commission, which was introduced by Openshaw (1984). The negative logarithmic rule found in our study could be regarded as a quantitative examination on the MAUP. Previous studies on the MAUP were focused its effects on statistical models, and little quantitative conclusions were obtained. This study provided a compromising quantitative pathway to simulate the effects of the MAUP on the SA measurement, from the scale perspective only, even though we did not solve the MAUP. Another important type of study on the MAUP in the remote sensing community was to identify the optimal analysis scale (Song et al. 2014). However, most of these studies were conducted at a specific area based on a single dataset, which is not applicable to other areas and other datasets. In addition, these conclusions may be inconsistent in application because of the different responses of the SA measurement to aggregation levels and spatial resolutions (Figure 8). Caution should be given to relevant conclusions derived from the multiscale analysis for aggregating remote sensing images, particularly when applying these conclusions in practice. Future work could be focused on the effects of MAUP based on the multiple satellite images of different spatial resolutions, such as Landsat, ASTER and MODIS images.
The negative effects of data aggregation levels on the SA measurement are understandable according to the definition of SA and the formula of Moran's I index. Moran's I value is equal to the mean value of the local indicator of spatial association (LISA) (Anselin 1996). With the increasing aggregation level of LST, the mean values keep stable while the variances decrease steadily (Wang et al. 2016), which causes the z-score standardized values to approach zero. Consequently, the LISA value for each sample gradually approaches zero with the increasing aggregation level, and therefore, Moran's I decreases gradually. We also observed a relatively moderate negative effect of the spatial resolutions on the SA measurement. Hence, the negative scale effects of the SA measurement could be obtained no matter if aggregating images or changing to coarser images. Obviously, the high SA degree of the fine data denotes data that is spatially homogeneous, while the low SA degree of coarse data represents data that is dominated by spatial heterogeneity. From this perspective, the spatial homogeneity may be transferred to spatial heterogeneity through increasing spatial scales of data. Nevertheless, it is difficult to determine whether this is applicable to all geographic phenomenon. More datasets should be obtained to find a more general rule to describe the scale effects of the SA measurement.
We also observed that intensity of the scale effects varied by cities. The spatial pattern of land use and land cover (LULC) may be responsible for this. The LISA values are usually negative at the edge of different landscapes, such as waterbody and impervious surface. Therefore, the landscape complexity and fragmentation determine the number of negative LISA values and are also critical to Moran's I value. The sample cities in this study possess various landscape patterns and thus exhibit different spatial autocorrelation scale effect intensities of LST. Because accurate LULC data are not available and image classification is a labor-intensive process, we did not explore the relationship between LULC and the scale effects in this study. However, we found that the fractal dimension of the original LST is correlated with the scale effect intensity. As previous studies showed, the spatial pattern of LST and LULC is closely correlated (Fu andWeng 2016, Wang et al. 2016), the fractal dimension of LST may be treated as a quantification of the spatial complexity of LULC.
We also analyzed the effects of data aggregation levels and spatial resolutions on the SA measurement using TIRS10 and MODIS31 images. The spatial autocorrelation of coarse images aggregated from fine images is lower than that of the original images of the same resolution. Actually, we also investigated this based on LST (see Figure S3 in the supplementary material) and also found the same phenomenon. In terms of the discrepancy of the SA of TIRS10 and MODIS31, two reasons may be responsible for this. The first reason should be the difference in the spectral response function of the thermal infrared channels (Figure 9), despite nearly having the same bandwidth as the TIRS10 (10.60-11.19 μm) and the MODIS31 (10.78-11.28 μm). The difference will result in a large discrepancy in the spectral radiance measured on top of the atmosphere.
Another reason should be that the original TIRS10 and MODIS31 images contained different magnitudes of spatial variability. For example, Figure 10 shows the thermal infrared radiance measured on top of the atmosphere in Harbin, China. We can see that the relative high radiance captured by the TIRS10 (yellow part in Figure 10(a)) was replaced by the lower radiance in the MODIS31 image (light blue part in Figure 10(c)). The scale of those ground objects corresponding to the high radiance pixels in TIRS10 was much less than the spatial resolution of the MODIS images. The local high radiance was hence smoothed or covered up, since each pixel in an image is homogeneous when the sensors collect signals. In contrast, the TIRS10 images were fine enough to collect more information on the spatial variability of the spectral radiance. As a result, the different spatial variability intensities of the original images resulted in the aggregated TIRS10 images remaining with a high spatial stratified heterogeneity compared with MODIS31 images, as shown in Figure 10(b). Figure 10(b,c) and exhibit markedly different spatial patterns at similar resolutions (990 m and 1000 m). However, images from different sensors vary in multiple aspects, such as spectral response, solar and view zenith geometry, atmospheric correction algorithms, viewing time, and so on. The different spatial patterns of the phenomenon observed by different images is, therefore, understandable, even though their viewing time is similar.

Conclusions
Scale dependence lies at the center of studies on the spatial analysis and remote sensing community. For the first time, this study compared the effects of scale transformation (i.e. aggregation level) and observation scales (i.e. spatial resolutions) on the SA measurement. Taking 30 capital cities in China as samples, we observed a negative logarithmic rule in the variation of the SA measurement of LST with data aggregation levels. We also found that the scale effects intensity of the SA measurement was significantly correlated with the DBFD of original LST. Effects of the data aggregation levels and spatial resolutions on the SA measurement were different. The coarse images aggregated from fine images exhibited a lower spatial autocorrelation than other original coarse images of the same resolution. In addition, the SA measurement of images with different spatial resolutions (Landsat and MODIS) showed similar behavior in response to data aggregation levels.
The proposed scale effects negative logarithmic rule of SA in this study were quantified by mathematical functions and were validated by multiple samples, which was different from previous relevant rules that are qualitative and developed from a single sample. The rule offered a quantitative method for translating the SA from one scale to another. The logarithmic scale effects rule could be applied for other variables, because the LST is a typical spatially autocorrelated variable. More experimental case studies based on related variables are thus needed to validate our findings. Our analysis also showed that the scale effects intensity of the SA measurement is negatively correlated to the spatial complexity of data that is described through the DBFD in our study. The LST is also a special complex surface reflecting the spatial pattern induced by climate and human activities. However, only DBFD was considered to be an exploratory factor in our analysis. Future studies could include applications and the development of spatial pattern indices, such as landscape pattern indices, to seek potential influential factors of the scale effects of the SA, to attenuate the negative effects caused by it.

Disclosure statement
No potential conflict of interest was reported by the authors.