Mapping volumetric soil moisture in the Vietnamese Red River Delta using Landsat 8 images

ABSTRACT This study estimates the surface soil moisture content in a case study situated in the Vietnamese Red River Delta, using the Landsat 8 satellite images. The trapezoidal relationship between land surface temperature and vegetation index was used to obtain soil wetness indexes. A split-window algorithm was developed to overcome the missing of atmospheric data. The method was validated with ground truth across different land covers. The RMSE between the calculated and measured SMC ranges between 0.556 and 0.971 and varies across different types of land covers. The method is important to monitor SMC across large areas with limited surveyed data.


Introduction
Soil moisture content (SMC) is one of the important hydro-meteorological variables that influence the interactions between land surface and atmospheric processes (Mallick et al. 2009). However, reliable regional soil moisture data are usually available at a limited number of stations and are expensively determined through conventional point measurements (Mallick et al. 2009, Jacome et al. 2013, Sadeghi et al. 2017. Therefore, satellite data, that is, derived from Sentinel-1, Sentinel-2 and Landsat-8 observations are useful for SMC estimation. Remote Sensing (RS) techniques, including microwave and optical and thermal satellite, are of the two frequently used ones for soil moisture monitoring. The former is suitable for monitoring global scale SM dynamics while the latter is well suited for smallscale applications (Sadeghi et al. 2017).
The problem is particularly relevant to our investigation in the Red River Delta (RRD). In the RRD, rice production is heavily dependent on irrigation and information on soil moisture conditions is critical for water management for specific crops such as maize and seasonal vegetables, including broccoli and carrot in the winter season, and Ceylon spinach or fruit vegetables of the Cucurbitaceae families in the summer season. Thus, soil moisture monitoring can provide direct information about crop productivity than rainfall predictions or other weather forecasts, which dramatically aids the farmer in decisionmaking (Shaxson andBarber 2003, Huong et al. 2013). Furthermore, soil moisture monitoring will enable assessments of the transpiration rate of crops (Chen et al. 2012). Productivity can be adversely affected if the crop is not supplied with sufficient soil moisture for growth.
The RRD is currently under severe environmental pressures due to ever-increasing human impacts without proper management. For example, the recent intensification of agricultural activities has significantly degraded water quality in the RRD (Nishina et al. 2010). Land-use change and groundwater consumption have depleted groundwater, which has dramatically increased the risk of flooding in the RRD area (Du Bui et al. 2012, Loc et al. 2021a, Park et al. 2021,Tay et al. 2022. Sand mining and sea-level rise also pose an additional threat to the habitability of RRD (Pruszak et al. 2002, Dang et al. 2010. Thus, RRD's hydrology is disturbed mainly by human activities and has become one of the most vulnerable regions in Southeast Asia in terms of water resource availability. Specifically, the RRD is a wet tropical environment with high monsoonal rainfall throughout the year. With very high evapotranspiration rates due to vast agricultural practices, surface soil moisture plays a critical role in the regional hydrological cycle/budget by controlling the land-atmosphere exchange (Hiep et al. 2018).
Previous studies have focused on soil moisture in other parts of Vietnam based on field-based measurements (Anh et al. 2014, Han et al. 2014. However (Park et al. 2022), point-based studies are limited in providing relevant information for integrative management of a watershed because it provides no information on soil moisture distribution patterns or its dynamics through time (Chen et al. 2011, Anh et al. 2014. In this context, remote sensing could be considered a critical method to map soil moisture at a large scale with dynamics (Rodríguez-Fernández et al. 2019, Feizizadeh et al. 2021a. Besides, the so-called 'trapezoid' or 'triangle' model is of the most frequently applied tools in Remote Sensing of soil moisture estimation by utilizing optical and thermal data (Mohanty et al. 2017, Sadeghi et al. 2017. The potential of thermal infrared remote sensing data for estimating surface soil moisture has been presented by several studies (Price 1980, 1982, Carlson et al. 1994, Fang and Lakshmi 2014, Yang et al. 2015, Zhang and Zhou 2016. By integrating surface temperature and soil moisture through evaporation, the thermal infrared method showed promise in mapping soil moisture contents (Price 1982, Mohanty et al. 2017. For instance, surface soil moisture was estimated by applying the relationship between soil moisture and evaporation fraction (Rahimzadeh-Bajgiran et al. 2013). A new soil moisture index, Perpendicular Soil Moisture Index (PSMI), was developed for simplifying soil moisture estimation. By directly evaluating PSMI through raw digital count data in optical bands and thermal infrared spectral band, Shafian and Maas (2015) showed a feasible and significant potential of the use of PSMI in soil moisture assessment. Finally, the advantages of optical remote sensing for soil moisture mapping are that the wide swath of data availability could be derived on a global scale at multiple spatiotemporal resolutions (Sadeghi et al. 2017, Mohamed et al. 2020, Loc et al. 2020. Although Chen et al. (2011) is among the few studies that utilized remote sensing in Vietnam to measure soil moisture, there are no field-calibrated remote sensing-based studies to map soil moisture in the RRD. This lack of existing knowledge on regional soil moisture mainly relates to the limited research capacity, such as lack of gauge station data and field-work equipment. This study, henceforth, seeks to fill this knowledge gap by applying a thermal infrared-based method to measure soil moisture fluxes, their spatial distribution, and dynamics in response to the recent climatic change in the RRD. We calculated land surface temperature (LST) and used land surface temperature -Vegetation Index (LST-VI) method to estimate SMC. The results may contribute to recognizing the critical need to better understand RRD's regional hydrological cycles in terms of soil moisture and evapotranspiration patterns. We also hope that empirical knowledge will be enhanced to the soil moisture reference library for future research in Vietnam to support crop management, irrigation, and crop planning. Given the region's vulnerability to droughts, providing quality soil moisture data will be useful for forecasting and preparing for droughts. Finally, we expect that our proposed methodological framework will be readily applied for soil moisture monitoring in data-poor regions worldwide.

Methods
We propose an analytical framework to generate soil moisture maps over an ungauged catchment in the RRD by combining (1) land surface temperature (LST) calculations; and (2) soil moisture retrieved through land surface temperature -Vegetation Index (LST-VI) model (Holzman et al. 2014). The results were verified with field-collected data. More specifically, intensive ground sampling was conducted at precisely determined locations across the study area. The time of sampling was the same when Landsat 8 satellite flew over the study area. The soil moisture values were determined using the gravimetric method ( Figure 1).

Study area
Spanning some 150 km in width, the Red River Delta (RRD) is located in the western coastal zone of the Gulf of Tonkin, in the northeast of Vietnam ( Figure 2). It is the secondlargest river in Vietnam and one of the five largest rivers in Southeast Asia, where its catchment lies both over China and Vietnam (Nguyen et al. 2015). Though the RRD makes up only 5% of Vietnam's land with a total area of 16,000 km 2 , 30% of the country's population lives there, making it the most densely populated part of the country (That et al. 1996). Although there is a decent number of industrial zones in the associated cities, such as Viet Tri, Hanoi, Haiphong, and Nam Dinh, the majority of the delta population (80%) are employed in agriculture. In essence, the RRD is the second most crucial riceproducing area in Vietnam after the Mekong Delta in the south, accounting for 20% of the national crop. Production of rice is close to optimal with a minimal yield gap to exploit and to employ double-cropping techniques to achieve maximum yields (Hoang et al. 2005, Devienne 2006, Nishina et al. 2010).

Field measurements
Field soil moisture measurements were conducted in one of the farmlands within the RRD at the same time with the Landsat 8 satellite overpass. We followed the gravimetric method to measure the soil moisture (Blake 1965). It should be noted that to minimize errors, a charging scale was brought to the site for weighting the soil samples upon collection. More specifically, the soil moisture content may be expressed by weight as the ratio of the mass of water present to the dry weight of the soil sample, or by volume as the ratio of the volume of water to the total volume of the soil sample as follows: The water mass must be determined by drying the soil to constant weight and measuring the soil sample mass after and before drying to determine any of these ratios for a particular soil sample, the water mass. The water mass (or weight) is the difference between the weights of the wet and oven-dry samples. The criterion for a dry soil sample Figure 1. Methodological framework developed in this study to derive soil moisture maps. Landsat 8 (L8) images were obtained from the United States Geological Survey's Earth Explorer (http://usgs.gov). NDVI and LST were calculated using a Split-Window Algorithm (SWA) (Du et al. 2015). NDVI and LST data then used as inputs to estimate soil moisture data.
is the soil sample that has been dried to constant weight in the oven at a temperature between 100°C and 110°C (105°C is typical). The materials for the measurements are shown in Figure S1.

Remotely sensed images
This study used Landsat 8 Operational Land Imager (OLI)/Thermal Infrared Sensor (TIRS) Level-1 data. One of the main advantages of the Landsat 8ʹs TIRS over the thermal bands of Landsat missions, that is, Landsat 5 and 6 Thematic Mapper (TM) and Enhanced Thematic Mapper (ETM), is the development of the two-thermal infrared (TIR) bands in the atmospheric window between 10 and 12 µm wavelength. It is considered superior over the single thermal band on TM and ETM. In this study, we used three images for the estimation of soil moisture, which has cloud coverage of less than 15% (Table 1).

Figure 2.
Sampling sites for ground truth data collection. Sampling sites were chosen, considering the homogeneity and representability of the land cover of the RRD, as well as accessibility to the site. The samples were collected approximately 3 to 5 cm below the soil surface, of which the moisture values were determined through the gravimetric method (Blake 1965). The image shown was taken on 21 September 2016: from 09:30 AM to 10:30 AM (GMT +7).

Land Surface Temperature (LST) calculations
To facilitate the calculation of LST, we developed a Geographic Information System model shown in Figure 3.

Normalized Differences Vegetation Indices (NDVI)
NDVI is a vegetation index, which allows mapping the relative healthiness of the biomes (Wang et al. 2005). NDVI is calculated as follows: where R NIR and R RED are the surface bidirectional reflectance values for their respective bands. NDVI of a given pixel range from −1 to +1, which respectively indicates water and healthy vegetation. Besides, a zero (0) corresponds to barren areas of rock of sand.
Based on NDVI, Fractional Vegetation Cover (FVC) was calculated. For this, a NDVI threshold method was used for LSE estimation from Landsat 8 imagery according to Equation (Yu et al. 2014):  If NDVI < 0.2, with b i taken as ρ4 being the red band reflectance (Band 4) If 0:2 � NDVI � 0:5: • ε v,i ε s,i are emissivity of vegetation and soil for band i, respectively. For Landsat 8 TIRS band 10 and 11, ε v,i equal 0.9668 and 0.9747, respectively; whereas ε s,i equal 0.9863 and 0.9896, respectively (Du et al. 2015). • P v is vegetation cover fraction and derived from NDVI.
• In Equation (5), NDVI min = 0.2, NDVI max = 0.5 If NDVI > 0.5 • C i is a term that takes the cavity effect into account due to the surface roughness. In this study, C i was estimated as follows (Sobrino and Raissouni 2000).
• Where F' is the geometrical factor ranging between 0 and 1 (typically 0.55)

Top of Atmospheric spectral radiance (TOA)
Pixel values (Digital Numbers) of the Landsat 8 Thermal Bands have been converted tosensor spectral radiance (Lλ) by using the radiance scaling factors provided in the metadata file (Survey 2016): Where: • Lλ: Spectral radiance (W/(m 2 *srad*μm)) • M L : Radiance multiplicative scaling factor for the band (RADIANCE_MULT_BAND_ 10/ 11 from the metadata) • Qcal: Level 1 pixel value in Digital Number of band 10/11 image. • A L : Radiance additive scaling factor for the band (RADIANCE_ADD_BAND_10/11 from the metadata)

Brightness Temperature (TB)
TB for both the TIRs bands was calculated by adopting the following formula (Tian et al. 2009, Han et al. 2014: Where: • K1 and K2 -thermal conversion constant and it varies for both TIR bands • Lλ -Top of Atmospheric spectral radiance

Land Surface Temperature (LST) calculation
The foundation of the technique to calculate LST from Split-Window Algorithm (SWA, see Figure 1) is that the radiance attenuation for atmospheric absorption is proportional to the radiance difference of simultaneous measurements at two different wavelengths (Jimenez-Munoz et al. 2014). Therefore, the SWA removes the atmospheric effect based on differential atmospheric absorption in the adjacent thermal infrared bands, and the nonlinear combination of the brightness temperatures are applied for LST retrieval (Du et al. 2015, Valizadeh Kamran et al. 2015, Tan et al. 2016). In the case of unsuccessful atmospheric column water vapor retrieval, a practical split-window algorithm was applied in this study (Du et al. 2015): Where: • T i and T j are TOA brightness temperature measured of Band 10 and Band 11, respectively; • ε i and ε j is the channel effective surface emissivity of band 10 and band 11, respectively; • ε = 0.5(ε i + ε j ) is the mean emissivity; • ∆ε = (ε i − ε j ) is the emissivity difference; • b k (k = 0,1, . . . ,7) are the algorithm coefficients derived in the simulation dataset (Du et al. 2015).
The coefficients b k have been obtained through numerical simulations with different atmospheric and surface conditions (see Table 2 by Du et al. (2015)). Thermodynamic Initial Guess Retrieval (TIGR) atmospheric profiles have been used to represent a worldwide set of atmospheric conditions from polar to tropical atmospheres. For the surface conditions, seven levels of LST according to the bottom atmospheric temperature and emissivity database from the American Advanced Spaceborne Thermal Emission Reflection (ASTER) have been considered. Based on the numerical simulation and a statistical regression method, the coefficients b k in the Equation (3) have been determined (Du et al. 2015). In the case of an unsuccessful CWV retrieval, a group of coefficients for the entire CWV range [0.0,6.3] g/cm 2 was applied in this study.

LST-VI triangle
LST mainly depends on soil moisture and FVC. In bare soil and vegetated surfaces, soil moisture determines surface temperature through evaporative control, thermal inertia, and the amount of energy involved in the evapotranspiration processes. A combination of FVC derived from vegetation index (VI) and LST allows the estimation of soil water availability from bare soil to fully vegetated surfaces (Holzman et al. 2014).
In the LST-VI triangle ( Figure S2), the highest LST (LST max ) values along the dry edge represent the driest surface soil conditions when surface soil wetness is zero. The wettest soil conditions are represented through the minimum LST (LST min ) along the 'wet edge' when surface soil wetness is the highest. It is assumed that moisture availability varies linearly from the dry edge to the wet edge (Carlson 2007). A soil wetness index (SWI) on a given day or time (t), representing relative surface soil moisture, is defined as: Where LST (i) is the land surface temperature of (i) pixel, LST min is the minimum LST in the triangle (Wet edge), and LST max(i) is the maximum LST for (i) vegetation index. The values of the Soil wetness index (SWI) range from 0 to 1 (0 ≤ SWI ≤ 1).
Based on the SWI, volumetric surface soil moisture can be estimated by using the following relationship (Mallick et al. 2009

):
Where θ max is the upper limit of surface soil moisture has been set equal to the field capacity (FC), and θ min is the lower limit of surface soil moisture has been set equal to the permanent wilting point (PWP).

Verification of results
Results computed from optical/thermal remote sensing data have been compared with the ground-truth soil moisture data to test the robustness and the ability of the approach for retrieving surface soil moisture. The performance has been assessed using root mean square error (RMSE) and determination coefficient (R 2 ): Table 2. The coefficients b k (k = 0,1, . . . 7) in different atmospheric column water vapor (CWV) subranges and the root-mean-square error (RMSE). Where n is the number of pixels, and S i and E i are the ground-truth data and estimated values, respectively. RMSE for soil moisture under 0.05 m 3 .m 3 or cm 3 .cm 3 is at a reasonable accuracy (Weidong et al. 2002), whereas R 2 (0 to 1) reaches 1 presenting the best performance.

NDVI and LST
Comparing NDVI values and their spatial distribution across the three dates, we found a downward trend of vegetated areas, as opposed to the whole area's trend where vegetation areas had increased to 21.73% in the same period. Variations of planning dates between areas can explain this difference ( Figure S3) With respect to LST (Figure 4), on 1 June 2016 most of the pixels were found to have LST values within 35-39°C, the median and standard deviation values are 37.6 and 1.82, respectively. LST on 21 September 2016 was slightly higher, and the values mostly concentrated on a wider range between 36°C and 43°C, with a median of 39.4 and a standard deviation of 2.9. Finally, the LST values of 7 October 2016 though not the highest of the three images processed, are the most consistent with significantly smaller data standard deviation (1.36), mostly distributed between 36°C and 39°C, with the median of 38.11 .

Surface soil moisture
Combining the LST and NDVI results, three LST-NDVI scatterplots of 1 June 2021 September and 7th October were developed. Then, the surface soil moisture of every pixel was calculated based on the wet edges and dry edges of the LST-VI triangular space (Figure 4). The results were then validated with ground-truth data being the moisture measurements of the respective soil samples. Our LST-NDVI scatterplots show that the R 2 values are 0.9716, 0.9486, and 0.9744 for the 1 June 2016, 21 September 2016 and 7 October 2016. It implies a good performance of the calculated and observed values. Besides, the validation results showed RMSE of the 0.0696 and R 2 of 0.2636 ( Figure 5). Of equal importance is the clear separation of the data points into three distinct groups, that is, the good fit -the blue filled diamonds, the underestimates -the green filled diamonds, and the overestimates -the orange filled diamonds.
The best fit includes A1-2, A3-1, and A5, having the most uniform land cover surfaces ( Figure S4). The first two sampling sites represent bare soil areas while the last exemplifies those areas with high-density vegetation cover. This could be explained that the LST was calculated via two thermal infrared bands 10 and 11 with 100 m resolution, which means the surface temperature is uniform within any 100 × 100 m pixel regardless of the actual land covers. The imbalances of SMC estimation results, as such, imply the problem of scale effect in that, the mixed pixels will affect the verification process.
A1-1 and A3-2 plots, on the other hand, have the highest simulated soil moisture, thus indicating the overestimations, which the presence of adjacent highways can explain. In fact, the highways and other urban areas, in general, have higher LST values than vegetated and bare soil areas. These could have affected the average LST of these pixels, making them less representative of vegetation and bare soil areas. On the contrary, the underestimated cases, for example, A4 or A2-1, were chosen to represent the cornfields within the research area. Apparently, the height of the corn trees (1.2-1.5 m), small canopies, and the low density should not be sufficient to affect the surface temperature. However, from the Remote Sensing lenses, the corn tree land cover could have affected the analyses of satellite images, making the LST values calculated lower than the actual values, hence the underestimation.

Methodological implications
In this study, we have presented a practical method to estimate the surface soil moisture content of an agricultural area of the Vietnamese RRD from Landsat 8 satellite images using the surface temperature -vegetation index triangular method. The estimation results were validated using ground truth samples of the actual soil moisture content measured by the gravimetric method. One significant merit of the proposed method is the freely available primary data resources of satellite images; in this case, Landsat 8 images is one of the most popular sources (Jacome et al. 2013, Hassan-Esfahani et al. 2015, Mohanty et al. 2017, Park et al. 2022. The application of SWA proved to be effective in calculating the land surface temperature, especially in case of missing or incomplete atmospheric data, even though with variable extents (Rozenstein et al. 2014). More specifically, homogeneous land surfaces, for example, uniform vegetation or bare soils, give more accurate estimation results than those of more diverse land covers. Also, the high traffic volumes of the adjacent highways can substantially abrupt the results by locally increasing the land surface temperature (Yang et al. 2020). Another important methodological note is the limitation associated with a certain level of subjectivity in identifying the edges of the temperature -vegetation index triangles. Similarly, we took note that the measurements of soil moisture could have been enhanced using volumetric water content instead of mass water content. Due to the instrumental limitation of the study, the latter were opted for ultimately. Finally, it is noteworthy that while conducting the result verification, attention should be paid to select only homogenous to avoid the scale effect and the problem of mixed pixels.
All in all, despite some disadvantages, we have contributed an inexpensive method to effectively monitor soil moisture content in the RRD, one of the two most important agricultural products of Vietnam. In essence, being able to monitor soil moisture content over a large landscape is of particular importance for agricultural activities in various aspects, such as planning crop calendars, reserving water during dry seasons, or planning for irrigation systems (McNairn et al. 2012, Minh et al. 2019. In a broader sense, soil moisture constitutes one of the most important indicators for monitoring environmental health. Despite accounting for only 0.15% of the globally available freshwater, this water resource has always been considered critical storage within the hydrologic cycle (Chen et al. 2011, Letu et al. 2014, Holzman et al. 2017. Our study henceforth has contributed a practical approach to evaluate soil moisture content across expanded geographical areas. We believe the method is also useful for a wide range of research and management disciplines concerned with weather and climate, runoff potential and flood control, soil erosion and slope failure, reservoir management, geotechnical engineering, and water quality (Berney andKyzar 2012, Orr et al. 2016).

Conclusions
This study estimates surface soil moisture (SMC) content by combining the optical/ thermal infrared remote sensing Landsat 8 over the agricultural dominated region in the Red River Delta (RRD) in Vietnam. The land surface temperature (LST) -vegetation index (VI) triangular method was used for three satellite datasets over the primary crop time. The results showed the significant correlations between SMC and LST with the highest R 2 of 0.765, as opposed to no correlation between SMC and NDVI. The root means square error between the calculated and measured SMC values yet varied greatly across different types of land covers, that is, bare soil, vegetation, and highways. Although our methods and results could be considered by future studies in the region with the same problem of missing or having limited measured SMC data, the estimation results could be sustainably affected by various factors in using remote sensing techniques. The method appears to give consistently better results for homogeneous than heteronomous surfaces, which is an important implication of the scale effect and mixed pixels that could have affected the verification results. In other words, extra attention should be paid to include only homogeneous surface areas for verification. Other factors interfering with the results include high vegetation, for example, corn trees in this case and the clouds of dust from the adjacent highways.
The analytical framework in this study could be useful for retrieving high-resolution maps of soil moisture. The presented method is particularly meaningful for studies at data-poor regions, given the comparatively fewer data demanding and methodological convenience compared to other approaches. Regarding the data sources, our method utilized Landsat 8 Operational Land Imager (OLI) and Thermal Infrared Sensor (TIRS) dataset, which are free for acquisition from several sources. Therefore, in the case of ungauged regions and the absence of reliable models, the practical split-window algorithm is worth considering.