A bi-seasonal classification of woody plant species using Sentinel-2A and SPOT-6 in a localised species-rich savanna environment

Abstract Sustainable management of biodiversity benefit from cost-effective multi-temporal classification schemes afforded by remote sensing techniques. This study compared classification accuracies of woody plant species (n = 27) and three coexisting land cover types using dry and wet seasons data. Random Forest (RF), Support Vector Machine (SVM) and Deep Neural Network (DNN), were applied to Sentinel-2A and SPOT-6 images. The results showed higher overall classification accuracies for wet season data (65%–72%) for both images and classifiers (DNN, RF and SVM), compared to dry season classification (52%–59%). Near infrared region bands, available in both Sentinel-2A and SPOT-6 imagery, produced high performance for both wet (83%) and dry (80%) seasons. Overall, the findings show potential of multispectral remote-sensing for woody plant species diversity in different seasons. Such a study should be extended to higher frequency species diversity classification, to capture dynamics that may manifest at short time intervals of the year.


Introduction
The savanna ecosystem is considered a rich source of natural resources for firewood, building materials and wildlife habitat, etc. Axelsson and Hanan 2017;Sosef et al. 2017;Venter et al. 2018;Ondier et al. 2020). Nevertheless, over-exploitation of these resources coupled with climate change is having a devastating effect on the continued integrity of the biodiversity in the savanna region (Marchant 2010;Leeuwis et al. 2018). It is important to have cost-effective and efficient methods for monitoring the savanna ecosystem during dry and wet seasons, to maintain and manage the biodiversity of such an ecosystem. Monitoring biodiversity assists in determining the extent of changes to species composition, such as in their abundance and diversity; it also informs of an incursion of threats, and changes in the condition of vegetation communities (Leeuwis et al. 2018). Remote sensing is one effective technology that has emerged as an alternative to expensive field-based surveys, considering that seasonal analysis would require more than one field survey (Turner et al. 2003;Maxwell et al. 2018).
Optical remote sensing has been used to assess species diversity in different environments, such as in tropical rainforests (Semboli et al. 2014;Wallis et al. 2017), Mediterranean climates (Roberts et al. 2015;Galidaki et al. 2017) and savanna ecosystems (Campos et al. 2018;Silveira et al. 2019). Generally, high accuracy levels are reported in discriminating plant species during the wet season (e.g. Allan et al. 2011;Gessner et al. 2013;Leeuwis et al. 2018;Shaharum et al. 2018;Silveira et al. 2019), due chiefly to the abundance of photosynthetically active species that exhibit a great deal of variability in reflectance during that season. For instance, Ivanova et al. (2019) characterised mixed forests comprising of herbaceous, coniferous and birch-larch forest by exploiting Moderate Resolution Imaging Spectroradiometer (MODIS) data with the advantage of high temporal frequency (< 10-day cycle) that can minimize the effect of cloud contamination during the wet season. Their study reported high accuracy levels; however, there was significant variation in forest types with respect to leaf structures. It is also important to note that MODIS has a coarse spatial resolution that limits its capacity to discriminate species-level diversity. Belay et al. (2013) used Landsat that has better spatial resolution than MODIS imagery, for classifying coniferous and broadleaved trees at the stand level during a wet season. Although they used a better resolution image, they still reported challenges in the effective separation of certain species due to homogeneous spectral characteristics of target species.
Widely used machine learning algorithms such as Random Forest (Breiman 2001) Decision Trees (Quinlan 1986), k-Nearest Neighbors (Fukunaga and Narendra 1975), Gradient Boosting (Friedman 2001) and Support Vector Machines (Cortes and Vapnik 1995) have proved useful by improving accuracies compared to traditional algorithms in woody plant species diversity estimations (Pu and Cheng 2015;Brandt et al. 2016;McKenna et al. 2017;Macintyre et al. 2020). For instance, Jombo et al. (2020) reported high overall accuracies of 81% (SVM) and 84% (RF) for classifying five plant species and three land cover types in a savanna environment, showing a marginally better performance using the latter algorithm. A study by Tesfamichael et al. (2018) showed a slightly better performance of SVM than Gradient Boosting Modeling (GBM) (97 vs. 94%) for discriminating grassland and four shrubby invasive species exhibiting morphologically similar characteristics in a savanna region. Odindi et al. (2016) utilised RF to map the distribution and composition of two alien and indigenous woody vegetation in a savanna environment and reported overall accuracy of 86%. Kganyago et al. (2018) discriminated one species type from three coexisting land cover types in a savanna region using SVM and reported overall accuracy of 83%. Forkuor et al. (2018) assessed the performance of Stochastic Gradient Boosting (SGB), SVM and RF in separating two woody plant species and three land cover types in a savanna region and reported accuracies ranging between 88% and 94%, with SGB performing better than the other classifiers. Such studies have discriminated few woody plant species, and therefore it is important extend the benefits of machine learning algorithms to discerning several species common in complex vegetation environments over different seasons.
A number of studies have exploited optical remote sensing in the absence of atmospheric interference for species diversity classification in dry seasons (Motohka et al. 2010;Cho et al. 2015;Xiao et al. 2015;M€ ockel et al. 2016;Shaharum et al. 2018;Silveira et al. 2019;Ritz et al. 2020). A common problem with optical remote sensing application in the dry season relates to challenges in effective discrimination of plant species devoid of chlorophyll content and plant species shedding off their leaves due to absence of moisture in winter (Brandt et al. 2016). This results in a loss of spectral separability between species due to prevalent reflectance from leaf-off branches as well as from background features (Chamaille-Jammes et al. 2006;Cleland et al. 2012;Rocchini et al. 2018;Zimbres et al. 2020). Given this weakness of optical remote sensing in the dry season, careful analysis of the spectral heterogeneity related to species leaf traits and plant phenotyping can be strengthened. Adopting several narrow and contiguous bands can allow the detection of subtle differences in plant species (Guerschman et al. 2009;Fricker et al. 2019;Gao et al. 2019). For instance, Peña et al. (2013) successfully discriminated deciduous and evergreen plants using HySpex-VNIR 1600 airborne-based hyperspectral images. However, the plants had relatively distinct characteristics, with A. caven a deciduous tree measuring approximately 6 m in height and possessing a crown diameter of 0.45 m, and evergreen trees (L. caustica, C. alba and Q. saponaria) measuring approximately 12-30 m in height and having crown diameters of 0.5-1 m. Furthermore, the hyperspectral remote sensing system used in the study is capable of accurately discriminating the vegetation types. Unfortunately, this system is largely commercial and thus unavailable for low-cost monitoring efforts. Generally, it should also be noted that dry period analysis of plant species focuses on a certain developmental stage of plants often devoid of leaves. Such a focus on dry season alone, therefore, limits the importance of optical remote sensing in tracking biodiversity throughout the year. Therefore, there is a need to extend the application of optical remote sensing in plant species diversity classification to both wet and dry seasons.
Monitoring plant species diversity in different seasons provides a temporally sound assessment of biodiversity and ecosystem functioning of a given vegetation environment. In this regard, several studies attempted to classify plant species in different seasons (e.g. Ng et al. 2017;Wendelberger et al. 2018;Gara et al. 2019;Borges et al. 2020;Hunter et al. 2020;Macintyre et al. 2020;Piovan 2020). For instance, Gara et al. (2019) successfully separated broadleaf, conifer, and other landcover types during summer, autumn and winter seasons using Sentinel-2A imagery. Similarly, Chrysafis et al. (2019) reported high accuracy levels in the separation of a mixture of extensive forests, sparse forest, cultivation plants, and pastures, in dry and wet seasons, using Landsat 8. Furthermore, Pu et al. (2018) utilised Pleiades imagery to classify different species in an area dominated by 70% broadleaf evergreen plant species and 30% generic plant species (shrub and turf/lawn/ grass) in dry and wet periods. High accuracies reported from the above studies can be attributed to significant differences in vegetation compositions that can be grouped as broadleaf and narrow-leaved, or evergreen and deciduous plants. Foliage dominated by narrow-leaved plants induces spectral responses with little variations, because of similar physiological structure as well as chemical composition of the plant leaves (Bayat et al. 2018). In addition, differentiating such plant species spectrally in different seasons can become complicated in a localised environment where evergreen and deciduous plants are difficult to discern.
There is, therefore, the need to exploit commonly-used multispectral data with improved spatial and spectral characteristics for detecting plants in different seasons. This study aims to investigate the performance of SPOT-6 and Sentinel-2A data for discriminating plant species characterised by similar leaf-level morphological characteristics, in dry and wet seasons. Successful findings from this specific assessment will have great potential in tracking the dynamics of plant species diversity in a localised savanna ecosystem.

Study area
The Klipriviersberg Nature Reserve south of Johannesburg, South Africa, was used for the study (Figure 1). The reserve was proclaimed for conservation purposes in 1984 and covers approximately 651 hectares. Generally, the vegetation types in the reserve include Andesite Mountain Bushveld and Clay Grassland which are associated with a savanna environment (South African National Biodiversity Institute 2012). The altitude of the area ranges between 1540 m in the south and 1790 m in the north, with a mean altitude of 1653 m, and a coefficient of variation of 3.9%. The mean annual rainfall of the area surrounding the reserve ranges from 624 to 802 mm (Kruger and Nxumalo 2017). The study area experiences warm to hot summers and cold nights in winter, with mean annual temperature ranging between 17 C and 26 C in summer and 5 C and 7 C in winter (MacKellar et al. 2014). The geology types found in the area, which lead to the floristic structure of the reserve, include quartzites, conglomerates, and dolomites (Molezzi et al. 2019).

Field data
In this study, we created a grid of 240 points, distributed at approximately 170 m intervals in the north-south and east-west directions, which covered the whole study area and was generated using the fishnet tool in ArcGIS (ESRIV R ArcGIS 10.6, Redlands, California, USA). The point coverage was exported into a Global Position System (GPS) (Garmin, GPSMAPV R 64, Kansas, USA) and located in the field. Field surveys were done in June 2017, representing the dry season, and in November 2017, representing the wet season (MacKellar et al. 2014). A buffer with a 20 m radius was created around each plot center; this size was specified to accommodate multiple pixels of the imagery used in the study. In each plot, different plant species and coexisting land covers were enumerated, totalling 30 classes. All woody plants with a height of two metres or above were counted in each plot. A total of 27 distinct woody plant species were recorded, with a minimum of one and a maximum of nine woody plant species in each plot.
2.3. Remote sensing data and pre-processing SPOT-6 and Sentinel-2A satellite images were acquired for dry (June 2017) and wet (November 2017) seasons coinciding with the time of the field surveys. The SPOT-6 image was sourced from the South African National Space Agency (SANSA). The imagery has four multispectral bands including red, blue, green (RGB) and near infrared (NIR) in the 0.455 mm-0.89 mm wavelength range ( Figure 2) and a panchromatic band in the 0.45 mm-0.75 mm wavelength range. The spatial resolutions of the multispectral and panchromatic images are 6 m and 1.5 m, respectively. Atmospheric correction was applied to all individual bands of the SPOT-6 imagery, using the Dark Object Subtraction (DOS) method (Chavez 1996) implemented in ENVI 5.3 (#2015 Exelis Visual Information Solution Inc., Boulder, CO). Subsequently, the individual bands of SPOT-6 were pansharpened with the panchromatic band using the Gram-Schmidt algorithm, which maximizes image sharpness and minimizes colour distortions (Laben and Brower 2000). The pansharpened bands were combined to create a composite SPOT-6 multispectral image of 1.5 m spatial resolution; one image was generated for each season (wet and dry).
The Sentinel-2A image was downloaded from the European Space Agency Data Hub (https://scihub.copernicus.eu/dhus/). Sentinel-2A has 13 multispectral bands, covering the visible and infrared regions of the electromagnetic spectrum. We excluded the coastal band (0.43 mm-0.45 mm), water vapour band (0.93 mm-0.95 mm), and cirrus band (1.36 mm-1.39 mm) from the Sentinel-2A image, due to their relative sensitivity to atmospheric interferences (Stych et al. 2019). The remaining 10 bands ( Figure 2) covered a wavelength range of 0.46 mm-2.28 mm. The first four bands out of 10 included the RGB and NIR, which together covered a wavelength range of 0.46 mm-0.90 mm and measured at 10 m spatial resolution. The first four bands out of 10 included RGB and NIR which covered a wavelength range 0.46 mm-0.90 mm measured at 10 m spatial resolution. The additional six bands were vegetation red edge-1 (VRE1), vegetation red edge-2 (VRE2), vegetation red edge-3 (VRE3), vegetation red edge-4 (VRE4), NIR, shortwave infrared-1 (SWIR1) and shortwave infrared-2 (SWIR2) (together spanning 0.71 mm-2.28 mm) were measured at a 20 m spatial resolution. Atmospheric correction was applied to all individual bands of the Sentinel-2A imagery, using a similar method applied to SPOT-6. Subsequently, all the individual bands were stacked to create a multispectral image.

Training and classification of the remote sensing data
Training of the 30 different classes was performed on the two composite satellite images (SPOT-6 and Sentinel-2A) separately. It should be noted that as the spatial resolution becomes coarser, individual pixels are less likely to capture small features, resulting in a mixed pixel phenomenon (Lu and Weng 2007;Ma et al. 2015;Zhu et al. 2018). This study adopted the nearest neighbour resampling technique on the Sentinel-2A image, in order to resample the pixels at the resolution of the SPOT-6 image (1.5 m). The nearest neighbour resampling technique was chosen because it does not alter values in the output raster data set and is therefore appropriate for categorical data classification (Cassel and Cassel 2013). A total of n ¼ 8011 (dry season) and n ¼ 8080 (wet season) points representing 27 woody plant species as well as grass, shrubs and bareland were digitized inside the 240 plots on the two satellite images separately. Digitising of points was guided by field surveys in which a local cartesian coordinate system was used to locate the classes. Finally, points were split into two sets of which 30% was allocated to training À classification and 70% to evaluate the accuracy of the classification. This amounted to n ¼ 2408 and n ¼ 5603 samples respectively for the dry season and n ¼ 2441 and n ¼ 5639 for wet season respectively. The spatial distribution of the training samples was taken into consideration when splitting the data into training À classification and evaluation sets. The species, along with the proportions allocated to the training and testing of the classifications, are given in Table 1.
Three machine learning classification algorithms-Random Forest (RF), Support Vector Machine (SVM) and Deep Neural Network (DNN)-were applied to train and classify the two datasets. These classification algorithms were implemented using the caret package for RF and SVM (Kuhn 2018), and the H20 package for DNN (Darren 2016) in R (RStudio Team 2020). A RF classifier is an ensemble machine learning method, which uses bootstrap sampling to build multiple decision tree models (Breiman 2001). It was selected for this study because (i) it can handle large datasets, (ii) it is free from normal distribution assumptions, and (iii) it is robust when treating outliers (Cutler et al. 2007). Internally, the RF uses two-thirds of the data (in-bag) for developing the classification model while the remaining one-third, which is referred to as out-of-bag (OOB) data, is used to evaluate the accuracy of the trained model (Breiman 2001). A RF classifier requires the specification of two parameters to generate a prediction model. These parameters include the number of classification trees (ntree) and the number of predicting variables (mtry) used in each node to grow the trees (Cutler et al. 2012). We used a ten-fold cross-validation analysis, which was repeated 10 times to determine the optimal parameters. The explanatory power of the input variables (multispectral bands) was quantified to rank the importance of each band, for classification accuracy.
The SVM classifies features (reflectance of different bands) by trying to identify the optimal decision (separation) boundary that maximizes the margin between two classes (Cortes and Vapnik 1995). Similar to the RF, the SVM does not require the data to have a normal distribution (Maxwell et al. 2018). In addition, SVM performs well when using high dimensional and complex data (Wang and Lin 2014). A nonlinear SVM that utilises a radial basis function kernel (Cortes and Vapnik 1995) and customized for R (Karatzoglou et al. 2004) was applied in this study, because it accommodates linear and nonlinear relationships between a class and a predictor. The SVM classifier uses two parameters in order to balance the accuracy and reliability of the classification. These parameters includes the cost factor (C) and gamma (c). The C factor relates to the penalty (cost) of misclassification error and c determines the influence of a training sample to capture the complexity in the data (Cortes and Vapnik 1995). C and c were determined by running a ten-fold cross-validation repeated 10 times, similar to the approach applied for selecting optimal parameters of the RF classifier.
Deep learning, which was introduced recently in the field of remote sensing and is a family of machine learning algorithms structured around neural networks, was also used in this study. Deep learning includes multi-layer perceptrons, restricted Boltzmann machines, stacked autoencoders, deep belief networks, and DNN (Schmidhuber 2015). DNN was selected for this study due to its ability to generate non-linear decision boundary, and to learn complex patterns (Lavine and Blank 2009). DNN with error backpropagation was applied using the H20 package in R language (Darren 2016). It uses multi-layer perceptrons that consist of three or more layers; that is, (i) an input layer which contains one or more processing elements, (ii) hidden layers which are responsible for the transformation of data between input and output layers, and (iii) an output layer which stores the results of the network (Hochreiter 1991;Mousavi et al. 2018;Ma et al. 2019). DNN links multiple functions joined together in hierarchically structured neural networks, which are deeper than three hidden layers (Schmidhuber 2015). We used an output activation function (hyperbolic tangent function), which is the most frequently used activation function in deep learning classification (Lavine and Blank 2009;Ma et al. 2019). The function provides zero-centred outputs and allows model parameters to be more frequently updated in feed-forward neural networks (Lavine and Blank 2009). This study optimised activation, hidden layers, neurons per layer and epochs hyperparameters using a grid search approach (Beysolow 2017). A ten-fold cross-validation that was repeated 10 times was used to determine the best possible combinations of the hyperparameters.

Accuracy assessment
Classification accuracies derived from the remotely-sensed data were assessed on the 70% validation dataset (n ¼ 5603 and n ¼ 5639 for dry and wet seasons, respectively) of the woody plant species. An error matrix that uses overall, producer's, and user's accuracies (Congalton and Green 2009) was used in the study. Overall accuracy is the probability that an individual class will be correctly classified. It is measured by the summation of the true observations plus predicted observations, which is then divided by the total number of tested classes. The producer's accuracy indicates the probability of a reference class species being classified correctly and it is calculated as the number of true observations of a class divided by the number of true reference observations of the same class. The user's accuracy indicates the probability that classified species on the map represent the same category on the ground. The user's accuracy is calculated as the number of true observations of a class (true positives) divided by the number of predicted observations (sum of true positives and false positives). A Kappa coefficient (Equation 1) was also used as an additional statistic to assess the quality of classified imagery (Story and Congalton 1986). The statistical measure is used to control the instances which might have been correctly classified by chance.
where P observed ¼ observed proportion of agreement and P chance ¼ proportion expected by chance. Finally, the McNemar's test (McNemar 1947) was used to compare whether or not there was a significant difference in dry and wet season classification results. This test was implemented per algorithm; that is, dry season vs. wet season for DNN, dry season vs. wet season for RF, dry season vs. wet season for SVM. The test was computed using Equation 2.
where, the square of z follows a chi-square v 2 distribution with 1 degree of freedom. -f 12 represents the misclassified number of samples by a classifier (e.g. DNN) in dry season but classified correctly by the same classifier (e.g. DNN) in wet season. -f 21 represents total number of samples classified correctly by a classifier in dry season and not classified correctly by the same classifier in wet season. The pairwise comparison was implemented for overall accuracies as well as for species-level accuracies.

Classification accuracies of dry and wet seasons using RF, SVM and DNN classifiers
Overall classification accuracies of species and coexisting land covers during dry and wet seasons showed that wet season output achieved the highest overall accuracy of 72% (kappa coefficient ¼ 0.68) using a DNN classifier derived from Sentinel-2A data, followed by a 70% accuracy (kappa coefficient ¼ 0.66) with a DNN classifier derived from SPOT-6 data ( Figure 3). In contrast, the lowest overall accuracies were recorded in the dry season, at 52% (kappa coefficient ¼ 0.50) using an SVM classifier, followed by an RF classifier with 54% (kappa coefficient ¼ 0.52), both being derived from the same SPOT-6 image. Further analysis of the two seasons and the three classifiers (RF, SVM and DNN) showed overall better results from the wet season data, with mean accuracy of 68% compared to 55% from dry season, which equates to a difference of 13%. Kappa statistics graph is given in supplementary data ( Figure S1). Results from the McNemar's test revealed that wet and dry season classification using DNN classifier had statistically significant difference for Sentinel-2A (v 2 ¼ 8.45; p ¼ 0.003) and SPOT-6 (v 2 ¼ 7.34; p ¼ 0.007). The SVM classifier also resulted in a statistically significant difference between the two seasons using Sentinel-2A (v 2 ¼ 4.48; p ¼ 0.023) and SPOT-6 images (v 2 ¼ 5.36; p ¼ 0.034). Such significant difference was also observed between the two seasons using RF for Sentinel-2A (v 2 ¼ 4.15; p ¼ 0.042) and SPOT-6 (v 2 ¼ 3.97; p ¼ 0.048). Pairwise comparison of individual species using McNemar's test between wet and dry seasons is provided in supplementary data Table S1. Using Sentinel-2A image, the test showed significant differences between the two seasons for 22, 21 and 20 species for RF, DNN and SVM classifiers, respectively. For the SPOT-6 based classification, the test showed significant differences between the two seasons for 23, 20 and 8 species for RF, DNN and SVM classifiers, respectively. Producer's and user's accuracies of individual plant species are shown in Figure 4. The producer's accuracies ranged between 11% (Heteromorpha arborescens) and 96% (Acacia caffra) for different species across wet and dry seasons using RF, SVM and DNN classifiers derived from Sentinel-2A and SPOT-6 images ( Figure 4). A closer look at dry season classification results showed that 13 classes (species and coexisting land covers) had a producer's accuracy exceeding 70% using the three classifiers and Sentinel-2A imagery (Figure 4a). In contrast, 28 classes (species and coexisting land cover types) had a producer's accuracy greater than 70% using the three classifiers and Sentinel-2A imagery in the wet season (Figure 4b). Furthermore, six species and coexisting land cover types had a producer's accuracy > 70% using the three classifiers and the SPOT-6 image in the dry season (Figure 4c). Twenty-three species and coexisting land covers had a producer's accuracy exceeding 70% using the three classifiers and the SPOT-6 image in the wet season (Figure 4d).
Focusing on user's accuracy, the three classifiers and Sentinel-2A imagery yielded accuracies ranging between 39% (H. arborescens) and 73% (Acacia Karro) in the dry season, with seven species and coexisting land covers having accuracies greater than 70%. Although the accuracies using the same classifiers and Sentinel-2A ranged between 36% (Tarcchonanthus camphoratus) and 96% (A. caffra) in the wet season, twenty-two species and coexisting land covers had accuracies greater than 70%. The three classifiers and SPOT-6 yielded accuracies ranging between 31% (grassland) and 87% (A. karro) in dry season, with five species and coexisting land covers having accuracies greater than 70%. Although the accuracies using the same SPOT-6 and classifier ranged between 32% (T. camphoratus) and 96% (A. caffra) in wet season, 18 species and coexisting land cover types had accuracies greater than 70%.
The producer's accuracy was used to compare the relative performance of classifiers between dry and wet seasons in identifying species, since it indicates the probability of a reference class species being classified correctly. Figure 5a shows that wet season classification was better compared to dry season classification for 17 species using DNN, 18 species using SVM and 17 species using RF classifiers, from Sentinel-2A imagery. There was no relative improvement of producer's accuracy in the dry season over the wet season using Sentinel-2A; consequently, the graph has not been included. The relative improvement of all three classifiers in producer's accuracy ranged between 1 to 49% when assessing. In contrast, the wet season classification showed an improvement over the dry season for 21 species using DNN and RF, and for 19 species using SVM, using the SPOT-6 image (Figure 5b). In general, the relative improvement in the producer's accuracy for wet over dry season ranged between 3 and 66% using SPOT-6 ( Figure 5b). Interestingly, classification from SPOT-6 imagery for the dry season showed an improvement over the wet season for eight species when applying DNN, for nine species when utilising SVM, and for seven species when using RF (Figure 5c). The relative improvement in producer's accuracy in dry season over the wet season ranged between 1 to 31% for a DNN classifier, using SPOT-6 imagery. Comparatively, the improvement in accuracies in dry season over wet season using the SVM classifier ranged between 3% and 22%, while it varied between 1% and 23% for RF classifier (Figure 5c), when applied to SPOT-6 imagery.  Table 1.

Comparison of wet and dry season based on confusions
Classification performances were further evaluated based on the level of confusion in the identification of species in different seasons. Detailed confusion matrices using DNN, RF and SVM classification types are given in supplementary Table S2. Figure 6 provides the count of species and land cover types against which a species is confused. Lower numbers of species confusion were recorded during the wet season compared to the dry season, showing the superiority of the wet season classification. A species was confused with 10 to 12 other species or land cover types in the wet season when utilising DNN and RF classifiers applied to the Sentinel-2A image (Figure 6a). Results of SPOT-6 derived classification showed that species were on average confused with 12 to 13 other species using DNN and RF classifiers, respectively. The SVM classifier recorded an average of 11 species being confused with other species when using Sentinel-2A imagery, and 12 species when using SPOT-6 imagery. Overall, results of species confusion from the wet season data showed that B. rotundata A. karro and Pittosporum viridiflorum recorded the least number (three) of confusion. Heteromorpha arborescens recorded the highest number (22) of confusion with other species and co-existing land cover types.
Dry season classification results showed that each species was confused with at least 13 other species using DNN, 15 species using SVM, and 17 species using RF, when applied to the Sentinel-2A imagery (Figure 6b). Further poor performance of the SPOT-6 image presented classification results on average 16 species using DNN, 17 species using SVM, and 22 species using the RF classifier. Similar to wet season output, B. rotundata also recorded the lowest number of confusions with other species (confused with three species), whilst A.caffra recorded the highest number (30) of confusions with other land cover types. Overall, the most confusion was recorded in the dry season, compared to the classification in the wet season. Image comparison also highlighted that most of the confusion was recorded using the SPOT-6 image, compared to Sentinel-2A imagery. Figure 5. Relative producer's accuracy of wet over dry seasons in identifying species types. Sentinel-2A and SPOT-6 in wet and dry seasons in the y-axes represent satellite images and time period. Species names represented by the twoor three-letter codes are given in Table 1. There was no improvement in dry over wet season for Sentinel-2A.

Variable importance
Variable importance is calculated by the sum of the relative decrease in error when any variable is removed from the classification process (Kuhn 2008). Variable importance was computed for both Sentinel-2A and SPOT-6 images in both wet and dry seasons using the three classifiers (RF, SVM, DNN), in order to observe a general trend in terms of individual band importance. There was a strong similarity between the dry and wet seasons in the important regions of the electromagnetic spectrum of Sentinel-2A imagery. For instance, VRE, NIR and SWIR contributed the most to the classifications (> 50%) using the three classifiers (Figure 7a). SWIR1 contributions in the dry season ranged between 70% and 81%, compared to contributions of 80%-83% in the wet season. The contributions of SWIR2 in the dry season ranged between 55% and 77%, compared to the wet season when it ranged between 53% and 77%. There were also similarities in the contribution of the NIR band between dry and wet seasons, with accuracies exceeding 70%. The VRE bands showed the least contribution in the dry season (31%-76%) compared to the wet season (55%-77%). Similarities were seen in the low contributions (< 50%) of the RGB bands in both seasons. Focusing on SPOT-6 image band contributions (Figure 7b), a similar trend of high values of NIR band was recorded for both dry and wet seasons, with values ranging between 77% and 88% for the former, and between 78% and 83% for the later. It should be noted that the green band recorded the lowest contributions ( 36%) for both wet and dry seasons. In contrast, both red and blue scored contributions which ranged between 40% and 66% in both wet and dry seasons (Figure 7b).
We assessed the effect of adding spectral indices on classification accuracy. The indices were limited to the bands that contributed the most in the classifications above, including red and NIR for SPOT-6 and Sentinel-2A as well as SWIR1 and 2 for Sentinel-2A. The addition of the Normalised Difference Vegetation Index (NDVI, Rouse et al. 1974) computed as (NIR-Red/NIR þ Red) improved the overall classification accuracy of SPOT-6 by only 1.5% in the wet season but not in the dry season. When NDVI was added, the overall classification accuracy of Sentinel-2A improved by 1% in the wet season, however it did not improve the overall accuracy in the dry season. A further addition of a Simple Ratio computed as SWIR1/SWIR2 (Henrich et al. 2009) marginally improved the overall classification accuracy of Sentinel-2A by another 2% in the wet season compared to 1% improvement in the dry season. At individual species-level, the improvement in the producer's accuracies after adding NDVI to SPOT-6 as well as NDVI and Simple Ratio to Sentinel-2A data did not exceed 5% for most species. Considerable improvements were observed for few species that had quite low accuracies (< 20%) without the additions of the spectral indices.

Discussion
This study investigated commonly used multispectral data with improved spatial and spectral characteristics for detecting woody plant species in different seasons (wet and dry). The study area is composed of multiple species, including evergreen and deciduous plants with similar morphological characteristics. These characteristics render effective species diversity estimation challenging. Thus, it motivated a study on analysing performance of improved spatial and spectral resolution imagery. Numerous studies have however quantified seasonal species diversity derived from vegetation indices (e.g. Wardlow et al. 2007;Sridhar et al. 2010;Tillack et al. 2014;Adhikari et al. 2016;Wang et al. 2016;Prajesh et al. 2019;Gerard et al. 2020;Wang et al. 2020). In principle, vegetation indices quantify vegetation intensity, and in turn land areas with photosynthetically active woody plant species during wet seasons are misrepresented as high species diversity hotspots. This narrative can misrepresent and foil effective species diversity applications utilising multispectral imagery throughout the year in wet and dry seasons.

Performance of multispectral image in wet and dry seasons
The results of our study showed high overall accuracy (72%-52%) and kappa coefficient (0.68-0.50) for the wet season compared to the dry season, for species diversity classification ( Figure 3). Furthermore, the classification results between wet and dry season showed statistically significant differences. This observation could generally be attributed to an increase in diurnal temperature and maximum photoperiod leading to increased photosynthetic activity and vegetative growth (Eitel et al. 2019). In addition, wet seasons are characterised by high rainfall which contributes to vegetation vigour. Our results are in agreement with Gara et al. (2019) who utilised the same dataset (Sentinel-2A) for seasonal characterisation of leaf canopy although their study focused on vegetation composition that was dominated by herbaceous plants during the summer, spring and autumn seasons. Better accuracies of our study in wet season were not surprising. This is attributed to high photosynthetic activity during the wet season, resulting in plants exhibiting unique chemical composition characteristics of chlorophyll content within the different plant leaves. Similar observations were reported by Kikuzawa (2003) and Wendelberger et al. (2018) and Wang et al. (2016), who mapped species diversity in Praire grassland with better results during wet seasons.
Although the difference in classification output was not significant among DNN, RF and SVM classifiers (Figure 3), the results can be attributed to different learning approaches and the constraints inherent in each of the algorithms (Bui et al. 2020). For instance, RF is a tree-based classifier that uses repeated sub-setting which groups the training data into categories of increasing similarity (Breiman 2001). The SVM is also designed to split the training data mathematically, using hyperplanes class assignment due to position in multivariate space (Cortes and Vapnik 1995). These classifiers suggest that there is a level of less species variability that makes classification sub-setting inefficient. Thus, tree-based models and the SVM model, which both need well-structured training data in order to provide more accurate models, appear to have less predictive performance than the neural network-based models (Ma et al. 2019). A key advantage of neural networks using deep learning works is in the design which sends the input (the data of images) through different layers of the network, with each network hierarchically defining specific features of images (Abdi 2020).Thus, DNN was able to successfully separate different species better than RF and SVM machine learning algorithms.
Another notable aspect that could have influenced the overall classification performance is the issue of unbalanced training samples (Table 1). Ordinarily, performance of the remote sensing classification assessment produces better results if classes are of almost similar proportions in order to reduce bias towards the most dominant class (Foody 2004;Congalton and Green 2009). Various studies have balanced training data sample proportions in order to improve spectral separation of landcover types (e.g. Olofsson et al. 2014;Romero et al. 2016;Piiroinen et al. 2017). Abdi (2020) mapped vegetation diversity using different machine learning classification algorithms (Extreme Gradient Boosting (EGB), DNN, SVM and RF) and balanced the training samples regardless of class composition proportions, in order to improve their performance. However sampling design should capture the true representation of identified species in the area of study, for accurate ecological and conservation efforts, rather than focussing on improving accuracy levels.
The Sentinel-2A image produced better results in discriminating multiple narrowleaved woody plant species in both wet and dry seasons (e.g. A. caffra, A. karro and A. mundianum). Although SPOT-6 has a better spatial resolution compared to Sentinel-2A, the advantage of the latter is due to the improved spectral contents (10 bands versus 4 bands). Such observations were noted by Lewis et al. (2012) who mapped species diversity (n ¼ 22) using SPOT-5 and Landsat, which has somewhat comparable spectral properties as Sentinel-2A. Hsieh et al. (2001) also reported that the increase in spatial resolution of satellite imagery could result in increased spectral covariance that diminishes class spectral separability. This could explain the superior performance of Sentinel-2A imagery in our study with a 10 m spatial resolution, compared to SPOT-6 with 1.5 m spatial resolution. It should also be noted that improved spectral characteristics of Sentinel-2A data, particularly the presence of four VRE and two SWIR bands, enables the satellites to sense important biophysical parameters of plant species (Arroyo-Mora et al. 2018;Forkuor et al. 2018;Chrysafis et al. 2019). The greater number of bands in Sentinel-2A than in SPOT-6 accentuates effective discrimination of subtle species reflectance variation (Ghoussein et al. 2019).

Confusion levels of species in wet and dry seasons
The results of the present study comparing discrimination of species diversity in wet and dry seasons demonstrated that the wet season classification produced less confusion among species than the dry season classification ( Figure 6). This is due to high photosynthetic activity that allows for the holistic manifestations of leaf spectral properties in wet conditions. Similarly, better performance of wet season classification results was noted by Chrysafis et al. (2019) who mapped five different species and three coexisting landcover types in a Mediterranean climate. Bayat et al. (2018) postulated that plant responses to light differ based on the lighting environment and season. Such factors can explain the different confusion levels between wet and dry season observed in this study (higher species confusion in dry versus wet season as illustrated in Figure 6a and b). Image comparison also highlighted the better classification performance of the Sentinel-2A image than the SPOT-6 image. This could be attributed to the better spectral properties of the Sentinel-2A image. Specifically, Sentinel-2A imagery contains more spectral bands including vegetation-focussed channels, many of which are unavailable with SPOT-6. This also implies that the superior spatial resolution of a SPOT-6 image could not compensate for the lack in spectral qualities.
Although inferior to wet season classification, results of species classification in the dry season were encouraging, with producer accuracies exceeding 70% for both Sentinel-2A and SPOT-6 (Figure 5a and c). Similar good performance of classification using dry season was also reported by Hunter et al. (2020), who classified a fewer number of species (n ¼ 9). However, sub-optimal accuracies in dry season classification could be due to the fact that the season is characterised by deciduous plants shedding most of their leaves, rendering plant canopy detection difficult. Another noteworthy morphological characteristic in this study which could have limited the success of species classification is the similarity in size of the species. It should be noted that > 90% of the species identified in the study are narrow-leaved (Table 1), which limits effective species diversity classification.

Comparison of images based on band importance
A look at the importance of variables in the classifications shows similarities in high contributions from the NIR band of both SPOT-6 and Sentinel-2A images (Figure 7). Such performance is not surprising given that the emissivity rate of the leaf surface of a fully replenished plant (in the wet season) is generally in the range of 0.96 mm-0.99 mm of the electromagnetic spectrum (Xue and Su 2017). The contributions of VRE bands was fairly promising in the classification of plant species with morphologically similar characteristics. Munyati (2018) classified different plant species composition in savanna and noted that VRE performed generally well in discriminating subtle differences in plant species, because it is more sensitive to variation in chlorophyll content. Macintyre et al. (2020) reported good performance of Sentinel-2A imagery in detecting flowering phenology season of plant species, and attributed it to the presence of VRE bands. Sentinel-2A SWIR bands are centred at 1.61 mm (band 11) and 2.19 mm (band 12), and thus able to detect variability in water content between different tree species (Luke s et al. 2013;Macintyre et al. 2020). These properties could also explain the good contributions of SWIR bands. The visible range generally showed a similar pattern of low contributions in classifying different plant species in both dry and wet seasons using both images ( Figure  7). This could be attributed to the closed-canopy nature of plant species in the study area. Tucker (1979) demonstrated that the reflectance difference in the visible and NIR regions could be related to various properties of plant density, with NIR performing better than the visible range. Poor performances of the bands found in the visible range was also reported by Abdi (2020), who classified different plant species and coexisting landcover types using Sentinel-2A imagery.

Conclusions
This study explored the classification accuracies of narrow-leaved woody plant species in dry and wet seasons using SPOT-6 and Sentinel-2A imagery. Overall, classification of plant species had better accuracies in wet season than in the dry season. This is attributed to high photosynthetic activity (vegetation vigour) during the wet season, compared to the dry season. Though not as good as the wet season classification, the dry season classification still yielded promising results. This potentially makes it useful for operational monitoring purposes, due to the fact that images acquired during dry seasons are less affected by atmospheric interferences than those acquired in the wet seasons. Comparing the performances of images, Sentinel-2A resulted in better classification accuracies than the SPOT-6 image in both wet and dry seasons. This was due to the better spectral characteristics of Sentinel-2A imagery, and despite the better spatial resolution provided by SPOT-6.
Variable importance results showed consistently high contributions of the NIR and longer wavelength regions to the species classifications using both images in dry and wet seasons. Such performances indicate the adequacy of VRE, NIR and SWIR range bands for narrow-leaved plant species classification in both wet and dry seasons. Our study classified species diversity comparing only wet and dry season; however, it is important to expect the existence of species that exhibit dynamics at shorter intervals than the two generic seasons considered in the study. Therefore, classification of species type should be extended to higher temporal frequency analysis in order to capture the dynamics for improved biodiversity monitoring and management.