Investigating the effects of landscape characteristics on landslide susceptibility and frequency-area distributions: the case of Taounate province, Northern Morocco

Abstract The geomorphological control on landslides susceptibility maps (LSM) and frequency-area distributions (FAD) is under-investigated. Therefore, we investigate LSMs and FAD distributions in two contrasted landscapes of Taounate province using explanatory statistical techniques. In the geologically heterogenous context, our findings show that LSM spatial distributions for deep landslides are different to those of shallow processes, regardless of the LSM technique used. The Predictive factors also differ between both types of mass movements in this rugged and heterogenous zone. However, the opposite is observed in the geologically homogenous South, where shallow and deep landslides seem to be controlled by the same factors and tend to produce LSMs that are similarly distributed. This is further emphasized by the Area Under Curve (AUC) results which yield good to very good values in the South regardless of the types of landslides used for validation, with average to bad values in the North when shallow landslides are used to validate deep LSMs and vice versa. The same can be said about FADs, which are also sensitive to landscape variations.


Introduction
Landslides were the subject of many scientific research papers because of their great potential to hinder the socio-economic development of certain mountainous areas (Haque et al. 2016;Del Soldato et al. 2017;Haque et al. 2019), as well as their significance and contribution to understanding landscape evolution processes (Lomoschitz et al. 2002;Carlini et al. 2016;Mateos et al. 2018;Tanyaş et al. 2019;Lacroix et al. 2020;Qiu et al. 2021). Although the latter perspective is of great importance to geologists and geomorphologists alike, most of landslide research teams are focussed on landslide hazard assessment in general and spatial susceptibility assessment in particular. Such research, which aims to predict future landslide occurrences, was popularized by the availability of Geographic Information Systems (GIS) and the development of statistical tools and datadriven approaches since the late 1980s and early 1990s (Jibson and Keefer 1989;Pomeroy 1989;Agterberg and Bonharn-Carter 1994;Bonham-Carter 1994). In fact, an excess of 50,000 papers were published on landslide susceptibility mapping (LSM) during the last three decades, which demonstrates the ease of data manipulation and processing due to the recent important technological advances. However, this comes at the expense of geomorphological research, which aims to characterize both landslide mechanisms and the landscape evolution processes in their local and regional contexts, to which less time and effort were dedicated.
The quantitative assessment of LSM performance is generally performed using statistical techniques such as the Area Under the Receiver operator Curve (AUC) or the Rindex (e.g. Shahabi et al. 2014;Razavizadeh et al. 2017;Borrelli et al. 2018;Li et al. 2019). Comparative studies mainly focus on the use of multiple techniques and the weighting of their respective performances with the above-mentioned statistical tools (e.g. Gong 1996;Othman et al. 2015;Razavizadeh et al. 2017;Benchelha et al. 2019). Nevertheless, the effect of input data selection, with regard to both the different combinations of determining factors, and the training data sampling and selection approaches, is rarely discussed (Pereira et al. 2012;Bounab et al. 2022). As such, most LSM studies only measure the goodness of fit of a given validation dataset to certain LS models without discussing their geomorphological significance. Also, investigating the existence of a link of causality that explains the observed statistical associations is a rare practice in the field, with only few examples of such investigations being conducted (Yilmaz and Keskin 2009;Gaidzik and Ram ırez-Herrera 2021; Bounab et al. 2022). In addition, some researchers tend to focus on the impact of using different statistical techniques (bivariate VS multivariate VS Artificial Intelligence), the results of which are not consistent across the world. For instance, bivariate techniques are found to be better in some settings, while multivariate techniques tend to perform better in others (Lee and Pradhan 2007;Kavzoglu et al. 2015). Moreover, sometimes ANN is deemed the best technique and sometimes not. This can be explained by the input data selection practices, which are rarely if ever considered in LSM research (Aditian et al. 2018;Ozdemir 2020;Bounab et al. 2022).
Northern Morocco is not an exception to this situation. It is both affected by many landslides with pronounced consequences (Millies-Lacroix 1968;El Kharim 2002;Mastere 2011;El Kharim et al. 2021;Obda et al. 2021) and the subject of a multitude of landslide susceptibility attempts that do not consider the geomorphological context and processes in the development and discussion of the said predictive maps (e.g. Elmoulat and Brahim 2018;Benchelha et al. 2019;Harmouzi et al. 2019;Elmoulat et al. 2021). In addition to typology, the relative age and the geomorphological characterization of landslides are lacking, which could lead to errors due to the drastic contrast observed between the old Quaternary dynamics and those of the present (Milli es -Lacroix 1965;Millies-Lacroix 1968;El Kharim 2002). This effect was verified and emphasized by Bounab et al. (2022) in a relatively small area of Northern Morocco, where the inclusion of relict landslides induced significant error in comparison to models that exclude such processes. To our knowledge, no other research discussed these effects in this North African region.
The characterization of landslides also includes the study of frequency-area distributions (FAD), which yields important information regarding the completeness of a given landslide inventory map and its overall size distribution patterns (Stark and Hovius 2001;Guzzetti et al. 2002;Malamud et al. 2004;Li et al. 2016;Jafarimanesh et al. 2018). The negative power law exponent and the position of the rollover effect observed in most landslide inventory datasets constitute good metrics for defining the slope dynamics and characterizing their underlying hydro geomorphological mechanisms (Brunetti et al. 2014;Qiu et al. 2017;Tanyaş et al. 2019;Qiu et al. 2021).
In Taounate Province, such geomorphological investigations are of great significance for landslide research in Northern Morocco. In fact, this area is marked by a net contrast between the heterogonous and rugged topographic expressions of the Intrarif and Mesorif geological units in the North and the homogenous gentle topography of the Prerif domain in the South (Maurer 1968;Suter 1980a;Benzaggagh and Atrops 1997). Such difference in terms of slope gradient, and elevation difference between both landscapes renders landslide susceptibility assessment efforts more difficult and necessitates the integration of FAD analysis as well as the typological characterization of the training data before preparing a spatial predictive model. Previous research, which only covered 20% of the province, and focussed mainly on the Northern segment of the study area (e.g. Benchelha et al. 2020;Ozer et al. 2020) fails to investigate such effects and thus more research is needed.
In this regard, we attempt to expose the effect of such a net contrast on the distribution and susceptibility of both shallow small-scale landslides and deep, large-scale ones. To do so, we employed two bivariate analysis techniques, one multivariate technique and an Artificial Neural Network (ANN) algorithm in order to detect biases that are due to the characteristics of some methods. An FAD analysis is also performed for both the Southern and Northern segments of Taounate province in order to highlight the consequences of the topographic and tectonic differences on slope instability characteristics. Finally, the results are discussed with the aim of explaining our observations.

Study area
The Southern (denoted zone B for simplification) and Northern (denoted zone A for simplification) segments of Taounate province are geologically sited in different domains, which are the Prerif and the Intra-Mesorif respectively. The latter segment is marked by a complex geological and geomorphological setting, with an important lithological heterogeneity and geomorphological contrast induced by the major thrust faults in the area. The latter domain however is simpler in structure and is characterized by more gentle slopes and less rugged landforms.
In fact, the Intrarif, Mesorif and Prerif are the names attributed to three dislocated palaeogeographic and tectonic zones of the external Rif (Suter 1980a(Suter , 1980b. Together, they form three distinct sub-domains derived from the tectonic inversion of the North African paleo-margins (Favre et al. 1991;Michard et al. 2014;Gimeno-Vives et al. 2019). The first and upper most unit is the Intrarif, which is comprised of the tectonically dislocated Tangier unit, mainly dominated by Upper Creataceous sedimentary shale and marly formations, and its supposed Lower Cretaceous flysch-like stratigraphic substratum (Ketama unit) with evidence of low grade metamorphism (Durand-Delga et al. 1960). Both of the latter units outcrop in the Northern most part of the study area and occupy the upper tectonic position (Figure 1).
To their south, outcrop the underlying Mesorif formations, which differ from the Intrarif ones in terms of lithostratigraphic facies and recent tectonic evolution. In fact, the Mesorif is more heterogenous with a relatively wider variation of facies that ranges from rocky and semi-rocky formations to unconsolidated material (Suter 1980a;Benzaggagh and Atrops 1997;Sani et al. 2007). With respect to its Neotectonic evolution, a main stratigraphic unconformity clearly distinguished the Mesorif from the Intrarif, with Miocene deposits being absent in the latter and present in the former (Gimeno-Vives et al. 2020). Consequently, the topography is more rugged due to differential erosion processes that exploit the lithological heterogeneities and produce relatively steeper slopes with elevations reaching up 1800 m. In the study area, this geological unit is part of the Northern segment of Taounate province ( Figure 1).
To the South of Oued Ouergha (Figure 1), the lower tectonic unit outcrops (Prerif), which is the most homogenous with regards to the lithology and topography. The terrain is dominated by marly formations with a gentle hilly topography (Fares 1994). The Ouazzane nappes, which tectonically overlay the Prerif formations, are also formed by similar material with thin alternations of sandstone or limestone strata. Similarly, the Miocene deposits are dominated by marls (Sani et al. 2007) and are consequently similar in terms of geomorphological control and landscape processes to the Prerif and Ouazzane units (Benzaggagh and Atrops 1997).

Material and methods
To investigate the effects of the above-described contrast, we used four explanatory research methods to exclude algorithm biases . We also split the training data into largedeep landslides and small-shallow landslides. Such a practice allows investigating the deep and shallow processes separately. Finally an FAD analysis is performed in order to further demonstrate the said contrast between the two segments of the study area and also to further verify and explain our findings.

Landslide inventory mapping (LIM)
Training data selection is one of the most if not the most important factor contributing to the accuracy of LSMs (Pereira et al. 2012;Gaidzik and Ram ırez-Herrera 2021;Bounab et al. 2022). Although automatic techniques tend to perform well in areas with less vegetation and noise, the interpretation of aerial photographs is still considered a very reliable method if done by a trained and experienced geomorphologist (Guzzetti et al. 2012). This is mostly true in very complex settings such as the study area where automatic methods tend to perform poorly.
For this reason, the elaboration of a LIM for the province of Taounate was a time-consuming process, realized with the aim of obtaining the most complete dataset (possible) for this largely uninvestgated and undocumented area. To do so, our research was not limited to the delimitation of landslide-affected areas, but also identifying and characterizing the activity, type, age, date, cause, area and relative depth for each landslide. This information was used to prepare the training data for the LSMs through separating data into deep and shallow landslides based on their area and typology. In our study, shallow landslides are all mass movements the area of which does not exceed 5000 m 2 for earthflows and 2500 m 2 for all other types, which is a wide-spread classification practice in scarce-data areas (Figure 2a and b) (Moser and Schoger 1989;Aleotti et al. 1996;Avanzi et al. 2004;Markart et al. 2007;Raetzo et al. 2007). For this end, we used a multi-source dataset that includes but is not limited to, satellite derived google-earth images, airborne stereoscopic photographs, press articles, previously published thesis (Maurer 1968;Fonseca 2014) and finally field surveys and investigations that we conducted from 2017 to 2020. Although automatic techniques exist, expert-based LIMs are still very reliable tools when done by trained geomorphologist (Guzzetti et al. 2012). Therefore, we relied on the knowledge and experience of the co-authors who already successfully achieved LIMs in Northern Morocco (El Kharim 2002;Bounab 2022;Bounab et al. 2022), and investigated many slope instability cases in the Northern Rif region (El Kharim et al. 2003;Bounab et al. 2021;El Kharim et al. 2021;Obda et al. 2021). The typology was defined according to the Hungr et al. (2014) classification which is a modified version of the infamous Cruden and Varnes (1996) system. With respect to the age of each landslide, the McCalpin (1984) classification system was adopted and simplified due to the lack of sufficient data. As such, we regrouped Active and Inactive-young processes of this classification into one category named new landslides. Inactive mature and Inactive old processes are called in this paper old and dormant/eroded landslides respectively (Figure 2c and d). Then the LIM was divided into shallow and deep landslides based on the typology and size of each individual slope instability case.
To validate the data, we further split the inventory into ante-2016 (training: 90.6%) and post-2016 (validation: 9.4%) landslides. This practice although arbitrary and subjective helps to avoid validating newer landslide with older ones and allows to detect changes in slope dynamics between old and new processes. Conversely, the random splitting practiced by many authors (e.g. Naghibi et al. 2016;Kamali Maskooni et al. 2020;Fang et al. 2021) may induce such undesirable errors, which we prefer to avoid.

Landslide FAD analysis
Numerous studies, have shown that landslide size distributions in general and area distributions in particular follow a power law distribution that can be best fitted to double-Pareto and inverse gamma models (e.g. Guzzetti et al. 2002;Qiu et al. 2021). These two theoretical distributions are the go-to models for landslides FAD analyses since they allow characterizing both the power-law tail for deep landslides and the Rollover for smaller ones (Malamud et al. 2004). The rollover was defined in this study as the smallest value at which the best coefficient of determination was obtained for the power law decay (Kasai and Yamada 2019). In this work, the use of double Pareto and inverse gamma functions is mainly focussed on defining the FAD for zone A and zone B. The mathematical equations that allow doing so are the Double Pareto distribution (Stark and Hovius 2001;Eq. (2)), characterized by positive and negative power scaling parameters, and the three parameter inverse Gamma distribution (Malamud et al. 2004;Eq. (1)), which is fitted to the power-law decay of medium and deep landslides as well as the exponential reversal of smaller ones.
where q is a parameter of the gamma function, (a > 0), b a continuous scale parameter (b > 0), c is a continuous location parameter and C is the gamma function.
where x is the landslide area, Amax is the largest landslide area in the data; Ap is the rollover location; a and b are respectively the positive and negative exponents of power law functions After establishing the models for the entire datasets, the same analysis was performed for specific types, ages, and sizes with the aim of detecting variations that could be due to such variables. In addition, the parameters obtained in each of the above-detailed analyses are compared between zone B and zone A so as to detect differences caused by the geomorphological contrasts between the two zones. Similar analyses were performed by Qiu et al. (2021) for the same purposes.

Landslide conditioning factors
In this work, only the factors that contribute the most to the instability processes were considered for calculating the LSM models. In this respect, the Goodman-Kruskal (G-K) test was set as a method for measuring the contribution of each factor and subsequently select the most influencual in the area of interest (Goodman and Kruskal 1954;Irigaray 1995;El Hamdouni 2001;Fern andez 2001;Irigaray et al. 2007). In fact, G-K values indicate the correlation between independent (factors) and dependent (land movements) variables. When it is greater than 0.5, this means that the factor is considerable and thus can be included (Fern andez et al. 2003;Irigaray et al. 2007). These coefficients are included in this analysis and collected from various sources. Factors such as elevation, slope, aspect and local relief are extracted from a 28-meter digital elevation model (DEM) (Figure 3), while the other factors such as lithology, land use, road lines, hydrographic network and gullies were produced by manually digitizing a variety of data sources (1:50 000 geological maps, 1:25 000 topographic map, satellite-derived google-earth imagery) (Figure 3).

LSM bivariate analyses
3.3.1.1. GIS matrix method. The matrix method was proposed by Degroot-Hedlin and Constable (1990) and later used and developed by Fern andez et al. (1999). It allows calculating the susceptibility index for a given area, by determining all possible combinations between the considered factors. Then, the area corresponding to each combination is calculated and compared to a dichotomous LIM dataset. Subsequently, the area of landslideaffected zones is computed. The result is a contingency table that defines the relationship between spatial factors and landslides in each map unit of a given study area. The level of susceptibility of each matrix is calculated as the ratio of the area of unstable zones to that of stable ones. When the ratio equals or is close to 1, the area is interpreted as extremely susceptible. Conversely, values close to 0 indicate extremely low susceptibility.

Weight of evidence method
Although originally used as a metric to quantitatively evaluate the mineral potential of a given area (Bonham-Carter 1990), this technique was later intensively used in landslides susceptibility mapping (Thiery et al. 2007;Pradhan and Lee 2010a;Ding et al. 2016;Arabameri et al. 2019) to the point where it almost became synonymous with the discipline. It is a probabilistic method based on Eq. (3): where A is the event corresponding to the occurrence of a landslide, B is the event corresponding to the non-occurrence of a landslides and P(AjB) is the probability of A occurring given B.
In fact, this technique assesses the weight of each predictive factor category through calculating the probability of occurrence to that of non-occurrence for each individual factor category (Eqs. (4) and (5)).
where W þ is the positive weight, W À is the negative, NL i is the number of landslide cells in the corresponding category, NL is the total number of landslide cells in the study area, Nc is the number of cells of the corresponding category and N is the total number of cells in the area. The W þ value is an indicator of correlation between the occurrence of landslides and a given predictive variable category, where 0 indicates no effect, positive values indicate correlation and negative ones non-correlation. The Wis a measure of non-occurrence likelihood, which can be interpreted in the same manner.
Based on these two parameters, a contrast value (C) (Eq. (6)) was calculated (Supplementary Tables 1 and 2) which is often interpreted as a LS index.

Logistic regression (multivariate)
The logistic regression (LR) method is useful for predicting the absence or presence of a feature or outcome based on a set of predictors. In fact, it can be used in landslide research to determine the relationship between landslide occurrence and possible conditionning factors. Currently, it is the most widely used multivariate method in landslide susceptibility mapping (Budimir et al. 2015). To deploy this algorithm, the dependent variable must be dichotomous and the determining factors have to present little to no multicollinearity between them. Then, the conditional probability of landslide occurrence P (y ¼ 1jx) is calculated (Supplementary Tables 3-6) using Eqs. (7) and (8).
where b 0 is the intercept of the equation, and b 1 , b 2 … bn are the coefficients attributed to each independent variable, x 1 , x 2 , … , x n are the variables used to build the model. To reduce the effect of measurement units on the interpretability of the final logit model, we performed a standardization of the data. This step is crucial for improving the quality of the model especially in relation to odds ratio (OR) computation (Eq. (9)).

Artificial neural network
The Artificial Neural Networks (ANN) is a popular artificial intelligence method that have been widely used for landslide susceptibility mapping and detection (Pradhan and Lee 2010b;Were et al. 2015;Dou et al. 2018). They are considered robust computational mechanisms capable of acquiring, representing, and analysing multivariate information spaces (Svozil et al. 1997). This approach has many advantages over conventional statistical methods. In fact, ANN algorithms are independent of the statistical distribution of the data, which means that they do not require any preprocessing of the input (Cs aji and others 2001; Zeng-Wang 2001). In landslides research, the Multi-Level Perception (MLP) performed through the backpropagation algorithm is the most frequently used ANN method due to its flexibility and adaptability. (Pham et al. 2017;Chac on et al. 2019;Li et al. 2019;Hong et al. 2020;Bounab et al. 2022). In fact, MLP consists of three sets of layers: the input layer, the hidden layers, and the output layer, with most of the processing occurring between the second and third layers. During these steps, the model undergoes a process of constant updating and improvement until a 'satisfactory' result is reached (Pradhan and Lee 2010b). After the hidden layers had been computed by the algorithms, the weight of each variable is used to generate new ones through multiplying each input value by its corresponding weight and subsequently summing the products (Eq. (10)). The new generated variables are used to calculate the probability of occurrence of landslides for each individual pixel using Eq. (11).
where P j is the pseudo-probability for each individual cell of the study area, and net is the input received by each neuron in the hidden layers. A more detailed explanation of the way by which the 'net' value is computed is given by Pradhan and Lee (2010b).

LIM and landslide size-frequency distributions
In the province of Taounate (5600 km 2 ), a total of 27,891 landslides have been inventoried (Figure 4), which corresponds to an average density of about 5 landslides per km 2 (lsdl/ km 2 ). These processes involve an area equivalent to 4.1% of the territory (Figure 4). About the typology of these said landslides, four main types have been identified: 8091 landslides, 12,248 earth and debris flows, 5078 solifluctions, and 2474 complex landslides ( Figure 2). In zone B, 19,241 slope instability processes are identified, with a mean density of 6.12 lsdl/km. 2 Although important in number, most gravitational processes (around  (Table 1). However, the variability within deep landslides is important for all investigated categories, indicating the heterogeneity of gravitational processes, with standard deviation values exceeding 100% in most cases (Table 1). However, the average landslide size in zone A ($24,825 m 2 ) is almost 6.5 times larger than that of zone B ($3928 m 2 ), suggesting that by and large, mass movements are of bigger proportions in the former zone. In addition, small landslides (<5000 m 2 ) are more frequent in zone B compared to zone A, which further solidifies the previously advanced observation.
Theopposite can be said about deep landslides (Table 1). The double Pareto and inverse gamma models fit well the landslides FAD, which follow in both segments of the study area, a heavy-tailed distribution with a negative power law defining the probability density of medium-scale and large-scale landslides ( Figure  5a). The Rollover effect observed in small landslides, suggests, according to Guzzetti et al. (2002), that the LIM generated in this work is more or less complete. Despite being present in all graphs presented in Figure 5a, the position of the rollover effect varies between zone B and zone A, with values of 800 m 2 and 2800 m 2 , and probability densities of 3.6 Â 10 À4 and 8.5 Â 10 À5 respectively, thus reflecting the above-described typological difference. The power-law tail exponent (a) of the best fit double Pareto and inverse gamma models are respectively 0.94 and 1.08 in the North (Zone A) (with a maximum probability density of 2 Â 10 6 and 2 Â 10 5 ), which are in the low range of values compared toother areas of the world. In the South however, more important a values are computed (1.41 for the inverse gamma curve and 1.5 for the double pareto model, with maximum probability densities of 5 Â 10 5 and 1.5 Â 10 3 ), which corresponds to a steeper decline, thus indicating the relatively low frequency of mid-size landslides in this distribution.
The comparison of FAD distributions for old/new and active/non-active landslides provides good information regarding the temporal evolution of gravitational processes in  both study areas (i.e. Prerif and Intra and Mesorif). However, because of the fact that old and relict shallow landslides are not well preserved, such analyses cannot be performed because of the undersampling in shallow LIM (Malamud et al. 2004;Frattini and Crosta 2013). Thus, only deep processes are subject to such analytical techniques.
Our results show that in zone B, there are more than 3394 totally or partially active landslides, while in the North, only 2097 landslides show signs of activity. The new landslides are more frequent in zone B (2919) compared to zone A (846). On the contrary, old landslides are more frequent in zone A (3088) compared to zone B (2586) ( Table 1).
The inverse gamma and double Pareto distributions appear to match the observed probability of both old and new landslides. In zone A, the most frequent landslide area categories for both both types of processes are 2800 m 2 to 64,000 m 2 ( Figure 5d) and 5000 m 2 to 80,000 m 2 respectively (Figure 5e). Nevertheless, values in zone B, which is sited in the Prerif geological context, are drastically different, with the most frequent classes being the 2800 m 2 to 17,000 m 2 for new landslides (Figure 5e) and 3000 m 2 to 19,700 m 2 for old ones (Figure 5d). With regards to the active and non-active landslides, they also seem to be well fitted to the inverse gamma and double Pareto distributions, with the observed probability density curve corresponding tothe inverse gamma and double Pareto distributions for small landslides <10,000 m. 2 In the Prerif terrains, the most frequent landslides area is around 3000 m 2 for active processes (Figure 5b), and ranges from 2900 m 2 and 15,700 m 2 for inactive ones (Figure 5c). While, in the intrarif and mesorif terrains, the most frequent landslide size is close to 4200 m 2 for active landslides (Figure 5b), and ranges between 30,000 m 2 and 360,000 m 2 for inactive ones (Figure 5c). Such results indicate that the contrasted geological and geomorphological settings of both segments of the study area are translated into large and significant differences in relation to the size distributions of active and inactive, old and new, as well as small and deep landslides.

Selection of landslide determining factors
The contingency table of all landslides predictors used in both zone A and zone B (Table 2) shows the absolute values of the gamma G-K index corresponding to each determining factor. The highest determining factor in Taounate province is gullies with a G-K > 0.95 in both Zone A and Zone B. This is due to the fact that deep water incision causes the removal of support material especially in fine-grained strata such that of the study area. In addition to this, the slope angle is also a good predictor of all the investigated categories of landslides, with G-K values exceeding 0.8. This comes as a result of the marly deposits becoming increasingly unstable close to major thrust faults, where the topography is steeper. The altitude is another good predictive variable especially for shallow landslides (G-K > 0.75) with a relatively poorer performance for deeper ones (G-K < 0.733). Similarly, the local relief factor corresponds to a medium correlation for shallow landslides (G-K < 0.71), and a strong one in relation to deep processes (G-K ¼ 0.8), which is due to the complex mechanisms involved in the latter category. Regarding lithology and Landuse, they tend to present good correlation in Zone A, and low to medium correlation in the South (zone B). Once again, the geological and geomorphological contrasts of the study area are manifested through influencing the predictive performance of the investigated factors. With respect to hydrography, it seems that the stream network is only effective in predicting shallow landslides (G-K > 0.816). For deep ones however, it has a low G-K score (G-K < 0.33) in both segments of the study area.The road network factor is also exclusively influential in the Prerif area given its low G-K values in zone A and its intermediate ones in zone B. Finally, the slope aspect is found to be the least influential variable in the province of Taounate. Artificial neurel networks (ANN) methods. Using the quantile classifier, the susceptibility index values were grouped into 4 main categories: low susceptibility, moderate susceptibility, high susceptibility and very high susceptibility ( Figure 6). Through the visual interpretation of the spatial distribution of LSI values, it can be clearly seen that there is a significant difference between the distribution of deep landslides and shallow landslides in zone A, as the high susceptibility areas for shallow landslides are low susceptibility areas for deep landslides and vice versa. Through comparing the four methods the one another, it is clear that the bivariate models (WoE and GMM) seem to be the most conservative and the least detailed (Figure 6a-d), and it can be easily seen that they are mainly influenced by lithology and land use. In contrast, for LSMs derived from the multivariate LR technique and the ANN method, which showed similar results in terms of spatial distribution and LSI values, these models are the least conservative and most detailed, and seem to be strongly influenced by topographic factors (Slope, Local relief and gullies) (Figure 6e-h). Figure 7 illustrates the LSMs of the Prerif area (Area B) for deep (left) and shallow (right) landslides. The results clearly show similar LSM spatial distributions for both types of mass movements. However, the shallow landslide models are the most conservative due to the geological nature of this area, which strongly favours the triggering of shallow landslides. Areas of high susceptibility are focussed along gullies and marl formations while areas of low susceptibility are characteristic of alluvial deposits and massive sandstone and limestone formations (Figure 7). By comparing the eight LSM models obtained using be above mentioned techniques, it is clear that the models derived from bivariate approaches appear to be very similar in terms of spatial distribution and LSI values (Figure 7a-d). The ANN model is considerably smoother, showing more detailed and very liberal results (Figure 7g and h). Finally, the models produced using the LR technique appear to be the most conservative in the prerif area (zone B) (Figure 7e and f).

Validation of LSMs
With regards to the performance of the LSMs, all models provided in this study are shown to be good (AUC > 0.7) to very good (AUC > 0.8) (Figure 8) at the exception of the WOE model for deep landslides with an AUC ¼ 0.68 (Figure 8b). In general, the multivariate techniques were shown to be more performant in assessing large landslides susceptibility, by providing the most liberal and accurate models, especially for the LR technique (Figure 8c). These findings confirm previous research, which found that multivariate techniques in general and LR in particular perform better than bivariate ones (Kavzoglu et al. 2015). However, for shallow processes, the LSMs produced by the bivariate techniques are more precise but also more conservative with the exception of the LR technique, which gave the most conservative and least accurate model. Thus, the methods Figure 6. LSMs of zone A derived from GIS matrix for deep landslide (a) and shallow landslides (b). LSMs of zone A derived from WoE for deep landslide (c) and shallow landslides (d). LSMs of zone A derived from LR for deep landslide (e) and shallow landslides (f). LSMs of zone A derived from ANN for deep landslide (g) and shallow landslides (h).
used do not present a significant difference in terms of performance between zone B and zone A. Therefore, the observed variability in the frequency of susceptibility classes (Figure 8) must be consequential to the underlying geomorphic processes that control slope instability at the landscape scale.

Performance of predictive factors for shallow and deep landslide susceptibility mapping
Based on the G-K values obtained in this study, the assessment of the performance of landslides predictors is achieved (e.g. Irigaray et al. 1996;Costanzo et al. 2012;Pereira Figure 7. LSMs of zone B derived from GIS matrix for deep landslide (a) and shallow landslides (b). LSMs of zone B derived from WoE for deep landslide (c) and shallow landslides (d). LSMs of zone B derived from LR for deep landslide (e) and shallow landslides (f). LSMs of zone B derived from ANN for deep landslide (g) and shallow landslides (h). et al. 2012). This index is widely used and thus its results constitute a reasonably solid metric in our application. In effect, for the prediction of shallow landslides, all variables are found to be significant at the exception of Aspect and distance to road ( Table 2). The similarity between Zone A and Zone B indicates that shallow processes occur in more or less similar geomorphological and geological settings. The reason for such results is the fact that shallow landslide occurrence is correlated with pedogenic soil thickness as shown by Ho et al. (2012). In geotechnical terms this is due to a loss of cohesion in unconsolidated deposits under extreme rain conditions (Frattini and Crosta 2013;Qiu et al. 2021). In the study area, the formation of thick soil layers is favoured in blue marls and other marl-like lithologies, which tend to resist the least to chemical weathering. Hence, the geomorphic expressions and lithological characteristics that favour the development of shallow gravitational processes are found to be the same for both the Intra and Mesorif (North) and the Prerif (South).
Deep landslides differ however from one segment of the study area to the other. In reality, more variables can be considered for LSM preparation in Zone A when compared to those of Zone B. This shows that despite being of the same category, deep landslides in the North (Zone A), which are significantly larger, are triggered in different conditions due to the complexity of the Intrarif and Mesorif terrain, in comparison to the relatively simpler and more homogenous Prerif structure. Actually, the presence of subhorizontal thrust faults where the lithological, hydrogeological and geomorphological contrasts are severe (Suter 1980a), translates into steeper slopes (Figure 3) due to the superior resistance of sandstone layers to mechanical erosion, as well as differential processes that exaggerate the contrasted relief morphology. Similar oriented landslide distributions are also reported by Carlini et al. (2016) where landslide-affected areas are preferably located alongside thrust faults. In the Prerif no oriented distributions are observed, thus their conditionning factors differ.

The significance of FAD for landscape characterization
A more complete landslide is a more accurate one. Consequently, the evaluation of the sample size and its completeness becomes more relevant and could be considered as crucial for LSM results. In this respect, Li et al. (2016) confirmed previously advanced conclusions by Guzzetti et al. (2002) who propose FADs as a way of measuring the wholeness of LIMs. To do so, the goodness of fit of FADs to double pareto and gamma theoretical distributions is used as an indicator of sample size performance, especially knowing that less complete LIMs are badly fitted. Therefore, our LIM can be considered as more or less complete because the FADs obtained are well-fitted to both previously mentioned statistical models. Moreover, despite the fact that we split our data base into small-shallow and large-deep landslides categories, the obtained distributions can still be considered as reasonably well fitted. This is a good testimony to the robustness and completeness of the LIM used in this study with the rollover and the power law decay being more or less preserved. Nevertheless, the rollover position changes, which was reported by other researchers who split LIM datasets based on the date of occurrence (Guzzetti et al. 2002). It is important to note that in our case, old landslides failed to adjust to the double pareto distribution (Figure 5d). This effect was described in previous research and is considered normal since old shallow and small processes are not well preserved and thus are excluded from the LIM, which induces a deflection effect in the rollover area (Van Den Eeckhaut et al. 2007).
In relation to the landscape evolution processes that control zone A and zone B, the FAD parameters suggest that new and active processes are similar (Figure 5b and c). However, old ones show a significant contrast, expressed through the slightly different rollover position and the largely contrasted power law exponents, which indicates larger processes in Zone A (Figure 5d). In Figure 9, the reasons behind such difference are explained. In zone A, thrust nappes structures of the Meso and intra-rif induces large and deep landslides processes as can be seen through comparing the LIM (Figure 4) to the geological map of the study area (Figure 1). Similar processes are described in other areas around the world where thrust systems are shown to be associated with mass movements (Strecker and Marrett 1999;Arai and Chigira 2019;El Kharim et al. 2021;Bounab et al. 2022). Near the alluvial plain of Oued Ourgha (Figure 1), the gentler topography of the Miocene terrains only favours the development of shallow landslides and thus a significant contrast can be observed between shallow and deep processes in terms of geographic distribution in the North. In the Southern part of the study area (Zone B), the gentler topography and monotonous facies promote smaller slope instability processesthat are spatially associated with deep ravines (see Table 2 and Supplementary Tables 1 and 2).
Hence, no clear spatial variability can be observed between both types of mass movements.

LSM recommendations for complex geomorphological settings
5.3.1. LSM validation using shallow landslides for deep LSMs and vice versa As stated before, and shown in previous landslides research (Zêzere et al. 2005;Dou et al. 2015;Kritikos and Davies 2015;Massey et al. 2018;Shou and Chen 2021), shallow and deep slope instability processes differ both in terms of spatial distribution and trigger mechanisms. To highlight this difference we attempted to validate shallow LSMs produced in this research by deep processes and vice versa. The results presented in Figure 10 reveal that this situation is only true in the North (Zone A), where such a practice yields low AUC values of around 0.45 to 0.59 for all techniques, which is significantly lower than AUC values of Figure 8. Reasons behind such contrast are shown in Figure 9, where the impact of the geological structure and its geomorphological manifestations is illustrated. However, in the South, AUC values (0.7 À 0.88) are similar to those presented in the results section, hence indicating that the spatial distribution of shallow and deep processes follow that of hydrological erosion landforms as shown in Table 2 and explained by Figure 9.

General recommendations for working in complex geological settings
To sum up the findings of this study, we suggest the following as best practices for LSM especially in complex settings.
First, the experts must integrate their geological and geomorphological knowledge and experience in the development of the LSMs. As such, the researchers are supposed to be aware of the different geological units, lithological facies as well as their geomorphological expressions due to the important control that geology has on defining landslide typology, size and spatial distribution (Guzzetti et al. 1996;Lomoschitz et al. 2002;Prager et al. 2009;Carlini et al. 2018). Failure to understand the structure of the field could lead to inaccurate LSMs due to the inability to adequately delimit their extent and subsequently the preparation of the training data. Actually, training data selection and preparation is considered the most important factor controlling the accuracy of LSMs (Kalantar et al. 2018; Gaidzik and Ram ırez- Herrera 2021; Bounab et al. 2022). In fact the inclusion of relict landslides that do not reflect the current slope dynamics could lead to large inaccuracies, the amount of which depends on the methods used . As shown by Shou and Chen (2021), the establishment of separate LSMs for shallow and deep processes, the triggers and conditionning factors of which differ significantly, generates large difference in the final product. Moreover, Pereira et al. (2012) and Pourghasemi et al. (2020) showed that even the sampling and representation techniques of LIM data could drastically change the outcome. Consequently, the training data used in LSM research must be thoroughly investigated before calculating the model. In complex settings, the first step is the distinction between shallow landslides that are mostly induced by short-term intense rainfall, and deep ones that necessitate steeper slopes and more extreme conditions (Zêzere et al. 2005;. This is especially true in the Alboran Sea region and Northern Morocco where the climate is sensitive to NAO variations (Knippertz et al. 2003;Tramblay et al. 2012;Bounab 2022). Another good practice that we suggest is the use of FAD analyses to test for the completeness of the LIM, which is a practice proposed by Malamud et al. (2004) and Van Den Eeckhaut et al. (2007). In fact, the incompleteness of an LIM could lead to drastically different LSMs. The effects of such lack of information is yet to be fully investigated. To do so, LSMs for the same area need to be produced with differentLIM sizes, to see if less landslides in the input data lead to large differences in the output. Input factor selection is the next important recommendation that needs to be carefully examined. Although topographic, geological and environmental variables could all contribute to the predictive model, the inclusion of all of them at one in the computation process leads to noise and inaccuracies, because of the conditional independence assumptioninhenerent to most statitical techniques (Agterberg and Cheng 2002;Pereira et al. 2012), which is rarely verified in recent LSM studies. One way to choose predictive factors is the G-K index, which was proposed by Irigaray et al. (1996). Other indices are the contingency coefficient (CC), the cramer's V (Fern andez et al. 1999), and information gain ratio (IGR) (Tien Bui et al. 2016). And finally, we join El Hamdouni (2001), Boualla et al. (2019), Bounab et al. (2022a) and many others, who stated that for validation purposes, the validation and training data sets need to be split on the basis of a threshold date, instead of using random sampling techniques. The later practice although very common, could lead to the validation of newer landslides by older ones if no temporal threshold is used. As for the typology of landslides used in the validation, the true assessment of the performance of an LSM realized in a complex setting needs to be done using the same types of mass movements as those of the training data. Failure to comply with this recommendation could lead to a false underperformance of the LSM that is due to the bad selection of the input and not the model itself. This was the case in zone A where AUC values were low when validating deep processes by shallow ones and vice versa.

Conclusion
In conclusion, to conduct LSM research, geology and geomorphology must be integrated in all processing steps as well as the interpretation of the output model. The present study attests to this and also stresses the fact that not all LIMs are born equal. Thus one has to evaluate the completeness and thoroughly discuss and analyse the statistical distributions of the training datasets before computing the model, especially in complex settings where different types of mass movements are sensitive to slight changes in facies and tectonic fractures, such as the case of zone A in this study. Nevertheless, such practices are underinvestigated in current research and thus deserve more attention from geologists and geomorphologists working on landslide hazard and risk assessment. Our findings highlight the impact of the landscape on the distribution of shallow and deep landslides, which present a contrasted spatial distribution in heterogenous settings and a similar one in monotonous and homogenous areas. Therefore, it is ill-advised to use LIMs containing both types of mass movements in complex areas. However, our findings need to be further investigated and completed by research in different climate and geological environments. In addition, this paper shows that LSMs are not only useful for prediction, but can also be used as effective tools to study changes in the landscape and statistically demonstrate their impact on the evolution of hillslopes.