Sampling biases of small non-volant mammals (Mammalia: Rodentia and Didelphimorphia) surveys in Paraná state, Brazil

ABSTRACT The lack of information on biological aspects of small mammals in Brazil fits into two of the main gaps in knowledge of biodiversity – the Linnean and Wallacean shortfall. We performed a broad compilation of studies developed in the Paraná state to consolidate the first list of non-volant small mammal species. Furthermore, we indicate which regions lack information and greater sampling efforts to understand the biological aspects of small mammals. We listed 50 species belonging to 30 genera, five families, and two orders, which represent 37% of marsupials and small rodent species occurring in the Brazilian Atlantic Forest. Our results indicate that are regions in the Paraná state, such as the southwest, without a single record of small mammals, reveling a Wallacean shortfall of approximately 138.584 km2. Studies on the small-bodied mammal fauna in the state of Paraná are influenced by accessibility bias, concentrated at sites less than 50 kilometers distant from cities, roads, or airports. New research will not only have the challenge of knowing the species richness in regions that are still poorly or hitherto no studied, but also evaluate the state and the population dynamics of the species that persist in densely anthropized areas.

Marsupials and small rodents are good indicators of habitat quality because some species are less tolerant of habitat changes, due to their reduced spectrum of rare ecomorphological traits (Püttker et al. 2019). Smallbodied mammals occupy different trophic levels but stand-out for composing the prey-basis of several groups of larger vertebrates, such as snakes (Hartmann et al. 2009), birds (Rocha et al. 2011), and other mammal species (Bianchi et al. 2011;Giordano et al. 2018). Due to the widespread conjunct of traits and sympatric diversity, small-bodied mammals perform important ecosystem services such as predation and dispersal of seeds (Vieira et al. 2011;Galetti et al. 2015;Carreira et al. 2020), pollination (Amorim et al. 2020), predation of invertebrates, small vertebrates and eggs (Cáceres & Monteiro-Filho 2001;Pinotti et al. 2011) besides being potential reservoirs of diseases (Muylaert et al. 2019).
he information on species richness, diversity, and distribution are the essential baselines for ecological studies and conservation planning (Silveira et al. 2010;Oliveira et al. 2017). Many species of small-bodied mammals are susceptible to fragmentation, habitat changes, and habitat loss (Pardini et al. 2010;Püttker et al. 2013Püttker et al. , 2019. Thus, detailed information on geographic distribution, natural history, biogeography, and systematics are paramount to solve knowledge gaps but remain poorly studied for many taxa, especially the cryptic-prone mammals (Costa et al. 2005;Hortal et al. 2015;Bovendorp et al. 2017).
The lack of information on biological aspects of small Brazilian mammals fits into two of the main gaps in knowledge of biodiversity, the Linnean and Wallacean shortfall (Hortal et al. 2015). Linnean shortfall refers to the discrepancy between species formally described; the number of species that actually exist and have not yet has been described -also including the knowledge gap for extinct species (Hortal et al. 2015). The Wallacean shortfall refers to the lack of knowledge about the geographic distribution of species (Lomolino 2004), generally increased by the geographic sampling bias in to compose the data on species distribution (Hortal et al. 2015). Both shortfalls have implications for knowledge of large-scale patterns of biodiversity, for processes that modify biodiversity, and for species threat estimates (Hortal et al. 2015).
Recent data papers have shed light on some gaps in the knowledge of biodiversity, providing information with good resolution and a wide geographical scale to contribute to filling gaps about the distribution and occurrence of small non-volant mammals in the Atlantic Forest (e.g. Bovendorp et al. 2017;Figueiredo et al. 2017). Despite the intense participation of several research groups with study agendas spanning across the entire Atlantic Forest of South America, for some regions within this biome, such as the Paraná state, there are still few records and information on the occurrence and distribution of small non-volant mammals species.
With an area of approximately 200,000 km 2 , the Paraná state has the second-largest political territory in Southern Brazil (IBGE 2020). Almost entirely inserted in the Atlantic Forest biome (98%), Paraná state also has some relictual patches of Cerrado biome (Wrege et al. 2017). Despite recent studies on smallbodied mammal diversity and distribution, there are still gaps in knowledge (e.g. Linnean and Wallacean) about the marsupials and small rodents distribution throughout Brazil (Bovendorp et al. 2017). Once faunistic inventories are fundamental for the development of biodiversity conservation policies and strategies (Silveira et al. 2010;Oliveira et al. 2017), our aims were (1) to improve the spatial resolution of species distribution in the Atlantic Forest biome, and (2) reveal which regions of the state of Paraná lack information, and then discuss an agenda for improve the sampling efforts aiming increase the current knowledge of smallbodied mammal fauna. To do so, we performed a broad compilation of studies developed in the Paraná state to consolidate the first list of non-volant small mammal species.

Study site
Paraná state is located in Southern Brazil, between latitudes 22º29ʹ30" and 26º42ʹ59" South and 48º02ʹ24" and 54º37ʹ38" West (Maack 2017). Paraná have a territory of 199,298.979 km 2 comprising 10 mesoregions (i.e. Northwest, Center-North, Pioneer North, West, Center-Western, Center-Eastern, Southeast, South-West, Center-South and Metropolitan of Curitiba), bordering with Santa Catarina, São Paulo, and Mato Grosso do Sul states, and also Argentina and Paraguay in the west frontier (IBGE 2020) ( Figure 1F). The climate according to the Köppen classification is of the type Cfa (humid temperate climate with hot summer) and Cfb (humid temperate climate with moderately hot summer), both are oceanic climates without a defined dry season (Peel et al. 2007).
The phytoecological regions of Paraná are mainly Mixed Ombrophilous Forest (MOF), Ombrophilous Dense Forest (ODF) and Seasonal Semideciduous Forest (SSF), with some areas of Grassy-Woody Steppe (GWS) and Cerrado patches (WS) (Figure 1) (Wrege et al. 2017). Originally, about 83% of the Paraná state consisted of extensive forest formations (ODF, MOF, and SSF) and 17% were open-areas vegetation (Maack 2017). Currently, the forest remnants of the Atlantic Forest in the Paraná state correspond to less than 28% of its original cover, with the remaining fragments being surrounded by different types of agriculture and forestry, such as Eucalyptus spp. and Pinus spp. (IBÁ 2016;Rezende et al. 2018).

Data collection and bibliographic review
We searched for scientific articles, books, book chapters in the indexing engines (e.g. Web of Science (webofknowledge.com) and Google Scholar (scholar.google.com) using the keywords: 'rodents from Paraná'; 'Marsupials from Paraná'; 'Small nonvolant mammals of the Atlantic Forest of Paraná', 'small non-volant mammals of the state of Paraná' and 'non-flying small mammals of the state of Paraná' both in Portuguese and English. We also consider online databases of species in zoological collections such as the Centro de Referência em Informação Ambiental (SpeciesLink; CRIA 2020) and the Global Biodiversity Information Facility (GBIF 2020) to obtain distributive, ecological, and taxonomic data on small-bodied mammals of Paraná state. We listed the occurrences (species richness; presence) of small non-volant mammal registered in Paraná state between 1979 and 2019. Our database comprised a set of 63 research, but only 40 of these studies (63.5%) met our prior criteria to analysis (Table 1 and Supplementary Material S8) We considered valid data the records containing information related to the collected specimen voucher, as well as georeferencing information for the collection sites.

Data analysis
We used the Kernel interpolation to create a density map to identify areas of the state with the highest concentration of small non-volant mammal records. The Kernel approach allows estimating the number of events per unit area in each cell of a regular grid that covers the study area (Wand & Jones 1995). The analysis was performed using the QGIS software, version 2.18.21, with the 'Heat map' function.
We evaluated the dissimilarity between species composition across the different phytoecological regions using the Unweighted Pair Group Method with Arithmetic Mean (UPGMA) approach derived from the Jaccard coefficient (Legendre & Legendre 2012). The Jaccard coefficient of dissimilarity ranges from 0 to 1, with 1 indicating a completely different faunistic composition between the sites (Legendre & Legendre 2012). Additionally, we used the cophenetic correlation coefficient (r) to verify the fit between the similarity matrix and the dendrogram derived from the Jaccard coefficient. These analyzes were performed based on the Vegan version 2.4-1 package (Oksanen et al. 2019).
We used the sampbias approach (Zizka et al. 2020) to evaluate the sampling bias of species records across the sites throughout the Paraná state. This approach enables to visualize the distribution of records, quantifying the biasing effect of geographic features related to human accessibility (i.e. proximity to airports, cities, rivers, and roads) deriving the biasing effects in space (Figure 1). This is an entirely new approach to access the sampling bias of any taxonomic group yet is restricted to default spatial layers presumably partially correlated (e.g. airports and cities). We performed the sampbias analysis based on the package sampbias  ( Zizka et al. 2020). All analyses were carried out using the R software version 4.0.1 (R Core Team 2020).

Results
Based on our systematic search, we recorded 50 species of small non-volant mammals for the Paraná state, distributed in two orders, five families, and 32 genera (Table 1 and Supplementary Material S2 to S7). The order Didelphimorphia is represented by 16 species (32%), belonging to the family Didelphidae. The order Rodentia had 34 species (68%) distributed in four families ( Table 1). The Cricetidae family was the one with the highest richness with 26 species (52%), Echimyidae with six species (12%), and Caviidae and Sciuridae with one species each (2%). Among the species recorded for the Paraná state, 25 (50%) are endemic to the Atlantic Forest biome (Table  1). Regarding the conservation status, only Marmosops paulensis is at vulnerability conservation status (VU) at the national level and Wilfredomys oenax is considered endangered (EN) at the global and national level and critically threatened (CR) at the state level (Table 1).
Five species have unique records for only one of the five vegetation types that occur in Paraná state (Table  1). In the Mixed Ombrophilous Forest (MOF), 94% of the species (S = 47) occurring in the state were registered, followed by Grassy-Wood Steppe (GWS) with 70% of the total richness (S = 35), Ombrophilous Dense Forest (ODF) with 56% of the species (S = 28) and Seasonal Semideciduous Forest (SSF) with 38% of the total richness (S = 19) for Paraná state (Table 1). According to the dissimilarity analysis ( � X = 0.50), MOF and GWS are the most similar concerning species composition (0.74), whereas ODF and SSF showed greater divergences creating isolated groups (Figure 2).
We compiled 320 georeferenced sites of small non-volant mammals records for Brazil Paraná state from 40 studies since 1978 (Supplementary Material S1). The highest concentration of small mammals studies was related to phytoecological region MOF (172 sites), followed by ODF (75 sites), SSF with (53 sites) and GWS was the phytoecological region with the lowest rate of studies (20 sites). About 50.94% (N = 163) records were in protected areas, while 49.06% (N = 157) were recorded in non-protected ( Figure 1). The highest number of studies of at least one record of small mammals was the metropolitan region of Curitiba and western Paraná ( Figure 3A), while in mesoregions such as northwest, western center, pioneer north, and southwestern Paraná, the records were non-existent or with a low concentration ( Figure  3B, C). We found a strong effect of cities on sampling intensity, a moderate effect of roads and airports, and the negligible effect of rivers ( Figure 4A). The highest sampling rates (> 50%) are concentrated at distances of less than 50 kilometers from urbanized areas (cities, roads, and airports) ( Figure 4B).

Discussion
The final list of species recorded in the Paraná state comprises approximately 37% of all marsupials and small rodents listed for the entire Brazilian Atlantic Forest (Graipel et al. 2017). However, our results indicate that there are regions in the Paraná state, such as the southwest, without a single record of small mammals, revealing a Wallacean shortfall of approximately 138.584 km 2 . Wallacean shortfall is characterized as geographic biases in the information on species distributions, which cause many maps of observed biodiversity to closely resemble maps of survey effort (Hortal et al. 2007(Hortal et al. , 2015. Our results indicate that over the past few decades, temporal changes in the sampling effort concentration of small-bodied mammals have occurred in Paraná state (Supplementary Material S1). Until the mid-2000s, studies were concentrated exclusively in the eastern region of Paraná (metropolitan region of Curitiba). Only in the last 20 years, other regions of the state, such as remote areas in SSF, have been the target of studies of small-bodied mammal fauna. However, the sampling in peripheral regions is still insufficient when compared to the eastern region of Paraná state.
Although the pattern of our dissimilarity analysis may reflect only the sampling effort bias, this result was partially independent of the sampling effort once the GWS was the lesser sampled phytoecological region across the entire Paraná state, but present the second highest species richness. Similar studies show that the difference in small-bodied mammal assemblages is majoritarian owing to species-specific environmental preferences and habitat uses (Pardini et al. 2005(Pardini et al. , 2010Grazzini et al. 2015). The assemblages' dissimilarity across neighbor phytoecological regions was lesser than the regions more distant apart. However, this result seems to be only partially true. Once the broad transition between SSF and MOF also revealed a high dissimilarity. The pattern of dissimilarity seems to be more biased by the species richness recorded in each location than due to the number of studies at any given phytoecological region. Therefore, our results suggest that due to sampling bias was impossible to determine the small-bodied mammal assembling across the different phytoecological regions in the Paraná state, indicating a virtual absence of a species exclusively adapted to only one habitat type. Despite that, studies reveal that the species distribution ranges, the patterns of diversity distribution, and the assembling of smallbodied mammal species are intrinsically related to the widespread gradients within the phytoecological regions of the Atlantic Forest biome (Pardini & Umetsu 2006), remaining hitherto poorly understand in the Parana state due to the sampling biases.
Despite the Atlantic Forest being an important biome due to the diversity of small mammals (Paglia et al. 2012), few regions of this hotspot have been properly sampled and the local lists are generally incomplete. Our results showed that the sampling effort of small mammals was geographically biased, mainly by the distance from urbanized areas (cities, roads, and airports). The sampling rates decrease significantly when the distance for any city is larger than 50 km, being biased additionally by the distance of any road with high rates of inventories performed typically near to the road network. The geographic sampling bias includes the under-sampling of specific geographic regions, whereby accessible areas tend to be more and better sampled than remote and inaccessible areas (Zizka et al. 2020). We acknowledge that some spatial layers of the sampbias R package may be autocorrelated spatially for some densely settled regions such as Paraná state, and the method does not permit remove layers (see Zizka et al. 2020). Despite that, our result based on this approach reveals the sampling bias intrinsically linked to the more economically affluent regions across the Paraná state, which consequently embody the large majority of studies centers and universities. This result indicates that future studies need to encompass an agenda that decentralize this asymmetry in small-bodied mammal sampling across the Paraná. Further, possibly this issue is common to other Brazilian states and deserves investigations in broad-scales.
Our insights reveal that the Cricetidae family was the most representative group among the small-bodied mammals. Cricetidae is the most diverse rodent family in South America (Patton et al. 2015), Brazil (Quintela et al. 2020), and the Atlantic Forest (Graipel et al. 2017). In Brazil, all rats and mice of the Cricetidae family was grouped into a single subfamily, Sigmodontinae (Abreu-Jr et al. 2020). A total of 16 species of Cricetidae from Paraná state are endemic to the Atlantic Forest. (Graipel et al. 2017 Among these species, only D. sublineatus and O. dasytrichus were recorded in only in ODF phytoecological region, as documented in different studies. D. sublineatus seems a species specialized in the continuum habitat of core forest, absent in areas of the initial succession stage (Umetsu & Pardini 2007;Gatto-Almeida et al. 2016). Whereas O. dasytrichus is widely distributed across the Dense Ombrophilous Forest along the Brazilian coast (Peçanha et al. 2016). Regarding the species H. brasiliensis, C. angustidens, W. oenax, P. dasythrix and P. nigrispinus -only recorded in the MOF -prior studies indicate that are species associated with both open areas and forest, occurring in riparian and swampy habitats across the Brazilian Atlantic Forest (Teixeira et al. 2014;Brandão 2015;Patton et al. 2015;Machado 2016). Nevertheless, we emphasize that most of these unique records are retrieved from only one study, and for some of these species distribution range maybe is much wider than we have known until now.
Many taxonomic uncertainties (Linnean shortfall) still exist in the family Cricetidae. For example, although the SpeciesLink (CRIA 2020) contains the record of Oxymycterus rufus (G. Fischer, 1814) collected in the municipality of General Carneiro, we do not consider the occurrence of this species in our final list, due to the lack of physical holotype material for the Paraná state. This genus is one of the main Cricetidae group that requires further studies, due to low intrageneric diversity and few records (Peçanha et al. 2019).
Moreover, the phylogeographic aspects and the groups rufus, angularis and dasytrichus still lack more robust taxonomic approaches, mainly phylogenetically based (i.e. Darwinian shortfall).
Didelphidae was the second most representative family in species richness. Didelphids are distributed across all Brazilian biomes, from the Amazon to the Pampas, with the greatest diversity of species found in the dense forests of the Amazon and the Atlantic Forest (Melo & Sponchiado 2012). Our list includes 16 species of marsupials, which totals 66% of all species occurring in the Brazilian Atlantic Forest, with M. paulensis, Monodelphis iheringi, and Monodelphis scalops being endemic to this biome (Graipel et al. 2017).
Echimyidae family includes arboreal, terrestrial spiny-rats, and bamboo rats being the third most representative on our list (six species). Four species (Phyllomys dasythrix, Phyllomys nigrispinus, Phyllomys sulinus and Trinomys iheringi) are endemic of the Atlantic Forest (Graipel et al. 2017). Echimyids are the most diverse of South American Hystricognathi, due to species richness and variety of body plans (Fabre et al. 2012;Upham & Patterson 2012;Patton et al. 2015). However, the taxonomic history of this family has been chaotic, with several generic names proposed, others abandoned, and the content of the family and genera is highly unstable (Patton et al. 2015). Thus, information about the distribution and occurrence of species is essential for understanding the geographical limitations of the species of this family.
The only representative of the family Sciuridae was Guerlinguetus brasiliensis (Gmelin, 1788). After the review proposed by Vivo and Carmignotto (2015) it was found that for Brazil the species Guerlinguetus aestuans (Linnaeus, 1766) is recognized for the Amazon and G. brasiliensis (Gmelin, 1788) for the east of the Amazon and from the northeast to the south of the Atlantic Forest. Thus, all previous records of G. aestuans as well as Guerlinguetus ingrami (Thomas, 1901), for the state of Paraná were considered to be G. brasiliensis.
We considered only Cavia aperea to the family Caviidae (Erxleben, 1777) as a small representative with valid records for Paraná state. Specimens of Cavia in the Southern Region of Brazil identified as Cavia fulgida (Wagler, 1831), such as, the specimens collected in the municipalities of Morretes and Roça Nova, 25º28'S and 49º01ʹW (see Cherem & Ferigolo 2012) require further evaluation (Graipel et al. 2017), once morphological analyzes do not support the characteristics that differentiate this species of C. aperea (Cherem & Ferigolo 2012).
In the Paraná state, there are insufficient data to assess the degree of threat of most of the small species registered in our list. Only two species are considered threatened in at least one of the levels. Locally, M. paulensis is the rarest species, whose populations are restricted to the Ombrophylous Dense Forest at altitudes above 800 m, strongly affected by the fragmentation and alteration of this vegetation (Bonvicino et al. 2018). The second is W. oenax, a rare species with few records (Patton et al. 2015). In the Paraná state, the last register occurred in 1981 in the metropolitan region of Curitiba (Bonvicino et al. 2018). W. oenax is the only species considered to be extinct in Paraná state, it is classified locally extinct in the Metropolitan of Curitiba region (Brandão 2015;Christoff 2018). The urban expansion of the city of Curitiba suppressed all the environments where the species could occur (Bonvicino et al. 2018). Although it has a relatively wide range of occurrence, its area of occupation is extremely small, as the species lives in forest refugees, with scattered distribution (Bonvicino et al. 2018). Therefore, possibly the absence of records can be correlated to the fact that the collection efforts are concentrated in a few areas, with sample absences in other areas with expected occurrence of this species. To solve this issue, direct efforts should be done to try capture W. oenax, once it remains a poorly collected species.
Our main results therefore stress out that studies on the small mammal fauna in the Paraná state are influenced by accessibility bias, with a lack of information about the richness and species distribution in large territorial portions, as is the case in the southwestern, western and northwestern center and pioneer northern of the state. Fauna inventories can be fundamental starting points for species monitoring and conservation programs (Silveira et al. 2010). New research will not only have the challenge of knowing the species richness in regions that are still little or hitherto not studied yet but also reevaluate the conservation status and the population dynamics of the species that persist in densely anthropized areas, contributing to conservation programs of small-sized species. The knowledge gaps and the absence of regionalized lists make conservation and management initiatives difficult, especially at the local level.