An empirical study of the semantic similarity of geospatial prepositions and their senses

ABSTRACT Spatial prepositions have been studied in some detail from multiple disciplinary perspectives. However, neither the semantic similarity of these prepositions, nor the relationships between the multiple senses of different spatial prepositions, are well understood. In an empirical study of 24 spatial prepositions, we identify the degree and nature of semantic similarity and extract senses for three semantically similar groups of prepositions using t-SNE, DBSCAN clustering, and Venn diagrams. We validate the work by manual annotation with another data set. We find nuances in meaning among proximity and adjacency prepositions, such as the use of close to instead of near for pairs of lines, and the importance of proximity over contact for the next to preposition, in contrast to other adjacency prepositions.


Introduction
The locations of objects on the earth are commonly described using natural language in human speech and written documents. Locations may be identified using place names, but may also be described with relative location expressions, consisting of a spatial preposition and a reference object (Herskovits, 1985). For example, the expression I am near the cinema describes the speaker's location (near) relative to a cinema. In this case, the preposition near does not describe a precise, specific location. Near could refer to a location in any direction within a short distance of the cinema. The distance specified by near is vague, and likely to depend on the context (Fu, Jones & Abdelmoty, 2005).
Spatial prepositions are a key element of relative location descriptions, and a clear understanding of their meaning (semantics) and applicability in different contexts is key to the study of location language but is far from straightforward. In addition to their vagueness, spatial prepositions often have multiple senses and contexts of use (Coventry & Garrod, 2004;Talmy, 1983;Tyler & Evans, 2003). They are known to be among the most difficult kinds of words for second-language learners to use correctly (Chodorow, Gamon & Tetreault, 2010), and spatial prepositions are often used metaphorically to apply to other situations (for example, I am at the end of my tether) (Coventry & Garrod, 2004). In addition to the inherent interest in the study of spatial prepositions for our understanding of human language use, a clear understanding of the semantics of spatial prepositions in different situations is crucial to the development of effective methods for automated georeferencing and generation of natural language location descriptions. Such automation has multiple applications, including natural language spatial querying; georeferencing of social media, blogs, reports, and archives, automated georeferencing of emergency calls and natural language support for navigation (Al-Olimat, Shalin, Thirunarayan & Sain, 2019;Chen, Hui, Wu, Lang & Li, 2019;Hu & Wang, 2020).
An important element in understanding the semantics of spatial prepositions and their senses is the consideration of semantic similarity. The semantics of concepts are often understood through their relations with other words (Bittner, Donnelly & Winter, 2005;Sánchez, Batet, Isern & Valls, 2012), and if we know which spatial prepositions and/or spatial preposition senses are synonymous or nearly synonymous, we can better understand their meaning. This knowledge can also be applied in automated natural language processing methods, as it enables us to learn correct interpretations from other semantically similar expressions. For example, the restaurant next to the Auckland Harbor Bridge and the restaurant beside the Auckland Harbor Bridge describe the same location, and awareness of this similarity may be useful for machine learning tasks, or for ontology-based information retrieval. Semantic similarity has long been an essential element for many information retrieval problems, including web search (Hliaoutakis, Varelas, Voutsakis, Petrakis & Milios, 2006), and for tools like WordNet (Fellbaum, 1998), which is built on semantic relations.
Researchers have investigated the semantics of spatial prepositions in some detail (e.g. Coventry & Garrod, 2004;Herskovits, 1985;Talmy, 1983;Tyler & Evans, 2003), exploring the different contexts of use, and describing their senses (Coventry, 1999;Coventry & Garrod, 2004;Herskovits, 1986;Talmy, 1983;Tenbrink, 2008;Tyler & Evans, 2003). However, much of this work focusses on spatial prepositions and/or their senses individually, rather than addressing the semantic similarity between them. A notable exception is the work of Logan and Sadler (1996) who conducted human subject experiments to elicit assertions of similarity between pairs of spatial prepositions before comparing these similarities with similarity measurements between representations of spatial templates that model the applicability of the location of a located object relative to a reference object. They found close agreement between the two which served as evidence for their theory of the significance of spatial templates in human apprehension of spatial relations.
A number of formal, mathematical models have been developed to enable rule-based calculation of the physical configurations in which specific spatial relations occur (Clementini, Sharma & Egenhofer, 1994;Freeman, 1975), but these works focus on the definition of spatial relations on a theoretical level, not natural language spatial prepositions, and do not take context into account. Some work has addressed the problem of mapping spatial relations to the natural language prepositions that are used to describe them, and explored the semantic similarity of different spatial prepositions, but these works largely focus on a single contextual situation (road and park, with different spatial relation terms), rather than developing more broadly applicable models, and do not address different senses of spatial prepositions (Du, Wang, Feng & Zhang, 2017;Mark et al., 1995;Mark & Egenhofer, 1994;Schwering, 2007;Shariff, Egenhofer & Mark, 1998). A third strand of investigation of spatial prepositions comes from the computational linguistics (Kelleher & Costello, 2009) and computer science fields, in which methods for automated interpretation of spatial prepositions implement applicability models, or spatial templates (Collell, Van Gool & Moens, 2018;Hall, Jones & Smart, 2015;Hall et al., 2015) as introduced in (Logan & Sadler, 1996).
In this paper, we address these gaps in the previous literature and pursue two research questions: (1) Which spatial prepositions are semantically similar to each other across a range of geospatial contexts, and what is the degree and nature of that similarity? (2) How are the semantics of similar spatial prepositions and their senses related to each other?
We address these research questions by studying the semantics of 24 spatial relation prepositions and the senses of a subset of 13 of them using empirical data from a human subjects experiment. Our focus is particularly on the geospatial context, in which these spatial prepositions are used to describe situations in geographic, environmental or some cases of vista space, in Montello's typology (Montello, 1993). We asked respondents to match 720 expressions to the diagrams (from a set of 55) that best reflect their meaning. From the analysis of the human subjects data, we make two main contributions. Firstly, we study spatial preposition semantics using both quantitative and qualitative approaches. Using a quantitative approach, we identify groups of semantically similar spatial prepositions using clustering and t-distributed stochastic neighbor embedding (t-SNE), contrasting the groupings of similar prepositions to the typologies and groupings of prepositions that have been proposed thus far. Then, using a qualitative approach (although based on our quantitative data), we explore the aspects of similarity and difference within and between groups of prepositions using extensional maps.
In our second contribution, we explore the senses of three groups of semantically similar spatial prepositions, again using a combination of qualitative and quantitative approaches. We apply density-based clustering (DBSCAN) to the x, y coordinates for each individual expression that were determined using t-SNE. We then examine the clusters using Venn diagrams to isolate individual senses and the relationships between them using a manual approach. We do not attempt to build sense networks that show the ways in which senses may have been abstracted from other senses of a particular preposition like Tyler and Evans (2003) and Lakoff (2008). Our focus is rather on identifying the senses used in geospatial natural language, and the relationships between the senses of different prepositions. We are particularly interested in geospatial natural language because of the applications of semantic similarity work on the problem of georeferencing. An understanding of the different senses used to describe geospatial location in natural language is important because it enables us to distinguish the different configurations in space that may be referred to by a particular preposition (e.g. the preposition across may describe three different spatial configurations as discussed in Section 6.2), and this is essential for accurate georeferencing.
We combine computational and manual methods to explore the semantic similarity of specific prepositions and their senses, and do not attempt to define an automated approach to the extraction of senses.
The structure of this paper is as follows. Section 2 describes related work addressing the spatial prepositions and the similarity between them, Section 3 describes the method used for the human subjects experiment, and Section 4 describes the analysis applied to the data to represent the semantics of the spatial prepositions. Section 5 analyses the semantic similarity of the spatial prepositions using qualitative and quantitative methods and discusses the results. Section 6 analyses the senses of three subgroups of geospatial prepositions (13 of the geospatial prepositions) and discusses the results. Section 7 presents future work and draws conclusions.

The semantics of spatial prepositions
The main elements of a relative spatial description are the locatum (the object being located), the relatum (the reference object) and the spatial relation term, which describes the position of the locatum relative to the relatum (Lehmann, 1983;Quirk, Greenbaum, Leech & Svartvik, 1985;Tyler & Evans, 2003). Spatial relation terms are commonly prepositions (Retz-Schmidt, 1988;Talmy, 1983), but may alternatively (or as well as) consist of other parts of speech such as verbs, adverbs, etc. (Kordjamshidi, Van Otterlo & Moens, 2011). Prepositions may specify the geometric configuration of the relatum relative to the locatum, as well as shape, magnitude, and orientation (Dirven, 1993;Talmy, 1983).
Experimental work has demonstrated the importance of context in the selection of spatial prepositions to describe a scene (Coventry, 1999), and their selection and use may be influenced by space schematization, idealization, image schema and abstraction. For example, in the expression a bar inside the hotel, the spatial preposition inside may indicate that bar is smaller than hotel, hotel has a volume geometry and both objects have locative characteristics (Herskovits, 1980(Herskovits, , 1985Talmy, 1983;Vorwerg & Rickheit, 1998;Zelinsky-Wibbelt, 1993;Zwarts, 1997), although note that the application of these aspects depends on the specific situation and perspective of the observer. Other aspects that may impact on the semantics of prepositions include frame of reference, which may be intrinsic (object-centered), relative (viewer-centered) or absolute (environment-centered) and the asymmetry, partiteness (degree of subdivision), plexity (state of articulation into equivalent elements), boundedness and dividedness of figure and ground (Talmy, 1975(Talmy, , 1978(Talmy, , 1983. The role of function alongside geometry in selection of prepositions has also been highlighted, with the relative weight of geometry and function varying by preposition (Coventry & Garrod, 2004;Coventry, Prat-Sala & Richards, 2001). While these different aspects of the semantics of spatial prepositions have been studied in some detail, particularly by linguists and cognitive scientists, investigation of the semantic similarity and relatedness between the spatial configurations expressed by spatial prepositions is more limited.

Spatial preposition senses
It is common for words to have multiple meanings in natural language generally, and spatial prepositions are no exception. Several spatial prepositions are known to be used to describe multiple, different spatial configurations (e.g. the preposition on in the cup is on the table and the key is on the chain) (Coventry & Garrod, 2004). These different meanings of the same preposition are referred to as senses. In some cases, the same word is used to refer to objects or concepts that appear to have no semantic connection (homonyms), thus, for example, the word bank can be used to describe a geographic feature or a financial institution (Lakoff, 2008), but in the case of spatial prepositions, senses are commonly thought to be connected through some underlying principle (polysemes) (Richard-Bollans, Álvarez & Cohn, 2020;Rodrigues, Santos, Lopes, Bennett & Oppenheimer, 2020;Tyler & Evans, 2003). Principles of support and location control have been posited as playing this role for the on and in prepositions respectively (Coventry & Garrod, 2004). Lakoff (1987) describes connections between senses as being defined by metaphors and image schemas and shows how multiple senses are connected for the spatial preposition over. Herskovits (1986) cites contiguity, attachment, and support, but also identifies other factors and exceptions in different cases, rather than a single organizing principle.
Senses of spatial prepositions have been studied and enumerated by several researchers (Cooper, 1968;Leech, 1970;Bennett, 1972;Miller & Johnson-Laird, 1976;Talmy, 1983;Lakoff, 2008), and application of the specific senses of prepositions have been shown to be influenced by the surrounding context (Dahlmeier, Ng & Schultz, 2009). In the Preposition Project (PP) Litkowski and Hargraves (2005) define senses based on dictionary definitions. Cannesson and Saint-Dizier (2002) discuss the difference in senses based on the characteristics of the noun and verbs in the context. Cooper (1968) defines senses based on a semantic marker that is a specification of a concept, defining different concepts and interpretations. To disambiguate senses, Dahlmeier et al. (2009) and Tratz and Hovy (2009) designed a classifier and trained it on annotated data to get the annotations of senses for test data prepositions. While this work has investigated senses, work on the semantic similarity of senses is limited.
In addition to studying distinct senses, researchers have investigated the means by which senses are related to each other (e.g., through metaphor). Herskovits (1986) refers to use types that describe variations on the ideal meaning of a preposition, and the "stretching" of prepositions to apply in different situations. How then, do we define a distinct sense? Tyler and Evans (2003) propose two criteria. Firstly, "it must contain additional meaning not apparent in other senses associated with a particular form" (pp. 42-43). Secondly, "there must be instances of the sense that are context independent, that is, in which the distinct sense could not be inferred from another sense and the context in which it occurs" (p.43). We contrast two uses of the preposition across to illustrate this point: the bridge goes across the river and they are found in shops across the country. These two expressions meet the first criteria, in that the second sense contains additional meaning (the idea of coverage) relative to the first (more akin to crossing or overlapping). They meet the second criterion in that the difference cannot be explained by context alone and describes entirely different spatial configurations. Tyler and Evans (2003) distinguish uses of a preposition that meet these two criteria, and thus count as distinct senses, as those that are "conventionalised in semantic memory" (p.45), in contrast to other uses that are the result of inference and "produced on-line for the purposes of understanding" (p.45). They acknowledge that these criteria are strict, and that agreement about how fine-grained sense distinctions should be has not been agreed on. They also discuss the notion of a primary sense, which they define as the most prototypical, which can be identified through empirical means (from language studies) and linguistic means such as the earliest use; role in the semantic network relative to other senses; inclusion in composite words; participation in contrast sets with other prepositions (e.g. above/below); and ability to be substituted for related senses (Langacker, 1987;Tyler & Evans, 2003). Richard-Bollans et al. (2020) performed two selective and comparative tasks to identify the polysemy network of prepositions. They asked respondents to select prepositions and figures for a given description in a 3D environment respectively. They introduced a baseline model and reviewed how polysemy can play a role in the improvement of this baseline. Their polysemy detection framework is based on the ideal meaning and principled polysemy. Also, the polysemy hierarchy is introduced in this work and defines how closely a polyseme belongs to a specific preposition. Although this work is an important investigation of the senses of prepositions, it only reviewed four spatial prepositions and their senses.
Furthermore, while in previous work, the semantics of many common spatial prepositions and their senses has been explored, limited attention has been given to the semantic similarity of spatial prepositions and senses, except in a narrow range of situations (e.g. road + park).

Semantic similarity
Semantic similarity is a subset of the general idea of semantic relatedness, which includes any kind of relation between concepts. A vast range of different kinds of semantic relations between objects have been defined, including contrasts (e.g. antonyms, incompatibilities); case/syntactic/syntagmatic relations (e.g. agent-action), part-whole relations and causality (Ballatore, Bertolotto & Wilson, 2014;Budanitsky & Hirst, 2006;Chaffin & Herrmann, 1984).
Definitions of semantic similarity vary, with Chaffin and Herrmann (1984) including synonymity (car-auto); attributional similarity (have the same salient attributes); dimensional similarity (smile-laugh) and necessary attribution (lemon-sour). Ballatore et al. (2014) restrict their attention to synonymity, hypernymity or hyponymity (e.g. house is a kind of building) and Miller and Charles (1991) define semantic similarity in terms of substitutability (whether terms can be used in place of one another without changing meaning, or in a weaker form, truth value). Several criticisms of definitions of similarity have been proposed (Goodman, 1972), but the notion of semantic similarity nevertheless plays a key role in many information retrieval and querying tasks.
Much of the work on semantic similarity has focused on objects (e.g. river, mountain), rather than relations, and methods for determining semantic similarity have considered the presence of shared or similar attributes, relations (e.g. analogy) or affordances (Ballatore et al., 2014;Hahn & Chater, 1997;Janowicz & Raubal, 2007;Turney, 2006); proximity in space; correspondence between objects; or number of transformations needed to change one object into another (Goldstone & Son, 2005).
Janowicz, Raubal and Kuhn (2011) provide a comprehensive review of the semantics of similarity, describing a range of approaches to the measurement of similarity in the context of geographic information retrieval, and identifying the benefits of each. Ontology-based approaches, which formally specify the semantics of concepts using their attributes and relations, have been used to identify semantically similar objects, and have been applied to geographic concepts (river, mountain, forest) (Jones, Alani & Tudhope, 2001;Rodríguez & Egenhofer, 2004). Initiatives such as WordNet define a range of different types of relations to assist in the automation of semantic processing (Pedersen, Patwardhan & Michelizzi, 2004). Another common approach to determining the semantic relationship between objects (or types of objects) uses word context in natural language, assuming that similarity in the terms that appear near words in text corpora indicates that they are semantically similar (Agirre et al., 2009;Rubenstein & Goodenough, 1965;Wang et al., 2020). However, text-based approaches more accurately describe semantic relatedness than semantic similarity, as they do not account for situations such as antonymy (Ballatore et al., 2014;Budanitsky & Hirst, 2006;Miller & Charles, 1991).
In this paper, we address the semantic similarity among geospatial prepositions and define semantically similar prepositions as those that are used to describe a similar spatial configuration between the locatum and relatum of the preposition. Our meaning is thus narrower than many of the definitions described above, most closely aligning with synonymity, and excludes broader definitions of similarity, although we do consider hypernymity and hyponymity when discussing the preposition senses (Section 6). The reason for this narrow interpretation is that we are interested in understanding and automating the interpretation and generation of spatial prepositions, in order to understand and describe specific spatial configurations, and this requires synonymity or near-synonymity.

Semantic similarity of spatial prepositions
Despite extensive investigation into the notion of semantic similarity, application of the concept in the context of spatial prepositions is more limited. Several researchers have addressed the semantics of spatial prepositions by attempting to categorize them, indicating some level of semantic similarity or relatedness (e.g. adjacency and proximity) (Bitters, 2009;Coventry & Garrod, 2004;Hois, Tenbrink, Ross & Bateman, 2009;Kemmerer, 2006;Levinson, Meira & Max Planck Institut Fur Psycholingu, 2003;Retz-Schmidt, 1988;Tenbrink, 2008;Zwarts, 2005;Zwarts & Winter, 2000). However, many of these studies cover only a subset of spatial relation terms, and there is little consensus among schemes (e.g. beside can be classified as projective or proximal) (Coventry & Garrod, 2004;Retz-Schmidt, 1988;Zwarts & Winter, 2000). Other classes contain prepositions that are related in some way but are not semantically similar (e.g. the class of topological prepositions includes various types of connection or containment (e.g. contains, outside, overlaps) (Kemmerer, 2006;Levinson et al., 2003). Similarly, the class of projective relations contains relations that rely on projected axes (e.g. left, right, in front, behind) (Coventry & Garrod, 2004;Kemmerer, 2006), but would not be considered semantically similar for many purposes.
A human subject study of similarity between a set of twelve spatial relations was presented by Logan and Sadler (1996). They modeled the applicability of the individual spatial relations with spatial templates where each location in a template represents the degree of applicability of the location for the locatum relative to the relatum. They used multidimensional scaling to compare similarities asserted by the participants purely between pairs of spatial relation terms and similarities that they obtained by measuring cosine similarity between vectors representing individual spatial templates derived from the data provided by other experiments in the study. They found that there was close correlation between the two types of similarity assessment, with pairs of relations such as above and over; under and below; next to and near occurring close to together while left and right were far apart. The similarity study did not focus on specific senses of the spatial relation terms, though associated experiments did identify the subjects' use of over with the two senses of above and covering, The study concluded that the experiments provided evidence that spatial templates underlie a theory of apprehension of spatial relations.
Theoretical work by Bitters (2009) describes equivalent and synonymous relations for the spatial preposition near, equivalents being near to, nearby, close, close to, and nigh, and synonyms being adjacent, adjacent to, beside, by, alongside, and next to. However, the focus of this work is to identify frequency of use of prepositions with particular feature type pairs, and the semantic equivalence and synonymous relations are not experimentally verified. In a quantitative approach, Schwering (2007) defines a semantic similarity measure between pairs of 15 natural language spatial terms, combining Shariff et al.'s (1998) mapping from natural language terms to topological and metric relations with Mark and Egenhofer's (1994) conceptual neighborhood graphs that define the semantic similarity between topological relations. They test their measure with a human subjects experiment, identifying three groupings of semantically similar terms (broadly representing containment, intersection and near/avoid/bypass). However, they experiment only with road and park as locatum and relatum respectively, and do not consider a wider range of situations. Du et al. (2017) develop a random forest classifier to predict spatial relation from a sketch also using Shariff et al.'s (1998) parameters. To aid prediction success, they identify sets of five and seven groups of semantically similar prepositions (from a set of 69) using three methods: human judgment with a sketch drawing task; examination of a confusion matrix to identify misclassification (and thus likely similarity) and average distance between vectors of features. Their groups roughly correspond to: starts and ends in; alongness/enclosure; leads up to; containment; crosses/overlaps; goes into and near. However, their similarity assessment is relatively course-grained, with some groups containing a wide range of terms, and is again confined to the road + park context only. Stock (2008) demonstrates an approach to determine semantic similarity of spatial relations using a restricted natural language called Natural Semantic Metalanguage, but investigates only the intersects, next to, on and contains spatial relation terms in a theoretical treatment.
In the next section, we explain the human subjects experiment that forms the basis of our determination of semantic similarity of geospatial prepositions and their senses, across a range of different contextual situations.

Method
Our method for studying spatial prepositions and their senses has its theoretical foundations in Gärdenfors' conceptual spaces, in which the semantics of an object can be described by its position in a multidimensional vector space whose axes are defined by quality dimensions. The quality dimensions refer to aspects that define the semantics of the objects (e.g. size, shape), and objects are represented as points in the conceptual space. Thus, the location of a point describes a specific value on each of the quality dimensions. Within the multidimensional conceptual space, the distance between the points that represent objects can be used to determine their semantic similarity (Gärdenfors, 2004). In this work, we create a conceptual space in which objects are spatial prepositions and their senses, and we use 55 geometric configuration diagrams, based on Stock's (2014) Geometric Configuration Ontology, to represent each quality dimension (and thus each axis in the conceptual space). Values for each quality dimension for a given preposition are determined by respondents' assessments of how well each geometric configuration diagram fits a range of expressions using the preposition, and in combination locate a given preposition as a point in the conceptual space. We use 30 expressions for each preposition in order to incorporate a range of different contextual situations (explained in section 3.2), as the interpretation of spatial relations is acknowledged to be highly influenced by context (Coventry & Garrod, 2004). By using a range of different expressions for each preposition, we explore the aspects of preposition semantics that are generic in different situations, as well as different preposition senses.
Like a number of previous researchers (Coventry, 1999;Levinson et al., 2003;Mark & Egenhofer, 1994;Stock & Yousaf, 2018), we use a diagram matching task, in which respondents select diagrams that match each expression and rate the degree of agreement on a Likert scale. While grouping and pairwise comparison tasks are common alternatives to diagram matching methods for determining semantic similarity (e.g. Chaffin & Herrmann, 1984;Mark & Egenhofer, 1994;Miller & Charles, 1991), we consider them less useful for gaining a clear understanding of the specific meanings of spatial prepositions and their senses because we are interested in exploring the use of prepositions in different contexts, and in the range of different ways that prepositions are used, aspects that can be highlighted through the diagram matching approach. Drawing tasks have also been used in the study of spatial prepositions , but unlike many studies that focus on a single expression (for example, the road crosses the park), we study prepositions across many different contexts, and we considered that it would be difficult to obtain comparable diagrams across such a range of situations, when the experiment is not based on a limited number of expressions. Employing the results of our diagram matching experiment, we apply several methods to determine semantic similarity, including clustering, t-distributed stochastic neighbor embedding (t-SNE) (Section 5.1), as well as qualitative methods (Section 5.2).

Selection of geospatial prepositions
We investigate the semantics of 24 frequently used spatial prepositions. These prepositions were identified by extracting 890 geospatial expressions from the Geograph 1 and Foursquare 2 websites. Geograph aims to crowd-source geographically representative photos and associated captions, descriptions, and locations for every square kilometer of Great Britain and Ireland. Foursquare is a social networking application and website that contains attractions and user reviews. We extracted descriptions and comments from both sites in the central London area (specifically, the TQ 3080 map tile on the British National Grid) and, using manual examination, we excluded any descriptions that did not include place names or location information, resulting in 890 geospatial descriptions. From these descriptions, we manually identified geospatial prepositions as those that described either the location or movement of a geographic object/place. For instance, we excluded the expression a cat behind the table as it does not refer to a named or geographic place, but we include the bridge over the Thames River. We excluded the spatial prepositions to and from because their interpretations are based on the verbs that they are collocated with (e.g. the road goes to the church; the ferry came to the island), and ternary prepositions (e.g. between). This process resulted in 700 expressions with 24 spatial prepositions. The final list consisted of twenty-one singleword prepositions (above, across, along, alongside, around, at, behind, beside, beyond, by, in, inside, near, off, on, opposite, over, outside, past, through and toward) and three prepositional phrases (adjacent to, close to, and next to). Figure 1 shows the frequency of expressions for each preposition.

Selection of expressions
Having selected 24 frequently appearing geospatial prepositions, we randomly selected 30 expressions for each preposition from two other data sets ( Table 1): The Manaaki Whenua -Landcare Research Specimen Collection data, consisting of four different data sets (soils, 3 flora, 4 terrestrial invertebrates, 5 and fungi 6 ), including specimen types and collection locations in the form of natural language descriptions.  The Nottingham Corpus of Geospatial Language 7 (NCGL) (Stock et al., 2013), consisting of around 11,000 geospatial expressions collected from 46 websites with content such as news, travel, tourism, etc.
From these expressions, we manually extracted the relatum and locatum for each preposition in each of the 720 expressions. Many of the expressions were complex, involving other elements (e.g. adjectives, adverbs), but these additional elements were disregarded. Expressions with compound prepositions (e.g. across from) were excluded, with the exception of adjacent to, close to and next to, which are not typically used to describe spatial location without the to preposition appended. Specific place names were replaced with the relevant geographic feature type to avoid bias specific to particular locations. For instance, the first example in Table 1 becomes "beside the lake, 1 km north of stream."

Data collection
We collected assessments of the semantics of each expression from respondents using Amazon Mechanical Turk, 8 a platform for crowdsourcing responses to Human Intelligence Tasks (HITs) that has been used in a range of research projects (Mason & Suri, 2012;Schnoebelen & Kuperman, 2010). We created a separate HIT for each of our 720 expressions, and Mechanical Turk Workers were paid US$0.1 to complete each HIT. Workers could complete as many or as few HITs as they liked but could only complete a given HIT once. We collected 30 responses (from 30 different respondents) for each of our 720 expressions (30 expressions for each of the 24 spatial prepositions), in order to ensure that the results were not biased by responses of one, or a small number of respondents.
Each HIT page contained introductory instructions (see S4), an explanation of spatial prepositions, and an ethical statement. The research was conducted in accordance with the <anonymous university> Code of Ethical Conduct for Research, Teaching and Evaluations involving Human Participants, and lowrisk ethical approval was obtained from <anonymous university> Ethics Committee prior to the commencement of data collection. 9 For each expression, we asked respondents to select up to three diagrams that best reflected the expression from a set of 55 (see Figure 2) derived from the Geometric Configuration Ontology (GCO) (Stock, 2014). For some of the concepts described in the GCO, we included more than one diagram to reflect different geometry types (for example, different diagrams to show overlapping line or polygon geometries, as in the case of diagrams 32 and 33, which both indicate an overlapping configuration, but with different geometry types), in line with the two basic models of representation of place as regions and vectors (Zwarts, 2017). The GCO provides a comprehensive ontology of different geometry configurations extracted from the literature and text analysis, and includes topology, distance, linear orientation, horizontal projective orientation, direction, adjacency, collocation, and object parthood. The diagrams depict the locatum (in red) and the relatum (in blue) and include spatial relations that are relative to the position of the observer (projective, egocentric frame of reference) (Diagrams 1-10) and cardinal direction relations (absolute frame of reference) (Diagrams 11-26). The observer was represented by a stick figure while the direction of North was represented by an arrow labeled with the word "North." Several diagrams reflect multiple kinds of spatial relations (e.g. Diagram 53 depicts the topological contains relation and a parthood center of relation).
The diagrams intentionally omit contextual information (e.g. scale, location of other objects in the scene). This is because our goal is to focus on the semantics of spatial relations and their senses that occur across a range of different situations, relata and locata, rather than through a single relatum- locatum pair Mark & Egenhofer, 1994;Shariff et al., 1998), or a specific aspect of context (e.g. Tenbrink, 2008). We acknowledge that this approach excludes a deeper level of understanding of contextual aspects of spatial preposition semantics, including for example the influence of object size and type on the use of proximity prepositions, and the importance of function in the use of spatial prepositions (Coventry, 1999;Coventry et al., 2001) but leave this for later work. Our focus is on qualitative spatial configurations, and the diagrams used do not capture quantitative variations between, for example, different degrees of proximity, except very approximately (e.g. diagram 36 can be used to indicate very close, touching or almost touching objects, while diagram 31 indicates greater separation). Consequently, the measures of similarity resulting from our analysis may group together prepositions that do not differ in the types of configurations they describe but do differ in quantitative configuration (thus making prepositions appear more similar than they might be if quantitative and contextual aspects are considered). Nevertheless, our analysis (see Section 4) does still show clear differences among preposition that are often thought to be synonymous, particularly in their senses.
We asked respondents to select at least one and no more than 3 diagrams for each expression (in case a single diagram did not exactly reflect the expression and additional diagrams were needed), and to specify closeness of match from a half-Likert 10 scale with options: "agree somewhat," "agree" and "strongly agree" (Stock & Yousaf, 2018). We require the selection of at least one diagram for a given expression in order to force some decision and avoid null responses, which would be difficult to analyze (although could be the subject of another study). We recognize that the case in which no diagram is a good fit is possible, and cater for this using the Likert scale, which allows respondents to identify whether they strongly or weakly agree with a diagram. The limit of three is designed to ensure that respondents do not select every diagram that could fit, but are required to be selective in their mapping of the expression to the diagram/s. To remove bias created by the order of the diagrams in the experimental stimulus, we produced 100 different diagram matrices, each containing the same diagrams, but in different orders (changing the order of diagrams in Figure 2). Each of the 720 HITs was sequentially allocated one of the 100 diagram matrices.
The experiment was restricted to fluent English speakers through selfselection (workers were asked to proceed only if they met this criteria as shown in Supplementary material 4 (S4)), since prepositions (and not least spatial prepositions) are one of the more difficult aspects of English for learners to obtain (Bitchener, Young & Cameron, 2005;De Felice & Pulman, 2008). Rather than relying on self-selection, it would be possible to use Mechanical Turk Qualifications to validate language skills before allowing respondents to complete the experiment, but this was not done in this work. It is possible that the results may have been influenced by workers who completed the task even though there were not fluent, in order to receive the payment or if they over-estimated their English-speaking ability. However, we consider this influence to be minimal, as workers were only paid if they completed the task fully, so we anticipate that this would dissuade those who were not genuine. Furthermore, we expect that the analysis of 30 responses per expression that focusses on majority rather than individual selections (see Section 4) would reduce the influence of spurious responses.

Analysis
From the 21,600 HITs (30 responses x 30 expressions x 24 spatial prepositions), 956 blank HITs were submitted. It is likely that blank responses result from workers looking at the task and then deciding not to proceed, or hoping to get payment without completing the task (for example, the HITs can be set up to auto-accept any response after they have not been manually verified in a given time). We manually rejected these blank responses (Mechanical Turk provides the option to accept or reject responses before payment) and returned the rejected expressions into the pool repeatedly until valid responses were received for all HITs. The total number of respondents was 921 and the majority completed fewer than 21 HITs. Figure 3 shows the final number of respondents and HITs completed by them.
We calculated a total agreement score for each expression -diagram combination using the following formula (Equation 1): We assigned a weight to each response: 0.5 for "agree somewhat," 0.75 for "agree" and 1 for "strongly agree" applying the weights used in Stock and Yousaf (2018), which are designed so that the strongest response has a value of 1, and weaker responses are reduced accordingly. This ensures that if a diagram were selected by every respondent with "strongly agree," a score of 1 for the expression-diagram pair would have a total agreement score of 1. Response k specifies an individual response and has a value of 1 (for each respondent who selected the diagram concerned), weight k is the weight of that response and n represents the total number of responses for the given expression. We produced a 55-dimension vector (one number for each diagram representing the average weighted agreement across all respondents with the diagram for that expression) for each expression. We refer to these vectors as expression diagram vectors.
Previous studies have shown that although Mechanical Turk can be a cheap and fast platform for collecting data, sometimes the quality of data may not be at the level that requesters expect (Mason & Suri, 2012;Schnoebelen & Kuperman, 2010). When computing the Total agreement score we average across all 30 responses for a given expression in order to reduce the effects of outliers amongst respondents, and we further removed noise from the vectors by considering only average values that were equal to or greater than 0.1 (all average values for a dimension below 0.1 were set to zero). Very low numbers for a given diagram in an expression diagram vector suggest that only one or two people selected the diagram, and therefore it does not reflect a common view across all, or even most, respondents. Our focus in this work was on the majority understanding of the semantics of prepositions and their senses. Thus, while the threshold value of 0.1 provided a much clearer picture of clustering when compared with using all data, it ensured that, within the data removed below this value, no more than two respondents (out of 30 per expression) could have expressed a strongly agree preference for the corresponding diagram. Regarding the process of selecting the threshold, we considered several values between 0.1 and 0.3 and found that 0.1 provided the clearer clustering pattern in combination with what we regard as acceptable data loss. With the 0.1 threshold, the majority of data values that are removed (more than 75%) correspond to no more than one respondent asserting a preference for the diagram at any level of confidence (i.e. agree somewhat, agree or strongly agree). At the strongest levels of preference for expressiondiagram pairs, for all data that were filtered out at the 0.1 threshold, we had two cases of two strongly agree and one agree, and less than five cases corresponding to one strongly agree and two agree. When considering larger threshold values the numbers of agree and strongly agree that would be lost increased (including cases of 3 strongly agree for the 0.15 threshold) which we considered might result in unacceptable data loss.
We then produced a single diagram vector for each spatial preposition by calculating an average score for each diagram across all 30 expressions that contained the spatial preposition. We refer to these vectors as preposition diagram vectors. The difference between expression diagram vector and preposition diagram vector is shown in Figure 4.

Semantic similarity of spatial prepositions
In this Section, we use the results from our experiment to explore the semantics of spatial prepositions and their similarity. We firstly apply quantitative techniques (clustering and t-distributed stochastic neighbor embedding, or t-SNE) to identify groupings of spatial prepositions and discuss the results from this process. We then study the prepositions using qualitative methods, with an extensional map.

Quantitative analysis, results, and discussion
We apply clustering to the preposition diagram vectors in order to identify groups of semantically similar spatial prepositions, following the assumption that respondents will select similar diagrams for spatial prepositions that have similar meaning. We applied several different clustering configurations in order to identify the dominant groupings robustly, as follows: We applied two clustering techniques: Agglomerative Hierarchical Clustering (AHC) and K-means (Hartigan & Wong, 1979;Johnson, 1967).
We applied the techniques to both the preposition diagram vectors and a modified form of the vectors, in which only the top three diagram values in each preposition diagram vector were retained, and all other values were set to zero (this eliminates all but the most dominant selections), because the top three values show the most frequently chosen diagrams for that specific expression, and thus carry more information than other small values that may be outliers.
We then calculated the co-occurrence between pairs of prepositions as the percentage of configurations in which they appear in the same cluster, across all of these different clustering configurations (20 in total -5 × 2 × 2) in order to ensure that our groupings of semantically similar prepositions are not influenced by a particular clustering configuration, using the following formula (Equation 2): co À occurrence x;y ¼ number of configurations in which xand yare in the same cluster total number of configurations We created a co-occurrence matrix representing the pairwise co-occurrence of the prepositions and plot this data on a t-SNE plot ( Figure 5). T-SNE plots are able to express the similarity between multi-dimensional non-linear vectors in two-dimensional space (Maaten & Hinton, 2008).
The t-SNE plot shows several interesting groupings. Unsurprisingly, in and inside are grouped together. While there are differences in the way these prepositions are used (e.g, I live in the street makes sense, while I live inside the street is unlikely), there are significant overlaps that are highlighted once the nuances apparent in the multidimensional conceptual space are reduced to the t-SNE twodimensional space.
Several adjacency and proximity prepositions are grouped together (next to, near, adjacent to), while beside and close to are together, but some distance from the other proximity and adjacency relations. The groupings do not reflect the distinction between proximity (near, close to) and adjacency (beside, next to, adjacent to) that has been identified in preposition typologies (Bitters, 2009). While Zwarts and Winter (2000) and Retz-Schmidt (1988) class beside as Figure 5. T-SNE plot of preposition co-occurrence matrix.

SPATIAL COGNITION & COMPUTATION
a projective relation, Coventry and Garrod (2004) class it as a proximity relation, consistent with its position in Figure 5 with the close to relation. Interestingly, outside is grouped with next to, near and adjacent to, although it is not commonly presented as either an adjacency or proximity relation, but rather a topological or containment relation (in that it would typically be considered to refer the situation in which the locatum is external to the containing relatum) (Bitters, 2009). It should be noted that our analysis focused only on qualitative similarities and differences (since our diagrams do not depict scale and thus do not reflect different distances between objects), and it is possible that quantitative analysis would reveal different groupings. Nevertheless, our qualitative analysis reveals interesting patterns in the semantics of the proximity prepositions including the differences in their use according to feature type and clarification in the relationship between proximal and adjacency prepositions (see Section 5.2 and 6.3).
Past, beyond, off and by are grouped together in the t-SNE plot. While by might be considered more akin to the adjacency and/or proximity relations, the similarity between the four prepositions is further confirmed by the Pearson Product Moment Correlation Coefficients between the preposition diagram vectors as shown in Figure 6 (see Appendix A for the full matrix of Figure 6. Shaded similarity matrix of the prepositions-from darkest (highest similarity) to lightest (low similarity). correlation coefficients), in which the similarity of by with off and past is 0.95 and with beyond is 0.7. Across, through and over are close together in the plot. Although the relations expressed by these prepositions might vary if viewed in threedimensional space, because our diagrams only depict plan view, there is significant overlap in the diagrams selected.
Above, behind, and opposite are also close to each other in the plot, even though they appear to be semantically very different. As for the across, through and over group, this grouping may be affected by the absence of the threedimensional view in our diagram, and the tendency for respondents to select diagrams in which one object is above the other in the plan view, even though the diagrams are not intended to depict the vertical dimension. Thus Diagram 2 was highly scored for both above and behind. While it is intended to reflect the behind relation, given the position of the objects relative to the observer, some respondents also applied it to the above preposition. We consider the specific diagrams selected for each preposition and explore these aspects in more detail in the next Section.

Qualitative analysis, results, and discussion
Following Levinson et al. (2003), we present an extensional map (Figure 77) of the three diagrams with the highest agreement for each preposition. Extensional maps are used to highlight the findings of diagram matching experiments and depict groups of diagrams that are most frequently selected for a given linguistic expression (in our case prepositions). Diagrams are positioned on the extensional map in a way that facilitates display of groups of similar diagrams. Thus, diagrams used for the same preposition are grouped together on the map and enclosed by a boundary, each of which is labeled in Figure 7. Most importantly for our work, the diagram enables comparison of the semantics of individual prepositions, illustrating the associated geometric configuration and the overlaps between the selection of particular geometric configurations to represent different prepositions. The extensional map of our experimental results further elucidates some of the groupings shown in the t-SNE plot. It is important to note that while the t-SNE plot incorporates the full set of average diagram vectors for a preposition, and position on the plot can be influenced by diagrams that have lower agreement scores, the extensional map only shows the three most highly scored diagrams, so gives a more general view of the similarities of the prepositions. Nevertheless, it highlights the explicit distinctions between those views, which is informative.
In the extensional map, in, inside and at all share the same highest scoring diagrams (Diagrams 26, 40 and 53): those that indicate containment, with greater or lesser degree of centrality in the relatum.
As in the t-SNE plot, the proximity and adjacency relations in the extensional map form two distinct groups, but these do not coincide with the distinction between proximity and adjacency. Beside, by and close to all have the same three highest scoring diagrams, one of which indicates two objects touching, and the other two of which depict a linear object near a polygon object. In contrast, all three highest scoring diagrams for adjacent to, near and next to show two polygons, in one case touching (which overlaps with those for the beside, by and close to group) and the other two near each other. This suggests that beside, by and close to are more appropriate for linear objects than polygonal ones, where adjacent to, near and next to might be preferred. Outside, which was grouped with adjacent to, near and next to in the t-SNE plot, shares two highly scored diagrams with each of the other two groups, and those groups include linear objects as well as touching and near polygons, indicating more general semantics.
Past, beyond, off and by, which are grouped together in the t-SNE plot, all share the same two highly scored diagrams (Diagrams 29 and 30), as well as one other which they do not share (past: Diagram 35, beyond: Diagram 2, off: Diagram 38 and by: Diagram 36). They are the same two diagrams that are included in the top three for beside and close to: a polygon and a linear object near each other. These prepositions thus clearly have some shared semantics, while also some additional aspects of meaning that are independent of the others. In the case of beyond, this additional diagram is a projective relation, indicating one object behind another, relative to the observer, and is also shared with behind. Past includes a diagram showing a linear locatum over a polygonal relatum, and all three of its diagrams combine linear and polygon objects. Off also includes a third diagram (38) involving linear and polygon objects, with the linear locatum outside and leading up to the edge of the polygonal relatum.
Across, through and over were very closely clustered in the t-SNE plot, while in the extensional map, across and through share the same three highly scored diagrams and over shares two of those, with one different diagram. Across, through and over all share a diagram involving two crossing lines, as well as one in which a linear locatum crosses a polygonal relatum. Across and through (but not over) also share a diagram (33) with a linear locatum going into and stopping in the middle of a polygonal relatum. The extra diagram that is highly scored for over involves two overlaid lines, one inside the other. It should be noted that our diagrams are only in plan view, so three-dimensional diagrams are not available, even though they may be more suitable for prepositions like over, and this may affect the results.
The above, behind, and opposite group from the t-SNE plot is not visible in the extensional map, with the three prepositions only sharing one diagram. Above and behind share two diagrams, but it is possible that this is because of a mistaken identification of these diagrams as a view from the side, rather than from above, in the case of the above preposition. All three of the diagrams selected for above show the locatum object geometrically above the relatum object in the diagram (i.e. further up the page), but when the diagrams are interpreted in plan view, they do not reflect the above relation. Instead, in plan view, diagrams that show one object inside another may be considered the most accurate depiction of the above relation.
The above, behind, and opposite prepositions also reveal the tendency for respondents to ignore the intended meaning of the north arrow in the diagrams. Diagrams 11 to 26 include a north arrow and were intended to show cardinal direction spatial relations (north of, south of, etc.) from the original Geometric Configuration Ontology (Stock, 2014). Cardinal directions were not included in our set of 24 spatial relations, although a small number of our expressions (14 expressions) did include cardinal direction references in other parts of speech (e.g. a kitchen on the north side of the town). In any case, respondents appeared to ignore the north arrows, and see the diagrams as if only the objects themselves appeared, in contrast to Diagrams 1 to 10, which included an observer to reflect spatial relations that were relative to the observer's position (the projective relations), for which the selection of diagrams did appear to take the existence of an observer into account.
It is clear from the extensional maps that for some spatial prepositions, the three most highly scored diagrams include different kinds of spatial configurations. For example, the top three diagrams for the around preposition include one in which the entire locatum covers the relatum, and another in which it is only around the edges of the relatum. Some of these selections of different diagrams suggest different senses of the spatial preposition. In the next section, we explore preposition senses in more detail.

Geospatial preposition senses
In this section, we focus on three groups of prepositions that were shown to be semantically similar in the previous section ( Figure 5): • across, through and over; • proximity and adjacency: beside, close to, near, next to, outside and adjacent to and • past, off, beyond and by.
Again, we combine quantitative and qualitative approaches to study individual spatial prepositions and their senses, using the diagram vectors and applying Tyler and Evans (2003) criteria for identification of distinct senses. Within each group, we present our findings, validating them with explanation and examples in the tradition of Talmy (1983), Tyler and Evans (2003) and Herskovits (1985), and relating them to the previous literature. We then further validate our findings using manual classification. Extracting the senses of spatial prepositions is important since such prepositions are often ambiguous and overloaded in meaning. For example, the preposition across can refer to at least three different spatial configurations (see Section 6.2), and methods to automate the interpretations of spatial prepositions will be limited in their accuracy if sense differences are not taken into account. Furthermore, as we will show in this section, some geospatial prepositions like across and over share a similar sense. However, this is not their only sense, so some uses of over may be semantically similar to some uses of across, but both prepositions are used in other, dissimilar ways. In addition, over has been reviewed in other works (Logan & Sadler, 1996;Tyler & Evans, 2003) and it has a sense which is common with above (higher than).

Qualitative and quantitative analysis method
In this section, we interpret the prepositions, their similarity, and their senses using both qualitative and quantitative means. We first apply t-SNE to the expression diagram vectors (in contrast to the preposition diagram vectors that were used in Section 5), reducing them to x, y coordinates in twodimensional space. We then apply density-based clustering (DBSCAN) (Ester, Kriegel, Sander & Xu, 1996) to the t-SNE coordinates for the expressions for each spatial preposition, to identify clusters of expressions that have similar agreement score profiles across the 55 diagrams. We used DBSCAN as it does not require the number of clusters to be specified as input but rather identifies natural groupings, and because it considers points that are not close to other points (which in our case refer to expressions), to be noise rather than forcing them to be included in a cluster.
We consider each of the clusters identified by DBSCAN a candidate sense for the preposition concerned. We perform manual, qualitative analysis on these clusters using Venn diagrams for each preposition to study the semantics of prepositions and identify their senses. We explain in the following text the method for extracting senses from the Venn diagrams and provide a detailed worked example for the across preposition (see Figure 8). In addition, we include the Venn diagrams for all of the 13 prepositions for which we identified senses in Supplementary Materials S1, S2, S3.
The Venn diagrams 11 allow us to identify which aspects of the semantics of the prepositions (represented by the highly scored diagrams) are shared across all senses (in the section of the Venn diagram where the clusters intersect). The Venn diagrams also clearly identify the aspects of the semantics of each cluster that are distinct to that cluster, as required by Tyler and Evans (2003) first criterion for a distinct sense (see Section 2.4). To address Tyler and Evans' second criterion, which specifies that instances of a sense must not be capable of being inferred from the context in which they appear, we consider three kinds of similarity between diagrams that may invalidate a given cluster as a separate sense (see Supplementary Materials S1, S2, S3 for examples): • semantic similarity above a threshold, determined from the semantic similarity matrix in Appendix A; • representations of the same relation with different geometric types, determined from our mapping from the GCO ontology to diagrams, in which some GCO concepts were mapped to multiple diagrams with different geometry types and • representations of the same relation with different plurality (one diagram depicts a single object while another depicts multiple objects, but the diagrams are otherwise identical).
While distinct senses may be invalidated by other kinds of similarity than these three (since Tyler and Evans' second criterion is not clearly specified), we consider that these give an indication of clearly similar clusters that do not qualify as distinct senses, and during our manual study of each sense, we require a clearly different semantic intent for each sense and discuss equivocal cases. Figure 8 shows the Venn diagram for across. Each Venn diagram shows the six most highly scored (by maximum total agreement score for any expression within the cluster) diagrams for each cluster, sized according to the maximum total agreement score. Diagrams that appear in more than one cluster are sized to reflect the highest maximum total agreement score, and the maximum total agreement scores for all clusters are shown as vertical bars beside the diagram, color-coded for the cluster. For example, in the Venn diagram for across (Figure 8), Diagram 35 had the highest maximum total agreement score in cluster 1 (green), with much lower scores in clusters 2 and 3, indicated by the smaller blue and orange bars. The lines between diagrams represent the types of semantic similarity discussed above: • solid lines indicating semantic similarity are weighted (in width) by degree of similarity (Diagram 48 is more semantically similar to diagram 43 than 51, based on the results of our human subjects experiment); • dashed lines indicating the same spatial relation represented with different geometry types (for example, Diagrams 35 and 39 are both representations of an overlaps/crosses spatial relations, but in one case the relatum is a line, while in the other it is a polygon) and • dot-dash lines indicating the same spatial relation with different plurality (not indicated here, see Supplementary Materials S1, S2, S3).
In cases in which a diagram is sufficiently highly scored to be among the top six and thus appear in the Venn diagram, but that on closer examination has gained that high score based on use with only one expression, we exclude it from the analysis (shown without borders in the Venn diagrams -for example, see Diagram 54 in the diagram for next to, Supplementary Material S2(e)). Such cases are normally due to other aspects of the expression than the original preposition (e.g. referring to part of a relatum) and are considered outliers (e.g. "A doorway close to the head of the north-western staircase"). We also consider that the intersecting section of the Venn diagrams may be used as guidance as to the primary sense of a preposition, given that Tyler and Evans (2003) view the primary sense of a preposition as its prototypical use, and the intersecting portion of the Venn diagram indicates a "central" meaning of the preposition, but further research is required to verify this.
We also show example expressions from each cluster to assist in analysis of the differences between the kinds of expressions. Supplementary material 5 (S5) summarizes the extraction of the senses from the Venn diagrams for all the prepositions.

Across, through and over
All three of these prepositions have a sense that indicates an overlapping relation between the located and referenced objects. In addition to this sense, we identify two other senses for across. In one of these senses there is a third object between the observer and locatum, and the observer is often implied (e.g. the bus station is just across the road [from me]) (see S5). A third sense indicates a relation in which multiple locata appear throughout different areas of the relatum (e.g., cities across the country). The previous literature mainly refers to the first, and most dominant (given its role in the intersecting part of the Venn diagram) of these senses (Cooper, 1968;Landau & Jackendoff, 1993;Lindstromberg, 2010). Cooper (1968) also identifies a sense that has some similarities with our second sense (e.g. the town across the river), but specifies that "x is located in the space which is contiguous with the distal boundary of y" (p.19).
The through preposition (Supplementary Material S1(b)) has only one sense, which it shares with across. We thus consider that through is a specialization of across, being semantically similar to across sense 1, but not encompassing the semantics of senses 2 and 3. Expressions in across clusters 2 and 3 in which through is substituted for across make little sense (the bus station through the road), or alter the semantics of the expression (the valley is just through the crest). The preposition through has not been widely studied, although Dirven (1993) describes spatial and non-spatial senses of through and Bahm (2019) studies its use in an indoor environment. In the spatial context, the focus of this work is that through is used in movements in a 2D or 3D enclosure (e.g. channel, tunnel or surface).
The over preposition also shares the overlapping sense with through and across, as identified by a number of other researchers (Brugman & Lakoff, 1988;Cooper, 1968;Kreitzer, 1997;Lakoff, 2008;Mackenzie, 1992;Tyler & Evans, 2001). The second sense combines the overlapping relation with varying degrees of linear alignment between relatum and locatum and was identified by Lindstromberg (2010). Our third sense places a greater emphasis on verticality, with diagrams such as the tower over part of the bay reflecting a meaning that is more akin to above than across and through, a sense that has been identified by other researchers (Bennett, 1975;Brugman & Lakoff, 1988;Kreitzer, 1997;Lakoff, 2008). It must be pointed out that only 2-dimensional diagrams were available to respondents, and that these are limited in their ability to represent some uses of over, given that they represent a survey perspective (from above) (Taylor & Tversky, 1996). A final sense that has been described for over but that was not identified in our research describes the case in which the locatum is on the other side of the relatum (e.g. "Arlington is over the river from Georgetown" Tyler & Evans, 2001, page 48) (Geeraerts & Cuyckens, 2007Lakoff, 2008;Lindstromberg, 2010;Tyler & Evans, 2001), and is like our second sense for across. Figure 9 illustrates the senses of the prepositions in the across, through and over group, and the relationships between them, highlighting the common overlapping sense across all three prepositions that was also identified by Kreitzer (1997). The overlapping sense frequently has a dynamic component of transition relative to the reference object.

Adjacency and proximity prepositions
Six prepositions that relate to adjacency and proximity are grouped together in Figure 5, the Venn diagrams for these are presented in Supplementary Materials S2, and the senses and relationships between them are summarized in Figure 10.
We identify two senses for the adjacent preposition: one describing spatial proximity, and another describing the overlap relation. The more dominant touching or proximal sense reflects the sense of adjacent identified by Klien and Lutz (2005) in their analysis of Wordnet definitions. There can be some debate about whether the second sense (overlapping) is merely a stretching of the proximity sense (such "stretched" semantics are described by Herskovits (1986)) to accommodate vague boundaries. Expressions for which Diagram 32 (two overlapping polygons) was selected include land adjacent to the mountain, and a wetland adjacent to the avenue. A similar overlapping sense was identified for the outside preposition (see Supplementary Materials S5 and Supplementary Materials S2), and thus we have included this as a sense of both prepositions, but it should be noted that it is weaker than the other senses, as the maximum scores given by respondents for the overlapping diagrams are much lower. The shared senses, absence of additional senses for either preposition and close positioning in Figure 5 confirm the semantic similarity of adjacent and outside.
The touching or proximal sense is also shared by beside, close to and next to. Near has a similar sense (like close to, near has only one sense), but interestingly near is only used for this sense for polygon-polygon and line-polygon (not line-line) pairs, in contrast to the other prepositions, which are also used to describe the touching or proximal sense for line-line pairs. In order to confirm this finding, we examined the expressions, and noted that all of the near expressions in the data set (randomly extracted from the NCGL) involved polygon objects (with another polygon or a line). We further confirmed this by randomly selecting a larger sample of 174 expressions using near (87 expressions) and close to (87 expressions) from Geograph, and manually identifying the geometry types of the locatum and relatum using the Linguistically-Augmented Geospatial Ontology (Stock & Yousaf, 2018), which identifies geometry types for a range of geographic feature types. The results showed that 31% of close to expressions referred to line-line feature type pairs, in contrast to 3% of near expressions. Figure 11 shows this distribution.
An additional sense that was evident for next to and beside was the proximal and parallel sense, which was used for pairs of linear objects, rows of multiple objects in a line, or sides of a larger polygon object.
Another interesting observation was the relative importance of the proximal and touching aspects of this group of prepositions. Figure 12 compares the maximum expression scores for diagrams that depict a touching relation vs those that depict a proximal relation. It is unsurprising that proximity is more important than touching for close to and near, and that touching is more important for adjacent. However, next to is more similar to close to and near in that proximity is more important than touching, and beside gives equal scores to both.
It must be acknowledged that our method does not capture the importance of the vertical elements of the adjacency prepositions identified in the literature (Herskovits, 1980;Lautenschütz, Davies, Raubal, Schwering & Pederson, 2006;Lindstromberg, 2010), since we work only with diagrams in plan/survey view. However, the previous literature confirms the role of proximity and the possibility of contact (Lindstromberg, 2010;Mackenzie, 1992;Saint-Dizier, 2006;Zwarts, 1997), without identifying the nuances and inter-relationships shown in Figure 10.

Past, off, beyond and by
The third group of prepositions that we examine in more detail also captures varying kinds of proximity, with some additional semantics for particular senses. The Venn diagrams are presented in Supplementary Materials S3, and the senses and relationships between them are summarized in Figure 13.
Off, past and by all have a sense that conveys proximity. This sense for by is particularly used in expressions involving "by the side of" (e.g. a house by the side of the lake), and has been identified by multiple researchers for linear objects (Cooper, 1968;Landau & Jackendoff, 1993;Lindstromberg, 2010;Mackenzie, 1992). Our data also identifies an additional sense that has been discussed by Hois and Kutz (2008), in which particular verbs combine with the preposition to indicate enclosure (a field bounded by the canal, the platform is surrounded by a ditch). The previous literature identifies the first sense of off shown in our data (Cooper, 1968;Landau & Jackendoff, 1993;Lindstromberg, 2010). Our second sense of off is used for pairs of linear features in various relative orientations, and indicates a branching or veering configuration, sometimes combined with a verb (the avenue off the main road, the path leading off the track). In addition to the proximal sense, past also includes a sense in which the located object overlaps a reference object that is a group, conveying the notion of traveling through that group (a walk past the buildings, a river past the villages). This is similar to the sense described by Lindstromberg (2010), but our data mostly confined this sense to grouped objects. Lindstromberg (2010) also identified a sense of past that was similar to beyond, which we did not observe in our data, possibly because it is a less common use of past and did not appear in our sample of 30 expressions. Finally, beyond has only one distinct sense and is thus different from the other three prepositions. That sense is similar to the third sense of across and indicates an object on the other side of some reference object from the observer (a chapel beyond the river). This extends the semantics of beyond described in the previous literature, which mainly focusses on distance (objects that are far away) (Cooper, 1968;Landau & Jackendoff, 1993;Lindstromberg, 2010;Mackenzie, 1992Mackenzie, , 2003. We postulate that beyond is close to past, off and by in Figure 5 mainly because some respondents selected diagrams 29 and 30, rather than the diagrams that depicted the observer. While this group of prepositions appear close to each other due to common semantics mainly related to the proximal sense, they also have additional senses that clarify the nature of their semantic variation.

Validation of the senses
In addition to comparison with the senses identified in the literature, we validate the senses extracted above in two ways. First, we validate the repeatability of the manual sense extraction process. Two of the paper coauthors independently extracted the senses for Group 3 (past, off, beyond and by) using the method described in Section 6.1 and the resulting senses were compared. Both coauthors independently produced the same senses for all four prepositions using the Venn diagram methodology.
Second, we validate the senses by classifying additional data using our senses to identify gaps and/or ambiguities. Two other coauthors, who were not involved in the sense identification step, classified a sample of 100 expressions involving each of the 13 prepositions for which we extracted senses. Four of the prepositions were excluded as we only identified one sense for them (close to, beyond, near and through). The annotators were given a description of the senses (the right most column in supplementary material S5), and asked to classify the expressions into each of the senses, with the addition of two other classes: non-spatial use (for uses of the prepositions in a non-spatial sense, as these are excluded from our work here) or other sense (a sense that is not included in the set we have extracted here), and to identify any ambiguous cases. The latter two classes validate our set of senses by determining (1) completeness: identifying any senses that are found in the sample of expressions but were not identified by our approach; and (2) distinctness: identifying cases in which the sense classification was ambiguous, suggesting that our senses are not sufficiently distinct or well defined.
The sample of 100 expressions for each preposition was randomly selected from the combined set of the NCGL and Landcare corpora, excluding the expressions that had been extracted and used in the main experiment. In the case of adjacent to and beside there were insufficient expressions, so additional expressions were sourced from Geograph, 12 a photo posting web site that includes photo captions and descriptions in which spatial prepositions often appear. For each of the two lower-frequency prepositions, we conducted a manual search using the spatial preposition in Geograph's search images function, and manually extracted the first 75 for adjacent to and 67 for beside expressions (142 in total being the number needed to achieve a total of 100 together with expressions already obtained from NCGL and Landcare corpora) that contained each respective preposition and that included both a locatum and a relatum (some captions in Geograph have an implied locatum, and these were excluded).
The 100 expressions for each preposition were divided among the two annotators with an overlap of 22 expressions to check inter-annotator agreement (each annotator classified the shared 22 expressions plus half of the remaining 78). Following annotation, we calculated the inter-annotator agreement for the 22 shared expressions, achieving an average agreement score across all prepositions of 86%, with a range between 72% (by) and 100% (next to). The past preposition had much lower agreement (50%), in part because an additional sense was identified by one annotator (see below).

Completeness of senses
Across the nine prepositions, only one additional sense was identified by the annotators that had not emerged from our analysis, for the past preposition, with a beyond/after sense. For example: • I'm standing one street from Long Bay College past the roundabout on the right next to the giveaway sign. • I am standing at the first driveway past the side street on the right side of the road as you face downhill . . . This additional sense was identified by Lindstromberg (2010) as discussed in Section 6.4, but not found in the 30 expressions that were used for our experiment (and that were a different set of expressions from those used for the validation), due to its low frequency of use (9 expressions out of the 100 expressions included in the validation).

Distinctness of senses
We asked the annotators to identify expressions that were of ambiguous class, with a view to determining the distinctness of our set of senses. Six expressions for the by preposition were marked as ambiguous across both annotators. For example, in the expression below, traversed by is the ambiguous case that was not identified by our experiment: • The Chesterfield canal here passes through the ridge of ground, that is traversed by the road to the north, by means of a tunnel some 270 yards in length and 15 feet in breadth and height. • Furthermore, 2 expressions for the off preposition were identified as ambiguous. For example: • The bus ride across the Pyrenean mountain passes into Andorra is spectacular, although a new tunnel cuts off part of the original road over the pass.
The ambiguity is mainly due to the verbs that accompany the prepositions (e.g. cuts off, traversed by, crossed by) conveying a different meaning than the uses of "by" and "off" with verbs in our experiment, in which most of the expressions used by in combination with verbs of boundedness (e.g. surrounded by, flanked by). Figure 14 shows the frequency of each sense for the eight validated prepositions, using all 100 expressions and averaging across annotators for the overlapping portions of the sample. As can be seen, most prepositions have a clearly dominant sense, along with other sense/s that are much less frequent.

Conclusion
In this paper, we used a human subject experiment with 720 expressions across 24 spatial relations and multiple geospatial contexts, in order to study the semantic similarity among spatial relations and their senses. We identified groups of semantically similar prepositions using t-SNE and studied the nature of differences between the prepositions using an extensional map to address Research Question 1. Groups that were particularly similar included across, through and over; the proximity and adjacency prepositions (beside, close to, near, next to, outside and adjacent to) and past, off, beyond and by. We then studied these three groups of similar spatial prepositions in more detail, identifying the senses and the semantic relations between them using Venn diagrams to address Research Question 2. We validated this work though comparison to previous literature and manual annotation. We found that through is a specialization of across and over, sharing only one of their senses; and that the adjacency and proximity prepositions share a complex network of senses. While these were centered on proximity and touching relations, overlap, orientation and geometry type were also relevant for some senses. The senses of past, off and by were similarly overlapping, while the single sense of beyond was distinct. Our results further showed that: The near preposition is rarely used for line-line relations, with close to being preferred to describe proximity in this case; The next to preposition is used to describe proximity more than immediate adjacency (touching), in contrast to adjacent, which more frequently requires a touching relation.
We acknowledge that this analysis provides one perspective on the semantics of the spatial prepositions: a perspective mediated by the experimental method used. The diagrams were deliberately designed to be context neutral in order to study the generic semantics of spatial prepositions across a range of contextual situations (although geometry type is an exception to this given that it is a key component of diagrammatic elicitation methods), but the importance of context in the application of spatial prepositions in specific geographic situations is acknowledged (Landau & Jackendoff, 1993;Schwering, 2007;Talmy, 1983), including the role of function (and specifically functional similarity) in the use of spatial prepositions (Coventry, 1999;Coventry et al., 2001). Future work is needed to build on these findings by exploring specific aspects of context (e.g., image schema, scale, quantitative distance between objects), particularly to identify the degree to which these contextual aspects affect semantic similarity, and to study the impact of the characteristics (such as type and size) of locatum and/or relatum on the use of prepositions. This work also focuses on descriptions of static location in the form of prepositions, and more complex expressions of spatial location including, for example, motion and fictive motion, should be addressed in future work. The focus of our work on two-dimensional (survey view) diagrams is another potential limitation, particularly when applied to prepositions that have a clear vertical component (e.g., above). Future work using three-dimensional diagrams is appropriate to address the semantic similarity of these prepositions and their senses in particular. Finally, this work addresses the semantics of spatial prepositions in generic terms and does not address variations in English dialect. There is much scope for future work on this kind of comparative analysis of use of spatial prepositions.
The findings of this research have provided insights into the semantic similarity of spatial prepositions, in both quantitative and qualitative terms. These insights can be used to assist in the automation of interpretation and generation of natural language descriptions of location through identifying the similarities and differences between spatial prepositions and provide much potential for future research to further clarify the nuances in the use of spatial prepositions in natural language.