Preverbal syntactic complexity leads to local coherence effects

ABSTRACT The effective use of preverbal linguistic cues to make successful clause-final verbal prediction as well as robust maintenance of such predictions has been argued to be a cross-linguistic generalisation for SOV languages such as German and Japanese. In this paper, we show that native speakers of Hindi (an SOV language) falter in forming a clause-final structure in the presence of a centre-embedded relative clause with a non-canonical word order. In particular, the fallibility of the parser is illustrated by the formation of a grammatically illicit locally coherent parse during online processing. Such a parse should not be formed if the grammatically licit matrix clause final structure was being successfully formed. The formation of a locally coherent parse is further illustrated by probing various syntactic dependencies via targeted questions. We show that the parser's susceptibility to form such structures is not driven by top-down processing, rather the effect can only be explained through a bottom-up parsing approach. Further, our investigation suggests that while plausibility is essential, presence of overt agreement features might not be necessary for forming a locally coherent parse in Hindi. These results go against top-down proposals to local coherence such as lossy surprisal and are consistent with the good-enough processing model to comprehension while only partially supporting the SOPARSE account. The work highlights how top-down processing and bottom-up information interact during sentence comprehension in SOV languages – prediction suffers with increased complexity of the preverbal linguistic environment.


Introduction
Top-down processing is an essential feature of the sentence comprehension system (e.g. Altmann & Kamide, 1999Fischler & Bloom, 1979;Kliegl et al., 2004;Kutas & Hillyard, 1980, 1984Levy & Keller, 2013;Marslen-Wilson, 1973;Schwanenflugel & Lacount, 1988;Schwanenflugel & Shoben, 1985;Staub & Clifton, 2006;Zola, 1984). For example, Altmann and Kamide (1999) have shown that English native speakers effectively use information about the verb's selectional restriction to preactivate the appropriate syntactic object before it becomes available in the input. In a visual world paradigm, participants selectively moved their gaze to the image of a "cake" on hearing "eat" in an utterance such as 1a. In contrast, while attending to sentences such as 1b, they started looking at the image of a "cake" only at the onset of its utterance.
(1) a. The boy will eat the cake b. The boy will move the cake This line of research (Altmann & Kamide, 1999 highlights how prediction can lead to the formation of grammatically licensed structures. See, Kutas et al. (2011), Kuperberg and Jaeger (2016), Staub (2015), and Levy (2013) for recent reviews on the role of prediction during sentence comprehension.
Interestingly, the generated structural predictions need not always be licensed by the grammar. For example, in a recent study, Kamide and Kukona (2018) demonstrated that native speakers of English generate predictions that are not consistent with the global structure of the sentence. Rather, the generated parse is consistent only with the local input. Using a picture description task, Kamide and Kukona (2018) showed that while hearing 2a participants were more likely to look at the image of a "bike" after hearing the "the man from London". Compared to this, the proportion of looks to "bike" was much less when hearing "the man very much" in 2b. In addition, participants also looked at the globally consistent object "carousel" in condition 2a suggesting that both the local as well as the global parse were active.
The above study highlights that in certain syntactic configurations, the parsing process can generate grammatically illicit predictions. One such configuration is the locally coherent structure (the man from London will ride in 2a) which has previously been shown to trigger parsing errors (Konieczny et al., 2009;Tabor et al., 2004). Such locally consistent (but globally inconsistent) parses seem to arise in embedded structures such as relative clauses which are known to incur memory load (Gibson, 1998;Grodner & Gibson, 2005;Levy & Keller, 2013). In fact, another much discussed parsing error also occurs in configurations involving multiple embeddings. In the context of the so called structural forgetting effect (Frazier, 1985), Gibson and Thomas (1999) showed that English native speakers rated sentences such as 3a as good as sentences such as 3b. Note that 3b is ungrammatical as a verb in the embedded structure ("was cleaning") is missing; this missing verb is denoted by a "∅" symbol. Gibson and Thomas (1999) explained this effect by assuming that the parser is unable to maintain the prediction of multiple verbs in 3a due to increased memory load and therefore it tends to forget the verb which incurs the maximum cost.
(3) a. The apartment that the maid who the service had sent over was cleaning every week was well decorated.
b. * The apartment that the maid who the service had sent over ∅ was well decorated.
Effects such as local coherence and structural forgetting highlight that the parser is susceptible to forming grammatically illicit parses. While in the case of local coherence, the global parse was presumably active along with the local parse; in the case of structural forgetting the global parse was completely abandoned in favour of a simpler structure. Both these effects happen with clausal embeddings, suggesting that the parser is particularly vulnerable in such syntactic configurations.

Prediction in SOV languages
In contrast to the processing difficulty due to centreembedding in English, processing in SOV languages has been argued to be not vulnerable to such a difficulty. In particular, it has been argued that the parser in SOV languages is better adapted at processing certain linguistic patterns such as verb final structures because of the frequent exposure to such configurations (Christiansen & Chater, 1999;Christiansen & MacDonald, 2009;Engelmann & Vasishth, 2009;Hsiao & MacDonald, 2013;Levy, 2008Levy, , 2013MacDonald & Christiansen, 2002;Vasishth et al., 2010). An important consequence of this adaptability is that the parser can effectively use the preverbal cues to correctly predict the upcoming clause final structure and in addition it is also able to effectively maintain such predictions. It has therefore been suggested that unlike SVO languages, in such languages, prediction of the clause final structure and its maintenance even in the face of complex linguistic environments such as centre-embeddding is quite robust. This has been argued through two effects, namely, lack of structural forgetting and anti-locality. The broad goal of this paper is to understand the constraints on such a predictive processing mechanism in SOV languages.
The structural forgetting effect in the context of multiple clausal embeddings discussed above has been shown to disappear in SOV languages like German and Dutch. Vasishth et al. (2010) used the German equivalent of sentences 3a-3b to demonstrate that German native speakers rate 3b equivalents worse than 3a equivalents. Dutch has also been shown to pattern with the German results (Frank et al., 2016). Vasishth et al. (2010) explain this lack of forgetting in German by arguing that German native speakers get exposed to clause final SOV structures much more than speakers of English (though see Häussler & Bader, 2015, for a different explanation in terms of cue-based retrieval). Hence, the parser in German becomes adept at efficient handling of such verb-final structures. A similar rationale has been invoked to explain the anti-locality effect in SOV languages.
Anti-locality effects are characterised by a facilitation at the clause final verb with increased distance between the verb and its prior dependents. For example, Vasishth and Lewis (2006) used Hindi items such as 4a-4b to show that reading times at the relative clause verb dekhaa "saw" were faster in the long conditions compared to the reading times in the short conditions.
(4) a. Short, Subject Relative vo-lar kaa jisne us-kaagaz-ko dekha bahut-jigyasu thaa that-boy who that-paper-acc saw very-inquisitive was "That boy who saw that (piece of) paper was very inquisitive". b. Long, Subject Relative vo-lar  kaa jisne us-kaagaz-ko mez ke-piiche gire-hue dekhaa that-boy who that-paper-acc table behind fallen saw bahut-jigyaasu thaa very-inquisitive was "That boy who saw that (piece of) paper fallen behind a/the table was very inquisitive".
Such facilitatory effects are generally understood to arise due to the better prediction of the clause final verb in the long condition compared to the short condition. While Vasishth and Lewis (2006) explain antilocality through memory reactivation, their model assumes a left-corner parsing mechanism to predict the relevant clause final verbal phrase. Note that the facilitation is found within a relative clause which is deemed to be a difficult to process syntactic environment (e.g. Gibson, 1998;Just & Carpenter, 1992;Miller & Chomsky, 1963;Yngve, 1960). Similar effects have been demonstrated in other head-final languages as well such as German (Konieczny, 2000) and Japanese (Nakatani & Gibson, 2010). Information theoretic metrics such as surprisal (Hale, 2001;Levy, 2008) have been successful in accounting for the prediction-based effects such as anti-locality.
To summarise, effects such as lack of structural forgetting and anti-locality in SOV languages have been explained by the notion of processing adaptability. In the current context, the parser for SOV languages becomes efficient in robust verbal prediction based on preverbal cues. Consequently, a key aspect of such a processing mechanism is that comprehenders of SOV languages such as Hindi, Japanese, German, etc. will effectively use the preverbal information (e.g. nominal semantics, case-markers) to correctly predict a valid clause-final verb. Critically, as demonstrated by the lack of structural forgetting in these languages, it is also expected that such verbal predictions should be maintained robustly even in complex syntactic configurations involving clausal embeddings.

Revisiting robust prediction in SOV languages
Given the evidence for anti-locality effects as well as lack of structural forgetting during processing of a clause final verb, it is quite possible that local coherence effects that arise due to RCs in English would fail to arise in an SOV language like Hindi. This is because in Hindi, the prediction of the RC structure as well as the matrix verb should be maintained robustly given previous evidence for efficient processing of clause final structures in SOV languages. On such an account, which we refer to as the verb-final infallibility hypothesis, the globally consistent SOV structure should always be formed even when construction of a locally consistent (grammatically illicit) structure is possible.
In this paper, we investigate the verb-final infallibility hypothesis by asking four questions, (1) Can a local coherence effect involving a matrix verb be observed in an SOV language like Hindi? (2) What representations are formed when locally coherent structures are created? (3) Are such locally coherent structures generated top-down (i.e. predicted) or created bottom-up? and (4) What are the possible factors due to which such an effect may arise? We report 6 experiments to investigate these questions. Following the frequent use of centre-embedded relative clause (RC) structures to illustrate robust prediction in Hindi (Husain et al., 2014;Vasishth & Lewis, 2006), we also utilise centre-embedded RCs in our experiments. In particular, we investigate if the presence of a centre-embedded relative clause (RC) with non-canonical word order causes the parser to form a illicit matrix clause final structure. This wordorder manipulation is central to our investigation into local coherence effects in Hindi. A failure to observe such an effect would be consistent with the verb-final infallibility hypothesis. On the other hand, if a local coherence effect is observed, that would imply that processing of complex syntactic structures can allow for the formation of grammatically illicit parses irrespective of the word order of a language.
A parsing mechanism that can lead to grammatically illicit parses is compatible with a variety of theoretical approaches to sentence processing. For example, local coherence has been explained by a self-organising parsing mechanism (Smith, 2018;Tabor et al., 2004) where global syntactic structures compete with structures formed based on local input. Under such an account, a frequent local structure could be selected over a rarer global structure. A noisy-channel model of language processing (e.g. Gibson et al., 2013;Kuperberg & Jaeger, 2016;Piantadosi et al., 2011) can also account for the formation of grammatically illicit parses when the input is rare. The noisy channel model assumes that the comprehension system is predictive because it needs to accommodate for the noisy transmission of information from the speaker (cf. Kurumada & Jaeger, 2015). Therefore, unexpected input during comprehension can lead to effects such as local coherence (Kamide & Kukona, 2018;Konieczny et al., 2009;Tabor et al., 2004) and misinterpretation due to non-canonical order (Ferreira, 2003). Finally, the good-enough processing system (Ferreira et al., 2002) also assumes that speakers make mistakes and therefore comprehenders rely on various heuristics to accommodate their error-ridden input to arrive at meaning. Under a good-enough system, the parser can end up relying on very little information from the input, leading to a shallow or incorrect interpretation of the received input. It has been proposed that processing of this kind could be a result of limits on time and resources that the parsing system is typically subject to when attending to complex structures such as a non-canonical word order (Ferreira, 2003;Ferreira & Patson, 2007;Karimi & Ferreira, 2016). Together, these proposals make the prediction, irrespective of the word order of a language, syntactic complexity could lead to the formation of illicit parses. In the context of the current work, these proposals predict that in SOV languages locally coherent parses could be formed in a matrix clause final environment.
The paper is organised as follows, in Section 2 we provide a brief description of the agreement system in Hindi. In addition, this section discusses the Husain et al. (2014) experiment which forms the basis of the design for all the experiments in this work. In Sections 3-5, we discuss three experiments to investigate the formation of locally coherent parses in Hindi. Section 3 investigates if Hindi native speakers form locally coherent parses; Section 4 investigates the nature of syntactic representation when such structures are formed; Section 5 investigates if these structures are formed top-down or bottom-up. In Section 6 we present an interim summary for the first three experiments. In the next set of experiments discussed in Sections 7-9 we investigate semantic and syntactic factors that could lead to a local coherence effect. Finally, we consolidate the findings of all the experiments in Section 10. We conclude in Section 11.

Verb-argument agreement in Hindi
Hindi has a mixed agreement systemverb agreement morphology can index the features of the subject or the object or no argument depending on the sentence structure. The verb agrees with the subject if the subject lacks overt case-marking; failing that the verb agrees with the object if the object lacks overt case-marking; failing that the verb exhibits default agreement morphology (Kachru, 2006;Pandharipande & Kachru, 1977). Consider the examples in 5. The sentence in 5a is an example where the verb par henge "will read" is inflected for masculine plural agreement, showing agreement with the subject of the sentence lar ke "boys" as the subject does not bear any overt case-marking. 5b is an example where the verb par  hii "read" now shows feminine singular agreement features, agreeing with the grammatical object kitaab "book", as the subject is overtly casemarked with ergative case here, and the object does not bear any overt case-marking. Finally, when both the subject and the object have overt case-marking as in 5c, the verb does not show agreement features of any of its arguments and surfaces with default agreement instead.
(5) a. Subject agreement:  Husain et al. (2014) The experiments discussed in this work utilise a key manipulation in the design of experiments in Husain et al. (2014). Here, we briefly discuss their items, the key results and their interpretation of the results.

Overview of
The critical conditions from Husain et al. (2014) are shown in 6. In condition 6a, the subject NP vah lar  kaa "That boy Masc " is modified by a subject relative clause (RC) which has an SOV word order. In condition 6b, this relative clause has a non-canonical order such that the object of the relative clause kitaab "book Fem " appears after the relative clause verb par  hii thii "read Fem had Fem "; i.e. the RC has an SVO word order. Additionally, the RC verb in conditions 6a and 6b agrees with the RC object (kitaab "book") since the RC subject bears an overt (ergative) case-marker.  The dependencies that would need to be established for a licit parse for the canonical condition and the noncanonical condition are shown in Figure 1(a). Here NP1 corresponds to subject NP vah lar kaa "That boy Masc ", Relpro=ne to the ergative marked relative pronoun jisne, and NP2 to the RC object kitaab "book Fem ". The crucial contrast is with respect to the dependency between the RC-verb and the RC-Object across the two RC-internal word orders: SOV (canonical) and SVO (non-canonical). The other dependencies such as that between the subject NP and the matrix copula; or that between the Relative pronoun and the the RC verb are identical across the structures. Husain et al. (2014) found a significantly longer reading time at the RC verb in condition 6b compared to the RC verb in condition 6a. The RT difference at the RC verb was explained through an expectationbased account (Levy, 2008) by arguing that the SOV structure of the RC in conditions 6a-6b is strongly expected at the relative pronoun jisne. Not encountering the expected OV order in 6b leads to a processing slowdown. Critically, their experiment demonstrates that the RC structure is successfully predicted during the comprehension process.
An alternative explanation for the RT slowdown in 6b vs. 6a could be that the agreement feature on the RC verb in the non-canonical order doesn't find an agreement controller preceding it, while in the canonical condition the controller (i.e. the object) appears before the verb. This lack of agreement match in the non-canonical condition could lead to the observed slowdown. 1 In order to test this hypothesis we ran an SPR experiment (N = 61) with items shown in 7a-7b. In examples 7a-7b, the subject of the relative clause is not overtly case-marked and hence the RC verb agrees with the subject. If the slowdown observed in Husain et al. (2014) was due to an agreement mismatch, then we would expect no difference in RT at the RC verb in conditions 7a vs. 7b. Results show that the RT at the critical RC-verb in the non-canonical condition was significantly slower than the RT in the canonical condition (p<0.01). This suggests that the word order explanation proposed in Husain et al. (2014) is better suited to explain the observed patterns of results. Details of this experiment can be found in the Appendix.

Experiment 1
We begin by investigating the first question discussed earlier (Section 1.2), i.e. can a local coherence effect involving a matrix verb be observed in an SOV language like Hindi? The goal of this experiment was to investigate if native speakers of Hindi form a locally coherent parse during online comprehension.

Material and methods
3.1.1. Participants 52 native speakers of Hindi participated in this experiment. All the participants in the experiments reported in this paper were undergraduate or graduate students from the Indian Institute of Technology, Delhi (IIT, Delhi) between the ages of 18 to 40 years, with the exception of Experiment 3, where we also included students from Jawaharlal Nehru University, Delhi. Participants provided informed consent and were paid between INR 150 to 300 for their participation. The amount was pro-rated as a function of time spent in the lab: INR 150 was paid for 30 min of experiment participation; payments for experiments that lasted more than 30 minutes increased accordingly.

Items
We adapted the design from Husain et al. (2014) (see Section 2.2) to evaluate Hindi speakers' propensity to build a locally coherent parse involving relative clause internal material and the post-RC material. Specifically, we used a post-RC verb which cannot be grammatically integrated with other matrix clause material but which is compatible with relative clause internal material in the non-canonical order as part of a locally coherent parse.
We used a 2 × 2 design crossing RC internal WORD ORDER (Canonical/Non-canonical) and post-RC CLAUSE TYPE (Copula/Transitive). The Canonical condition has SOV order in the RC -8a, 8cwhile the Non-canonical conditions has SVO order -8b, 8d. Importantly, all four experimental conditions are ungrammatical. This is because the post-RC material cannot be integrated with the subject of the clause NP Masc (vah lar kaa "That boy Masc " in 8) (which is also the RC head noun). Specifically, in the Copula conditions -8a, 8bthe post-RC verb incorrectly bears feminine agreement, matching the RC internal object NP Fem (kitaab "book Fem ") rather than the features of the subject NP Masc . 2 In the Transitive conditions -8c, 8din addition to the incorrect feminine agreement, NP Masc and the post-RC verb are thematically incompatible/odd. We use underlining in the sample item below to draw attention to the locally coherent parse. In the Copula condition -8bthe RC internal object (kitaab "book Fem ") could be incorrectly treated as the subject of the copula verb thii "was Fem ", thus forming a locally coherent parse (kitaab moTii thii "The book was thick"). Similarly, in the Transitive condition -8dthe RC internal object could be incorrectly treated as the object of the transitive verb to form a locally coherent parse (kitaab mujhe bechnii paRii "I had to sell the book").
24 experimental items in 4 conditions were prepared. 56 filler sentences with varied levels of acceptability were also included. In the current experiment (and in subsequent experiments), some of the filler sentences were ungrammatical. These ungrammatical filler sentences were used to filter out participants who were not paying attention during the data analysis phase. A latin-square design was used to present different items Since the experimental items are ungrammatical at the post-RC verb and two of the conditions involve a locally coherent parse, we provide a broad translation for various parts of the items separately in 9 for ease of understanding. Copula verb / Transitive verb … was thick / I had to buy c. Spill over material … and many friends bought the same book as well.

Procedure
We used the self-paced reading (SPR) paradigm (Just et al., 1982) with centred presentation for the reading task. Stimulus items were presented in the lab using the online Ibex Farm platform (Drummond, 2013). A Latin square design ensured that each participant saw each item in only one condition. The target items and fillers were randomised for each participant. The experiment began with anexplaination of the task to the participants in verbal as well as in written form. After this, several practice sentences were presented in order to familiarise participants with the task. At the beginning of each trial, the participant saw a fixation cross at the centre of the screen. When the participant pressed the space-bar key, the first word or phrase was displayed at the centre of the screen. With each successive press of the space bar, the previous word or phrase disappeared and the next word or phrase was displayed. This successive replacement continued until the participant had read the whole sentence. Reading times or RTs (in milliseconds) were taken as the measure of relative momentary processing difficulty. In addition, at the end of each trial, participants were asked to rate the sentence on a scale of 1-7 (with 1 signifying completely unacceptable and 7 signifying completely acceptable).

Predictions
The verb-final infallibility (VI) hypothesis will predict no difference in RTs between the conditions at the critical post-RC verb. Recall that under the parsing accounts on which the VI hypothesis is based, it is assumed that the parser is actively predicting and successfully maintaining this clause-final verb (Levy, 2008;Vasishth et al., 2010). Since all the four conditions become ungrammatical at the critical post-RC region, the VI hypothesis, will predict equal difficulty in all the four conditions. Indeed, based on a metric like surprisal (Hale, 2001;Levy, 2008) that quantifies the processing difficulty at a word as its conditional probability in the sentential context, the probability of encountering the critical verb-forms in the experimental items will be close to zero. This is on account of a critical assumption that surprisal makes: the probability mass is distributed over only grammatically licensed structures.
The other possibility is that the unexpected SVO order of the embedded relative clause in the Non-canonical conditions could make the parser susceptible to making errors. Under such a situation, a locally coherent structure (Smith, 2018;Tabor et al., 2004) could be formed such that the RC internal NP Fem gets structurally integrated with the post-RC material. We would therefore expect to see a main effect of Word Order such that RTs at the (post-RC) matrix verb in the Non-canonical condition should be less than the Canonical condition. No such locally coherent structure is possible in the Canonical order. This pattern would be consistent with theories of sentence processing that allow for the building of grammatically illicit structures. The predictions discussed above were pre-registered on AsPredicted.com.
Also, since the SPR paradigm is known to have spillover effects (Kaiser, 2014), the predicted RT difference can also appear at the post-critical regions. Consequently, similar to the critical region, the post-critical regions were kept identical across conditions. In addition to the RTs, the accounts that permit grammatically illicit structures also makes predictions regarding grammaticality ratings: it may be the case that comprehenders are susceptible to an illusion of grammaticality for the non-canonical conditions due to the availability of a locally coherent parse.

Results
For the data analyses in Experiment 1 and all subsequent experiments, the statistical analysis for the RT/acceptability data was done using linear mixed-effects model through the lme4 package (Bates et al., 2015) in the R statistical computing environment (R Core Team, 2013). Anova sum contrasts were used for fixed effects factors in the experiments. p-values for fixed-effects were computed using the pbkrtest package (Halekoh & Højsgaard, 2014). Maximal models were fit when possible (Barr et al., 2013); in case of convergence failure, a less complex model was fit by successively removing the random slopes of the by-subject and byitem random effects components. Raw RTs were log transformed and following Schütze and Sprouse (2014), acceptability ratings were scaled before fitting the lmer models.
In addition, participants who rated ungrammatical fillers more than 3.7 were removed from analysis. The threshold of 3.7 was chosen as it was the value for the 3rd quartile of all ungrammatical ratings in this experiment. This step led to the removal of data from 14 participants.
The average ratings for the experimental conditions were found to be significantly higher than the rating for the ungrammatical filler sentences (t = −12.32). The mean rating for the experimental items was 3.7 (Min = 1, 1st Quartile = 2, Median = 4, 3rd Quartile = 5, Max = 7) while the mean rating for the ungrammatical fillers was 2.4 (Min = 1, 1st Quartile = 1, Median = 2, 3rd Quartile = 3, Max = 7). The analysis of the acceptability ratings found a significant effect of Word Order (p<0.05) such that the ratings for the non-canonical conditions were less than the ratings for the canonical condition. These results can be seen in Tables 1 and  2. These tables also include the acceptability ratings results of Experiments 4 and 5, which follow the same experimental methods as this Experiment.
RTs at the critical post-RC matrix verb region showed a significant main effect of Clause Type (p = 0.5) such that RTs in the Transitive conditions were higher than the RTs at the Copula conditions. At the critical region, the interaction was not significant (p>0.05). In a nested contrast in Transitive conditions, the Non-canonical Transitive condition was faster than the Canonical Transitive condition (p = 0.05). No difference was found in the Copula conditions (p>0.05). RTs at the post-critical region showed a significant effect of Word Order (p<0.05) such that the Non-canonical canonical conditions was read faster than the Canonical conditions. Table 3 shows the results for critical and the post-critical regions for this Experiment, and Experiments 2, 4 and 5 which also included RT measures. The RTs at the critical and the post-critical regions are shown in Figure 2. Figure 3 shows the RTs for all the regions across all the experimental conditions. We also note that not removing the participants based on the exclusion criterion mentioned above did not change the salient findings of the presented analyses.

Discussion
The RT results suggest that Hindi native speakers are entertaining a locally coherent parse when the RC internal NP Fem object appears in a less frequent post verbal position and the post-RC matrix verb incorrectly agrees with the RC object. This suggests that Hindi native speakers are susceptible to local coherence in a clause final structure. Additionally, the acceptability ratings shows that the non-canonical order conditions are dispreferred compared to the canonical order conditions. The formation of a locally coherent parse goes against the VI hypothesis that predicts no parsing error in a clause final environment even in the face of clausal embeddings. In particular, these results cannot be explained by the surprisal metric that would predict no difference between various ungrammatical conditions. On the other hand, the RT results are compatible with various processing accounts that predict creation of illicit structures during parsing. In particular, the results could be explained by either the SOPARSE account, the lossy surprisal account or the good-enough processing account of sentence comprehension. We will attempt to investigate which of these explain the observed locally coherent effect better in the subsequent sections.
One concern regarding the RT results could be that the participants are employing a non-standard reading pattern given the ungrammaticality of the sentences. However, this seems unlikely because of multiple reasons. Firstly, in addition to the facilitation at the matrix clause final verb in the non-canonical conditions, we also observe a slowdown at the RC verb compared to the RC verb in the canonical conditions (p = 0.001). The slowdown is reflective of a dashed expectation of not seeing a (canonical) pre-RC verb object in such conditions. This slowdown was also observed in the original (Husain et al., 2014) study which tested grammatical sentences. If the participants were not engaging in normal processing, we would not have observed such a processing pattern. Secondly, the acceptability ratings results suggest that participants are treating the critical items distinctly compared to the ungrammatical fillers, and engaging in normal processing for the former: the critical experimental items (although ungrammatical) were consistently rated higher than other ungrammatical fillers with middle of the scale ratings for the critical items and the bottom of the scale ratings for ungrammatical fillers. In order to ensure that our data comes from the participants that adopt a normal-processing strategy  for the critical experimental items while also controlling for the impact of non-normal processing strategies for ungrammatical sentences on our conclusions, data from participants who did not rate the ungrammatical fillers low were not included in the final analysis. Finally, our interpretation that participants are forming locally coherent parses in the non-canonical conditions is supported by the question response pattern discussed in Experiment 2 where grammatical sentences were also included as part of the experimental manipulation. Interestingly, the acceptability ratings results for the critical items show a different pattern than the RT data -Canonical conditions were rated higher than the Non-canonical conditions. This seems to go against the SOPARSE account (as well as other accounts such as the good-enough and the lossy surprisal models) which would predict higher ratings for the locally coherent (non-canonical word order) structures. One way to reconcile the RT pattern and the ratings pattern would be to assume the parser forms a locally coherent parse temporarily before realising that it had made a mistake, which may manifest as an RT difference in later regions of the sentence. However, a post-hoc analysis of the RTs in the post-critical regions showed no evidence for an effect of word order (p = 0.77).
An alternative explanation could be that the RT data and the ratings data index different underlying processes on account of the former being an online measure, thereby measuring a localised behavioural response and the latter being more of an offline measure, measuring an overall behavioural response connected to the judgment task, as it is an end of sentence measure. We return to further discussion of this possibility in the General Discussion.
We next turn to an experiment where the syntactic dependencies that need to be resolved are probed using comprehension questions in order to investigate this connection (between offline and online measures) and the nature of representations formed better. A better picture of the built representations can help us understand the mechanisms that could lead to the formation of such representations.

Experiment 2
The previous experiment clearly pointed to the formation of a locally coherent parse in a matrix clause final environment. The aim of the current experiment was to probe the nature of the syntactic representation during the formation of such locally coherent structures discussed in the previous experiment. In particular, we investigate whether Hindi speakers are able to successfully form the various structural dependencies in the sentence presented in previous experiment by utilising targeted comprehension questions. We presented Hindi speakers with copular verb sentences in a self-paced reading paradigm which was followed by comprehension questions targeting the integration of NPs in the sentences. Our primary focus was the integration of the RC internal object NP: we tested a licit integration (RC object NP -RC verb) and an illicit integration (RC object NPmatrix verb) to evaluate whether faulty NP integrations underlie the observed local coherence effects, but we also tested the integration of the head NP.
Further, in addition to ungrammatical copular sentences, we also tested grammatical copular sentences to establish a comparative baseline for local coherence effects.

Material and methods
4.1.1. Participants 46 native speakers of Hindi participated in this experiment. While the planned number of participants was 70, data collection had to be stopped partway due to COVID-related disruptions.

Items
We used a 2 × 2 within-subjects repeated measures design crossing RC internal Word order (Canonical/ Non-canonical) and Grammaticality (Grammatical/ Ungrammatical). Canonical order is SOV -10a, 10cwhile Non-canonical order is SVO -10b, 10d. Grammaticality is manipulated via the RC head NP and the relationship between this NP and the agreement morphology on the post RC verbdistinct NPs with different gender specifications were used in the grammatical and ungrammatical conditions. In 10, the ungrammatical conditions have a masculine NP as the head NP (vah lar kaa "That boy Masc ") while the grammatical conditions have a feminine NP as the head NP (vah raajkumaarii "That princess Fem "). The post-RC verb bears the same agreement morphology across all conditions of an itemin example 10, feminine agreement. In the ungrammatical conditions -10a, 10bthis feminine agreement morphology on the post-RC verb does not correspond to the features of the grammatical target of agreement, the RC head NP (vah lar kaa "That boy Masc ") but rather to the features of the RC internal object (kitaab "book Fem "), as in Experiment 1. In the grammatical non-canonical conditions -10c, 10dthe feminine agreement morphology on the post-RC verb correctly matches the RC head NP (vah raajkumaarii "That princess Fem "). The words that can form a locally coherent parse are underlined in the examples. In the Ungrammatical and Grammatical conditions -10b and 10d, respectivelythe RC internal object (kitaab "book Fem ") could be incorrectly treated as the subject of the copula verb thii "was Fem ", thus forming a locally coherent parse (kitaab shaayad mot ii thii "The book was perhaps thick"). 24 experimental items in 4 conditions were prepared. Every experimental trial was followed by a Yes-No comprehension question. 88 filler sentences were also included. The filler sentences included both monoclausal and biclausal sentences in canonical and non-canonical word orders. The complete list of experimental items has been uploaded to the OSF repository associated with this paper. A latin-square design was used to present different items. Overall, comprehension questions were included for 70% of trials.

Comprehension questions
The comprehension questions for the critical experimental items sought to evaluate the success of various NP integrations in the sentences. The aim was to understand the representations formed during the parsing process. Question type was manipulated as a between items factor to give three types of question-item pairs probing distinct NP integrations.
The first two subsets of questions are concerned with the integration of the RC internal NP and are our primary focus. Of these two subsets, one subset tested the integration of the RC internal object NP with the RC internal material. This is a grammatically licit dependency since both these elements are part of the the relative clause. To illustrate, if the sample item in 10 was part of the subset probing RC internal NP-RC internal material integration, then the corresponding question would be as in example 11. 8 out of 24 items in the experiment had this kind of comprehension question.
(11) RC internal NP-RC internal integration kyaa kitaab kal par hii gayii thii? Q book yesterday read go was? "Was the book read yesterday?" The second subset tested the integration of the RC internal NP with the post-RC material. Since the RC internal object NP and the post RC material are from distinct clauses and the RC internal object NP is not an argument of the post RC verb, this is a grammatically illicit dependency. If the sample item in 10 were to be part of the subset probing RC internal NP-post RC integration, then the question would be as in example 12. 8 out of 24 items in the experiment had this kind of comprehension question. In addition, we also had a third subset of questions which probed the integration of the head NP with distinct parts of the structure. Distinct questions had to be used in the grammatical and ungrammatical conditions due to the non-identical head NP across these conditions. In the grammatical conditions, the questions probed the integration of the head NP with the post-RC material. Since both the head NP and the post RC material are part of the matrix clause structure, this integration is licit. However, in the ungrammatical conditions we probed the integration of the head NP with the RC internal verb since the integration with the post RC verb is illicit here. If the sample item in 10 were to be part of the subset probing head NP-RC integration/post RC integration, then the questions would be as in example 13.
(13) head NP integrations a. kyaa lar ke ne kuchh par haa thaa? Q boy erg something read was "Did the boy read something?" b. kyaa raajkumaarii mot ii thii? Q princess Fem fat was "Was the princess fat?" Note that while we illustrate the various question types for a single item here 10, in the experiment any one item corresponded to only one question type subset for all participants.

Procedure
We followed the same procedure discussed in Section 3. However, instead of ratings, the SPR reading was followed by yes-no comprehension questions. The experiment was conducted using Douglas Rohde's Linger software (version 2.94) 4 .

Predictions
We expected to replicate the reading time pattern observed in Experiment 1, wherein Word Order has an impact on RTs. If the RC internal NP gets integrated with the post-RC material, then the RTs for the Noncanonical condition should be lower than the Canonical condition as in the previous experiment.
Turning to question comprehension accuracy, the primary predictions are regarding the integration of the RC internal NP. If Hindi speakers are successfully integrating the RC internal object NP with the RC internal material, then they should have high accuracy on questions probing this dependency. Furthermore, since the RC internal object NP and the post RC material are from distinct clauses, Hindi speakers ought not to integrate these elements. If Hindi speakers do not attempt such an integration, then they should exhibit high accuracy on questions targeting this dependency. However, if Hindi speakers erroneously attempt to integrate the RC internal NP with the post RC material then they are expected to exhibit low accuracy on questions targeting this dependency.
In addition, a more general point can be made regarding the integration of the head NPif Hindi speakers are able to make a successful prediction about upcoming matrix material at the head NP and maintain that prediction, then the integration of the head NP with the matrix material ought to proceed smoothly. However, if they fail to maintain that prediction they should subsequently have difficulty in answering comprehension questions about this integration which would manifest as low accuracy in the grammatical conditions of the head NP-integration subset.

Results
We follow the same statistical analysis procedures as discussed in Experiment 1. Data from participants with less than 70% accuracy across all the trials (including fillers) were excluded from analysis. This led to the removal of one participant's data.
RTs at the critical region showed a significant effect for Grammaticality (p<0.05) such that RTs in the Grammatical condition were faster than the Ungrammatical condition. The effect of Word Order was significant (p = 0.05) in the post-critical2 region -Non-canonical conditions were read faster than the Canonical conditionswhich is consistent with the effect of Word order observed in the previous experiment. Table 3 shows the RT results for critical region and the post-criti-cal2 region. RTs at the critical and the post-critical2 regions are shown in Figure 4. Figure 5 shows the RTs for all the regions across all the experimental conditions.
Comprehension accuracy results are given in Table 4. The results suggest that Hindi speakers are fairly accurate at answering questions probing the integration of the RC internal NP with the RC internal material. At the same time, they struggle with questions probing the integration of the RC internal NP with the post RC material.
The accuracy data for the two question types probing the RC internal NP was analysed using generalised linear mixed effects model (see Table 5). There is a significant main effect of Question Type (p<0.001). This shows that Hindi speakers are quite susceptible to the incorrect integration of the RC internal NP with post-RC material.
In addition, we evaluated accuracy on comprehension questions targeting the integration of the head NP with post-RC matrix material in grammatical conditions separately. Hindi speakers appear to struggle with this integrationspeakers' accuracy is close to chance levels here (53%). Finally, we note that not removing the participants based on the exclusion criterion mentioned above did not change the salient findings of the presented statistical analyses.

Discussion
The reading times results in this experiment are consistent with the key result of speed up in non-canonical word order observed in previous experiment, although the effect is not significant in the present experiment.
The comprehension question accuracy results help us better understand the representations formed during the parsing of the items in this (and the previous) experiment. The results show that Hindi speakers are forming faulty integrations during structure building in sentences with centre embedded relative clauses with an SVO word order. Accuracy on questions targeting the RC internal NP suggests that licit integrations and illicit integrations coexist -Hindi speakers integrate the RC object NP with RC internal material and they also integrate the RC object NP with post RC material. Alongside this susceptibility to incorrect integrations, we also observed that Hindi speakers struggle with the integration of the head NP with the post-RC material. The near chance performance on questions targeting this integration suggests that Hindi speakers failed to adequately maintain the predictions associated with this dependency. Figure 6(a) shows the syntactic dependencies that are established during the comprehension process. Note the formation of two distinct parses (shown in two boxes)the first parse is the grammatically licit but incomplete parse, the second parse is the grammatically illicit locally coherent parse. Based on the comprehension results, both these parses seem to coexist at the end of the parsing process. Figure 6(b) shows the dependencies that need to be formed in order to parse the sentence successfully.
The representation shown in Figure 6(a) can be formed by a parsing process assumed by theories that predict formation of illicit structures. For example, the self-organising parsers (SOPARSE) (Smith, 2018;Tabor et al., 2004) could explain the formation of such a structure. As stated earlier, SOPARSE assumes the formation of local structures that compete with non-local structures. In the current experiment, in the non-canonical structure, an infrequent "verb before object RC" structure is formed (parse1 in Figure 6(a)), while the post RC material leads to a more frequent "object before verb" structure (parse2 in Figure 6(a)). For instance, in a Hindi dependency treebank (Bhatt et al., 2009), there were 8620 instances of Object before Verb order, compared to a mere 33 instances of Verb before Object cases. This asymmetry in frequency between the word orders could lead the local structure to win over the non-local structure.
The rarity of the Verb before Object order could also explain the effect using the lossy surprisal model. This model assumes that a noisy sentential context is used to make upcoming structural    predictions. Under this model, the local coherence effect (that could lead to the representation shown in Figure 6(a)) can arise assuming that the head NP, the RC as well as the prior prediction of the matrix verb are deleted during the noisy contextual reconstruction step (see Figure 7(a)). However, such forgetting seems to be inconsistent with the question response pattern discussed earlier which clearly shows that the structural dependencies related to the RC structure and the head NP are not completely lost at the end of the parsing process (cf. Figure 6(a)). Finally, the good enough account can explain these results by assuming a heuristic based processing triggered due to processing load. In particular, this processing load can be understood to be caused due to (a) the embedded relative clause, and (b) the revision of the SOV structure at the RC verb in the non-canonical conditions. These factors are known to cause processing difficulty (e.g. Gibson, 1998;Levy, 2008). On this account, when the processing load becomes high, the heuristicbased processing strategy could override the algorithmic processing strategy (cf. Apurva & Husain, 2021;Karimi & Ferreira, 2016;von der Malsburg & Vasishth, 2013). The comprehension system will not necessarily create a grammatically licit structure when operating in a heuristic mode. This in turn could lead to the formation of a locally coherent structure that does not require integration with previously built structure. Consequently, the previously built structure involving the head noun and the RC could linger in memory alongside the locally coherent structure (see Figure 6(a)). The switch between an algorithmic mode (which is grammar driven) to a heuristic mode due to the resource intensive construction of the RC in the non-canonical condition could also be understood as "forgetting" of the predicted matrix verb (see Figure 7(b)). In other words, the comprehension system forgets to anticipate the grammatically licensed clause final matrix verb structure when the processing load is high. Indeed, in the current experiment as well as in the previous experiment, combined RTs at the RC verb and RC object regions show a clear effect of word order (Experiment 2: p<0.001; Experiment 1: p = 0.001) such that the canonical conditions were read faster than the non-canonical conditions (see Figures 3 and 5). As mentioned earlier, this increase in RTs in the non-canonical conditions can be explained as a revision cost due to dashed expectation of the SOV order in the RC structure (Husain et al., 2014) and could lead to processing load.
We also observe a main effect of grammaticality in the reading times at the critical region -Hindi speakers are faster at reading grammatical sentences than ungrammatical sentences. The existence of this grammaticality effect in RTs alongside a susceptibility to building locally coherent parses (as suggested by the comprehension question accuracies) may appear puzzling at first glance. However, this could be understood in terms of the ambiguity advantage effect (Logačev & Vasishth, 2016;Swets et al., 2008;Van Gompel et al., 2000). In the Grammatical conditions, the string associated with the grammatically licit global parse (head NP integration with post RC verb) and the sub-string associated with the grammatically illicit local parse (RC-NP integration with post RC verb) are well-formed in and of themselves. In the absence of any disambiguating information at the post-RC verb, speakers will fail to detect the licitness of the global parse and the illicitness of the local parse. Not having to commit to the (licitness of the) global parse or to the (illicitness of the) local parse could translate to fast reading times in the grammatical conditions. In contrast, in ungrammatical sentences, compared to the global (illicit) parse, the local parse remains more viable.
In sum, this experiment shows that Hindi speakers are struggling with the various NP integrations in sentences with centre-embedded relative clauses. The results suggest that the parser does form an illicit structure that is not temporary. The evidence for the formation of a locally coherent structure from the present set of experiments is compatible with parsing accounts wherein the comprehension system can generate grammatically illicit parses. So far we have not investigated if one of these explanations fit the data better. We noted that the explanation of the lossy surprisal model for the observed effect is not consistent with the question response pattern. In order to investigate if the lossy-surprisal model continues to be a viable explanation to the local coherent structures attested in the current and the previous experiments we next discuss a sentence completion study.

Experiment 3
The goal of the current experiment was to investigate the role of prediction during the formation of the locally coherent parse discussed in the previous two experiments. In particular, it is unclear from the results if the faster RTs in the non-canonical condition reflect active prediction of a locally coherent parse on the part of the parser or is it the case that the parser forms a locally coherent parse bottom-up using the input encountered in the post-RC region. In order to investigate this we conducted a sentence completion study.

Participants
36 Hindi native speakers participated in the experiment but data from one participant had to excluded due to no responses on nearly all trials.

Items
We used a 1 × 2 design manipulating RC internal WORD ORDER (Canonical/Non-canonical) with the sentence preambles truncated at the RC internal verb in both cases. In the Canonical conditions the RC internal order was SOV, but in the Non-canonical conditions, no object was present and the order was SV.
Participants were shown 24 items following example 14. Items were presented using a latin square design. In addition, 52 filler items were also shown. The filler items had a variety of preamble constructions involving multiple nominals with different case-marker combinations.
(14) a. Canonical vah lar  kaa/ jisne/ kal/ bohot dilchaspii se/ kitaab/ par  hii that boy Masc who Erg yesterday lots interest with book Fem read Fem thii … had Fem … "The boy who had read the book with a lot of interest yesterday …" b. Non-Canonical vah lar  kaa/ jisne/ kal/ bohot dilchaspii se/ par hii thii that boy Masc who Erg yesterday lots interest with read Fem had Fem … … "The boy who had read with a lot of interest yesterday …"

Procedure
The sentence completion task was employed as the experimental paradigm (Taylor, 1953). Each sentence was presented using the centred self-paced reading (SPR) paradigm. Participants were provided an incomplete sentence and their task was to complete it such that it was meaningful. Each sentence appeared on the screen in the self-paced reading format. Initially, a "+" sign appeared on the centre of the computer screen. When the participant pressed the space-bar key this "+" sign got replaced with the first word of the sentence. Successive button presses displayed the remaining words of the sentence at the centre of the screen. A ". . ." symbol signalled the end of the sentence fragment and prompted the participant to complete the sentence. This was done in a text box that appeared by pressing "space-bar" after the ". . ." symbol. After typing in the text the participants pressed the "enter" key to move to the next trial. The experiment was conducted using Douglas Rohde's Linger software (version 2.94). Items were automatically randomised by Linger.
It has been argued that cloze probabilities from the completion task provide the most comprehensive measure to quantify predictability during comprehension (Staub, 2015). Indeed, numerous studies employing the sentence completion task have shown that completion patterns correlate strongly with reading time patterns found during online sentence comprehension (e.g. Husain et al., 2014;Jäger et al., 2015;Levy & Keller, 2013;Rayner et al., 2011).

Predictions
A top-down account assumes that a locally coherent parse is predicted at the post-RC object, while a bottom-up account assumes that a locally coherent parse is formed when the necessary strings are made available in the input. In particular, a bottom-up account of local coherence is compatible with SOPARSE and good-enough processing mechanisms. On the other hand, a top-down account would be more compatible with the lossy surprisal model which is based on the noisy-channel processing framework.

Response coding
The completion data was coded for four variables (a) whether the completion was grammatical (1 for yes, 0 otherwise), (b) whether the direct object of the RC verb was posited (1 for yes, 0 otherwise), (c) whether a resumptive pronoun referencing the head NP was used in the completion (1 for yes, 0 otherwise), and (d) whether the completion was locally coherent with the RC object (1 for yes, 0 otherwise).
Sentence 15a shows an instance where a resumptive pronoun (vah "that") is used during the completion. The use of resumptive pronouns in this context can be understood to constitute a successful attempt at reactivating the head NP (cf. Keshtiari & Vasishth, 2012) especially when the continuation is otherwise wellformed visá vis the integration of the head NP and post RC-verb. Sentence 15b shows an instance of a locally coherent parse. "The man who the day before yesterday had slowly constructed the wall, it collapsed today".

Results
The statistical analysis for the completion data was done using the generalised linear mixed-effects model with logit link function. This has been done using the lme4 package (Bates et al., 2015) in R. Maximal models were fit when possible (Barr et al., 2013); in case of convergence failure, a less complex model was fit by successively removing the random slopes of the by-subject and by-item random effects component.
The percentage completion for the four variables is shown in Table 6. This table also presents the percentage completion results for Experiment 6 which also employs the same methodology. A glmer analysis showed that the Canonical condition had significantly higher grammatical completions than the Non-canonical condition (p < 0.0001). In addition, compared to the Non-canonical condition, the number of resumptive pronouns was higher in the Canonical condition (p < 0.0001). The number of locally coherent completions did not differ between the two conditions (p = 0.57). In addition, note that the RC object was posited frequently in the Non-canonical condition. The glmer analysis for this experiment alongside that of Experiment 6 is provided in Table 7.

Discussion
The results show that the total percentage of locally coherent completions is very low. Importantly, such completions do not differ between the Canonical and the Non-canonical conditions. This suggests that locally coherent parse observed in the previous experiments is not constructed in a top-down fashion. These results are not consistent with the lossy surprisal model but can be explained by accounts where a locally coherent parse is constructed in a bottom-up fashion.
The completion results are consistent with the pattern of comprehension accuracy found in the previous experiment. While the completions relating to the relative clause structure are finethe direct object of the RC verb in the Non-canonical condition is predicted frequently, we find that the parser is vulnerable to making errors in the Post-RC region, specially in the Non-canonical condition. This is manifested by both high ungrammatical completions as well as lower use of resumptive pronouns in the non-canonical condition. A higher number of resumptive pronouns in the Canonical condition suggests that compared to the Non-canonical condition, the head NP was more likely to be retrieved successfully in the Canonical condition. As discussed earlier, the pattern of question responses in Experiment 2 also go against the lossy surprisal model. While the current result supports bottom-up processing of local coherence, the issue of which of the two bottom-up accounts, namely, SOPARSE and goodenough processing is the best explanation for the current set of data needs further investigation. We attempt to do this in the next set of experiments.

Interim summary
So far we have observed that a local coherence effect can arise in an SOV language like Hindi in configurations involving clause final structure. We found evidence that the parsing process leads to the formation of two distinct structures, (a) an incomplete grammatically licit parse involving the head NP and the RC, and (b) an illicit parse involving the RC object and the post RC material. Both these structures were found to be active. The formation of a local coherence parse and the associated representation goes against the VI hypothesis that assumes a robust parsing mechanism for clause final structures in SOV languages. Further, unlike the English data (Kamide & Kukona, 2018), we found that such locally coherent parses are built in a bottom-up fashion. This data is compatible with both the SOPARSE and good-enough parsing accounts to processing but it goes against the lossy-surprisal model.
The previous set of reported experiments were able to address the first three questions raised in the Introduction section relating to the existence of local coherence effects in SOV languages, the representations formed when locally coherent structures are created as well as the processing mechanisms behind the generation of such structures. We now move on to the fourth and the final question that we had raised earlier what are the possible factors due to which such an effect may arise? In particular, could there be certain linguistic features in the local environment (within the RC and in the post-RC region) that cause the locally coherent structure to arise? In the three experiments discussed in the following sections we explore two such features which may be important for establishing semantic and syntactic relationships in the structure: (a) semantic plausibility and (b) verbal agreement. Investigation of these factors helps us explore if one of the two bottom-up approaches explain local coherence in the current work better.

Experiment 4
The goal of the current experiment was to investigate if certain semantic factors in the local linguistic environment are necessary for the local coherence observed in Experiment 1 to arise. Words in a structure are in a semantic relationship with one another and plausibility is a key device to encode the well-formedness of these semantic relations. In this experiment, we evaluate the contribution of semantic plausibility and ask if local coherence arises only when the RC internal object is a plausible argument of the post-RC verb.

Material and methods
7.1.1. Participants 63 native speakers of Hindi participated in this experiment at IIT Delhi. Please note that although, the number of participants in this experiment and next experiment is the same, these experiments were conducted separately.

Items
We used a 2 × 2 design crossing RC internal WORD ORDER (Canonical/Non-canonical) and PLAUSIBILITY (Plausible/ Implausible). All experimental conditions are ungrammatical as in Experiment 1. As before, Canonical order is SOV, and Non-canonical SVO. In the Plausible conditions, the RC internal NP-object is a semantically plausible argument of the post-RC verb. Thus, the text kitaab mujhe bechnii parhii "I had to sell the book" in the Non-canonical, Plausible condition could lead to the formation of a locally coherent parse. On the other hand, in the Non-canonical, Implausible conditions, the NP-object is not a plausible argument of the post-RC verb, thus formation of a locally coherent parse should be difficult for the text aankh mujhe bechnii parhii "I had to sell the eye". 24 experimental items in 4 conditions were prepared. 56 filler sentences with varied levels of acceptability were also included.

Procedure
We followed the same procedure discussed in Section 3.

Predictions
The predictions for the VI hypothesis for the items in current experiment are similar to those mentioned in Section 3. With respect to the bottom-up accounts to local coherence, both the SOPARSE account as well as the good-enough processing account would predict that for local coherence to arise, semantic compatibility should be a necessary condition. In the SOPARSE account, semantic match is a precondition for forming links between nodes of various treelets since the treelets are assumed to be vectors of semantic and syntactic features (see, Villata et al., 2018, for more details). Similarly, the good-enough account assumes use of various heuristics during the comprehension process. According to this account, heuristics based on semantics and world knowledge can override the grammatical constraints during online structure building (Ferreira, 2003). So both these accounts predict semantic plausibility to be a prerequisite for local coherence to arise, as plausibility is a key device to encode semantic relations between words. If this is correct, then, we would expect to observe a faster reading time in the Plausible Non-canonical condition compared to the Plausible Canonical condition. No difference in RT should be observed in the two Implausible conditions.

Results
We follow the same statistical analysis procedures as discussed in Experiment 1. In addition, participants who rated ungrammatical fillers more than 4.1 were removed from analysis. The threshold of 4.1 was chosen as it was the value for the 3rd quartile of all ungrammatical ratings. This step led to the removal of 16 participants.
The average rating for the experimental conditions was found to be numerically higher than the rating for the ungrammatical filler sentences, but the effect was not significant (t = −1.8). The mean rating for the experimental items was 3.1 (Min = 1, 1st Quartile = 1, Median = 2, 3rd Quartile = 5, Max = 7) while the mean rating for the ungrammatical fillers was 2.9 (Min = 1, 1st Quartile = 1, Median = 2, 3rd Quartile = 5, Max = 7). The analysis of the acceptability ratings found a significant effect of Word Order (p = 0.01) such that the ratings for the noncanonical conditions were less than the ratings for the canonical condition. A significant effect of Plausibility (p < 0.0001 ) was also observed such that plausible conditions were rated higher than the implausible conditions. These figures can be seen in Tables 1 and 2. The ratings were further evaluated in a nested contrast: the difference between the ratings in the Plausible conditions was in the expected direction but non-significant (p = 0.06). The difference between the ratings in the Implausible conditions was also non-significant(p = 0.1).
RTs at the critical region showed a significant main effect of Word Order (p = 0.01) such that RTs in the Noncanonical conditions were faster than the Canonical conditions. See Table 3 for the RT results. A nested contrast showed that this effect was driven by the Plausible conditionsthe Plausible non-canonical condition was faster than the Plausible canonical condition (p = 0.001) while no difference was found in the Implausible conditions (p = 0.42). See Table 3 for the results of the nested contrast. No significant effect was observed in the post-critical region. RTs at the critical and the post-critical regions are shown in Figure 2. Figure 8 shows the RTs for all the regions across all the experimental conditions. We also note that not removing the participants based on the exclusion criterion mentioned above did not change the salient findings of the presented analyses.

Discussion
The results from Experiment 4 show that the local coherence effect appeared in the non-canonical condition only when the post-RC object and the clause final matrix verb and its subject formed a plausible parse. It also showed that while the implausible conditions were rated lower than the plausible conditions, the two implausible conditions were not significantly different.
The results replicate the key finding of the speed-up in the non-canonical word order observed in Experiments 1 and 2. Further, this experiment shows that semantic plausibility is a key precondition for the local coherence effect to arise. This is consistent with the processes assumed by SOPARSE and good-enough processing accounts.
Given that accounts such as SOPARSE also assume a syntactic component to local coherence, especially in recent iterations such as Villata et al. (2018), it is important to evaluate if syntactic compatibility in the local context is important for local coherence effects to arise. Next, we report two experiments which evaluate this possibility by utilising the properties of verb agreement in Hindi.

Experiment 5
The goal of the current experiment was to investigate if certain syntactic factors in the local linguistic environment are necessary for the local coherence observed in Experiment 1 to arise. In particular, this experiment evaluated the contribution of verb agreement, which is a key device to encode syntactic relations between words in Hindi.
In order to probe the hypothesis that agreement constitutes a necessary condition for local coherence to arise in structures of the type tested in Experiment 1, we manipulated agreement morphology on the post-RC verb for transitive sentences. We compared sentences where this verb incorrectly bears overt agreement morphology indexing the features of the RC internal NP to sentences where the verb bears default agreement morphology (not indexing any NP's features) to evaluate if the presence or absence of overt agreement morphology impacts local coherence.

Material and methods
8.1.1. Participants 63 native speakers of Hindi participated in this experiment at IIT Delhi.

Items
We used a 2 × 2 design crossing RC internal WORD ORDER (Canonical/Non-canonical) and AGREEMENT-TYPE (Overt/ Default). Canonical order is SOV, while non-canonical order is SVO. All four experimental conditions are ungrammatical as in Experiment 1the post-RC verb is thematically incompatible with the RC head noun, NP Masc (vah lar kaa "That boy Masc "). In the Overt Agreement conditions -17a, 17bthe post-RC verb incorrectly bears Feminine agreement matching the RC internal object NP Fem (kitaab "book Fem ") rather than the NP Masc . The Overt Agreement conditions are identical to the transitive sentences in Experiment 1 -8c, 8d. In the Default Agreement conditions -17a, 17bthe post-RC verb bears default agreement morphology which does not index any NP's features. Note that in order to allow for the possibility of a locally coherent parse in the Default agreement conditions, the RC internal object NP bears an overt case-marker ko which blocks agreement in the language and is consistent with the requirements for default agreement on the post-RC verb (see Section 2 for a brief account on agreement rules in Hindi). This manipulation in the Default agreement conditions does not alter the grammaticality status of the sentences in questionthe thematic incompatibility between the head-NP and the post-RC verb that exists across all the conditions continues to render the sentences globally ungrammatical.
24 experimental items in 4 conditions were prepared. 56 filler sentences with varied levels of acceptability were also included. Some of the filler sentence were ungrammatical. These ungrammatical fillers were used to filter out participants who were not paying attention during the data analysis phase. A latin-square design was used to present different items.

Procedure
We followed the same procedure discussed in Section 3.1.3.

Predictions
The predictions of the VI hypothesis will be similar to those mentioned in Experiment 1. The SOPARSE account predicts that the presence of overt agreement would make the local coherence effect stronger. This is because the account assumes each node in an activated treelet to be a vector of semantic and syntactic features. Increased match between nodes of different treelets leads to increased strength of link between the treelets (see Villata et al., 2018, for more details). Therefore, if agreement is a necessary condition to build a locally coherent parse, then we expect the local coherence effect on RTs to manifest differently across the Overt Agreement and Default Agreement conditions. The reading time difference between the Canonical and Non-canonical condition ought to be larger in the Overt Agreement conditions since the morphology on the post-RC verb overtly indexes the features of the RC internal object NP here but not in the Default Agreement conditions. On the other hand, the good-enough processing account does not require syntactic feature match to be a necessary condition for local coherence. This is because heuristic-based parsing does not subscribe to any grammatical constraints.

Results
We follow the same statistical analysis procedures as discussed in Experiment 1. In addition, participants who rated ungrammatical fillers more than 3.5 were removed from analysis. The threshold of 3.5 was chosen as it was the value for the 3rd quartile of all ungrammatical ratings. This step led to the removal of data from 17 participants in this experiment.
The average rating for the experimental conditions was found to be significantly higher than the rating for the ungrammatical filler sentences (t = −12.51). The mean rating for the experimental items was 3.6 (Min = 1, 1st Quartile = 2, Median = 4, 3rd Quartile = 5, Max = 7) while the mean rating for the ungrammatical fillers was 2.5 (Min = 1, 1st Quartile = 1, Median = 2, 3rd Quartile = 4, Max = 7). The analysis of the acceptability ratings found no effect of Word Order (p = 0.96), however the ratings for the non-canonical conditions were numerically lower than the ratings for the canonical condition. In addition, a significant effect was found for Agreement (p = 0.03) such that Overt agreement conditions were rated higher than the Default agreement conditions. These figures can be seen in Tables 1 and 2. RTs at the critical and post-critical region showed a numerical trend for faster RTs in the Non-canonical conditions compared to the Canonical conditions; however, this difference was not statistically significant. The effects of Agreement and the interaction term were found to be non-significant. Table 3 shows the results for both the critical and the post-critical region. RTs at the critical and the post-critical regions are shown in Figure 2. Figure 9 shows the RTs for all the regions across all the experimental conditions.
We also note that not removing the participants based on the exclusion criterion mentioned above did change the results in the following waysfor the acceptability ratings there was a significant effect of Word Order (p = 0.01), in addition to the main effect of Agreement (p = 0.001). For the RT results at the critical region, the effect of Word Order was not significant (p = 0.07) with the Non-canonical conditions being read numerically faster compared to the Canonical conditions. For this unfiltered data, the average rating for the experimental conditions were still found to be significantly higher than the rating for the ungrammatical filler sentences (t = −10.35).

Discussion
The RT results and the acceptability ratings results in the current experiment show similar numerical trends to Experiment 1 with regard to the Word Order manipulation. While the effect of Word Order was non-significant for both dependent variables, a main effect of Agreement was observed in the acceptability data.
The lack of a differential impact of overt agreement morphology versus default agreement morphology on the reading times is consistent with the possibility that overt agreement on the post-RC verb does not form a necessary condition for local coherence to arise. This would go against the prediction of the SOPARSE account that treats syntactic features to be critical for the local coherence effect to arise. Further, the lack of an effect of agreement is, consistent with the goodenough account wherein a heuristic-based mechanism does not use any syntactic information to parse the sentence.
A further possibility is that while overt agreement on the post-RC verb is not a necessary condition for local coherence, overt agreement elsewhere in the sentence, for example on the RC internal verb, impacts local coherence effects. In particular, it could be the case that in the Non-canonical condition, forming a locally coherent parse is easier when the RC internal verb exhibits default agreement compared to when the RC internal verb overtly agrees with the RC-object. This is because in the former, the association between the RC verb and its post-verbal object could be weak because of a lack of an agreement dependency. The next experiment investigates this possibility through a sentence completion task.

Experiment 6
This section presents a sentence completion study to investigate if agreement morphology on the RC-verb leads to differential error rates after the RC verb in the two non-canonical conditions. We evaluated participant performance in a sentence completion task involving Non-canonical sentence preambles truncated at the RC-verb with Agreement type manipulated at this verb. We hypothesise that Overt morphology could lead to a stronger association between the RC verb and the post-RC object and, thus, in such cases prediction of post-RC objects could be higher. In contrast, in the Default morphology condition, a weaker link between the RC-verb and the post-RC object is expected to result in reduced prediction of post-RC objects.
The linking hypothesis is that the more susceptible participants are to locally coherent parses after encountering the RC verb, the more likely they are to make completion errors, with a further corollary that the Default morphology condition should lead to more completion errors compared to the Overt morphology condition, if agreement morphology is contrastive for Hindi speakers here. If such a differential error rate is observed, this would suggest that the susceptibility to building a locally coherent parse rests on agreement morphology on the RC-verb. On the other hand, if similar amounts of errors are observed after seeing Overt agreement or Default agreement morphology on the RC-verb in Non-Canonical sentence preambles then that would amount to a failure to find evidence in support of agreement morphology at the RC-verb impacting downstream processing.
9.1. Material and methods 9.1.1. Participants 21 Hindi native speaker participated in the experiment. The participants were undergraduate or graduate students at Jawaharlal Nehru University Delhi and IIT Delhi.

Items
We used a 1 × 2 design manipulating AGREEMENT-TYPE (Overt/Default) at the RC internal verb in sentences with non-canonical word order in the relative clause. Participant were shown 24 items following example 18. The items were presented using a latin square design. In addition, 122 filler items were also included in the experiment.
(18) a. Non-Canonical, Overt vah lar  kaa/ jisne/ bohot dilchaspii se/ par hii thii … that boy Masc who Erg lots interest with read Fem had Fem … b. Non-Canonical, Default vah lar  kaa/ jisne/ bohot dilchaspii se/ par haa thaa … that boy Masc who Erg lots interest with read Def had Def … 9.1.3. Procedure The sentence completion task was employed as the experimental paradigm in this experiment in a manner identical to that outlined in Section 5. The experiment was conducted using Douglas Rohde's Linger software (version 2.94). Items were automatically randomised by Linger.

Response coding
The completion data was coded for two variables (a) whether the direct object of the RC verb was posited (1 if yes, 0 otherwise), and (b) whether the completion was grammatical (1 for yes, 0 otherwise).

Predictions
As stated previously, the aim of the current completion study was to investigate if the susceptibility to form locally coherent structures is affected by the agreement morphology at the RC-verb.
Note that given the SV order in the relative clause, speakers ought to predict an object argument for the RC-verb, as in Experiment 3, and that this step would precede the building of any locally coherent parses. Object prediction may be directly affected by the agreement morphology on the RC-verbovert agreement requires the positing of a case-unmarked NP with the relevant person/number/gender features. On the other hand default agreement may not be as strongly predictive because in the language default agreement can arise in a number of contextswhen argument NPs are case-marked or when there are no argument NPs present (e.g. in infinitive clauses). Therefore, completions with a valid RC-object are predicted to be higher in the Overt condition compared to the Default condition.
In order to form the locally coherent parse the prediction of the head NP-matrix verb structure has to be discarded by the parser. By hypothesis, increased completion errors in this experiment constitute a signal of the discarding of this prediction and can be used as a measure of the strength for a locally coherent parse to be formed in the two conditions. If the local coherence effect is impacted by agreement morphology at the RC-verb, we should expect a differential rate of ungrammatical completions in the Overt condition compared to the Default condition.

Results
The statistical analysis for the completion data was done using the generalised linear mixed-effects model with logit link function. This has been done using the lme4 package (Bates et al., 2015) in R. Maximal models were fit when possible (Barr et al., 2013); in case of convergence failure, a less complex model was fit by successively removing the random slopes of the by-subject and by-item random effects component.
The percentage completion for the two response variables coded by us is shown in Table 6. Results show no difference in the completions either for the RC object or for grammatical completions (see Table 7).

Discussion
These results show that completions in sentences with overt agreement features on the RC verb do not differ from cases where there were default agreement features on the RC verb. Hindi speakers are just as likely to (a) posit an RC-object in the Overt agreement condition as the Default agreement condition; and (b) grammatically complete the sentences in both the conditions. While there may be some generalised susceptibility to locally coherent parses in these sentencesas signalled by the relatively low proportion of grammatical completions across the boardovert agreement at the RCverb does not modulate this effect.

General discussion
The results of the present set of experiments are summarised in Table 8. A key takeaway of the current work has been that local coherence effects can be observed in an SOV language like Hindi. Importantly, through a series of online and offline experiments we found that such locally coherent parses can be formed in clausefinal environments. Another important finding was that the observed locally coherent structure is being formed bottom-up. These results are striking because processing in Hindi (and other SOV languages) has been argued to have robust mechanisms for parsing clause final structures involving top-down predictive mechanisms. The results show that the parser can form illicit parses bottom-up in such syntactic environments.

Does processing load lead to local coherence?
Local coherence has typically been explained using the SOPARSE model (Smith, 2018;Tabor et al., 2004). In the current work we do find some support for this model, however it is not unequivocal. The susceptibility to form locally coherent parse increases when there is a semantic compatibility between the RC object and the matrix verb. On the other hand, we found no evidence that local agreement features are necessary for the formation of locally coherent structures. Such an asymmetry between syntactic and semantic features goes counter to the predictions of the SOPARSE model.
The current results seem to be more compatible with a good-enough account wherein the "good-enough" parse is constructed when the processing load is high. In particular, at the RC-object region in the non-canonical condition, the combined cost of embedding as well as structural revision could strain the memory resources leading to the parser resorting to heuristic-based processing (cf. Apurva & Husain, 2021;Karimi & Ferreira, 2016;von der Malsburg & Vasishth, 2013). The good-enough processing mechanism has previously been invoked to explain the formation of such illicit parses due to increased memory load. In a recent study investigating anti-locality in Hindi, Apurva and Husain (2021) found that when syntactic complexity (in the form of clausal embeddings) increases, the parser resorts to creating shallow parses. They argued that the signature speedup at the verb that is observed with increased distance between the verb and its prior argument (the "antilocality effect") is not driven by robust prediction, rather the speed-up signifies a shallow processing strategy which arises because of the increase in preverbal complexity. On similar lines, in the context of processing of garden-path sentences, von der Malsburg and  showed that low-capacity readers tend to underspecify an ambiguous attachment that makes them face less difficulty than high-capacity readers. von der Malsburg and Vasishth (2013) come to a conclusion that is very similar to ours, they say . . . goodenough processing . . . may consist of a dynamic adaptation to continually fluctuating processing constraints. Situations where the language system cannot afford an exhaustive analysis of the sentence and has to cut corners may even be the norm and not just an occasional deviation from "normal" processing. Together these results suggest that the formation of locally coherent structures can therefore be thought of as the parser adapting to its limited resources.
Finally, our findings could be used to motivate and test a more curtailed SOPARSE model with weighted vectors such that the model privileges semantic information over syntactic information during bottom-up structure building. We leave the further delineation of these possibilities to future work.
Understanding the mechanisms that subserve the formation of a locally coherent structure during online comprehension is critical because it highlights the interaction between top-down and bottom-up processing. The locally coherent parse in previous work that probed this interaction (e.g. Gibson, 2006) was mainly driven by lexical ambiguity (also see, Bicknell & Levy, 2009). Similarly, previous demonstrations of local coherence in SOV languages, also exploited lexical ambiguity (e.g. Konieczny et al., 2009;Paape & Vasishth, 2016). Given the proposed widespread role of top-down processing in SOV languages, a fuller understanding of this interaction is critical in uncovering the parsing mechanisms in such languages. To our knowledge, the current work is the first demonstration of local coherence in an SOV language involving argument structure, agreement operations, etc. (cf. Kamide & Kukona, 2018). Critically, unlike previous work, the effect here could be interpreted as being driven by forgetting of structural predictions that have previously been assumed to be quite robust in SOV languages. This has some interesting implications for processing of SOV languages which we turn to next.

Implications for predictive processing in SOV languages
As discussed in the Introduction, it is generally understood that clause final verbal prediction is quite robust in SOV languages (e.g. Konieczny, 2000;Levy & Keller, 2013;Vasishth et al., 2010). On this account, the current results are quite striking. They show that when the processing load is high, the prediction mechanism in an SOV language can falter.
In the experiments reported, an embedded RC as well as a cost of revision due to dashed expectation, was assumed to lead to high processing load. Forgetting of the clause final verb was triggered due to the combined cost of embedding (Gibson & Thomas, 1999; R. L. Lewis & Vasishth, 2005;Miller & Chomsky, 1963;Yngve, 1960) as well as structural revision (Hale, 2001;Levy, 2008). In the current set of experiments, speakers needed to parse various dependencies within the relative clause (e.g. argument structure, agreement) and resolve the connection between the relative clause and matrix clause. This was done while also making revisions to the RC structure (due to the word order variation) and retaining the details of the predicted matrix structure across the embedded RC. While an independent processing cost for these two factors (embedding and revision) has been attested previously, it seems that together they lead to a tipping point for memory overload in SOV languages. Interestingly, this also suggests that structure maintenance and structure building use the same memory resource (cf. Gibson, 1998;Just & Carpenter, 1992;Yadav et al., 2022).
Factors such as clausal embeddings and structural revision are known to cause processing difficulty during comprehension (e.g. Frazier, 1985;Gibson, 1998;Levy, 2008). One implication of this would be that processing in SOV languages should be easier when the preverbal linguistic configuration is simple and when the core verbal arguments are close to the clause final verb (cf. Futrell et al., 2020;Gibson et al., 2013;Ros et al., 2015;Ueno & Polinsky, 2009). Recent experimental/corpus studies in Hindi and other SOV languages do support this hypothesis. For example, Husain (2021, 2022) find that parsing errors in Hindi increase with complex argument structure involving multiple embeddings and with increase in core dependency relations between a verb and its arguments. Relatedly, in a cross-linguistic corpus study involving multiple SOV languages (including Hindi), Yadav et al. (2020) found that the linguistic context intervening between a verb and its dependent has few embeddings. Similarly, using a corpus study, Sharma et al. (2020) found that, compared to adjuncts, arguments (subject, objects and indirect objects) in Hindi, are, on average, linearly closer to the verb. Together these results suggest that processing in an SOV language like Hindi involves robust prediction when the preverbal linguistic material is simple; with increased preverbal complexity, prediction can become fallible and the parser can falter.

Broader issues and future directions
10.3.1. Online versus offline measures across tasks As we saw in Experiment 1, the acceptability rating results for the critical items show a different pattern than the RT datathe Non-canonical conditions were rated lower than Canonical conditions, even as the Non-canonical conditions were read faster than the Canonical conditions (at the post-critical region). So, even though we have evidence for a local coherence effect from the RT results, the acceptability rating results don't seem to reflect the same. We briefly mentioned the possibility that this indicates that the RT data and the rating data index different underlying processes with the RT data indexing an online process and the end of sentence rating data indexing an offline measure. If this is correct, this raises the further question of whether the divergent effects observed for the two measures in this experiment should also extend to other online-offline measures. However, as we saw in Experiment 2, the RT data and the comprehension question data show similar patternsquestion accuracy is low in the same contexts where RTs are fast i.e. in Non-Canonical conditions a local coherence effect is reflected across both an online measure and an end of sentence measure. Comparing the two end of sentence measuresin Experiment 1, the ratings seem to be capturing an independent overall preference for canonical word order in Hindi while in Experiment 2, the comprehension question accuracy seem to be capturing the persistence of the locally coherent parse. We take the difference across the two offline measures to suggest that different end of sentence measures index different aspects of processing, which do not necessarily parallel online measures like RTs. While we do not explore this possibility further in this paper, the potential connection (or lack thereof) between online and offline measures within a single experimental task is worth examining to get a fuller handle on the complexity of sentence comprehension. For now, we point the reader to existing literature that has discussed the implications of alignment or misalignment of online and offline data for a number of domains in psycholinguistic researchsee, among others, S. Lewis and Phillips (2015), Parker (2019), Stowe et al. (2018), and Christianson et al. (2022).

Implications for prediction
Prediction forms a critical component in various processing theories (Gibson et al., 2013;Kurumada & Jaeger, 2015;Pickering & Gambi, 2018). It is assumed to be effortless and requiring minimal resources. For example, under the noisy channel model (Kurumada & Jaeger, 2015) comprehension is a problem of inference in a noisy perceptual environment where prediction of the upcoming input becomes a key mechanism in successfully deriving the meaning. Such inferencing is also assumed to be cognitively inexpensive (e.g. Piantadosi et al., 2012). Recent work on processing of SOV languages suggests that prediction in sentence comprehension might be quite constrained. In particular, successful syntactic prediction might only be possible when the preverbal complexity is simple. Prediction seems to be influenced by working memory constraints and its scope during comprehension could therefore be limited (cf. Brothers & Kuperberg, 2021;Huettig, 2015). The local coherence effect found in the Kamide and Kukona (2018) study as well as in the current study shows the vulnerability of the parser in a similar configurationan embedded RC with an SVO order. This suggests that the processing pressures in SOV and SVO languages might be very similar, rather than distinct (cf. Yadav et al., 2020), which in turn highlights the cross-linguistic validity of an interaction of prediction and working-memory.
The results discussed in this paper, thus, allow us to segue into more specific questions about the interaction of predictability and the load-bearing capacity of the parser. In particular, while it is clear that Hindi speakers struggle with the prediction of the matrix verb as generated at the head NP in this context, the details of what is remembered and what is forgotten needs further probing.

Grammatical typology
The experimental results of this paper also contribute to broader discussions in psycholinguistics regarding the importance of conducting research in typologically distinct languages for robust theory building (Anand et al., 2020;Norcliffe et al., 2015). The experimental findings of this paper demonstrate how language-specific properties may be leveraged to precisely probe the mechanisms underlying sentence processing both generally and specifically. It is only with a firm understanding of cross-linguistic patterns can we even begin to ask to what extent are the production or comprehension mechanisms invoked by various theories universal and which aspects indicate the influence of language-specific factors. Similar momentum has been observed in other empirical domains in psycholinguisticsfor example, recent work on the processing of agreement dependencies cross-linguistically has yielded novel insights into the parser-grammar interaction (for e.g. Avetisyan et al., 2020;Bhatia & Dillon, 2022;Slioussar, 2018, among others). Work in this vein, therefore, signals a growing shift towards linguistically rich psycholinguistic research that allows researchers to tease apart different hypotheses on account of the grammatical properties of diverse languages under investigation.
Within the specific context of our paper, the choice of Hindi as the language of interest has allowed for investigations into structurally-driven local coherence effects, which are distinct from previously studied local coherence effects in a number of ways. For instance, we observed that Hindi speakers, despite being users of a language which has traditionally been argued to have robust prediction, exhibit non-predictive local coherence effects (as suggested by the low number of locally coherent completions). This is in contrast to Kamide and Kukona (2018)'s findings, wherein English exhibits predictive local coherence effects (as signalled by increased fixations on words which continue the locally coherent parse) despite English being a language argued to have relatively weaker prediction. In addition, the availability of different word orders within what is typically (described as) a head final language, has helped to identify the processing costs associated with non-canonical orders, which may in combination with other sources of processing load may even overwhelm the parser. Further, the specific grammatical properties of Hindi have also made it possible to examine syntactic and semantic factors underlying local coherence effects systematically, and thereafter formulate hypotheses about the relative contribution of these factors in parsing complex structures.

Conclusion
In this paper, we examined the nature of structure building in the clause final environment of an SOV language, Hindi. We tested the verb-final infallibility hypothesis which predicted that speakers ought to face no difficulty in building head-final structures on account of the parser's adaptation to language specific statistics. Contrary to this prediction, we observed that Hindi speakers struggled with head-final structures. They formed grammatically invalid parses as indexed by a local coherence effect in reading ungrammatical sentences, with this effect resting on the plausibility of the local parse. Hindi speakers also struggled with maintaining predictions regarding the matrix structure as indexed by ungrammatical completions and faulty structural integrations in grammatical and ungrammatical sentences. These results are consistent with processing accounts that allow for building of grammatically illicit sentential structures during comprehension. Further, we found that in the context of these processing models, the pattern of results is best explained by bottom-up processes, with good-enough processing being the most likely model. On the other hand, the lossy-surprisal model cannot explain the data. Together the current work draws attention to the limits of parser adaptability in head-final languages and demonstrates that the parser can falter with respect to predicting and maintaining linguistic structure in the face of increased complexity.