Lexical alignment in second language communication: evidence from a picture-naming task

ABSTRACT Language alignment occurs when interlocutors mimic each other’s language. Language alignment can happen as a result of priming, but may also be mediated by speakers’ beliefs about their interlocutor, including how language-proficient they believe the interlocutor to be. However, it is unknown whether bilingual speakers also show such effects. In this study, the participant and interlocutor took turns labelling pictured objects. These had alternative labels—one preferred, one dispreferred – with the latter used by the interlocutor. Participants were native Mandarin speakers who rated themselves as higher- or lower-intermediate L2 English learners. They were told their interlocutor was either a native English speaker, or another L2 English-learner. In a series of three experiments, the results showed that participants aligned with the interlocutor by using the dispreferred label. Rates of alignment varied, depending on the perceived proficiency of the interlocutor, and to a lesser extent, the L2 speaker’ self-rated proficiency.

Unmediated alignment; mediated alignment; L2 lexical alignment; L2 learners; L2 proficiency; second language production Interactive language alignment is the phenomenon by which speakers repeat each other's expressions, structures, and pronunciation patterns in conversation (Chartrand & Bargh, 1999). For example, if a speaker says boat instead of ship to describe a large seagoing vessel, their interlocutor will probably also use the word boat to refer to that object. Similarly, a speaker may use the same sentence structure as their interlocutor. Or a speaker may alter their speech patterns in subtle ways to sound more like their conversational partner. There has been a great deal of research over the last few decadessince the seminal studies by Bock (1986) that has explored the types of linguistic representations that lend themselves to alignment and that has sought to identify the mechanisms that underlie it. There are various explanations, falling into two major types. One type is that it is the result of priming within the language system: speakers use the same linguistic representations that they have just encountered in comprehension because those representations are still activated in memory (e.g. Pickering & Garrod, 2004); this an unmediated account of alignment. Another type of explanation is that it is mediated by non-linguistic factors, and arises due to an unspoken agreement between interlocutors to use the same expressions in order to facilitate communication (e.g. Brennan & Clark, 1996). Very likely, there are mediated and unmediated elements to many instances of alignment, though the extent to which non-linguistic influences come into play is likely to vary depending on the type of alignment and the characteristics of the interlocutors, including their language proficiency (e.g. Branigan et al., 2011;Tobar-Henríquez et al., 2021).
Most alignment studies have focused on native speakers. Despite the prevalence of bilingualism, interactions between people speaking their second language (L2) have received far less attention. Yet, L2 learners differ in obvious ways from native speakers: L2 processing may be less automatic, may require greater cognitive resources, may contain errors, may be subject to influences from the native language (L1), and will obviously be affected by proficiency in L2 (Costa & Santesteban, 2004). In addition, and importantly, L2 speakers are still learning the language, so they may be especially attuned to the language usage of their interlocutors. If we are to understand the ways in which interlocutors adapt to one another during communicative interactions, we need to consider L2 speakers, as well as native speakers.
In this paper, we investigate the variables that influence alignment in L2 speakers with different degrees of self-perceived proficiency when they interact with interlocutors whom they believed were either native speakers of the L2 or L2 learners. We focus specifically on word-level alignment because learning L2 vocabulary is at the core of learning a second language. By some estimates, people need to learn 8,000-9,000 word families in order to be able to read a variety of authentic texts (Nation, 2006). Thus, it would be beneficial for second language learners to use their interactions in their L2 as learning opportunities. As a consequence, they may be highly likely to show alignment, especially when interacting with native speakers.
Below, we review different views of alignment and some of the evidence that supports these views.

Unmediated alignment
One model of (unmediated) alignment is the Interactive Alignment Model (IAM Model, Pickering & Garrod, 2004), which posits that language alignment is due to priming. In this model, there is a parity between the representations in comprehension and production within and between interlocutors. In a conversation, linguistic representations that are employed during production are the same as those activated during comprehension. Thus, a representation that has just been activated during comprehension remains active for a short period of time and can then be used for production (Pickering & Garrod, 2007).
This theory is consistent with a range of input-output effects that occur at different levels within the language system. For instance, it has been reported that tongue and lip muscles are activated when people listen to the speech of others (Fadiga et al., 2002;Watkins & Paus, 2004). Schneider and Chein (2003) found that people were faster to name a picture of a dog when they have just heard the phonologically similar word dot, suggesting that phonological units that are activated by an input are used by the production system. Studies on lexical alignmentor lexical entrainment (Brennan & Clark, 1996) have shown that interlocutors tend to reuse the same referential expression, even when they could use a simpler term (Brennan & Clark, 1996;Garrod & Anderson, 1987;Levelt & Kelter, 1982). For example, if someone has referred to a shoe as a loafer in a context where there is more than one type of shoe, they will also use the term loafer in a later context in which there is only one shoe (and where the simple expression shoe would refer uniquely to one item). At the syntactic level, speakers not only reuse syntactic structures they have just produced (Bock, 1986;Pickering & Branigan, 1998) but also use the structures that they have heard others use (Potter & Lombardi, 1998). For instance, participants would prefer to say the nun giving the book to the clown rather than the nun giving the clown the book if they have just heard a confederate produce an utterance with the former structure (Branigan et al., 2000).
Of course, many of these examples of alignment could be due to a speaker's desire to achieve communicative success rather than due to priming. What is the evidence for a purely unmediated component to alignment? Evidence comes from manipulations such those used in research by Branigan et al. (2011). In their lexical alignment study, participants were engaged in a computerised experiment in which they matched a picture to the interlocutor's typed label for the picture (the prime) and on a subsequent trial typed the name of the same picture (the target), which they believed would be sent to the interlocutor to match to the picture. The prime was a label that named the picture but was not the preferred label (e.g. coach instead of bus). Would the participants use their interlocutor's label when they later named the picture? Using the same label would count as an instance of alignment. In order to determine whether alignment was due to priming, the rate of alignment when the prime and target occurred in close proximity was compared to the rate of alignment when the prime and target were separated by unrelated items. According to an unmediated account of alignment, the prime activates a linguistic representation, and as long as that representation is active beyond some threshold, speakers repeat the prime rather than produce an alternative that might be equally suitable. But because activation would decay over time, it would be expected that there would be a decrement in alignment if there is intervening material between a prime and target. Branigan et al. (2011) found that the likelihood of alignment was significantly reduced when the prime and target were separated, at least for one of their participant groups (see also Brennan & Clark, 1996;and Pickering & Branigan, 1998), consistent with the unmediated account. Branigan et al. (2011) also included a manipulation in which participants named a picture twice (with the favoured label) before they were primed with the disfavoured label. According to an unmediated account of alignment, the act of using the favoured label would increase its activation level, making it a stronger competitor with the disfavoured label and thereby reducing alignment (relative to a no-prior-naming condition). Having participants produce the names rather than simply reading them (when produced by others) should make them especially potent (Zhang & Tullis, 2021). The overall results showed reduced alignment of the dispreferred label, again pointing to an unmediated component to alignment. In sum, these results support the idea that the activation level of a target wordwhich can be raised by the prior appearance of the target label or reduced by the prior use of an alternative labelaffects whether the target word is produced by the speaker (i.e. whether alignment occurs).
Would L2 speakers be expected to show the same pattern of results? This is unclear. Prior to the current study, the design employed by Branigan et al. (2011) had not been applied to L2 speakers. However, it has been suggested that L2 speakers might show less unmediated alignment than L1 speakers because L2 speakers' lexicons may contain words that are so much lower in terms of frequency of occurrence that priming does not raise their activation level above threshold and they are therefore not produced. Thus, even when primed, the L2 speaker would instead produce the label with the higher resting level of activation (Costa et al., 2008). Consider, for example, the alternative labels wallet and billfold. For a native speaker, the activation level of wallet might be higher than that of billfold, however, billfold would be readily produced if primed. But not so for an L2 speaker who knows the word wallet but has rarely encountered billfold. But there is another consideration as well. Pickering and Garrod (2006) argue that alignment is more likely when interlocutors share activation profiles (e.g. similar activation levels for alternative labels). This predicts that unmediated alignment should be greater for L1 speakers interacting with other L1 speakers than with L2 speakers; note that it also predicts that unmediated alignment should be greater for L2 speakers interacting with other similar L2 speakers than with L1 speakers.

Mediated alignment
An account of alignment that holds that it is purely the result of priming cannot accommodate findings that alignment appears to be mediated by a range of non-linguistic factors. A number of studies have shown interlocutor effects on alignment (also called partner-specific adaptation, e.g. Mol et al. 2012) that go beyond similarity of activation profiles: rate of alignment can be affected by a speaker's attitude toward their interlocutor, the interlocutor's accent, the interlocutors' political opinions, and so on. In one study, the rates of alignment involving dative structures were greater when participants interacted with a "nice" vs. a "mean" confederate (Balcetis & Dale, 2005). In another study, relatively greater alignment was observed when participants thought their conversation partners had similar political opinions and a "standard" American accent. In addition, greater alignment was found for participants who favour a conflict management style that involves compromise (Weatherholtz et al., 2014). In other research, a speaker's use of the same words or phrases as their interlocutor has been found to be closely related to the establishment of trust between strangers in a text-chat session (Scissors et al., 2008). In this research, the repetition of words or word phrases by both partners was significantly higher for high-trusting pairs than for low-trusting pairs. Such studies suggest both that the desire for communicative success may drive alignment in many instances, and that this desire may be mediated by one's beliefs about and attitude toward an interlocutor.
Importantly for the current study, it has also been shown that alignment is mediated by speakers' beliefs about the linguistic competence of their interlocutor (Branigan et al., 2003;Chun et al., 2016;Branigan et al., 2010Branigan et al., , p. 2011Koulouri et al., 2016;Tobar-Henríquez et al., 2021). The study by Branigan et al. (2011), described above, was designed not only to explore unmediated components of lexical alignment, but also mediated components. Thus, in each experiment, they tested two groups of participants. One group was told that they were interacting with another human. The other group was told that they were interacting with a computer. In reality, for both groups, there was no synchronous interaction; participants simply responded to an automated sequence of picture-matching and picturenaming turns. Any difference in alignment would be entirely due to the participants' beliefs about their interlocutor. The results showed greater alignment when participants believed they were interacting with a computer than when they were interacting with a human. In another experiment, greater alignment was found for participants who believed that the computers they were interacting with were "less capable" as opposed to "more capable". Overall, then, when interacting with an interlocutor with (perceived) limited linguistic capacity, participants used the vocabulary they knew the interlocutor would understand.
The same picture emerges in some studies with L2 interlocutors. For example, Chun et al. (2016) conducted a computer-based study in which participants heard prerecorded sentences and then described pictures. They found that their participants showed greater syntactic alignment when the recorded sentences were produced by someone with a foreign accent than when the sentences were produced by someone with a native accent. In a recent lexical alignment study by Suffill et al. (2021), participants interacted with either L1 or L2 confederate interlocutors in a map task in which routes defined by various landmarks were described. They found that participants aligned more with the L2 interlocutors than the L1 interlocutors. This shows that although native speakers may enjoy greater unmediated alignment when interacting with other native speakers, due to greater similarity in their activation profiles for lexical items, this can be over-ridden by their desire to adapt their language usage to their interlocutor.
Not all studies with L2 interlocutors have found that alignment is greater with less-proficient interlocutors. Bortfeld and Brennan (1997) had participants working in pairs in a "referential communication task" in which they had to organise a set of pictures so that they both ended up with the same order. They took turns being "director" (describing the cards and their order) and matcher (putting cards in the same order as the director's). Native speakers were paired either with native speakers or non-native speakers. Results showed equivalent rates of lexical alignment when native speakers conversed with non-natives and when they conversed with other native speakers. Given the mixed findings regarding interlocutor proficiency, it is still an open question when and how this variable affects L2 alignment.
Does speaker proficiency affect alignment? No study has tested alignment in L2 participants at different levels of proficiency, as we do. The few studies that have explored speaker proficiency have compared native speaker with non-native speaker participants, and the results are mixed. Mol et al. (2012) did find a speaker effect. They conducted an experiment in which participants interacted with a confederate on a map task in which they needed to refer to landmarks. Participants were native Dutch speakers with a minimum of six years of formal English classes. They were assigned to either the Switch condition or the No Switch condition. In the Switch condition, the confederate described two routes through the map, then participants switched rooms and described these routes to a different confederate, then they returned to the first room, then returned to the second room. In the No Switch condition, participants didn't switch confederates and so were able to describe routes to the confederate who had also provided route descriptions. One group of participants interacted with the confederate (s) in L1 and another group interacted in L2. Results showed no speaker effect for overall extent of alignment. However, there was an interaction between speaker type and partner type, with only L2 participants showing a partner-specificity effect (that is, greater alignment when participants did not switch). This contrasts with the map navigation study by Suffill et al. (2021), described above. They tested a group of L2-English speaking participants as well as L1 participants: thus, both L1 and L2 participants interacted with confederates who were L1 or L2 speakers (there were four participant groups). Both L1 and L2 groups showed the same result: more alignment with the L2speaking confederates than the L1-speaking confederates, but the rates of alignment did not differ for the L1 and L2 groups. Thus, there was an interlocutor effect, but not a speaker effect. Suffill et al. note that the failure to observe a speaker effect is surprising. They argue that "audience design"the ability of a speaker to adapt their language according to what they believe their interlocutor will understandis "cognitively demanding", and because L2 speakers "have fewer resources available when interacting in their second language", they should have shown a smaller interlocutor effect, with the least proficient speakers showing the smallest effects. It is worth noting that their non-native speaking participants were living in an English-speaking environment and rated their average daily exposure to English at 63%. Hence, they were able to engage in audience design possibly because their English proficiency was sufficiently high.
In fact, the non-native speakers in both studies described above might also have been "too proficient" to show effects of another potentially powerful influence: the desire to learn L2 and therefore align more with a native-speaking interlocutor than a nonnative interlocutor because the native speaker is a more trust-worthy language model (e.g. Bortfeld & Brennan, 1997;Costa et al., 2008;Dobao, 2012;Gass & Varonis, 1994;Mackey, 1999). Given these sets of findings, it is an open question whether and how speaker proficiency mediates alignment, especially for less proficient L2 speakers.

The current study
In three experiments, we investigated the unmediated and mediated components of lexical alignment in L2 speakers. To investigate the unmediated components of lexical alignment, we tested, across experiments, whether and how L2 speakers' tendency to lexically align varied as a function of the distance between prime and target and whether an unprimed target was named prior to primed target naming. To investigate mediated components, we tested, in all three experiments, (1) whether and how L2 speakers varied their tendency to lexically align with an interlocutor whom they believed to be a native speaker versus an L2 speaker, and (2) whether and how L2 speakers varied their tendency to lexically align depending on their own self-rated L2 proficiency.
In each experiment, participants saw a written label and two pictures and had to click on the picture that matched the label. This picture would then appear in a subsequent trial and participants were asked to type the label for it. The matching trial is considered to be the "prime" trial; the naming trial is the "target" trial. Because the prime was always the dispreferred label, if participants provided a label that matched the label that appeared in the prime trial, this was considered to be a case of alignment.
The design of the experiments is as follows. In Experiment 1, one filler item intervened between the prime trial and target trial. In Experiment 2, the prime trial and target trial were separated by eight fillers. In Experiment 3, the prime and target trials were preceded by unprimed instances that required the participant to name the pictures. In all three experiments, one subset of participants was told that they were communicating with a native speaker of English; another subset was told they were interacting with a nonnative speaker. All participants were in fact engaged in a computerised experiment that only had the appearance of actual synchronous interaction; this ensured that any differences in L2 alignment responses were due only to the participants' beliefs about who they were interacting with. In addition, in each experiment, we tested two groups of participants who differed in how they perceive their own proficiency in L2.
The unmediated components of alignment will be assessed through a comparison of the alignment rates across experiments. The effect of increasing the distance between a prime and target is seen in the comparison of the results of Experiment 1 and the results of Experiment 2. If alignment rates are significantly reduced in Experiment 2 (when there is greater distance between the prime and target), this would suggest that alignment in Experiment 1 could be due, at least in part, to primingthe automatic boost in activation of a target word caused by the appearance of the prime, and that the Experiment 2 results would be due to a decay in activation. But if the rates of alignment are similarwith significant alignment in both experimentsthen we would have no evidence for unmediated effects. The effect of prior naming is evaluated via a comparison of the results of Experiments 1 and 3. Prior naming would presumably boost the activation level of the preferred label, which would then act as a more robust competitor to the alternative (primed) label. This should have the effect of reducing alignment, compared to Experiment 1. If no such reduction is observed, this would suggest that there are other factors driving alignment besides those that affect the activation levels of alternative target labels.
The mediated components of alignment will be assessed for each experiment. As mentioned above, within each experiment, participant groups differed with respect to their own self-rated proficiency and their beliefs about the proficiency of their interlocutor. If alignment rates differ across participant groups, this would suggest these variables mediate alignment.

Experiment 1
Experiment 1 investigated lexical alignment in L2 speakers when the prime and target were in close proximity, separated by one filler turn. This experiment had three goals: (1) as a preliminary step, to test whether L2 speakers show lexical alignment; (2) to test whether L2 lexical alignment is mediated by beliefs about L2 proficiency; (3) to serve as a baseline experiment for comparison with the results of Experiments 2 and 3, in order to assess whether lexical alignment in L2 has unmediated components.

Method
Participants. Eighty-two undergraduates in an EFL programme in a university in south China participated in this study. None had been educated in an English-speaking country. The age range was between 17 and 24 years of age. Years spent learning English ranged from 7 to 12 years. All the participants were required to take the National College Entrance Exam for English Proficiency before they could become enrolled as an English major in a university. This exam requires the test takers to know at least 3,500 English words. Their scores had to be at least above the 90th percentile. Because university students' use of social media to communicate with their classmates is so prevalent, the participants were assumed to have had experience chatting online. The selection process worked as follows. First, participants were grouped into two proficiency levels (higher-intermediate vs. lower-intermediate L2) based on their EFL classroom level (freshmen vs. juniors). Then, to obtain information about their beliefs about their English proficiency levels, they were administered an adapted Chinese "Can-Do" test (Barrows, 1981; see Appendix I), which asks participants to judge whether they can accomplish specific types of language interaction tasks. There are 25 questions in the adapted "Can-Do" test. For each question, if the participants endorsed it with a "√", they would get 1 point and if they marked an "X" for the question, they would get 0; points were then tallied for each participant. Then, we eliminated the participants whose Can-Do test scores were exceptionally high in the freshmen group and those whose Can-Do test scores were exceptionally low at junior level. Thus, only students whose Can-Do results were consistent with their expected proficiency level were included in the alignment task.
Following these steps, we ended up with 82 participants. Of the 82 participants, 41 were juniors and 41 were freshmen. Two subjects were excluded because they either over-estimated or under-estimated their English L2 proficiency. This left 40 in the self-perceived as higher-intermediate L2 proficiency group and 40 in the self-perceived as lower-intermediate group. An independent sample t-test revealed that the two groups were statistically different in their self-ratings of L2 proficiency, t (78) = 12.815, p < .001, d = 0.18. We will henceforth refer to the two groups as Self-Perceived as Higher Intermediate (SPHI) and Lower Intermediate (SPLI). Here and in subsequent experiments, the SPHI group's scores were about 20-23 out of 25 and the SPLI group's scores were in the 14-18 point range. The SPHI subjects were much more likely than the SPLI subjects to indicate, for example, that they could understand English-language movies without subtitles and read English-language novels without using a dictionary.
Here and in Experiments 2 and 3, there were two groups of SPHI students and two groups of SPLI students. Students in each group were taking a Chinese-English translation class and so group size depended on class enrollment. This led to differences in the number of subjects in each group.
In this first experiment, of the 40 SPHI participants, 21 were assigned to interact with (what they believed to be) a "non-native" interlocutor, and 19 were assigned to the "native speaker". Of the 40 SPLI subjects, 21 were randomly assigned to the "non-native" interlocutor, and 19 to the native English-speaking interlocutor. We will refer to these types of perceived interlocutor as Native Speaker (NS) and Non-native Speaker (NNS).
Materials. Selection of the experimental items was constrained in the following ways. The first constraint was to ensure that all the labels used in the experiments were those the subjects knew well enough to recognise and spell. The second was that both labels for the critical pictured items were judged acceptable by native Chinese English L2 learners. These constraints were intended to rule out the possibility of the subjects using a word because they did not know (or accept) the alternative. The third constraint was that of the two alternative labels for a picture, one was preferred and one was dispreferred. By having the interlocutor use the disfavoured label, we could be confident that a repetition of that label was due to alignment with the interlocutor. We conducted the tests described below and ultimately selected 18 experimental items from 260 black-and-white line drawings (adapted from Snodgrass & Vanderwart, 1980). The item selection tests were conducted separately with subjects who rated themselves as higher-intermediate and those who self-rated as lowerintermediate. This was done because it is possible that the different groups of L2 learners may have different criteria for the labels in terms of their familiarity, acceptability, and preferences in use. These subjects did not participate in the alignment task.
Label familiarity survey. Thirty subjects (15 self-rated as higher-proficiency L2 learners and 15 self-rated as lower-proficiency L2 learners) were tested on a paperand-pencil task. Each image was paired with two labels. For each label, subjects were asked to indicate on a seven-point Likert-scale how familiar they were with each of the two labels (see Figure 1). The familiarity scale ranged from 1(completely unfamiliar) to 7(completely familiar). Since the subjects were native-Chinese English L2 learners, instructions were given in Chinese. Only those pictures whose labels received a rating of more than 5 by over 80% of the subjects were flagged for the next selection phase. This step resulted in 192 pictures.
Label acceptability survey. The 192 pictures with two familiar labels were presented to another group of 30 subjects (15 self-rated as higher-intermediate L2 learners and 15 self-rated as lower-intermediate L2 learners). This test was conducted to determine the acceptability of the two labels for each picture. Again, a seven-point Likertscale (see Figure 2) was used. Only those object pictures whose two labels were rated above a 5 by over 80% of the subjects were chosen. This step led to 54 pictures whose two labels were judged to be acceptable by L2 learners.
Preferred label survey. The final test was conducted to determine which pictured objects had clearly favoured/ disfavoured labels. Favoured labels were those that were more frequently used or were those that participants felt more confident about using. The 54 remaining items were presented to another 30 subjects (15 self-rated as higher-intermediate proficiency L2 learners and 15 selfrated as lower-intermediate proficiency L2 learners) who were asked to indicate which of the two labels they favoured (i.e. which they thought they used more frequently or they felt more comfortable with), on a 1-10 scale, with 1 being highly disfavoured, and 10 being highly favoured (see Figure 3). If they thought they used both labels for an image or they had no preference, they would rate the two labels with 5 or 6. We selected the pairs with a favoured label score of over 8 and a disfavoured label scored below 3. To make the favoured/disfavoured list more accurately represent the L2 learners' naming preference, at least 80% of the participants needed to agree on the preferred label and dispreferred label. The 18 pairs with the highest agreement were selected (see Appendix II). In addition to the 18 experimental items, 174 filler items were constructed.
We note that most of the words selected for these experiments are in the vocabulary list of words that needed to be learned for the National College Entrance Exam for English Proficiency, except for those less-frequently used labels such as neckwear, lock-up and icebox. But these compounds are made up of the words that are included in the required list. We consulted high-school English teachers and they agreed that all the selected words would not be difficult for the highschool students to comprehend and to spell. The participants in these studies were all those who had passed the National College Entrance Exam for English Proficiency and were enrolled as English majors. Therefore, we were confident that all the undergraduate English majors (freshmen and juniors) were able to understand and spell the words used in this study.
Procedure. Subjects were seated in front of a desktop computer and told that they would play a picturematching-and-naming game online with a partner who was in the next room. In one condition, subjects were told that they were to play the game with a native English speaker. In the other condition, they were told that they were to play the game with a native Chinese-speaking L2 learner of English.
The matching and naming turns alternated between the participant and the interlocutor. For the matching turns, subjects were informed that their partner would first present two pictures to them simultaneously, and after a short interval (3000-5000ms), a label for one of the pictures would appear on the screen (see Figure 4). The subjects were asked to match the label with one of the pictures. If they chose the picture on the left, they pressed the "1" key on the keyboard, and if they chose the one on the right, they pressed "2". Once they pressed "1" or "2", the chosen picture would be highlighted by a black-colored frame (which the participant believed was sent to the interlocutor for confirmation).
For the naming turns, first, the target picture which had appeared in the previous trial and a distractor picture appeared on the screen side-by-side. Two seconds later, the picture which was to be named in this turn was highlighted by a black frame. The subjects were to type in the label for the image highlighted with a black frame (see Figure 5). The critical matching turn (the prime) and its corresponding naming turn were separated by a filler naming turn and filler matching turn (see Figure 6). To simulate a real exchange between human interlocutors, the interval between the matching and naming turn varied between 3000 and 5000 ms. The time interval was assigned randomly by the programme that ran the experiment. After the participants completed all the trials, they were told the game was over and they pressed the "Esc" key to exit.
To remind subjects that they were interacting with a foreign native English speaker or a second language speaker peer, an image of the foreign interlocutor or the L2 speaker always appeared in the dialogue screen (see Figure 5 for examples).

Results
For all three experiments, we conducted the following analyses: (1) Chi-square analyses to determine whether overall rates of alignment were significantly greater than baseline; (2) mixed logit regression analyses to determine whether alignment was mediated by Participant Type and Interlocutor Type. In addition, we conducted mixed logit regression analyses that included cross-experiment comparisons.
First, as an important initial step, we examined whether alignment occurred in the L2 speakers. Picture naming responses were categorised as aligned or not. As long as it was clear which label participants were using, typing mistakes were ignored. The mean proportions of aligned naming responses are shown in Table 1 and Figure 7. The average alignment percentage was 61.3%, with the SPHI participants aligning 63.8% of the time and the SPLI participants aligning 58.8% of the time. Recall that in pretesting the materials, participants were presented with pairs of labels for a given picture and asked to indicate how much they favoured (or disfavoured) each label, on a 10-point scale. We only used pictures for which the participants showed a clear preference for one of the labels: a rating above 8 for one alternative, and below 3 for the other. Using the ratings for the dispreferred label as a proxy for the base rate of dispreferred label usage, we conducted chi-square analyses that compared the observed rates with the expected (base) rates. These tests revealed that alignment rates for all four participant groups were significantly greater than the base rates: SPHI participants with NS interlocutor, X 2 by item (1, N = 18 (Jaeger, 2008;Johnson, 2009) using R (R Core Team, 2016) and lme4 (Bates et al., 2012). We used this statistic for two reasons. First, the dependent measure is binary: a given trial was scored as either aligned or not (Breslow & Clayton, 1993;Bates & DebRoy, 2004). Second, unlike ANOVA, which would require separate analyses for participants and items, mixed effect models allow for the simultaneous inclusion of participants and items as random variables.
Our approach to the statistical analyses follows that taken by Weatherholtz et al. (2014). Recall that in our experiments, Participant Type (SPHI vs. SPLI) and Interlocutor Type (NS vs. NNS) are between-participant and within-item variables. Into the model, we entered as fixed effects Participant Type and Interlocutor, and their interaction. Participant Type and Interlocutor Type were coded with sum contrasts (−1 and 1) to reduce collinearity. To avoid anticonservativity, we used the maximal converging by-subject and by-item random effects structure: random by-subject and by-item intercepts, and by-item random slopes (Barr et al., 2013). See Table 2 for a summary of statistical analyses for all three experiments.
For Experiment 1, the maximal converging model included the fixed effects (and their interaction) and participants and items as random factors, but excluded the by-item random slopes. There was significant main effect of the Interlocutor Type: β = 0.373, z = 3.322, p = .001, with participants showing greater alignment with the NS interlocutor. The effect of Participant Type was not significant, β = −0.12, z = −1.07, p = .284. The interaction of Interlocutor and Participant Type was marginal, β = 0.189, z = 1.684, p = .092.
Despite the only marginal interaction of Participant Type with Interlocutor, there is a large difference in the magnitude of the Interlocutor effect across groups (7% for SPHI groups and 21% for SPLI groups). Therefore, we evaluated the Interlocutor effect for SPHI and SPLI groups separately. The maximal converging model excluded random slopes for Item. Results showed that the Interlocutor effect was highly significant for the SPLI participants, β = 0.562, z = 3.536, p < .001, but not for the SPHI participants, β = 0.184; z = 1.163, p = .245. Thus, only for SPLI participants was there a significantly lower rate of alignment with the NNS interlocutor than the NS interlocutor.

Discussion
The goal of Experiment 1 was to determine whether L2 speakers show lexical alignment, and if so, whether alignment is mediated by beliefs about their interlocutor's proficiency in L2 and their own proficiency in L2. The results show that L2 speakers do adopt their interlocutor's label in naming pictures, and this effect is mediated by their beliefs. Overall, the data show that within both SPHI and SPLI groups, there is greater alignment with NS interlocutors than NNS interlocutors, possibly due to the belief that a native speaker is a good language model and a non-native speaker is less good.  But this interlocutor effect is only significant for the SPLI participants. Note that when they are interacting with NS interlocutors, both SPHI and SPLI participants show similar rates of alignment, suggesting that both proficiency groups are equally comfortable using the dispreferred labels. The difference is with the NNS interlocutors: the SPLI subjects show a larger dip in alignment, compared to the SPHI subjects. One reason for this could be that the participants paired with a NNS interlocutor assume that the interlocutor was at the same level of proficiency in L2 as they themselves were, making the NNS interlocutor less trust-worthy as a language model for the SPLI group than the SPHI group. Now that we have demonstrated lexical alignment in L2 learners, and observed that alignment can be mediated by beliefs about the interlocutor's level of proficiency in L2, we test whether there is an unmediated component to alignment by manipulating the distance between prime and target. This is the main focus of Experiment 2.

Experiment 2
The main goal of this experiment is to determine whether increasing the distance between the prime and target affects the rate of alignment in L2 speakers. Previous studies (e.g. Branigan et al., 2010Branigan et al., , 2011Brennan & Clark, 1996;Purmohammad, 2015) suggested that when the prime and target are adjacent or close together, alignment may occur due to priming. However, when the prime and target are at some distance, any observed alignment cannot be due to priming and is, therefore, due to nonlinguistic factors (Branigan et al., 2011;Pickering & Garrod, 2004, p. 2007). We created long-distance prime-target sets by inserting eight pairs of fillers between the prime and the target (the standard term for a filler pair that intervenes in this way is "lag"). If alignment occurs over 8 lags, it is mediated, due to the L2 participants assuming that their interlocutor knows the best label for the picture, or due to the participants' desire to facilitate communication by using the same label, or both.
Assuming that the results of Experiment 1 reflect both mediated and unmediated components, we should see a smaller alignment effect if the unmediated component is removed.
Another goal of this experiment is to explore whether the Interlocutor effects observed in the two proficiency groups Experiment 1 are apparent under "long lag" conditions. In addition, we may see an effect of speaker proficiency. It should be pointed out that in an experiment such as this, it is not enough for the participants to want to align with the interlocutor; they need to remember the label that the interlocutor used. This makes this version of the experiment cognitively more demanding than Experiment 1. Given findings that working memory in L2 correlates with L2 proficiency (e.g. Linck et al., 2014), a long lag would have a potentially greater impact on the lower-proficiency group of participants.

Participants
Ninety-seven subjects participated in this experiment. Their educational background in English as a second language was the same as for the participants in Experiment 1. As in Experiment 1, the adapted Chinese Can-do test was administered to test the participants' perceived L2 competence. As for Experiment 1, any participant whose score on the Can-do test misaligned with their class level was not tested further; however, there were no such participants. An independent t-test was administered to determine whether the two groups of subjects had reliably different perceptions about their L2 proficiency. The result showed that the two groups of subjects' Can-do scores were reliably different, t(95) =17.486, p < .001, d = 0.54.
Of the 97 participants, 46 self-rated as higher-intermediate. Of these, 27 were assigned to the NS interlocutor condition, and 19 to the NNS interlocutor condition. Of the 51 participants who self-perceived as lower-intermediate, 28 were assigned to the NS interlocutor condition and 23 to the NNS interlocutor condition.
Materials and procedure Experiment 2 used the same materials, design and procedure as Experiment 1 except that 8 fillers intervened between a prime and target (see Figure 8). Ultimately, there were 162 images to be matched and labelled.

Results
The primary focus of Experiment 2 was to examine whether L2 alignment occurred when the prime was separated from the target by intervening matchingand-naming turns. The results, displayed in Table 1 and Figure 7, showed that alignment occurred 62.1% of the time, with SPHI participants showing greater alignment than SPLI participants (67% and 57.1% respectively). Chi-square analyses to compare the alignment rates in the experiment with the norming task again showed significantly greater use of the dispreferred label for all four participant groups: SPHI with NS interlocuter, X 2 by item (1, N =18 (1, N=28) = 174.463, p < .001. So, we infer that the increased use of the dispreferred label in the experiment reflects alignment.
Turning to the effects of the independent variables (Participant Type and Interlocutor Type), mixed logit regression analyses that were used for the data from Experiment 1 were used here. The maximal converging structure included the interaction of the two independent variables, random by-subject and by-item intercepts, and by-item random slope for Participant Type plus Interlocutor. The results indicated significant effects of Participant Type, β = −0.286, z = −2.303, p = .021 (greater alignment for the SPHI participants than the SPLI participants), and Interlocutor type, β = 0.314, z = 2.839, p = .005 (greater alignment with NS than NNS interlocutors). The interaction of the two variables was also significant, β = −0.213, z = −2.017, p = .044. We then examined the Interlocutor effect separately for SPHI and SPLI participants. The results showed that the Interlocutor effect was significant for the SPHI participants, β = 0.526, z = 3.308, p < .001, but not the SPLI participants, β = 0.102, z = 0.699, p = .485. SPHI participants aligned significantly more often with the NS interlocutor than with the NNS interlocutor (see Table 1).
Just as for Experiment 1, the results of Experiment 2 showed significant alignment in our L2 participants. In Experiment 2, rates of alignment were affected by both self-perceived L2 proficiency and by beliefs about the interlocutor's L2 proficiency, with the SPHI subjects showing a larger effect of Interlocutor.

Combined analysis of experiments 1 and 2
In order to investigate the effect of increased "lag", we conducted a statistical comparison of Experiments 1 and 2. The maximal converging model included the fixed effects of Lag (Experiment 1 with 1 lag vs. Experiment 2 with 8 lags), Interlocutor Type (NS vs. NNS interlocutor) and Participant Type (SPHI vs. SPLI participants) and the interaction of the three variables, along with random effects of Participants and Items (and by-item

Discussion
The primary aim of Experiment 2 was to determine whether, relative to Experiment 1, alignment rates declined when the prime and target were separated by intervening filler items. The results showed that the overall rate of alignment was nearly identical for Experiments 1 and 2: increasing the distance between the prime and target did not reduce alignment. This is similar to Branigan et al.'s (2011) finding for their computer-interlocutor participants, who showed similar rates of alignment irrespective of prime-target distance. But it contrasts with the result for their human-interlocutor participants. In their study (Branigan et al., 2011), when participants believed they were interacting with another human, their rates of alignment were significantly lower in the "long lag" condition than in the "no lag" condition. Branigan et al. suggest that these two sets of results are compatible with a "communicative design account in which participants encode their partner's choice of name more deeply when they are less certain that their partner will understand their utterances correctly", that is when they are interacting with a less linguistically competent interlocutor (Branigan et al., 2011, p. 52). But our participants don't do this. Instead of aligning more with the less linguistically competent interlocutor, they align less.
But our results are complex. The significant three-way interaction (of Lag, Participant Type and Interlocutor Type) indicates that the interlocutor effect varied as a function of Lag (with lags vs. without lags) and participants' L2 proficiency. The SPHI subjects showed a larger interlocutor effect in Experiment 2 than in Experiment 1. Conversely, the SPLI subjects showed a much larger interlocutor effect in Experiment 1 than Experiment 2. Therefore, with respect to the interlocutor effect, the addition of intervening material had different effects on the two sets of participants. One possible explanation is that for the SPLI subjects, the burden of trying to remember the labels used by the interlocutorsomething the participants would do if they wanted to optimise communicative successeclipsed their assessment of their interlocutor as a reliable or an unreliable model of L2. This is consistent with Suffill et al. (2021)'s suggestion that non-native speakers have "fewer resources" available to take into account interlocutor proficiency: although we didn't see this in Experiment 1 with the lower proficiency participants, we do see it under these more difficult task conditions. As for why the SPHI group showed a larger interlocutor effect in the 8-lag condition, it is unclear. We do note that a comparison of the mean rates of alignment in the two experiments shows that, compared to Experiment 1, the SPHI subjects in Experiment 2 aligned slightly more with the NS interlocutor and slightly less with the NNS interlocutor, pushing the interlocutor effect into the significant range.
In sum, the results show that prime recency has no effect on overall levels of alignment: our participants aligned with their interlocutors in both long and short lag conditions, likely because they wanted to maximise communicative success, especially with an interlocutor whom they believed was a native speaker of the foreign language. Lexical alignment in L2 appears to have a robust mediated component, potentially swamping any effect that would be due to the increased level of activation of the dispreferred label due to priming.
In the next experiment we further explore unmediated components to alignment. Instead of decreasing the activation of the prime, as in Experiment 2, we increase the activation level of the preferred label, in order to create greater competition between the two alternative labels. Greater competition between the alternatives should result in reduced alignment.

Experiment 3
The primary purpose of Experiment 3 was to examine whether the increased activation of the preferred label (arising due to prior naming with the preferred label) would affect the tendency of L2 speakers to align with their interlocutors. In this experiment, participants name pictures twice before they encounter the alternative, dispreferred label (offered by their interlocutor). If there is a component to alignment in our L2 speakers that is sensitive to the activation levels of alternative names of objects, this manipulation should have the effect of reducing alignment, relative to alignment rates in Experiment 1. But if alignment in L2 largely mediated, this manipulation may not reduce alignment rate. In addition, we sought to determine whether the Interlocutor effects present in Experiment 1 were replicable under this different experiment design.

Participants
Eighty-six participants took part in this experiment for partial course credit. Their L2 English education background was the same as that of the participants in Experiments 1 and 2. One participant who overestimated their L2 English proficiency was excluded from the data analysis. This left 85 participants included in the experiment.
Of the 42 who self-rated as higher-intermediate in proficiency, 24 participants were randomly assigned to interact with the NNS interlocutor, and the other 18 participants were assigned to interact with the NS interlocutor. Forty-three participants perceived themselves as lower intermediate L2 learners. Among them, 21 were assigned to interact with the NNS interlocutor and 22 participants with the NS interlocutor. The result of a t-test showed that the two groups of subjects perceived their English proficiency differently, t (83)=15.745, p < .001, d = 0.26.

Materials, design and procedures
The materials, design and procedures were similar to those in Experiment 1 except for the following differences. In terms of experiment materials, we used the 174 fillers from the picture pool for which the L2 learners had judged their acceptability and familiarity with the two labels. In addition to 18 pairs of labels used in experimental trials, we needed 270 fillers. So, some of the fillers in the pool were used twice (randomly assigned by the experiment software). The design of Experiment 3 was similar to Experiment 1 except that the primed naming trial was preceded by two unprimed naming trials. Five filler trials intervened between the two unprimed trials and four fillers appeared between the second unprimed trial and priming trial (see Figure  9). The time interval between every matching and naming turn was 3000 ms-5000ms.

Results
The major goal of this experiment was to explore whether the rate and pattern of alignment observed in Experiment 1 was changed by an experimental design which would lead to an increased activation level of the preferred label. We included prior instances of naming a picture with the preferred label to determine whether this prior naming would reduce the tendency to use the same label as the interlocutor (the dispreferred label). The average rate of alignment was 76.2% (see Table 1 and Figure 7). The SPHI participants aligned on 80% of the experimental trials and the SPLI participants aligned on 72.3%. Again, Chi-square analyses were conducted to compare the alignment rates in the experiment with the expected rates, based on the norming task. These tests showed significantly greater usage of the dispreferred label in the experiment: SPHI participants with the NS interlocutor, With respect to the regression analyses, the maximal converging structure included the interaction of the two independent variables, random by-subject and by-item intercepts, and by-item random slope for Participant Type. The results showed a main effect of Interlocutor Type, β = 0.416, z =3.452, p < .001. There was no significant effect of Participant Type, β = −0.188, z = −1.378, p = .168; and no significant interaction of the two variables, β = 0.155, z = 1.29, p = .197. Just as in Experiment 1, the magnitude of the alignment effect was much greater for the SPLI group than the SPHI group, and analyses showed a significant effect for the SPLI participants, β = 0.571, z = 3.445, p < .001, but not for the SPHI participants, β = 0.261, z = 1.495, p = .135: the SPLI participants aligned significantly more with NS than NNS interlocutors.

Combined analysis of experiments 1 and 3
In order to examine the effect of prior naming on alignment patterns, we compared the results of Experiments 1 & 3. In the combined statistical analysis, the maximal converging model included the fixed effect Prior Naming (no prior naming in Experiment 1 vs. prior unprimed naming in Experiment 3), Interlocutor Type (NS vs. NNS), and Participant Type (SPLI vs. SPHI) and their interaction, and random by-subject and by-item intercepts, and by-item random slope for Participant Type.
There were significant effects of Prior Naming, β = 0.452, z = 5.508, p < .001 and Interlocutor Type, β = 0.393, z = 4.792, p < .001. The Participant Type factor was marginally significant, β = −0.166, z = −1.828, p = .068. There was a significant interaction between Participant Type and Interlocutor Type, β = 0.172, z = 2.098, p = .036. The three factors (Participant Type, Interlocutor Type, and Prior Naming) did not significantly interact, β = −0.026, z = −0.314, p =.753. Neither of the other interactions was significant (Participant Type and Prior Naming, Interlocutor Type and Prior Naming, p's ≥ .439). Overall, there was greater alignment with Prior naming than without, marginally greater alignment for the SPHI participants than the SPLI participants, and a larger interlocutor effect for the SPLI participants than the SPHI participants.

Discussion
Having participants name pictures prior to encountering their interlocutor's name was expected to have the effect of raising the activation level of a competitor label and thereby reduce how often participants produced the dispreferred (primed) label. However, this is not what was found. Alignment rates are higher than in Experiment 1, not lower, an issue we discuss in some detail in the General Discussion.
With respect to the Interlocutor effect, these results are comparable to those obtained in Experiment 1, with both groups showing greater alignment with the NS interlocutor, significantly so for the SPLI participants.

General discussion
In three experiments, we examined L2 lexical alignment patterns in self-rated high intermediate and low intermediate L2 speakers, some of whom believed they were interacting with a native speaker of the L2, and others of whom believed they were interacting with another L2 learner. The results showed that all groups of L2 speakers exhibited lexical alignment: across the three experiments, the usage of dispreferred labels significantly exceeded the baseline. In fact, our participants show quite high rates of lexical alignment, higher than native language speakers in similar studies (Branigan et al., 2011). In addition, all three experiments showed that alignment was modulated by participants' beliefs about the interlocutor and, to a lesser extent, beliefs about their own L2 proficiency.

Effects of "lag" and prior naming
The purpose of the "lag" manipulation (Experiment 2) and the "prior naming" manipulation (Experiment 3) was to explore whether there is an unmediated component to alignment in L2 speakers. Recall that Branigan et al. (2011) had found evidence that long lags and prior naming reduced alignment in native speakers interacting with (what they believed to be) another human interlocutor. In contrast to these findings from Branigan et al.'s study, the comparison of results from Experiments 1 and 2 showed that the overall rates of L2 alignment were not significantly lower in the long-lag experiment than in the short-lag experiment. We believe this is because alignment in our participants was mediated by a strong desire to repeat the interlocutor's labels in both experiments, either because of the belief that the interlocutor "knows best" and L2 speakers could learn from this, or because they wanted to optimise communicative success, or both. Were the participants in Experiment 1 primed at all? We think it likely that the participants in Experiment 1 were primedthat is, their lexical representations corresponding to the dispreferred labels automatically received a boost in activation from the primehowever, this increase in activation was swamped by the mediating factors described above. This would make the activation boost irrelevant because participants intended to use the interlocutor's referring expression anyway. This is similar to the situation found by Branigan et al.'s (2011) for speakers who believed they were interacting with a computer. They showed high rates of alignment across lags, with equal alignment rates at a lag of zero and a lag of eight. One assumes that at a lag of zero, there was a priming effect, but this was obscured by mediated effects.
The comparison of the results of Experiment 1 and 3 likewise showed that relative activation levels of the alternative labels for a picture had no obvious effect on L2 alignment. Again, this is in contrast to Branigan et al.'s findings for effects of prior naming in their human speaker-human interlocutor condition. Prior (unprimed) instances of naming a given picture should have had the effect of raising the activation level of the preferred label. Branigan et al. (2011) refer to this as "self-priming", and in their study, it did result in a reduction in alignment, compared to a no-priornaming condition. But in our study with L2 speakers, not only did we not see a decrement, we saw a significant increase in alignment. It might be argued that this increase was due to the closer proximity of the prime (a lag of zero in Experiment 3 vs. a lag of one in Experiment 1), although the comparison of Experiments 1 and 2 showed that greater prime-target distance had no effect whatsoever on alignment. But another possibility is that participants notice the fact that their interlocutor did not use the same label that they themselves did and assume that the interlocutor had a strong preference for the alternative label. As second language learners, the participants could be especially responsive to what they might see as corrective feedback about their language use.
Overall, we found no evidence for an unmediated component of alignment in our L2 learners. As mentioned above, we think it is likely that our participants were, in fact, primed (at least in Experiments 1 and 3, the "short-lag" experiments); however, but this effect was overwhelmed by mediated factors.

Effects of interlocutor proficiency in L2
We found a robust interlocutor effect in all three experiments. Overall, there was greater alignment with interlocutors that participants believed were native speakers. This finding contrasts with findings from previous studies that native speakers align more with non-native speakers than other native speakers (Chun et al., 2016;Suffill et al., 2021) and with Branigan et al.'s (2011) finding of greater alignment with (less language-proficient) computers than humans. These studies suggest that native speakers are more likely to use their interlocutor's referring expressions if they think they might not be understood otherwise. That is, they know that less proficient interlocutors are likely to have a smaller, spotty lexicon, and that they may not know, for example, that a bus is more commonly called a bus rather than a coach. Thus, to ensure communicative success, they use the interlocutor's term. But when they interact with other native speakers, they may assume that they will be understood no matter which label they use. Importantly, non-native speakers do this too: they align more with non-native speakers, presumably for the same reason that native speakers do (Suffill et al., 2021).
So then why don't our non-native speakers also align more with the non-native interlocutor? There are two possibilities. One is that, compared to Suffill et al.'s (2021) participants, our participants were more focused on language learning. They were likely less proficient than those in Suffill et al.'s study; they were, after all living in an L1 community and most certainly not speaking their L2 more than their L1 (as Suffill et al.'s participants did). In addition, they were in an EFL programme, taking English courses, and were, in fact, recruited from a Chinese-English translation class to participate in the experiment. Hence, they may have been in a language-learning state of mind and therefore, aligned more with the native speakers. We will refer to this as the language-learning account. Another, intriguing, possibility relates to the difference in the NNS-NNS (speaker-interlocutor) groups in the two studies. The non-native speakers in the Suffill et al. (2021) study spoke a variety of native languages, none of which were the same as that of the non-native speaking confederate. Therefore, they could not assume that the confederate had the same lexical knowledge as they did; but, like the native speakers in that study, they could assume that the interlocutor might not understand them if they didn't use the interlocutor's terms to describe items that appear in the map task. In contrast, our participants believed that the non-native interlocutor was a "peer", essentially someone just like them, with, presumably, similar lexical knowledge. Therefore, thinking "if I know this alternative term, they will too", they could assume that they would be understood no matter which alternative label they chose. However, when they interacted with the NS interlocutor, our participants could have assumed that they and their interlocutor would have a different "activation profile". Thinking something like, "They called that a couch; I prefer to call it a sofa but maybe sofa means sometime different to them", they would endeavour to use the NS interlocutor's labels in order to ensure understanding. If this explanation is correct, that means that the same mechanism underlies our results and the results of studies that show greater alignment with the "less proficient" interlocutor (e.g. Chun et al., 2016;Suffill et al., 2021); the general principle is "consider the extent to which the interlocutor is likely to have the same lexicon as I do; to ensure communicative success, align more when there is a greater disparity". We will refer to this as the lexicon-disparity account.
Although the current study was not designed to tease these two possibilities apart, the evidence concerning effects of speaker proficiency, which we discuss next, favours the language-learning account: L2 speakers assess their interlocutor with respect to how reliable they are as a model of the L2 and align more with a more reliable model.

Effects of speaker proficiency in L2
The overall results also showed a slight tendency for the SPHI groups to show more alignment than the SPLI groups. However, close consideration of the data shows that this is not always the case. Let us first focus on the similar pattern of results for Experiments 1 and 3. The SPHI and SPLI groups show different rates of alignment with the NNS interlocutor, with the SPHI group showing greater alignment. We hypothesised that these different alignment rates could be due to an assumption of proficiency-equivalence with the non-native interlocutor: if participants believe that the non-native interlocutor is at their own level of proficiency, then the SPHI group would assume they were interacting with a HI speaker (a reasonably good language model) and the SPLI group would assume they were interacting with a LI speaker (a less good language model). This would be consistent with the language-learning account described above. On this analysis, the two groups actually had different types of interlocutors, so it may be unrevealing about speaker effects to compare the rates of alignment in the two cases.. If we consider the data from the condition in which they had the same type of interlocutor (the NS condition), we see equivalent rates of alignment: no speaker proficiency effect. Now, let us examine the results of Experiment 2, the long-lag experiment. In the NS condition, we see that the SPHI group aligns about 17% more than the SPLI group. The SPHI group also shows a larger interlocutor effect than the SPLI group. We argued earlier that the smaller interlocutor effect for the SPLI participants could have been because the relatively greater difficulty of the task demanded all their attention, such that they could not consider their interlocutor's proficiency in L2. As Suffill et al. (2021) suggest, L2 speakers likely have limited processing resources that vary as a function of proficiency, therefore one might expect the lower proficiency participants to show both smaller interlocutor effects, and lower rates of alignment when they have to remember words over some distance.
Why the difficulty of remembering the interlocutors' labels across more lags enhanced the interlocutor effect for the SPHI group (compared to the other two experiments) is unclear. Perhaps they, too, experienced some difficulty with the task and the subgroup of SPHI participants that were paired with the native speaker interlocutor tried especially hard to remember the native speaker's labels because they wanted to learn them, whereas the group paired with the NNS interlocutor was less motivated to do so.

Conclusion
This research adds to the body of research on the variables that lead speakers to align their referring expression with those of their interlocutor, focusing on the effects of L2 proficiency. Our experiments explored unmediated and mediated components to lexical alignment in L2 speakers at two different proficiency levels, interacting with an interlocutor whom they believed to be either a native speaker or a peer L2 learner.
Our results show the following. First, we observed robust alignment in our L2 speakers. Second, alignment in our two groups of intermediate L2 speakers is mediated by the following variables. (1) The assessment of the interlocutor as a good language model (or not). They show greater alignment with the more "expert" interlocutor. This contrasts with findings of more alignment with lower proficiency interlocutors (Suffill et al., 2021). Whether L2 speakers adopt a language-learning mindset when interacting with others likely depends on their actual and self-perceived L2 proficiency level, the perceived proficiency of their interlocutor, and the context and demands of the interaction. (2) The difficulty of the task affected alignment patterns. When the prime and target were separated by intervening material, this adversely affected the lower proficiency group. Presumably, this is because L2 actual and perceived proficiency correlates positively with L2 working memory capacity and general cognitive resources.
(3) We found no evidence for an unmediated component to L2 lexical alignment. The mediating factors were so powerful that neither increasing the amount of intervening material between prime and target, nor introducing a prior naming condition that would elicit the participant's preferred label for a picture reduced alignment.