Migration and academic performance in higher education: evidence for Colombia

ABSTRACT We study the relationship between academic performance of students in higher education and the decision to migrate. We focus on the case of Colombia due to the good availability of data on standardised tests for students in higher and secondary education. We exploit this information following an empirical strategy that allows us to identify the existence of negative effects associated with the decision to migrate, controlling for potential academic benefits of migration, such as belonging to better social networks in the receiving cities of migrants. These negative effects are associated with the psychological and financial costs that students face when migrating. Similarly, we follow a novel strategy by controlling for potential commuter students who are not identified in the sample, or who may be misclassified as migrants. These robustness exercises show that the result found previously is maintained, which is favourable to the hypothesis of the existence of negative effects associated with migration on academic performance. This result is relevant for the elaboration of educational policies in developing countries.


INTRODUCTION
Owing to the large amount of data available for elementary and high-school education, considerable research has been conducted on students' academic performance in pre-college education.Further, the results of various standardised tests of academic performance for large groups of students are available for these educational levels.However, such data are rare in the case of higher education (Chan & Luk, 2021;Zlatkin-Troitschanskaia et al., 2015).To address this shortage, the Organisation for Economic Co-operation and Development (OECD) developed the Assessment of Higher Education Learning Outcomes programme, which provides information on university students' knowledge levels at the end of their first (bachelor's level) degrees.However, this information does not allow for reliable comparisons between countries, and there is no relevant information on students before their university stage.This last aspect is relevant to understand the migratory processes related to higher education.Colombia is amongst the countries having standardised evaluation for higher education, and from previous levels such as secondary education; therefore, it is a valuable source of information for research on topics related to higher education and migration, including technical schools and universities.
Higher education encourages the accumulation of individual and social capital, allowing skilled individuals to promote development and innovation as engines of economic growth and social prosperity (Bloom et al., 2007;Di Maria & Stryszowski, 2009;Nakajima & Nakamura, 2012).Migration is a significant choice for individuals who want to access higher education because some of the best higher education institutions (HEIs) are in major cities.Many students do not have HEIs in their cities of residence or are reluctant to commute the distance that separates their home from HEIs (Selod & Shilpi, 2021).Additionally, benefits such as the development of greater personal independence and access to new and better social networks, which tend to be located in the main cities, are associated with migration (Beine et al., 2014;Vieira et al., 2018).Such social networks consist of highly trained teachers and talented students, amongst others, who support the personal and academic development of those who migrate for education.
However, the decision to migrate entails financial and psychological costs for students who decide to get higher education training.Therefore, we examine how university students' migration decisions and their resulting costs affect their academic performance.We propose an empirical strategy to gather evidence of migration's adverse effects on university students' academic performance.Our focus is limited to finding evidence of the adverse effects of students' decision to migrate; considering the potential benefits of this phenomenon, we controlled for potential positive outcomes to obtain good estimators.
This study has a two-fold significance.First, the impacts of decisions such as funding or migration on university students' academic performance are relevant for policymakers (Christou & Haliassos, 2006).Second, the identification of factors to improve university students' academic performance has scholarly relevance.
We used data from the Sabre (SB) Pro tests of the Instituto Colombiano para el Fomento de la Educación Superior (ICFES) conducted on Colombian university students between 2017 and 2019.The SabreSabre Pro test is a standardised assessment that all university students must take as a requirement to graduate and attain their university degrees.Additionally, we used data from the ICFES's SabreSabre 11 tests, applicable to all high-school students at the end of their school stage, conducted between 2011 and 2015.Thus, the data for higher education students from all Colombian universities were used for our analysis.The two sets of data allowed us to identify students in the pre-university stage along with other high-school students who could not be observed in the SabreSabre Pro test.This is a novel database, and to the best of our knowledge no previous study has employed it to analyse students' higher education academic performance and its relationship with migration.
In Colombia, HEIs are mostly concentrated in major cities (Ospina Londoño et al., 2015).As a result, a part of the student population is faced with the decision of migration to access higher education.Policies have been proposed for the creation of HEIs in areas that lack them.This could extend the access to higher education to students unable to overcome the economic costs of migrating and reduce gaps in regional development.However, the top universities are in major cities, and migration is still relevant for students' future professions.In Colombia, notably, the relationship between migration and academic performance in higher education has not yet been evaluated.
Our results show evidence of the adverse effects of students' decision to migrate, controlling for potential gains associated with the universities' characteristics and the students' programmes.These negative effects can be explained by the psychological (nostalgia effect) and financial costs of migrating, which can affect students' emotional state and diminish their academic performance.These results are robust, controlling for possible commuter students.The results associated with a lower academic performance owing to students' decision to migrate are relevant for educational policy in emerging countries.
The remainder of the paper is structured as follows.Section 2 briefly reviews the literature on factors that explain the relationship between migration and academic performance in the higher education context.Section 3 presents the estimation strategy.Section 4 presents the investigation results.Section 5 concludes.

ACADEMIC PERFORMANCE IN HIGHER EDUCATION, DISTANCE FROM HEIs AND MIGRATION
We first discuss the determinants of university students' migration to understand the relationship between migration and academic performance in higher education.The distance between students' place of residence and HEIs is relevant because it affects students' decision of migration or commuting to HEIs.It is difficult to establish students' migrant status in the data because this question does not tend to be asked directly; therefore, it must be inferred from the survey.Studies follow various strategies to ascertain whether students can be classified as migrants; some have based their analysis on the distance that separates a student's reported residence from their HEI.
Concerning the determinants of university students' migration, Rosenzweig (2008) and Bessey (2012) have highlighted the inadequate provision of higher education in students' places of origin and the low wages earned by skilled individuals.The low wages in places of origin explain the low probability of students returning to these places after completing their studies.Beine et al. (2014) estimated the importance of determinants of international student mobility using data covering more than 180 origin countries and 13 destination countries.They found the presence of fellow nationals at the destination to be a fundamental determinant of migration.Other relevant determinants include prospective wages and the quality of higher education at the destination, as well as living costs and host capacity.
At a local level, students willing and able to pursue higher education who live far away from HEIs may migrate to the destination city or commute daily from their home city.In both cases, the distance between the student's city of origin and destination can influence their academic performance.The literature is not concurrent on the relationship between the distance that separates a student's home, their HEI and their academic performance.A greater distance can positively or negatively influence university students' academic performance (Garza & Fullerton, 2018;Vieira et al., 2018).Evidence indicates that physical proximity to HEIs improves the probability of entering such institutions, even after controlling for more relevant determinants, such as individual characteristics, socio-economic conditions and institutional factors (Cullinan & Duggan, 2016;Saenz et al., 2006;Vieira et al., 2018).
Further, because of economic cost or geographical considerations, university students often enrol in HEIs close to their homes (Saenz & Barrera, 2007).This is because distance increases the financial and personal costs of pursuing higher education, limiting individuals' options to carry out enriching activities outside of academics to maintain their motivation levels.Therefore, distance reduces participation in higher education, particularly amongst those from less-favoured socio-economic groups.These aspects imply that the distance separating university students' homes and HEIs is a negative factor in academic performance.
For university students, continually living at home is not the same as being, often for the first time, away from the family environment (Chow & Healey, 2008;Desmond & Turley, 2009).Both scenarios present aspects that may affect students' academic performance in HEIs.Those who decide to pursue higher education studies and be away from their family environment have new responsibilities, such as money and time management; higher financial costs; less socialising with family and old friends; and less direct parental support and supervision (Bufton, 2003;Trueman & Hartley, 1996).These factors can lead to the individual not possessing the requisite disposition to maintain good academic performance.Some students live far from their HEI but do not migrate.They must constantly travel long distances, causing them to also face problems as migrant students.These commuter students may not fully be able to take advantage of the opportunities the higher education programme grants to strengthen their capacities.Astin (1993) found that living close to their HEI contributes to student satisfaction, giving students a better overall experience in their college careers.This is because it allows students to better exploit different academic and non-academic activities.These results are valid for those students whose homes are close to the HEI and those who decided to migrate nearby.Living close to campus has been found to lead to significant intellectual, cognitive and critical thinking gains (Feldman, 1994;Pascarella et al., 1993;Pike & Kuh, 2005).
Furthermore, Kuh et al. (2001) have highlighted that distance affects access to academic development opportunities because of the low contact that students may have with course instructors.The above aspects suggest that the relationship between the distance from a student's home to the HEI and academic performance is negative.Considering the distance between their homes and HEIs, many students may migrate to a closer site.Moreover, they may choose to migrate because of the costs that the distance implies for them.However, these aspects predict that the direction of the relationship between the student's academic performance and their decision to migrate is ambiguous.
Evidence has been collected on the broad relationship between the distance from students' homes to HEIs and the academic performance of university students.Existing studies in the literature mostly focus on single institutions.This strategy enables avoiding the problems of comparability of scores amongst HEIs and ensures that the available data are relatively homogeneous.Pitkethly and Prosser (2001) studied how initial experiences on campus are meaningful and influence students' persistence in higher education.The authors collected data from students attending their first course at La Trobe University in Melbourne, Victoria, Australia.They confirmed that the lack of adaptation to a new environment is amongst the leading causes of student dropout or desertion.This factor is associated with low mood.Intellectual difficulties were found to have little relevance in explaining students' desertion.Low mood can be associated with the nostalgia effect, an explanation proposed for the negative effect of students' migration decision on academic performance (Vieira et al., 2018).Based on evidence from different cultures, the nostalgia effect occurs in university students who decide to leave home to study (Burt, 1993;Fisher & Hood, 1987;Scopelliti & Tiberio, 2010;Tognoli, 2003;Xu et al., 2015).In the university context, nostalgia has two components: personal and community.The former is related to less-frequent physical contact with family and friends (Fisher & Hood, 1987).The latter refers to ruptures in the place of attachment and students' ties with their community (Tognoli, 2003).To our knowledge, no research work has attempted to deconstruct the nostalgia effect to identify the components that are empirically the most relevant.
The geographical proximity between HEIs and family residence reduces the impact and intensity of the nostalgia effect, translating into good academic performance (Fisher et al., 1985;Katsikas & Panagiotidis, 2011;Vieira et al., 2018;Williams & Luo, 2010;Xu et al., 2015).This effect of geographical proximity is also evident in secondary education (Falch et al., 2013).Thus, past research supports the hypothesis that university students' decision to migrate hurts their academic performance.However, some studies present the opposite result (López Turley & Wodtke, 2010).Newly acquired independence can positively affect individuals' autonomy and personal responsibility.López Turley and Wodtke (2010) found that living in a family home during higher education can be detrimental to students because problems at home may affect the student, impeding their study time.The abovementioned factors imply an ambiguity regarding the effect of distance and university students' academic performance.Garza and Fullerton (2018) studied how the distance between students' home during their high-school stage and HEI impacted their academic performance.They used data from the Beginning Postsecondary Students 2004-2009 database, which includes survey data on the academic and work experiences of undergraduate students who obtained their university degrees (Wine et al., 2011).Their results contradict the expected outcome of the nostalgia effect of students' migration decision.They found that for college students in their first academic course, a greater distance from HEIs to their homes implied a greater probability of graduating.However, they found no significant relationship between distance and students' academic performance, as measured by the grade-point average.The authors noted that the lack of a relationship between distance and academic performance, but one existing between distance and graduating, suggests that the factors impeding student persistence have more to do with social, environmental or personal circumstances than with skills and talent.We consider that the non-significant effect of distance on academic performance can be explained by the fact that distance absorbs two effects that counteract each other: the nostalgia effect and the gains of belonging to new academic and social networks.
Regarding works on higher education in Colombia about the issue of migration, we only know of the study by Ospina Londoño et al. (2015), which focused on the relationship between the supply of higher education and students' enrolment and migration processes.Between 2000 and 2013, higher education in Colombia expanded substantially in response to greater demand, a greater supply of programmes by HEIs, and policies enhancing HEI access and attractiveness (Carranza & Ferreyra, 2019).The results show that Colombian students' interest in accessing higher education has increased because they are more likely to enrol when there is a greater supply of HEIs.The authors did not find a significant effect between the supply of higher education and migration.This result is closely related to the high concentration of HEIs in the country's major cities.Given the increase in the supply of higher education in these cities, the migration patterns from municipalities without HEIs to cities with such institutions are not affected.This implies that migratory dynamics remain the same before and after the increase in the supply of higher education.Our work differs from this paper in terms of the objective we pursue.We propose a strategy to study how migration, motivated by the distance that separates students from their HEIs, can affect students' academic performance.To our knowledge, such an investigation has not been carried out so far in Colombia.
In summary, in terms of the decision to migrate, some factors have positive and negative effects on university students' academic performance.The positive factors are associated with the better conditions of the new environment, which constitute the initial motivation for students' decision to migrate, including access to higher quality social networks, better universities and greater future employability options.These factors can be captured by the new environment to which students migrate.Meanwhile, the negative factors are associated with the financing and psychological costs of living in a new environment.In this work, we developed an empirical methodology allowing us to identify the adverse effects of students' decision to migrate, controlling for potential benefits.

EMPIRICAL STRATEGY
We used two sources of information obtained from ICFES, an autonomous entity attached to the Ministry of Education of Colombia that evaluates education at all levels.The two sources of information provide the data reported by students in the registration forms for their mandatory secondary and higher education tests.The first source includes the registration form data of the SabreSabre 11 test, through which sociodemographic and economic information is collected on high-school students and their homes.Further, this database incorporates test results in different areas of knowledge, such as mathematics or language.The second source includes the results data of the SabreSabre Pro test, through which sociodemographic and economic information on university students is collected, including the results in generic quantitative reasoning and reading tests.
These data include the academic programmes and the HEIs to which students belong.Notably, taking these tests is a degree requirement at the high-school level (SabreSabre 11) and the university level (Sabre Pro).We used data from students who took the Sabre Pro test in 2017-19.University programmes last for four to six years in Colombia, and students can take the Sabre Pro test after completing at least 75% of the academic credits of their career (Castro-Avila & Ruiz-Linares, 2019).Therefore, we used data from students who have taken the Sabre 11 test between 2011 and 2015.Subsequently, we combined both databases for information on high-school students who took the Sabre Pro test and those who did not.We eliminated the information of students who have taken the Sabre 11 test more than once to have a homogeneous sample.Subsequently, we eliminated those who took Sabre Pro more than once from the sample.Finally, we discarded the data of students who have taken Sabre Pro in more than one college career.
The location of the municipality of residence in the Sabre 11 test was compared with the municipality of the location of the HEI's programme in Sabre Pro to define migrant status.If the location was different, then the student was classified as a migrant.This was the first method used for determining migrant status.Given that Colombia has municipalities that form conurbations or metropolitan areas, the distances separating one municipality from another may be small (e.g., municipalities in the Metropolitan Area of Medellín).Therefore, the method for determining migrant status was corrected to reflect the distances students travelled between their residence and the municipalities where their HEIs were located.Thus, the application programming interface of Google Maps was used to calculate the Euclidean distance between the centroids of the municipalities of residence in Sabre 11 with the municipalities of the programmes in Sabre Pro.Trips between nearby municipalities can be made by motorised or non-motorised vehicles, and students can choose not to migrate.For students who live in remote municipalities, which may be located several hours away by road or may not be connected to a road network, migration is unavoidable.The Euclidean distance approximates, to a reasonable extent, the travel distance that separates students' home municipality from the municipality of their HEIs.With these measures, we obtained a classification of three categories of students: non-migrants, commuters and migrants.Further, different ranges that allowed us to contrast the robustness of our econometric model in defining a student as a commuter.Students were defined as non-migrants when they lived in the municipality of their HEI, as commuters when their municipality of residence was located below a certain distance threshold from the municipality of their HEIs, and as migrants when their municipality was beyond the distance threshold.We chose various values for the threshold to define students as commuters or migrants.
The Sabre Pro database for 2017-19 consisted of 738,462 students, of which 487,348 took the Sabre Pro and Sabre 11 tests only once, representing 66% of the total university students who gave the Sabre Pro test.Of the 487,348 students, 320,975 took the Sabre 11 test between 2011 and 2015.These 320,975 university students took the Sabre Pro and Sabre 11 tests only once.These students constitute our study population, and they had taken a reasonable time to meet the requirements to complete their college career.Finally, after considering the inconsistencies in the students' reports or the lack of responses to questions relevant to our research, the final database comprised 202,784 records.This database has information about students' residence during high school and during their university period; measures of academic performance in both stages; and information on the financing of their higher education, their academic programme and the HEIs.A complete description of the variables used in this research is presented in Table A1 in Appendix A in the supplemental data online.

Descriptive analysis
Table 1 presents the descriptive statistics of the control variables for migrant and non-migrant students.The proportion of students in our sample whose HEIs were in municipalities other than those of their homes of residence is between 41% and 46%.The scores on the Sabre Pro and Sabre 11 tests were standardised by period, considering all the students who took the test and were included in our sample.The standardisation of the Sabre Pro results was done with a smaller sample of students than the one used to standardise the results of the Sabre 11 test.For all years of the sample, non-migrant students had higher average standardised scores in Sabre 11 and Sabre Pro in mathematics and reading tests than migrant students.Regarding the position occupied by students in Sabre 11, we found that migrants had lower average positions than their non-migrant peers.
Additionally, migrant students were found to have low socio-economic levels, and the proportion of women tended to be higher amongst migrants than non-migrants.The average distance that separates the HEIs of students from their residences was over 130 km.Fewer migrant students worked while studying than non-migrants.Further, compared with nonmigrant students, a higher proportion of migrants pay their tuition through scholarships, whereas a lower proportion is helped by their parents.
Figure 1 presents a map of the municipalities in Colombia.We show in circles the centroids of municipalities with HEIs in 2017.The purple circles are municipalities that are not the respective department's capital, and the green ones are capitals.Additionally, the size of the circle reflects the number of HEIs that exist in the municipality.Notably, HEIs are concentrated in the main cities of the country.The vast majority of the country's municipalities do not have HEIs.Therefore, high-school students access to higher education in these areas is tied to their decision to migrate.In 2017, students participated in the Sabre Pro test in 248 HEIs.Amongst them, 79 are in Bogotá, 27 in Medellín, 15 in Cali, 13 in Barranquilla and 12 in Cartagena.In other words, 58.8% of the HEIs in the country are located in its five major cities.Additionally, 197 of the 248 HEIs are in capitals.Therefore, 20% of the HEIs are in municipalities that are not capitals.Considering that some municipalities are located near capital cities, the percentage of HEIs in municipalities far from the main cities is relatively low.

Estimation strategy
The study population consisted of high-school graduates who could take the Sabre Pro test between 2017 and 2019.These students took the Sabre 11 test between 2011 and 2015.We assumed that the population regression function takes the following form: where y 1 is the score in Sabre Pro; y 2 is a dummy variable of migrants (1 when migrating, 0 otherwise); and z 1 are control variables (the score on the Sabre 11 test, index of the student's socioeconomic level, gender, the value of the tuition, forms of payment of tuition, characteristics of the home and parents, whether or not they had a job, fixed effects by year, and fixed effects by the HEI's programme 1 ), which are exogenous (y represents endogenous variables and z is exogenous).To identify a 1 , two problems must be overcome: (1) the possible correlation between y 2 and u 1 because of omitted variables and (2) the selection bias because only the students who took the Sabre Pro test were observed.

Selection bias
For several reasons, a considerable proportion of the population was not observed in the base corresponding to the Sabre Pro test.First, once students finish high school, they must decide whether to study at HEIs, get a job or pursue a technical career, amongst other options.
Those who decided not to study in HEIs did not take the Sabre Pro test.Second, even students who decide to study in HEIs may still not have given the Sabre Pro test between 2017 and 2019 if they dropped out or delayed their studies.This finding implies that the sample is composed mainly of good students, that is, those who wanted to study and entered a HEI, did not drop out and did not fall behind.In other words, we have a selected sample.To overcome this problem, the probability of having taken the Sabre Pro test between 2017 and 2019 is modelled using the following: where y 3 is a variable that takes the value of 1 when a student took the Sabre Pro test, and 0 otherwise; z 2 refers to the controls used (the score in the Sabre 11 test, age, gender, characteristics of home and parents, characteristics of school, and fixed effects of the municipality); and F(.) is the standard normal cumulative distribution function.This regression is performed using data from the Sabre 11 tests conducted between 2011 and 2015, which included students who took the Sabre Pro test and those that did not.This model allows obtaining an estimate of the probability of taking the Sabre Pro test in the period of analysis but does not admit a disaggregation of the probability for the reasons behind not taking the Sabre Pro test.Thus, estimating the probability of a student not being able to take the test because of no longer studying, choosing a technical career, dropping out or falling behind in their professional career is not possible.However, the estimation strategy does not require such disaggregation.This strategy only needs to estimate the probability of being observed in the Sabre Pro test.To obtain consistent estimators of a 1 and b 1 , the probability of observing individuals in the Sabre Pro test should be corrected.
A problem arises from the Sabre 11 test data, which contains the information of all highschool students, and the Sabre Pro test data, which contains the information of those high-school students who pursued higher education.A substantial imbalance exists between students who appear in the Sabre Pro test and those who do not.The number of students who cannot be observed in the Sabre Pro is relatively large.This is called the class imbalance problem, which refers to the imbalanced distribution of values of the response variable.Moreover, this problem implies inconsistent estimates and reduces the classification performance of an algorithm (Ali et al., 2013;Buda et al., 2018;Thabtah et al., 2020).Generally, a classification estimation method, such as the probability models applied to binary response variables, maximises the overall precision when classifying observations.However, maximising the overall accuracy in the case of an imbalanced dataset involves serious problems.
Moreover, maximising the overall accuracy necessarily requires assigning more weight to the majority class.Hence, the estimation method can achieve a high accuracy level for the majority class but performs poorly on the minority set.In this study, identifying the minority cases is of greater importance because they are the students of interest we observed in the Sabre Pro test (Thabtah et al., 2020).The imbalance problem was corrected by randomly sampling equal numbers of students observed and not observed in Sabre Pro.This option is known as the data-driven approach with under-sampling and is recommended by Ali et al. (2013) and Kotsiantis et al. (2006).Once the model is estimated, the inverse Mills ratio is constructed for the observed students in Sabre Pro: By including this variable in the regression with the students observed in Sabre Pro, it is controlled for selection bias: The exclusion restriction in this model is that at least one variable in z 2 is not part of z 1 .This notion holds as the characteristics of home and parents, characteristics of the school, and fixed effects of the municipality in z 2 are all measures in Sabre 11.On the contrary, z 1 , where these variables appear, is measured several years later when the student appears in the Sabre Pro test.The variable that is common to z 1 and z 2 is the Sabre 11 test score.

Omitted variables
Amongst the factors determining whether a student decides to migrate are the presence of relatives or acquaintances at the destination, prospective wages, the quality of higher education at the destination and living costs.The inclusion of fixed effects by HEI programmes allow to absorb aspects related to the characteristics of the destination city that do not vary between students, such as potential wages, the quality of higher education, and the average living costs at the destination.However, the presence of relatives or acquaintances at the destination is not controlled for.Individuals may be motivated to migrate because of unobservable factors contained in u 1 .For example, a student who lives in a municipality without a HEI but who has family in a city with a HEI, keeping everything else constant, is more likely to make the decision to migrate than another student who lives in the same municipality but has no relatives in the city. 2 This unobservable variable (i.e., relatives or acquaintances in a city or, in general, any factor that decreases the individual cost of migrating) is probably contained in u 1 (predictably, a student with a family in her HEI's city has minor adaptation costs, which results in higher performance in the programme and Sabre Pro test).This finding implies that y 2 is potentially endogenous, and this endogeneity is essential because y 2 is a student decision.By contrast, the controls (including the inverse Mills ratio) can be considered given conditions and, in many cases, predetermined (e.g., talent, socio-economic level and others).Therefore, they do not correlate with unobservable factors that decrease migrating costs.
Endogeneity was controlled by estimating the equation using two-stage least squares (2SLS), instrumenting y 2 with the student's position in the Sabre 11 test.High-school students who took the Sabre 11 test were informed of their scores and position.To calculate their position, the students are divided, according to their locations, into groups of 1000.They receive a number between 1 and 1000, with 1 being the highest position and 1000 the lowest.Those students who obtain higher positions can quickly enter the best HEIs.Many students come from municipalities without HEIs, and the HEIs are in the main cities.Thus, a better position increases the probability of students' migration, so the instrument is relevant.Considering that it is a predetermined variable in the context of students' migration decision, the instrument is exogenous.For example, having family in a major city has nothing to do with students' position.Finally, students' position does not explain their Sabre Pro test scores, when controlling for students' talent, measured by the score in the Sabre 11 test, for which the instrument meets the exclusion criterion.
Based on the above discussion, we propose the strategy of estimating the inverse Mills ratio, plugging it into the performance equation on the Sabre Pro test, and estimating by 2SLS to yield a consistent estimator (Wooldridge, 2010).Standard errors of equation ( 4) are obtained through bootstrapping.Of the 202,784 students who took the Sabre Pro test and were included in our sample, we randomly resampled 150,000.We proceeded to calculate the estimated coefficients, correcting for selection bias and using instrumental variables (IV).This procedure was repeated 100 times.We calculated the standard error for each coefficient as the standard deviation amongst the 100 estimated coefficients.

RESULTS
Table 2 presents the estimation results of equation ( 4). 3 Columns 1-4 present the results of the maths test and columns 5-8 that of the reading test.Columns 1 and 5 present the relationship between Sabre Pro test performance and migration, controlling only for test performance in Sabre 11.Columns 2 and 6 add an index of the student's socio-economic level, gender, ordered categorical variables for the value of tuition and parents' education, the method of payment of tuition, students' job status, characteristics of the home of current residence, fixed effects by period and fixed effects by HEI's academic programmes as controls.Columns 3 and 7 add the inverse Mills ratio to control for the potential selection bias of not observing the performance of individuals who did not take the Sabre Pro test either because they did not pursue higher education or because they dropped out and did not complete their studies.Columns 4 and 8 present the estimate, correcting with IV for unobservable factors in the new environment that may affect students' academic performance and correlated with the decision to migrate (e.g., family members at the new area of residence).The migrant dummy variable takes the value of 1 when the municipality of residence in Sabre 11 and the municipality of students' HEIs are different.We then studied how the presence of commuter students modifies the results obtained.
Our estimation results show that migrant students obtain scores that are, on average, lower than those of their non-migrant peers.This result applies for math and reading tests and different specifications, without and with fixed effects and controls, the sample selection correction and the relevant omitted variable problem correction.Similarly, students' ability control, measured by their results in the Sabre 11 test, has a positive and significant sign under all specifications.Regarding the correction of the selection bias, this case does not imply a relevant change in the magnitude of the estimated coefficients for the ability and the effect of migration.Additionally, when the omitted variable problem is corrected, migrant students are found to have lower scores, on average, than their non-migrant peers. 4 Table A3 in Appendix A in the supplemental data online shows the adjustment of the model corresponding to the probability of presenting the Sabre Pro test.The results show that based on the variables used, the probability model adequately classifies the students.Students who took the Sabre Pro test are at 69%, and those who did not take the Sabre Pro are at 72%.Finally, about our control of the students' ability, which is their score on the Sabre 11 test, as expected, we found a positive relationship.Table A2 online presents the results for other controls.We only interpreted the results obtained by IV-SB (columns 4 and 8).We found that, on average, male students Note: Standard errors were corrected by the cluster at the programme level in higher education institutions (HEIs) for ordinary least squares (OLS) and obtained through bootstrapping for OLS-SB and IV-SB.All regressions included a constant term.We included additional controls, such as students' gender; parents' education; the value and the form of payment of students' tuition; the socio-economic index of students and the socio-economic stratum of the home of residence; hours of work per week; dummy variables for the presence of the internet, television, computer and washing machine in the home of residence; and fixed effects of the year and academic programme of the HEI to which students belongs.The weak IV t-test was a t-test of the first stage on the instrumental variable (IV).We corrected this by bootstrapping the standard errors of the regression coefficient of the IV.***p , 0.01, **p , 0.05, *p , 0.1.
Migration and academic performance in higher education: evidence for Colombia obtain higher scores than female students in mathematics, and female students obtain better scores in language.Moreover, people with more income tend to obtain lower scores; parents' higher educational level is related to higher scores, and students who have a job obtain lower scores.
We propose that the fixed effects of HEI programmes allow us to capture the benefits associated with students' decision to migrate.These effects represent a measure of the average score obtained in the Sabre Pro test associated with belonging to a specific HEI programme, while students' other characteristics are held constant.Therefore, we can expect that by not including the fixed effects of HEIs' programmes as a control, the estimated coefficient of migrant in the regression on the Sabre Pro score has a positive bias; that is, it approaches zero or takes a positive value.This is suggested because the migrant coefficient would absorb some of the benefits of migrating, as explained by the fixed effects of HEIs' programmes.Table 3 presents the role of fixed effects in HEIs' programmes.The results show that when the fixed effects of HEIs' programmes are not included, the estimated coefficient for the migrant dummy tends to approach zero, and the Sabre 11 score increases.We interpret this result as evidence in favour of the hypothesis that the fixed effects of HEIs' programmes allow controlling for the potential benefits of migration.By not including the fixed effects of HEIs' programmes, a part of the benefit of belonging to a specific programme at a HEI is absorbed by the measure of student talent and migration dummy.

Controlling for commuters
In this section, we focus on a difficulty presented by our results in Table 2, given the inability to observe commuter students.Students who live in municipalities other than those of their HEIs may choose to commute to their HEIs and not decide to migrate owing to their proximity.This result indicates the need to accurately identify the effect of migration on students' math or reading test scores in Sabre Pro.As a strategy to control for potential commuter students in our sample, we defined three categories of students, as follows: (1) non-migrants, whose area of residence during the Sabre 11 test was the same as that of the HEI in which their academic programme was located; (2) commuter students, whose municipality of residence during the Sabre 11 test was at a distance less than or equal to d from the municipality in which their HEI was located; and (3) migrant students, whose municipality of residence during the Sabre 11 test was at a distance greater than d of the municipality in which their HEI was located.
We took the coordinates of the centroids of municipalities and calculated the Euclidean distances between the municipalities to measure the distance between one municipality and another.Concerning threshold d, we define five values: 20, 40, 60, 80 and 100 km.Table 4 shows the regression results of equation ( 4), controlling for commuter students.
The results when we control for potential commuter students are similar to those obtained in Table 2.This implies that our results are robust to the strategy we followed for identifying potential commuter students amongst those we classified as migrants in Table 2. Evidence shows that after controlling for potential commuter students, migrant students are found to have lower scores on the Sabre Pro tests than their non-migrant peers.In the case of commuter students, not enough evidence was found that their score differs significantly from those obtained by their nonmigrant peers.Finally, as we increased the threshold d, the estimated coefficient for the migrant dummy was more damaging for the math test while that for the reading test was ambiguous.This result implies that as we better identify those students who live farther from their HEIs, based on Sabre Pro test results, and are more confident that they are correctly identified as migrants, their academic performance tends to be lower than that of non-migrant peers.This result is interpreted as additional evidence supporting our hypothesis regarding the costs associated with migrating.
Additionally, we defined a student as a commuter based on a low threshold (20-40 km); therefore, we correctly identified commuters students because the distances are short and can be covered without a problem.However, for larger thresholds (80-100 km), we identify commuter students more noisily, given that a student may choose to migrate with greater probability because of the distance between their municipality and HEI.The evidence that the commuter student coefficient is not significant under lower thresholds (suitable identification of commuters) but is significant under larger thresholds (inadequate identification of commuters) is reasonable and consistent with our evidence of an adverse effect of migration on the score.Larger thresholds identify commuters students in a noisy way, some migrant students can be include in the threshold, so these students may drive the estimated coefficient for a commuter to be negative and significant, consistent with our evidence that being a migrant implies lower scores.

CONCLUSIONS
This study examines the relationship between the academic performance of students in higher education and their decision to migrate, which presents both benefits and costs.New environments can offer better social conditions, greater access to new social networks, significant personal development and independence for students.However, the decision to migrate involves financial and psychological costs to students, associated with abandoning the family circle and the customs of the local community.This implies that the relationship between performance and migration is ambiguous because some factors associated with migration have positive or adverse effects.In this work, we propose an empirical strategy to find evidence for adverse effects, controlling for positive effects, of the decision to migrate on university students' performance.
This work uses publicly available data from ICFES of standardised tests in high-school education (Sabre 11) and higher education (Sabre Pro).We used data from 2017 to 2019 for university students, and from 2011 to 2015 for high-school tests.This method was adopted to control for their initial conditions before entering a HEI.Academic performance in higher education is measured as the standardised score on quantitative reasoning and reading tests.Students were defined as migrants when the municipality of residence that they reported in high-school education was different from the one in which the HEI they attended was located.This definition is arbitrary and potentially includes a few commuter students.Thus, different robustness exercises were carried out using the Euclidean distance between the urban centre of the municipality where the high-school student lives and that of the HEI to control for potential commuter students.
The results of this research provide evidence of the adverse effects of students' decision to migrate, controlling for potential gains, on their academic performance in Colombia.Migrant students, on average, obtain lower scores than others, controlling for other potential explanatory variables of Sabre Pro test scores.These results are robust to how the variable of interest, which is the migrant status, is measured.The robustness exercises that were carried out were controlled for possible commuter students.Commuter students live in municipalities near HEIs and can travel daily without needing to migrate.

Table 2 .
Academic performance and migration.

Table 3 .
Academic performance and migration: the role of fixed effects (FE) in higher education institutions' (HEIs) programmes.
Note: Standard errors are obtained through bootstrapping for OLS-SB and IV-SB.All regressions include a constant term; the controls and FE are the same in Table2, except when it is explicitly mentioned that the fixed effects of HEIs' programmes are excluded.The weak IV t-test is a t-test of the first stage on the instrumental variable (IV).We correct this by bootstrapping the standard errors of the regression coefficient of the IV.***p , 0.01, **p , 0.05, *p , 0.1.

Table 4 .
Academic performance and migration with commuter students.
Note: Standard errors are obtained through bootstrapping for OLS-SB and IV-SB.All regressions include a constant term; controls and fixed effects are the same as in Table2.***p , 0.01, **p , 0.05, *p , 0.1.