Constructing independent evidence from regression and instrumental variables with an application to the effect of violent conflict on altruism and risk preference

ABSTRACT To provide an unbiased estimate, a regression analysis depends, among other things, on there being no unmeasured confounding. Often, unmeasured confounding is thought to be possible, but not severe; leading to a secondary instrumental variables (IV) analysis. However, these two analyses are correlated. It is unclear how much independent evidence is provided by the IV analysis. We resolve this redundancy using a new estimator, which extracts the part of the regression estimator uncorrelated to the IV-based 2SLS estimator. We apply our approach to analyze the effect of exposure to violent conflict on preferences for altruistic behavior, time and risk.


Instrumental variables analysis as a secondary analysis: a redundancy
The linear regression model is often used to infer about the causal effect of a treatment controlling for various confounders.Consider the linear regression model for i = 1, 2, . . ., n, popular in practice to assess the significance of the effect of a treatment d on another variable y, controlling for confounders x.The Gauss-Markov theorem says, the best possible linear unbiased estimator for β under the conditions of zero mean, uncorrelated errors and homoskedasticity of the errors is the ordinary least squares (OLS) estimator, βOLS .Most empirical studies of the effect of one variable on the other start with the traditional t-statistics and the corresponding p-value calculation based on βOLS .After that to address the most common criticism that there might be unmeasured confounding variables that make the OLS estimate biased and thus a test of significance of β based on βOLS invalid for assessing the causal effect of the treatment d, often an instrumental variables (IV) based analysis is presented (see, e.g.[1][2][3]).Brookhart et al. [4] recommend IV methods as a secondary analysis to regression analysis for the following setting: IV methods are inefficient and should not be used as a primary analysis unless unmeasured confounding is thought be strong . . . .If unmeasured confounding is possible, but not expected to be severe, IV may be more appropriately used as a secondary analysis.
IV analysis has been used as a secondary analysis in various studies [5][6][7].
When IV is used as a secondary analysis to a primary regression analysis using the same data set, the two analyses are correlated.Let us consider the simplest instrumental variable model with a single instrument, z, for our variable of interest d ( The standard estimate of the coefficient β using instrumental variables is the two-stage least squares (2SLS) estimator, which is obtained by first regressing d on z and then regressing y on the predicted values of d from the first regression.The 2SLS estimate of β here is denoted by β2SLS .
Figure 1 shows a scatter plot of the two estimators βOLS and β2SLS for 1000 simulations with β = 0.1, γ = −1, and no covariates.The high (positive) correlation of the two estimators is immediate from this plot.A quantitative view of the correlation structure is given in Table 1.Even with a very small correlation of d i and z i through Equation ( 2), we can see the high amount of correlation between the two estimators.For example, with γ = 0.5, these two estimators in this simple model have correlation around 0.5.When γ = ±2, the simulated values of the correlation are as high as 0.92.Therefore, in an empirical study, a chance error that led to concluding statistical significance of the effect based on the OLS estimator when there is truly no effect could very well be the reason for a significant effect in the secondary IV analysis as well.This is worrisome, since we would not like to be a fool who bought 'several copies of the morning paper to assure himself that what it said was true' ( [8, #265], quoted in [9]) by reporting two highly correlated pieces of evidence as if they are separate pieces.In supplementary Section S8, we provide additional simulation results that also include a covariate.
Suppose that after the primary regression analysis finds a significant treatment effect, the secondary IV analysis also finds a significant treatment effect.Does this provide meaningful confirmatory evidence for there being a true causal effect of treatment?As these analyses are correlated, it is difficult to interpret such findings.

Summary of the proposed methodology
In this paper, we develop an approach to constructing two independent pieces of evidence from a regression analysis and an IV analysis for the setting when the IV analysis is used as a secondary analysis.We first present the methodology in the context of the linear structural equations model.Then, in Section S4 of the supplementary, we extend the method to a general treatment effect model that allows for heterogeneity and non-additive effects.
For the linear structural equations model, our procedure first carries out the primary OLS analysis.If this analysis rejects the null hypothesis, we propose to present two secondary analyses, one based on the instrumental variable based 2SLS estimator and a second   2), based on a sample of size n = 100; u i , v i ∼ N(0, 0.5 2 ); no x variable; and based on a new estimator, denoted in Figure 2 by βEX , which is asymptotically independent from the 2SLS estimator.These two secondary analyses provide separate evidence about the causal effect.Using the testing of hypotheses in order argument of [10], we show that for this two stage procedure, which conducts three tests of hypothesis each at a prespecified level α, the overall type I error is still controlled at level α.
Rosenbaum [9] uses the term evidence factors when two independent analyses are conducted for one null hypothesis where the individual analyses depend on different sets of assumptions.In the case when only one of two sets of assumptions fails, while the other remains true, the decision based on combining the two analyses remains reliable.In our proposal, the two evidence factors in the secondary analysis are based on the estimators β2SLS and βEX , both of which test the same null hypothesis of no causal effect of the treatment while they are derived from different assumptions.The 2SLS estimator assumes that  2), with the same data as in Figure 1.
the instruments are valid, namely, A.1 the instruments are associated with the treatment, A.2 instruments cannot directly affect the outcome, and A.3 there is no unmeasured confounding between the instruments and the outcome.The EX estimator assumes that A.4 there is no unmeasured confounding between the treatment and the outcome, and A.1 the instruments are associated with the treatment.Note that the assumption A.1 is testable while A.2, A.3 and A.4 are not.We also discuss later a way to assess the sensitivity of these analyses to potential violations of assumptions A.2, A.3 and A. 4.
The rest of the paper is organized as follows.Our motivating example is presented in Section 2. A new estimator, the EX estimator, is defined via (8) in Section 3.1.Section 3.2 verifies the consistency of the proposed estimator.The technical result showing that the 2SLS analysis and the proposed analysis are asymptotically independent is proved in Section 3.3.Results for our motivating example are presented in Section 4. Section 5 discusses methods of sensitivity analysis, and the results of sensitivity analysis for the example are presented in Section 6.The supplementary further provides an extension of the estimation procedure to general instrumental variables model that allows for heterogeneity and non-additive effects.

Motivating example: does exposure to violence alter preferences? Data from the civil war in Burundi
Wars have been referred to as 'development in reverse' because of their destruction of capital [11].In several countries over the past century, we have observed remarkable economic and social rebounds after a period of war.These postwar recoveries can be partly attributed to generous humanitarian aid [12].An additional hypothesis for rapid economic growth after the war is that violent conflict can spur societal reforms that promote economic growth (e.g.[13]).One mechanism by which violent conflict could spur (or hinder) economic growth is that the conflict may alter preferences [14][15][16][17].Voors et al. [16] conducted a study to test whether exposure to a violent conflict changes people's preferences toward altruistic behavior, time preference (in particular, a measure of impatience for gaining some money now vs. more money later) and risk preference.The study was based on data on the history of violence in different communities of Burundi combined with data from field experiments used to determine preferences.
Burundi went through a civil war between 1993 and 2006, resulting from long-standing ethnic divisions between the Hutu and Tutsi ethnic groups; in the war, the Tutsi-dominated army clashed with Hutu militias.The war left over 300,000 people dead and many displaced [18].
Following Voors et al., our treatment variable of interest is a measure of conflict victimization, the number (relative to the community) of war-related deaths between 1993 and 2003.We have household level observations on 35 communities on various household level variables and various community level variables.These communities are shown in the map in Figure 3. Out of the 35 communities, 11 communities did not experience any form of violence, while among the communities which were exposed to violence the percentage of death during the period 1993-2003 varied from 0.078% to 15.62%.Table 2 provides the descriptive statistics of the variables in the data set.
Four outcome variables are used to quantify preferences toward altruism, risk and time preference.Experimental games were used to measure these preferences.The degree of altruism was measured on a scale from 0 (purely selfish preference) to 100 (unselfish preferences) with a value of 50 denoting the social optimum which maximizes joint payoffs.The average of this measure was 27, which indicates an overall selfish preference.Risk preferences for gains/losses were measured on a scale of 0 to 3. The average values of these risk preferences for gains and losses were 1.87 and 2.31, respectively, indicating an overall risktaking behavior.Finally, time preference of a subject was measured as the discount rate d such that the subject is willing to choose a fixed amount of money at d% interest rate two weeks later over receiving the fixed amount the following day.For detailed information on these experiments, see the web Appendix to [16].To estimate the causal effect of exposure to violence on preferences, Voors et al.'s primary analysis regressed the preference measures on the exposure to violence.For this analysis to provide a consistent estimate of the causal effect, there should be no unmeasured confounding.Support for the assumption of no unmeasured confounding is provided by the fact that the violence in the civil war was thought to be largely indiscriminate because of the army's inability to identify rebels, a desire for extermination, 'revenge by proxy', plundering, and a perceived need to demonstrate power as part of the tactics of fear to control a population [19][20][21].However, there are possible unmeasured confounders.Communities with greater ethnic or political cleavages may be easier targets because they are less able to defend themselves or, conversely, communities with fewer cleavages may be more likely to be targeted because of their potential support for the 'other side' [16].To address concerns about unmeasured confounding, Voors et al. conducted a secondary instrumental variable (IV) analysis using altitude and distance to Bujumbura, Burundi's capital, as instruments.
We consider two instrumental variables following Voors et al. -distance to the capital Bujumbura and the altitude of the community both on a log scale.These two variables are fairly negatively correlated with a correlation value of −0.39.A principal component analysis of these two instruments shows that 95% of the total variability is explained by the first component which has a loading of −0.995 on the distance to the capital.
Figure 3 shows the geography of Burundi and remoteness of various cities of the country from its capital Bujumbura which is situated on the western land border by Lake Tanganyika.Data were collected from 35 randomly selected communities from 13 provinces.Within each community a number of random household heads were surveyed to measure their preferences.Different communities in the data are shown in the map with a color  (1993)(1994)(1995)(1996)(1997)(1998)(1999)(2000)(2001)(2002)(2003).Only the estimate of the coefficient of the variable of interest and the corresponding standard errors clustered at the community level are reported.For a list of all the other covariates, see Table 2. * Significant at the 10 percent level.* * Significant at the 5 percent level.* * * Significant at the 1 percent level.
gradient depicting the violence level experienced by the community during the war.The primary regression analysis tests the null hypothesis that violence does not affect preferences by assuming that violence experienced by a community, reflected by the color gradient in Figure 3, is independent of what the community's preferences would have been in the absence of violence.The IV analysis using distance from the capital as the IV tests the null hypothesis of no effect of violence on preferences by assuming that distance to capital is independent of what the community's preferences would have been in the absence of violence.Essentially, the IV analysis is assuming that what the average community preferences would be in the absence of violence in a contour of average distance from the capital (dashed contours) is the same across contours while allowing for unmeasured confounding within a contour, whereas the regression analysis assumes no unmeasured confounding both within a contour and across contours.
For example, the first row of Table 3 (discussed in detail later in Section 4) shows the regression analysis (OLS) and IV analysis (2SLS) for the effect of exposure to violence on altruistic preferences.The regression analysis shows a significant positive effect, providing evidence that exposure to violence causes altruistic preferences under the assumption that altruistic preferences in the absence of violence would be the same on average between the communities that were actually exposed to high levels of violence compared to communities actually exposed to low levels of violence.The IV analysis also shows a significant positive effect, providing evidence that exposure alters altruistic preferences under the assumption that altruistic preferences in the absence of violence would be the same on average among communities that are near vs. far from the capital.But these two analyses are not separate; they use the same data set and they are correlated.

A new estimator
We first describe our causal model involving instrumental variables.Let the units of the data be denoted by i = 1, 2, . . ., n.We denote by y the outcome, d the treatment, and z the instrument(s).Each unit i has a potential outcome y , the outcome the unit would have if the unit received the treatment d * and the instrument value z [22,23].We first start by assuming the exclusion restriction, which says that, A.2 The instrument only affects the outcome through its effect on treatment received, i.e. the instrument has no direct effect on the outcome.
This allows us to write the potential outcome i dropping the effect of the instrument.In the Burundi data, the treatment is the level of violence measured as the percentage of death in violent conflict in the community.The outcome is a measure of preference.Each unit thus has a vector of potential outcomes and depending on the observed value of the treatment for unit i, a particular corresponding potential outcome value is observed.Let D i := d obs i denote the observed level of the treatment for unit i, Z i := z obs denote the observed instrument level, and y i := y d obs i i denote the observed outcome.An additive, linear constant-effect causal model for the potential outcomes is (see [24]) Our interest is in the parameter . is the causal effect of the treatment d on the outcome variable y for a one unit increase in the treatment level.A variable (or vector of variables) Z i is a valid instrument for estimating if along with A.2, the following are satisfied (see [25,26] 3 There is no confounding between the instrument and the observed outcome.Under model (3) this is the same as saying y (0) i and Z i are independent.
Often we need to condition on observed covariates, so that the proposed instrument(s) Z i satisfies these conditions.Let X i be the vector of observed covariates and conditioning on X i the conditions A.1, A.2, and A.3 are satisfied by Z i .We consider a linear model for E(y We consider the following model on the observed data under ( 4) with a linear model on the D i = d obs , The first equation above is implied by the potential outcomes model (4).The second equation is an additional assumption.Model ( 5) is the linear structural (simultaneous) equations model that is widely used in economics [27].
In our presentation, we frame the problem in a general setting allowing the treatment to be a multilevel variable.In matrix notation model ( 5) is written as In the first equation, y = (y 1 , . . ., y n ) is a n × 1 vector of outcome variable, D = (D 1 : . . .: D n ) is a n × m matrix of observations on a m dimensional variable of interest (i.e.treatment variables), X = (X 1 : . . .: X n ) is a n × k 1 matrix of control covariates, and u = (u 1 , . . ., u n ) of dimension n × 1, is the error term.In the IV model ( 7) Z = (Z 1 : . . .: For a matrix A, let P A be the projection matrix on the column space of A and M A be the projection matrix on the orthogonal space of the column space of A. With a slight abuse of notation, we write P XZ and M XZ for P [X:Z] and M [X:Z] , respectively.Our interest is in the parameter .The OLS and the 2SLS estimators of are This estimator can be calculated as the coefficient of D when y is regressed on D and X, Z.
The purpose of this new estimator is for it to be used along with the 2SLS estimator to strengthen the supplementary analysis when the OLS analysis is conducted as the primary analysis.Below, we provide a discussion of how existing literature motivates this estimator.
The 2SLS estimator motivates our estimator ˆ EX .The 2SLS estimator is easily calculated using a two-stage least squares method by which in the first stage, a predicted value of D ( D) from ( 7) is calculated and in the second stage, is estimated through regression model ( 6) by D replacing D. Our proposed estimator, ˆ EX , is a different type of two-stage estimator where at the end of the first stage of regressing D on X and Z, we store the residual V and in the second stage, is estimated through regression model ( 6) by V replacing D.
The 2SLS estimator uses the part of the variation in D that is explained by the instrument Z to estimate the treatment effect.Our proposed estimator is the converse, it uses the part of the variation in D that is not explained by the instrument Z to estimate the treatment effect.
This two-stage procedure might remind a reader of the augmented regression version of the Durbin-Wu-Hausman test ( [28][29][30]; see [31], Section 8.7 for augmented regression version) for testing whether there is unmeasured confounding assuming valid instruments.Yet, they are different.The Durbin-Wu-Hausman test tests whether the expected value of the difference of the OLS and 2SLS estimator is zero, as it is expected to be when the outcome model is free of any unmeasured confounders.The augmented regression version of the Durbin-Wu-Hausman test, which is asymptotically equivalent to the original test, uses the residuals of the first stage regression fit, V, as a new independent variable in the second stage regression alongside D and X and tests if its coefficient is zero.Thus, unlike the two stage procedure that gives us ˆ EX with the second stage regressing y on m + k 1 regressors, the second stage regression in the augmented regression version of the Durbin-Wu-Hausman test regresses y on 2m + k 1 regressors.

Consistency of the proposed estimator
Under standard conditions on the variables, the proposed estimator ˆ EX in ( 8) is a consistent estimator of the parameter under the assumption of no unmeasured confounding between the treatment and the outcome.
Then the proof follows from similar arguments as in Chapter 5.2 of [32].
The assumption E(V u | X, Z) = 0 or equivalently E(D u | X, Z) = 0 is the assumption that conditional on the observed covariates and the instruments, there is no unmeasured confounding between the treatment and the outcome.This condition implies no unmeasured confounding in (6), i.e.D is exogenous with respect to y conditional on the covariates X, if assumption A.3 is satisfied.
If the condition of Theorem 3.1 is not satisfied, the bias of the estimator ( 8) can be quantified.The bias of the estimator ˆ EX is of the order of E(V 1 u 1 | X, Z).Details of the bias calculation are given in the supplementary.

ˆ 2SLS and ˆ EX are asymptotically independent
We show that ˆ 2SLS and ˆ EX are asymptotically independent.Therefore, evidence based on the two IV-based estimators ˆ 2SLS and ˆ EX carry separate information about the coefficient .

Theorem 3.2:
Assume the conditions of Theorem 3.1.Also assume the following: Then, ˆ 2SLS and ˆ EX are asymptotically independent.
Our results all hold true under setups with heteroskedasticity and clustering in which robust standard errors are valid [33,34].

OLS, 2SLS, and EX estimators
An OLS analysis provides a consistent estimator of the causal parameter assuming no unmeasured confounding between the treatment and the outcome.The instrumental variables-based 2SLS analysis is consistent when the instruments satisfy assumptions A.1, A.2, and A.3.Even if there is concern about unmeasured confounding and the instruments are thought to be valid (satisfy A.1, A.2 and A.3), OLS may still be the preferred method for primary analysis if the unmeasured confounding is thought not likely to be serious, because 2SLS is inefficient compared to OLS [35,36].We propose a method of analysis where both the OLS analysis and the 2SLS analysis can be considered together, while keeping the OLS analysis as primary.If we are going to consider the two analyses together, then we need to control for multiple testing.Bonferroni treats the OLS and IV analyses equally.The testing in order method we suggest gives more power to the OLS analysis while allowing us to look at the 2SLS and EX analyses.In Section 3.3, we showed that the two instrumental variables-based estimators, the 2SLS estimator and the EX estimator, are asymptotically independent.Further, we note that the EX estimator is valid as long as assumption A.1 is satisfied and unmeasured confounding is not severe (i.e.

E(D u | X, Z) = o(1)
).Thus, these two estimators form evidence factors in testing for a treatment effect [9].We propose to change the standard practice of considering 2SLS analysis as a secondary analysis and provide both 2SLS and EX analysis as secondary analyses.
Our method is the following: (1) Conduct the OLS analysis and test for a treatment effect at level 0.05.If the OLS analysis does not reject, stop.(2) If the OLS analysis rejects, then test for a treatment effect using each of the 2SLS and EX analyses, both at level 0.05.
Even though our method involves potentially carrying out three different significance tests at level 0.05, we show in the next section that our method controls the overall type I error rate because of the sequential way in which tests are performed and the relationship between the tests.

Primary and secondary analyses: testing of hypotheses in order
Let E |X (•) denote the conditional expectation given X.In the population, ˆ OLS estimates the population quantity ∂E |X (y 1 | D 1 )/∂D 1 since it estimates the slope of D when y is projected on X and D. Similarly, ˆ 2SLS , which is computed as the coefficient of the projection of D on Z when the outcome y is regressed on this projection, estimates the population quantity Whether or not any of the assumptions are valid, significance tests based on these three estimators ˆ OLS , ˆ 2SLS and ˆ EX correspond to the following three null hypotheses of the population quantities they estimate, respectively: We observe that the three hypotheses above are interrelated as follows.Suppose /∂Z 1 = 0, which corresponds to assumption A.1 that the instrument is associated with the treatment, and As a consequence, if H 2SLS is true, then H EX cannot be false, and if H OLS is false, H 2SLS cannot be true.
We partition these three hypotheses as T = {T primary , T secondary } where T primary = {H OLS } and T secondary = {H 2SLS , H EX }.Then with the ordering primary ≺ secondary on the index set of T , in the language of [10], the partition of disjoint intervals of hypotheses T is sequentially exclusive and Rosenbaum's Proposition 2 applies.Hence, we have proof of the following proposition.

Proposition 3.1:
The probability that the two step method proposed earlier, in Section 3.4, rejects at least one true hypothesis is at most α = 0.05.
When OLS analysis shows the significance and the proposed method allows us to perform two secondary analyses, based on the results of the secondary analyses, we can say that we have 'no evidence' or 'one piece of evidence' or 'two pieces of evidence' at the secondary level, depending on whether neither or only one or both of the secondary estimators gives statistical significance.

Results: three estimates in the Burundi data
Table 3 reports the estimates of the effect of violence and the corresponding robust standard errors clustered at the community level for the four outcome variables.The OLS analyses show statistically significant positive effects of conflict on preferences toward altruistic behavior, risk for gains and time preference.The OLS estimate does not show a significant non-zero effect on risk preferences losses.For the three outcomes where OLS analysis shows statistical significance, we perform the 2SLS and EX analysis simultaneously.When considering the effect of conflict on the degree of altruism both secondary analyses show significance.Only the EX analysis shows a positive effect of conflict on risk preferences gains at the 5% significance level.The 2SLS estimate is not significant for risk preferences gains.The 2SLS estimate for time preference is statistically significant while the EX estimate is not.Therefore, a naive summary of Table 3 would say that a person exposed to violent conflict tends to exhibit more altruistic behavior, be more of a risk taker in the context of gaining and be less patient.
Table 3 tells us that when considering the effect of violent conflict on the degree of altruism, we have two pieces of evidence against the null hypotheses of no effect while for effect on risk preferences for gains and time preference, we only have one piece of evidence at significance level 0.05.
We can further test for any specified effect of the treatment.For example, suppose we are interested in the null hypothesis of H 0 : = 3 for the effect of conflict on degree of altruism.The OLS estimator rejects this null hypothesis at level 0.05, then, among the two secondary analyses only the EX analysis rejects the null hypothesis.This is because the corresponding 95% confidence interval for the OLS estimator does not contain the hypothesized effect, the same is true for the EX confidence interval, while the 95% confidence interval for the 2SLS estimator contains the hypothesized effect.Such statistical decisions based on our proposed method for any such null hypothesis can be shown in a plot such as in Figure 4.This figure presents for each of the four outcome variables, the effect levels which we fail to reject, or reject only based on the primary OLS analysis, or reject with one piece of secondary evidence, or reject with two pieces of secondary evidence.As an illustration, let us consider risk preference in losses.We fail to reject any null hypothesis H 0 : = 0 with value of 0 between −0.019 and 0.058.A specified effect value between −0.033 and −0.019 is rejected only in primary OLS analysis.We have one piece of secondary evidence for the null hypothesis H 0 : = 0.12 and any 0 more than 0.152 or less than −0.067 is rejected with two pieces of secondary evidence.

Sensitivity analyses
A sensitivity analysis asks the following question: if an analysis is based on a certain assumption, then what magnitude of violation in that assumption needs to be present to alter the conclusion based on such an analysis.If we know that the conclusion of the naive analysis that presumes a certain assumption is still valid when some amount of bias is allowed, then this knowledge will strengthen the conclusion.For discussion of various methods of sensitivity analysis for observational studies, see [37][38][39][40][41][42][43][44][45][46]; and [47].One crucial step of a sensitivity analysis is to determine a suitable parameter that can be used to quantify the deviation from the assumption.A sensitivity parameter should be such that a larger value of the parameter indicates a bigger deviation from the assumption in some intuitive sense.Once such a sensitivity parameter, say δ, is determined, a sensitivity analysis will report for a given value , if the conclusion of an analysis based on the assumption would remain unchanged if we allow for a deviation in magnitude of at most , i.e. if δ ≤ .
In our discussion, we are interested in the effect of treatment on the outcome y.Thus for a given value of the effect 0 we would conclude either we reject the null hypothesis H 0 : = 0 or we fail to reject it.For simplicity, we consider the case where we have only one variable of interest (m = 1).For ease of explanation we use d to denote an n × 1 univariate treatment variable, rather than the matrix D which allowed multiple treatments.
Let us consider the OLS analysis first.The validity of the OLS analysis depends on the assumption that there are no unmeasured confounders.Denote the potential unmeasured confounder by w.Our sensitivity analysis method builds on a method proposed by Hosman et al. [44].We use the parameter δ 1 = ρ wd•X , which is the partial correlation between the unmeasured confounder and the treatment conditional on the covariates X.The parameter δ 1 measures the magnitude of association of the unmeasured confounder w and the treatment d while we let the magnitude of association between the unmeasured confounder and the outcome be unrestricted.A larger value of |δ 1 | indicates a larger deviation from the assumption of no unmeasured confounding.When |δ 1 | is allowed to take value in a certain range, for a given significance level α, we calculate an interval in which the (1 − α)100% confidence interval of based on its OLS estimator must be contained.Such an interval is called a sensitivity interval.If H 0 : = 0 , allowing for an unmeasured confounder with sensitivity parameter δ 1 , we would still be able to reject the null hypothesis if the sensitivity interval does not include 0 .
A sensitivity interval when |δ 1 | ≤ can be calculated as where r X = rank(X) and c(l, b 2 with t α,l denoting the (1 − α)th quantile of a t-distribution with l degrees of freedom.Using (9), we can determine the sensitivity of OLS analysis for different bounds on the partial correlation δ 1 = ρ wd•X .The derivation of (9) and other sensitivity intervals presented below in this section are given in the supplementary.
Let us now consider the instrumental variable-based 2SLS estimator.This estimator depends on the validity of the instruments, which is to say that assumptions A.1, A.2 and A.3 are true.Assumption A.1, which says that there is association between the instruments and the outcome, can be empirically validated.If the amount of association is very small, i.e. the instruments are weak, then conventional asymptotic results of the 2SLS estimator are misleading for even large finite samples [48][49][50].Whether the strength of the instrumental variable is adequate for the asymptotic results to be reliable can be tested using the data and thus a sensitivity analysis is not needed for this assumption [51,52].
To see how the observed outcome behaves when assumptions A.2/A.3 are not enforced, we go back to our potential outcomes framework.Recall y (d * ,z) i is the potential outcome for unit i, on receiving a treatment d * and at value of the instrument z.The additive, linear constant-effect causal model for the potential outcomes is parallel to (3): , so that κ 1 i (•) measures the direct effect of the IV on the outcome; a non-zero value of k 1 i (•) violates A.2.As before let d i := d obs denote the observed level of the treatment for unit i, Z i := z obs denote the observed instrument level, and y i := y (d i ,Z i ) i denote the observed outcome.Now write In the above equation, κ 2 i (z) := E(y is the effect of unmeasured confounders between the IV and outcome; a non-zero value of k 2 i (z) violates A.3.We assume the error term u i = y Finally, combining these, the observed outcome model for unit i is given by Therefore, is the term that measures the violation of either of the two assumptions A.2 and A.3.We assume that it is a linear function of g(Z i ).The function g(Z) which maps the instruments to a real number indicates the mechanism by which the IV assumptions A.2 and A.3 could be violated.In our study, we make g(Z) the function that calculates the distance of the community from the capital city Bujumbura.We provide further explanation on this choice of the function g(Z) in Section 6.Also, see Section 7 for choice of g(Z) for other studies and how they are calculated from the data.The larger the magnitude of κ, the more the assumptions are violated with κ = 0 corresponding to the assumptions holding (see [24,26]).Now we include our observed covariates in model (11), and in the matrix notation our model for the observed outcomes is The 2SLS estimator, ˆ 2SLS , and its standard error under models ( 11) and ( 7) can be written as We calculate ρ g(Z) d•X from the data.We consider δ 2 = ρ yg(Z)• dX as our sensitivity parameter for the 2SLS estimator.This sensitivity parameter δ 2 can be seen as a scaled version of the coefficient κ in model (11) since we have the equality κ/{ , where the right-hand side ratio is an odd function which is increasing in δ 2 on the positive axis.Therefore, δ 2 is an appropriate measure of the amount of violation in A.2 and A.3.When we restrict the sensitivity parameter δ 2 in the interval [− , ] for the 2SLS estimator (after some algebra) the sensitivity interval can be calculated, when 1 , as where the sensitivity interval is where For given a value of , we can determine if we can reject the null hypothesis by checking whether 0 falls outside the interval ( 12) or ( 13), whichever one is appropriate.
Finally, EX analysis depends on the assumptions that there are no unmeasured confounders and assumption A.1 that the instrument is associated with the treatment.We only consider violation in the assumption of no unmeasured confounding since A.1 can be empirically validated as explained above.To measure the violation, we use the parameter δ 1 = ρ wd•X as in the OLS analysis.We further assume that there is no unmeasured confounding between the instrument and the outcome.Then, the sensitivity interval for the EX estimator when we have the range where r XZ = rank([X : Z]).Recall, c(l, b) is defined just after (9).In calculating the sensitivity interval, we calculate ρ dZ•X from the data.The sensitivity parameters for the OLS and 2SLS estimators are δ 1 and δ 2 , respectively.δ 1 measures the potential violation in the assumption of no unmeasured confounders of the treatment-outcome relationship.When there is no confounding between the outcome and the instruments we get the sensitivity analysis for EX estimator at no extra cost using δ 1 .In that case, we consider potential violation in instruments only through a violation in assumption A.2 but not A.3.In the sensitivity analysis for the null hypothesis H 0 : = 0 , for given pair (δ 1 , δ 2 ) there are four possible decisions: primary OLS analysis shows no statistical significance, primary analysis rejects the null but neither of the two secondary analyses shows significance, OLS rejects and only one of the two secondary analyses shows significance, or OLS and both the secondary analyses show significance.We will present the sensitivity analysis result based on the primary OLS estimator and secondary 2SLS and EX estimators using a gray-scale plot depicting these four decisions as in Figure 5 that will be discussed in the next section.

Results: sensitivity analysis in the Burundi data
We now go back to the empirical study of the Burundi civil war and its effect on preferences of people exposed to violent conflict.A potential unmeasured confounder of the treatmentoutcome relationship is the number of ethnic or political cleavages in the communities.A larger number of political cleavages may leave the community more vulnerable to violent attacks.Also, the cleavages may be homogeneous in social preferences so that the number of political cleavages will also be associated with social preferences.If this theory is true, not having information on cleavages can lead to OLS analysis signaling a statistically significant effect when there is no causal effect.
A potential concern about the validity of the instruments is that distance to the capital could have a direct effect on preferences, thus violating assumption A.2 if the distance to capital is a proxy for distance to markets (see [53]).
In Figure 5, we present the result of sensitivity analyses for the four outcome variables.We also consider different sizes of effects for each outcome.Results of Table 3 which correspond to δ 1 = δ 2 = 0 and effect size = 0, can be seen in the plots as well.For example, we have two pieces of secondary evidence for a non-zero effect of conflict on altruistic behavior.We lose evidence from the 2SLS analysis for sensitivity parameters |δ 2 | > 0.07, and the evidence from EX estimator is sensitive for |δ 1 | > 0.05.For an effect size of 0.1 there is no evidence from the EX analysis at any level and the 2SLS analysis is sensitive for |δ 2 | > 0.06.
As noted in Section 2, for the two instrumental variables in the data, distance of the communities from the nation's capital and altitude, the first variable is the first principal component explaining more than 95% of the total variation of the two variables.In the sensitivity interval calculations for the 2SLS estimator, we use distance from Bujumbura for the mechanism by which the instruments can be invalid, i.e. g(Z) = distance from Bujumbura.This choice of g(Z) is also reasonable in view of our earlier discussion that validity of the instrument may be violated because distance to capital is a proxy for distance to markets.
Even though exposure to violence showed a non-zero effect on risk preferences in gains from EX analysis, this analysis is sensitive for |δ 1 | > 0.065.For a hypothesized negative effect size of −0.01, 2SLS does not show any evidence (this would have been seen as a horizontal strip).For risk preferences in losses with a hypothesized effect size of −0.01, we do get statistical significance from both secondary analyses when deviation to the assumption is in the range |δ 1 | ≤ 0.11 and |δ 2 | ≤ 0.06 (the rectangular area with darkest shade).Differing results for gains and losses is not surprising since it has been noted that people have different perceptions towards gains and losses [54].
When considering the effect on time preference, although 2SLS analysis does show a significant effect, it is sensitive even for a small deviation of δ 2 = 0.005.Thus a conclusion that exposure to conflict can result in less patience cannot be made with confidence.
In summary, two secondary analyses along with their sensitivity analyses add considerable information about the evidence provided by the study for the theory that exposure to violent conflict affects preferences.

Discussion
We have found two independent sources of evidence that in Burundi, exposure to violent conflict increased a community's level of altruistic behavior and risk-taking behavior.We have considered a two-step analysis procedure to arrive at this decision where the first step, the primary analysis, uses standard regression analysis and a secondary analysis uses two evidence factors constructed from instrumental variables.While this two-step procedure also finds two sources of evidence for conflict increasing impatience (time preference) assuming that the instrumental variables of distance to the capital and altitude are strictly valid, this evidence can be easily challenged on the ground that the instruments considered are not strictly valid.
For the three estimators that are involved in our analysis -OLS, 2SLS and EX -we chose sensitivity analysis methods with sensitivity parameters intended to intuitively measure the deviation from the assumption of concern.In a sensitivity analysis of the least squares regression estimator to potential violation of the no unmeasured confounding assumption, we used the correlation of the unmeasured confounder to the treatment.The same parameter is used for sensitivity analysis of the EX estimator as well.The sensitivity analyses for the three estimators can be conveniently presented in a gray-scale plot as in Figure 5.The sensitivity analysis of the 2SLS estimator requires additional information on the mechanism by which the instruments may be invalid.This mechanism is encoded through the function g of the instrumental variables.The choice of the function g is contextual and needs subject matter consideration.For example, in a study of the effect of imprisonment on earnings [55] uses judges' ID as the instrument where a violation to the validity of the instrument can be due to the fact that characteristics of the pool of convicted felons may vary as a function of judge's harshness.In this case, g can be taken as a function of judges' ID that measures their harshness.Ertefaie et al. use this function in a sensitivity analysis based on the Anderson-Rubin statistic [56].
The methods and the visualizations of the paper are coded in the R package ivregEX available at CRAN.The Burundi data set analyzed in this paper is available as a Web supplementary to the paper by Voors et al. [16].

Figure 2 .
Figure 2. Scatter plot of estimates β2SLS and the proposed estimator βEX for the models (1) and (2), with the same data as in Figure1.

Figure 3 .
Figure 3. Geography of Burundi.The circles correspond to the communities surveyed.Color gradient is used to show the violence level during the civil war period in the communities.Distance contours from the capital of the country, Bujumbura, are shown in dashed lines.

Theorem 3 . 1 :
Suppose the observations are iid with finite second moments such that E{ Finally, ˆ EX estimates the population quantity ∂E |X (y 1 | D 1 , Z 1 )/∂D 1 .These population parameters do not have a causal interpretation unless further assumptions are made.For example, ∂E |X (y 1 | D 1 )/ ∂D 1 only indicates the causal effect of D on the outcome under the assumption of no unmeasured confounding, ∂E |X {y 1 | E |X (D 1 | Z 1 )}/∂E |X (D 1 | Z 1 ) has a causal interpretation only when Z 1 is a valid instrument, and finally ∂E |X (y 1 | D 1 , Z 1 )/∂D 1 has causal interpretation when we have no unmeasured confounding and the instrument is associated with D.

Figure 4 .
Figure 4. Different confidence regions for the effect of violent conflict on the four outcome variables of the Burundi data at 95% level.

Figure 5 .
Figure 5. Sensitivity analysis of the four outcome variables for different amount of treatment effects using OLS as primary and 2SLS and EX as secondary analysis.The decision is indicated using four different shades of gray.

Table 1 .
Correlation of βOLS and β2SLS for varied values of β and γ in the models (1) and (

Table 2 .
Summary statistics of the variables.

Table 3 .
Conflict and preferences.
Notes: Variable of interest: percentage dead in attacks