Bayesian Detection of Bias in Peremptory Challenges Using Historical Strike Data

Abstract United States law bars using peremptory strikes during jury selection because of prospective juror race, ethnicity, sex, or membership in certain other cognizable classes. Here, we extend a Bayesian approach for detecting such illegal strike bias by showing how to incorporate historical data on an attorney’s use of peremptory strikes in past cases. In so doing, we use the power prior to adjust the weight of such historical information in the analysis. Using simulations, we show how the choice of the power prior’s discounting parameter influences bias detection (how likely the credible interval for the bias parameter excludes zero), depending on the degree of incompatibility between current and historical trial data. Finally, we extend this approach with a prototype software application that lawyers could use to detect strike bias in real time during jury-selection. We illustrate this application’s use with real historical strike data from a convenience sample of cases from one court.


Introduction
In the United States, individuals selected for jury service appear in court as scheduled and are questioned by the parties' attorneys and the trial judge.During this process, a prospective juror, if not excused by the trial judge for cause, may still be dismissed if a party's attorney uses one of their limited number of peremptory challenges against them.By asserting a peremptory challenge, a party can declare a prospective juror ineligible ("strikes" that juror) for a seat on the jury without the burden of explaining why.
Since Batson v. Kentucky (1986), a party violates the Equal Protection Clause of the United States Constitution by using peremptory challenges if motivated by the prospective jurors' race, ethnicity, sex, or membership in another social category that the law prohibits as a basis for striking that juror.The party bringing a Batson challenge bears the burden of proving such illegal strike bias is more likely than not to be true (LaFave et al. 2022, §22.3(d)).Such illegal strike bias is the parameter of interest, not to be confused with the bias of an estimator of a parameter (how much the estimator's expected value differs from the parameter's true value).
A Batson challenge typically proceeds in a four-step sequence.First, the attorney must decide whether to challenge the opposing attorney's use of one or more peremptory strikes as based on illegal strike bias.Second, if challenged, the challenger must meet an initial burden (the prima facie case) of producing just enough evidence to raise a sufficient initial inference of illegal strike bias.Third, the striking attorney must proffer permissible reasons for their challenged strikes.Fourth, the challenger must discredit those proffered reasons and ultimately persuade the trial judge that it is more likely than not that the striking attorney acted with illegal bias.For Batson challenges, the "ultimate CONTACT Sachin S. Pandya sachin.pandya@uconn.eduSchool of Law, University of Connecticut, Hartford, CT.Supplementary materials for this article are available online.Please go to www.tandfonline.com/r/TAS.
inquiry" is whether the striking party "was motivated in substantial part by discriminatory intent." (Flowers v. Mississippi 2019, p. 2241).
In contrast, for similar challenges under State law, a few States require only that an "objective observer" or an "objectively reasonable person" would find that race, ethnicity or another cognizable-class was a "factor" in that party's use of strikes (Wash.General Rule 37(e) (2022); Calif.Code of Civil Procedure Section 231.7(d)(1) (2022); Conn.Superior Court Rule 5-12(d) (2023); for a similar rule, see N.J.Court Rule 1:8-3A(d) ( 2023)).Peremptory-challenge procedure, and thus the task of proving illegal strike bias, varies not only by State, but also by court, including the number of peremptory challenges assigned to each side and the order in which each party uses those strikes (Williams 2017).
In any case, evidence of such illegal strike bias may include data on the use of peremptory challenges in past cases (Flowers v. Mississippi 2019, p. 2243; Wash.General Rule 37(g)(v) (2022); Calif.Code Civil Procedure §231.7(d)(3)(G)(2022)).Prior studies have collected such historical strike data in past cases and reported the observed difference in strike rates by race or sex.Typically, they test for the probability of observing a nonzero difference in strike rates by the race or sex of the struck prospective jurors, given repeated sampling from a hypothetical population of peremptory strikes with zero such difference (e.g., Eisenberg 2017; Grosso and O'Brien 2012; for discussion, see Gastwirth and Xu 2014, pp. 289-297;Gastwirth and Yu 2013).Prior studies have also modeled how much a prospective juror's race affected the odds of being struck.For different modeling approaches using the same historical strike data from Mississippi, see Craft (2018), DeCamp (2021), and Dunn and Zhuo (2022).Without historical strike data, there is a higher risk of underpowered and inflated estimates of an attorney's illegal strike bias, because of the low number of strikes per trial: typically from 6 to 15 strikes per party, depending on the jurisdiction (National Center for State Courts -Center for Jury Studies 2023).Yet, complete pooling of current and historical strike data may be affected by the degree of incompatibility between current and historical strike data.Such incompatibility may occur if missing historical strike is missing not at random due to incomplete or inaccessible jury-selection records (Grosso and O'Brien 2017;Wright, Chavis, and Parks 2018).
Incompatibility may also occur if attorney strike bias depends on certain trial-level characteristics (e.g., defendant race, charge severity) that take on one value in the current trial but that vary across the historical trials.For example, suppose an attorney who is more likely to strike Black prospective jurors because the defendant is Black; who has zero strike bias in cases with white defendants; and whose historical strike data comes from past trials with an equal number of Black and white defendants.If so, then inferring strike bias in the current trial with a Black defendant may depend in part on whether we use all the historical data or only the subset of past trials with Black defendants.
As a result, for any given estimate of illegal strike bias in a current trial, incompatibility between strike data from that current trial and strike data from past trials can lead to errors in detecting illegal strike bias (false positives and false negatives) (Bennett et al. 2021).Other methods to increase power may in turn require adjusting for incompatibility implicitly.For example, Gastwirth and Xu (2014) stratify strike data by trial and apply the Cochran-Mantel-Haenszel test [pp. 295-297], albeit after first selecting only historical strike data from past trials that are "similar" in their bias-salient characteristics to the current trial [pp. 289, 297].This initial step presumably aims to reduce incompatibility due to variation in those salient characteristics.Any such choice of "similar" historical strike data would need to be disclosed and justified.
In this article, we extend a Bayesian approach to Batson and similar challenges (Kadane 2021(Kadane , 2018a(Kadane , 2018b) ) to incorporate historical strike data in a manner that allows for transparently adjusting for the assumed degree of incompatibility between current and historical trials.We proceed in three steps.First, we specify a model of the peremptory-strike process in the court of interest that includes a strike-bias parameter to which we assign an initial prior distribution.Second, we use an attorney's strike data from past trials, generated by that same peremptory-strike process, to estimate a posterior distribution for that bias parameter.In so doing, we use the power prior (Chen and Ibrahim 2000;Ibrahim et al. 2015; for a gentle introduction, see Viele et al. 2014).The power prior raises the likelihood of the historical strike data to a fixed power (α), typically between zero and one, to down-weight the historical strike data in accord with expected incompatibility between the historical trials and the current trial.Zero denotes no weight to the historical data (for complete incompatibility) and one denotes equal weight (for complete compatibility).Third, we use this posterior distribution as an informative prior for the bias parameter in the current trial.
We demonstrate this approach with simulations in which we vary the degree of incompatibility between current and historical trial data, and test how well we can "detect" attorney strike bias, that is, whether the credible interval for the bias parameter excludes zero, given different values of the power prior's alpha parameter.Then, we present a software prototype that encodes the same approach with actual historical strike data from a convenience sample of criminal cases from one court.With that data, we use the prototype to illustrate how attorneys and others can use this approach to detect strike bias during jury selection.
To be sure, neither this nor any other statistical approach can automatically detect Batson violations, because Batson law permits any evidence of strike bias, not just strikes in past cases.And Batson law requires the trial judge to weigh all such evidence to decide whether illegal strike bias likely exists (the Batson violation).Still, before accounting for other relevant evidence of strike bias, our approach can help attorneys detect illegal strike bias, given current and historical strike data, a valid model of the relevant court's strike procedure, and a fixed value for the power prior's α parameter.After testing for sensitivity of strike-bias detection to different fixed α values for the power prior (Carvalho and Ibrahim 2021, p. 5252;Ibrahim et al. 2015, p. 3734), attorneys can then combine the resulting inference with other evidence to decide whether to argue Batson; to prove the Batson prima facie case; and to persuade the trial judge to find a Batson violation (Figure 1).

Statistical Procedure
Following others (e.g., Gastwirth 2005, p. 183;Barrett 2007;Kadane 2018a), we model a peremptory-challenge process in which each party strikes prospective jurors in an alternating sequence.Under such a procedure, the trial judge rules on all challenges for cause before the parties exercise any peremptory strikes.Then, of the potential jurors who remain, a subset of them are subject to peremptory strike, usually a number that accounts for the number of seats on the jury (plus alternates, if any) and the number of strikes allotted to each party.The parties exercise their strikes on anyone among this subset of potential jurors in an alternating sequence.Once all strikes are used or waived, remaining prospective jurors are assigned to seats on the jury or as alternate jurors.
Accordingly, for any given case i in which jury selection occurs, let j denote a peremptory strike used, and let δ ij denote whether or not a party used that strike on a person who belongs to a "cognizable class".If "race" is the bias type of interest, the cognizable class is racial minority jurors (δ ij = 1, 0 for white jurors).If "sex" is the bias of interest, the cognizable class is female jurors (δ ij = 1, 0 for male jurors).In turn, let c ij denote the number of cognizable class members subject to strike; and m ij denote the number of cognizable class nonmembers subject to strike, such that c ij + m ij is the total number of jurors potentially subject to strike.If there is no bias, the probability is c ij c ij +m ij for striking a cognizable class member, and 1 − c ij c ij +m ij for striking someone who does not belong to that class.
By adding one parameter w, we can measure strike bias by different values of w by defining the probability of a cognizable class member being struck to be wc ij wc ij +m ij .To avoid making the weight of the non-cognizable class be the reciprocal of the weight of cognizable class, let b = log(w).
Accordingly, for any given value of the bias parameter b, the probability of strike of a member from either class, or Pr(δ ij ), is such that: (1) This ( 1) is equivalent to (2) Given the strike data we have, that is, δ ij , c ij , and m ij , by estimating the value of b, we can measure bias when a party is striking potential jurors.If b = 0, there is no bias, and the probability of strike is simply a function of the share of cognizable members (nonmembers) in the pool of prospective jurors that could be struck.If b > 0, we infer that the party has strike bias against a prospective juror falling within the cognizable class (e.g., a racial minority).Where b < 0, the party has strike bias against nonmembers of the cognizable class (e.g., a white prospective juror).
The likelihood function of b is where n i is the total number of jury selections (trials), n j is the total number of peremptory strikes, and δ = (δ 11 , δ 12 , . . ., δ n i n j ) In this article, we assume that the law for Batson and similar challenges entails a weakly-informative initial prior: ( 4 ) For this initial prior, we let μ = 0, because the law assigns the burden of proof in a Batson challenge to the party bringing the challenge.Thus, if the challenging party produces no relevant evidence of illegal bias, the law requires a trial judge to reject the challenge as unproven.This is tantamount to treating zero as the most-likely value of the bias parameter, absent any data.Moreover, we take the law to imply that, absent any data, one must assume that higher degrees of illegal strike bias are less likely than lower degrees of such bias.For this reason, we use a normal (Gaussian) distribution with σ = 2.This prior accords with Kadane (2021, p. 51), who suggests a prior for estimating Batson strike bias that is at least unimodal, symmetric, and not dependent on the data.

Incorporating Historical Strike Data
We incorporate data on strikes in past cases and allow for adjustment of the weight of that historical strike data on the posterior distribution of the bias parameter.To do this, we introduce the power prior: where 0 ≤ α 0 ≤ 1 is the parameter controlling the weight of the historical information; δ 0 = (δ 011 , δ 012 , . . ., δ 0n i n j ) is the observed historical data; L(b|δ 0 ) is the likelihood function of b given the historical data; and π 0 (b) is the initial prior before the historical data is observed.
After including the historical information through the power prior, the posterior distribution of b is proportional to the product of likelihood function of b and the power prior of b: where is the likelihood function of b given historical data; n 0i is the total number of jury selections (trials) in the historical data; n 0j is the total number of peremptory strikes in the historical data; and δ 0ij denotes whether or not a party used that strike on a person who belongs to a cognizable class in the historical trials.

Model Performance on Simulated Data
To evaluate the proposed method, we conducted a simulation study using Stan 2.261, a Hamiltonian Monte Carlo engine for Bayesian inference by way of R version 4.2.2 and RStan (Stan Development Team 2022).These simulations primarily show how much the size of the historical data affects our ability to detect bias and how much the power prior discounted the weight of the historical data.As introduced above, b > 0 represents a prosecutor's bias against a prospective juror within the cognizable class, while b = 0 denotes no bias.For simplicity, we assume the defense attorney has no bias.
Because we use the power prior, we can control how much we account for the historical strike data by modifying the discounting parameter α (denoted as α 0 in ( 5) and ( 6)).If α = 1, the historical strike data is equally weighted with the data on strikes in the current trial.If α < 1, the historical data are discounted and weighted proportionally less than the current trial data.We assign α < 1 to evaluate how sensitive the posterior of the bias parameter is to the historical data on strikes.Accordingly, in the simulation study, we evaluated different values of α: α = {0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1} to show how that parameter affected bias detection.
We generated historical strike data of three sizes (same, double, and triple the size of current data, that is, data on one, two, or three previous trials) using seven different values of the bias parameter for generating that historical data b hist = {0, 0.5, 1, 1.5, 2, 2.5, 3}.
To generate data, we generated strike data for a current trial using seven different values of the bias parameter b curr = {0, 0.5, 1, 1.5, 2, 2.5, 3}.To evaluate the strength of the power prior as a function of the degree of incompatibility between current and historical strike data (|b curr − b hist | > 0), we assigned b hist the same bias values as b curr , that is, b hist = {0, 0.5, 1, 1.5, 2, 2.5, 3}.We did this to simulate noise in the historical strike data that can arise due to not-at-random missingness or because a confounder (e.g., defendant race, charge severity) affects b curr but not all the past trials in the historical strike data.
For our simulations, we define bias detection as the proportion of times we identify strike bias when strike bias is present in the current trial (i.e., b curr > 0), based on whether credible intervals of the bias parameter exclude zero.Since we use a weakly-informative prior that pulls the estimate of b curr toward zero, we focus on how accurately our model can detect that bias, not on how accurately the model can recover the true value of b curr (i.e., a traditional coverage rate).For instance, a 90% bias detection rate means that in 900 of 1000 model fits, a given credible interval does not contain zero if the b curr is not zero.Put another way, if we set b curr > 0, we calculate the proportion of the 95% credible intervals that lie to the right of zero (lower bound is positive) among the 1000 model fits.For each scenario, we generated 1000 datasets and fit the model on those datasets.For each replicate, we generated a HMC sample of 10,000 iterations with a burn-in period of 1000 iterations.For each replicate, we calculated the posterior mean and 80%, 90%, and 95% highest posterior density intervals for b curr .
Additionally, we present sensitivity analysis for the number of available strikes per trial (6 strikes, 10 strikes, and 15 strikes for each attorney).Thus, in total, we considered 4851 different scenarios, that is, the combination of seven different bias parameters of current data and of historical data, three different amounts of historical data, 11 different values for the power prior weight parameter, and three combinations of the total number of strikes.

Results
Figure 2 depicts the bias detection rate with a 95% credible interval for the 15 strikes scenario.When b curr is high (close to 3), the bias detection rate is high (very close to 1).However, when b curr is lower (close to zero), the bias detection rate drops below 0.8.When there is no bias in the current or historical trials (b curr = b hist = 0), the bias detection rate for the 95% credible interval is 0.05 or less.As we place more weight on the historical strike data (as α increases), and the historical strike data is consistent with current information (high compatibility), the bias detection rate is close to 1, especially when bias is high (upper right-hand corner of plots).Conversely, as α decreases, bias detection is only high when there is high bias in the current trial (b curr ≥ 1.5).Increasing the number of historical trials leads to an increase in bias detection especially when current and historical trials are compatible.However, when current and historical trials are incompatible, both false positives (bias detection when b curr = 0) and false negatives (no bias detection when when b curr > 0) can be observed if α is close to one.As we discuss in Section 4.1, the choice of α depends on assumptions about the degree of incompatibility of current and historical trials.
As we decrease the number of strikes each attorney has from fifteen to six (Figure 3), bias detection decreases.(For the simulation results for ten strikes, see the supplementary material.)Bias detection is only high when α is high (close to 1); current trial bias is high; there is high compatibility between current and historical trials; and the number of historical trials is large.
The choice of credible interval also affects bias detection.For example, with a 90% credible interval, when b curr = 2, the biasdetection rate is lower than with an 80% credible interval; when b curr = 0, that detection rate is higher.The tradeoff is lower accuracy.With a 90% credible interval, when the bias parameter is 2, the model is less accurate in detecting bias.When the bias parameter is 3, however, the detection rate is still high.For details on simulation results for 90% and 80% credible intervals, see the supplementary material.
Bias detection will also be sensitive to the choice of the standard deviation of the initial prior (σ ).The results above depend in part on our choice of a conservative initial prior for the bias parameter (4).Given little historical strike data for an attorney and a low number of strikes per trial, this initial prior will dominate.As a result, our model will only detect severe bias.This sensitivity, however, reduces after including the historical data, especially when α is close to 1.
Figure 4 depicts how assigning different values for the initial prior's standard deviation (σ = {1, 2, 100}) affects the bias detection rate at the 95% credible interval for three α values (α = {0.1,0.5, 1}) under scenarios where each attorney had 10 strikes and where b curr = b hist .In our simulations, we set σ = 2 for our initial prior.If we instead select σ = 1, that matters most when α = 0.1, that is, when we borrow less information from the historical data.As α increases, the initial prior's standard deviation matters less.Moreover, when we increased the initial prior's standard deviation to make it much less informative (σ = 100), that had little impact on bias detection rates, regardless of the choice of α.
Finally, Figure 5 summarizes our simulation results for bias detection at the 95% credible interval, given perfect compatibility (b curr = b hist ).These results indicate that our model can accurately detect strong bias if present in both the current and historical trials.Given high compatibility, increasing α improves bias detection, as does increasing the number of historical trials and more strikes per trial.Under the low number of strikes per trial (n = 6), bias detection requires strong bias in the current trial and a high α.

Simultaneous Strikes
In some courts, both parties simultaneously exercise their peremptory challenges on the prospective jurors subject to strikes.This simultaneous-strikes process can be modeled as a special case of the model of an alternating-strikes process above (see ( 3)), that is, as equivalent to one party engaging in an uninterrupted sequence of strikes against a subset of prospective jurors eligible to be struck.The premise: Regardless of the order in which a party announced those strikes, the posterior for the bias parameter would be the same.
To test this premise, we conducted the following simulation study.We first generated a single trial in which one attorney used 15 strikes in an uninterrupted sequence against 30 prospective jurors, 15 of which were members of a cognizable-class (e.g., Black jurors).Then, we shuffled the order of those strikes to generate 50 trials with the same proportion of struck cognizableclass members but different orders.We then fit the model to the 50 trials to examine whether the estimated bias parameters across the 50 simulated trials were equivalent.
Figure 6 depicts the results of the simultaneous strike simulations.We find the estimated bias parameter of the 50 trials with different strike orders are close to each other for all of the three scenarios.None of their credible intervals include zero.This shows that the order of strikes does not influence the estimate of the bias parameter.Accordingly, the simultaneousstrikes procedure can be modeled as a special case of our initial model of an alternating-strikes process with an identical likelihood.

The Software Prototype
We describe here a prototype software application ("app") that implements the approach described above and that attorneys and others can use in real time to detect bias in the use of peremptory challenges.We built this app with R version 4.2.2 and the shiny package (Chang et al. 2021; R Core Team 2022).
Unlike the simulations, we built the app using Rcpp (Eddelbuettel and Balamuta 2018) and the Metropolis-Hasting algorithm to sample b from the posterior distribution.As in the simulations, we assumed a normal distribution for the distribution of b.The length of the Markov chain was 110,000 and the first 10,000 iterations were dropped as burn-in.No thinning was performed, as the correlation was weak and convergence occurred rapidly.We checked convergence with both traceplots and the Gelman-Rubin statistic (Gelman and Rubin 1992).Computation times vary from 0.76 to 1.2 seconds, depending on the amount of data used (quicker with current strike data only vs. current plus historical strike data).
To show how lawyers might use this app in real cases during jury selection, we loaded this app with actual strike data from a convenience sample of attorneys who appeared during  SD) by level of discounting (alpha).Simulations were run with three levels of the initial prior SD: low (SD = 1), medium (SD = 2), and hight (SD = 100).Bias detection rate corresponds to the proportion of simulations in which bias was detected based on 0 being excluded from the 95% credible interval.Bias was equal in current and historical trials and three levels were selected: 0 (no bias scenario), 1, and 2. In general, bias detection is relatively unaffected by choice initial SD, especially when alpha is close to 1.
Figure 5. Bias detection under compatibility scenarios (equal bias in current and historical trials) for different numbers of strikes (columns), sizes of historical data (rows) and levels of alpha (shapes).As the weight applied to historical data increases (higher alpha), bias detection increases.Bias detection rates also increase as the number of strikes and historical datasets increase.
jury selection in criminal cases in the federal district court for Connecticut during fiscal years 2013 through 2017.Using publicly-available court docket sheets for each case, we matched particular lawyers to these strikes and later assigned them aliases using the charlatan package (Chamberlain and Voytovich 2020).For details on this dataset, see the supplementary material.
To use the app, the user enters by hand the strike information in the case before them in the strike tally table (Figure 7, top  left).In the strike tally, round denotes the order of strikes, num_cog denotes the number of prospective jurors that could be struck that belong to the cognitive class; total denotes the total number of prospective jurors that could be struck; cog indicates whether the prospective juror actually struck in that round was a member of the cognizable class (1 = yes, 0 = no); and party indicates which side used the strike (PP = prosecutor, PD = defense attorney).The user can add or delete rows to the strike tally as needed.
To illustrate, suppose the strike tally in Table 1 depicts the pattern of strikes in the present case with defense attorney Aaron Waelchi and prosecutor Lawrance Klocko V (both aliases for actual attorneys in the historical strike data).After the user enters this strike tally, the app initially displays two graphsone for the prosecution and the defense.Each graph depicts the prior density plot (colored light grey) and posterior density plot (blue and red for defense and prosecution, respectively) for the bias parameter (Figure 7(a)).Here, the 95% credible interval includes zero, indicating no credible basis to infer bias, given the current strike tally alone.In the default setting, the pulldown menus for prosecutor and defense are set to "None".As a result, the app ignores any historical strike data and estimates the posterior distributions of the bias parameter for this prosecutor and defense attorney based only on the strike tally data and the initial prior (b ∼ N(0, 2)).
To use historical strike data, the user selects the name of the prosecutor or defense attorney from the pull-down menus.If an attorney's name cannot be found, the app has no historical strike data for that attorney.Once selected, the prior and posterior density plots automatically update to account for the pre-loaded historical strike data for that attorney.For the weight to assign that attorney's historical strike data, the default is set to equal weight of historical information and current information (α = 1).The user has two other options: half weight (α = 0.5) and minimal weight (α = 0.2).
In our illustration, we select the names of the prosecutor and defense attorney from their respective pull-down menus; and leave the weight setting to "Equal".The density plots update accordingly (Figure 7(b)).Now the credible intervals clearly exclude zero.Thus, we have a credible basis to infer strike bias against racial-minority prospective jurors in how this attorney uses their peremptory challenges in the present case.In the supplementary material, we present additional illustrations wherein we use the app to detect gender bias and where we assign less weight to the historical strike data.

Discussion
Statistical methods for Batson and similar challenges often use historical strike data to infer illegal strike bias.The Bayesian approach presented here accounts for available historical strike data when estimating the posterior for the bias parameter.By incorporating the power prior, this approach makes assumptions about the degree of incompatibility between current and historical trial data an explicit part of the process of identifying likely strike bias.Moreover, with a tool like our prototype app, attorneys can use the approach presented here during jury selection to help them decide whether to bring a Batson challenge at all.If argued, the attorney can also introduce that posterior distribution as relevant evidence, along with any other admissible evidence, for presenting a prima facie case or for ultimately proving illegal strike bias (Figure 1).

Selecting the Power Prior Value
In practice, the anticipated degree of incompatibility should influence which value of α to select for the power prior.Given perfect compatibility (|b curr − b hist | = 0), we should let α = 1 (equal weight between current and historical data).This will maximize bias detection even when the number of strikes per trial is small.This effect of increased bias detection can be seen in the simulation results depicted in Figures 2 and 3 along the positive diagonals of the individual blocks.
Bias detection also increases with the number of historical trials and the number of strikes per trial.Higher values of α lead to higher bias detection rates across compatibility scenarios, especially in scenarios with a lower number of strikes per trial and few historical trials (top left panel of Figure 5).The choice of α matters less for bias detection when bias is high; the number of strikes per trial is high; and the number of historical trials is high (lower right panel of Figure 5).Overall, given high compatibility, increasing α will increase bias detection.
But our simulations also show that, given high incompatibility, bias detection can be (erroneously) high when current strike bias (b curr ) is low and strike bias in past trials (b hist ) is high.For example, Figure 2 shows that when b curr = 0, b hist = 3, and alpha = 1, our simulation bias detection rate is 1 when there are two or three historical trials.This might occur where an attorney is motivated to strike Black prospective jurors only in criminal trials with Black defendants; all that attorney's past criminal trials had Black defendants; but the current trial has a white defendant.Similarly, strike-bias detection can be (erroneously) low where current strike bias (b curr ) is high but bias in historical trials (b hist ) is low.For either case of high incompatibility, the difficulty is that b curr and b hist are never directly observable in real cases.
Where lawyers dispute the degree of incompatibility, they can and should interrogate the choice of α by selecting multiple α values and observing how sensitive the posterior distribution of the striking attorney's bias parameter is to the choice of α value.Our prototype app permits this kind of testing of the degree of sensitivity by providing multiple options for α values.If the inference of bias is highly sensitive to the choice of α values, all else equal, then one should worry more about incompatibility and choose a lower α value accordingly.For example, Figure 2 shows that when b curr = 0, b hist = 3, but when we lower α to 0.1, the simulation bias detection rate drops to 0.07 and 0.13 for two and three historical trials, respectively.This approach also applies when the unit of analysis is not the attorney, but the law firm or office.To illustrate, suppose prosecutors from the same office were all exposed to training that recommended striking Black jurors.If so, our approach allows for (but does not require) aggregating all those prosecutors' cases, that is, acting as if all prosecutors from that office were a single individual prosecutor (e.g., the strike bias of the "district attorney's office").Thereafter, any down-weighting adjustment (choosing α < 1) depends on assumptions about how dissimilarities between the trial-level characteristics (e.g., defendant race, offense type) in the current trial and in past trials may still produce incompatibility, notwithstanding the prosecutors' common exposure to the training.If none, one could assign equal weight to the historical data (α = 1).If one allows for some incompatibility nonetheless, one could either subset the historical data (sacrificing power) to include only those past trials that share the bias-salient characteristics of current trial.Or one could keep all the historical data; choose an α value to match the degree of suspected incompatibility; and check for how sensitive any bias detection is to different α values.
In general, by their choice of α, the lawyer explicitly and transparently weights the historical strike data in the analysis.This transparency thwarts lawyer efforts to strategically select α values.Rather, at the key steps of the Batson challenge process (Figure 1), the lawyers on both sides know that at any time the lawyers on the other side can always assess how sensitive any strike bias estimation is to the choice of α.In any case, the choice of α cannot alone determine whether a trial judge finds that illegal strike bias is more likely than not (i.e., a Batson violation).Future work might examine how various criteria for deriving a guide value of α (e.g., Ibrahim et al. 2015, pp. 3724-3738) could appropriately apply in this context while also considering incompatibility in selecting α (Ollier et al. 2020).Knowing exactly how to quantify incompatibility and specify an appropriate prior for α is an area of active research (e.g., Han, Ye, and Wang 2023;Pawel et al. 2023).

Modeling the Strike Procedure
The Bayesian approach here also requires a model to match the strike procedure of the court of interest.This choice of model can matter in at least three ways.First, it can affect the missingness of historical strike data.For example, given the model we used (1), we required data on the cognizable-class composition of the prospective jurors not just at the time of the first strike, but at each time either party used a strike.Such information may be missing from jury selection records of past cases.
Second, the choice of model can affect prospective data collection not only by courts but also by courtroom observers.The public has a presumptive First Amendment right to attend jury selection in criminal cases (Press-Enterprise Co. v. Superior Court 1984).Similarly, a criminal defendant has a Sixth Amendment right to a public trial that presumptively precludes excluding the public from the courtroom during jury selection (Presley v. Georgia 2010).As a result, any person can often attend jury selection and collect relevant strike data based on what they observe.Thus, such courtroom observers can prospectively collect strike data that they would have been less likely to obtain from court records after the fact.
Third, the choice of model matters if historical strike data for a particular attorney covers strikes exercised in multiple courts with different strike procedures.In some cases, one court's strike procedure can be modeled as a special case of a more general model that can be applied to another court's procedure (e.g., the simultaneous strike example in Section 3.1 ).In other cases, however, one cannot apply the same model to courts with different jury selection practices.For example, the model we used here does not apply to strike data generated by a court's strike procedure in which prospective jurors appear before the attorneys one at a time to be either struck or seated on the jury.

Conclusion
In this article, we extended a Bayesian approach to estimate attorney strike bias in the use of peremptory challenges by incorporating historical data on that attorney's use of strikes in past cases.In so doing, we used the power prior to adjust the weight of such historical information.Our simulations showed that how well our approach detects attorney bias depends on the number of past trials in which that attorney used strikes; the number of strikes per trial; and the degree of incompatibility, that is, the distance between the bias parameters for the current trial and for the past trials.Finally, we discussed a prototype software application with which attorneys could use our approach in real time during jury selection to detect attorney strike bias.

Figure 1 .
Figure 1.Inference from Historical Strike Data at Key Steps of Batson Challenge.Filled circles indicate steps for which bias estimates could be used.

Figure 2 .
Figure 2. Bias detection rates (proportion of simulations where 0 is excluded from the 95% credible interval of bias parameter) for simulations based on 15 strikes.Rows of larger squares correspond to combinations of alpha values (0.1, 0.5, and 1) for low, median, and equal weight of historical data to current trial data.Columns of larger squares correspond to 1, 2, or 3 historical trials.Small squares correspond to combinations of bias in current (rows) and historical (columns) trials.

Figure 3 .
Figure 3. Bias detection rates (proportion of simulations where 0 is excluded from the 95% credible interval of bias parameter) for simulations based on six strikes.

Figure 4 .
Figure 4. Sensitivity to Initial Prior's Standard Deviation (SD) by level of discounting (alpha).Simulations were run with three levels of the initial prior SD: low (SD = 1), medium (SD = 2), and hight (SD = 100).Bias detection rate corresponds to the proportion of simulations in which bias was detected based on 0 being excluded from the 95% credible interval.Bias was equal in current and historical trials and three levels were selected: 0 (no bias scenario), 1, and 2. In general, bias detection is relatively unaffected by choice initial SD, especially when alpha is close to 1.

Figure 6 .
Figure 6.Simulation results for simultaneous strikes example.Columns correspond to values of bias (b curr ) used to simulate data.Points and horizontal lines denote estimated posterior mean and 95% credible interval for bias parameter.Vertical dashed lines denote true value of b curr in the simulations.

Figure 7 .
Figure 7. Screenshots of R-Shiny application showing density plots for race bias of prosecutor and defense attorney.Vertical dotted lines depict 95%.credible interval.Blue and red density plots represent draws from posterior distributions of defense attorneys and prosecutors respectively.Gray background density plots show the initial priors (before using current strike pattern).(a) shows the result when no historical data are included, and (b) shows the result when historical data are included with equal weight to current data ("Strike History Weight" set to "Equal, " i.e., alpha = 1).