A multivariate spatiotemporal model for tracking COVID-19 incidence and death rates in socially vulnerable populations

Recent studies have produced inconsistent findings regarding the association between community social vulnerability and COVID-19 incidence and death rates. This inconsistency may be due, in part, to the fact that these studies modeled cases and deaths separately, ignoring their inherent association and thus yielding imprecise estimates. To improve inferences, we develop a Bayesian multivariate negative binomial model for exploring joint spatial and temporal trends in COVID-19 infections and deaths. The model introduces smooth functions that capture long-term temporal trends, while maintaining enough flexibility to detect local outbreaks in areas with vulnerable populations. Using multivariate autoregressive priors, we jointly model COVID-19 cases and deaths over time, taking advantage of convenient conditional representations to improve posterior computation. As such, the proposed model provides a general framework for multivariate spatiotemporal modeling of counts and rates. We adopt a fully Bayesian approach and develop an efficient posterior Markov chain Monte Carlo algorithm that relies on easily sampled Gibbs steps. We use the model to examine incidence and death rates among counties with high and low social vulnerability in the state of Georgia, USA, from 15 March to 15 December 2020.


Introduction
The global pandemic brought about by coronavirus disease 2019 (COVID-19) has created an unprecedented public health crisis. As of late December 2021, there have been approximately 50 million confirmed COVID-19 cases and over 800,000 related deaths in the United States (US) alone [4]. Although the pandemic has affected all US communities, numerous studies suggest that socially vulnerable populations have been disproportionately impacted by the disease [12,13]. To assist federal, state, and local governments in mobilizing resources for at-risk counties in response to adverse events such as the COVID-19 pandemic, the US Centers of Disease Control and Prevention developed a Social Vulnerability Index, or SVI, for each county in the US [3].
Recent studies have used the SVI to examine associations between social vulnerability and COVID-19 cases and deaths, with somewhat inconsistent findings. In a study from early May 2020, Karaye and Horney [12] found that the SVI was associated with higher rates of COVID-19. A second study conducted in April 2020 found that those living in the most vulnerable counties were at greatest risk for infection and death [13]. However, Nayak et al. [15] found no association between social vulnerability and incident cases. There are a number of potential explanations for this inconsistency. For example, underreporting may be more prevalent in vulnerable populations. Additionally, these prior studies examined COVID-19 cases and fatalities separately, ignoring their underlying association. This univariate approach leads not only to imprecise inferences, but it obscures the inherent interplay between correlated outcomes, such as infections and fatalities, that share common spatiotemporal patterns. Our primary goal, therefore, is to develop a multivariate framework to jointly examine spatial and temporal trends in COVID-19 incidence and death rates among counties with differing characteristics, such as social vulnerability. Through this multivariate approach, we seek to obtain more accurate predictions of spatiotemporal trends and more precise estimates of relative risks comparing communities with differing social vulnerability profiles. These improved inferences can, in turn, assist policymakers in targeting resources to areas of greatest need. A second, somewhat more methodological aim is to derive convenient conditional representations of multivariate intrinsic Gaussian Markov random fields, including conditionally autoregressive (CAR) and random walk priors, to facilitate Bayesian computation and to advance multivariate spatiotemporal analysis more broadly.
Our multivariate approach extends recent univariate spatiotemporal analyses based on penalized Bsplines [6]. Ugarte et al. [25] developed a univariate spatiotemporal Poisson model for the analysis of brain cancer mortality across 47 Spanish provinces over a 10-year period. They modeled the coordinates of province centroids via cubic Bsplines for each coordinate dimension as well at their interaction. They assigned random walk penalties to the basis coefficients to impose temporal smoothing and used penalized quasi-likelihood for estimation. In the Bayesian setting, MacNab and Gustafson [14] introduced a Poisson model with temporal penalized splines and separate spatial random effects that were assigned CAR priors. They used their model to examine hospitalizations due to iatrogenic injuries across 84 local health areas in British Columbia from 1991 to 2000. More recently, Bauer et al. [2] developed a univariate spatiotemporal Poisson model that included a spatial CAR main effect for spatial region, a temporal main effect with a random walk prior, and a space-time interaction that, analogous to Ugarte et al. [25], was modeled via the tensor product of two univariate Bsplines formed from the coordinates of region centroids. Approximate Bayesian computation via INLA [22] was used for inference.
Our goal is to advance this work by developing a Bayesian multivariate negative binomial model to jointly examine spatiotemporal trends in correlated counts and rates. As in prior work, we use cubic Bsplines to model the temporal trends for each outcome. However, in our case, the spline coefficients are assigned multivariate rather than univariate autoregressive priors that provide shared temporal and spatial smoothing across outcomes. This shared smoothing is appealing in our application, as we expect the temporal patterns in COVID-19 cases and deaths to behave similarly across adjacent spatial regions. This multivariate autoregressive penalty also ensures that the model is robust to the number and placement of interior knots. For posterior computation, we develop an efficient data-augmentation algorithm with a number of attractive features. First, we extend recent work on Pólya-Gamma samplers [19,20] to the setting where rates rather than raw counts are of interest. Second, we develop a conditional approach to sampling the spline coefficients that aids posterior computation. This is especially relevant as the number of temporal units, spatial units, and outcomes increases, as jointly sampling the coefficients becomes increasingly inefficient in large dimensions. We develop explicit expressions for the conditional prior distributions of the spatiotemporal effects and investigate their asymptotic properties as the number of spatial and temporal units increases. Finally, we develop a Gibbs sampler that relies primarily on closed-form full conditionals, thus enabling convenient full posterior rather than approximate inference.

Bayesian negative binomial regression
We begin by briefly reviewing Bayesian inference for negative binomial regression models based on Pólya-Gamma data augmentation. For i = 1, . . . , n, let y i denote a count random variable distributed according to NB(r, ψ i ), which we parameterize as where η i = x T i β denotes the linear predictor, x i is a p × 1 vector of covariates, and β is a p × 1 vector of regression parameters. The conditional mean and variance are given by implying overdispersion relative to the Poisson distribution. For Bayesian inference, Pillow and Scott [19] develop an efficient data-augmented Gibbs sampler by introducing weights, ω i , that follow independent Pólya-Gamma distributions [20] of the general form where the g ik 's are independently distributed according to Ga(a i , 1) and c i ∈ . The primary advantage of Pólya-Gamma data augmentation is its computational efficiency. Conditional on the Pólya-Gamma weights, posterior inference for β proceeds via straightforward Gibbs sampling, making the approach scalable to large data sets while avoiding computationally burdensome procedures such as Metropolis-Hastings steps. The approach relies on a key connection between the inverse logit function and the Pólya-Gamma density [20], namely where κ = a − b/2 and p(ω | b, 0) denotes a PG(b, 0) density. For the negative binomial model in (1), we have which has the same functional form as the left-hand side of (2) with a i = y i , b i = r + y i , and η i = x T i β. Thus, from equation (2), it follows that where κ i = (y i − r)/2 and p(ω i |r + y i , 0) denotes a PG(r + y i , 0) density. Using this identity, Pillow and Scott [19] show that if ω i ∼ PG(y i + r, x T i β), then the random variable As a result, posterior inference for β can condition on the latent normal responses, z 1 , . . . , z n , leading to the following data-augmented Gibbs sampler: (1) For i = 1, . . . , n, draw ω i from PG(y i + r, x T i β) using the accept-reject algorithm implemented via the R package BayesLogit Here, X is an n × p design matrix and is an n × n diagonal matrix with diagonal entries ω 1 , . . . , ω n . (4) Assuming a conjugate gamma prior, draw r from its gamma full conditional using the algorithm outlined in Zhou et al. [28].
For more on Pólya-Gamma data-augmentation, including detailed derivations of the above full conditionals, see Neelon [16].

Univariate spatiotemporal model
We now consider extensions to the spatiotemporal setting. We begin with a single response y ij denoting the number of COVID-19 cases in county i (i = 1, . . . , n) on day j (j = 1, . . . , J). We assume that y ij is distributed as NB(r, ψ ij ). We model ψ ij in terms of the incidence rate, λ ij , per C individuals (e.g.C = 100,000) in a county of population size P i ; that is, where log P i and log C are offsets. To model the overall and county-specific incidence trends, we specify the following functional form for λ ij where w ij is a p × 1 vector of county-level covariates, α is a p × 1 vector of fixed effect coefficients, f 1 (t j ) is smooth temporal function evaluated at day t j = j for j = 1, . . . , J, and f 2i (t j ) is an analogous temporal function for county i. To allow for sufficient smoothness over time, we model f 1 and f 2 using cubic Bsplines [6] with q evenly spaced interior knots: where x j is a K × 1 vector of Bspline basis functions for time j; β = (β 1 , . . . , β K ) T is a K × 1 vector of fixed effect basis coefficients that capture the overall temporal trend; b i = (b i1 , . . . , b iK ) T is a K × 1 vector county-specific basis coefficients that capture countyspecific departures from the overall trend. Without loss of generality, we allow x j to span an intercept and exclude an intercept from w ij , implying that K = q + 4. If we collect the J observations for county i into a single vector y i , we can extend equation (3) as follows where the log operator is applied elementwise to the J × 1 vector is a J × K Bspline basis matrix. Finally, combining all N = nJ observations into a single vector y = (y 11 , . . . , y 1J , . . . , y n1 , . . . , y nJ ) T , we obtain where the log operator is again applied elementwise to the N × 1 vector λ = (λ T 1 , . . . , λ T n ) T , W is an N × p covariate matrix, X is an N × K Bspline basis matrix formed by stacking X * row-wise n times, vec(·) denotes the vec operator, and B is the n × K matrix of spline coefficients for all counties of the form implying that

Bayesian inference for the univariate spatiotemporal model
We adopt a Bayesian approach and assign weakly informative conjugate priors to all model parameters, resulting in convenient closed-form full conditionals throughout. For α, we assign a N p (α 0 , 0 ) prior with default hyperparameters α 0 = 0 and 0 = 100I p , where I p denotes the p-dimensional identity matrix. We assign a conjugate Ga(c, d) prior to the dispersion parameter r, with default shape c = 0.01 and rate d = 0.01. For β, we assign a first-order random walk (RW1) prior of the form where τ β is a prior precision term, K is a penalty matrix of rank K−1 with entries Finally, we let b 1 , . . . , b K denote the K columns of B in (6). We model b 1 , . . . , b K using a random walk, conditionally autoregressive (RWCAR) prior of the form where, for k = 1, . . . , K, δ k follows an intrinsic CAR prior given by Here, τ b is a prior precision term and Q is an n × n neighborhood penalty matrix of rank n−1 with entries where m i is the number of geographic neighbors for county i, and i ∼ j indicates that counties i and j are neighbors. The matrices K and Q are known as Laplacian matrices for simple connected graphs, and their generalized inverses have appealing properties that aid posterior computation in large dimensions, as detailed below in Section 2.4. It follows from equations (10) and (11) that the joint prior of the nK × 1 vector of county-specific spline coefficients, b = vec(B), is improper and proportional to the kernel of a mean-zero, singular (i.e. rank-deficient) multivariate normal density with penalty matrices K and Q: Priors (8) and (13) are smoothing priors: prior (8) encourages temporal smoothing of the overall time trend, while prior (13) smooths the county-specific trends over both time and space, so that neighboring counties exhibit similar temporal trends. To complete the prior specification, we assign conjugate Ga(c, d) priors to τ β , τ b , and r, with default shape and rate parameters c = d = 0.01. These conjugate priors admit closed-form full conditionals, leading to the following Gibbs sampling algorithm: (1) For county i = 1, . . . , n and day j = 1 . . . , J, sample weight, ω ij , from its Pólya-Gamma conditional distribution. (2) For all i and j, form the latent response, z ij = (3) Sample α from its multivariate normal full conditional. (4) Assuming the RW1 prior given in (8), sample β from its normal full conditional. For identifiability, mean-center β as discussed in [1]. (5) Sample τ β from its gamma full conditional. (6) Assuming the RWCAR prior given in (13), sample b k , the kth column of B, from its multivariate normal full conditional for k = 1, . . . , K. Center B row-and column-wise for identifiability. (7) Sample τ b from its gamma full conditional. (8) Sample r from its gamma full conditional as described in Zhou et al. [28].
Convergence is monitored using standard Markov chain Monte Carlo (MCMC) diagnostics, including trace plots, Geweke [9] statistics, and effective sample sizes. Details of the algorithm can be found in Appendix A of the Supplementary Material.

Multivariate spatiotemporal model
We now shift our focus to joint inference for multiple count or rate outcomes. Let y lij denote the lth count (l = 1, . . . , L) for county i on day j. In our COVID-19 application, for example, we have L = 2 outcomes, with y 1ij denoting COVID-19 cases and y 2ij denoting corresponding deaths. We assume y lij ∼ NB(r l , ψ lij ), where r l is an outcome-specific dispersion term and ψ lij is modeled as which extends the univariate model by incorporating an outcome-specific rate parameter λ lij and rate constant C l . For instance, it is customary to report COVID-19 cases per 100,000 individuals (implying C l = 100, 000), and to report deaths per million, as the latter are rarer events. Following the exposition in the previous section, we arrive at the following expression for the N × 1 rate vector for outcome l: where, as in equation (5), the log operator is applied elementwise. It is often reasonable to assume that the L rates are correlated spatially and temporally. In the case of COVID-19, for example, it is likely that the incident cases and deaths track together, and that areas with high COVID-19 cases also experience increased deaths. We can link the outcomes by incorporating dependence among the overall trends, represented by β l , as well as the county-specific trends, captured by B l . First, for the overall trend, consider the K × L matrix B comprising the K spline coefficients for each of the L outcomes: We assume that the improper prior for B is proportional to the kernel of a singular matrix normal density of the form where the L × L covariance matrix β operates across columns of B to ensure that the overall temporal trends behave similarly across outcomes. Equivalently, we can note that the KL × 1 vector β * = vec(B) has an improper prior that is proportional to the kernel of a singular multivariate normal distribution; that is, Similarly, we allow the county-specific trends to be correlated across outcomes by first forming an n × K × L array, B, where the lth element of the third dimension comprises Next, we assume an improper prior for B is proportional to the kernel of a mean-zero, singular array normal distribution with penalty and precision matrices Q, K and −1 b of rank n−1, K−1, and L respectively. The covariance matrix b captures dependence over both space and time across the L outcomes, ensuring that outcomes exhibit similar spatiotemporal patterns with respect to the county-specific trends. As above, we can vectorize the array B by forming an nKL × 1 vector, , with improper prior However, as the dimensions n, K and L increase, updating the vectors β * and becomes computationally burdensome. As a convenient alternative, we can update the spline coefficients separately for each outcome by specifying appropriate conditional priors. In particular, for outcome l, we update β l , the spline coefficients for the lth overall trend, one at a time by conditioning on the K(L − 1) × 1 vector of remaining spline coefficients, denoted β * −l . Likewise, we update the kth county-specific spline coefficients for outcome l, represented by the n × 1 vector b lk , by conditioning on the n(KL − 1) × 1 remaining outcomeand county-specific spline coefficients, −(lk) . These conditional priors greatly reduce the dimension of the updates and can yield substantial computational gains during posterior sampling. We codify these results in the following theorems.  (16). Then, the conditional prior of the K × 1 vector β l , given the K(L − 1) × 1 vector β * −l , is given by  = (b l1k , . . . , b lnk ) T , given the n(KL − 1) × 1 vector −(lk) of remaining spline coefficients, is expressed as one of the following: Here where τ b,l is defined as in (i) and Here, the n × 1 vectors b l(k−1) and b l(k+1) denote, respectively, the k−1th and k + 1th columns of B K in equation (17); and −(l) k , −(l) k−1 and −(l) k+1 are the n(L − 1) × 1 vectors formed by stacking, respectively, columns k, k−1 and k + 1 of B h for all h = l.
where τ b,l is defined in (i) and Here, the n × 1 vector b l(K−1) denotes the (K − 1)th column of B K in equation (17); and −(l) By appealing to properties of the generalized inverse of graph Laplacian matrices [10], we can establish the limiting expressions for the conditional means given in Theorems 2.1 and 2.2. We provide these in the following corollary. Likewise, let = vec(B) have prior (18). Then, as n → ∞, the conditional prior mean of the n × 1 vector b lk (l = 1, . . . , L) converges to The proofs are provided in Appendix B of the Supplementary Material. Corollary 2.3 implies that for large K and n, we can approximate K + K and Q + Q by the identity matrices I K and I n , respectively. This eliminates the need to compute generalized inverses as part of the Gibbs sampler, leading to significant computational gains. In fact, our simulations suggest that the approximations work well even when K and n are of modest size.
To complete the prior specification for the multivariate model, we assign a N p (α 0 , 0 ) prior to α l , a Ga(c, d) prior to r l , and conjugate inverse-Wishart IW(ν 0 , G 0 ) priors to β and b , with defaults ν 0 = L + 2 and G 0 = I L . For posterior inference, we modify the Gibbs sampler described in Section 2.3 by cycling through outcomes 1, . . . , L, and for model l, implementing the following steps: (1) Sample ω lij , z lij , α l and r l using updates analogous to those described in Section 2.3.
(2) Assuming the conditional prior given in Theorem 2.1, sample β l from its multivariate normal full conditional.

Simulation study
We conducted a simulation study to evaluate the performance of the model. To mimic the spatial domain of the application, we used the US Census county-level adjacency matrix for Georgia. This matrix comprises 159 counties and 431 unique adjacencies. For i = 1, . . . , 159 and j = 1, . . . , 300, we generated outcomes y 1ij and y 2ij ('cases' and 'deaths', respectively) from a bivariate version of model (14) with rate constants C 1 = 10 5 and C 2 = 10 6 . The model included two covariatesw 1 generated from N(0, 1) and w 2 generated from Bern(0.75) -with outcome-specific coefficients α 1 = (0.25, −0.25) T and α 2 = (−0.25, 0.25) T . The model also included cubic Bspline fixed effects for each outcome, with interior knots evenly spaced every 10 days from day 10 to 290, for a total of 29 interior knots. We allowed the spline bases to span and intercept and sampled the 29 + 4 = 33 spline coefficients from prior (16) with β = 0.50 0.25 0.25 0.20 , implying a correlation of 0.79 between outcomes. Similarly, we generated the county-specific spline coefficients, , from prior (18) with b = 0.20 0.05 0.05 0.20 , for a more modest correlation of 0.25 across outcomes. Lastly, we set the dispersion parameters to be r 1 = 1.00 and r 2 = 1.25. We assumed N(0, 100I 2 ) priors for α 1 and α 2 , and Ga(0.01, 0.01) priors for r 1 and r 2 . For the spline coefficients, we assumed the approximate conditional priors presented in Corollary 1. Finally, we assigned IW(4, I 4 ) priors to β and b . We ran the Gibbs sampler described in Appendix C of the Supplementary Material for 10,500 iterations with a burn-in of 500 and a thinning interval of 5 to reduce autocorrelation. MCMC diagnostics, presented in Figure S1 of the Supplementary Material, indicated that the algorithm converged rapidly and exhibited excellent mixing. Table S1 in Appendix D of the Supplementary Material presents the simulated parameter values along with the posterior mean estimates and 95% credible intervals (CrIs) for the model. Overall, the estimates were close to the true value with 95% intervals that overlapped the simulated value. Figures 1(a) and 1(b) present the population average trends for incident cases per 100,000 (panel a) and deaths per million (panel b). For reference, we overlaid the simulated daily mean counts across counties. As expected, the posterior mean trends mirror the observed data and are able to capture abrupt changes in the data. This highlights the flexibility of the model and, in particular, its ability to detect subtle changes in trends. This flexibility can prove useful in settings where functional behavior is characterized by abrupt spikes and troughs.
Figures 2(a)-2(d) display the estimated county-specific trends for two randomly selected counties, together with the true, simulated curves and the observed daily counts. As with the population-average plots, the estimated trends mirror the simulated curves and provide reasonable fit to the observed data. These results highlight the model's ability to capture not only population average trends, but county-specific trends as well.
To evaluate the predictive performance of the proposed model, we reran the analysis assuming separate, uncorrelated models for cases and deaths. We then compared the correlated and uncorrelated models using the 'widely applicable information criterion', or WAIC [27], a Bayesian comparison measure that combines an assessment of predictive accuracy with a penalty for model complexity, with lower values indicating improved fit. The WAIC values for the correlated and uncorrelated models were 363,360 and 363,398, respectively, indicating that even with a modest correlation of 0.25 between the county-specific trends for cases and deaths, joint modeling resulted in improved predictive performance. To further investigate model performance, we conducted a second simulation, increasing the county-specific correlation to 0.70. Here, the joint model yielded a 95-point improvement in WAIC relative to the uncorrelated model, further supporting the multivariate approach when spatiotemporal trends are correlated across outcomes.
Finally, to compare the performance of the conditional prior approximation in Corollary 2.3 to the exact expressions in Theorems 2.1 and 2.2, we re-ran the models and updated the spline coefficients assuming the exact prior distributions that rely on generalized inverses. To compute the generalized inverse, we applied the computationally efficient inversion algorithm proposed by Courrieu [5], which we implemented using the C++-based function geninv available at https://github.com/jaredhuling/rfunctions. The results are presented in supplemental Figures S2 and S3, which show the population average trends for cases and deaths ( Figure S2) and the county-specific for counties 70 and 151 ( Figure S3). The results are nearly identical to those presented in Figures 1 and 2 above, indicating that the identity-matrix approximations in Corollary 2.3 provide a suitable alternative to the exact expressions based on generalized inverses. Of note, the sampler based on the approximation ran 25% faster than the one relying on generalized inverses, highlighting the computational advantage to specifying approximate rather than exact conditional priors for the spline coefficients.

Application to Georgia COVID-19 data
For our case study, we downloaded daily COVID-19 incident case and death data from the Johns Hopkins Center for Systems Science and Engineering [11] for the state of Georgia from 15 March to 15 December 2020 inclusive, for a total of 276 study days. Because the first COVID-19 case was not reported in Georgia until March 2 [8] and there were relatively few cases in early March, we initiated our analysis on March 15. Next, we obtained the SVI for each county from the CDC's Agency for Toxic Substances and Disease Registry [3]. For our analysis, we used the overall SVI summary measure, which is a percentile-based score of social vulnerability derived from four themes: socioeconomic status; household composition; race, ethnicity and language; and housing and transportation availability. For the stratified analysis, we compared counties in the top and bottom quartiles of SVI. We adjusted for several county-level variables unrelated to the components of SVI that could explain the differential impact of COVID-19 on high and low SVI counties. These variables were obtained from the Robert Wood Johnson Foundation's (RWJF) 2019 County Health Rankings & Roadmaps: Rankings Data & Documentation [23] and included the percentage of each county designated as rural, the percentage of residents with obesity, the percentage of adult smokers, the average daily PM2.5 for each county, and the number of primary care physicians per 100,000 residents. Finally, we downloaded the population size for each county from the 2019 database maintained by the US Census Bureau [26] and used this information to adjust for county population density per square mile.
We fit a bivariate version of model (14) to the Georgia COVID-19 data, with daily incident cases and deaths for each county as joint outcomes. In addition to the covariates listed above, the model incorporated county population size as an offset along with constants C 1 = 10 5 and C 2 = 10 6 for incident cases and deaths, respectively. We used cubic Bsplines to model both the overall time trend and county-specific trends. We allowed the splines to span an intercept and placed 16 evenly spaced interior knots every 15 days from day 30 to 255 inclusive, leading to K = 20 basis functions and coefficients. We assigned the priors outlined in Section 2.4 to the respective model parameters. We ran the MCMC algorithm for 10,500 iterations with a burn-in of 500. MCMC diagnostics are presented in Figure S4 of the Supplementary Material and demonstrate excellent mixing. Figures 3(a) and 3(b) present the predicted incidence and death rate trends across counties. The average incidence trend increased from March 15 through mid-April before leveling off in May, potentially as a result of the statewide social distancing orders imposed from April 3 to April 30. However, cases increased precipitously from mid-June to early August. On August 6, for example, there were an estimated 30.86 (95% CrI: 30.17, 31.55) COVID-19 cases per 100,000 on average across the 159 counties. Statewide, the average incidence declined through August and September, achieving a local nadir of 12.19 cases per 100,000 (95% CrI: 11.87, 12.53) on Oct 18. Cases began to rise again in mid-October and climbed steadily through the remainder of the year, as a 'second wave' of infections began to take hold. Death rates showed a similar but somewhat delayed pattern for the first half of the pandemic: there was a initial increase during the onset of the pandemic, followed by a decline in May and June and a mid-summer increase to 8.99 deaths per million (95% CrI: 8.07, 9.75) on August 24. Deaths declined in early fall before leveling off to around 6 per million in November and December. Figure 4(a) presents the average incidence trends for counties in the top (most vulnerable) and bottom (least vulnerable) quartiles of SVI. For counties in the top quartile, the average incidence increased steadily from an estimated 0.04 (95% CrI: 0.03, 0.06) cases per 100,000 on March 15 to 18.50 (95% CrI: 17.24, 19.83) cases per 100,000 on April 10. The incidence fell through April and leveled off until June 12 (7.88 cases per 100,000; 95% CrI: 7.29, 8.53) before a precipitous increase through August 6 (36.58 cases per 100,000; 95% CrI: 35.04, 38.10). Incident cases declined thereafter, before a final uptick starting in late October. By December 15, the incidence rate for the upper quartile counties achieved a maximum value of 44.53 (95% CI: 40.52, 48.62). The bottom quartile exhibited a similar but less pronounced trend during the first half of the pandemic: there was a modest increase from March 15 (0.09 cases per 100,000; 95% CrI: 0.07, 0.12) to April 21 (5.26 cases per 100,000; 95% CrI: 4.77, 5.43) and a longer plateau lasting until June 9 (4.32 cases per 100,000; 95% CrI: 4.07, 4.59). The incidence climbed steadily afterward before peaking at 26.032 (95% CrI: 25.06, 27.00) on August 6, the same summer peak date as the top quartile. As with the top quartile, cases declined in late summer before climbing again in November, achieving a maximum of 59.10 (95% CrI: 54.07, 64.65) on December 15. Figure 4(b) presents the posterior mean RRs comparing the top and bottom quartiles on each day, with RRs > 1 indicating more cases for the top quartile. On March 15, the RR for incident cases was 0.48 (95% CrI: 0.33, 0.67), suggesting that the more vulnerable top quartile counties had approximately half as many cases per 100,000 on average as the less vulnerable lower quartile counties. However, on March 22, we observed a 'crossover effect' in which the RR became significantly greater than 1.00, indicating that the more vulnerable counties had higher COVID-19 incidence on average compared to less vulnerable counties (March 18 RR = 1.30, 95% CrI: 1.09, 1.96). The RRs increased for the next few weeks and achieved a maximum of 3.64 (95% CrI: 3.32, 3.98) on April 9, then decreased steadily through the summer and the fall. On October 31, there was a second crossover event in which the bottom quartile counties again experienced significantly fewer cases than the top quartile.     5(a) and 5(b) present the corresponding plots for deaths. The risk ratios increased in March and early April, achieving a maximum of 5.46 (95% CrI: 4.37, 6.72) on April 12. The RRs declined until mid-June before leveling off for the remainder of the study period. However, unlike cases, the trends did not cross, and counties in the upper SVI quartile experienced higher death rates on average than those in the lower quartile throughout the study. Thus, in terms of mortality, the COVID-19 appears to have disproportionately impacted vulnerable counties, particularly in the early stage of the pandemic.  6(a)-6(d) show incidence and death rate trends for Fulton and Dekalb counties, two neighboring counties in the Atlanta metro area. The shared temporal patterning is evident from the figures, with each county experiencing relatively constant incidence rates of 5-8 cases per 100,000 from March through mid-June, at which time there was a stark increase in cases. Deaths remained relatively constant, ranging from 1 to 4 cases per million throughout most of the study period, with a nadir in early July and a peak in early to mid-August.  7(a)-7(d) present analogous trends for a Randolph and Terrell counties, two neighboring counties in rural, southwest Georgia. These are sparsely populated counties with fewer than 10,000 residents. These counties also have high social vulnerability, with SVIs of 0.97 to 0.96, respectively. For each county, the incidence and death rates peaked in early to mid-April which, according to media reports, can be traced back to super-spreader events at local funerals in the nearby town of Albany [7]. In mid-April, cases and deaths were concentrated in the highly vulnerable southwest corner of the state, which includes Randolph and Terrell counties. By July, the virus had spread eastward, leading to spikes in vulnerable counties such as Washington and Jefferson, each with an SVI score of 0.94. Cases subsided somewhat in the fall, although a pocket of high infections and deaths remained in the south-central portion of the state. By mid-December, cases had spread throughout the state, whereas the highest death rates were confined to vulnerable counties in the central eastern and northernmost portions of the state. Health professionals can potentially use this information to prepare for future outbreaks that may follow similar spatiotemporal patterns. Of note, the shrinkage property of the RWCAR prior is reflected in the spatial smoothing evident throughout the maps. The correlation between cases and deaths is also apparent in the maps, as areas with elevated case rates also tend to have higher death rates. This again highlights the benefit of modeling the dependence across outcomes via the multivariate RWCAR prior.

Discussion
We have proposed a Bayesian multivariate spatiotemporal model for the analysis of correlated counts and rates. The model has a number of appealing features, including its multivariate RWCAR prior for the spline coefficients, which provides shared spatiotemporal smoothing across outcomes. The RWCAR prior also ensures robustness to the number and placement of knots, allowing the model to incorporate a large number of knots without the risk of overfitting. We demonstrated that jointly modeling incidence and deaths yields improved predictive performance over independent univariate models. To enhance computational efficiency, we developed conditional specifications of popular multivariate intrinsic Gaussian Markov random fields, including CAR and random walk priors, and established asymptotic properties that obviate calculation of large pseudoinverses. Finally, we developed a data-augmented Gibbs sampler that extends recent work on Pólya-Gamma sampling to the setting where inference on correlated rates is of interest.
Results from our case study suggest a dynamic impact of COVID-19 on socially vulnerable communities. At the outset of the pandemic, COVID-19 disproportionately impacted less vulnerable counties before spreading to more vulnerable areas in mid-March and April. By fall 2020, however, the least vulnerable counties in the bottom quartile began to keep pace with the top quartile, and in fact surpassed the top quartile by November. This likely reflects the natural migration of the virus from less vulnerable counties to more vulnerable counties and back again throughout the pandemic. On the other hand, the most vulnerable counties experienced higher death rates throughout, suggesting that the pandemic has had a disproportionate impact on mortality in more vulnerable communities. This may point to a lack of appropriate health resources in rural areas. Importantly, the model was able to accurately capture the COVID-19 trends for individual counties. This could aid local health officials in tracking the emergence of new variants of the virus in socially vulnerable areas. By monitoring COVID-19 trends more generally, policymakers can allocate health resources, target vaccine efforts, and administer recovery aid to areas most impacted by the pandemic at a particular point in time.
The model can easily handle multiple outcomes and is scalable to reasonably large spatiotemporal domains. Adding temporal knots to the model imposes relatively little computational burden. For large spatial domains, there are a couple of options. One option is to assume uncorrelated spatial units (i.e. ignore spatial smoothing and allow only temporal smoothing). In previous applications, we have fit uncorrelated county effects to all 3041 US counties and obtained reasonable results [17]. However, because this approach ignores spatial smoothing, it is not ideal for sparse spatial data. An alternative is to adopt a model fitting strategy specifically designed for massive geospatial data, such as meshed Gaussian processes [18]. Future work could explore this approach.
There are a number of other methodological extensions of this work. Currently, the model is best suited for quantifying existing spatiotemporal trends (i.e. estimating insample case and death rates) rather than forecasting future trends. However, it may be possible to extend the model to allow for short-term forecasting by adapting the work of Ugarte et al. [24] to the Bayesian setting The proposed methods could also be used to jointly model correlated physical activity counts measured by GPS-enabled wearable devices. In health services research, the models could be used to explore spatiotemporal patterns in inpatient and outpatient admissions. The models could also be extended to zero-inflated count outcomes. More broadly, the methods proposed here could be expanded to settings where interest lies in multivariate curve fitting for correlated counts and rates.

Disclosure statement
No potential conflict of interest was reported by the author(s).

Funding
This work was supported in part by the Biostatistics Shared Resource, Hollings Cancer Center, Medical University of South Carolina [P30 CA138313].