2023-02-09 CaseAssignm
- Instituut Voor Tropische Geneeskunde - Antwerp, Belgium
Step by Step
Import data
Importing a CSV database under the name of “VL_Bihar”:
Starting the Exercise:
1. Description of the study population
1.1 Sex
```{r}
table(VL_Bihar$sex, useNA = "always") #frecuency
prop.table(table(VL_Bihar$sex)) #proportion
```
1 2 <NA>
907 973 0
1 2
0.4824468 0.5175532
1.2 Age
Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
2.00 8.00 19.50 24.15 35.00 90.00 2
1.2.1 Categories of age
```{r}
#Group Age
VL_Bihar$groupage <- NA
VL_Bihar$groupage[VL_Bihar$age<15] <- "0"
VL_Bihar$groupage[VL_Bihar$age>14 & VL_Bihar$age<35] <- "1"
VL_Bihar$groupage[VL_Bihar$age>34] <- "2"
table(VL_Bihar$groupage, useNA = "always") #frecuency
prop.table(table(VL_Bihar$groupage, useNA = "always")) #proportion
```
0 1 2 <NA>
765 608 505 2
0 1 2 <NA>
0.40691489 0.32340426 0.26861702 0.00106383
1.3 Case
```{r}
table(VL_Bihar$case, useNA = "always") #frecuency
prop.table(table(VL_Bihar$case)) #proportion
```
0 1 <NA>
1673 207 0
0 1
0.8898936 0.1101064
1.4. Died
```{r}
table(VL_Bihar$died, useNA = "always") #frecuency
prop.table(table(VL_Bihar$died)) #proportion
```
0 1 <NA>
1877 3 0
0 1
0.998404255 0.001595745
1.5. neem_tree
```{r}
table(VL_Bihar$neem_tree, useNA = "always") #frecuency
prop.table(table(VL_Bihar$neem_tree)) #proportion
```
0 1 <NA>
1122 758 0
0 1
0.5968085 0.4031915
1.6. bamboo_tree
```{r}
table(VL_Bihar$bamboo_tree, useNA = "always") #frecuency
prop.table(table(VL_Bihar$bamboo_tree)) #proportion
```
0 1 <NA>
1144 736 0
0 1
0.6085106 0.3914894
1.7. banana_tree
```{r}
table(VL_Bihar$banana_tree, useNA = "always") #frecuency
prop.table(table(VL_Bihar$banana_tree)) #proportion
```
0 1 <NA>
304 1576 0
0 1
0.1617021 0.8382979
1.8. rice_field
```{r}
table(VL_Bihar$rice_field, useNA = "always") #frecuency
prop.table(table(VL_Bihar$rice_field)) #proportion
```
0 1 <NA>
379 1501 0
0 1
0.2015957 0.7984043
1.9. permanent_water_body
```{r}
table(VL_Bihar$permanent_water_body, useNA = "always") #frecuency
prop.table(table(VL_Bihar$permanent_water_body)) #proportion
```
0 1 <NA>
1002 878 0
0 1
0.5329787 0.4670213
1.10 granaries_in_hh
```{r}
table(VL_Bihar$granaries_in_hh, useNA = "always") #frecuency
prop.table(table(VL_Bihar$granaries_in_hh)) #proportion
```
0 1 <NA>
806 1074 0
0 1
0.4287234 0.5712766
1.11 owngoat
```{r}
table(VL_Bihar$owngoat, useNA = "always") #frecuency
prop.table(table(VL_Bihar$owngoat)) #proportion
```
0 1 <NA>
1282 598 0
0 1
0.6819149 0.3180851
1.12 ownpoul
```{r}
table(VL_Bihar$ownpoul, useNA = "always") #frecuency
prop.table(table(VL_Bihar$ownpoul)) #proportion
```
0 1 <NA>
1799 81 0
0 1
0.95691489 0.04308511
1.13 ownbov
```{r}
table(VL_Bihar$ownbov, useNA = "always") #frecuency
prop.table(table(VL_Bihar$ownbov)) #proportion
```
0 1 <NA>
1161 719 0
0 1
0.6175532 0.3824468
1.14 bednet
```{r}
table(VL_Bihar$bednet, useNA = "always") #frecuency
prop.table(table(VL_Bihar$bednet)) #proportion
```
0 1 <NA>
1507 373 0
0 1
0.8015957 0.1984043
1.15 asset_index
```{r}
table(VL_Bihar$asset_index, useNA = "always") #frecuency
prop.table(table(VL_Bihar$asset_index)) #proportion
```
1 2 3 4 5 <NA>
447 374 325 392 342 0
1 2 3 4 5
0.2377660 0.1989362 0.1728723 0.2085106 0.1819149
1.16 riskwall
```{r}
table(VL_Bihar$riskwall, useNA = "always") #frecuency
prop.table(table(VL_Bihar$riskwall)) #proportion
```
0 1 <NA>
1104 776 0
0 1
0.587234 0.412766
1.17 earthfloor
```{r}
table(VL_Bihar$earthfloor, useNA = "always") #frecuency
prop.table(table(VL_Bihar$earthfloor)) #proportion
```
0 1 <NA>
171 1709 0
0 1
0.09095745 0.90904255
1.18 sc_caste
2. Bivariate analysis
2.1 Case and sex
```{r}
GLM1 <- glm(formula = case ~ factor(sex), family = binomial, data = VL_Bihar)
summary(GLM1)
exp(coef(GLM1))
exp(confint(GLM1))
```
Waiting for profiling to be done...
Call:
glm(formula = case ~ factor(sex), family = binomial, data = VL_Bihar)
Deviance Residuals:
Min 1Q Median 3Q Max
-0.5011 -0.5011 -0.4657 -0.4657 2.1332
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -2.0118 0.1029 -19.544 <2e-16 ***
factor(sex)2 -0.1550 0.1474 -1.051 0.293
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 1303.7 on 1879 degrees of freedom
Residual deviance: 1302.6 on 1878 degrees of freedom
AIC: 1306.6
Number of Fisher Scoring iterations: 4
(Intercept) factor(sex)2
0.1337500 0.8564302
2.5 % 97.5 %
(Intercept) 0.1087251 0.1628343
factor(sex)2 0.6410060 1.1433795
2.2 Case and age
```{r}
GLM2 <- glm(formula = case ~ age, family = binomial, data = VL_Bihar)
summary(GLM2)
exp(coef(GLM2))
exp(confint(GLM2))
```
Waiting for profiling to be done...
Call:
glm(formula = case ~ age, family = binomial, data = VL_Bihar)
Deviance Residuals:
Min 1Q Median 3Q Max
-0.5340 -0.5134 -0.4815 -0.4272 2.2860
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -1.854781 0.118111 -15.704 <2e-16 ***
age -0.010493 0.004288 -2.447 0.0144 *
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 1299.1 on 1877 degrees of freedom
Residual deviance: 1292.8 on 1876 degrees of freedom
(2 observations deleted due to missingness)
AIC: 1296.8
Number of Fisher Scoring iterations: 5
(Intercept) age
0.1564872 0.9895616
2.5 % 97.5 %
(Intercept) 0.1236807 0.1965649
age 0.9810855 0.9977385
2.2.1 Case and age group
```{r}
GLM2_1 <- glm(formula = case ~ groupage, family = binomial, data = VL_Bihar)
summary(GLM2_1)
exp(coef(GLM2_1))
exp(confint(GLM2_1))
```
Waiting for profiling to be done...
Call:
glm(formula = case ~ groupage, family = binomial, data = VL_Bihar)
Deviance Residuals:
Min 1Q Median 3Q Max
-0.5265 -0.5265 -0.4677 -0.4270 2.2092
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -1.9062 0.1077 -17.696 <2e-16 ***
groupage1 -0.2515 0.1712 -1.469 0.1419
groupage2 -0.4430 0.1910 -2.319 0.0204 *
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 1299.1 on 1877 degrees of freedom
Residual deviance: 1293.1 on 1875 degrees of freedom
(2 observations deleted due to missingness)
AIC: 1299.1
Number of Fisher Scoring iterations: 5
(Intercept) groupage1 groupage2
0.1486486 0.7776480 0.6420824
2.5 % 97.5 %
(Intercept) 0.1196675 0.1826217
groupage1 0.5538452 1.0847758
groupage2 0.4379600 0.9277561
2.3 Case and died
```{r}
GLM3 <- glm(formula = case ~ factor(died), family = binomial, data = VL_Bihar)
summary(GLM3)
exp(coef(GLM3))
exp(confint(GLM3))
```
Waiting for profiling to be done...
Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
Call:
glm(formula = case ~ factor(died), family = binomial, data = VL_Bihar)
Deviance Residuals:
Min 1Q Median 3Q Max
-0.4797 -0.4797 -0.4797 -0.4797 2.1068
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -2.10425 0.07416 -28.375 <2e-16 ***
factor(died)1 16.67032 509.65213 0.033 0.974
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 1303.7 on 1879 degrees of freedom
Residual deviance: 1290.5 on 1878 degrees of freedom
AIC: 1294.5
Number of Fisher Scoring iterations: 13
(Intercept) factor(died)1
1.219366e-01 1.737115e+07
2.5 % 97.5 %
(Intercept) 1.051421e-01 0.1406351
factor(died)1 2.290754e-22 NA
2.4 Case and neem_tree
```{r}
GLM4 <- glm(formula = case ~ factor(neem_tree), family = binomial, data = VL_Bihar)
summary(GLM4)
exp(coef(GLM4))
exp(confint(GLM4))
```
Waiting for profiling to be done...
Call:
glm(formula = case ~ factor(neem_tree), family = binomial, data = VL_Bihar)
Deviance Residuals:
Min 1Q Median 3Q Max
-0.5023 -0.5023 -0.5023 -0.4533 2.1571
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -2.00635 0.09236 -21.724 <2e-16 ***
factor(neem_tree)1 -0.21755 0.15330 -1.419 0.156
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 1303.7 on 1879 degrees of freedom
Residual deviance: 1301.7 on 1878 degrees of freedom
AIC: 1305.7
Number of Fisher Scoring iterations: 4
(Intercept) factor(neem_tree)1
0.1344793 0.8044893
2.5 % 97.5 %
(Intercept) 0.1117282 0.160517
factor(neem_tree)1 0.5933996 1.083238
2.5 Case and bamboo_tree
```{r}
GLM5 <- glm(formula = case ~ factor(bamboo_tree), family = binomial, data = VL_Bihar)
summary(GLM5)
exp(coef(GLM5))
exp(confint(GLM5))
```
Waiting for profiling to be done...
Call:
glm(formula = case ~ factor(bamboo_tree), family = binomial,
data = VL_Bihar)
Deviance Residuals:
Min 1Q Median 3Q Max
-0.5287 -0.5287 -0.4518 -0.4518 2.1600
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -2.23069 0.09987 -22.335 <2e-16 ***
factor(bamboo_tree)1 0.33357 0.14817 2.251 0.0244 *
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 1303.7 on 1879 degrees of freedom
Residual deviance: 1298.7 on 1878 degrees of freedom
AIC: 1302.7
Number of Fisher Scoring iterations: 4
(Intercept) factor(bamboo_tree)1
0.107454 1.395946
2.5 % 97.5 %
(Intercept) 0.08787735 0.1300431
factor(bamboo_tree)1 1.04293293 1.8656202
2.6 Case and banana_tree
```{r}
GLM6 <- glm(formula = case ~ factor(banana_tree), family = binomial, data = VL_Bihar)
summary(GLM6)
exp(coef(GLM6))
exp(confint(GLM6))
```
Waiting for profiling to be done...
Call:
glm(formula = case ~ factor(banana_tree), family = binomial,
data = VL_Bihar)
Deviance Residuals:
Min 1Q Median 3Q Max
-0.5021 -0.4793 -0.4793 -0.4793 2.1076
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -2.00747 0.17751 -11.309 <2e-16 ***
factor(banana_tree)1 -0.09866 0.19511 -0.506 0.613
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 1303.7 on 1879 degrees of freedom
Residual deviance: 1303.5 on 1878 degrees of freedom
AIC: 1307.5
Number of Fisher Scoring iterations: 4
(Intercept) factor(banana_tree)1
0.1343284 0.9060498
2.5 % 97.5 %
(Intercept) 0.09330639 0.1875041
factor(banana_tree)1 0.62506879 1.3458822
2.7 Case and rice_field
```{r}
GLM7 <- glm(formula = case ~ factor(rice_field), family = binomial, data = VL_Bihar)
summary(GLM7)
exp(coef(GLM7))
exp(confint(GLM7))
```
Waiting for profiling to be done...
Call:
glm(formula = case ~ factor(rice_field), family = binomial, data = VL_Bihar)
Deviance Residuals:
Min 1Q Median 3Q Max
-0.4949 -0.4949 -0.4949 -0.4336 2.1960
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -2.3172 0.1797 -12.894 <2e-16 ***
factor(rice_field)1 0.2790 0.1970 1.416 0.157
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 1303.7 on 1879 degrees of freedom
Residual deviance: 1301.6 on 1878 degrees of freedom
AIC: 1305.6
Number of Fisher Scoring iterations: 4
(Intercept) factor(rice_field)1
0.09855074 1.32186819
2.5 % 97.5 %
(Intercept) 0.06804181 0.1379605
factor(rice_field)1 0.90960592 1.9741271
2.8 Case and permanent_water_body
```{r}
GLM8 <- glm(formula = case ~ factor(permanent_water_body), family = binomial, data = VL_Bihar)
summary(GLM8)
exp(coef(GLM8))
exp(confint(GLM8))
```
Waiting for profiling to be done...
Call:
glm(formula = case ~ factor(permanent_water_body), family = binomial,
data = VL_Bihar)
Deviance Residuals:
Min 1Q Median 3Q Max
-0.4846 -0.4846 -0.4812 -0.4812 2.1040
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -2.08281 0.10065 -20.693 <2e-16 ***
factor(permanent_water_body)1 -0.01469 0.14773 -0.099 0.921
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 1303.7 on 1879 degrees of freedom
Residual deviance: 1303.7 on 1878 degrees of freedom
AIC: 1307.7
Number of Fisher Scoring iterations: 4
(Intercept) factor(permanent_water_body)1
0.1245791 0.9854151
2.5 % 97.5 %
(Intercept) 0.1017388 0.151011
factor(permanent_water_body)1 0.7368210 1.315789
2.9 Case and granaries_in_hh
```{r}
GLM9 <- glm(formula = case ~ factor(granaries_in_hh), family = binomial, data = VL_Bihar)
summary(GLM9)
exp(coef(GLM9))
exp(confint(GLM9))
```
Waiting for profiling to be done...
Call:
glm(formula = case ~ factor(granaries_in_hh), family = binomial,
data = VL_Bihar)
Deviance Residuals:
Min 1Q Median 3Q Max
-0.4838 -0.4838 -0.4825 -0.4825 2.1016
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -2.086439 0.112383 -18.565 <2e-16 ***
factor(granaries_in_hh)1 -0.005634 0.148829 -0.038 0.97
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 1303.7 on 1879 degrees of freedom
Residual deviance: 1303.7 on 1878 degrees of freedom
AIC: 1307.7
Number of Fisher Scoring iterations: 4
(Intercept) factor(granaries_in_hh)1
0.1241283 0.9943820
2.5 % 97.5 %
(Intercept) 0.09893469 0.1537828
factor(granaries_in_hh)1 0.74366989 1.3337585
2.10 Case and owngoat
```{r}
GLM10 <- glm(formula = case ~ factor(owngoat), family = binomial, data = VL_Bihar)
summary(GLM10)
exp(coef(GLM10))
exp(confint(GLM10))
```
Waiting for profiling to be done...
Call:
glm(formula = case ~ factor(owngoat), family = binomial, data = VL_Bihar)
Deviance Residuals:
Min 1Q Median 3Q Max
-0.5065 -0.5065 -0.4717 -0.4717 2.1218
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -2.13963 0.09099 -23.516 <2e-16 ***
factor(owngoat)1 0.15100 0.15514 0.973 0.33
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 1303.7 on 1879 degrees of freedom
Residual deviance: 1302.8 on 1878 degrees of freedom
AIC: 1306.8
Number of Fisher Scoring iterations: 4
(Intercept) factor(owngoat)1
0.1176983 1.1629911
2.5 % 97.5 %
(Intercept) 0.09804742 0.1401079
factor(owngoat)1 0.85459470 1.5712859
2.11 Case and ownpoul
```{r}
GLM11 <- glm(formula = case ~ factor(ownpoul), family = binomial, data = VL_Bihar)
summary(GLM11)
exp(coef(GLM11))
exp(confint(GLM11))
```
Waiting for profiling to be done...
Call:
glm(formula = case ~ factor(ownpoul), family = binomial, data = VL_Bihar)
Deviance Residuals:
Min 1Q Median 3Q Max
-0.5915 -0.4777 -0.4777 -0.4777 2.1105
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -2.11302 0.07601 -27.800 <2e-16 ***
factor(ownpoul)1 0.45846 0.31210 1.469 0.142
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 1303.7 on 1879 degrees of freedom
Residual deviance: 1301.8 on 1878 degrees of freedom
AIC: 1305.8
Number of Fisher Scoring iterations: 4
(Intercept) factor(ownpoul)1
0.1208723 1.5816404
2.5 % 97.5 %
(Intercept) 0.1038311 0.1398946
factor(ownpoul)1 0.8221058 2.8219122
2.12 Case and ownbov
```{r}
GLM12 <- glm(formula = case ~ factor(ownbov), family = binomial, data = VL_Bihar)
summary(GLM12)
exp(coef(GLM12))
exp(confint(GLM12))
```
Waiting for profiling to be done...
Call:
glm(formula = case ~ factor(ownbov), family = binomial, data = VL_Bihar)
Deviance Residuals:
Min 1Q Median 3Q Max
-0.4889 -0.4889 -0.4793 -0.4793 2.1075
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -2.10587 0.09435 -22.320 <2e-16 ***
factor(ownbov)1 0.04199 0.15105 0.278 0.781
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 1303.7 on 1879 degrees of freedom
Residual deviance: 1303.7 on 1878 degrees of freedom
AIC: 1307.7
Number of Fisher Scoring iterations: 4
(Intercept) factor(ownbov)1
0.1217391 1.0428795
2.5 % 97.5 %
(Intercept) 0.1007174 0.1458382
factor(ownbov)1 0.7734163 1.3992746
2.13 Case and bednet
```{r}
GLM13 <- glm(formula = case ~ factor(bednet), family = binomial, data = VL_Bihar)
summary(GLM13)
exp(coef(GLM13))
exp(confint(GLM13))
```
Waiting for profiling to be done...
Call:
glm(formula = case ~ factor(bednet), family = binomial, data = VL_Bihar)
Deviance Residuals:
Min 1Q Median 3Q Max
-0.5103 -0.5103 -0.5103 -0.3568 2.3605
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -1.97272 0.07868 -25.073 < 2e-16 ***
factor(bednet)1 -0.74972 0.22918 -3.271 0.00107 **
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 1303.7 on 1879 degrees of freedom
Residual deviance: 1291.2 on 1878 degrees of freedom
AIC: 1295.2
Number of Fisher Scoring iterations: 5
(Intercept) factor(bednet)1
0.1390779 0.4725000
2.5 % 97.5 %
(Intercept) 0.1188343 0.1617957
factor(bednet)1 0.2941532 0.7252757
2.14 Case and asset_index
```{r}
GLM14 <- glm(formula = case ~ factor(asset_index), family = binomial, data = VL_Bihar)
summary(GLM14)
exp(coef(GLM14))
exp(confint(GLM14))
```
Waiting for profiling to be done...
Call:
glm(formula = case ~ factor(asset_index), family = binomial,
data = VL_Bihar)
Deviance Residuals:
Min 1Q Median 3Q Max
-0.5882 -0.5860 -0.4628 -0.2995 2.5007
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -1.666909 0.129399 -12.882 < 2e-16 ***
factor(asset_index)2 -0.008126 0.192010 -0.042 0.966243
factor(asset_index)3 -0.513337 0.224659 -2.285 0.022315 *
factor(asset_index)4 -0.860198 0.232339 -3.702 0.000214 ***
factor(asset_index)5 -1.415001 0.294023 -4.813 1.49e-06 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 1303.7 on 1879 degrees of freedom
Residual deviance: 1260.9 on 1875 degrees of freedom
AIC: 1270.9
Number of Fisher Scoring iterations: 5
(Intercept) factor(asset_index)2 factor(asset_index)3
0.1888298 0.9919070 0.5984951
factor(asset_index)4 factor(asset_index)5
0.4230784 0.2429255
2.5 % 97.5 %
(Intercept) 0.1454019 0.2416577
factor(asset_index)2 0.6793381 1.4439490
factor(asset_index)3 0.3813144 0.9223585
factor(asset_index)4 0.2647424 0.6603038
factor(asset_index)5 0.1317253 0.4204629
2.15 Case and riskwall
```{r}
GLM15 <- glm(formula = case ~ factor(riskwall), family = binomial, data = VL_Bihar)
summary(GLM15)
exp(coef(GLM15))
exp(confint(GLM15))
```
Waiting for profiling to be done...
Call:
glm(formula = case ~ factor(riskwall), family = binomial, data = VL_Bihar)
Deviance Residuals:
Min 1Q Median 3Q Max
-0.5717 -0.5717 -0.4124 -0.4124 2.2391
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -2.4218 0.1100 -22.019 < 2e-16 ***
factor(riskwall)1 0.6933 0.1489 4.657 3.21e-06 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 1303.7 on 1879 degrees of freedom
Residual deviance: 1281.8 on 1878 degrees of freedom
AIC: 1285.8
Number of Fisher Scoring iterations: 5
(Intercept) factor(riskwall)1
0.0887574 2.0003035
2.5 % 97.5 %
(Intercept) 0.071064 0.1094251
factor(riskwall)1 1.495792 2.6830202
2.16 Case and earthfloor
```{r}
GLM16 <- glm(formula = case ~ factor(earthfloor), family = binomial, data = VL_Bihar)
summary(GLM16)
exp(coef(GLM16))
exp(confint(GLM16))
```
Waiting for profiling to be done...
Call:
glm(formula = case ~ factor(earthfloor), family = binomial, data = VL_Bihar)
Deviance Residuals:
Min 1Q Median 3Q Max
-0.5002 -0.5002 -0.5002 -0.5002 2.5884
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -3.3142 0.4154 -7.979 1.47e-15 ***
factor(earthfloor)1 1.2990 0.4221 3.077 0.00209 **
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 1303.7 on 1879 degrees of freedom
Residual deviance: 1289.8 on 1878 degrees of freedom
AIC: 1293.8
Number of Fisher Scoring iterations: 5
(Intercept) factor(earthfloor)1
0.03636366 3.66544816
2.5 % 97.5 %
(Intercept) 0.01429586 0.07505732
factor(earthfloor)1 1.74657291 9.41739018
2.17 Case and sc_caste
```{r}
GLM17 <- glm(formula = case ~ factor(sc_caste), family = binomial, data = VL_Bihar)
summary(GLM17)
exp(coef(GLM17))
exp(confint(GLM17))
```
Waiting for profiling to be done...
Call:
glm(formula = case ~ factor(sc_caste), family = binomial, data = VL_Bihar)
Deviance Residuals:
Min 1Q Median 3Q Max
-1.0842 -0.4512 -0.4512 -0.4512 2.1611
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -2.23339 0.07954 -28.079 < 2e-16 ***
factor(sc_caste)1 2.01024 0.25015 8.036 9.28e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 1303.7 on 1879 degrees of freedom
Residual deviance: 1248.7 on 1878 degrees of freedom
AIC: 1252.7
Number of Fisher Scoring iterations: 5
(Intercept) factor(sc_caste)1
0.1071647 7.4651429
2.5 % 97.5 %
(Intercept) 0.09138689 0.1248467
factor(sc_caste)1 4.54860989 12.1710268
3. Multivariate analysis. Association between incidence of visceral leshmanis and regular use of bednet and which factors are potential confounders
We will do the “Change in Estimate Model”
Step 1. We select the factor that can be the main exposure and the other exposures that have logic
Main exposure: regular use of bed net
Exclude: Died
Step 2. First do the crude odd ratio (bivariate)
```{r}
ML1 <- glm(factor(case) ~ factor(bednet), family=binomial, data=VL_Bihar)
summary(ML1)
exp(coef(ML1))
exp(confint(ML1))
```
Waiting for profiling to be done...
Call:
glm(formula = factor(case) ~ factor(bednet), family = binomial,
data = VL_Bihar)
Deviance Residuals:
Min 1Q Median 3Q Max
-0.5103 -0.5103 -0.5103 -0.3568 2.3605
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -1.97272 0.07868 -25.073 < 2e-16 ***
factor(bednet)1 -0.74972 0.22918 -3.271 0.00107 **
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 1303.7 on 1879 degrees of freedom
Residual deviance: 1291.2 on 1878 degrees of freedom
AIC: 1295.2
Number of Fisher Scoring iterations: 5
(Intercept) factor(bednet)1
0.1390779 0.4725000
2.5 % 97.5 %
(Intercept) 0.1188343 0.1617957
factor(bednet)1 0.2941532 0.7252757
Step 3. Then adding each other variable and writing it down. And doing the table of how much change the OR
Case - bednet + sex
```{r}
ML2 <- glm(factor(case) ~ factor(bednet) + factor(sex), family=binomial, data=VL_Bihar)
exp(coef(ML2))
exp(confint(ML2))
```
Waiting for profiling to be done...
(Intercept) factor(bednet)1 factor(sex)2
0.1497557 0.4737404 0.8623835
2.5 % 97.5 %
(Intercept) 0.1208915 0.1836368
factor(bednet)1 0.2948994 0.7272629
factor(sex)2 0.6449093 1.1523372
Case - bednet + age
```{r}
ML3 <- glm(factor(case) ~ factor(bednet) + age, family=binomial, data=VL_Bihar)
exp(coef(ML3))
exp(confint(ML3))
```
Waiting for profiling to be done...
(Intercept) factor(bednet)1 age
0.1689851 0.4963053 0.9909758
2.5 % 97.5 %
(Intercept) 0.1331263 0.2129796
factor(bednet)1 0.3083839 0.7636801
age 0.9824605 0.9991968
Case - bednet + groupage
```{r}
ML3_2 <- glm(factor(case) ~ factor(bednet) + groupage, family=binomial, data=VL_Bihar)
exp(coef(ML3_2))
exp(confint(ML3_2))
```
Waiting for profiling to be done...
(Intercept) factor(bednet)1 groupage1 groupage2
0.1613022 0.4961901 0.8078296 0.6846378
2.5 % 97.5 %
(Intercept) 0.1293122 0.1990533
factor(bednet)1 0.3082705 0.7636340
groupage1 0.5745769 1.1285375
groupage2 0.4659519 0.9917488
Case - bednet + neem_tree
```{r}
ML4 <- glm(factor(case) ~ factor(bednet) + factor(neem_tree), family=binomial, data=VL_Bihar)
exp(coef(ML4))
exp(confint(ML4))
```
Waiting for profiling to be done...
(Intercept) factor(bednet)1 factor(neem_tree)1
0.1507119 0.4740451 0.8096511
2.5 % 97.5 %
(Intercept) 0.1242335 0.1813191
factor(bednet)1 0.2950723 0.7277817
factor(neem_tree)1 0.5967055 1.0911650
Case - bednet + bamboo_tree
```{r}
ML5 <- glm(factor(case) ~ factor(bednet) + factor(bamboo_tree), family=binomial, data=VL_Bihar)
exp(coef(ML5))
exp(confint(ML5))
```
Waiting for profiling to be done...
(Intercept) factor(bednet)1 factor(bamboo_tree)1
0.1218403 0.4833793 1.3557781
2.5 % 97.5 %
(Intercept) 0.09872222 0.1488309
factor(bednet)1 0.30071367 0.7426603
factor(bamboo_tree)1 1.01175573 1.8139321
Case - bednet + banana_tree
```{r}
ML6 <- glm(factor(case) ~ factor(bednet) + factor(banana_tree), family=binomial, data=VL_Bihar)
exp(coef(ML6))
exp(confint(ML6))
```
Waiting for profiling to be done...
(Intercept) factor(bednet)1 factor(banana_tree)1
0.1481994 0.4736328 0.9261514
2.5 % 97.5 %
(Intercept) 0.1025907 0.2077231
factor(bednet)1 0.2948139 0.7271530
factor(banana_tree)1 0.6381431 1.3772929
Case - bednet + rice_field
```{r}
ML7 <- glm(factor(case) ~ factor(bednet) + factor(rice_field), family=binomial, data=VL_Bihar)
exp(coef(ML7))
exp(confint(ML7))
```
Waiting for profiling to be done...
(Intercept) factor(bednet)1 factor(rice_field)1
0.1144057 0.4800229 1.2666583
2.5 % 97.5 %
(Intercept) 0.07841589 0.1615314
factor(bednet)1 0.29863762 0.7374461
factor(rice_field)1 0.86999235 1.8945309
Case - bednet + permanent_water_body
```{r}
ML8 <- glm(factor(case) ~ factor(bednet) + factor(permanent_water_body), family=binomial, data=VL_Bihar)
exp(coef(ML8))
exp(confint(ML8))
```
Waiting for profiling to be done...
(Intercept) factor(bednet)1
0.1386849 0.4723452
factor(permanent_water_body)1
1.0062094
2.5 % 97.5 %
(Intercept) 0.1125509 0.1691994
factor(bednet)1 0.2939865 0.7252469
factor(permanent_water_body)1 0.7516130 1.3449976
Case - bednet + granaries_in_hh
```{r}
ML9 <- glm(factor(case) ~ factor(bednet) + factor(granaries_in_hh), family=binomial, data=VL_Bihar)
exp(coef(ML9))
exp(confint(ML9))
```
Waiting for profiling to be done...
(Intercept) factor(bednet)1 factor(granaries_in_hh)1
0.1385799 0.4724105 1.0063584
2.5 % 97.5 %
(Intercept) 0.1097963 0.1727742
factor(bednet)1 0.2940746 0.7252060
factor(granaries_in_hh)1 0.7519452 1.3510593
Case - bednet + owngoat
```{r}
ML10 <- glm(factor(case) ~ factor(bednet) + factor(owngoat), family=binomial, data=VL_Bihar)
exp(coef(ML10))
exp(confint(ML10))
```
Waiting for profiling to be done...
(Intercept) factor(bednet)1 factor(owngoat)1
0.1348600 0.4785828 1.0907549
2.5 % 97.5 %
(Intercept) 0.1110257 0.1623881
factor(bednet)1 0.2973262 0.7365004
factor(owngoat)1 0.7996930 1.4770939
Case - bednet + ownpoul
```{r}
ML11 <- glm(factor(case) ~ factor(bednet) + factor(ownpoul), family=binomial, data=VL_Bihar)
exp(coef(ML11))
exp(confint(ML11))
```
Waiting for profiling to be done...
(Intercept) factor(bednet)1 factor(ownpoul)1
0.1359797 0.4769139 1.5118321
2.5 % 97.5 %
(Intercept) 0.1156162 0.1589157
factor(bednet)1 0.2968099 0.7323453
factor(ownpoul)1 0.7844602 2.7032767
Case - bednet + ownbov
```{r}
ML12 <- glm(factor(case) ~ factor(bednet) + factor(ownbov), family=binomial, data=VL_Bihar)
exp(coef(ML12))
exp(confint(ML12))
```
Waiting for profiling to be done...
(Intercept) factor(bednet)1 factor(ownbov)1
0.1344998 0.4674983 1.0953236
2.5 % 97.5 %
(Intercept) 0.1106581 0.1620431
factor(bednet)1 0.2906619 0.7186983
factor(ownbov)1 0.8108404 1.4725822
Case - bednet + asset_index
```{r}
ML13 <- glm(factor(case) ~ factor(bednet) + factor(asset_index), family=binomial, data=VL_Bihar)
summary(ML13)
exp(coef(ML13))
exp(confint(ML13))
```
Waiting for profiling to be done...
Call:
glm(formula = factor(case) ~ factor(bednet) + factor(asset_index),
family = binomial, data = VL_Bihar)
Deviance Residuals:
Min 1Q Median 3Q Max
-0.5955 -0.5941 -0.4723 -0.3225 2.5833
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -1.64489 0.13006 -12.647 < 2e-16 ***
factor(bednet)1 -0.37012 0.24058 -1.538 0.123934
factor(asset_index)2 0.00511 0.19228 0.027 0.978800
factor(asset_index)3 -0.49232 0.22509 -2.187 0.028725 *
factor(asset_index)4 -0.79317 0.23566 -3.366 0.000763 ***
factor(asset_index)5 -1.28561 0.30390 -4.230 2.33e-05 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 1303.7 on 1879 degrees of freedom
Residual deviance: 1258.3 on 1874 degrees of freedom
AIC: 1270.3
Number of Fisher Scoring iterations: 5
(Intercept) factor(bednet)1 factor(asset_index)2
0.1930338 0.6906510 1.0051226
factor(asset_index)3 factor(asset_index)4 factor(asset_index)5
0.6112062 0.4524071 0.2764830
2.5 % 97.5 %
(Intercept) 0.1484556 0.2473784
factor(bednet)1 0.4214821 1.0867165
factor(asset_index)2 0.6880463 1.4639808
factor(asset_index)3 0.3891080 0.9427867
factor(asset_index)4 0.2813302 0.7108052
factor(asset_index)5 0.1472895 0.4884626
Case - bednet + riskwall
```{r}
ML14 <- glm(factor(case) ~ factor(bednet) + factor(riskwall), family=binomial, data=VL_Bihar)
exp(coef(ML14))
exp(confint(ML14))
```
Waiting for profiling to be done...
(Intercept) factor(bednet)1 factor(riskwall)1
0.1011328 0.5528385 1.8463805
2.5 % 97.5 %
(Intercept) 0.07972857 0.1265816
factor(bednet)1 0.34181591 0.8561327
factor(riskwall)1 1.37410587 2.4884630
Case - bednet + earthfloor
```{r}
ML15 <- glm(factor(case) ~ factor(bednet) + factor(earthfloor), family=binomial, data=VL_Bihar)
exp(coef(ML15))
exp(confint(ML15))
```
Waiting for profiling to be done...
(Intercept) factor(bednet)1 factor(earthfloor)1
0.04606946 0.52529186 3.16935398
2.5 % 97.5 %
(Intercept) 0.01793912 0.09657055
factor(bednet)1 0.32593161 0.80963385
factor(earthfloor)1 1.49945482 8.17768200
Case - bednet + sc_caste
```{r}
ML16 <- glm(factor(case) ~ factor(bednet) + factor(sc_caste), family=binomial, data=VL_Bihar)
exp(coef(ML16))
exp(confint(ML16))
```
Waiting for profiling to be done...
(Intercept) factor(bednet)1 factor(sc_caste)1
0.1186469 0.5475738 6.7960195
2.5 % 97.5 %
(Intercept) 0.09993218 0.1398283
factor(bednet)1 0.33945784 0.8451291
factor(sc_caste)1 4.12689597 11.1198834
Table of how much change the OR
Factor | Co factor | OR | 95IC | Change-in-estimate |
---|---|---|---|---|
Bed nets | 0.4725 | 0.2941532 0.7252757 | Ref | |
Bed nets | factor(sex) | 0.47374 | 0.2948994 0.7272629 | -0.3% |
Bed nets | age | 0.496305 | 0.3083839 0.7636801 | -5.0% |
Bed nets | group age | 0.49619 | 0.3082705 0.7636340 | -5.0% |
Bed nets | under5 | 0.482438 | 0.3000965 0.7413054 | -2.1% |
Bed nets | factor(neem_tree), f | 0.474045 | 0.2950723 0.7277817 | -0.3% |
Bed nets | factor(bamboo_tree), | 0.483379 | 0.30071367 0.7426603 | -2.3% |
Bed nets | factor(banana_tree), | 0.473633 | 0.2948139 0.7271530 | -0.2% |
Bed nets | factor(rice_field), | 0.480023 | 0.29863762 0.7374461 | -1.6% |
Bed nets | factor(permanent_wat | 0.472345 | 0.2939865 0.7252469 | 0.0% |
Bed nets | factor(granaries_in_ | 0.472411 | 0.2940746 0.7252060 | 0.0% |
Bed nets | factor(owngoat), | 0.478583 | 0.2973262 0.7365004 | -1.3% |
Bed nets | factor(ownpoul), | 0.476914 | 0.2968099 0.7323453 | -0.9% |
Bed nets | factor(ownbov), | 0.467498 | 0.2906619 0.7186983 | 1.1% |
Bed nets | factor(asset_index) | 0.690651 | 0.4214821 1.0867165 | -46.2% |
Bed nets | factor(riskwall), f | 0.552839 | 0.34181591 0.8561327 | -17.0% |
Bed nets | factor(earthfloor), | 0.525292 | 0.32593161 0.80963385 | -11.2% |
Bed nets | factor(sc_caste), f | 0.547574 | 0.33945784 0.8451291 | -15.9% |
So far we know that assess index, riskwall, earthfloor and sccastle have more than 10% of change in the OR of bednet
Assess index : change of 46.2%
Riskwall : change of 17%
earthfloor: 11.2%
sccastle : 15.9 ( Castle is similar to assess index, so we won´t take in to account)
So the model so far is:
Step 4: We are going to exclude factors removing one by one (starting with the one with less change: earthfloor)
We will star removing earthfloor:
```{r}
ML18 <- glm(factor(case) ~ factor(bednet) + factor(asset_index) + factor(riskwall) , family=binomial, data=VL_Bihar)
summary(ML18)
exp(coef(ML18))
exp(confint(ML18))
```
Waiting for profiling to be done...
Call:
glm(formula = factor(case) ~ factor(bednet) + factor(asset_index) +
factor(riskwall), family = binomial, data = VL_Bihar)
Deviance Residuals:
Min 1Q Median 3Q Max
-0.6197 -0.5569 -0.4514 -0.3220 2.5816
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -1.82815 0.19068 -9.588 < 2e-16 ***
factor(bednet)1 -0.36266 0.24083 -1.506 0.132088
factor(asset_index)2 0.04281 0.19434 0.220 0.825645
factor(asset_index)3 -0.40452 0.23418 -1.727 0.084100 .
factor(asset_index)4 -0.66343 0.25476 -2.604 0.009212 **
factor(asset_index)5 -1.10510 0.33366 -3.312 0.000926 ***
factor(riskwall)1 0.23261 0.17392 1.337 0.181078
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 1303.7 on 1879 degrees of freedom
Residual deviance: 1256.5 on 1873 degrees of freedom
AIC: 1270.5
Number of Fisher Scoring iterations: 5
(Intercept) factor(bednet)1 factor(asset_index)2
0.1607100 0.6958205 1.0437402
factor(asset_index)3 factor(asset_index)4 factor(asset_index)5
0.6672951 0.5150839 0.3311786
factor(riskwall)1
1.2618946
2.5 % 97.5 %
(Intercept) 0.1096759 0.2317313
factor(bednet)1 0.4244549 1.0954504
factor(asset_index)2 0.7116732 1.5265074
factor(asset_index)3 0.4176057 1.0483734
factor(asset_index)4 0.3091537 0.8414938
factor(asset_index)5 0.1675919 0.6242947
factor(riskwall)1 0.8990299 1.7791027
And then comparing both models (ML17 vs ML18)
Analysis of Deviance Table
Model 1: factor(case) ~ factor(bednet) + factor(asset_index) + factor(riskwall) +
factor(earthfloor)
Model 2: factor(case) ~ factor(bednet) + factor(asset_index) + factor(riskwall)
Resid. Df Resid. Dev Df Deviance Pr(>Chi)
1 1872 1254.1
2 1873 1256.5 -1 -2.4739 0.1157
Because the Likelihood ratio test was not significant (p = 0.1157) there is not significant difference between LM17 an ML18, so we keep the simplest model ML18: factor(case) - factor(bednet) + factor(asset_index) + factor(riskwall)
Now we will try to remove other factor (riskwall)
And then comparing both (ML18 vs ML19)
Analysis of Deviance Table
Model 1: factor(case) ~ factor(bednet) + factor(asset_index) + factor(riskwall)
Model 2: factor(case) ~ factor(bednet) + factor(asset_index)
Resid. Df Resid. Dev Df Deviance Pr(>Chi)
1 1873 1256.5
2 1874 1258.3 -1 -1.8029 0.1794
Because the Likelihood ratio test was not significant (p = 0.1794) there is not significant difference between LM18 an ML19, so we keep the simplest model LM19: factor(case) - factor(bednet) + factor(asset_index)
Now we will try to remove the other factor (assest_index)
And then comparing both (ML19 vs ML20)
Analysis of Deviance Table
Model 1: factor(case) ~ factor(bednet) + factor(asset_index)
Model 2: factor(case) ~ factor(bednet)
Resid. Df Resid. Dev Df Deviance Pr(>Chi)
1 1874 1258.3
2 1878 1291.2 -4 -32.812 1.305e-06 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Because the Likelihood ratio test was significant (1.305e-06 ) there is a significant difference between LM18 an ML19, so we keep the complex model: factor(case) - factor(bednet) + factor(asset_index)
So the final model should include: Bednet and asset_index
Now the question is if those factors are ok like that or there is an interaction between them?
Step 5: Analyzing the interaction