2023-02-13 Session 1
Survival Analysis - Session 1
Starting the Exercise
Question 1
Which of the following types of data can/should be analyzed using survival analysis? What would be the analysis variable? What is the time of origin and the end point? Are the data censored, and if yes, which type of censoring occurs?
- Two groups of patients with skin cancer are treated: group A gets surgical removal of the cancer, group B gets surgical removal + chemotherapy. We believe that the chemotherapy will reduce the risk of recurrence of he cancer. The patients are asked to return for a check up every 3 months for the following 5 years. At every check-up, the patients are examined for skin cancers.
a1. Should be analyzed using survival analysis?: Yes.
Why?, Because we are interested in analyzing the time from one event to another and also because the data could be censored (when we don’t know exactly when is the end point for some individuals)
Why not linear regression?: Because of the censorship. There is a possibility that some individuals did not continue to follow-up or completed the study without having the event.
a2. Analysis variable: Time to cancer recurrence
a3. Time:
Moment of origin: Moment in which the intervention is carried out
End point: Time to cancer recurrence
a4. Are the data censored?: Yes, censoring in survival analysis refers to the situation in which the exact time to event is not known for some individuals in a study.
What type of censoring occurs?: Interval, because if you are studying the recurrence of cancer, you do not know if it just appeared before the appointment Also, the point where cancer begins is not well defined.
- Two groups of patients with malaria are entered in a clinical trial: 1 group receives a new treatment, the other group receives standard care. After 3 months they are asked to return for parasitological examination. We want to see if the new treatment cures more patient than standard therapy.
b1. Should be analyzed using survival analysis?: No. Our outcome doesn’t involve time. It is just one measure.
For this we should use Logistic Regression. Because the outcome is a binary variable..
b2. Analysis variable:% of patients parasite-free at month 3
- To assess risk behaviour for HIV and other STI’s, a group of adolescents (12 to 18 years) are asked when they had sexual intercourse for the first time. We are interested in estimating the age of sexual debut.
c1. Should be analyzed using survival analysis?: Yes. Our outcome is the time of sexual debut. We are interested in the time from one event to another. Also the data could be censored
c2. Analysis variable: Age of sexual debut
c3. Time:
Time of origin: Born
End point: Age of sexual debut
c4. Are the data censored?: Yes. There may be adolescents who at the end of the study have not developed the result (they have not yet had sexual relations at the time of the survey).
This is a cross-sectional study but you ask for a time. (It’s a bit tricky)
What type of censoring occurs? Right. Because of some teens who have not yet had sex at the time of the survey. We know that the outcome could develop after the end of the study.
- To assess risk behaviors for childhood asthma, we perform a case-control study comparing mothers of children with asthma and children without asthma. We ask the mothers when they were first pregnant. We are interested in estimating the age of first pregnancy for women with a child with asthma vs women with a child without asthma
d1. Should be analyzed using survival analysis?: Yes. We are interested in the time from one event to another. But, since there is no censoring here, we can also choose to use linear regression.
d2. Analysis variable: Age at first pregnancy
Independent variable: having or not a child with asthma
d3. Time:
Time of origin: Birth
End point: Age of first pregnancy
d4. Are the data censored? No, censorship is when we don’t know the endpoint for some individuals. In this case, all the women have had a first pregnancy, so they have all reached the outcome and we know when. Also, I cannot have losses to follow-up, because it is a cross-sectional study.
Question 2
We ran a clinical trial of a new less invasive surgical procedure (treatment A) for patients with myocardial infarction. This procedure is compared to the standard treatment (treatment B) which requires open-heart surgery. After surgery patients were required to return to the clinic every 3 months for a check-up. The trial was completed on 1-Jan-2006. The primary objective of the study is to compare the two treatment groups with respect to mortality, analyzed using an intention-to-treat approach. As a secondary analysis, time to cardiac event (defined as new MI or a new cardiac procedure or cardiac death) is analyzed, using a per-protocol approach.
In an intention-to-treat analysis, you analyze everyone according to the groups from the initial randomization. (Although in the end for X reasons some did not follow the planned treatment)
In a per-protocol analysis, you analyze only those who actually undergo initial randomization.
- Patient 001 was randomized to treatment A. The surgery was performed on 1-Jul-2004 and was successful according to the surgeon. The patient came in for his visits and was in good health at the end of the study.
Table according to an intention-to-treat analysis
ID | Original Treatment | Event (death) | Start date | End date | Reason of the end date | Time (start date - end date) |
---|---|---|---|---|---|---|
1 | A | 0 | 1-jul-04 | 1-ene-06 | End of study | 18.30 |
Table according to an per-protocol analysis
ID | Treatment original | Real treatment | Cardiac Event | Start date | End date (first date) | Reason of the end date | Time to event (months) |
---|---|---|---|---|---|---|---|
1 | A | A | 0 | 1-jul-04 | 1-ene-06 | End of study | 18.30 |
The cardiac event is defined as new MI or a new cardiac procedure or cardiac death.
- Patient 002 was randomized to treatment A. The surgery was performed on 15-Jul-2004 and was successful according to the surgeon. On 15-Nov-2004, the patient came into the hospital with a new myocardial infarction. The patient however died 15 days later in hospital.
According to an intention-to-treat analysis
ID | Original Treatment | Event (death) | Start date | End date | Reason of the end date | Time (start date - end date) |
---|---|---|---|---|---|---|
2 | A | 1 | 15-jul-04 | 1-dic-04 | Died | 4.63 |
According to an per-protocol analysis
ID | Treatment original | Real treatment | Cardiac Event | Start date | End date (first date) | Reason of the end date | Time to event (months) |
---|---|---|---|---|---|---|---|
2 | A | A | 1 | 15-jul-04 | 15-nov-04 | New MI | 4.10 |
- Patient 003 was randomized to treatment B. The surgery was performed on 30-Jul-2004 and was successful. The patient came in hospital for the first 2 visits. The second visit was on 1-Feb-2005 when the patient was in good health. However the patient but did not return to the hospital after this date.
According to an intention-to-treat analysis
ID | Original Treatment | Event (death) | Start date | End date | Reason of the end date | Time (start date - end date) |
---|---|---|---|---|---|---|
3 | B | 0 | 30-jul-04 | 1-feb-05 | Lost follow up | 6.20 |
According to an per-protocol analysis
ID | Treatment original | Real treatment | Cardiac Event | Start date | End date (first date) | Reason of the end date | Time to event (months) |
---|---|---|---|---|---|---|---|
3 | B | B | 0 | 30-jul-04 | 1-feb-05 | Lost follow up | 6.20 |
- Patient 004 was randomized to standard treatment. When in the operating room (on 1-Aug-2004), the surgeon however decided that the standard surgery was too dangerous to perform, due to the bad condition of the patient. Instead, the new less invasive operation was performed. The patient died in hospital 15 days after the operation.
According to an intention-to-treat analysis
ID | Original Treatment | Event (death) | Start date | End date | Reason of the end date | Time (start date - end date) |
---|---|---|---|---|---|---|
4 | B | 1 | 1-ago-04 | 16-ago-04 | Died | 0.50 |
According to an per-protocol analysis
ID | Treatment original | Real treatment | Cardiac Event | Start date | End date (first date) | Reason of the end date | Time to event (months) |
---|---|---|---|---|---|---|---|
4 | B | A | Excluded |
Why in per-protocol analysis is B? Because this person did not carry out the assigned procedure (randomized), then is excluded
- Patient 005 was randomized to the treatment B. The surgery was performed on 1-Nov-2004 and was successful according to the surgeon. The patient returned for his first follow-up visit on 1-Feb-2005. The patient did not return to the hospital for any of the next follow-up visits. However one of the study nurses read in the newspaper that the patient (whose name she remembered) had died on 1-Sep-2005 due to a fall when performing repairs to the roof of his house.
According to an intention-to-treat analysis
ID | Original Treatment | Event (death) | Start date | End date | Reason of the end date | Time (start date - end date) |
---|---|---|---|---|---|---|
5 | B | 1 | 1-nov-04 | 1-sep-05 | Died (even it was by fall) | 10.13 |
According to an per-protocol analysis
ID | Treatment original | Real treatment | Cardiac Event | Start date | End date (first date) | Reason of the end date | Time to event (months) |
---|---|---|---|---|---|---|---|
5 | B | B | 0 | 1-nov-04 | 1-feb-05 | Lost follow up | 3.07 |
- Patient 006 was randomized to the treatment A. The surgery was performed on 1-Dec-2004 and was successful according to the surgeon. On 1-Mar-2005, the patient came into the hospital with a new MI. The surgeon decided to perform open heart surgery, similar to treatment B. After this surgery, the patient did well and returned regularly to the check-ups. On 1-Jan-2006 was still alive and doing well
According to an intention-to-treat analysis
ID | Original Treatment | Event (death) | Start date | End date | Reason of the end date | Time (start date - end date) |
---|---|---|---|---|---|---|
6 | A | 0 | 1-dic-04 | 1-ene-06 | End of study | 13.20 |
According to an per-protocol analysis
ID | Treatment original | Real treatment | Cardiac Event | Start date | End date (first date) | Reason of the end date | Time to event (months) |
---|---|---|---|---|---|---|---|
6 | A | A | 1 | 1-dic-04 | 1-mar-05 | New MI | 3.00 |
Why in per-protocol analysis is still A? Because this person did carry out the assigned procedure (randomized)
So the final table for intention-to-treat analysis is:
ID | Treatment | Time (months) | Event |
---|---|---|---|
1 | A | 18.30 | 0 |
2 | A | 4.63 | 1 |
3 | B | 6.20 | 0 |
4 | B | 0.50 | 1 |
5 | B | 10.13 | 1 |
6 | A | 13.20 | 0 |
And the final table for per-protocol analysis is:
ID | Treatment | Time (months) | Event |
---|---|---|---|
1 | A | 18.30 | 0 |
2 | A | 4.10 | 1 |
3 | B | 6.20 | 0 |
4 | A | Excluded | |
5 | B | 3.07 | 0 |
6 | A | 3.00 | 1 |
Question 3
Patient 003 was lost to follow-up in the trial. Can you think of ways to recover some additional data for this patient?
- Check with other hospitals
- Contact with their house or relatives.
- Check demographics records
Question 4
The data below are from a cervical cancer trial. Enter the data in Excel and analyze the data using R.
Give: - Kaplan-Meier plot for the two groups
median survival time in each group
The proportion of patients alive after 2 years in each group
Test the difference between the two curves using the log-rank test
4.1. First we import the data from the excel file
4.2. To start working with survival analysis we have to create an event variable (1 when the event happened (in this case dead))
4.3. We start doing the kaplan-meyer plot for the two groups
4.4. To see the median survival time for each group we ask for the numbers of the model:
Call: survfit(formula = Surv(SurvivalTime, event) ~ Therapy, data = dat)
n events median 0.95LCL 0.95UCL
Therapy=A 16 11 1037 291 NA
Therapy=B 14 5 1307 827 NA
The median survival time is 1037 for therapy A and 1307 for therapy B.
Interpretation:
50% of the population was alive at 1037 days in therapy group A
50% of the population was alive at 1307 days in therapy group B
So, group B (new treatment) keeps more alive for more time
4.5. Proportion of patients alive after two years
For this we can see the graph:
Seeing the proportion of patients alive after two years (730 days)
More or less 60% in the A group and about 80% in the B group are alive after two years
But we also can see the proportions of people alive in a certain moment using summary
Call: survfit(formula = Surv(SurvivalTime, event) ~ Therapy, data = dat)
Therapy=A
time n.risk n.event survival std.err lower 95% CI upper 95% CI
90 16 1 0.938 0.0605 0.8261 1.000
142 15 1 0.875 0.0827 0.7271 1.000
150 14 1 0.812 0.0976 0.6421 1.000
269 13 1 0.750 0.1083 0.5652 0.995
291 12 1 0.688 0.1159 0.4941 0.957
680 10 1 0.619 0.1230 0.4191 0.914
837 9 1 0.550 0.1271 0.3497 0.865
1037 7 1 0.471 0.1310 0.2735 0.813
1153 4 1 0.354 0.1417 0.1612 0.775
1297 3 1 0.236 0.1348 0.0768 0.723
1429 2 1 0.118 0.1072 0.0198 0.701
Therapy=B
time n.risk n.event survival std.err lower 95% CI upper 95% CI
272 14 1 0.929 0.0688 0.803 1
362 13 1 0.857 0.0935 0.692 1
373 12 1 0.786 0.1097 0.598 1
827 7 1 0.673 0.1401 0.448 1
1307 3 1 0.449 0.2057 0.183 1
In terapy A , at the time 680, 68.8% were alive
In terapy B , at the time 827, 78.6% were alive
(You need to see the number in the previous row)
4.6 Test differences between both curves
Are both curves different? (Curve from terapy A vs curve from terapy B). We use the Long-rank test
Call:
survdiff(formula = Surv(SurvivalTime, event) ~ Therapy, data = dat)
N Observed Expected (O-E)^2/E (O-E)^2/V
Therapy=A 16 11 8.44 0.780 1.68
Therapy=B 14 5 7.56 0.869 1.68
Chisq= 1.7 on 1 degrees of freedom, p= 0.2
The p value is 0.2, so there is no statistical difference between both curves.
4.7 Conclusions
The prognosis of patients with the new treatment appeared to be better to reduce mortality cases. The median survival was 1037 days (2.8 years) in therapy group A (standard treatment) and 1307 days (3.5 years of survival) in the therapy group B (new treatment)
This difference was however not statistically significant (p-value = 0.2)
This could be because it was a small study. A new and larger clinical trial should be planned to reach a definitive conclusion.