Solution: Cox-regression in R for Pneumonia data
Eksempel
Here we solve the problem linking to this page.
Solution to (a)
From the Cox-regression model \((1.32)\) we have
\[\alpha_i(t) = \alpha_0(t)e^{\beta x_i(t)},\]
so for a patient with pneumonia status \(x\) the hazard rate is
\[\alpha(t|x) = \alpha_0(t)e^{\beta x}.\]
Solution to (b)
We get that the partial likelihood in \((4.7)\) becomes:
\[L(\beta) = \prod_{T_j} \frac{\exp(\beta x_{i_j})}{\sum_{l \in R_j}\exp(\beta x_l)},\]
where \(i_j\) is the index of the individual with event time \(T_j\) and \(R_j = \{l: T_l \geq T_j\}\) are the indices of individuals still at risk right before \(T_j\). We can also calculate the partial log-likelihood:
\[l(\beta) = \sum_{T_j} \exp(\beta x_{i_j}) - \sum_{T_j}\sum_{l \in R_j}\exp(\beta x_l).\]
Code 1 shows R-code that calculates and plots the partial likelihood as a function of \(\beta\). NB: If you are refused access to the dataset through R, you may open the link in a web-browser and download the txt-file to your own computer.
Figure 1 shows the resulting plot, and we can see that the maximum likelihood estimate of \(\beta\) is \[\hat \beta = -1.6.\]
Solution to (c)
Running the provided code,
gives the output:
The estimate for \(\beta\) is read out under "coef", next to "x". Thus the estimate is \(\hat \beta = -1.5971\), corresponding to what we found above.
Solution to (d)
We want to test the hypothesis \[H_0: \beta = 0 \quad \text{vs.} \quad H_1: \beta \neq 0.\]
Using a significance level of \(5\%\), and the results of the score test in the summary above, we can conclude that there is a significant difference between the discharge times for patients without and with pneumonia (note that the Wald test does not yield a significant result, and that in an actual experiment the choice of test would be done before seeing the results of the tests).
The relative risk of a patient without pneumonia compared to a patient with pneumonia is
\[e^{\beta\cdot 1}/e^{\beta \cdot 0} = e^{\beta} = 0.2025,\]
meaning that a the rate of disharge for a patient with pneumonia at admission is a fifth of the rate for a patient without pneumonia.
Solution to (e)
The log-rank test tests a more general hypothesis that the two groups have the same hazard rate at all times, i .e. \[H_0: \alpha_1(t) = \alpha_2(t)\] for all \(t \in [0,t_0]\), where the hazard rates need not be specified. The tests above, on the other hand, test if the hazard rates are the same under the restriction of the Cox-regression model.
R-code for performing the log-rank test "by hand", as described in section 3.3.1 in ABG, is provided in Code 2 below (note that the dataset has been rewritten in a slightly different form):
This results in a p-value of 0.038, meaning the hypothesis that the hazard functions are equal is rejected.
Solution to (f)
Running the command
gives the output:
Hence we reach the same conclusion as we did in (d) (note that the test performed in R uses a slightly different estimate for the variance of \(Z_1\) than we did in (e), hence the slight difference in test statistic and p-value).