Solution: Maximum partial likelihood estimator for beta (Exam 2022)

Eksempel

Solution

To find an expression for the partial likelihood we start with the general expression for partial likelihood for a relative risk regression model, given in (4.7) in ABG,

\[ L(\beta) = \prod_{T_j} \frac{r(\beta,x_{i_j}(T_j))}{\sum_{\ell \in {\cal R}_j} r(\beta,x_\ell(T_j))}. \]

Since our covariates are time invariant we can simplify this expression by inserting \(x_{i_j}(T_j)=x_{i_j}\) and \(x_\ell(T_j)=x_\ell\). Since we have only one covariate we have that \(\beta\) is scalar, so \(r(\beta,x)=e^{\beta x}\) we then get

\[ L(\beta) = \prod_{T_j} \frac{e^{\beta x_{i_j}}}{\sum_{\ell \in {\cal R}_j} e^{\beta x_\ell}}. \]

Since all the units are at risk at all times we get that \({\cal R}_j=\{1,2,\ldots,n\}\) so

\[L(\beta) = \prod_{T_j} \frac{e^{\beta x_{i_j}}}{\sum_{\ell =1}^n e^{\beta x_\ell}}.\]

To simplify the denominator in this expression we can define

\[{\cal B}_1 = \{i:x_i=1\}\]

to be the set of units that have the covariate equal to one, and correspondingly define

\[{\cal B}_0 = \{i:x_i=0\} = \{1,2,\ldots,n\} \setminus {\cal B}_1\]

to be the set of units that have the covariate equal to zero. Moreover, let \(z_1=|{\cal B}_1|\) be the number of units that has the covariate equal to one, and let \(z_0=|{\cal B}_0|=n-z_1\).

Since the covariate value is binary, the denominator can then be written as

\[\sum_{\ell =1}^n e^{\beta x_\ell} = \sum_{\ell:x_\ell=0} e^{\beta \cdot 0} + \sum_{\ell:x_\ell=1} e^{\beta\cdot 1} = \sum_{\ell:x_\ell=0} 1 + \sum_{\ell:x_\ell=1} e^{\beta} = z_0 + z_1e^\beta.\]

The partial likelihood function can thereby be expressed as

\[L(\beta) = \prod_{T_j} \frac{e^{\beta x_{i_j}}}{z_0 + z_1e^\beta}.\]

Taking the logarithm of this expression we get the log-partial likelihood function,

\[\ell (\beta) = \sum_{T_j} \left(\beta x_{i_j} - \ln(z_0+z_1e^\beta)\right).\]

Letting \(K_0\) and \(K_1\) denote the number of failures observed up to time \(\tau\) on units with covariate value equal to zero and one, respectively, and letting \(K=K_0+K_1\) denote the total number of failures observed up to time \(\tau\), the expression for \(\ell(\beta)\) can be written as

\begin{align*} \ell (\beta) &= \sum_{T_j} \beta x_{i_j} - \sum_{T_j} \ln(z_0+z_1e^\beta)\\ &= \sum_{T_j:x_{i_j}=0} \beta\cdot 0 + \sum_{T_j:x_{i_j}=1}\beta\cdot 1 - K\ln (z_0+z_1e^\beta)\\ &= K_0\cdot 0+K_1\cdot \beta - K\ln (z_0+z_1e^\beta)\\ &= \beta K_1 - K\ln(z_0+z_1e^\beta). \end{align*}

Using the chain rule to evaluate the derivative of \(\ell (\beta)\) we get

\begin{align*} \ell^\prime(\beta) &= K_1 - K\frac{1}{z_0+z_1e^\beta}\cdot z_1e^\beta\\ \end{align*}

Setting \(\ell^\prime(\beta)\) to find the maximum likelihood estimator for \(\beta\) we get

\begin{align*} K_1 &= \frac{K z_1 e^\beta}{z_0+z_1e^\beta}\\ K_1(z_0+z_1e^\beta) &= Kz_1e^\beta\\ K_1 z_0 &= (K-K_1)z_1e^\beta\\ e^\beta &= \frac{K_1z_0}{(K-K_1)z_1} = \frac{K_1z_0}{K_0z_1}\\ \beta &= \ln\left( \frac{K_1z_0}{K_0z_1}\right). \end{align*}

So the maximum partial likelihood for \(\beta\) is

\[ \underline{\underline{\widehat{\beta} = \ln\left( \frac{K_1z_0}{K_0z_1}\right)}}. \]