Semaj Christian

2022-06-26

Survivor function of a variable that has discrete and continuous components
I'm currently reading The Statistical Analysis of Failure Time Data by Kalbfleisch and Prentice and had trouble at arriving at the expression for the survivor function of a random variable T having both discrete and continuous components. The setup is the following:
Let T be a random variable on $\left[0,\mathrm{\infty }\right)$ with survivor function F(t)=P(T>t). Then
if T is absolutely continuous with density f, then the hazard function $\lambda$ can be defined as
$\lambda \left(t\right):=\underset{h\to {0}^{+}}{lim}\frac{P\left(t\le T
for $t\ge 0$, and hence we have
$F\left(t\right)=\mathrm{exp}\left(-{\int }_{0}^{t}\lambda \left(s\right)\phantom{\rule{thinmathspace}{0ex}}\mathrm{d}s\right),\phantom{\rule{1em}{0ex}}t\ge 0.$
if T is discrete taking on the values $0\le {a}_{1}<{a}_{2}<\cdots$, then we define the hazard at ${a}_{i}$ as
${\lambda }_{i}=P\left(T={a}_{i}\mid T\ge {a}_{i}\right),\phantom{\rule{1em}{0ex}}i=1,2,\dots .$
Then we can show that
$F\left(t\right)=\prod _{j\mid {a}_{j}\le t}\left(1-{\lambda }_{j}\right),\phantom{\rule{1em}{0ex}}t\ge 0.$
These expressions for the survivor functions I am ok with. Now they write the following:
More generally, the distribution of T may have both discrete and continuous components. In this case, the hazard function can be defined to have the continuous component ${\lambda }_{c}\left(t\right)$ and discrete components ${\lambda }_{1},{\lambda }_{2},\dots$ at the discrete times ${a}_{1}<{a}_{2}<\cdots$
The overall survivor function can then be written
$\begin{array}{}\text{(1)}& F\left(t\right)=\mathrm{exp}\left(-{\int }_{0}^{t}{\lambda }_{c}\left(s\right)\phantom{\rule{thinmathspace}{0ex}}\mathrm{d}s\right)\prod _{j\mid {a}_{j}\le t}\left(1-{\lambda }_{j}\right).\end{array}$
That T has both discrete and continuous components means that the distribution of T is of the form
${P}_{T}\left(\mathrm{d}x\right)={f}_{c}\left(x\right)\lambda \left(\mathrm{d}x\right)+\sum _{j=1}^{\mathrm{\infty }}{b}_{j}{\delta }_{{a}_{j}}\left(\mathrm{d}x\right)$
or equivalently
$P\left(T\in A\right)={\int }_{A}{f}_{c}\left(x\right)\phantom{\rule{thinmathspace}{0ex}}\mathrm{d}x+\sum _{j\mid {a}_{j}\in A}{b}_{j}$
for some sequence ${a}_{1}<{a}_{2}<\cdots$ and ${b}_{i}\in \left(0,1\right)$ and some non-negative measurable function ${f}_{c}$ with ${\int }_{0}^{\mathrm{\infty }}{f}_{c}\phantom{\rule{thinmathspace}{0ex}}\mathrm{d}\lambda +\sum _{i=1}^{\mathrm{\infty }}{b}_{j}=1$. If we define
${\lambda }_{c}\left(t\right)=\frac{{f}_{c}\left(t\right)}{P\left(T\ge t\right)}=\frac{{f}_{c}\left(t\right)}{F\left(t\right)},\phantom{\rule{1em}{0ex}}t\ne {a}_{i},$
and
${\lambda }_{i}=P\left(T={a}_{i}\mid T\ge {a}_{i}\right),$
then how do I show (and is it even true) that the survivor function of T is given by (1)?

Ryan Newman

Expert

The function $F\left(t\right)=\mathsf{P}\left(T>t\right)=1-\mathsf{P}\left(T\le t\right)$ is clearly of RCLL class on $\left[0,\mathrm{\infty }\right)$. As a result, the definitions of continuous part of the hazard function ${\lambda }_{c}$ and discrete parts allow you computing F by integrating ${\lambda }_{c}$ in between of the jumps, and applying jump conditions at $t={a}_{j}$. The latter have the following shape:
${\lambda }_{j}=\mathsf{P}\left(T={a}_{j}\mid T\ge {a}_{j}\right)=\frac{F\left({a}_{j}-\right)-F\left({a}_{j}\right)}{F\left({a}_{j}-\right)}\phantom{\rule{thickmathspace}{0ex}}⟹\phantom{\rule{thickmathspace}{0ex}}F\left({a}_{j}\right)=F\left({a}_{j}-\right)\left(1-{\lambda }_{j}\right)$
where $F\left(t-\right):=\underset{s↑t}{lim}F\left(s\right)$.

Do you have a similar question?