Joint survival analysis model of longitudinal data for binary data: inference, residuals and applications.
Survival Analysis, longitudinal data, joint model, residual analysis.
Survival analysis is widely used in several areas of knowledge, for example, in health, economics, engineering, etc. Parametric, semi-parametric and non-parametric models were developed to study the variable time until the occurrence of an event of interest, for example, the death of an individual, the failure of an electronic component, etc. When there is proportionality of risks, between two groups for example, researchers have used Cox’s proportional hazards model (Cox, 1972). The presence of variables observed over time is common, this type of variable is commonly known as longitudinal variables. In practice, it is common to find situations in which the interest is to study the time until the occurrence of the event of interest in the presence of variables observed over time, for example, studies with patients with Acquired Immunodeficiency Syndrome (AIDS), which aim to study the time until the patient’s death in the presence of the CD4 lymphocyte count variable, which is observed longitudinally. The joint models for survival data and longitudinal data are suitable for obtaining information about practical situations involving survival date and longitudinal date. This dissertation work studies practical situations in which the longitudinal variable is binary, for example, if the patient’s lifetime is affected by patient’s satisfaction or non-satisfaction with their life. We found maximum likelihood estimates via two stages and via the Expectation-Maximization (EM) algorithm. The computationally two-stage approach is less costly. In the first stage, we use the Generalized linear mixed models (GLMMs) for binary data; we find the mean population estimate. In the second stage, the mean estimate, obtained in the first stage, is considered as an explanatory variable in the Cox’s proportional hazards model. Via Monte Carlo simulation, we evaluated the asymptotic behavior of the maximum likelihood estimators via two gains and studied the empirical distribution
of Martingale, quantile, deviance, NRSP and NMSP residuals. Resampling methods like Jackknife, Bootstrap and extensions were used in order to find the bias of the maximum likelihood estimators obtained in two gains. Finally, a data set is used to validate the developed methodology.