The Influence of Variable Coding on the Interpretation of the Cox Proportional Hazards Model Parameters

Beata BIESZK-STOLORZ and Iwona MARKOWICZ

University of Szczecin, Poland

Abstract

Methods of duration analysis (or survival analysis, or failure-free analysis) were initially used in the analysis of the duration of human life. However, due to their versatility, they are increasingly used in the study of socio-economic phenomena. Nonparametric and semiparametric models are of particular importance. This results from the lack of need to know the distribution of the random variable under study. This is a significant problem in the application of parametric models in research. The prerequisite for the use of survival analysis models is the availability of data enabling the determination of the duration of a defined state for individual units of the studied population. Usually these are retrospective studies using available registers. The unemployment register is an example of such a database. The research has a methodological and analytical character. Its aim is to show the influence of the method of variables coding on the estimation of Cox regression model parameters. Special attention is paid to the interpretation of results obtained by different methods. The authors also present the relationship between model parameters estimated for data coded in two ways. An empirical example is the study of a cohort of unemployed deregistered at a specific time. The division into subgroups was made on the basis of age, which is a determinant of job search time.

Keywords: Survival Analysis, Semi-parametric Models, Cox Regression Model, Encryption.
Shares