Users Online: 45
Home About us Editorial board Search Ahead of print Current issue Archives Submit article Instructions Subscribe Contacts Login 

 Table of Contents  
Year : 2019  |  Volume : 8  |  Issue : 3  |  Page : 150-159

Prostate cancer survival estimates: An application with piecewise hazard function derivation

Centre for Cancer Epidemiology, The Advanced Centre for Treatment, Research and Education in Cancer, Tata Memorial Centre, Navi Mumbai, Maharastra, India

Date of Web Publication01-Aug-2019

Correspondence Address:
Dr. Atanu Bhattacharjee
Centre for Cancer Epidemiology, The Advanced Centre for Treatment, Research and Education in Cancer, Tata Memorial Centre, Navi Mumbai, Maharastra
Login to access the Email id

Source of Support: None, Conflict of Interest: None

DOI: 10.4103/sajc.sajc_245_18

Rights and Permissions

Background: The hazard function is defined as time-dependent. However, it is an overlooked area of research about the estimation of hazard function within the frame of time. The possible explanation could be carried by estimating function through the changes of time points. It is expected that it will provide us the overall idea of survival trend. This work is dedicated to propose a method to work with piecewise hazard rate. It is a data-driven method and provides us the estimates of hazard function with different time points. Methods: The proposed method is explored with prostate cancer patients, registered in the Surveillance, Epidemiology, and End Results Program and having aged at diagnosis with range 40–80 years and above. A total of 610,814 patients are included in this study. The piecewise hazard rate is formulated to serve the objective. The measurement of piecewise hazard rate is compared with Wald-type test statistics, and corresponding R function is provided. The duration of follow-ups is split into different intervals to obtain the piecewise hazard rate estimates. Results: The maximum duration of follow-up observed in this study is 40 years. The piecewise hazard rate changes at different intervals of follow-ups are observed almost same except few later intervals in the follow-up. The likelihood of hazard in earlier aged patients observed lower in comparison to older patients. The hazard rates in different grades of prostate cancer also observed separately. Conclusion: The application of piecewise hazard helps to generate statistical inference in a deeper manner. This analysis will provide us the better understanding of a requirement of effective treatment toward prolonged survival benefit for different aged patients.

Keywords: Piecewise hazard function, prostate cancer, SEER

How to cite this article:
Bhattacharjee A, Budukh A, Dikshit R. Prostate cancer survival estimates: An application with piecewise hazard function derivation. South Asian J Cancer 2019;8:150-9

How to cite this URL:
Bhattacharjee A, Budukh A, Dikshit R. Prostate cancer survival estimates: An application with piecewise hazard function derivation. South Asian J Cancer [serial online] 2019 [cited 2019 Aug 21];8:150-9. Available from:

  Introduction Top

There have been significantly more deaths due to prostate cancer among patients with the age group of 62–76 years in comparison to age <61 years.[1] It sparked us to dig a better estimate about the influence of age on prostate cancer deaths. We are interested to get estimates of hazard function those are changing with time point. Three important parameters, that is, duration of survival, reasons for death, and age of patients are required for estimates of hazard function. Since we are trying to establish hazard function with reference to different ages in years, it is also an important to initiate the work with a high amount of sample size data. The age-wise classification of data created several strata with small sample size. Unless our cohort data are not large enough in size, it is difficult to establish the robust statistical inference with hazard functions for different ages in years. A relatively large sample size data on prostate cancer were obtained from the Surveillance, Epidemiology, and End Results (SEER) Program ( Public-Use Data (1973–2014), National Cancer Institute.

It is true that prostate cancer is a deadly disease. However, prostate cancer is observed with prolonged survival. There are always possibilities that the patients may be exposed due to other causes of death. The management to prolong the duration of survival is always interest in any clinical practice. However, the challenge to prolonging the survival for the younger patient is not same for older patients. For instance, the effort to prolonging the survival of a 40-year-old patient to 41 years is not same for the 60-year-old patient to his 61 years.[2],[3] The reason is the presence of different life expectancy in different age groups. It is obvious that older patients will be diagnosed with prostate cancer with several comorbidities. It becomes difficult to cover the minimum label of life expectancy in the general population for an older prostate cancer patient in the presence of different comorbidities. Simultaneously, the younger prostate cancer patient may be free from different comorbidities, but covering their life toward average life expectancy is another challenge due to the long gap of years between their age and life expectancy in the general population. In this circumstance, we preferred to use piecewise hazard to capture the magnitude of mortality risk in different age groups due to prostate cancer. Hence, the objective of this study is to estimate the hazard functions that are changing with time. While searching with work on piecewise hazard function, it has been observed that the single change-point analysis with hazard function[4],[5] and multiple change-point analysis are attempted.[6] We adopted the data-driven approach for detecting the number of change points with piecewise hazard function. The results were further compared with likelihood ratio test[7] with piecewise hazard estimates.[5],[6],[7],[8],[9],[10]

  Piecewise Hazard Function Top

The idea to compare treatment effect by cumulative risk of event is useful to quantify the ultimate treatment benefit.[11],[12] In our motivating context, the theoretical quantities of interest are the survival benefit in a specific time intervals and identify the necessary steps to modify the treatment management strategy. Let the total time point is measured with interval (0, ). The total time interval is split into where . The corresponding hazard is defined as and g0 =1.

The piecewise constant hazard function is defined[13] as follows:

The survival function is:

The cumulative hazard is obtained as follows:

  Piecewise Hazard With Multiple Testing Problem Top

The terms X1,… Xn denote independent identically distributed survival times and C1,… Cn be the censoring times which are assumed to be independently of X. We only observe the pairs.

(Ti,δi,i = 1, 2, ......n) where Xi,Ci and i = 1 if

and zero otherwise. Considering the following change-point model,

where are the change points, k the number of change points in the model, and α i; the value of the hazard function between the time points τj-1 and τj.

We propose a maximum likelihood estimates to estimate the unknown parameters. Based on Equation (1), the log-likelihood function is formulated as follows:

Where is the number of death observed up to time t with $\tau_{j}, j = 1., k$ fixed, some algebra yields that the maximizes of τj are given by:

Substituting these values into log L gives the profile likelihood for $\tau_{j}$'s, which can be expressed as:

We then maximize with respect to and insert the obtained values back to for MLEs of αj

Now, the objective is to identify the changes of τj. It can be confirmed through the hypothesis test with . The representation of τjcan be prepared by different factors. In this work, it is assumed with age. It is explored that and are independent in nature.[7] The Wald-type test statistics is as follows:

It follows the Chi-square test statistics with one degree of freedom under null hypothesis. We wrote an R function, called Wald Test (), which allows to perform test statistics. Its source code is reported in Appendix A.

Data analysis

The proposed method is explored with prostate cancer data, the SEER Program ( Public-Use Data (1973–2014), National Cancer Institute, with a follow-up till December 2014. The cancer incidence and survival status of the patients are included in this data set. There are several causes of death among patients included in these data. However, we only consider the causes of death due to prostate cancer and censored cases. Deaths due to other causes are excluded for this analysis. In this data set, there are other subsites of prostate cancer such as “Prepuce,” “Glans penis,” “Body of penis,” “Overlapping lesion of penis,” “Penis, NOS,” “Prostate gland,” “Undescended testis,” “Descended testis,” “Testis, NOS,”

"Epididymis,” “Spermatic cord,” “Scrotum, NOS,” “Other specified parts of male genital organs,” “Overlapping lesion of male genital organs,” and “Male genital organs, NOS” are excluded from this analysis to maintain the level of consistency as much as possible.

  Results Top

A total of 610,814 patients are included in this study. Registered patients died due to prostate cancer or censored are included in this study. Initially, we prepared the descriptive statistics to check the occurrence of prostate cancer with respect to age. It is observed that there are very less number of cases of age at diagnosis of up to 40 years. Thus, in some age at diagnosis, it is observed with zero count or very less number of prostate cancer cases. Our intention is to present hazard rate for each age at diagnosis due to prostate cancer. However, it is not feasible due to zero-inflated or very less count represented prostate cancer cases in different ages at diagnosis, although the sample is very large. These very less count number of cases are explored with percentage with reference to the cohort size, that is, 611,133 and many times, these are observed with frequencies with zero with two decimal places. Only age at diagnosis observed with cumulative frequencies 0.01 or more is included in this study from a cohort size of 611,133. Finally, only patients of age at diagnosis minimum 40 years are included in this study. The graphical representation of a number of cases and their death rate at different ages at diagnosis is detailed in [Figure 1]. The count table with cases and deaths is presented in [Table 1]. In the next step, we split the duration of survival into different survival intervals by , where represents 0–20 months and as 20–40 months. Under the null hypothesis testing, it is assumed that , where k = 1 to 11. However, to avoid the multiple testing problems, the hypothesis tests are performed with k >k − 1 and k = 1 to 11. The upper limit of k is defined 39 (i.e., 39 months) because the maximum duration of follow-up with death occurrences in this data set is observed with 468 months, that is, 39 years. Therefore, a total of 38 survival intervals are generated with 12-month window from the observed duration of survival. The outcomes with piecewise hazard estimates and 95% lower control limit (LCL) and upper control limit (UCL) are presented with [Figure 2]. The numerical outputs are presented in [Table 2]. There are four different grades. The piecewise hazard estimates adjusted with different grades are presented in [Table 3]. The results show that no significant changes in piecewise hazard estimates are observed between different ages at prostate cancer diagnosis. It shows that the initial duration of follow-up of the hazard rates is almost equal in the entire interval and not significantly different in any age at diagnosis. However, few significant changes observed intervals in 25 (25) and to 39 (39) onward. However, in most of these cases, this interval is not observed significantly different with upper and lower confidence intervals. Hence, our null hypothesis not rejected.
Figure 1: Distribution of age at diagnosis, number of prostate cancer cases, and death due to prostate cancer

Click here to view
Table 1: Prostate cancer occurence and death presentation in different age at diagnosis

Click here to view
Figure 2: Piecewise hazard rate estimated in different survival duration intervals

Click here to view
Table 2: Piecewise hazard ratio estimates in different survival intervals in months

Click here to view
Table 3: Piecewise Hazard Ratio Estimates in Different Survival Intervals in Months for Grade I, II, III and IV

Click here to view

In the final step, we performed the age-adjusted piecewise hazard estimates to test the real impact of age at diagnosis on hazard rate in prolonged survival of prostate cancer. Patients' age at diagnosis 40 years and above are considered in this step. However, patients' age at diagnosis 80 years and above are classified into the same category. The maximum age at diagnosis is observed with 107 years. In this step, the duration of survival is split into a maximum of 11 different intervals by , where represents by 0–20 months and as 20–40 months. Reason to prepare less number of intervals in comparison to earlier step is because of the presence of less number of patients in different survival durations with age-adjusted data. In addition to that, we observed that in some survival intervals, the estimates are failed to generate due to limited number of cases. However, those are observed for prolonged survival intervals not for initial intervals. The problem is overcome by extending the duration of survival interval with longer window. For example, if we failed to generate piecewise hazard estimate for interval between 280 and 300 months, then interval is extended up to 280 and 320 months and piecewise hazard is generated thereafter. If we still failed to generate the estimate, then it further extending into 280–340 months. A total of 10 intervals are generated. The corresponding estimates of piecewise hazard estimates are provided through [Figure 3]. The similar hypothesis is assumed with τk = τk-1 wherek = 1 to 11.
Figure 3: Piecewise hazard estimates in different ages at diagnosis

Click here to view

The outcomes with hazard rate and 95% LCL and UCL are presented with [Figure 3]. The numerical outputs are presented in [Table 4] and [Table 5]. The graphical representations are provided in [Figure 3]. [Figure 3] provides that in the initial duration of follow-up, the hazard rates are higher in older age patients. While we shifted the duration of survival from 20 to 40 months and thereafter 40–60 months, it shows that the hazard rate in older age patients was started to decline. However, the hazard rate for younger age patients steadily inclined through increases of duration of survival. However, at the end of duration of survival, the hazard rate in younger and old patients is maintained with similar hazard rate. It can be concluded that prostate cancer is more fatal in older age group patients after diagnosis. However, in longer duration, it becomes more fatal in the younger patient as compared to older.
Table 4: Piecewise Hazard Ratio Estimates in Different Survival Intervals in Months

Click here to view
Table 5: Piecewise Hazard Ratio Estimates in Different Survival Intervals in Months (Continued)

Click here to view

  Discussion Top

There are very limited applications observed with piecewise proportional hazard model. The application of piecewise proportional hazard is observed to determine the hormone therapeutic effect in women's health.[14] It is also used to compare the infant and early childhood mortality rates.[15] The risk of home hemodialysis utilization in Canada and their corresponding risks are compared through piecewise proportional hazard function.[16] It is always better to start with conventional hazard rate, due to the conditional in nature, and easy to handle with time-dependent treatment.[17],[18] However, it is not suitable with multiple timescales.[19] The piecewise Poisson model is found suitable to work with multiple timescales to evaluate the impact of event by likelihood ratio test.[9] In this work, we also used the likelihood ratio results through Wald-type test statistics.

The estimates of hazard function are feasible to use to develop prediction score as well. It will provide us another dimension about the establishment of therapeutic effect. It may be important toward health policy decision. With an enhanced understanding of the hazard function estimation with time point, we can improve the estimation procedure.

By analyzing the change of hazard for different age groups from SEER data, we can establish the different phases in mortality risk in prostate cancer patients. We identified that age more than 40 years is highly affected by prostate cancer death. The death due to prostate cancer becomes influential after 40 years and above.

The duration of follow-up in prostate cancer patients is relatively large. However, interpretation about causes of death among prostate cancer patients is relatively difficult in comparison to other types of cancer. Since during the prolonged follow-up period, patients could be exposed with several other causes and other causes may jointly and separately be able to decline the duration of survival. It is assumed that patients will be exposed more number of causes to penetrate their death as long they survived. In this situation, the age of the patients as separate factor is considered in this study. The time-varying effects and biologically plausible interactions are also required to be considered. In such a way, the model could be complex and piecewise hazard function could be appropriate tools.

One recent study on SEER confirmed that the prostate cancer patients with conservatively managed, localized, and well-to-moderately differentiated prostate cancer observed with 8%–9% incidence of mortality between 10 years from the date of diagnosis.[20] It is also concluded that majority of prostate cancer cases die due to other causes. The other cause like lifestyle is required to be modified.[21]

Financial support and sponsorship


Conflicts of interest

There are no conflicts of interest.

Appendix A

WaldTest = function (L)


WaldTest = numeric (3)

names (WaldTest) = c(“W”,”df”,”P value”)

r = dim (L)[1]

W = ((tau1.tau2)^{2})/v

W = as.numeric(W)

pval = 1.pchisq(W,1)

WaldTest[1] = W; WaldTest[2] = r; WaldTest[3] = pval


} # End function WaldTest

LL = rbind (c(1,.1)); LL thetahat = c(1,1)

  References Top

Parikh RR, Kim S, Stein MN, Haffty BG, Kim IY, Goyal S, et al. Trends in active surveillance for very low-risk prostate cancer: Do guidelines influence modern practice? Cancer Med 2017;6:2410-8.  Back to cited text no. 1
Badar F, Mahmood S. Epidemiology of cancers in Lahore, Pakistan, among children, adolescents and adults, 2010-2012: A cross-sectional study part 2. BMJ Open 2017;7:e016559.  Back to cited text no. 2
Watts EL, Appleby PN, Albanes D, Black A, Chan JM, Chen C, et al. Circulating sex hormones in relation to anthropometric, sociodemographic and behavioural factors in an international dataset of 12,300 men. PLoS One 2017;12:e0187741.  Back to cited text no. 3
Gijbels I, Gürler U. Estimation of a change point in a hazard function based on censored data. Lifetime Data Anal 2003;9:395-411.  Back to cited text no. 4
Goodman MS, Li Y, Tiwari RC. Detecting multiple change points in piecewise constant hazard functions. J Appl Stat 2011;38:2523-32.  Back to cited text no. 5
Kim HJ, Fay MP, Feuer EJ, Midthune DN. Permutation tests for joinpoint regression with applications to cancer rates. Stat Med 2000;19:335-51.  Back to cited text no. 6
Yao YC. Maximum likelihood estimation in hazard rate models with a change-point. Commun Stat Theory Methods 1986;15:2455-66.  Back to cited text no. 7
Henderson R. A problem with the likelihood ratio test for a change-point hazard rate model. Biometrika 1990;77:835-43.  Back to cited text no. 8
Matthews DE, Farewell VT. On a singularity in the likelihood for a change-point hazard rate model. Biometrika 1985;72:703-4.  Back to cited text no. 9
Nguyen HT, Rogers GS, Walker EA. Estimation in change-point hazard rate models. Biometrika 1984;71:299-304.  Back to cited text no. 10
Rebora P, Galimberti S, Valsecchi MG. Using multiple timescale models for the evaluation of a time-dependent treatment. Stat Med 2015;34:3648-60.  Back to cited text no. 11
Pepe MS, Mori M. Kaplan-Meier, marginal or conditional probability curves in summarizing competing risks failure time data? Stat Med 1993;12:737-51.  Back to cited text no. 12
Walke R. Example for a Piecewise Constant Hazard Data Simulation in R. Max Planck Institute for Demographic Research; 2010. Available from: [Last accessed details on 2018 Jul 07].  Back to cited text no. 13
Yang S, Prentice RL. Assessing potentially time-dependent treatment effect from clinical trials and observational studies for survival data, with applications to the women's health initiative combined hormone therapy trial. Stat Med 2015;34:1801-17.  Back to cited text no. 14
Kuate Defo B. Determinants of infant and early childhood mortality in Cameroon: The role of socioeconomic factors, housing characteristics, and immunization status. Soc Biol 1994;41:181-211.  Back to cited text no. 15
Perl J, Na Y, Tennankore KK, Chan CT. Temporal trends and factors associated with home hemodialysis technique survival in Canada. Clin J Am Soc Nephrol 2017. pii: CJN.13271216.  Back to cited text no. 16
Mantel N, Byar DP. Evaluation of response-time data involving transient states: An illustration using heart-transplant data. J Am Stat Assoc 1974;69:81-6.  Back to cited text no. 17
Anderson JR, Cain KC, Gelber RD. Analysis of survival by tumor response. J Clin Oncol 1983;1:710-9.  Back to cited text no. 18
Rebora P, Salim A, Reilly M. Bshazard: A flexible tool for nonparametric smoothing of the hazard function. R J 2014;6:114-22.  Back to cited text no. 19
Lu-Yao GL, Albertsen PC, Moore DF, Shih W, Lin Y, DiPaola RS, et al. Outcomes of localized prostate cancer following conservative management. JAMA 2009;302:1202-9.  Back to cited text no. 20
Epstein MM, Edgren G, Rider JR, Mucci LA, Adami HO. Temporal trends in cause of death among Swedish and US men with prostate cancer. J Natl Cancer Inst 2012;104:1335-42.  Back to cited text no. 21


  [Figure 1], [Figure 2], [Figure 3]

  [Table 1], [Table 2], [Table 3], [Table 4], [Table 5]


Similar in PUBMED
   Search Pubmed for
   Search in Google Scholar for
 Related articles
Access Statistics
Email Alert *
Add to My List *
* Registration required (free)

  In this article
Piecewise Hazard...
Piecewise Hazard...
Article Figures
Article Tables

 Article Access Statistics
    PDF Downloaded9    
    Comments [Add]    

Recommend this journal