1. 程式人生 > >用survminer優雅的畫生存曲線

用survminer優雅的畫生存曲線

生存曲線繪製

#Survival Curves
#The ggsurvplot() function creates ggplot2 plots from survfit objects.
data(lung)
head(lung)
  inst time status age sex ph.ecog ph.karno pat.karno meal.cal wt.loss
1    3  306      2  74   1       1       90       100     1175      NA
2    3  455      2  68   1       0       90        90
1225 15 3 3 1010 1 56 1 0 90 90 NA 15 4 5 210 2 57 1 1 90 60 1150 11 5 1 883 2 60 1 0 100 90 NA 0 6 12 1022 1 74 1 1 50 80 513 0 table(lung$sex) 1
2 138 90 fit <- survfit(Surv(time,status)~ sex, data = lung) class(fit) [1] "survfit" library(survminer) ggsurvplot(fit,data = lung)

這裡寫圖片描述

#Use the fun argument to set the transformation of the survival curve
#"event" for cumulative events, "cumhaz" for the cumulative hazard function or "pct"
for survival probability in percentage. ggsurvplot(fit, data = lung, fun = "event") ggsurvplot(fit, data = lung, fun = "cumhaz") ggsurvplot(fit, data = lung, fun = "pct")#預設pct

這裡寫圖片描述 這裡寫圖片描述

#很多的引數可以調,position and content of the legend; additional annotations like p-value, title, subtitle.
ggsurvplot(fit, data = lung,
           conf.int = TRUE,#新增置信區間
           pval = TRUE,#新增P值
           fun = "pct",
           risk.table = TRUE,
           size = 1,
           linetype = "strata",
           palette = c("#E7B800","#2E9FDF"),
           legend = "bottom",
           legend.title = "Sex",
           legend.labs = c("Male","Female"))

這裡寫圖片描述

Diagnostics of Cox Model

Cox模型的診斷一般包括三方面的內容:

  • 比例風險假定;
  • 模型影響點(異常值)識別;
  • 比例風險的對數值與協變數之間的非線性關係識別; 對上述三方面的診斷,常見的方法為殘差法。

  • Schoenfeld殘差用於檢驗比例風險假定;

  • Deviance殘差用於影響點(異常值)識別;
  • Martingale殘差用於非線性檢驗;

The function cox.zph() from survival package may be used to test the proportional hazards assumption for a Cox regression model fit. The graphical verification of this assumption may be performed with the function ggcoxzph() from the survminer package. For each covariate it produces plots with scaled Schoenfeld residuals against the time.

模型診斷——PH假定。PH假定可通過假設檢驗和殘差圖檢驗。正常情況下,Schoenfeld殘差應該與時間無關,如果殘差與時間有相關趨勢,則違反PH假設的證據。殘差圖上,橫軸代表時間,如果殘差均勻的分佈則,表示殘差與時間相互獨立。

library(survival)
fit <- coxph(Surv(time, status) ~ sex + age, data = lung)
ftest <- cox.zph(fit)
ftest
           rho chisq     p
sex     0.1236 2.452 0.117
age    -0.0275 0.129 0.719
GLOBAL      NA 2.651 0.266
#繪圖
library(survminer)
ggcoxzph(ftest)

這裡寫圖片描述 從上面的結果可以看出,三個變數的P值都大於0.05,說明每個變數均滿足PH檢驗,而模型的整體檢驗P值0.266,模型整體滿足PH檢驗。上圖中實線是擬合的樣條平滑曲線,虛線表示擬合曲線上下2個單位的標準差。如果曲線偏離2個單位的標準差則表示不滿足比例風險假定。從上圖中可見,各協變數滿足PH風險假設。

模型診斷——模型影響點(異常值)識別

我們可以通過繪製Deviance殘差圖或者dfbeta值實現上述診斷。在R語言survminer中ggcoxdiagnostics()函式可以畫出Deviance殘差圖。 The function ggcoxdiagnostics() plots different types of residuals as a function of time, linear predictor or observation id. The type of residual is selected with type argument. Possible values are “martingale”, “deviance”, “score”, “schoenfeld”, “dfbeta”’, “dfbetas”, and “scaledsch”. The ox.scale argument defines what shall be plotted on the OX axis. Possible values are “linear.predictions”, “observation.id”, “time”. Logical arguments hline and sline may be used to add horizontal line or smooth line to the plot.

library("survival")
library("survminer")
fit <- coxph(Surv(time, status) ~
               sex + age, data = lung)
ggcoxdiagnostics(fit,
                 type = "deviance",
                 ox.scale = "linear.predictions")
ggcoxdiagnostics(fit,
                 type = "schoenfeld",
                 ox.scale = "time")

這裡寫圖片描述 殘差值均勻的分佈在0上下,表明滿足上述假定。

Summary of Cox Model

The function ggforest() from the survminer package creates a forest plot for a Cox regression model fit. Hazard ratio estimates along with confidence intervals and p-values are plotter for each variable.

> library("survival")
> library("survminer")
> lung$age <- ifelse(lung$age > 70, ">70","<= 70")
> fit <- coxph( Surv(time, status) ~ sex + ph.ecog + age, data = lung)
> fit
Call:
coxph(formula = Surv(time, status) ~ sex + ph.ecog + age, data = lung)

          coef exp(coef) se(coef)     z       p
sex     -0.567     0.567    0.168 -3.37 0.00075
ph.ecog  0.470     1.600    0.113  4.16 3.1e-05
age>70   0.307     1.359    0.187  1.64 0.10175

Likelihood ratio test=31.59  on 3 df, p=6e-07
n= 227, number of events= 164 
   (1 observation deleted due to missingness)

這裡寫圖片描述

校正生存曲線繪製

The function ggcoxadjustedcurves() from the survminer package plots Adjusted Survival Curves for Cox Proportional Hazards Model. Adjusted Survival Curves show how a selected factor influences survival estimated from a Cox model. Note that these curves differ from Kaplan Meier estimates since they present expected ssurvival based on given Cox model.

library("survival")
library("survminer")
lung$sex <- ifelse(lung$sex == 1,"Male", "Female")
fit <- coxph(Surv(time, status) ~sex + ph.ecog + age,data = lung)
ggcoxadjustedcurves(fit, data=lung,variable=lung$sex)

Note that it is not necessary to include the grouping factor in the Cox model. Survival curves are estimated from Cox model for each group defined by the factor independently.

lung$age3 <- cut(lung$age, c(35,55,65,85))
ggcoxadjustedcurves(fit, data=lung, variable=lung$age3)