Title: | Regression Models for Event History Outcomes |
---|---|
Description: | A user friendly, easy to understand way of doing event history regression for marginal estimands of interest, including the cumulative incidence and the restricted mean survival, using the pseudo observation framework for estimation. For a review of the methodology, see Andersen and Pohar Perme (2010) <doi:10.1177/0962280209105020> or Sachs and Gabriel (2022) <doi:10.18637/jss.v102.i09>. The interface uses the well known formulation of a generalized linear model and allows for features including plotting of residuals, the use of sampling weights, and corrected variance estimation. |
Authors: | Michael C Sachs [aut, cre], Erin E Gabriel [aut], Morten Overgaard [ctb] (Corrected variance calculation), Thomas A Gerds [ctb] (Fast computation of leave one out cumulative incidence), Terry Therneau [ctb] (Restricted mean computation) |
Maintainer: | Michael C Sachs <[email protected]> |
License: | GPL-3 |
Version: | 1.4.4 |
Built: | 2025-01-04 04:41:46 UTC |
Source: | https://github.com/sachsmc/eventglm |
Compute inverse probability of censoring weights pseudo observations
calc_ipcw_pos(mr, time, causen, type, ipcw.method, Gi)
calc_ipcw_pos(mr, time, causen, type, ipcw.method, Gi)
mr |
Model response object returned by Surv |
time |
Max time |
causen |
Cause of interest (numeric) |
type |
Outcome type, "cuminc", "survival", or "rmean" |
ipcw.method |
"binder" or "hajek" |
Gi |
vector of estimated censoring probabilities |
Censoring model must take the same named arguments as the predefined modules (though they do not all have to be used), and return a vector of pseudo observations.
check_mod_cens(model.censoring)
check_mod_cens(model.censoring)
model.censoring |
censoring model specification as character or function |
These are data from one of the first successful trials of adjuvant chemotherapy for colon cancer. Levamisole is a low-toxicity compound previously used to treat worm infestations in animals; 5-FU is a moderately toxic (as these things go) chemotherapy agent. There are only one record per patient for the death outcome (or censoring). This is redistributed from the survival package, with a small modification to include only the death outcome.
colon
colon
A data frame with 929 rows and 17 variables:
id
1 for all patients
Treatment - Obs(ervation), Lev(amisole), Lev(amisole)+5-FU
1=male
in years
obstruction of colon by tumour
perforation of colon
adherence to nearby organs
number of lymph nodes with detectable cancer
days until death or censoring
censoring status
differentiation of tumour (1=well, 2=moderate, 3=poor)
Extent of local spread (1=submucosa, 2=muscle, 3=serosa, 4=contiguous structures)
time from surgery to registration (0=short, 1=long)
more than 4 positive lymph nodes
event type: 1=recurrence,2=death
event indicator: censored, death
Computes Wald confidence intervals for one or more parameters in a fitted model. Users can specify the type of variance estimate used, with the default being the robust sandwich variance estimator.
## S3 method for class 'pseudoglm' confint(object, parm, level = 0.95, type = "robust", ...)
## S3 method for class 'pseudoglm' confint(object, parm, level = 0.95, type = "robust", ...)
object |
|
parm |
a specification of which parameters are to be given confidence intervals, either a vector of numbers or a vector of names. If missing, all parameters are considered. |
level |
the confidence level required. |
type |
The type of variance estimate to use, see vcov.pseudoglm |
... |
Not used |
A matrix (or vector) with columns giving lower and upper confidence limits for each parameter. These will be labelled as (1-level)/2 and 1 - (1-level)/2 in
cumincipcw <- cumincglm(survival::Surv(etime, event) ~ age + sex, time = 200, cause = "pcm", link = "identity", model.censoring = "independent", data = mgus2) confint(cumincipcw)
cumincipcw <- cumincglm(survival::Surv(etime, event) ~ age + sex, time = 200, cause = "pcm", link = "identity", model.censoring = "independent", data = mgus2) confint(cumincipcw)
Using pseudo observations for the cumulative incidence, this function then
runs a generalized linear model and estimates the parameters representing
contrasts in the cumulative incidence at a particular set of times (specified
by the time
argument) across covariate values. The link function can
be "identity" for estimating differences in the cumulative incidence, "log"
for estimating ratios, and any of the other link functions supported by
quasi.
cumincglm( formula, time, cause = 1, link = "identity", model.censoring = "independent", formula.censoring = NULL, ipcw.method = "binder", data, survival = FALSE, weights, subset, na.action, offset, control = list(...), model = FALSE, x = TRUE, y = TRUE, singular.ok = TRUE, contrasts = NULL, id, ... )
cumincglm( formula, time, cause = 1, link = "identity", model.censoring = "independent", formula.censoring = NULL, ipcw.method = "binder", data, survival = FALSE, weights, subset, na.action, offset, control = list(...), model = FALSE, x = TRUE, y = TRUE, singular.ok = TRUE, contrasts = NULL, id, ... )
formula |
A formula specifying the model. The left hand side must be a Surv object specifying a right censored survival or competing risks outcome. The status indicator, normally 0=alive, 1=dead. Other choices are TRUE/FALSE (TRUE = death) or 1/2 (2=death). For competing risks, the event variable will be a factor, whose first level is treated as censoring. The right hand side is the usual linear combination of covariates. If there are multiple time points, the special term "tve(.)" can be used to specify that the effect of the variable inside the parentheses will be time varying. In the output this will be represented as the interaction between the time points and the variable. |
time |
Numeric vector specifying the times at which the cumulative incidence or survival probability effect estimates are desired. |
cause |
Numeric or character constant specifying the cause indicator of interest. |
link |
Link function for the cumulative incidence regression model. |
model.censoring |
Type of model for the censoring distribution. Options are "stratified", which computes the pseudo-observations stratified on a set of categorical covariates, "aareg" for Aalen's additive hazards model, and "coxph" for Cox's proportional hazards model. With those options, we assume that the time to event and event indicator are conditionally independent of the censoring time, and that the censoring model is correctly specified. If "independent", we assume completely independent censoring, i.e., that the time to event and covariates are independent of the censoring time. the censoring time is independent of the covariates in the model. Can also be a custom function, see Details and the "Extending eventglm" vignette. |
formula.censoring |
A one sided formula (e.g., |
ipcw.method |
Which method to use for calculation of inverse probability of censoring weighted pseudo observations. "binder" the default, uses the number of observations as the denominator, while the "hajek" method uses the sum of the weights as the denominator. |
data |
Data frame in which all variables of formula can be interpreted. |
survival |
Set to TRUE to use survival (one minus the cumulative incidence) as the outcome. Not available for competing risks models. |
weights |
an optional vector of 'prior weights' to be used in the fitting process. Should be NULL or a numeric vector found in data. |
subset |
an optional vector specifying a subset of observations to be used in the fitting process. |
na.action |
a function which indicates what should happen when the data
contain |
offset |
this can be used to specify an a priori known component to be included in the linear predictor during fitting. This should be NULL or a numeric vector of length equal to the number of cases. One or more offset terms can be included in the formula instead or as well, and if more than one is specified their sum is used. See model.offset. If length(time) > 1, then any offset terms must appear in the formula. |
control |
a list of parameters for controlling the fitting process. This is passed to glm.control. |
model |
a logical value indicating whether model frame should be included as a component of the returned value. |
x |
logical value indicating whether the model matrix used in the fitting process should be returned as components of the returned value. |
y |
logical value indicating whether the response vector (pseudo-observations) used in the fitting process should be returned as components of the returned value. |
singular.ok |
logical; if FALSE a singular fit is an error. |
contrasts |
an optional list. See the contrasts.arg of model.matrix.default. |
id |
An optional integer vector of subject identifiers, found in data. If this is present, then generalized estimating equations will be used to fit the model. This can be used, for example, if there are multiple observations per individual represented as multiple rows in data. |
... |
Other arguments passed to glm.fit |
The argument "model.censoring" determines how the pseudo observations are calculated. This can be the name of a function or the function itself, which must have arguments "formula", "time", "cause", "data", "type", "formula.censoring", and "ipcw.method". If it is the name of a function, this code will look for a function with the prefix "pseudo_" first, to avoid clashes with related methods such as coxph. The function then must return a vector of pseudo observations, one for each subject in data which are used in subsequent calculations. For examples of the implementation, see the "pseudo-modules.R" file, or the vignette "Extending eventglm".
A pseudoglm object, with its own methods for print, summary, and vcov. It inherits from glm, so predict and other glm methods are supported.
cumincipcw <- cumincglm(Surv(etime, event) ~ age + sex, time = 200, cause = "pcm", link = "identity", model.censoring = "independent", data = mgus2) # stratified on only the categorical covariate cumincipcw2 <- cumincglm(Surv(etime, event) ~ age + sex, time = 200, cause = "pcm", link = "identity", model.censoring = "stratified", formula.censoring = ~ sex, data = mgus2) # multiple time points cuminct2 <- cumincglm(Surv(etime, event) ~ age + sex, time = c(50, 100, 200), cause = "pcm", link = "identity", model.censoring = "independent", data = mgus2) cuminct3 <- cumincglm(Surv(etime, event) ~ age + tve(sex), time = c(50, 100, 200), cause = "pcm", link = "identity", model.censoring = "independent", data = mgus2)
cumincipcw <- cumincglm(Surv(etime, event) ~ age + sex, time = 200, cause = "pcm", link = "identity", model.censoring = "independent", data = mgus2) # stratified on only the categorical covariate cumincipcw2 <- cumincglm(Surv(etime, event) ~ age + sex, time = 200, cause = "pcm", link = "identity", model.censoring = "stratified", formula.censoring = ~ sex, data = mgus2) # multiple time points cuminct2 <- cumincglm(Surv(etime, event) ~ age + sex, time = c(50, 100, 200), cause = "pcm", link = "identity", model.censoring = "independent", data = mgus2) cuminct3 <- cumincglm(Surv(etime, event) ~ age + tve(sex), time = c(50, 100, 200), cause = "pcm", link = "identity", model.censoring = "independent", data = mgus2)
A user friendly, easy to understand way of doing event history regression for marginal estimands of interest, including the cumulative incidence and the restricted mean survival, using the pseudo observation framework for estimation. The interface uses the well known formulation of a generalized linear model and allows for features including plotting of residuals, the use of sampling weights, and corrected variance estimation.
Sachs MC, Gabriel EE (2022). "Event History Regression with Pseudo-Observations: Computational Approaches and an Implementation in R." Journal of Statistical Software, 102(9), 1-34. <doi:10.18637/jss.v102.i09>
Utility to get jackknife pseudo observations of cumulative incidence
get_pseudo_cuminc(marginal.estimate, time, cause, mr)
get_pseudo_cuminc(marginal.estimate, time, cause, mr)
marginal.estimate |
A survfit object with no covariates |
time |
Time at which to calculate the obs |
cause |
which cause |
mr |
Model response of the survival object |
A vector of pseudo-observations
Utility to get jackknife pseudo observations of restricted mean
get_pseudo_rmean(marginal.estimate, time, cause, mr)
get_pseudo_rmean(marginal.estimate, time, cause, mr)
marginal.estimate |
A survfit object with no covariates |
time |
Time at which to calculate the obs |
cause |
which cause |
mr |
Model response of the survival object |
A vector of pseudo-observations
Compute jackknife pseudo-observations of the cause-specific cumulative incidence for competing risks
jackknife.competing.risks2(object, times, cause, mr)
jackknife.competing.risks2(object, times, cause, mr)
object |
A survfit object, with competing events |
times |
Times at which the cumulative incidence is computed, must be length 1 |
cause |
Value indicating for which cause the cumulative incidence is to be computed, it must match one of the values available in object (see example) |
mr |
Model response, the result of a call to Surv, or a matrix with two columns: "time" (observed follow up time) and "status" (0 = censored, 1, ..., k = event types) |
A vector of jackknifed pseudo-observations of the cause-specific cumulative incidence at time times
sfit.cuminc <- survival::survfit(survival::Surv(etime, event) ~ 1, data = mgus2) mrs <- with(mgus2, Surv(etime, event)) pseudo.obs <- jackknife.competing.risks2(sfit.cuminc, times = 200, cause = "pcm", mrs) mean(pseudo.obs) # agrees with summary(sfit.cuminc, times = 200)
sfit.cuminc <- survival::survfit(survival::Surv(etime, event) ~ 1, data = mgus2) mrs <- with(mgus2, Surv(etime, event)) pseudo.obs <- jackknife.competing.risks2(sfit.cuminc, times = 200, cause = "pcm", mrs) mean(pseudo.obs) # agrees with summary(sfit.cuminc, times = 200)
Compute jackknife pseudo-observations of the survival function
jackknife.survival2(object, times, mr)
jackknife.survival2(object, times, mr)
object |
A survfit object, with a single event (no competing risks) |
times |
Times at which the survival is computed, must be length 1 |
mr |
Model response, the result of a call to Surv, or a matrix with two columns: "time" (observed follow up time) and "status" (0 = censored, 1 = event) |
A vector of jackknifed estimates of survival at time times
sfit.surv <- survival::survfit(survival::Surv(time, status) ~ 1, data = colon) mrs <- with(colon, Surv(time, status)) pseudo.obs <- jackknife.survival2(sfit.surv, times = 1000, mrs) mean(pseudo.obs) # agrees with summary(sfit.surv, times = 1000)
sfit.surv <- survival::survfit(survival::Surv(time, status) ~ 1, data = colon) mrs <- with(colon, Surv(time, status)) pseudo.obs <- jackknife.survival2(sfit.surv, times = 1000, mrs) mean(pseudo.obs) # agrees with summary(sfit.surv, times = 1000)
This version computes them for all times up to times
, for the restricted mean lifetime lost
leaveOneOut.competing.risks(object, times, cause, mr)
leaveOneOut.competing.risks(object, times, cause, mr)
object |
A survfit object, with competing events |
times |
Times at which the cumulative incidence is computed, must be length 1 |
cause |
Value indicating for which cause the cumulative incidence is to be computed, it must match one of the values available in object (see example) |
mr |
Model response, the result of a call to Surv, or a matrix with two columns: "time" (observed follow up time) and "status" (0 = censored, 1, ..., k = event types) |
A vector of jackknifed values of the cause-specific cumulative incidence at time times
sfit.cuminc <- survival::survfit(survival::Surv(etime, event) ~ 1, data = mgus2) mrs <- with(mgus2, Surv(etime, event)) jackvals <- leaveOneOut.competing.risks(sfit.cuminc, times = 200, cause = "pcm", mrs)
sfit.cuminc <- survival::survfit(survival::Surv(etime, event) ~ 1, data = mgus2) mrs <- with(mgus2, Surv(etime, event)) jackvals <- leaveOneOut.competing.risks(sfit.cuminc, times = 200, cause = "pcm", mrs)
Compute jackknife pseudo-observations of the cause-specific cumulative incidence for competing risks
leaveOneOut.competing.risks2(object, times, cause, mr)
leaveOneOut.competing.risks2(object, times, cause, mr)
object |
A survfit object, with competing events |
times |
Times at which the cumulative incidence is computed, must be length 1 |
cause |
Value indicating for which cause the cumulative incidence is to be computed, it must match one of the values available in object (see example) |
mr |
Model response, the result of a call to Surv, or a matrix with two columns: "time" (observed follow up time) and "status" (0 = censored, 1, ..., k = event types) |
A vector of jackknifed values of the cause-specific cumulative incidence at time times
sfit.cuminc <- survival::survfit(survival::Surv(etime, event) ~ 1, data = mgus2) mrs <- with(mgus2, Surv(etime, event)) jackvals <- leaveOneOut.competing.risks2(sfit.cuminc, times = 200, cause = "pcm", mrs)
sfit.cuminc <- survival::survfit(survival::Surv(etime, event) ~ 1, data = mgus2) mrs <- with(mgus2, Surv(etime, event)) jackvals <- leaveOneOut.competing.risks2(sfit.cuminc, times = 200, cause = "pcm", mrs)
For each subject, the survival function is recomputed leaving that subject out. This one does the calculation for all observed times, for calculation of the restricted mean
leaveOneOut.survival(object, times, mr)
leaveOneOut.survival(object, times, mr)
object |
A survfit object, with a single event (no competing risks) |
times |
Compute values at observed times up to and including this time |
mr |
Model response, the result of a call to Surv, or a matrix with two columns: "time" (observed follow up time) and "status" (0 = censored, 1 = event) |
A vector of jackknifed values of survival at time times
sfit.surv <- survival::survfit(survival::Surv(time, status) ~ 1, data = colon) mrs <- with(colon, Surv(time, status)) jackvals <- leaveOneOut.survival(sfit.surv, 1000, mrs)
sfit.surv <- survival::survfit(survival::Surv(time, status) ~ 1, data = colon) mrs <- with(colon, Surv(time, status)) jackvals <- leaveOneOut.survival(sfit.surv, 1000, mrs)
For each subject, the survival function is recomputed leaving that subject out. This is the workhorse for jackknife.survival2 and will generally not be called by the user.
leaveOneOut.survival2(object, times, mr)
leaveOneOut.survival2(object, times, mr)
object |
A survfit object, with a single event (no competing risks) |
times |
Times at which the survival is computed, must be length 1 |
mr |
Model response, the result of a call to Surv, or a matrix with two columns: "time" (observed follow up time) and "status" (0 = censored, 1 = event) |
A vector of jackknifed values of survival at time times
sfit.surv <- survival::survfit(survival::Surv(time, status) ~ 1, data = colon) mrs <- with(colon, Surv(time, status)) jackvals <- leaveOneOut.survival2(sfit.surv, times = 1000, mrs)
sfit.surv <- survival::survfit(survival::Surv(time, status) ~ 1, data = colon) mrs <- with(colon, Surv(time, status)) jackvals <- leaveOneOut.survival2(sfit.surv, times = 1000, mrs)
Match cause specification against model response
match_cause(mr, cause)
match_cause(mr, cause)
mr |
model.response as returned by Surv |
cause |
Numeric or string indicating the cause of interest |
Natural history of 1341 sequential patients with monoclonal gammopathy of undetermined significance (MGUS). This is a superset of the mgus data, at a later point in the accrual process. This dataset is redistributed from the survival package with an added competing risks event indicator.
mgus2
mgus2
A data frame with 1384 observations on the following 10 variables.
id
subject identifier
age
age at diagnosis, in years
sex
a factor with levels F
M
dxyr
year of diagnosis
hgb
hemoglobin
creat
creatinine
mspike
size of the monoclonal serum spike
ptime
time until progression to a plasma cell malignancy (PCM) or last contact, in months
pstat
occurrence of PCM: 0=no, 1=yes
futime
time until death or last contact, in months
death
occurrence of death: 0=no, 1=yes
etime
time until either death, pcm, or last contact
event
factor indicating which event occurred first
Print method for pseudoglm
## S3 method for class 'pseudoglm' print(x, digits = max(3L, getOption("digits") - 3L), ...)
## S3 method for class 'pseudoglm' print(x, digits = max(3L, getOption("digits") - 3L), ...)
x |
|
digits |
Number of significant digits |
... |
Not used |
x, invisibly
Assuming that the censoring depends on covariates, the pseudo observations are calculated with the inverse probability of censoring weighted approach, where the censoring probabilities are estimated using Aalen's additive hazards model.
pseudo_aareg( formula, time, cause = 1, data, type = c("cuminc", "survival", "rmean"), formula.censoring = NULL, ipcw.method = NULL )
pseudo_aareg( formula, time, cause = 1, data, type = c("cuminc", "survival", "rmean"), formula.censoring = NULL, ipcw.method = NULL )
formula |
A formula specifying the outcome model. The left hand side must be a Surv object specifying a right censored survival or competing risks outcome. The status indicator, normally 0=alive, 1=dead. Other choices are TRUE/FALSE (TRUE = death) or 1/2 (2=death). For competing risks, the event variable will be a factor, whose first level is treated as censoring. The right hand side is the usual linear combination of covariates. |
time |
Numeric constant specifying the time at which the cumulative incidence or survival probability effect estimates are desired. |
cause |
Numeric or character constant specifying the cause indicator of interest. |
data |
Data frame in which all variables of formula can be interpreted. |
type |
One of "survival", "cuminc", or "rmean" |
formula.censoring |
A right-sided formula specifying which variables to use in the model for the censoring distribution. |
ipcw.method |
Which method to use for calculation of inverse probability of censoring weighted pseudo observations. "binder" the default, uses the number of observations as the denominator, while the "hajek" method uses the sum of the weights as the denominator. |
A vector of pseudo observations
POi <- pseudo_aareg(Surv(time, status) ~ 1, 1500, cause = 1, data = colon, type = "rmean", formula.censoring = ~ sex + age, ipcw.method = "binder") mean(POi)
POi <- pseudo_aareg(Surv(time, status) ~ 1, 1500, cause = 1, data = colon, type = "rmean", formula.censoring = ~ sex + age, ipcw.method = "binder") mean(POi)
Assuming that the censoring depends on covariates, the pseudo observations are calculated with the inverse probability of censoring weighted approach, where the censoring probabilities are estimated using Cox's proportional hazards model.
pseudo_coxph( formula, time, cause = 1, data, type = c("cuminc", "survival", "rmean"), formula.censoring = NULL, ipcw.method = NULL )
pseudo_coxph( formula, time, cause = 1, data, type = c("cuminc", "survival", "rmean"), formula.censoring = NULL, ipcw.method = NULL )
formula |
A formula specifying the outcome model. The left hand side must be a Surv object specifying a right censored survival or competing risks outcome. The status indicator, normally 0=alive, 1=dead. Other choices are TRUE/FALSE (TRUE = death) or 1/2 (2=death). For competing risks, the event variable will be a factor, whose first level is treated as censoring. The right hand side is the usual linear combination of covariates. |
time |
Numeric constant specifying the time at which the cumulative incidence or survival probability effect estimates are desired. |
cause |
Numeric or character constant specifying the cause indicator of interest. |
data |
Data frame in which all variables of formula can be interpreted. |
type |
One of "survival", "cuminc", or "rmean" |
formula.censoring |
A right-sided formula specifying which variables to use in the model for the censoring distribution. |
ipcw.method |
Which method to use for calculation of inverse probability of censoring weighted pseudo observations. "binder" the default, uses the number of observations as the denominator, while the "hajek" method uses the sum of the weights as the denominator. |
A vector of pseudo observations
POi <- pseudo_coxph(Surv(time, status) ~ 1, 1500, cause = 1, data = colon, type = "survival", formula.censoring = ~ sex + age, ipcw.method = "hajek") mean(POi)
POi <- pseudo_coxph(Surv(time, status) ~ 1, 1500, cause = 1, data = colon, type = "survival", formula.censoring = ~ sex + age, ipcw.method = "hajek") mean(POi)
Assuming completely independent censoring, i.e., censoring does not depend on the survival time nor any covariates in the model, the pseudo observations are calculated with the standard jackknife approach
pseudo_independent( formula, time, cause = 1, data, type = c("cuminc", "survival", "rmean"), formula.censoring = NULL, ipcw.method = NULL )
pseudo_independent( formula, time, cause = 1, data, type = c("cuminc", "survival", "rmean"), formula.censoring = NULL, ipcw.method = NULL )
formula |
A formula specifying the model. The left hand side must be a Surv object specifying a right censored survival or competing risks outcome. The status indicator, normally 0=alive, 1=dead. Other choices are TRUE/FALSE (TRUE = death) or 1/2 (2=death). For competing risks, the event variable will be a factor, whose first level is treated as censoring. The right hand side is the usual linear combination of covariates. |
time |
Numeric constant specifying the time at which the cumulative incidence or survival probability effect estimates are desired. |
cause |
Numeric or character constant specifying the cause indicator of interest. |
data |
Data frame in which all variables of formula can be interpreted. |
type |
One of "survival", "cuminc", or "rmean" |
formula.censoring |
Not used with this method, see pseudo_stratified, pseudo_aareg or pseudo_coxph |
ipcw.method |
Not used with this method |
A vector of jackknife pseudo observations
POi <- pseudo_independent(Surv(time, status) ~ 1, 1500, cause = 1, data = colon, type = "survival") mean(POi)
POi <- pseudo_independent(Surv(time, status) ~ 1, 1500, cause = 1, data = colon, type = "survival") mean(POi)
Assuming that the censoring depends on covariates with a finite set of levels, the pseudo observations are calculated with the infinitesimal jackknife approach stratified on those covariates. If no covariates are specified in the censoring model, then the pseudo observations are calculated under the completely independent censoring assumption. This function allows survival objects with entry and exit times, thus multi-state models, recurrent events, and delayed entry/left truncation. With delayed entry, the pseudo observation approach theoretically works under the assumption that the entry time is independent of covariates.
pseudo_infjack( formula, time, cause = 1, data, type = c("cuminc", "survival", "rmean"), formula.censoring = NULL, ipcw.method = NULL )
pseudo_infjack( formula, time, cause = 1, data, type = c("cuminc", "survival", "rmean"), formula.censoring = NULL, ipcw.method = NULL )
formula |
A formula specifying the outcome model. The left hand side must be a Surv object specifying a right censored survival, competing risks, counting process, or multistate outcome. The status indicator, normally 0=alive, 1=dead. Other choices are TRUE/FALSE (TRUE = death) or 1/2 (2=death). For competing risks and multi state models, the event variable will be a factor, whose first level is treated as censoring. The right hand side is the usual linear combination of covariates. |
time |
Numeric constant specifying the time at which the cumulative incidence or survival probability effect estimates are desired. |
cause |
Numeric or character constant specifying the cause indicator of interest. |
data |
Data frame in which all variables of formula can be interpreted. |
type |
One of "survival", "cuminc", or "rmean" |
formula.censoring |
A optional right-sided formula specifying which variables to stratify on. All variables in this formula must be categorical. |
ipcw.method |
Not used with this method |
A vector of pseudo observations
POi <- pseudo_infjack(Surv(time, status) ~ 1, 1500, cause = 1, data = colon, type = "survival", formula.censoring = ~ sex) mean(POi)
POi <- pseudo_infjack(Surv(time, status) ~ 1, 1500, cause = 1, data = colon, type = "survival", formula.censoring = ~ sex) mean(POi)
Compute pseudo-observations for the restricted mean survival
pseudo_rmst2(sfit, jacks, times, tmax, type = "cuminc")
pseudo_rmst2(sfit, jacks, times, tmax, type = "cuminc")
sfit |
A survfit object |
jacks |
A matrix of leave-one-out jackknife values, subjects in the rows, times in the columns |
times |
Times at which the survival is calculated |
tmax |
Max time |
type |
"cuminc" or "survival" |
A vector of pseudo observations for the restricted mean or lifetime lost
Assuming that the censoring depends on covariates with a finite set of levels, the pseudo observations are calculated with the jackknife approach stratified on those covariates.
pseudo_stratified( formula, time, cause = 1, data, type = c("cuminc", "survival", "rmean"), formula.censoring = NULL, ipcw.method = NULL )
pseudo_stratified( formula, time, cause = 1, data, type = c("cuminc", "survival", "rmean"), formula.censoring = NULL, ipcw.method = NULL )
formula |
A formula specifying the model. The left hand side must be a Surv object specifying a right censored survival or competing risks outcome. The status indicator, normally 0=alive, 1=dead. Other choices are TRUE/FALSE (TRUE = death) or 1/2 (2=death). For competing risks, the event variable will be a factor, whose first level is treated as censoring. The right hand side is the usual linear combination of covariates. |
time |
Numeric constant specifying the time at which the cumulative incidence or survival probability effect estimates are desired. |
cause |
Numeric or character constant specifying the cause indicator of interest. |
data |
Data frame in which all variables of formula can be interpreted. |
type |
One of "survival", "cuminc", or "rmean" |
formula.censoring |
A right-sided formula specifying which variables to stratify on. All variables in this formula must be categorical. |
ipcw.method |
Not used with this method |
A vector of jackknife pseudo observations
POi <- pseudo_stratified(Surv(time, status) ~ 1, 1500, cause = 1, data = colon, formula.censoring = ~ sex, type = "rmean") mean(POi)
POi <- pseudo_stratified(Surv(time, status) ~ 1, 1500, cause = 1, data = colon, formula.censoring = ~ sex, type = "rmean") mean(POi)
Computes residuals according to the recommendations of Pohar-Perme and Andersen (2009) <doi: 10.1002/sim.3401>.
## S3 method for class 'pseudoglm' residuals(object, type = NULL, ...)
## S3 method for class 'pseudoglm' residuals(object, type = NULL, ...)
object |
|
type |
Either "scaled" (the default for cumulative incidence outcomes) or one of the types available in residuals.glm for restricted mean outcomes, with the default being "deviance". |
... |
Arguments passed on to residuals.glm. |
The scaled residuals are computed as
When the outcome is the cumulative incidence, the denominator corresponds to an estimate of the standard error of the conditional estimate of the outcome in the absence of censoring. For the restricted mean, no such rescaling is done and the computation is passed off to residuals.glm.
A numeric vector of residuals
Perme MP, Andersen PK. Checking hazard regression models using pseudo-observations. Stat Med. 2008;27(25):5309-5328. <doi:10.1002/sim.3401>
Using pseudo observations for the restricted mean, or the restricted mean
lifetime lost in the competing risks case, this function then runs a
generalized linear model to estimate associations of the restricted
mean/lifetime lost up to a particular time (specified by the time
argument) with covariates. The link function can be "identity" for estimating
differences in the restricted mean, "log" for estimating ratios, and any of
the other link functions supported by quasi.
rmeanglm( formula, time, cause = 1, link = "identity", model.censoring = "independent", formula.censoring = NULL, ipcw.method = "binder", data, weights, subset, na.action, offset, control = list(...), model = FALSE, x = TRUE, y = TRUE, singular.ok = TRUE, contrasts = NULL, id, ... )
rmeanglm( formula, time, cause = 1, link = "identity", model.censoring = "independent", formula.censoring = NULL, ipcw.method = "binder", data, weights, subset, na.action, offset, control = list(...), model = FALSE, x = TRUE, y = TRUE, singular.ok = TRUE, contrasts = NULL, id, ... )
formula |
A formula specifying the model. The left hand side must be a Surv object specifying a right censored survival or competing risks outcome. The status indicator, normally 0=alive, 1=dead. Other choices are TRUE/FALSE (TRUE = death) or 1/2 (2=death). For competing risks, the event variable will be a factor, whose first level is treated as censoring. The right hand side is the usual linear combination of covariates. |
time |
Numeric constant specifying the time up to which the restricted mean effect estimates are desired. |
cause |
Numeric or character constant specifying the cause indicator of interest. |
link |
Link function for the restricted mean regression model. |
model.censoring |
Type of model for the censoring distribution. Options are "stratified", which computes the pseudo-observations stratified on a set of categorical covariates, "aareg" for Aalen's additive hazards model, and "coxph" for Cox's proportional hazards model. With those options, we assume that the time to event and event indicator are conditionally independent of the censoring time, and that the censoring model is correctly specified. If "independent", we assume completely independent censoring, i.e., that the time to event and covariates are independent of the censoring time. the censoring time is independent of the covariates in the model. Can also be a custom function, see Details and the "Extending eventglm" vignette. |
formula.censoring |
A one sided formula (e.g., |
ipcw.method |
Which method to use for calculation of inverse probability of censoring weighted pseudo observations. "binder" the default, uses the number of observations as the denominator, while the "hajek" method uses the sum of the weights as the denominator. |
data |
Data frame in which all variables of formula can be interpreted. |
weights |
an optional vector of 'prior weights' to be used in the fitting process. Should be NULL or a numeric vector. |
subset |
an optional vector specifying a subset of observations to be used in the fitting process. |
na.action |
a function which indicates what should happen when the data
contain |
offset |
this can be used to specify an a priori known component to be included in the linear predictor during fitting. This should be NULL or a numeric vector of length equal to the number of cases. One or more offset terms can be included in the formula instead or as well, and if more than one is specified their sum is used. See model.offset. |
control |
a list of parameters for controlling the fitting process. This is passed to glm.control. |
model |
a logical value indicating whether model frame should be included as a component of the returned value. |
x |
logical value indicating whether the model matrix used in the fitting process should be returned as components of the returned value. |
y |
logical value indicating whether the response vector (pseudo-observations) used in the fitting process should be returned as components of the returned value. |
singular.ok |
logical; if FALSE a singular fit is an error. |
contrasts |
an optional list. See the contrasts.arg of model.matrix.default. |
id |
An optional integer vector of subject identifiers, found in data. If this is present, then generalized estimating equations will be used to fit the model. This can be used, for example, if there are multiple observations per individual represented as multiple rows in data. |
... |
Other arguments passed to glm.fit |
The argument "model.censoring" determines how the pseudo observations are calculated. This can be the name of a function or the function itself, which must have arguments "formula", "time", "cause", "data", "type", "formula.censoring", and "ipcw.method". If it is the name of a function, this code will look for a function with the prefix "pseudo_" first, to avoid clashes with related methods such as coxph. The function then must return a vector of pseudo observations, one for each subject in data which are used in subsequent calculations. For examples of the implementation, see the "pseudo-modules.R" file, or the vignette "Extending eventglm".
A pseudoglm object, with its own methods for print, summary, and vcov. It inherits from glm, so predict and other glm methods are supported.
rmeanipcw <- rmeanglm(Surv(etime, event) ~ age + sex, time = 200, cause = "pcm", link = "identity", model.censoring = "independent", data = mgus2) # stratified on only the categorical covariate rmeanipcw2 <- rmeanglm(Surv(etime, event) ~ age + sex, time = 200, cause = "pcm", link = "identity", model.censoring = "stratified", formula.censoring = ~ sex, data = mgus2)
rmeanipcw <- rmeanglm(Surv(etime, event) ~ age + sex, time = 200, cause = "pcm", link = "identity", model.censoring = "independent", data = mgus2) # stratified on only the categorical covariate rmeanipcw2 <- rmeanglm(Surv(etime, event) ~ age + sex, time = 200, cause = "pcm", link = "identity", model.censoring = "stratified", formula.censoring = ~ sex, data = mgus2)
Summary method
## S3 method for class 'pseudoglm' summary( object, correlation = FALSE, symbolic.cor = FALSE, type = "robust", ... )
## S3 method for class 'pseudoglm' summary( object, correlation = FALSE, symbolic.cor = FALSE, type = "robust", ... )
object |
|
correlation |
logical; if TRUE, the correlation matrix of the estimated parameters is returned and printed. |
symbolic.cor |
logical; If TRUE, print the correlations in a symbolic form rather than as numbers. |
type |
The method to use for variance estimation; one of "corrected", "robust", "naive", or "cluster" |
... |
Not used |
An object of class summary.glm
Compute covariance matrix of regression coefficient estimates
## S3 method for class 'pseudoglm' vcov(object, type = "robust", ...)
## S3 method for class 'pseudoglm' vcov(object, type = "robust", ...)
object |
|
type |
The method to use for variance estimation; one of "corrected", "robust", "naive", or "cluster" |
... |
Not used |
The "corrected" variance estimate for the cumulative incidence is as described in Overgaard et al. (2017) <doi:10.1214/16-AOS1516>, with code adapted from Overgaard's Stata program. This method does not handle ties and only has marginal benefits in reasonable sample sizes. The default is "robust" which uses the sandwich estimator vcovHC as implemented in the sandwich package. "cluster" is another option if you have clustered observations that uses the vcovCL function in sandwich. Finally "naive" uses the same method as glm to compute the variance, and is known to be anti-conservative. The bootstrap is another recommended option that can be implemented using other tools; there is an example in the vignette.
A numeric matrix containing the variance-covariance estimates
Overgaard, Morten; Parner, Erik Thorlund; Pedersen, Jan. Asymptotic theory of generalized estimating equations based on jack-knife pseudo-observations. Ann. Statist. 45 (2017), no. 5, 1988–2015. <doi:10.1214/16-AOS1516>.