synthcity.plugins.core.models.survival_analysis.third_party.nonparametric module

class CensoringDistributionEstimator(*args: Any, **kwargs: Any)

Bases: synthcity.plugins.core.models.survival_analysis.third_party.nonparametric.SurvivalFunctionEstimator

Kaplan–Meier estimator for the censoring distribution.

fit(y: numpy.ndarray) → synthcity.plugins.core.models.survival_analysis.third_party.nonparametric.CensoringDistributionEstimator

Estimate censoring distribution from training data.

Parameters: y (structured array, shape = (n_samples,)) – A structured array containing the binary event indicator as first field, and time of event or time of censoring as second field.
Returns
Return type: self

predict_ipcw(y: numpy.ndarray) → numpy.ndarray

Return inverse probability of censoring weights at given time points.

\(\omega_i = \delta_i / \hat{G}(y_i)\)

Parameters: y (structured array, shape = (n_samples,)) – A structured array containing the binary event indicator as first field, and time of event or time of censoring as second field.
Returns: ipcw – Inverse probability of censoring weights.
Return type: array, shape = (n_samples,)

predict_proba(time: numpy.ndarray) → numpy.ndarray

Return probability of an event after given time point.

\(\hat{S}(t) = P(T > t)\)

Parameters: time (array, shape = (n_samples,)) – Time to estimate probability at.
Returns: prob – Probability of an event.
Return type: array, shape = (n_samples,)

class SurvivalFunctionEstimator(*args: Any, **kwargs: Any)

Bases: sklearn.base.BaseEstimator

Kaplan–Meier estimate of the survival function.

fit(y: numpy.ndarray) → synthcity.plugins.core.models.survival_analysis.third_party.nonparametric.SurvivalFunctionEstimator

Estimate survival distribution from training data.

Parameters: y (structured array, shape = (n_samples,)) – A structured array containing the binary event indicator as first field, and time of event or time of censoring as second field.
Returns
Return type: self

predict_proba(time: numpy.ndarray) → numpy.ndarray

Return probability of an event after given time point.

\(\hat{S}(t) = P(T > t)\)

Parameters: time (array, shape = (n_samples,)) – Time to estimate probability at.
Returns: prob – Probability of an event.
Return type: array, shape = (n_samples,)

kaplan_meier_estimator(event: numpy.ndarray, time_exit: numpy.ndarray, time_enter: Optional[numpy.ndarray] = None, time_min: Optional[float] = None, reverse: bool = False) → tuple

Kaplan-Meier estimator of survival function.

See 1 for further description.

Parameters

event (array-like, shape = (n_samples,)) – Contains binary event indicators.
time_exit (array-like, shape = (n_samples,)) – Contains event/censoring times.
time_enter (array-like, shape = (n_samples,), optional) – Contains time when each individual entered the study for left truncated survival data.
time_min (float, optional) – Compute estimator conditional on survival at least up to the specified time.
reverse (bool, optional, default: False) – Whether to estimate the censoring distribution. When there are ties between times at which events are observed, then events come first and are subtracted from the denominator. Only available for right-censored data, i.e. time_enter must be None.

Returns

time (array, shape = (n_times,)) – Unique times.
prob_survival (array, shape = (n_times,)) – Survival probability at each unique time point. If time_enter is provided, estimates are conditional probabilities.

Examples

Creating a Kaplan-Meier curve:

>>> x, y = kaplan_meier_estimator(event, time)
>>> plt.step(x, y, where="post")
>>> plt.ylim(0, 1)
>>> plt.show()

References

1: Kaplan, E. L. and Meier, P., “Nonparametric estimation from incomplete observations”, Journal of The American Statistical Association, vol. 53, pp. 457-481, 1958.