synthcity.plugins.core.models.survival_analysis.third_party.nonparametric module

class CensoringDistributionEstimator(*args: Any, **kwargs: Any)

Bases: synthcity.plugins.core.models.survival_analysis.third_party.nonparametric.SurvivalFunctionEstimator

Kaplan–Meier estimator for the censoring distribution.

fit(y: numpy.ndarray) synthcity.plugins.core.models.survival_analysis.third_party.nonparametric.CensoringDistributionEstimator

Estimate censoring distribution from training data.

Parameters

y (structured array, shape = (n_samples,)) – A structured array containing the binary event indicator as first field, and time of event or time of censoring as second field.

Returns

Return type

self

predict_ipcw(y: numpy.ndarray) numpy.ndarray

Return inverse probability of censoring weights at given time points.

\(\omega_i = \delta_i / \hat{G}(y_i)\)

Parameters

y (structured array, shape = (n_samples,)) – A structured array containing the binary event indicator as first field, and time of event or time of censoring as second field.

Returns

ipcw – Inverse probability of censoring weights.

Return type

array, shape = (n_samples,)

predict_proba(time: numpy.ndarray) numpy.ndarray

Return probability of an event after given time point.

\(\hat{S}(t) = P(T > t)\)

Parameters

time (array, shape = (n_samples,)) – Time to estimate probability at.

Returns

prob – Probability of an event.

Return type

array, shape = (n_samples,)

class SurvivalFunctionEstimator(*args: Any, **kwargs: Any)

Bases: sklearn.base.BaseEstimator

Kaplan–Meier estimate of the survival function.

fit(y: numpy.ndarray) synthcity.plugins.core.models.survival_analysis.third_party.nonparametric.SurvivalFunctionEstimator

Estimate survival distribution from training data.

Parameters

y (structured array, shape = (n_samples,)) – A structured array containing the binary event indicator as first field, and time of event or time of censoring as second field.

Returns

Return type

self

predict_proba(time: numpy.ndarray) numpy.ndarray

Return probability of an event after given time point.

\(\hat{S}(t) = P(T > t)\)

Parameters

time (array, shape = (n_samples,)) – Time to estimate probability at.

Returns

prob – Probability of an event.

Return type

array, shape = (n_samples,)

kaplan_meier_estimator(event: numpy.ndarray, time_exit: numpy.ndarray, time_enter: Optional[numpy.ndarray] = None, time_min: Optional[float] = None, reverse: bool = False) tuple

Kaplan-Meier estimator of survival function.

See 1 for further description.

Parameters
  • event (array-like, shape = (n_samples,)) – Contains binary event indicators.

  • time_exit (array-like, shape = (n_samples,)) – Contains event/censoring times.

  • time_enter (array-like, shape = (n_samples,), optional) – Contains time when each individual entered the study for left truncated survival data.

  • time_min (float, optional) – Compute estimator conditional on survival at least up to the specified time.

  • reverse (bool, optional, default: False) – Whether to estimate the censoring distribution. When there are ties between times at which events are observed, then events come first and are subtracted from the denominator. Only available for right-censored data, i.e. time_enter must be None.

Returns

  • time (array, shape = (n_times,)) – Unique times.

  • prob_survival (array, shape = (n_times,)) – Survival probability at each unique time point. If time_enter is provided, estimates are conditional probabilities.

Examples

Creating a Kaplan-Meier curve:

>>> x, y = kaplan_meier_estimator(event, time)
>>> plt.step(x, y, where="post")
>>> plt.ylim(0, 1)
>>> plt.show()

References

1

Kaplan, E. L. and Meier, P., “Nonparametric estimation from incomplete observations”, Journal of The American Statistical Association, vol. 53, pp. 457-481, 1958.