synthcity.plugins.core.models.survival_analysis.third_party.util module

class Surv

Bases: object

Helper class to construct structured array of event indicator and observed time.

static from_arrays(event: numpy.ndarray, time: numpy.ndarray, name_event: Optional[str] = None, name_time: Optional[str] = None) Any

Create structured array.

Parameters
  • event (array-like) – Event indicator. A boolean array or array with values 0/1.

  • time (array-like) – Observed time.

  • name_event (str|None) – Name of event, optional, default: ‘event’

  • name_time (str|None) – Name of observed time, optional, default: ‘time’

Returns

y – Structured array with two fields.

Return type

np.array

static from_dataframe(event: Any, time: Any, data: Any) numpy.ndarray

Create structured array from data frame.

Parameters
  • event (object) – Identifier of column containing event indicator.

  • time (object) – Identifier of column containing time.

  • data (pandas.DataFrame) – Dataset.

Returns

y – Structured array with two fields.

Return type

np.array

check_arrays_survival(X: numpy.ndarray, y: numpy.ndarray, **kwargs: Any) tuple

Check that all arrays have consistent first dimensions.

Parameters
  • X (array-like) – Data matrix containing feature vectors.

  • y (structured array with two fields) – A structured array containing the binary event indicator as first field, and time of event or time of censoring as second field.

  • kwargs (dict) – Additional arguments passed to sklearn.utils.check_array().

Returns

  • X (array, shape=[n_samples, n_features]) – Feature vectors.

  • event (array, shape=[n_samples,], dtype=bool) – Binary event indicator.

  • time (array, shape=[n_samples,], dtype=float) – Time of event or censoring.

check_y_survival(y_or_event: numpy.ndarray, *args: Any, allow_all_censored: bool = False) tuple

Check that array correctly represents an outcome for survival analysis.

Parameters
  • y_or_event (structured array with two fields, or boolean array) – If a structured array, it must contain the binary event indicator as first field, and time of event or time of censoring as second field. Otherwise, it is assumed that a boolean array representing the event indicator is passed.

  • *args (list of array-likes) – Any number of array-like objects representing time information. Elements that are None are passed along in the return value.

  • allow_all_censored (bool, optional, default: False) – Whether to allow all events to be censored.

Returns

  • event (array, shape=[n_samples,], dtype=bool) – Binary event indicator.

  • time (array, shape=[n_samples,], dtype=float) – Time of event or censoring.

safe_concat(objs: Any, *args: Any, **kwargs: Any) Any

Alternative to pandas.concat() that preserves categorical variables.

Parameters
  • objs (a sequence or mapping of Series, DataFrame, or Panel objects) – If a dict is passed, the sorted keys will be used as the keys argument, unless it is passed, in which case the values will be selected (see below). Any None objects will be dropped silently unless they are all None in which case a ValueError will be raised

  • axis ({0, 1, ...}, default 0) – The axis to concatenate along

  • join ({'inner', 'outer'}, default 'outer') – How to handle indexes on other axis(es)

  • join_axes (list of Index objects) – Specific indexes to use for the other n - 1 axes instead of performing inner/outer set logic

  • verify_integrity (boolean, default False) – Check whether the new concatenated axis contains duplicates. This can be very expensive relative to the actual data concatenation

  • keys (sequence, default None) – If multiple levels passed, should contain tuples. Construct hierarchical index using the passed keys as the outermost level

  • levels (list of sequences, default None) – Specific levels (unique values) to use for constructing a MultiIndex. Otherwise they will be inferred from the keys

  • names (list, default None) – Names for the levels in the resulting hierarchical index

  • ignore_index (boolean, default False) – If True, do not use the index values along the concatenation axis. The resulting axis will be labeled 0, …, n - 1. This is useful if you are concatenating objects where the concatenation axis does not have meaningful indexing information. Note the the index values on the other axes are still respected in the join.

  • copy (boolean, default True) – If False, do not copy data unnecessarily

Notes

The keys, levels, and names arguments are all optional

Returns

concatenated

Return type

type of objects