synthcity.benchmark package

class Benchmarks

Bases: object

static evaluate(tests: List[Tuple[str, str, dict]], X: synthcity.plugins.core.dataloader.DataLoader, X_test: Optional[synthcity.plugins.core.dataloader.DataLoader] = None, metrics: Optional[Dict] = None, repeats: int = 3, synthetic_size: Optional[int] = None, synthetic_constraints: Optional[synthcity.plugins.core.constraints.Constraints] = None, synthetic_cache: bool = True, synthetic_reuse_if_exists: bool = True, augmented_reuse_if_exists: bool = True, task_type: str = 'classification', workspace: pathlib.Path = PosixPath('workspace'), augmentation_rule: str = 'equal', strict_augmentation: bool = False, ad_hoc_augment_vals: Optional[Dict] = None, use_metric_cache: bool = True, **generate_kwargs: Any) pandas.core.frame.DataFrame

Benchmark the performance of several algorithms.

Parameters
  • tests – List[Tuple[str, str, dict]] Tuples of form (testname: str, plugin_name, str, plugin_args: dict)

  • X – DataLoader The baseline dataset to learn

  • X_test – Optional[DataLoader] Optional test dataset for evaluation. If None, X will be split in train/test datasets.

  • metrics

    List of metrics to test. By default, all metrics are evaluated. Full dictionary of metrics is: {

    ’sanity’: [‘data_mismatch’, ‘common_rows_proportion’, ‘nearest_syn_neighbor_distance’, ‘close_values_probability’, ‘distant_values_probability’], ‘stats’: [‘jensenshannon_dist’, ‘chi_squared_test’, ‘feature_corr’, ‘inv_kl_divergence’, ‘ks_test’, ‘max_mean_discrepancy’, ‘wasserstein_dist’, ‘prdc’, ‘alpha_precision’, ‘survival_km_distance’], ‘performance’: [‘linear_model’, ‘mlp’, ‘xgb’, ‘feat_rank_distance’], ‘detection’: [‘detection_xgb’, ‘detection_mlp’, ‘detection_gmm’, ‘detection_linear’], ‘privacy’: [‘delta-presence’, ‘k-anonymization’, ‘k-map’, ‘distinct l-diversity’, ‘identifiability_score’, ‘DomiasMIA_BNAF’, ‘DomiasMIA_KDE’, ‘DomiasMIA_prior’]

    }

  • repeats – Number of test repeats

  • synthetic_size – int The size of the synthetic dataset. By default, it is len(X).

  • synthetic_constraints – Optional constraints on the synthetic data. By default, it inherits the constraints from X.

  • synthetic_cache – bool Enable experiment caching

  • synthetic_reuse_if_exists – bool If the current synthetic dataset is cached, it will be reused for the experiments. Defaults to True.

  • augmented_reuse_if_exists – bool If the current augmented dataset is cached, it will be reused for the experiments. Defaults to True.

  • task_type – str The type of problem. Relevant for evaluating the downstream models with the correct metrics. Valid tasks are: “classification”, “regression”, “survival_analysis”, “time_series”, “time_series_survival”.

  • workspace – Path Path for caching experiments. Default: “workspace”.

  • augmentation_rule – str The rule used to achieve the desired proportion records with each value in the fairness column. Possible values are: ‘equal’, ‘log’, and ‘ad-hoc’. Defaults to “equal”.

  • strict_augmentation – bool Flag to ensure that the condition for generating synthetic data is strictly met. Defaults to False.

  • ad_hoc_augment_vals – Dict A dictionary containing the number of each class to augment the real data with. This is only required if using the rule=”ad-hoc” option. Defaults to None.

  • use_metric_cache – bool If the current metric has been previously run and is cached, it will be reused for the experiments. Defaults to True.

  • plugin_kwargs – Optional kwargs for each algorithm. Example {“adsgan”: {“n_iter”: 10}},

static highlight(results: Dict) None
static print(results: Dict, only_comparatives: bool = True) None
print_score(mean: pandas.core.series.Series, std: pandas.core.series.Series) pandas.core.series.Series

Submodules