synthcity.benchmark package
- class Benchmarks
Bases:
object
- static evaluate(tests: List[Tuple[str, str, dict]], X: synthcity.plugins.core.dataloader.DataLoader, X_test: Optional[synthcity.plugins.core.dataloader.DataLoader] = None, metrics: Optional[Dict] = None, repeats: int = 3, synthetic_size: Optional[int] = None, synthetic_constraints: Optional[synthcity.plugins.core.constraints.Constraints] = None, synthetic_cache: bool = True, synthetic_reuse_if_exists: bool = True, augmented_reuse_if_exists: bool = True, task_type: str = 'classification', workspace: pathlib.Path = PosixPath('workspace'), augmentation_rule: str = 'equal', strict_augmentation: bool = False, ad_hoc_augment_vals: Optional[Dict] = None, use_metric_cache: bool = True, **generate_kwargs: Any) pandas.core.frame.DataFrame
Benchmark the performance of several algorithms.
- Parameters
tests – List[Tuple[str, str, dict]] Tuples of form (testname: str, plugin_name, str, plugin_args: dict)
X – DataLoader The baseline dataset to learn
X_test – Optional[DataLoader] Optional test dataset for evaluation. If None, X will be split in train/test datasets.
metrics –
List of metrics to test. By default, all metrics are evaluated. Full dictionary of metrics is: {
’sanity’: [‘data_mismatch’, ‘common_rows_proportion’, ‘nearest_syn_neighbor_distance’, ‘close_values_probability’, ‘distant_values_probability’], ‘stats’: [‘jensenshannon_dist’, ‘chi_squared_test’, ‘feature_corr’, ‘inv_kl_divergence’, ‘ks_test’, ‘max_mean_discrepancy’, ‘wasserstein_dist’, ‘prdc’, ‘alpha_precision’, ‘survival_km_distance’], ‘performance’: [‘linear_model’, ‘mlp’, ‘xgb’, ‘feat_rank_distance’], ‘detection’: [‘detection_xgb’, ‘detection_mlp’, ‘detection_gmm’, ‘detection_linear’], ‘privacy’: [‘delta-presence’, ‘k-anonymization’, ‘k-map’, ‘distinct l-diversity’, ‘identifiability_score’, ‘DomiasMIA_BNAF’, ‘DomiasMIA_KDE’, ‘DomiasMIA_prior’]
}
repeats – Number of test repeats
synthetic_size – int The size of the synthetic dataset. By default, it is len(X).
synthetic_constraints – Optional constraints on the synthetic data. By default, it inherits the constraints from X.
synthetic_cache – bool Enable experiment caching
synthetic_reuse_if_exists – bool If the current synthetic dataset is cached, it will be reused for the experiments. Defaults to True.
augmented_reuse_if_exists – bool If the current augmented dataset is cached, it will be reused for the experiments. Defaults to True.
task_type – str The type of problem. Relevant for evaluating the downstream models with the correct metrics. Valid tasks are: “classification”, “regression”, “survival_analysis”, “time_series”, “time_series_survival”.
workspace – Path Path for caching experiments. Default: “workspace”.
augmentation_rule – str The rule used to achieve the desired proportion records with each value in the fairness column. Possible values are: ‘equal’, ‘log’, and ‘ad-hoc’. Defaults to “equal”.
strict_augmentation – bool Flag to ensure that the condition for generating synthetic data is strictly met. Defaults to False.
ad_hoc_augment_vals – Dict A dictionary containing the number of each class to augment the real data with. This is only required if using the rule=”ad-hoc” option. Defaults to None.
use_metric_cache – bool If the current metric has been previously run and is cached, it will be reused for the experiments. Defaults to True.
plugin_kwargs – Optional kwargs for each algorithm. Example {“adsgan”: {“n_iter”: 10}},
- static highlight(results: Dict) None
- static print(results: Dict, only_comparatives: bool = True) None
- print_score(mean: pandas.core.series.Series, std: pandas.core.series.Series) pandas.core.series.Series