synthcity.metrics.eval_statistical module
- class AlphaPrecision(**kwargs: Any)
Bases:
synthcity.metrics.eval_statistical.StatisticalEvaluator
Evaluates the alpha-precision, beta-recall, and authenticity scores.
The class evaluates the synthetic data using a tuple of three metrics: alpha-precision, beta-recall, and authenticity. Note that these metrics can be evaluated for each synthetic data point (which are useful for auditing and post-processing). Here we average the scores to reflect the overall quality of the data. The formal definitions can be found in the reference below:
Alaa, Ahmed, Boris Van Breugel, Evgeny S. Saveliev, and Mihaela van der Schaar. “How faithful is your synthetic data? sample-level metrics for evaluating and auditing generative models.” In International Conference on Machine Learning, pp. 290-306. PMLR, 2022.
- static direction() str
- evaluate(X_gt: synthcity.plugins.core.dataloader.DataLoader, X_syn: synthcity.plugins.core.dataloader.DataLoader) Dict
- evaluate_default(X_gt: synthcity.plugins.core.dataloader.DataLoader, X_syn: synthcity.plugins.core.dataloader.DataLoader) float
- classmethod fqdn() str
- metrics(X: numpy.ndarray, X_syn: numpy.ndarray, emb_center: Optional[numpy.ndarray] = None) Tuple
- static name() str
- reduction() Callable
- static type() str
- use_cache(path: pathlib.Path) bool
- class ChiSquaredTest(**kwargs: Any)
Bases:
synthcity.metrics.eval_statistical.StatisticalEvaluator
Performs the one-way chi-square test.
- Returns
The p-value. A small value indicates that we can reject the null hypothesis and that the distributions are different.
- Score:
0: the distributions are different 1: the distributions are identical.
- static direction() str
- evaluate(X_gt: synthcity.plugins.core.dataloader.DataLoader, X_syn: synthcity.plugins.core.dataloader.DataLoader) Dict
- evaluate_default(X_gt: synthcity.plugins.core.dataloader.DataLoader, X_syn: synthcity.plugins.core.dataloader.DataLoader) float
- classmethod fqdn() str
- static name() str
- reduction() Callable
- static type() str
- use_cache(path: pathlib.Path) bool
- class FrechetInceptionDistance(**kwargs: Any)
Bases:
synthcity.metrics.eval_statistical.StatisticalEvaluator
Calculates the Frechet Inception Distance (FID) to evalulate GANs.
Paper: GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium.
The FID metric calculates the distance between two distributions of images. Typically, we have summary statistics (mean & covariance matrix) of one of these distributions, while the 2nd distribution is given by a GAN.
Adapted by Boris van Breugel(bv292@cam.ac.uk)
- static direction() str
- evaluate(X_gt: synthcity.plugins.core.dataloader.DataLoader, X_syn: synthcity.plugins.core.dataloader.DataLoader) Dict
- evaluate_default(X_gt: synthcity.plugins.core.dataloader.DataLoader, X_syn: synthcity.plugins.core.dataloader.DataLoader) float
- classmethod fqdn() str
- static name() str
- reduction() Callable
- static type() str
- use_cache(path: pathlib.Path) bool
- class InverseKLDivergence(**kwargs: Any)
Bases:
synthcity.metrics.eval_statistical.StatisticalEvaluator
Returns the average inverse of the Kullback–Leibler Divergence metric.
- Score:
0: the datasets are from different distributions. 1: the datasets are from the same distribution.
- static direction() str
- evaluate(X_gt: synthcity.plugins.core.dataloader.DataLoader, X_syn: synthcity.plugins.core.dataloader.DataLoader) Dict
- evaluate_default(X_gt: synthcity.plugins.core.dataloader.DataLoader, X_syn: synthcity.plugins.core.dataloader.DataLoader) float
- classmethod fqdn() str
- static name() str
- reduction() Callable
- static type() str
- use_cache(path: pathlib.Path) bool
- class JensenShannonDistance(normalize: bool = True, **kwargs: Any)
Bases:
synthcity.metrics.eval_statistical.StatisticalEvaluator
Evaluate the average Jensen-Shannon distance (metric) between two probability arrays.
- static direction() str
- evaluate(X_gt: synthcity.plugins.core.dataloader.DataLoader, X_syn: synthcity.plugins.core.dataloader.DataLoader) Dict
- evaluate_default(X_gt: synthcity.plugins.core.dataloader.DataLoader, X_syn: synthcity.plugins.core.dataloader.DataLoader) float
- classmethod fqdn() str
- static name() str
- reduction() Callable
- static type() str
- use_cache(path: pathlib.Path) bool
- class KolmogorovSmirnovTest(**kwargs: Any)
Bases:
synthcity.metrics.eval_statistical.StatisticalEvaluator
Performs the Kolmogorov-Smirnov test for goodness of fit.
- Score:
0: the distributions are totally different. 1: the distributions are identical.
- static direction() str
- evaluate(X_gt: synthcity.plugins.core.dataloader.DataLoader, X_syn: synthcity.plugins.core.dataloader.DataLoader) Dict
- evaluate_default(X_gt: synthcity.plugins.core.dataloader.DataLoader, X_syn: synthcity.plugins.core.dataloader.DataLoader) float
- classmethod fqdn() str
- static name() str
- reduction() Callable
- static type() str
- use_cache(path: pathlib.Path) bool
- class MaximumMeanDiscrepancy(kernel: str = 'rbf', **kwargs: Any)
Bases:
synthcity.metrics.eval_statistical.StatisticalEvaluator
Empirical maximum mean discrepancy. The lower the result the more evidence that distributions are the same.
- Parameters
kernel – “rbf”, “linear” or “polynomial”
- Score:
0: The distributions are the same. 1: The distributions are totally different.
- static direction() str
- evaluate(X_gt: synthcity.plugins.core.dataloader.DataLoader, X_syn: synthcity.plugins.core.dataloader.DataLoader) Dict
- evaluate_default(X_gt: synthcity.plugins.core.dataloader.DataLoader, X_syn: synthcity.plugins.core.dataloader.DataLoader) float
- classmethod fqdn() str
- static name() str
- reduction() Callable
- static type() str
- use_cache(path: pathlib.Path) bool
- class PRDCScore(nearest_k: int = 5, **kwargs: Any)
Bases:
synthcity.metrics.eval_statistical.StatisticalEvaluator
Computes precision, recall, density, and coverage given two manifolds.
- Parameters
nearest_k – int.
- static direction() str
- evaluate(X_gt: synthcity.plugins.core.dataloader.DataLoader, X_syn: synthcity.plugins.core.dataloader.DataLoader) Dict
- evaluate_default(X_gt: synthcity.plugins.core.dataloader.DataLoader, X_syn: synthcity.plugins.core.dataloader.DataLoader) float
- classmethod fqdn() str
- static name() str
- reduction() Callable
- static type() str
- use_cache(path: pathlib.Path) bool
- class StatisticalEvaluator(**kwargs: Any)
Bases:
synthcity.metrics.core.metric.MetricEvaluator
- abstract static direction() str
- evaluate(X_gt: synthcity.plugins.core.dataloader.DataLoader, X_syn: synthcity.plugins.core.dataloader.DataLoader) Dict
- evaluate_default(X_gt: synthcity.plugins.core.dataloader.DataLoader, X_syn: synthcity.plugins.core.dataloader.DataLoader) float
- classmethod fqdn() str
- abstract static name() str
- reduction() Callable
- static type() str
- use_cache(path: pathlib.Path) bool
- class SurvivalKMDistance(**kwargs: Any)
Bases:
synthcity.metrics.eval_statistical.StatisticalEvaluator
The distance between two Kaplan-Meier plots. Used for survival analysis
- static direction() str
- evaluate(X_gt: synthcity.plugins.core.dataloader.DataLoader, X_syn: synthcity.plugins.core.dataloader.DataLoader) Dict
- evaluate_default(X_gt: synthcity.plugins.core.dataloader.DataLoader, X_syn: synthcity.plugins.core.dataloader.DataLoader) float
- classmethod fqdn() str
- static name() str
- reduction() Callable
- static type() str
- use_cache(path: pathlib.Path) bool
- class WassersteinDistance(**kwargs: Any)
Bases:
synthcity.metrics.eval_statistical.StatisticalEvaluator
Compare Wasserstein distance between original data and synthetic data.
- Parameters
X – original data
X_syn – synthetically generated data
- Returns
Wasserstein distance
- Return type
WD_value
- static direction() str
- evaluate(X_gt: synthcity.plugins.core.dataloader.DataLoader, X_syn: synthcity.plugins.core.dataloader.DataLoader) Dict
- evaluate_default(X_gt: synthcity.plugins.core.dataloader.DataLoader, X_syn: synthcity.plugins.core.dataloader.DataLoader) float
- classmethod fqdn() str
- static name() str
- reduction() Callable
- static type() str
- use_cache(path: pathlib.Path) bool