synthcity.metrics.eval_statistical module

class AlphaPrecision(**kwargs: Any)

Bases: synthcity.metrics.eval_statistical.StatisticalEvaluator

Inheritance diagram of synthcity.metrics.eval_statistical.AlphaPrecision

Evaluates the alpha-precision, beta-recall, and authenticity scores.

The class evaluates the synthetic data using a tuple of three metrics: alpha-precision, beta-recall, and authenticity. Note that these metrics can be evaluated for each synthetic data point (which are useful for auditing and post-processing). Here we average the scores to reflect the overall quality of the data. The formal definitions can be found in the reference below:

Alaa, Ahmed, Boris Van Breugel, Evgeny S. Saveliev, and Mihaela van der Schaar. “How faithful is your synthetic data? sample-level metrics for evaluating and auditing generative models.” In International Conference on Machine Learning, pp. 290-306. PMLR, 2022.

static direction() str
evaluate(X_gt: synthcity.plugins.core.dataloader.DataLoader, X_syn: synthcity.plugins.core.dataloader.DataLoader) Dict
evaluate_default(X_gt: synthcity.plugins.core.dataloader.DataLoader, X_syn: synthcity.plugins.core.dataloader.DataLoader) float
classmethod fqdn() str
metrics(X: numpy.ndarray, X_syn: numpy.ndarray, emb_center: Optional[numpy.ndarray] = None) Tuple
static name() str
reduction() Callable
static type() str
use_cache(path: pathlib.Path) bool
class ChiSquaredTest(**kwargs: Any)

Bases: synthcity.metrics.eval_statistical.StatisticalEvaluator

Inheritance diagram of synthcity.metrics.eval_statistical.ChiSquaredTest

Performs the one-way chi-square test.

Returns

The p-value. A small value indicates that we can reject the null hypothesis and that the distributions are different.

Score:

0: the distributions are different 1: the distributions are identical.

static direction() str
evaluate(X_gt: synthcity.plugins.core.dataloader.DataLoader, X_syn: synthcity.plugins.core.dataloader.DataLoader) Dict
evaluate_default(X_gt: synthcity.plugins.core.dataloader.DataLoader, X_syn: synthcity.plugins.core.dataloader.DataLoader) float
classmethod fqdn() str
static name() str
reduction() Callable
static type() str
use_cache(path: pathlib.Path) bool
class FrechetInceptionDistance(**kwargs: Any)

Bases: synthcity.metrics.eval_statistical.StatisticalEvaluator

Inheritance diagram of synthcity.metrics.eval_statistical.FrechetInceptionDistance

Calculates the Frechet Inception Distance (FID) to evalulate GANs.

Paper: GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium.

The FID metric calculates the distance between two distributions of images. Typically, we have summary statistics (mean & covariance matrix) of one of these distributions, while the 2nd distribution is given by a GAN.

Adapted by Boris van Breugel(bv292@cam.ac.uk)

static direction() str
evaluate(X_gt: synthcity.plugins.core.dataloader.DataLoader, X_syn: synthcity.plugins.core.dataloader.DataLoader) Dict
evaluate_default(X_gt: synthcity.plugins.core.dataloader.DataLoader, X_syn: synthcity.plugins.core.dataloader.DataLoader) float
classmethod fqdn() str
static name() str
reduction() Callable
static type() str
use_cache(path: pathlib.Path) bool
class InverseKLDivergence(**kwargs: Any)

Bases: synthcity.metrics.eval_statistical.StatisticalEvaluator

Inheritance diagram of synthcity.metrics.eval_statistical.InverseKLDivergence

Returns the average inverse of the Kullback–Leibler Divergence metric.

Score:

0: the datasets are from different distributions. 1: the datasets are from the same distribution.

static direction() str
evaluate(X_gt: synthcity.plugins.core.dataloader.DataLoader, X_syn: synthcity.plugins.core.dataloader.DataLoader) Dict
evaluate_default(X_gt: synthcity.plugins.core.dataloader.DataLoader, X_syn: synthcity.plugins.core.dataloader.DataLoader) float
classmethod fqdn() str
static name() str
reduction() Callable
static type() str
use_cache(path: pathlib.Path) bool
class JensenShannonDistance(normalize: bool = True, **kwargs: Any)

Bases: synthcity.metrics.eval_statistical.StatisticalEvaluator

Evaluate the average Jensen-Shannon distance (metric) between two probability arrays.

static direction() str
evaluate(X_gt: synthcity.plugins.core.dataloader.DataLoader, X_syn: synthcity.plugins.core.dataloader.DataLoader) Dict
evaluate_default(X_gt: synthcity.plugins.core.dataloader.DataLoader, X_syn: synthcity.plugins.core.dataloader.DataLoader) float
classmethod fqdn() str
static name() str
reduction() Callable
static type() str
use_cache(path: pathlib.Path) bool
class KolmogorovSmirnovTest(**kwargs: Any)

Bases: synthcity.metrics.eval_statistical.StatisticalEvaluator

Inheritance diagram of synthcity.metrics.eval_statistical.KolmogorovSmirnovTest

Performs the Kolmogorov-Smirnov test for goodness of fit.

Score:

0: the distributions are totally different. 1: the distributions are identical.

static direction() str
evaluate(X_gt: synthcity.plugins.core.dataloader.DataLoader, X_syn: synthcity.plugins.core.dataloader.DataLoader) Dict
evaluate_default(X_gt: synthcity.plugins.core.dataloader.DataLoader, X_syn: synthcity.plugins.core.dataloader.DataLoader) float
classmethod fqdn() str
static name() str
reduction() Callable
static type() str
use_cache(path: pathlib.Path) bool
class MaximumMeanDiscrepancy(kernel: str = 'rbf', **kwargs: Any)

Bases: synthcity.metrics.eval_statistical.StatisticalEvaluator

Inheritance diagram of synthcity.metrics.eval_statistical.MaximumMeanDiscrepancy

Empirical maximum mean discrepancy. The lower the result the more evidence that distributions are the same.

Parameters

kernel – “rbf”, “linear” or “polynomial”

Score:

0: The distributions are the same. 1: The distributions are totally different.

static direction() str
evaluate(X_gt: synthcity.plugins.core.dataloader.DataLoader, X_syn: synthcity.plugins.core.dataloader.DataLoader) Dict
evaluate_default(X_gt: synthcity.plugins.core.dataloader.DataLoader, X_syn: synthcity.plugins.core.dataloader.DataLoader) float
classmethod fqdn() str
static name() str
reduction() Callable
static type() str
use_cache(path: pathlib.Path) bool
class PRDCScore(nearest_k: int = 5, **kwargs: Any)

Bases: synthcity.metrics.eval_statistical.StatisticalEvaluator

Inheritance diagram of synthcity.metrics.eval_statistical.PRDCScore

Computes precision, recall, density, and coverage given two manifolds.

Parameters

nearest_k – int.

static direction() str
evaluate(X_gt: synthcity.plugins.core.dataloader.DataLoader, X_syn: synthcity.plugins.core.dataloader.DataLoader) Dict
evaluate_default(X_gt: synthcity.plugins.core.dataloader.DataLoader, X_syn: synthcity.plugins.core.dataloader.DataLoader) float
classmethod fqdn() str
static name() str
reduction() Callable
static type() str
use_cache(path: pathlib.Path) bool
class StatisticalEvaluator(**kwargs: Any)

Bases: synthcity.metrics.core.metric.MetricEvaluator

Inheritance diagram of synthcity.metrics.eval_statistical.StatisticalEvaluator
abstract static direction() str
evaluate(X_gt: synthcity.plugins.core.dataloader.DataLoader, X_syn: synthcity.plugins.core.dataloader.DataLoader) Dict
evaluate_default(X_gt: synthcity.plugins.core.dataloader.DataLoader, X_syn: synthcity.plugins.core.dataloader.DataLoader) float
classmethod fqdn() str
abstract static name() str
reduction() Callable
static type() str
use_cache(path: pathlib.Path) bool
class SurvivalKMDistance(**kwargs: Any)

Bases: synthcity.metrics.eval_statistical.StatisticalEvaluator

Inheritance diagram of synthcity.metrics.eval_statistical.SurvivalKMDistance

The distance between two Kaplan-Meier plots. Used for survival analysis

static direction() str
evaluate(X_gt: synthcity.plugins.core.dataloader.DataLoader, X_syn: synthcity.plugins.core.dataloader.DataLoader) Dict
evaluate_default(X_gt: synthcity.plugins.core.dataloader.DataLoader, X_syn: synthcity.plugins.core.dataloader.DataLoader) float
classmethod fqdn() str
static name() str
reduction() Callable
static type() str
use_cache(path: pathlib.Path) bool
class WassersteinDistance(**kwargs: Any)

Bases: synthcity.metrics.eval_statistical.StatisticalEvaluator

Inheritance diagram of synthcity.metrics.eval_statistical.WassersteinDistance

Compare Wasserstein distance between original data and synthetic data.

Parameters
  • X – original data

  • X_syn – synthetically generated data

Returns

Wasserstein distance

Return type

WD_value

static direction() str
evaluate(X_gt: synthcity.plugins.core.dataloader.DataLoader, X_syn: synthcity.plugins.core.dataloader.DataLoader) Dict
evaluate_default(X_gt: synthcity.plugins.core.dataloader.DataLoader, X_syn: synthcity.plugins.core.dataloader.DataLoader) float
classmethod fqdn() str
static name() str
reduction() Callable
static type() str
use_cache(path: pathlib.Path) bool