synthcity.plugins.core.models.mbi.graphical_model module

class GraphicalModel(domain, cliques, total=1.0, elimination_order=None)

Bases: object

belief_propagation(potentials, logZ=False)

Compute the marginals of the graphical model with given parameters

Note this is an efficient, numerically stable implementation of belief propagation

Parameters
  • potentials – the (log-space) parameters of the graphical model

  • logZ – flag to return logZ instead of marginals

Return marginals

the marginals of the graphical model

calculate_many_marginals(projections)
Calculates marginals for all the projections in the list using

Algorithm for answering many out-of-clique queries (section 10.3 in Koller and Friedman)

This method may be faster than calling project many times

Parameters

projections – a list of projections, where each projection is a subset of attributes (represented as a list or tuple)

Returns

a list of marginals, where each marginal is represented as a Factor

datavector(flatten=True)

Materialize the explicit representation of the distribution as a data vector.

fit(data)
krondot(matrices)
Compute the answer to the set of queries Q1 x Q2 X … x Qd, where

Qi is a query matrix on the ith attribute and “x” is the Kronecker product

This may be more efficient than computing a supporting marginal then multiplying that by Q. In particular, if each Qi has only a few rows.

Parameters

matrices – a list of matrices for each attribute in the domain

Returns

the vector of query answers

static load(path)
mle(marginals)

Compute the model parameters from the given marginals

Parameters

marginals – target marginals of the distribution

Param

the potentials of the graphical model with the given marginals

project(attrs)
Project the distribution onto a subset of attributes.

I.e., compute the marginal of the distribution

Parameters

attrs – a subset of attributes in the domain, represented as a list or tuple

Returns

a Factor object representing the marginal distribution

static save(model, path)
synthetic_data(rows=None, method='round')

Generate synthetic tabular data from the distribution. Valid options for method are ‘round’ and ‘sample’.

greedy_order(domain, cliques, elim)
variable_elimination(factors, elim)

run variable elimination on a list of (non-logspace) factors

variable_elimination_logspace(potentials, elim, total)

run variable elimination on a list of logspace factors