gvar - Gaussian Random Variables

Introduction

Objects of type gvar.GVar represent gaussian random variables, which are specified by a mean and standard deviation. They are created using gvar.gvar(): for example,

>>> x = gvar.gvar(10,3)          # 0 +- 3
>>> y = gvar.gvar(12,4)          # 2 +- 4
>>> z = x + y                    # 2 +- 5
>>> print(z)
22.0(5.0)
>>> print(z.mean)
22.0
>>> print(z.sdev)
5.0

This module contains a variety of tools for creating and manipulating gaussian random variables, including:

Functions

The function used to create Gaussian variable objects is:

gvar.gvar(...)

Creates one or more new gvar.GVars.

gvar.gvar is an object of type gvar.GVarFactory. Each of the following creates new gvar.GVars:

gvar.gvar(x, xsdev)

Returns a gvar.GVar with mean x and standard deviation xsdev. Returns an array of gvar.GVars if x and xsdev are arrays with the same shape; the shape of the result is the same as the shape of x. Returns a gvar.BufferDict if x and xsdev are dictionaries with the same keys and layout; the result has the same keys and layout as x.

gvar.gvar(x, xcov)

Returns an array of gvar.GVars with means given by array x and a covariance matrix given by array xcov, where xcov.shape = 2*x.shape; the result has the same shape as x. Returns a gvar.BufferDict if x and xcov are dictionaries, where the keys in xcov are (k1,k2) for any keys k1 and k2 in x. Returns a single gvar.GVar if x is a number and xcov is a one-by-one matrix. The layout for xcov is compatible with that produced by gvar.evalcov() for a single gvar.GVar, an array of gvar.GVars, or a dictionary whose values are gvar.GVars and/or arrays of gvar.GVars. Therefore gvar.gvar(gvar.mean(g), gvar.evalcov(g)) creates gvar.GVars with the same means and covariance matrix as the gvar.GVars in g provided g is a single gvar.GVar, or an array or dictionary of gvar.GVars.

gvar.gvar(x, xcov, verify=True)

Same as gvar.gvar(x, xcov) above but checks that the covariance matrix is symmetric and positive definite (which covariance matrices should be). This check is expensive for large matrices and so is not done by default. Note, however, that unpredictable outcomes will result from specifying an improper covariance matrix.

gvar.gvar(x, xcov, fast=True)

Normally gvar.gvar(x, xcov) tries to break the covariance matrix into disjoint diagonal blocks, if there are any. For example,

xcov = [[1,1,0], [1,2,0], [0,0,3]]

can be decomposed into two blocks. This decomposition saves memory, and can make later manipulations of the resulting gvar.GVars faster. This is at the expense of extra processing to create the gvar.GVars. Setting keyword fast=True prevents gvar.gvar from doing this, which would make sense, for example, if it was known ahead of time that xcov has no sub-blocks. The default is fast=False. Either choice gives correct answers; the difference is about efficiency.

gvar.gvar((x, xsdev))

Returns a gvar.GVar with mean x and standard deviation xsdev.

gvar.gvar(xstr)

Returns a gvar.GVar corresponding to string xstr which is either of the form "xmean +- xsdev" or "x(xerr)" (see GVar.fmt()).

gvar.gvar(xgvar)

Returns gvar.GVar xgvar unchanged.

gvar.gvar(xdict)

Returns a dictionary (BufferDict) b where b[k] = gvar.gvar(xdict[k]) for every key in dictionary xdict. The values in xdict, therefore, can be strings, tuples or gvar.GVars (see above), or arrays of these.

gvar.gvar(xarray)

Returns an array a having the same shape as xarray where every element a[i...] = gvar.gvar(xarray[i...]). The values in xarray, therefore, can be strings, tuples or gvar.GVars (see above).

gvar.gvar(ymean, ycov, x, xycov)

Returns a 1-d array of gvar.GVars y[i] constructed from the 1-d array of mean values ymean and the 2-d covariance matrix ycov. The y[i] are correlated with the primary gvar.GVars in 1-d array x. The x-y covariance matrix is xycov whose shape is x.shape + y.shape. Note that this changes the gvar.GVars in x (because they are correlated with the y[i]); it has no effect on the variance or on correlations between different x[i]s.

The following function is useful for constructing new functions that can accept gvar.GVars as arguments:

gvar.gvar_function(x, f, dfdx)

Create a gvar.GVar for function f(x) given f and df/dx at x.

This function creates a gvar.GVar corresponding to a function of gvar.GVars x whose value is f and whose derivatives with respect to each x are given by dfdx. Here x can be a single gvar.GVar, an array of gvar.GVars (for a multidimensional function), or a dictionary whose values are gvar.GVars or arrays of gvar.GVars, while dfdx must be a float, an array of floats, or a dictionary whose values are floats or arrays of floats, respectively.

This function is useful for creating functions that can accept gvar.GVars as arguments. For example,

import math
import gvar as gv

def sin(x):
    if isinstance(x, gv.GVar):
        f = math.sin(x.mean)
        dfdx = math.cos(x.mean)
        return gv.gvar_function(x, f, dfdx)
    else:
        return math.sin(x)

creates a version of sin(x) that works with either floats or gvar.GVars as its argument. This particular function is unnecessary since it is already provided by gvar.

Parameters:
  • x (gvar.GVar, array of gvar.GVars, or a dictionary of gvar.GVars) – Point at which the function is evaluated.

  • f (float) – Value of function at point gvar.mean(x).

  • dfdx (float, array of floats, or a dictionary of floats) – Derivatives of function with respect to x at point gvar.mean(x).

Returns:

A gvar.GVar representing the function’s value at x.

Means, standard deviations, variances, formatted strings, covariance matrices and correlation/comparison information can be extracted from arrays (or dictionaries) of gvar.GVars using:

gvar.mean(g)

Extract means from gvar.GVars in g.

g can be a gvar.GVar, an array of gvar.GVars, or a dictionary containing gvar.GVars or arrays of gvar.GVars. Result has the same layout as g.

Elements of g that are not gvar.GVars are left unchanged.

gvar.sdev(g)

Extract standard deviations from gvar.GVars in g.

g can be a gvar.GVar, an array of gvar.GVars, or a dictionary containing gvar.GVars or arrays of gvar.GVars. Result has the same layout as g.

The deviation is set to 0.0 for elements of g that are not gvar.GVars.

gvar.var(g)

Compute variances for elements of array/dictionary g or a single gvar.GVar.

If g is an array of gvar.GVars, var returns the variances as an array with shape g.shape. If g is a dictionary whose values are gvar.GVars or arrays of gvar.GVars, the result is a dictionary where cov[k] is the variance for g[k].

gvar.is_primary(g)

Determine whether or not the gvar.GVars in g are primary gvar.GVars.

A primary gvar.GVar is one created using gvar.gvar() (or a function of such a variable). A derived gvar.GVar is one that is constructed from arithmetic expressions and functions that combine multiple primary gvar.GVars. The standard deviations for all gvar.GVars originate with the primary gvar.GVars. In particular,

z = z.mean + sum_p (p - p.mean) * dz/dp

is true for any gvar.GVar z, where the sum is over all primary gvar.GVars p.

Here g can be a gvar.GVar, an array of gvar.GVars, or a dictionary containing gvar.GVars or arrays of gvar.GVars. The result has the same layout as g. Each gvar.GVar is replaced by True or False depending upon whether it is primary or not.

When the same gvar.GVar appears more than once in g, only the first appearance is marked as a primary gvar.GVar. This avoids double counting the same primary gvar.GVar — each primary gvar.GVar in the list is unique.

gvar.dependencies(g, all=False)

Collect primary gvar.GVars contributing to the covariances of gvar.GVars in g.

Parameters:
  • ggvar.GVar or a dictionary, array, etc. containing gvar.GVars.

  • all (bool) – If True the result includes all primary gvar.GVars including those that are in g; otherwise it only includes those not in g. Default is False.

Returns:

An array containing the primary gvar.GVars contributing to covariances of gvar.GVars in g.

gvar.filter(g, f, *args, **kargs)

Filter gvar.GVars in g through function f.

Sample usage:

import gvar as gv
g_mod = gv.filter(g, gv.mean)

replaces every gvar.GVar in g by its mean. This is useful when gv.mean(g) doesn’t work — for example, because g contains data types other than gvar.GVars, or consists of nested dictionaries.

Parameters:
  • g – Object consisting of (possibly nested) dictionaries, deques, lists, numpy.arrays, tuples, and/or other data types that contain gvar.GVars and other types of data.

  • f – Function that takes an array of gvar.GVars as an argument and returns an array of results having the same shape. Given an array gvar_array containing all the gvar.GVars from g, the function call is f(gvar_array, *args, **kargs). Results from the returned array replace the gvar.GVars in the original object.

  • args – Additional arguments for f.

  • kargs – Additional keyword arguments for f.

Returns:

An object like g but with all the gvar.GVars replaced by objects generated by function f.

gvar.fmt(g, ndecimal=None, sep='')

Format gvar.GVars in g.

g can be a gvar.GVar, an array of gvar.GVars, or a dictionary containing gvar.GVars or arrays of gvar.GVars. Each gvar.GVar gi in g is replaced by the string generated by gi.fmt(ndecimal,sep). Result has same structure as g.

gvar.tabulate(g, ncol=1, headers=True, offset='', ndecimal=None)

Tabulate contents of an array or dictionary of gvar.GVars.

Given an array g of gvar.GVars or a dictionary whose values are gvar.GVars or arrays of gvar.GVars, gvar.tabulate(g) returns a string containing a table of the values of g’s entries. For example, the code

import collections
import gvar as gv

g = collections.OrderedDict()
g['scalar'] = gv.gvar('10.3(1)')
g['vector'] = gv.gvar(['0.52(3)', '0.09(10)', '1.2(1)'])
g['tensor'] = gv.gvar([
    ['0.01(50)', '0.001(20)', '0.033(15)'],
    ['0.001(20)', '2.00(5)', '0.12(52)'],
    ['0.007(45)', '0.237(4)', '10.23(75)'],
    ])
print(gv.tabulate(g, ncol=2))

prints the following table:

   key/index          value     key/index          value
---------------------------  ---------------------------
      scalar     10.30 (10)           1,0     0.001 (20)
    vector 0     0.520 (30)           1,1     2.000 (50)
           1      0.09 (10)           1,2      0.12 (52)
           2      1.20 (10)           2,0     0.007 (45)
  tensor 0,0      0.01 (50)           2,1    0.2370 (40)
         0,1     0.001 (20)           2,2     10.23 (75)
         0,2     0.033 (15)
Parameters:
  • g – Array of gvar.GVars (any shape) or dictionary whose values are gvar.GVars or arrays of gvar.GVars (any shape).

  • ncol – The table is split over ncol columns of key/index values plus gvar.GVar values. Default value is 1.

  • headers – Prints standard header on table if True; omits the header if False. If headers is a 2-tuple, then headers[0] is used in the header over the indices/keys and headers[1] over the gvar.GVar values. (Default is True.)

  • offset (str) – String inserted at the beginning of each line in the table. Default is ''.

  • ndecimal – Number of digits displayed after the decimal point. Default is ndecimal=None which adjusts table entries to show 2 digits of error.

  • keys – When g is a dictionary and keys is a list of keys, entries will be tabulated only for keys in the keys list; ignored if keys=None (default).

gvar.correlate(g, corr, upper=False, lower=False, verify=False)

Add correlations to uncorrelated gvar.GVars in g.

This method creates correlated gvar.GVars from uncorrelated gvar.GVars g, using the correlations specified in corr.

Note that correlations initially present in g, if any, are ignored.

Examples

A typical application involves the construction of correlated gvar.GVars give the means and standard deviations, together with a correlation matrix:

>>> import gvar as gv
>>> g = gv.gvar(['1(1)', '2(10)'])
>>> print(gv.evalcorr(g))           # uncorrelated
[[ 1.  0.]
 [ 0.  1.]]
>>> g =  gv.correlate(g, [[1. , 0.1], [0.1, 1.]])
>>> print(gv.evalcorr(g))           # correlated
[[ 1.   0.1]
 [ 0.1  1. ]]

This also works when g and corr are dictionaries:

>>> g = gv.gvar(dict(a='1(1)', b='2(10)'))
>>> print(gv.evalcorr(g))
{('a', 'a'): array([[ 1.]]),('a', 'b'): array([[ 0.]]),('b', 'a'): array([[ 0.]]),('b', 'b'): array([[ 1.]])}
>>> corr = {}
>>> corr['a', 'a'] = 1.0
>>> corr['a', 'b'] = 0.1
>>> corr['b', 'a'] = 0.1
>>> corr['b', 'b'] = 1.0
>>> g = correlate(g, corr)
>>> print(gv.evalcorr(g))
{('a', 'a'): array([[ 1.]]),('a', 'b'): array([[ 0.1]]),('b', 'a'): array([[ 0.1]]),('b', 'b'): array([[ 1.]])}
Parameters:
  • g – An array of gvar.GVars or a dictionary whose values are gvar.GVars or arrays of gvar.GVars.

  • corr – Correlations between gvar.GVars: corr[i, j] is the correlation between g[i] and g[j]. Should be a symmetric and positive-definite (unless upper or lower is specified).

  • upper (bool) – If True, replaces lower triangular part of corr with transpose of the upper triangular part. The diagonal is set to one. Default is False.

  • lower (bool) – If True, replaces upper triangular part of corr with transpose of the lower triangular part. The diagonal is set to one. Default is False.

  • verify (bool) – If True, verifies that corr is symmetric and positive definite. Default is False because verification is costly for large matrices.

gvar.evalcov(g)

Compute covariance matrix for elements of array/dictionary g.

If g is an array of gvar.GVars, evalcov returns the covariance matrix as an array with shape g.shape+g.shape. If g is a dictionary whose values are gvar.GVars or arrays of gvar.GVars, the result is a doubly-indexed dictionary where cov[k1,k2] is the covariance for g[k1] and g[k2].

gvar.cov(g1, g2)

Covariance of gvar.GVar g1 with g2.

gvar.evalcov_blocks(g, compress=False)

Evaluate covariance matrix for elements of g.

Evaluates the covariance matrices for gvar.GVars stored in array or dictionary of arrays/gvar.GVars g. The covariance matrix is decomposed into its block diagonal components, and a list of tuples (idx,bcov) is returned where bcov is a diagonal block of the covariance matrix and idx an array containing the corresponding indices in g.flat for that block. So to reassemble the blocks into a single matrix cov, for example, one would use:

import numpy as np
import gvar as gv
cov = np.zeros((len(g.flat), len(g.flat)), float)
for idx, bcov in gv.evalcov_blocks(g):
    cov[idx[:, None], idx] = bcov

gvar.evalcov_blocks() is particularly useful when the covariance matrix is sparse; only nonzero elements are retained.

Setting argument compress=True (default is False) changes the meaning of the first element in the list: idx for the first element contains the indices of all gvar.GVars in g.flat that are uncorrelated from other gvar.GVars in g.flat, and bcov contains the standard deviations for those gvar.GVars. Thus the full covariance matrix is reconstructed using the following code:

import numpy as np
import gvar as gv
cov = np.zeros((len(g.flat), len(g.flat)), float)
blocks = gv.evalcov_blocks(g, compress=True)
# uncorrelated pieces are diagonal
idx, sdev = blocks[0]
cov[idx, idx] = sdev ** 2
# correlated pieces
for idx, bcov in blocks[1:]:
    cov[idx[:, None], idx] = bcov

The code with compress=True should be slightly faster if there are many uncorrelated gvar.GVars.

It is easy to create an array of gvar.GVars having the covariance matrix from g.flat: for example,

import numpy as np
import gvar as gv
new_g = np.empty(len(g.flat), dtype=object)
for idx, bcov in evalcov_blocks(g):
    new_g[idx] = gv.gvar(new_g[idx], bcov)

creates an array of gvar.GVars with zero mean and the same covariance matrix as g.flat. This works with either value for argument compress.

Parameters:
  • g – A gvar.GVar, an array of gvar.GVars, or a dictionary whose values are gvar.GVars and/or arrays of gvar.GVars.

  • compress (bool) – Setting compress=True collects all of the uncorrelated gvar.GVars in g.flat into the first element of the returned list (see above). Default is False.

gvar.evalcorr(g)

Compute correlation matrix for elements of array/dictionary g.

If g is an array of gvar.GVars, evalcorr returns the correlation matrix as an array with shape g.shape+g.shape. If g is a dictionary whose values are gvar.GVars or arrays of gvar.GVars, the result is a doubly-indexed dictionary where corr[k1,k2] is the correlation for g[k1] and g[k2].

The correlation matrix is related to the covariance matrix by:

corr[i,j] = cov[i,j] / (cov[i,i] * cov[j,j]) ** 0.5
gvar.corr(g1, g2)

Correlation between gvar.GVars g1 and g2.

gvar.uncorrelated(g1, g2)

Return True if gvar.GVars in g1 uncorrelated with those in g2.

g1 and g2 can be gvar.GVars, arrays of gvar.GVars, or dictionaries containing gvar.GVars or arrays of gvar.GVars. Returns True if either of g1 or g2 is None.

gvar.chi2(g1, g2, svdcut=1e-12, dof=None)

Compute chi**2 of g1-g2.

chi**2 equals dg.invcov.dg, where ``dg = g1 - g2 and invcov is the inverse of dg’s covariance matrix. It is a measure of how well multi-dimensional Gaussian distributions g1 and g2 (dictionaries or arrays) agree with each other — that is, do their means agree within errors for corresponding elements. The probability is high if chi2(g1,g2)/dof is of order 1 or smaller, where dof is the number of degrees of freedom being compared.

Usually g1 and g2 are dictionaries with the same keys, where g1[k] and g2[k] are gvar.GVars or arrays of gvar.GVars having the same shape. Alternatively g1 and g2 can be gvar.GVars, or arrays of gvar.GVars having the same shape.

One of g1 or g2 can contain numbers instead of gvar.GVars, in which case chi**2 is a measure of the likelihood that the numbers came from the distribution specified by the other argument.

One or the other of g1 or g2 can be missing keys, or missing elements from arrays. Only the parts of g1 and g2 that overlap are used. Also setting g2=None is equivalent to replacing its elements by zeros.

A typical application tests whether distribution g1 and g2 are consistent with each other (within errors): the code

>>> chi2 = gvar.chi2(g1, g2)
>>> print(gvar.fmt_chi2(chi2))
chi2/dof = 1.1 [100]    Q = 0.26

shows that the distributions are reasonably consistent. The number of degrees of freedom (here 100) in this example equals the number of variables from g1 and g2 that are compared; this can be changed using the dof argument.

Parameters:
  • g1gvar.GVar or array of gvar.GVars, or a dictionary whose values are gvar.GVars or arrays of gvar.GVars. Specifies a multi-dimensional Gaussian distribution. Alternatively the elements can be numbers instead of gvar.GVars, in which case g1 specifies a sample from a distribution.

  • g2gvar.GVar or array of gvar.GVars, or a dictionary whose values are gvar.GVars or arrays of gvar.GVars. Specifies a multi-dimensional Gaussian distribution. Alternatively the elements can be numbers instead of gvar.GVars, in which case g2 specifies a sample from a distribution. Setting g2=None (default) is equivalent to setting its elements all to zero.

  • eps (float) – If positive, singularities in the correlation matrix for g1-g2 are regulated using gvar.regulate() with cutoff eps. Ignored if svdcut is specified (and not None).

  • svdcut (float) – If nonzero, singularities in the correlation matrix for g1-g2 are regulated using gvar.regulate() with an SVD cutoff svdcut. Default is svdcut=1e-12.

  • dof (int or None) – Number of independent degrees of freedom in g1-g2. This is set equal to the number of elements in g1-g2 if dof=None is set. This parameter affects the Q value assigned to the chi**2.

Returns:

The return value is the chi**2. Extra attributes attached to this number give additional information:

  • dof — Number of degrees of freedom (that is, the number

    of variables compared if not specified).

  • Q — The probability that the chi**2 could have

    been larger, by chance, even if g1 and g2 agree. Values smaller than 0.1 or so suggest that they do not agree. Also called the p-value.

  • residuals — Decomposition of the chi**2 in terms of the

    independent modes of the correlation matrix: chi**2 = sum(residuals**2).

gvar.qqplot(g1, g2, plot=None, svdcut=1e-12, dof=None)

QQ plot g1-g2.

The QQ plot compares the distribution of the means of Gaussian distribution g1-g2 to that of random samples from the distribution. The resulting plot will approximate a straight line along the diagonal of the plot (dashed black line) if the means have a Gaussian distribution about zero with the correct standard deviations.

Usually g1 and g2 are dictionaries with the same keys, where g1[k] and g2[k] are gvar.GVars or arrays of gvar.GVars having the same shape. Alternatively g1 and g2 can be gvar.GVars, or arrays of gvar.GVars having the same shape.

One of g1 or g2 can contain numbers instead of gvar.GVars.

One or the other of g1 or g2 can be missing keys, or missing elements from arrays. Only the parts of g1 and g2 that overlap are used. Also setting g2=None is equivalent to replacing its elements by zeros.

In a typical application, the plot displayed by

gvar.qqplot(gsample, g).show()

tests whether gsample is likely a random sample from a multi-dimensional Gaussian distribtuion g. It is consistent with being a random sample if the QQ plot is a straight line through the origin with unit slope. The is most useful when there are many variables (g.size >> 1).

Parameters:
  • g1gvar.GVar or array of gvar.GVars, or a dictionary whose values are gvar.GVars or arrays of gvar.GVars. Specifies a multi-dimensional Gaussian distribution. Alternatively the elements can be numbers instead of gvar.GVars, in which case g1 specifies a sample from a distribution.

  • g2gvar.GVar or array of gvar.GVars, or a dictionary whose values are gvar.GVars or arrays of gvar.GVars. Specifies a multi-dimensional Gaussian distribution. Alternatively the elements can be numbers instead of gvar.GVars, in which case g2 specifies a sample from a distribution. Setting g2=None (default) is equivalent to setting its elements all to zero.

  • plot – a matplotlib plotter. If None (default), uses matplotlib.pyplot.

  • eps (float or None) – eps used by gvar.regulate() when inverting the covariance matrix. Ignored if svdcut is set (and not None).

  • svdcut (float) – SVD cut used when inverting the covariance matrix of g1-g2. See documentation for gvar.svd() for more information. Default is svdcut=1e-12.

  • dof (int or None) – Number of independent degrees of freedom in g1-g2. This is set equal to the number of elements in g1-g2 if dof=None is set. This parameter affects the Q value assigned to the chi**2.

Returns:

Plotter plot.

This method requires the scipy and matplotlib modules.

gvar.fmt_chi2(f)

Return string containing chi**2/dof, dof and Q from f.

Assumes f has attributes chi2, dof and Q. The logarithm of the Bayes factor will also be printed if f has attribute logGBF.

gvar.GVars are compared by:

gvar.equivalent(g1, g2, rtol=1e-10, atol=1e-10)

Determine whether g1 and g2 contain equivalent gvar.GVars.

Compares sums and differences of gvar.GVars stored in g1 and g2 to see if they agree within tolerances. Operationally, agreement means that:

abs(diff) < abs(summ) / 2 * rtol + atol

where diff and summ are the difference and sum of the mean values (g.mean) or derivatives (g.der) associated with each pair of gvar.GVars.

gvar.GVars that are equivalent are effectively interchangeable with respect to both their means and also their covariances with any other gvar.GVar (including ones not in g1 and g2).

g1 and g2 can be individual gvar.GVars or arrays of gvar.GVars or dictionaries whose values are gvar.GVars and/or arrays of gvar.GVars. Comparisons are made only for shared keys when they are dictionaries. Array dimensions must match between g1 and g2, but the shapes can be different; comparisons are made for the parts of the arrays that overlap in shape.

Parameters:
  • g1 – A gvar.GVar or an array of gvar.GVars or a dictionary of gvar.GVars and/or arrays of gvar.GVars.

  • g2 – A gvar.GVar or an array of gvar.GVars or a dictionary of gvar.GVars and/or arrays of gvar.GVars.

  • rtol – Relative tolerance with which mean values and derivatives must agree with each other. Default is 1e-10.

  • atol – Absolute tolerance within which mean values and derivatives must agree with each other. Default is 1e-10.

gvar.GVars can be stored (serialized) and retrieved from files (or strings) using:

gvar.dump(g, outputfile=None, add_dependencies=False, **kargs)

Write a representation of g to file outputfile.

Object g consists of (possibly nested) dictionaries, deques, lists, numpy.arrays, and/or tuples that contain gvar.GVars and other types of data. Calling gvar.dump(g, 'filename') writes a serialized representation of g into the file named filename. Object g can be recovered later using gvar.load('filename').

Typical usage is:

# create file xxx.pickle containing GVars in g
import gvar as gv
gv.dump(g, 'xxx.pickle')

# read file xxx.pickle to recreate g
new_g = gv.load('xxx.pickle')

gvar.dump() is an alternative to pickle.dump() which, unlike the latter, works when there are gvar.GVars in g. In particular it preserves correlations between different gvar.GVars, as well as relationships (i.e., derivatives) between derived gvar.GVars and primary gvar.GVars in g. gvar.dump() uses gvar.remove_gvars() to search (recursively) for the gvar.GVars in g.

The partial variances for derived gvar.GVars in g coming from primary gvar.GVars in g are preserved by gvar.dump(). (These are used, for example, to calculate error budgets.) Partial variances coming from derived (rather than primary) gvar.GVars, however, are unreliable unless every primary gvar.GVar that contributes to the covariances in g is included in g. To guarantee that this is the case, set keyword add_dependencies=True. This can greatly increase the size of the output file, and so should only be done if error budgets, etc. are needed. (Also the cost of evaluating covariance matrices for the reconstituted gvar.GVars is increased if there are large numbers of primary gvar.GVars.) The default is add_dependencies=False.

Parameters:
  • g – Object to be serialized. Consists of (possibly nested) dictionaries, deques, lists, numpy.arrays, and/or tuples that contain gvar.GVars and other types of data.

  • outputfile – The name of a file or a file object in which the serialized g is stored. If outputfile=None (default), the results are written to a BytesIO.

  • add_dependencies (bool) – If True, automatically includes all primary gvar.GVars that contribute to the covariances of the gvar.GVars in g but are not already in g. Default is False.

  • kargs (dict) – Additional arguments, if any, that are passed to the underlying serializer (pickle).

Returns:

The BytesIO object containing the serialized data, if outputfile=None; otherwise outputfile.

gvar.dumps(g, add_dependencies=False, **kargs)

Return a serialized representation of g.

This function is shorthand for:

gvar.dump(g).getvalue()

Typical usage is:

# create bytes object containing GVars in g
import gvar as gv
gbytes = gv.dumps(g)

# convert bytes object back into g
new_g = gv.loads(gbytes)
Parameters:
  • g – Object to be serialized. Consists of (possibly nested) dictionaries, deques, lists, numpy.arrays, and/or tuples that contain gvar.GVars and other types of data.

  • add_dependencies (bool) – If True, automatically includes all primary gvar.GVars that contribute to the covariances of the gvar.GVars in g but are not already in g. Default is False.

  • kargs (dict) – Additional arguments, if any, that are passed to the underlying serializer (pickle).

Returns:

A bytes object containing a serialized representation of object g.

gvar.load(inputfile, **kargs)

Read and return object serialized in inputfile by gvar.dump().

This function recovers data serialized with gvar.dump(). Typical usage is:

# create file xxx.pickle containing data in g
import gvar as gv
gv.dump(g, 'xxx.pickle')

# read file xxx.pickle to recreate g
new_g = gv.gload('xxx.pickle')

Note that the format used by gvar.dump() changed with version 11.0 of gvar. gvar.load() will attempt to read the old formats if they are encountered, but old data should be converted to the new format (by reading it in with load() and them writing it out again with dump()).

Parameters:
  • inputfile – The name of the file or a file object in which the serialized gvar.GVars are stored (created by gvar.dump()).

  • kargs (dict) – Additional arguments, if any, that are passed to the underlying de-serializer (pickle).

Returns:

The reconstructed data object.

gvar.loads(inputbytes, **kargs)

Read and return object serialized in inputbytes by gvar.dumps().

This function recovers data serialized with gvar.dumps(). It is shorthand for:

gvar.load(BytesIO(inputbytes))

Typical usage is:

# create bytes object containing data in g
import gvar as gv
gbytes = gv.dumps(g)

# recreate g from bytes object gbytes
new_g = gv.gloads(gbytes)
Parameters:
  • inputbytes (bytes) – Bytes object created by gvar.dumps().

  • kargs (dict) – Additional arguments, if any, that are passed to the underlying serializer (pickle).

Returns:

The reconstructed data.

gvar.gdump(g, outputfile=None, method=None, add_dependencies=False, **kargs)

Write a representation of g to file outputfile.

Object g is a gvar.GVar, an array of gvar.GVars (any shape), or a dictionary whose values are gvar.GVars and/or arrays of gvar.GVars; it describes a general (multi-dimensional) Gaussian distribution. Calling gvar.gdump(g, 'filename') writes a serialized representation of g into the file named filename. The Gaussian distribution described by g can be recovered later using gvar.load('filename'). Correlations between different gvar.GVars in g are preserved, as are relationships (i.e., derivatives) between derived gvar.GVars and any primary gvar.GVars in g.

gvar.gdump() differs from gvar.dump() in that the elements in g must all be gvar.GVars for the former, whereas gvar.GVars may be mixed in with other data types for the latter. The structure of g is also more restricted for gvar.gdump(), but it is typically faster than gvar.dump().

Typical usage is:

# create file xxx.pickle containing GVars in g
import gvar as gv
gv.gdump(g, 'xxx.pickle')

# read file xxx.pickle to recreate g
new_g = gv.gload('xxx.pickle')

Object g is serialized using one of the json (text file) or pickle (binary file) Python modules. Method 'json' can produce smaller files, but method 'pickle' can be significantly faster. 'json' is more secure than 'pickle', if that is an issue.

json can have trouble with dictionaries whose keys are not strings. A workaround is used here that succeeds provided eval(repr(k)) == k for every key k. This works for a wide variety of standard key types including strings, integers, and tuples of strings and/or integers. Try pickle if the workaround fails.

The partial variances for derived gvar.GVars in g coming from primary gvar.GVars in g are preserved by gvar.gdump(). (These are used, for example, to calculate error budgets.) Partial variances coming from derived (rather than primary) gvar.GVars, however, are unreliable unless every primary gvar.GVar that contributes to the covariances in g is included in g. To guarantee that this is the case set keyword add_dependencies=True. This can greatly increase the size of the output file, and so should only be done if error budgets, etc. are needed. (Also the cost of evaluating covariance matrices for the reconstituted gvar.GVars is increased if there are large numbers of primary gvar.GVars.) The default is add_dependencies=False.

Parameters:
  • g – A gvar.GVar, array of gvar.GVars, or dictionary whose values are gvar.GVars and/or arrays of gvar.GVars.

  • outputfile – The name of a file or a file object in which the serialized gvar.GVars are stored. If outputfile=None (default), the results are written to a StringIO (for method='json') or BytesIO (for method='pickle') object.

  • method (str) – Serialization method, which should be either 'json' or 'pickle'. If method=None (default), the method is inferred from the filename’s extension: .json for 'json'; and .pickle or .pkl or .p for 'pickle'. If that fails the method defaults to 'json'.

  • add_dependencies (bool) – If True, automatically includes all primary gvar.GVars that contribute to the covariances of the gvar.GVars in g but are not already in g. Default is False.

  • kargs (dict) – Additional arguments, if any, that are passed to the underlying serializer (pickle or json).

Returns:

if outputfile=None, the StringIO or BytesIO object containing the serialized data is returned. Otherwise outputfile is returned.

gvar.gdumps(g, method='json', add_dependencies=False)

Return a serialized representation of g.

This function is shorthand for:

gvar.gdump(g, method='json').getvalue()

Typical usage is:

# create string containing GVars in g
import gvar as gv
gstr = gv.gdumps(g)

# convert string back into GVars
new_g = gv.gloads(gstr)
Parameters:
  • g – A gvar.GVar, array of gvar.GVars, or dictionary whose values are gvar.GVars and/or arrays of gvar.GVars.

  • method – Serialization method, which should be either 'json' or 'pickle'. Default is 'json'.

  • add_dependencies (bool) – If True, automatically includes all primary gvar.GVars that contribute to the covariances of the gvar.GVars in g but are not already in g. Default is False.

Returns:

A string or bytes object containing a serialized representation of object g.

gvar.gload(inputfile, method=None, **kargs)

Read and return gvar.GVars that are serialized in inputfile.

This function recovers gvar.GVars serialized with gvar.gdump(). Typical usage is:

# create file xxx.pickle containing GVars in g
import gvar as gv
gv.gdump(g, 'xxx.pickle')

# read file xxx.pickle to recreate g
new_g = gv.gload('xxx.pickle')
Parameters:
  • inputfile – The name of the file or a file object in which the serialized gvar.GVars are stored (created by gvar.gdump()).

  • method (str or None) – Serialization method, which should be either 'pickle' or 'json'. If method=None (default), the method is inferred from the filename’s extension: .json for 'json'; and .pickle or .pkl or .p for 'pickle'. If that fails the method defaults to 'json'. Argument method is ignored if inputfile is either a StringIO or BytesIO object, with the method being set to 'json' or 'pickle', respectively.

  • kargs (dict) – Additional arguments, if any, that are passed to the underlying de-serializer (pickle or json).

Returns:

The reconstructed gvar.GVar, array of gvar.GVars, or dictionary whose values are gvar.GVars and/or arrays of gvar.GVars.

gvar.gloads(inputstring)

Return gvar.GVars that are serialized in inputstr.

This function recovers the gvar.GVars serialized with gvar.gdumps(g)(). It is shorthand for:

gvar.gload(StringIO(inputstr), method='json')

Typical usage is:

# create string containing GVars in g
import gvar as gv
gstr = gv.gdumps(g)

# convert string back into GVars
new_g = gv.gloads(gstr)
Parameters:

inputstr (str or bytes) – String or bytes object created by gvar.dumps().

Returns:

The reconstructed gvar.GVar, array of gvar.GVars, or dictionary whose values are gvar.GVars and/or arrays of gvar.GVars.

gvar.remove_gvars(g, gvlist)

Remove gvar.GVars from structure g, replacing them with gvar.GVarRefs and collecting them in gvlist.

remove_gvars() searches object g (recursively) for gvar.GVars and replaces them with GVarRef objects. The gvar.GVars are collected in list gvlist. g can be a standard container object (dictionary, list, etc.) or an object obj whose contents can be accessed through obj.__dict__ or obj.__slots__. An object that defines method obj._remove_gvars is replaced by obj._remove_gvars(gvlist).

The gvar.GVars are restored using gvar.distribute_gvars: e.g.,

gvlist = []
new_g = gvar.distribute_gvars(gvar.remove_gvars(g, gvlist), gvlist)

creates a copy new_g of g with the gvar.GVars restored (and preserving correlations between different gvar.GVars). gvlist contains copies of the restored gvar.GVars.

The default treatment of a class instance obj without an obj._remove_gvars(gvlist) method is equivalent to adding the following method to the object’s class

def _remove_gvars(self, gvlist):
    tmp = copy.copy(self)
    tmp.__dict__ = gvar.remove_gvars(tmp.__dict__, gvlist)
    return tmp

assuming the class does not use __slots__. A class that has method obj._remove_gvars(gvlist) should have a corresponding method obj._distribute_gvars(gvlist), for use by gvar.distribute_gvars(). The default behavior when this method is undefined is equivalent to:

def _distribute_gvars(self, gvlist):
    self.__dict__ = gvar.distribute_gvars(self.__dict__, gvlist)

There are analogous defaults for classes that uses __slots__.

These routines are used by gvar.dump and gvar.load to facilitate pickling of objects containing gvar.GVars.

Parameters:
  • g – Object containing gvar.GVars.

  • gvlistgvar.GVars removed from g are appended to list gvlist.

Returns:

Copy of object g with gvar.GVars replaced by GVarRefs. The gvar.GVars are appended to gvlist.

gvar.distribute_gvars(g, gvlist)

Distribute gvar.GVars from gvlist in structure g, replacing gvar.GVarRefs.

distribute_gvars() undoes what remove_gvars() does to g. See discussion of remove_gvars() for more details.

Parameters:
Returns:

Object g with GVarRefs replaced by corresponding gvar.GVars from list gvlist.

class gvar.GVarRef(g, gvlist)

Placeholder for a gvar.GVar, used by gvar.remove_gvars().

Typical usage, when g is a dictionary containing gvar.GVars:

import gvar as gv

gvlist = []
for k in g:
    if isinstance(g[k], gv.GVar):
        g[k] = gv.GVarRef(g[k], gvlist)

where at the end gvlist contains the gvar.GVars removed from g.

The gvar.GVars are restored using:

for k in g:
    if isinstance(g[k], gv.GVarRef):
        g[k] = g[k](gvlist)

where the gvar.GVars are drawn from list gvlist.

gvar.disassemble(g)

Disassemble collection g of gvar.GVars.

Parameters:

g (dict, array, or gvar.GVar) – Collection of gvar.GVars to be disassembled.

gvar.reassemble(data, cov=gvar.gvar.cov)

Convert data from gvar.disassemble() back into gvar.GVars.

Parameters:

gvar.GVars contain information about derivatives with respect to the primary gvar.GVars from which they were constructed. This information can be extracted using:

gvar.deriv(g, x)

Compute partial derivatives of gvar.GVars in g w.r.t. primary gvar.GVars in x.

Primary gvar.GVars are those created using gvar.gvar() (or any function of such a variable); see gvar.is_primary().

Parameters:
Returns:

g but with each gvar.GVar in it replaced by an array of derivatives with respect to the primary gvar.GVars in x. The derivatives have the same shape as x.

The following functions are used to generate random arrays or dictionaries from the distribution defined by array (or dictionary) g of gvar.GVars. The random numbers incorporate any correlations implied by the gs.

gvar.raniter(g, n=None, svdcut=1e-12)

Return iterator for random samples from distribution g

The Gaussian variables (gvar.GVar objects) in array (or dictionary) g collectively define a multidimensional gaussian distribution. The iterator defined by raniter() generates an array (or dictionary) containing random numbers drawn from that distribution, with correlations intact.

The layout for the result is the same as for g. So an array of the same shape is returned if g is an array. When g is a dictionary, individual entries g[k] may be gvar.GVars or arrays of gvar.GVars, with arbitrary shapes.

raniter() also works when g is a single gvar.GVar, in which case the resulting iterator returns random numbers drawn from the distribution specified by g.

Parameters:
  • g – An array (or dictionary) of objects of type gvar.GVar; or a gvar.GVar.

  • n (int or None) – Maximum number of random iterations. Setting n=None (the default) implies there is no maximum number.

  • eps (float) – If positive, singularities in the correlation matrix for g are regulated using gvar.regulate() with cutoff eps. Ignored if svdcut is specified (and not None).

  • svdcut (float) – If nonzero, singularities in the correlation matrix are regulated using gvar.regulate() with an SVD cutoff svdcut. Default is svdcut=1e-12.

  • uniform (float or None) – Replace Gaussian distribution specified by g with a uniform distribution covering the interval [-uniform, uniform] times the standard deviation centered on the mean (along each principal axis of the error ellipse). Ignored if None (default).

Returns:

An iterator that returns random arrays or dictionaries with the same shape as g drawn from the Gaussian distribution defined by g.

gvar.sample(g, svdcut=1e-12)

Generate random sample from distribution g.

Equivalent to next(gvar.raniter(g, svdcut=svdcut, eps=eps)).

Parameters:
  • g – An array or dictionary of objects of type gvar.GVar; or a gvar.GVar.

  • eps (float) – If positive, singularities in the correlation matrix for g are regulated using gvar.regulate() with cutoff eps. Ignored if svdcut is specified (and not None).

  • svdcut (float) – If nonzero, singularities in the correlation matrix are regulated using gvar.regulate() with an SVD cutoff svdcut. Default is svdcut=1e-12.

Returns:

A random array or dictionary, with the same shape as g, drawn from the Gaussian distribution defined by g.

gvar.bootstrap_iter(g, n=None, svdcut=1e-12)

Return iterator for bootstrap copies of g.

The gaussian variables (gvar.GVar objects) in array (or dictionary) g collectively define a multidimensional gaussian distribution. The iterator created by bootstrap_iter() generates an array (or dictionary) of new gvar.GVars whose covariance matrix is the same as g’s but whose means are drawn at random from the original g distribution. This is a bootstrap copy of the original distribution. Each iteration of the iterator has different means (but the same covariance matrix).

bootstrap_iter() also works when g is a single gvar.GVar, in which case the resulting iterator returns bootstrap copies of the g.

Parameters:
  • g – An array (or dictionary) of objects of type gvar.GVar.

  • n – Maximum number of random iterations. Setting n=None (the default) implies there is no maximum number.

  • eps (float) – If positive, singularities in the correlation matrix for g are regulated using gvar.regulate() with cutoff eps. Ignored if svdcut is specified (and not None).

  • svdcut (float) – If nonzero, singularities in the correlation matrix are regulated using gvar.regulate() with an SVD cutoff svdcut. Default is svdcut=1e-12.

Returns:

An iterator that returns bootstrap copies of g.

This function is used to seed the random number generator used by gvar:

gvar.ranseed(a)

Seed random number generators with tuple seed.

Argument seed is an integer or a tuple of integers that is used to seed the random number generators used by numpy and random (and therefore by gvar). Reusing the same seed results in the same set of random numbers.

ranseed generates its own seed when called without an argument or with seed=None. This seed is stored in ranseed.seed and also returned by the function. The seed can be used to regenerate the same set of random numbers at a later time.

Parameters:

seed (int, tuple, or None) – Seed for generator. Generates a random tuple if None.

Returns:

The seed used to reseed the generator.

The following two functions that are useful for tabulating results and for analyzing where the errors in a gvar.GVar constructed from other gvar.GVars come from:

gvar.fmt_errorbudget(outputs, inputs, ndecimal=2, percent=True, verify=False, colwidth=10)

Tabulate error budget for outputs[ko] due to inputs[ki].

For each output outputs[ko], fmt_errorbudget computes the contributions to outputs[ko]’s standard deviation coming from the gvar.GVars collected in inputs[ki]. This is done for each key combination (ko,ki) and the results are tabulated with columns and rows labeled by ko and ki, respectively. If a gvar.GVar in inputs[ki] is correlated with other gvar.GVars, the contribution from the others is included in the ki contribution as well (since contributions from correlated gvar.GVars cannot be distinguished). The table is returned as a string.

Parameters:
  • outputs – Dictionary of gvar.GVars for which an error budget is computed.

  • inputs – Dictionary of: gvar.GVars, arrays/dictionaries of gvar.GVars, or lists of gvar.GVars and/or arrays/dictionaries of gvar.GVars. fmt_errorbudget tabulates the parts of the standard deviations of each outputs[ko] due to each inputs[ki].

  • ndecimal (int) – Number of decimal places displayed in table.

  • percent (bool) – Tabulate % errors if percent is True; otherwise tabulate the errors themselves.

  • colwidth (int) – Width of each column. This is set automatically, to accommodate label widths, if colwidth=None (default).

  • verify (bool) – If True, a warning is issued if: 1) different inputs are correlated (and therefore double count errors); or 2) the sum (in quadrature) of partial errors is not equal to the total error to within 0.1% of the error (and the error budget is incomplete or overcomplete). No checking is done if verify==False (default).

Returns:

A table (str) containing the error budget. Output variables are labeled by the keys in outputs (columns); sources of uncertainty are labeled by the keys in inputs (rows).

gvar.fmt_values(outputs, ndecimal=None)

Tabulate gvar.GVars in outputs.

Parameters:
  • outputs – A dictionary of gvar.GVar objects.

  • ndecimal (int) – Format values v using v.fmt(ndecimal).

Returns:

A table (str) containing values and standard deviations for variables in outputs, labeled by the keys in outputs.

The following functions are used to make correlation matrices less singular:

gvar.regulate(g, svdcut=1e-12, eps=None, wgts=False, noise=False)

Regulate singularities in the correlation matrix of the gvar.GVars in g.

Standard usage is, for example,

import gvar as gv

# with eps cutoff
gmod = gv.regulate(g, eps=1e-6)

# or with svd cutoff
gmod = gv.regulate(g, svdcut=1e-6)

where g is an array of gvar.GVars or a dictionary containing gvar.GVars and/or arrays of gvar.GVars. The result gmod is a copy of g whose gvar.GVars have been modified to make their correlation matrix less singular than that of the original g. Parameter eps or svdcut specifies the extent of the modification.

The modification of g is implemented by adding a set of gvar.GVars with zero means:

gmod = g + gmod.correction

where gmod.correction is an array/dictionary containing the gvar.GVars. When g is an array and eps is specified, for example, g[i] is modified by adding:

gmod.correction[i] = gv.gvar(0, (eps * norm) ** 0.5 * g[i].sdev)

where norm = numpy.linalg.norm(corr, numpy.inf) is an estimate of the largest eigenvalue of the correlation matrix corr. This correction typically has a negligible effect on the final standard deviations (relative order eps*norm/2), but can make noticeable changes in the correlation matrix for highly correlated data. Strong correlations lead to small eigenvalues in the correlation matrix, and these are significantly increased by the cutoff, which in effect replaces each eigenvalue eig by eig + eps * norm. Only members of g that are correlated with other members of g are modified; uncorrelated members are left unchanged (i.e., the correction is gv.gvar(0,0)).

Adding gmod.correction to g increases the uncertainties in g but does not affect random fluctuations in its mean values. If parameter noise=True, random noise is included in gmod.correction,

gmod.correction += gv.sample(gmod.correction),

before it is added to g. This adds random fluctuations to the means in gmod that are commensurate with the additions to the uncertainties. This is important, for example, when interpreting a chi**2 involving gmod, since chi**2 compares fluctuations in the means with the uncertainties.

Specifying svdcut rather than eps regulates the correlation matrix using gvar.svd(): calling gvar.regulate(g, svdcut=1e-4) is the same as calling gvar.svd(g, svdcut=1e-4). When svdcut>=0, each eigenvalue eig of the correlation matrix is replaced by max(eig, svdcut * max_eig) where max_eig is the largest eigenvalue. See the gvar.svd() documentation for more details. SVD cuts are numerically more robust, but more costly, especially for large systems.

There are a variety of reasons for regulating the correlation matrices. Roundoff error, for example, can make the smallest eigenvalues unreliable and destabilize calulations involving the correlation matrix. A similar situation arises when the correlation matrix is estimated from a small set of random data. This can result in small eigenvalues that are badly underestimated.

There is an additional parameter wgts in gvar.regulate() whose default value is False. Setting wgts=1 or wgts=-1 instead causes gvar.regulate() to return a tuple (gmod, i_wgts) where gmod is the modified copy of g, and i_wgts contains a decomposition of either the modified covariance matrix, when wgts=1, or the inverse of the modified covariance matrix, when wgts=-1. The covariance matrix is decomposed into non-overlapping sub-matrices, with i_wgts[0] containing the contributions from all 1x1 sub-matrices. Typical usage, for example to compute the inverse inv_cov of the gmod covariance matrix, is:

gmod, i_wgts = gvar.regulate(g, eps=1e-6, wgts=-1)
inv_cov = numpy.zeros((gmod.size, gmod.size))
# 1x1 sub-matrices
i, wgts = i_wgts[0]
inv_cov[i, i] = wgts ** 2
# nxn sub-matrices (n>1)
for i, wgts in i_wgts[1:]:
    inv_cov[i[:, None], i] = wgts.T @ wgts

Similarly, we can compute the expectation value, u.dot(inv_cov.dot(v)), between two vectors (numpy arrays) using:

result = 0.0
i, wgts = i_wgts[0]
# 1x1 sub-matrices
if len(i) > 0:
    result += numpy.sum((u[i] * wgts) * (v[i] * wgts))
# nxn sub-matrices (n>1)
for i, wgts in i_wgts[1:]:
    result += numpy.sum(wgts.dot(u[i]) * wgts.dot(v[i]))

where result is the desired expectation value.

These decompositions are useful for least squares fitting and simulating correlated data. A Cholesky decomposition is used when eps is specified, while an SVD decomposition is used with svdcut.

Parameters:
  • g – An array of gvar.GVars or a dicitionary whose values are gvar.GVars and/or arrays of gvar.GVars.

  • eps (float) – The diagonal elements of the g covariance matrix are multiplied by 1 + eps*norm where norm is the norm of the correlation matrix, numpy.linalg.norm(corr, numpy.inf). Ignored if svdcut is specified (and not None).

  • svdcut (float) – If positive, replaces eigenvalues eig of the correlation matrix with max(eig, svdcut * max_eig) where max_eig is the largest eigenvalue; if negative, discards eigenmodes with eigenvalues smaller than |svdcut| * max_eig. Note |svdcut| < 1. Default is 1e-12.

  • wgts – Setting wgts=1 causes gvar.regulate() to compute and return a decomposition of the covariance matrix of the modified gvar.GVars, gmod. Setting wgts=-1 results in a decomposition of the inverse of the covariance matrix. The default value is False, in which case only gmod is returned.

  • noise (bool) – If True, noise is added to the correction (see above). Default is False.

Returns:

A copy gmod of g with the modified correlation matrix. If wgts is not False, a tuple (g, i_wgts) is returned where i_wgts contains a decomposition of the gmod covariance matrix or its inverse (see above).

Additional information is stored in gmod:

gmod.correction

Array or dictionary containing the SVD corrections added to g to create gmod: gmod = g + gmod.correction.

gmod.eps

eps used to create gmod if set (otherwise None).

gmod.svdcut

svdcut used to create gmod if set (otherwise None).

gmod.dof

Number of degrees of freedom in gmod.

gmod.nmod

Number of members of gmod modified by regulation.

gmod.nblocks

A dictionary where gmod.nblocks[s] contains the number of block-diagonal s-by-s sub-matrices in the correlation matrix. Useful for debugging.

gmod.logdet

Logarithm of the determinant of the covariance matrix after eps regulation is applied.

gvar.svd(g, svdcut=1e-12, wgts=False, noise=False)

Apply SVD cuts to collection of gvar.GVars in g.

Standard usage is, for example,

import gvar as gv
...
gmod = gv.svd(g, svdcut=1e-4)

where g is an array of gvar.GVars or a dictionary containing gvar.GVars and/or arrays of gvar.GVars. When svdcut>0, gmod is a copy of g whose gvar.GVars have been modified to make their correlation matrix less singular than that of the original g: each eigenvalue eig of the correlation matrix is replaced by max(eig, svdcut * max_eig) where max_eig is the largest eigenvalue. This SVD cut, which is applied separately to each block-diagonal sub-matrix of the correlation matrix, increases the variance of the eigenmodes with eigenvalues smaller than svdcut * max_eig.

The modification of g’s covariance matrix is implemented by adding (to g) a set of gvar.GVars with zero means:

gmod = g + gmod.correction

where gmod.correction is an array/dictionary containing the gvar.GVars.

Adding gmod.correction to g increases the uncertainties in g but does not affect random fluctuations in the mean values. If parameter noise=True, random noise is included in gmod.correction,

gmod.correction += gv.sample(gmod.correction),

before it is added to g. This adds random fluctuations to the means in gmod that are commensurate with the additions to the uncertainties. This is important, for example, when interpreting a chi**2 involving gmod, since chi**2 tests whether fluctuations in the means are consistent with the uncertainties.

When svdcut is negative, eigenmodes of the correlation matrix whose eigenvalues are smaller than |svdcut| * max_eig are dropped from the new matrix and the corresponding components of g are zeroed out (that is, replaced by 0(0)) in gmod.

There is an additional parameter wgts in gvar.svd() whose default value is False. Setting wgts=1 or wgts=-1 instead causes gvar.svd() to return a tuple (gmod, i_wgts) where gmod is the modified copy of g, and i_wgts contains an SVD decomposition of either the modified covariance matrix, when wgts=1, or the inverse of the modified covariance matrix, when wgts=-1. The covariance matrix is decomposed into non-overlapping sub-matrices, with i_wgts[0] containing the contributions from all 1x1 sub-matrices. Typical usage, for example to compute the inverse inv_cov of the gmod covariance matrix, is:

gmod, i_wgts = gvar.svd(g, svdcut=1e-6, wgts=-1)
inv_cov = numpy.zeros((gmod.size, gmod.size))
# 1x1 sub-matrices
i, wgts = i_wgts[0]
inv_cov[i, i] = wgts ** 2
# nxn sub-matrices (n>1)
for i, wgts in i_wgts[1:]:
    inv_cov[i[:, None], i] = wgts.T @ wgts

Similarly, we can compute the expectation value, u.dot(inv_cov.dot(v)), between two vectors (numpy arrays) using:

result = 0.0
i, wgts = i_wgts[0]
# 1x1 sub-matrices
if len(i) > 0:
    result += numpy.sum((u[i] * wgts) * (v[i] * wgts))
# nxn sub-matrices (n>1)
for i, wgts in i_wgts[1:]:
    result += numpy.sum(wgts.dot(u[i]) * wgts.dot(v[i]))

where result is the desired expectation value. The SVD decompositions are useful for least squares fitting and simulating correlated data.

Parameters:
  • g – An array of gvar.GVars or a dicitionary whose values are gvar.GVars and/or arrays of gvar.GVars.

  • svdcut (None or float) – If positive, replace eigenvalues eig of the correlation matrix with max(eig, svdcut * max_eig) where max_eig is the largest eigenvalue; if negative, discard eigenmodes with eigenvalues smaller than |svdcut| * max_eig. Note |svdcut| < 1. Default is 1e-12.

  • wgts – Setting wgts=1 causes gvar.svd() to compute and return a spectral decomposition of the covariance matrix of the modified gvar.GVars, gmod. Setting wgts=-1 results in a decomposition of the inverse of the covariance matrix. The default value is False, in which case only gmod is returned.

  • noise – If True, noise is added to the SVD correction (see above). Default is False.

Returns:

A copy gmod of g whose correlation matrix is modified by SVD cuts. If wgts is not False, a tuple (g, i_wgts) is returned where i_wgts contains an SVD decomposition of gmod’s covariance matrix or its inverse.

Data from the SVD analysis is stored in gmod:

gmod.svdcut

SVD cut used to create gmod.

gmod.dof

Number of independent degrees of freedom left after the SVD cut. This is the same as the number initially unless svdcut < 0 in which case it may be smaller.

gmod.nmod

Number of modes whose eignevalue was modified by the SVD cut.

gmod.nblocks

A dictionary where gmod.nblocks[s] contains the number of block-diagonal s-by-s sub-matrices in the correlation matrix.

gmod.eigen_range

Ratio of the smallest to largest eigenvalue before SVD cuts are applied (but after rescaling).

gmod.logdet

Logarithm of the determinant of the covariance matrix after SVD cuts are applied (excluding any omitted modes when svdcut < 0 and any diagonal zero modes).

gmod.correction

Array or dictionary containing the SVD corrections added to g to create gmod: gmod = g + gmod.correction.

This function is useful when the correlation matrix is singular or almost singular, and its inverse is needed (as in curve fitting).

The following function can be used to rebuild collections of gvar.GVars, ignoring all correlations with other variables. It can also be used to introduce correlations between uncorrelated variables.

gvar.rebuild(g, corr=0.0, gvar=gvar.gvar)

Rebuild g stripping correlations with variables not in g.

g is either an array of gvar.GVars or a dictionary containing gvar.GVars and/or arrays of gvar.GVars. rebuild(g) creates a new collection gvar.GVars with the same layout, means and covariance matrix as those in g, but discarding all correlations with variables not in g.

If corr is nonzero, rebuild will introduce correlations wherever there aren’t any using

cov[i,j] -> corr * sqrt(cov[i,i]*cov[j,j])

wherever cov[i,j]==0.0 initially. Positive values for corr introduce positive correlations, negative values anti-correlations.

Parameter gvar specifies a function for creating new gvar.GVars that replaces gvar.gvar() (the default).

Parameters:
  • g (array or dictionary) – gvar.GVars to be rebuilt.

  • gvar (gvar.GVarFactory or None) – Replacement for gvar.gvar() to use in rebuilding. Default is gvar.gvar().

  • corr (number) – Size of correlations to introduce where none exist initially.

Returns:

Array or dictionary (gvar.BufferDict) of gvar.GVars (same layout as g) where all correlations with variables other than those in g are erased.

The following functions creates new functions that generate gvar.GVars (to replace gvar.gvar()):

gvar.switch_gvar()

Switch gvar.gvar() to new gvar.GVarFactory.

gvar.GVars created from different factory functions (see gvar.gvar_factory()), with different covariance matrices, should never be mixed in arithmetic or other expressions. Such mixing is unsupported and will result in unpredictable behavior. Arithmetic that mixes gvar.GVars in this way will generate an error message: “incompatible GVars”.

Returns:

New gvar.gvar().

gvar.restore_gvar()

Restore previous gvar.gvar().

gvar.GVars created from different factory functions (see gvar.gvar_factory()), with different covariance matrices, should never be mixed in arithmetic or other expressions. Such mixing is unsupported and will result in unpredictable behavior. Arithmetic that mixes gvar.GVars in this way will generate an error message: “incompatible GVars”.

Returns:

Previous gvar.gvar().

gvar.gvar_factory(cov=None)

Return new function for creating gvar.GVars (to replace gvar.gvar()).

If cov is specified, it is used as the covariance matrix for new gvar.GVars created by the function returned by gvar_factory(cov). Otherwise a new covariance matrix is created internally.

gvar.GVars created from different factory functions, with different covariance matrices, should never be mixed in arithmetic or other expressions. Such mixing is unsupported and will result in unpredictable behavior. Arithmetic that mixes gvar.GVars in this way will generate an error message: “incompatible GVars”.

gvar.GVar Objects

The fundamental class for representing Gaussian variables is:

class gvar.GVar

The basic attributes are:

mean

Mean value.

sdev

Standard deviation.

var

Variance.

Two methods allow one to isolate the contributions to the variance or standard deviation coming from other gvar.GVars:

partialvar(*args)

Compute partial variance due to gvar.GVars in args.

This method computes the part of self.var due to the gvar.GVars in args. If args[i] is correlated with other gvar.GVars, the variance coming from these is included in the result as well. (This last convention is necessary because variances associated with correlated gvar.GVars cannot be disentangled into contributions corresponding to each variable separately.)

Parameters:

args[i] – A gvar.GVar or array/dictionary of gvar.GVars contributing to the partial variance.

Returns:

Partial variance due to all of args.

partialsdev(*args)

Compute partial standard deviation due to gvar.GVars in args.

This method computes the part of self.sdev due to the gvar.GVars in args. If args[i] is correlated with other gvar.GVars, the standard deviation coming from these is included in the result as well. (This last convention is necessary because variances associated with correlated gvar.GVars cannot be disentangled into contributions corresponding to each variable separately.)

Parameters:

args[i] (gvar.GVar or array/dictionary of gvar.GVars) – Variables contributing to the partial standard deviation.

Returns:

Partial standard deviation due to args.

Partial derivatives of the gvar.GVar with respect to the independent gvar.GVars from which it was constructed are given by:

deriv(x)

Derivative of self with respest to primary gvar.GVars in x.

All gvar.GVars are constructed from primary gvar.GVars (see gvar.is_primary()). self.deriv(x) returns the partial derivative of self with respect to primary gvar.GVar x, holding all of the other primary gvar.GVars constant.

Parameters:

x – A primary gvar.GVar or an array of primary gvar.GVars.

Returns:

Derivatives of self with respect to the gvar.GVars in x. The result has the same shape as x.

There are two methods for converting self into a string, for printing:

__str__()

Return string representation of self.

The representation is designed to show at least one digit of the mean and two digits of the standard deviation. For cases where mean and standard deviation are not too different in magnitude, the representation is of the form 'mean(sdev)'. When this is not possible, the string has the form 'mean +- sdev'.

fmt(ndecimal=None, sep='')

Convert to string with format: mean(sdev).

Leading zeros in the standard deviation are omitted: for example, 25.67 +- 0.02 becomes 25.67(2). Parameter ndecimal specifies how many digits follow the decimal point in the mean. Parameter sep is a string that is inserted between the mean and the (sdev). If ndecimal is None (default), it is set automatically to the larger of int(2-log10(self.sdev)) or 0; this will display at least two digits of error. Very large or very small numbers are written with exponential notation when ndecimal is None.

Setting ndecimal < 0 returns mean +- sdev.

Two attributes and a method make reference to the original variables from which self is derived:

cov

Underlying covariance matrix (type gvar.smat) shared by all gvar.GVars.

der

Array of derivatives with respect to underlying (original) gvar.GVars.

dotder(v)

Return the dot product of self.der and v.

gvar.BufferDict Objects

gvar.BufferDict objects are ordered dictionaries that are heavily used in gvar’s implementation. They provide the most flexible representation for multi-dimensional Gaussian distributions.

gvar.BufferDict objects differ from ordinary dictionaries in two respects. The first difference is that the dictionary’s values must be scalars or numpy arrays (any shape) of scalars. The scalars can be ordinary integers or floats, but the dictionary was designed especially for gvar.GVars. The dictionary’s values are packed into different parts of a single one-dimensional array or buffer. Items can be added one at a time as in other dictionaries,

>>> import gvar as gv
>>> b = gv.BufferDict()
>>> b['s'] = 0.0
>>> b['v'] = [1., 2.]
>>> b['t'] = [[3., 4.], [5., 6.]]
>>> print(b)
{
    's': 0.0,
    'v': array([1., 2.]),
    't': array([[3., 4.],
                [5., 6.]]),
 }

but the values can also be accessed all at once through the buffer:

>>> print(b.buf)
[0. 1. 2. 3. 4. 5. 6.]

A previous entry can be overwritten, but the size, shape, and type of the data needs to be the same. For example, here setting b['s'] = [10., 20.] would generate a ValueError exception, but b['s'] = 22. is fine.

The second difference between gvar.BufferDicts and other dictionaries is illustrated by the following code:

>>> b = gv.BufferDict()
>>> b['log(a)'] = gv.gvar('1(1)')
>>> print(b)
{'log(a)': 1.0(1.0)}
>>> print(b['a'], b['log(a)'])
2.7(2.7) 1.0(1.0)

Even though 'a' is not a key in the dictionary, b['a'] is still defined: it equals exp(b['log(a)']). This feature is used to provide (limited) support for non-Gaussian distributions. Here gvar.BufferDict b specifies a distribution that is Gaussain for p['log(a)'], and therefore log-normal for p['a']. Thus, for example,

>>> [x['a'] for x in gv.raniter(b, n=4)]
[2.1662650927997817, 2.3350022125310317, 8.732161128765775, 3.578188553455522]

creates a list of four random numbers drawn from a log-normal distribution. Note that 'a' in this example is not a key in dictionary b, even though both b['a'] and b.get('a') return values:

>>> print('a' in b)
False
>>> print('log(a)' in b)
True
>>> print(list(b))
['log(a)']
>>> print(b['a'], b.get('a'))
2.7(2.7) 2.7(2.7)

This functionality is used routinely by other modules (e.g., lsqfit). The supported distributions, and methods for adding new ones, are described in the documentation for gvar.BufferDict.add_distribution(), below.

class gvar.BufferDict(*args, **kargs)

Ordered dictionary whose values are packed into a 1-d buffer (numpy array).

gvar.BufferDicts can be created in the usual way dictionaries are created:

>>> b = BufferDict()
>>> b['a'] = 1
>>> b['b'] = [2.7, 3.2]
>>> print(b)
{'a': 1.0, 'b': array([2.7, 3.2])}
>>> print(b.buf)
[1. 2.7 3.2]
>>> b = BufferDict(a=1, b=[2.7, 3.2])
>>> b = BufferDict([('a',1.), ('b',[2.7, 3.2])])

They can also be created from other dictionaries or gvar.BufferDicts:

>>> c = BufferDict(b)
>>> print(c)
{'a': 1.0, 'b': 2.7}
>>> c = BufferDict(b, keys=['b'])
>>> print(c)
{'b': 2.7}

The keys keyword restricts the copy of a dictionary to entries whose keys are in keys.

The values in a gvar.BufferDict are scalars or arrays of a scalar type (gvar.GVar, float, int, etc.). The data type of the buffer is normally inferred (dynamically) from the data itself:

>>> b = BufferDict(a=1)
>>> print(b, b.dtype)
{'a': 1} int64
>>> b['b'] = 2.7
>>> print(b, b.dtype)
{'a': 1.0, 'b': 2.7} float64

The data type of the gvar.BufferDict’s buffer can be specified when creating the gvar.BufferDict from another dictionary or list by using keyword dtype:

>>> b = BufferDict(dict(a=1.2), dtype=int)
>>> print(b, b.dtype)
{'a': 1} int64
>>> b['b'] = 2.7
>>> print(b, b.dtype)
{'a': 1, 'b': 2} int64

Note in this example that the data type is not changed by subsequent additions to the gvar.BufferDict when the dtype keyword is specified. To create an empty gvar.BufferDict with a specified type use, for example, BufferDict({}, dtype=int). Any data type supported by numpy arrays can be specified.

Some simple arithemetic is allowed between two gvar.BufferDicts, say, g1 and g2 provided they have the same keys and array shapes. So, for example:

>>> a = BufferDict(a=1, b=[2, 3])
>>> b = BufferDict(b=[20., 30.], a=10.)
>>> print(a + b)
    {'a': 11.0, 'b': array([22., 33.])}

Subtraction is also defined, as are multiplication and division by scalars. The corresponding +=, -=, *=, /= operators are supported, as are unary + and -.

Finally a gvar.BufferDict can be cloned from another one using a different buffer (containing different values):

>>> b = BufferDict(a=1., b=2.)
>>> c = BufferDict(b, buf=[10, 20])
>>> print(c)
{'a': 10, 'b': 20}

The new buffer itself (i.e., not a copy) is used in the new gvar.BufferDict if the buffer is a numpy array:

>>> import numpy as np
>>> b = BufferDict(a=1., b=2.)
>>> nbuf = np.array([10, 20])
>>> c = BufferDict(b, buf=nbuf)
>>> c['a'] += 1
>>> print(c)
{'a': 11, 'b': 20}
>>> print(nbuf)
[11 20]

The main attributes are:

size

Size of buffer array.

flat

Buffer array iterator.

dtype

Data type of buffer array elements.

buf

The (1d) buffer array. Allows direct access to the buffer: for example, self.buf[i] = new_val sets the value of the i-th element in the buffer to value new_val. Setting self.buf = nbuf replaces the old buffer by new buffer nbuf. This only works if nbuf is a one-dimensional numpy array having the same length as the old buffer, since nbuf itself is used as the new buffer (not a copy).

shape

Always equal to None. This attribute is included since gvar.BufferDicts share several attributes with numpy arrays to simplify coding that might support either type. Being dictionaries they do not have shapes in the sense of numpy arrays (hence the shape is None).

In addition to standard dictionary methods, the main methods here are:

flatten()

Copy of buffer array.

slice(k)

Return slice/index in self.flat corresponding to key k.

slice_shape(k)

Return tuple (slice/index, shape) corresponding to key k.

has_dictkey(k)

Returns True if self[k] is defined; False otherwise.

Note that k may be a key or it may be related to a related key associated with a non-Gaussian distribution (e.g., 'log(k)'; see gvar.BufferDict.add_distribution() for more information).

all_keys()

Iterator over all keys and implicit keys.

For example, the following code

b = BufferDict()
b['log(x)'] = 1.
for k in b.keys():
    print(k, b[k])
print()
for k in b.all_keys():
    print(k, b[k])

gives the following output:

log(x) 1

log(x) 1
x 2.718281828459045

Here 'x' is not a key in dictionary b but b['x'] is defined implicitly from b['log(x)']. See gvar.BufferDict.add_distribution() for more information.

static add_distribution(name, invfcn)

Add new parameter distribution.

gvar.BufferDicts can be used to represent a restricted class of non-Gaussian distributions. For example, the code

import gvar as gv
gv.BufferDict.add_distribution('log', gv.exp)

enables the use of log-normal distributions for parameters. So defining, for example,

b = gv.BufferDict()
b['log(a)'] = gv.gvar('1(1)')

means that b['a'] has a value (equal to exp(b['log(a)']) even though 'a' is not a key in the dictionary.

The distributions available by default correspond to:

gv.BufferDict.add_distribution('log', gv.exp)
gv.BufferDict.add_distribution('sqrt', gv.square)
gv.BufferDict.add_distribution('erfinv', gv.erf)
Parameters:
  • name (str) – Distributions’ function name. A ValueError is raised if name is already being used for a distribution; the error can be avoided by deleting the old definition first using BufferDict.del_distribution().

  • invfcn (callable) – Inverse of the transformation function.

static del_distribution(name)

Delete gvar.BufferDict distribution name.

Raises a ValueError if name is not the name of an existing distribution.

static has_distribution(name)

True if name has been defined as a distribution; False otherwise.

static uniform(fname, umin, umax, shape=())

Create uniform distribution on interval [umin, umax].

The code

import gvar as gv
b = gv.BufferDict()
b['f(w)'] = gv.BufferDict.uniform('f', 2., 3.)

adds a distribution function f(w) designed so that b['w'] corresponds to a uniform distribution on the interval [2., 3.] (see gvar.BufferDict.add_distribution() for more about distributions).

Parameters:
  • fname (str) – Name of function used in the BufferDict key. Note that names can be reused provided they correspond to the same interval as in previous calls.

  • umin (float) – Minimum value of the uniform distribution.

  • umax (float) – Maximum value of the uniform distribution.

  • shape (tuple) – Shape of array of uniform variables. Default is ().

Returns:

gvar.GVar object corresponding to a uniform distribution.

gvar.SVD Objects

SVD analysis is handled by the following class:

class gvar.SVD(mat, svdcut=None, svdnum=None, compute_delta=False, rescale=False)

SVD decomposition of a pos. sym. matrix.

SVD is a function-class that computes the eigenvalues and eigenvectors of a symmetric matrix mat. Eigenvalues that are small (or negative, because of roundoff) can be eliminated or modified using svd cuts. Typical usage is:

>>> mat = [[1.,.25],[.25,2.]]
>>> s = SVD(mat)
>>> print(s.val)             # eigenvalues
[ 0.94098301  2.05901699]
>>> print(s.vec[0])          # 1st eigenvector (for s.val[0])
[ 0.97324899 -0.22975292]
>>> print(s.vec[1])          # 2nd eigenvector (for s.val[1])
[ 0.22975292  0.97324899]

>>> s = SVD(mat,svdcut=0.6)  # force s.val[i]>=s.val[-1]*0.6
>>> print(s.val)
[ 1.2354102   2.05901699]
>>> print(s.vec[0])          # eigenvector unchanged
[ 0.97324899 -0.22975292]

>>> s = SVD(mat)
>>> w = s.decomp(-1)         # decomposition of inverse of mat
>>> invmat = sum(numpy.outer(wj,wj) for wj in w)
>>> print(numpy.dot(mat,invmat))    # should be unit matrix
[[  1.00000000e+00   2.77555756e-17]
 [  1.66533454e-16   1.00000000e+00]]
Parameters:
  • mat – Positive, symmetric matrix.

  • svdcut – If positive, replace eigenvalues of mat with svdcut*(max eigenvalue); if negative, discard eigenmodes with eigenvalues smaller than svdcut times the maximum eigenvalue.

  • svdnum – If positive, keep only the modes with the largest svdnum eigenvalues; ignore if set to None.

  • compute_delta (bool) – Compute delta (see below) if True; set delta=None otherwise.

  • rescale – Rescale the input matrix to make its diagonal elements equal to +-1.0 before diagonalizing.

The results are accessed using:

val

An ordered array containing the eigenvalues of mat. Note that val[i]<=val[i+1].

vec

Eigenvectors vec[i] corresponding to the eigenvalues val[i].

valmin

Minimum eigenvalue allowed in the modified matrix.

valorig

Eigenvalues of original matrix.

D

The diagonal matrix used to precondition the input matrix if rescale==True. The matrix diagonalized is D M D where M is the input matrix. D is stored as a one-dimensional vector of diagonal elements. D is None if rescale==False.

nmod

The first nmod eigenvalues in self.val were modified by the SVD cut (equals 0 unless svdcut > 0).

eigen_range

Ratio of the smallest to the largest eigenvector in the unconditioned matrix (after rescaling if rescale=True)

delta

A vector of gvars whose means are zero and whose covariance matrix is what was added to mat to condition its eigenvalues. Is None if svdcut<0 or compute_delta==False.

decomp(n)

Vector decomposition of input matrix raised to power n.

Computes vectors w[i] such that

mat**n = sum_i numpy.outer(w[i],w[i])

where mat is the original input matrix to svd. This decomposition cannot be computed if the input matrix was rescaled (rescale=True) except for n=1 and n=-1.

Parameters:

n (number) – Power of input matrix.

Returns:

Array w of vectors.

Requirements

gvar makes heavy use of numpy for array manipulations. It also uses the numpy code for implementing elementary functions (e.g., sin, exp …) in terms of member functions.