`gvar` - Gaussian Random Variables¶

Introduction¶

Objects of type gvar.GVar represent Gaussian random variables, which are specified by means and covariances. They are created using gvar.gvar(): for example,

>>> x = gvar.gvar(10, 3)         # 0 ± 3
>>> y = gvar.gvar(12, 4)         # 2 ± 4
>>> z = x + y                    # 2 ± 5
>>> print(z)
22.0(5.0)
>>> print(z.mean)
22.0
>>> print(z.sdev)
5.0

This module contains a variety of tools for creating and manipulating Gaussian random variables, including:

Creating or modifying

gvar.gvar(...) — create new gvar.GVar

gvar.gvar_function(x, f, dfdx) — create gvar.GVar for function of gvar.GVar.

regulate(g, svdcut|eps) — regulate g’s correlation matrix.

svd(g, svdcut) — SVD regulation of correlation matrix.

correlate(g, corr) — add correlations to gvar.GVars in array/dictionary g.

Examining and manipulating

mean(g) — extract means.

sdev(g) — extract standard deviations.

var(g) — extract variances.

deriv(g, x) — derivatives of g with respect to x.

is_primary(g) — test whether gvar.GVar is primary or derived.

evalcov(g) — compute covariance matrix.

evalcov_blocks(g) — compute diagonal blocks of covariance matrix.

evalcorr(g) — compute correlation matrix.

cov(g1, g2) — covariance between g1 and g2.

corr(g1, g2) — correlation between g1 and g2.

uncorrelated(g1, g2) — test for correlation

chi2(g1, g2) — chi**2 of g1-g2.

filter(g, f, *args, **kargs) — filter gvar.GVars in g through function f.

Sampling

sample(g, nbatch) — random sample from collection of gvar.GVars.

raniter(g, n, nbatch) — iterator for random numbers.

gvar_from_sample(gs) — gvar.GVars from random sample.

bootstrap_iter(g, n) — bootstrap iterator.

ranseed(seed) — seed random number generator.

Formatting

fmt(g) — replace all gvar.GVars in array/dictionary by string representations.

tabulate(g) — tabulate entries in array/dictionary of gvar.GVars.

qqplot(g1, g2) — QQ plot of g1-g2,

equivalent(g1, g2) — gvar.GVars the same in g1 and g2?

fmt_values(g) — list values for printing.

fmt_errorbudget(g) — create error-budget table for printing.

fmt_chi2(f) — format chi**2 information in f as string for printing.

Saving

dump(g, outputfile) — serialize data, including gvar.GVars, from g in file.

dumps(g) — serialize data from g in a bytes object.

load(inputfile) — reconstitute data serialized in a file.

loads(inputbytes) — reconstitute data serialized in a bytes object.

gdump(g, outputfile) — serialize a collection of gvar.GVars in file.

gdumps(g) — serialize a collection of gvar.GVars in a string.

gload(inputfile) — reconstitute a collection of gvar.GVars from a file.

gloads(inputstr) — reconstitute a collection of gvar.GVars from a string.

remove_gvars(g, gvlist) — remove gvar.GVars from g, appending them to gvlist.

distribute_gvars(g, gvlist) — restore gvar.GVars to g from gvlist.

class GVarRef — placeholder for gvar.GVar removed by gvar.remove_gvars().

dependencies(g) — collect primary gvar.GVars contributing to g.

disassemble(g) — low-level routine to disassemble a collection of gvar.GVars.

reassemble(data,cov) — low-level routine to reassemble a collection of gvar.GVars.

Related classes

class BufferDict — ordered dictionary with data buffer.

class SVD — SVD decomposition of symmetric matrix

class PDF — probability density function.

class PDFStatistics — statistical analysis of moments of a random variable.

class PDFHistogram — tool for building PDF histograms.

Analyzing Monte Carlo datasets

dataset.bin_data(data) — bin random sample data.

dataset.avg_data(data) — estimate means of random sample data.

dataset.bootstrap_iter(data,N) — bootstrap random sample data.

class dataset.Dataset — class for collecting random sample data.

Functions¶

The function used to create Gaussian variable objects is:

gvar.gvar(...)¶

Creates one or more new gvar.GVars.

gvar.gvar is an object of type gvar.GVarFactory. Each of the following creates new gvar.GVars:

gvar.gvar(x, xsdev): Returns a gvar.GVar with mean x and standard deviation xsdev. Returns an array of gvar.GVars if x and xsdev are arrays with the same shape; the shape of the result is the same as the shape of x. Returns a gvar.BufferDict if x and xsdev are dictionaries with the same keys and layout; the result has the same keys and layout as x.

gvar.gvar(x, xcov): Returns an array of gvar.GVars with means given by array x and a covariance matrix given by array xcov, where xcov.shape = 2*x.shape; the result has the same shape as x. Returns a gvar.BufferDict if x and xcov are dictionaries, where the keys in xcov are (k1,k2) for any keys k1 and k2 in x. Returns a single gvar.GVar if x is a number and xcov is a one-by-one matrix. The layout for xcov is compatible with that produced by gvar.evalcov() for a single gvar.GVar, an array of gvar.GVars, or a dictionary whose values are gvar.GVars and/or arrays of gvar.GVars. Therefore gvar.gvar(gvar.mean(g), gvar.evalcov(g)) creates gvar.GVars with the same means and covariance matrix as the gvar.GVars in g provided g is a single gvar.GVar, or an array or dictionary of gvar.GVars.

gvar.gvar(x, xcov, verify=True): Same as gvar.gvar(x, xcov) above but checks that the covariance matrix is symmetric and positive definite (which covariance matrices should be). This check is expensive for large matrices and so is not done by default. Note, however, that unpredictable outcomes will result from specifying an improper covariance matrix.

gvar.gvar(x, xcov, fast=True)

Normally gvar.gvar(x, xcov) tries to break the covariance matrix into disjoint diagonal blocks, if there are any. For example,

xcov = [[1,1,0], [1,2,0], [0,0,3]]

can be decomposed into two blocks. This decomposition saves memory, and can make later manipulations of the resulting gvar.GVars faster. This is at the expense of extra processing to create the gvar.GVars. Setting keyword fast=True prevents gvar.gvar from doing this, which would make sense, for example, if it was known ahead of time that xcov has no sub-blocks. The default is fast=False. Either choice gives correct answers; the difference is about efficiency.

gvar.gvar((x, xsdev)): Returns a gvar.GVar with mean x and standard deviation xsdev.

gvar.gvar(xstr): Returns a gvar.GVar corresponding to string xstr which is either of the form "xmean ± xsdev" or "x(xerr)". Here ± can be replaced by +/- or +-.

gvar.gvar(xgvar): Returns gvar.GVar xgvar unchanged.

gvar.gvar(xdict): Returns a dictionary (BufferDict) b where b[k] = gvar.gvar(xdict[k]) for every key in dictionary xdict. The values in xdict, therefore, can be strings, tuples or gvar.GVars (see above), or arrays of these.

gvar.gvar(xarray): Returns an array a having the same shape as xarray where every element a[i...] = gvar.gvar(xarray[i...]). The values in xarray, therefore, can be strings, tuples or gvar.GVars (see above).

gvar.gvar(ymean, ycov, x, xycov): Returns a 1-d array of gvar.GVars y[i] constructed from the 1-d array of mean values ymean and the 2-d covariance matrix ycov. The y[i] are correlated with the primary gvar.GVars in 1-d array x. The x-y covariance matrix is xycov whose shape is x.shape + y.shape. Note that this changes the gvar.GVars in x (because they are correlated with the y[i]); it has no effect on the variance or on correlations between different x[i]s.

The following function is useful for constructing new functions that can accept gvar.GVars as arguments:

gvar.gvar_function(x, f, dfdx)¶

Create a gvar.GVar for function f(x) given f and df/dx at x.

This function creates a gvar.GVar corresponding to a function of gvar.GVars x whose value is f and whose derivatives with respect to each x are given by dfdx. Here x can be a single gvar.GVar, an array of gvar.GVars (for a multidimensional function), or a dictionary whose values are gvar.GVars or arrays of gvar.GVars, while dfdx must be a float, an array of floats, or a dictionary whose values are floats or arrays of floats, respectively.

This function is useful for creating functions that can accept gvar.GVars as arguments. For example,

import math
import gvar as gv

def sin(x):
    if isinstance(x, gv.GVar):
        f = math.sin(x.mean)
        dfdx = math.cos(x.mean)
        return gv.gvar_function(x, f, dfdx)
    else:
        return math.sin(x)

creates a version of sin(x) that works with either floats or gvar.GVars as its argument. This particular function is unnecessary since it is already provided by gvar.

Parameters:

x (gvar.GVar, array of gvar.GVars, or a dictionary of gvar.GVars) – Point at which the function is evaluated.
f (float) – Value of function at point gvar.mean(x).
dfdx (float, array of floats, or a dictionary of floats) – Derivatives of function with respect to x at point gvar.mean(x).

Returns:

A gvar.GVar representing the function’s value at x.

Means, standard deviations, variances, formatted strings, covariance matrices and correlation/comparison information can be extracted from arrays (or dictionaries) of gvar.GVars using:

gvar.mean(g)¶

Extract means from gvar.GVars in g.

g can be a gvar.GVar, an array of gvar.GVars, or a dictionary containing gvar.GVars or arrays of gvar.GVars. Result has the same layout as g.

Elements of g that are not gvar.GVars are left unchanged.

gvar.sdev(g)¶

Extract standard deviations from gvar.GVars in g.

g can be a gvar.GVar, an array of gvar.GVars, or a dictionary containing gvar.GVars or arrays of gvar.GVars. Result has the same layout as g.

The deviation is set to 0.0 for elements of g that are not gvar.GVars.

gvar.var(g)¶

Compute variances for elements of array/dictionary g or a single gvar.GVar.

If g is an array of gvar.GVars, var returns the variances as an array with shape g.shape. If g is a dictionary whose values are gvar.GVars or arrays of gvar.GVars, the result is a dictionary where cov[k] is the variance for g[k].

gvar.is_primary(g)¶

Determine whether or not the gvar.GVars in g are primary gvar.GVars.

A primary gvar.GVar is one created using gvar.gvar() (or a function of such a variable). A derived gvar.GVar is one that is constructed from arithmetic expressions and functions that combine multiple primary gvar.GVars. The standard deviations for all gvar.GVars originate with the primary gvar.GVars. In particular,

z = z.mean + sum_p (p - p.mean) * dz/dp

is true for any gvar.GVar z, where the sum is over all primary gvar.GVars p.

Here g can be a gvar.GVar, an array of gvar.GVars, or a dictionary containing gvar.GVars or arrays of gvar.GVars. The result has the same layout as g. Each gvar.GVar is replaced by True or False depending upon whether it is primary or not.

When the same gvar.GVar appears more than once in g, only the first appearance is marked as a primary gvar.GVar. This avoids double counting the same primary gvar.GVar — each primary gvar.GVar in the list is unique.

gvar.dependencies(g, all=False)¶

Collect primary gvar.GVars contributing to the covariances of gvar.GVars in g.

Parameters:

g – gvar.GVar or a dictionary, array, etc. containing gvar.GVars.
all (bool) – If True the result includes all primary gvar.GVars including those that are in g; otherwise it only includes those not in g. Default is False.

Returns:

An array containing the primary gvar.GVars contributing to covariances of gvar.GVars in g.

gvar.filter(g, f, *args, **kargs)¶

Filter gvar.GVars in g through function f.

Sample usage:

import gvar as gv
g_mod = gv.filter(g, gv.mean)

replaces every gvar.GVar in g by its mean. This is useful when gv.mean(g) doesn’t work — for example, because g contains data types other than gvar.GVars, or consists of nested dictionaries.

Parameters:

g – Object consisting of (possibly nested) dictionaries, deques, lists, numpy.arrays, tuples, and/or other data types that contain gvar.GVars and other types of data.
f – Function that takes an array of gvar.GVars as an argument and returns an array of results having the same shape. Given an array gvar_array containing all the gvar.GVars from g, the function call is f(gvar_array, *args, **kargs). Results from the returned array replace the gvar.GVars in the original object.
args – Additional arguments for f.
kargs – Additional keyword arguments for f.

Returns:

An object like g but with all the gvar.GVars replaced by objects generated by function f.

gvar.fmt(g, ndecimal=None, sep='')¶

Format gvar.GVars in g.

g can be a gvar.GVar, an array of gvar.GVars, or a dictionary containing gvar.GVars or arrays of gvar.GVars. Each gvar.GVar gi in g is replaced by the string generated by gi.fmt(format,ndecimal,sep). The result has same structure as g.

Parameters:

ndecimal (int or None) – Format each gvar.GVar using f'{self:.{ndecimal}f}' when ndecimal is a non-negative integer, or f'{self:2P}' when ndecimal is negative. Ignored if ndecimal=None (default).
sep (str) – String inserted between the mean and the (sdev) in 'mean(sdev)'. Default is sep=''.
format (str) – Format string used to convert each gvar.GVar. Ignored if ndecimal is not None.

gvar.tabulate(g, ncol=1, headers=True, offset='', ndecimal=None)¶

Tabulate contents of an array or dictionary of gvar.GVars.

Given an array g of gvar.GVars or a dictionary whose values are gvar.GVars or arrays of gvar.GVars, gvar.tabulate(g) returns a string containing a table of the values of g’s entries. For example, the code

import collections
import gvar as gv

g = collections.OrderedDict()
g['scalar'] = gv.gvar('10.3(1)')
g['vector'] = gv.gvar(['0.52(3)', '0.09(10)', '1.2(1)'])
g['tensor'] = gv.gvar([
    ['0.01(50)', '0.001(20)', '0.033(15)'],
    ['0.001(20)', '2.00(5)', '0.12(52)'],
    ['0.007(45)', '0.237(4)', '10.23(75)'],
    ])
print(gv.tabulate(g, ncol=2))

prints the following table:

   key/index          value     key/index          value
---------------------------  ---------------------------
      scalar     10.30 (10)           1,0     0.001 (20)
    vector 0     0.520 (30)           1,1     2.000 (50)
           1      0.09 (10)           1,2      0.12 (52)
           2      1.20 (10)           2,0     0.007 (45)
  tensor 0,0      0.01 (50)           2,1    0.2370 (40)
         0,1     0.001 (20)           2,2     10.23 (75)
         0,2     0.033 (15)

Parameters:

g – Array of gvar.GVars (any shape) or dictionary whose values are gvar.GVars or arrays of gvar.GVars (any shape).
ncol – The table is split over ncol columns of key/index values plus gvar.GVar values. Default value is 1.
headers – Prints standard header on table if True; omits the header if False. If headers is a 2-tuple, then headers[0] is used in the header over the indices/keys and headers[1] over the gvar.GVar values. (Default is True.)
offset (str) – String inserted at the beginning of each line in the table. Default is ''.
ndecimal – Number of digits displayed after the decimal point. Default is ndecimal=None which adjusts table entries to show 2 digits of error.
keys – When g is a dictionary and keys is a list of keys, entries will be tabulated only for keys in the keys list; ignored if keys=None (default).

gvar.correlate(g, corr, upper=False, lower=False, verify=False)¶

Add correlations to uncorrelated gvar.GVars in g.

This method creates correlated gvar.GVars from uncorrelated gvar.GVars g, using the correlations specified in corr.

Note that correlations initially present in g, if any, are ignored.

Examples

A typical application involves the construction of correlated gvar.GVars give the means and standard deviations, together with a correlation matrix:

>>> import gvar as gv
>>> g = gv.gvar(['1(1)', '2(10)'])
>>> print(gv.evalcorr(g))           # uncorrelated
[[ 1.  0.]
 [ 0.  1.]]
>>> g =  gv.correlate(g, [[1. , 0.1], [0.1, 1.]])
>>> print(gv.evalcorr(g))           # correlated
[[ 1.   0.1]
 [ 0.1  1. ]]

This also works when g and corr are dictionaries:

>>> g = gv.gvar(dict(a='1(1)', b='2(10)'))
>>> print(gv.evalcorr(g))
{('a', 'a'): array([[ 1.]]),('a', 'b'): array([[ 0.]]),('b', 'a'): array([[ 0.]]),('b', 'b'): array([[ 1.]])}
>>> corr = {}
>>> corr['a', 'a'] = 1.0
>>> corr['a', 'b'] = 0.1
>>> corr['b', 'a'] = 0.1
>>> corr['b', 'b'] = 1.0
>>> g = correlate(g, corr)
>>> print(gv.evalcorr(g))
{('a', 'a'): array([[ 1.]]),('a', 'b'): array([[ 0.1]]),('b', 'a'): array([[ 0.1]]),('b', 'b'): array([[ 1.]])}

Parameters:

g – An array of gvar.GVars or a dictionary whose values are gvar.GVars or arrays of gvar.GVars.
corr – Correlations between gvar.GVars: corr[i, j] is the correlation between g[i] and g[j]. Should be a symmetric and positive-definite (unless upper or lower is specified).
upper (bool) – If True, replaces lower triangular part of corr with transpose of the upper triangular part. The diagonal is set to one. Default is False.
lower (bool) – If True, replaces upper triangular part of corr with transpose of the lower triangular part. The diagonal is set to one. Default is False.
verify (bool) – If True, verifies that corr is symmetric and positive definite. Default is False because verification is costly for large matrices.

gvar.evalcov(g)¶

Compute covariance matrix for elements of array/dictionary g.

If g is an array of gvar.GVars, evalcov returns the covariance matrix as an array with shape g.shape+g.shape. If g is a dictionary whose values are gvar.GVars or arrays of gvar.GVars, the result is a doubly-indexed dictionary where cov[k1,k2] is the covariance for g[k1] and g[k2].

gvar.cov(g1, g2)¶: Covariance of gvar.GVar g1 with g2.

gvar.evalcov_blocks(g, compress=False)¶

Evaluate covariance matrix for elements of g.

Evaluates the covariance matrices for gvar.GVars stored in array or dictionary of arrays/gvar.GVars g. The covariance matrix is decomposed into its block diagonal components, and a list of tuples (idx,bcov) is returned where bcov is a diagonal block of the covariance matrix and idx an array containing the corresponding indices in g.flat for that block. So to reassemble the blocks into a single matrix cov, for example, one would use:

import numpy as np
import gvar as gv
cov = np.zeros((len(g.flat), len(g.flat)), float)
for idx, bcov in gv.evalcov_blocks(g):
    cov[idx[:, None], idx] = bcov

gvar.evalcov_blocks() is particularly useful when the covariance matrix is sparse; only nonzero elements are retained.

Setting argument compress=True (default is False) changes the meaning of the first element in the list: idx for the first element contains the indices of all gvar.GVars in g.flat that are uncorrelated from other gvar.GVars in g.flat, and bcov contains the standard deviations for those gvar.GVars. Thus the full covariance matrix is reconstructed using the following code:

import numpy as np
import gvar as gv
cov = np.zeros((len(g.flat), len(g.flat)), float)
blocks = gv.evalcov_blocks(g, compress=True)
# uncorrelated pieces are diagonal
idx, sdev = blocks[0]
cov[idx, idx] = sdev ** 2
# correlated pieces
for idx, bcov in blocks[1:]:
    cov[idx[:, None], idx] = bcov

The code with compress=True should be slightly faster if there are many uncorrelated gvar.GVars.

It is easy to create an array of gvar.GVars having the covariance matrix from g.flat: for example,

import numpy as np
import gvar as gv
new_g = np.empty(len(g.flat), dtype=object)
for idx, bcov in evalcov_blocks(g):
    new_g[idx] = gv.gvar(new_g[idx], bcov)

creates an array of gvar.GVars with zero mean and the same covariance matrix as g.flat. This works with either value for argument compress.

Parameters:

g – A gvar.GVar, an array of gvar.GVars, or a dictionary whose values are gvar.GVars and/or arrays of gvar.GVars.
compress (bool) – Setting compress=True collects all of the uncorrelated gvar.GVars in g.flat into the first element of the returned list (see above). Default is False.

gvar.evalcorr(g)¶

Compute correlation matrix for elements of array/dictionary g.

If g is an array of gvar.GVars, evalcorr returns the correlation matrix as an array with shape g.shape+g.shape. If g is a dictionary whose values are gvar.GVars or arrays of gvar.GVars, the result is a doubly-indexed dictionary where corr[k1,k2] is the correlation for g[k1] and g[k2].

The correlation matrix is related to the covariance matrix by:

corr[i,j] = cov[i,j] / (cov[i,i] * cov[j,j]) ** 0.5

gvar.corr(g1, g2)¶: Correlation between gvar.GVars g1 and g2.

gvar.uncorrelated(g1, g2)¶

Return True if gvar.GVars in g1 uncorrelated with those in g2.

g1 and g2 can be gvar.GVars, arrays of gvar.GVars, or dictionaries containing gvar.GVars or arrays of gvar.GVars. Returns True if either of g1 or g2 is None.

gvar.chi2(g1, g2, svdcut=1e-12, dof=None)¶

Compute chi**2 of g1-g2.

chi**2 equals dg.invcov.dg, where ``dg = g1 - g2 and invcov is the inverse of dg’s covariance matrix. It is a measure of how well multi-dimensional Gaussian distributions g1 and g2 (dictionaries or arrays) agree with each other — that is, do their means agree within errors for corresponding elements. The probability is high if chi2(g1,g2)/dof is of order 1 or smaller, where dof is the number of degrees of freedom being compared.

Usually g1 and g2 are dictionaries with the same keys, where g1[k] and g2[k] are gvar.GVars or arrays of gvar.GVars having the same shape. Alternatively g1 and g2 can be gvar.GVars, or arrays of gvar.GVars having the same shape.

One of g1 or g2 can contain numbers instead of gvar.GVars, in which case chi**2 is a measure of the likelihood that the numbers came from the distribution specified by the other argument.

One or the other of g1 or g2 can be missing keys, or missing elements from arrays. Only the parts of g1 and g2 that overlap are used. Also setting g2=None is equivalent to replacing its elements by zeros.

A typical application tests whether distribution g1 and g2 are consistent with each other (within errors): the code

>>> chi2 = gvar.chi2(g1, g2)
>>> print(gvar.fmt_chi2(chi2))
chi2/dof = 1.1 [100]    Q = 0.26

shows that the distributions are reasonably consistent. The number of degrees of freedom (here 100) in this example equals the number of variables from g1 and g2 that are compared; this can be changed using the dof argument.

Parameters:

g1 – gvar.GVar or array of gvar.GVars, or a dictionary whose values are gvar.GVars or arrays of gvar.GVars. Specifies a multi-dimensional Gaussian distribution. Alternatively the elements can be numbers instead of gvar.GVars, in which case g1 specifies a sample from a distribution.
g2 – gvar.GVar or array of gvar.GVars, or a dictionary whose values are gvar.GVars or arrays of gvar.GVars. Specifies a multi-dimensional Gaussian distribution. Alternatively the elements can be numbers instead of gvar.GVars, in which case g2 specifies a sample from a distribution. Setting g2=None (default) is equivalent to setting its elements all to zero.
eps (float) – If positive, singularities in the correlation matrix for g1-g2 are regulated using gvar.regulate() with cutoff eps. Ignored if svdcut is specified (and not None).
svdcut (float) – If nonzero, singularities in the correlation matrix for g1-g2 are regulated using gvar.regulate() with an SVD cutoff svdcut. Default is svdcut=1e-12.
dof (int or None) – Number of independent degrees of freedom in g1-g2. This is set equal to the number of elements in g1-g2 if dof=None is set. This parameter affects the Q value assigned to the chi**2.

Returns:

The return value is the chi**2. Extra attributes attached to this number give additional information:

dof — Number of degrees of freedom (that is, the number
of variables compared if not specified).
Q — The probability that the chi**2 could have
been larger, by chance, even if g1 and g2 agree. Values smaller than 0.1 or so suggest that they do not agree. Also called the p-value.
residuals — Decomposition of the chi**2 in terms of the
independent modes of the correlation matrix: chi**2 = sum(residuals**2).

gvar.qqplot(g1, g2, plot=None, svdcut=1e-12, dof=None)¶

QQ plot g1-g2.

The QQ plot compares the distribution of the means of Gaussian distribution g1-g2 to that of random samples from the distribution. The resulting plot will approximate a straight line along the diagonal of the plot (dashed black line) if the means have a Gaussian distribution about zero with the correct standard deviations.

Usually g1 and g2 are dictionaries with the same keys, where g1[k] and g2[k] are gvar.GVars or arrays of gvar.GVars having the same shape. Alternatively g1 and g2 can be gvar.GVars, or arrays of gvar.GVars having the same shape.

One of g1 or g2 can contain numbers instead of gvar.GVars.

One or the other of g1 or g2 can be missing keys, or missing elements from arrays. Only the parts of g1 and g2 that overlap are used. Also setting g2=None is equivalent to replacing its elements by zeros.

In a typical application, the plot displayed by

gvar.qqplot(gsample, g).show()

tests whether gsample is likely a random sample from a multi-dimensional Gaussian distribtuion g. It is consistent with being a random sample if the QQ plot is a straight line through the origin with unit slope. The is most useful when there are many variables (g.size >> 1).

Parameters:

g1 – gvar.GVar or array of gvar.GVars, or a dictionary whose values are gvar.GVars or arrays of gvar.GVars. Specifies a multi-dimensional Gaussian distribution. Alternatively the elements can be numbers instead of gvar.GVars, in which case g1 specifies a sample from a distribution.
g2 – gvar.GVar or array of gvar.GVars, or a dictionary whose values are gvar.GVars or arrays of gvar.GVars. Specifies a multi-dimensional Gaussian distribution. Alternatively the elements can be numbers instead of gvar.GVars, in which case g2 specifies a sample from a distribution. Setting g2=None (default) is equivalent to setting its elements all to zero.
plot – a matplotlib plotter. If None (default), uses matplotlib.pyplot.
eps (float or None) – eps used by gvar.regulate() when inverting the covariance matrix. Ignored if svdcut is set (and not None).
svdcut (float) – SVD cut used when inverting the covariance matrix of g1-g2. See documentation for gvar.svd() for more information. Default is svdcut=1e-12.
dof (int or None) – Number of independent degrees of freedom in g1-g2. This is set equal to the number of elements in g1-g2 if dof=None is set. This parameter affects the Q value assigned to the chi**2.

Returns:

Plotter plot.

This method requires the scipy and matplotlib modules.

gvar.fmt_chi2(f)¶

Return string containing chi**2/dof, dof and Q from f.

Assumes f has attributes chi2, dof and Q. The logarithm of the Bayes factor will also be printed if f has attribute logGBF.

gvar.GVars are compared by:

gvar.equivalent(g1, g2, rtol=1e-10, atol=1e-10)¶

Determine whether g1 and g2 contain equivalent gvar.GVars.

Compares sums and differences of gvar.GVars stored in g1 and g2 to see if they agree within tolerances. Operationally, agreement means that:

abs(diff) < abs(summ) / 2 * rtol + atol

where diff and summ are the difference and sum of the mean values (g.mean) or derivatives (g.der) associated with each pair of gvar.GVars.

gvar.GVars that are equivalent are effectively interchangeable with respect to both their means and also their covariances with any other gvar.GVar (including ones not in g1 and g2).

g1 and g2 can be individual gvar.GVars or arrays of gvar.GVars or dictionaries whose values are gvar.GVars and/or arrays of gvar.GVars. Comparisons are made only for shared keys when they are dictionaries. Array dimensions must match between g1 and g2, but the shapes can be different; comparisons are made for the parts of the arrays that overlap in shape.

Parameters:

g1 – A gvar.GVar or an array of gvar.GVars or a dictionary of gvar.GVars and/or arrays of gvar.GVars.
g2 – A gvar.GVar or an array of gvar.GVars or a dictionary of gvar.GVars and/or arrays of gvar.GVars.
rtol – Relative tolerance with which mean values and derivatives must agree with each other. Default is 1e-10.
atol – Absolute tolerance within which mean values and derivatives must agree with each other. Default is 1e-10.

gvar.GVars can be stored (serialized) and retrieved from files (or strings) using:

gvar.dump(g, outputfile=None, add_dependencies=False, **kargs)¶

Write a representation of g to file outputfile.

Object g consists of (possibly nested) dictionaries, deques, lists, numpy.arrays, and/or tuples that contain gvar.GVars and other types of data. Calling gvar.dump(g, 'filename') writes a serialized representation of g into the file named filename. Object g can be recovered later using gvar.load('filename').

Typical usage is:

# create file xxx.pickle containing GVars in g
import gvar as gv
gv.dump(g, 'xxx.pickle')

# read file xxx.pickle to recreate g
new_g = gv.load('xxx.pickle')

gvar.dump() is an alternative to pickle.dump() which, unlike the latter, works when there are gvar.GVars in g. In particular it preserves correlations between different gvar.GVars, as well as relationships (i.e., derivatives) between derived gvar.GVars and primary gvar.GVars in g. gvar.dump() uses gvar.remove_gvars() to search (recursively) for the gvar.GVars in g.

The partial variances for derived gvar.GVars in g coming from primary gvar.GVars in g are preserved by gvar.dump(). (These are used, for example, to calculate error budgets.) Partial variances coming from derived (rather than primary) gvar.GVars, however, are unreliable unless every primary gvar.GVar that contributes to the covariances in g is included in g. To guarantee that this is the case, set keyword add_dependencies=True. This can greatly increase the size of the output file, and so should only be done if error budgets, etc. are needed. (Also the cost of evaluating covariance matrices for the reconstituted gvar.GVars is increased if there are large numbers of primary gvar.GVars.) The default is add_dependencies=False.

Parameters:

g – Object to be serialized. Consists of (possibly nested) dictionaries, deques, lists, numpy.arrays, and/or tuples that contain gvar.GVars and other types of data.
outputfile – The name of a file or a file object in which the serialized g is stored. If outputfile=None (default), the results are written to a BytesIO.
add_dependencies (bool) – If True, automatically includes all primary gvar.GVars that contribute to the covariances of the gvar.GVars in g but are not already in g. Default is False.
kargs (dict) – Additional arguments, if any, that are passed to the underlying serializer (pickle).

Returns:

The BytesIO object containing the serialized data, if outputfile=None; otherwise outputfile.

gvar.dumps(g, add_dependencies=False, **kargs)¶

Return a serialized representation of g.

This function is shorthand for:

gvar.dump(g).getvalue()

Typical usage is:

# create bytes object containing GVars in g
import gvar as gv
gbytes = gv.dumps(g)

# convert bytes object back into g
new_g = gv.loads(gbytes)

Parameters:

g – Object to be serialized. Consists of (possibly nested) dictionaries, deques, lists, numpy.arrays, and/or tuples that contain gvar.GVars and other types of data.
add_dependencies (bool) – If True, automatically includes all primary gvar.GVars that contribute to the covariances of the gvar.GVars in g but are not already in g. Default is False.
kargs (dict) – Additional arguments, if any, that are passed to the underlying serializer (pickle).

Returns:

A bytes object containing a serialized representation of object g.

gvar.load(inputfile, **kargs)¶

Read and return object serialized in inputfile by gvar.dump().

This function recovers data serialized with gvar.dump(). Typical usage is:

# create file xxx.pickle containing data in g
import gvar as gv
gv.dump(g, 'xxx.pickle')

# read file xxx.pickle to recreate g
new_g = gv.gload('xxx.pickle')

Note that the format used by gvar.dump() changed with version 11.0 of gvar. gvar.load() will attempt to read the old formats if they are encountered, but old data should be converted to the new format (by reading it in with load() and them writing it out again with dump()).

Parameters:

inputfile – The name of the file or a file object in which the serialized gvar.GVars are stored (created by gvar.dump()).
kargs (dict) – Additional arguments, if any, that are passed to the underlying de-serializer (pickle).

Returns:

The reconstructed data object.

gvar.loads(inputbytes, **kargs)¶

Read and return object serialized in inputbytes by gvar.dumps().

This function recovers data serialized with gvar.dumps(). It is shorthand for:

gvar.load(BytesIO(inputbytes))

Typical usage is:

# create bytes object containing data in g
import gvar as gv
gbytes = gv.dumps(g)

# recreate g from bytes object gbytes
new_g = gv.gloads(gbytes)

Parameters:

inputbytes (bytes) – Bytes object created by gvar.dumps().
kargs (dict) – Additional arguments, if any, that are passed to the underlying serializer (pickle).

Returns:

The reconstructed data.

gvar.gdump(g, outputfile=None, method=None, add_dependencies=False, **kargs)¶

Write a representation of g to file outputfile.

Object g is a gvar.GVar, an array of gvar.GVars (any shape), or a dictionary whose values are gvar.GVars and/or arrays of gvar.GVars; it describes a general (multi-dimensional) Gaussian distribution. Calling gvar.gdump(g, 'filename') writes a serialized representation of g into the file named filename. The Gaussian distribution described by g can be recovered later using gvar.load('filename'). Correlations between different gvar.GVars in g are preserved, as are relationships (i.e., derivatives) between derived gvar.GVars and any primary gvar.GVars in g.

gvar.gdump() differs from gvar.dump() in that the elements in g must all be gvar.GVars for the former, whereas gvar.GVars may be mixed in with other data types for the latter. The structure of g is also more restricted for gvar.gdump(), but it is typically faster than gvar.dump().

Typical usage is:

# create file xxx.pickle containing GVars in g
import gvar as gv
gv.gdump(g, 'xxx.pickle')

# read file xxx.pickle to recreate g
new_g = gv.gload('xxx.pickle')

Object g is serialized using one of the json (text file) or pickle (binary file) Python modules. Method 'json' can produce smaller files, but method 'pickle' can be significantly faster. 'json' is more secure than 'pickle', if that is an issue.

json can have trouble with dictionaries whose keys are not strings. A workaround is used here that succeeds provided eval(repr(k)) == k for every key k. This works for a wide variety of standard key types including strings, integers, and tuples of strings and/or integers. Try pickle if the workaround fails.

The partial variances for derived gvar.GVars in g coming from primary gvar.GVars in g are preserved by gvar.gdump(). (These are used, for example, to calculate error budgets.) Partial variances coming from derived (rather than primary) gvar.GVars, however, are unreliable unless every primary gvar.GVar that contributes to the covariances in g is included in g. To guarantee that this is the case set keyword add_dependencies=True. This can greatly increase the size of the output file, and so should only be done if error budgets, etc. are needed. (Also the cost of evaluating covariance matrices for the reconstituted gvar.GVars is increased if there are large numbers of primary gvar.GVars.) The default is add_dependencies=False.

Parameters:

g – A gvar.GVar, array of gvar.GVars, or dictionary whose values are gvar.GVars and/or arrays of gvar.GVars.
outputfile – The name of a file or a file object in which the serialized gvar.GVars are stored. If outputfile=None (default), the results are written to a StringIO (for method='json') or BytesIO (for method='pickle') object.
method (str) – Serialization method, which should be either 'json' or 'pickle'. If method=None (default), the method is inferred from the filename’s extension: .json for 'json'; and .pickle or .pkl or .p for 'pickle'. If that fails the method defaults to 'json'.
add_dependencies (bool) – If True, automatically includes all primary gvar.GVars that contribute to the covariances of the gvar.GVars in g but are not already in g. Default is False.
kargs (dict) – Additional arguments, if any, that are passed to the underlying serializer (pickle or json).

Returns:

if outputfile=None, the StringIO or BytesIO object containing the serialized data is returned. Otherwise outputfile is returned.

gvar.gdumps(g, method='json', add_dependencies=False)¶

Return a serialized representation of g.

This function is shorthand for:

gvar.gdump(g, method='json').getvalue()

Typical usage is:

# create string containing GVars in g
import gvar as gv
gstr = gv.gdumps(g)

# convert string back into GVars
new_g = gv.gloads(gstr)

Parameters:

g – A gvar.GVar, array of gvar.GVars, or dictionary whose values are gvar.GVars and/or arrays of gvar.GVars.
method – Serialization method, which should be either 'json' or 'pickle'. Default is 'json'.
add_dependencies (bool) – If True, automatically includes all primary gvar.GVars that contribute to the covariances of the gvar.GVars in g but are not already in g. Default is False.

Returns:

A string or bytes object containing a serialized representation of object g.

gvar.gload(inputfile, method=None, **kargs)¶

Read and return gvar.GVars that are serialized in inputfile.

This function recovers gvar.GVars serialized with gvar.gdump(). Typical usage is:

# create file xxx.pickle containing GVars in g
import gvar as gv
gv.gdump(g, 'xxx.pickle')

# read file xxx.pickle to recreate g
new_g = gv.gload('xxx.pickle')

Parameters:

inputfile – The name of the file or a file object in which the serialized gvar.GVars are stored (created by gvar.gdump()).
method (str or None) – Serialization method, which should be either 'pickle' or 'json'. If method=None (default), the method is inferred from the filename’s extension: .json for 'json'; and .pickle or .pkl or .p for 'pickle'. If that fails the method defaults to 'json'. Argument method is ignored if inputfile is either a StringIO or BytesIO object, with the method being set to 'json' or 'pickle', respectively.
kargs (dict) – Additional arguments, if any, that are passed to the underlying de-serializer (pickle or json).

Returns:

The reconstructed gvar.GVar, array of gvar.GVars, or dictionary whose values are gvar.GVars and/or arrays of gvar.GVars.

gvar.gloads(inputstring)¶

Return gvar.GVars that are serialized in inputstr.

This function recovers the gvar.GVars serialized with gvar.gdumps(g)(). It is shorthand for:

gvar.gload(StringIO(inputstr), method='json')

Typical usage is:

# create string containing GVars in g
import gvar as gv
gstr = gv.gdumps(g)

# convert string back into GVars
new_g = gv.gloads(gstr)

Parameters:: inputstr (str or bytes) – String or bytes object created by gvar.dumps().
Returns:: The reconstructed gvar.GVar, array of gvar.GVars, or dictionary whose values are gvar.GVars and/or arrays of gvar.GVars.

gvar.remove_gvars(g, gvlist)¶

Remove gvar.GVars from structure g, replacing them with gvar.GVarRefs and collecting them in gvlist.

remove_gvars() searches object g (recursively) for gvar.GVars and replaces them with GVarRef objects. The gvar.GVars are collected in list gvlist. g can be a standard container object (dictionary, list, etc.) or an object obj whose contents can be accessed through obj.__dict__ or obj.__slots__. An object that defines method obj._remove_gvars is replaced by obj._remove_gvars(gvlist).

The gvar.GVars are restored using gvar.distribute_gvars: e.g.,

gvlist = []
new_g = gvar.distribute_gvars(gvar.remove_gvars(g, gvlist), gvlist)

creates a copy new_g of g with the gvar.GVars restored (and preserving correlations between different gvar.GVars). gvlist contains copies of the restored gvar.GVars.

The default treatment of a class instance obj without an obj._remove_gvars(gvlist) method is equivalent to adding the following method to the object’s class

def _remove_gvars(self, gvlist):
    tmp = copy.copy(self)
    tmp.__dict__ = gvar.remove_gvars(tmp.__dict__, gvlist)
    return tmp

assuming the class does not use __slots__. A class that has method obj._remove_gvars(gvlist) should have a corresponding method obj._distribute_gvars(gvlist), for use by gvar.distribute_gvars(). The default behavior when this method is undefined is equivalent to:

def _distribute_gvars(self, gvlist):
    self.__dict__ = gvar.distribute_gvars(self.__dict__, gvlist)

There are analogous defaults for classes that uses __slots__.

These routines are used by gvar.dump and gvar.load to facilitate pickling of objects containing gvar.GVars.

Parameters:

g – Object containing gvar.GVars.
gvlist – gvar.GVars removed from g are appended to list gvlist.

Returns:

Copy of object g with gvar.GVars replaced by GVarRefs. The gvar.GVars are appended to gvlist.

gvar.distribute_gvars(g, gvlist)¶

Distribute gvar.GVars from gvlist in structure g, replacing gvar.GVarRefs.

distribute_gvars() undoes what remove_gvars() does to g. See discussion of remove_gvars() for more details.

Parameters:

g – Object containing GVarRefs created by remove_gvars().
gvlist – List of gvar.GVars created by remove_gvars().

Returns:

Object g with GVarRefs replaced by corresponding gvar.GVars from list gvlist.

class gvar.GVarRef(g, gvlist)¶

Placeholder for a gvar.GVar, used by gvar.remove_gvars().

Typical usage, when g is a dictionary containing gvar.GVars:

import gvar as gv

gvlist = []
for k in g:
    if isinstance(g[k], gv.GVar):
        g[k] = gv.GVarRef(g[k], gvlist)

where at the end gvlist contains the gvar.GVars removed from g.

The gvar.GVars are restored using:

for k in g:
    if isinstance(g[k], gv.GVarRef):
        g[k] = g[k](gvlist)

where the gvar.GVars are drawn from list gvlist.

gvar.disassemble(g)¶

Disassemble collection g of gvar.GVars.

Parameters:: g (dict, array, or gvar.GVar) – Collection of gvar.GVars to be disassembled.

gvar.reassemble(data, cov=gvar.gvar.cov)¶

Convert data from gvar.disassemble() back into gvar.GVars.

Parameters:

data (BufferDict, array) – Disassembled collection of gvar.GVars from gvar.disassemble() that are to be reassembled.
cov (gvar.smat) – Covariance matrix corresponding to the gvar.GVars in data. (Default is gvar.gvar.cov.)

gvar.GVars contain information about derivatives with respect to the primary gvar.GVars from which they were constructed. This information can be extracted using:

gvar.deriv(g, x)¶

Compute partial derivatives of gvar.GVars in g w.r.t. primary gvar.GVars in x.

Primary gvar.GVars are those created using gvar.gvar() (or any function of such a variable); see gvar.is_primary().

Parameters:

g – A gvar.GVar or array of gvar.GVars, or a dictionary whose values are gvar.GVars or arrays of gvar.GVars.
x – A gvar.GVar or array of primary gvar.GVars. Arrays are typically one dimensional but can have any shape.

Returns:

g but with each gvar.GVar in it replaced by an array of derivatives with respect to the primary gvar.GVars in x. The derivatives have the same shape as x.

The following functions are used to generate and analyze random arrays or dictionaries from the distribution defined by array (or dictionary) g of gvar.GVars. The random numbers incorporate any correlations implied by the gs.

gvar.sample(g, eps=None, svdcut=None, uniform=None, nbatch=None, mode='rbatch')¶

Generate random sample(s) from distribution g.

The gvar.GVars in array (or dictionary) g collectively define a multidimensional Gaussian distribution. sample(g) returns an array (or dictionary) containing random samples drawn from that distribution, with correlations intact:

>>> import gvar as gv
>>> g = dict(a=gv.gvar(1,1), b=gv.gvar([10, 100], [[1,0.99], [0.99,1]]))
>>> print(g)
{'a': 1.0(1.0), 'b': array([10.0(1.0), 100.0(1.0)], dtype=object)}
>>> print(gv.sample(g))
{'a': 0.03301713218946034, 'b': array([ 11.65655789, 101.64702394])}

The layout for each sample is the same as for g – an array (or dictionary) of the same shape but where each gvar.GVar is replaced by a number.

If multiple samples are needed, it is most efficient to create them all at once by specifying parameter nbatch. The following code creates a batch of 10,000 samples:

>>> gs = gv.sample(g, nbatch=10_000)
>>> print(gs['a'].shape, gs['b'].shape)
(10000,) (2, 10000)
>>> print(gs['a'])
[ 1.71081068 -0.12894464  0.41654825 ...  0.39854737  0.42461779
  0.97666448]
>>> print(gs['b'])
[[ 10.94348928  10.46761851  11.10887492 ...  10.46469306   8.6012229
   11.07082101]
 [100.8592698  100.55615515 101.3894677  ... 100.50053354  98.60155065
  101.11041462]]

Here individual samples are gs['a'][i] and gs['b'][:, i] where batch index i=0...9999 labels the sample.

Specifying mode='lbatch' (instead of default 'rbatch') moves the batch index to the left, gs['b'][i, :]:

>>> gs = gv.sample(g, nbatch=10_000, mode='lbatch')
>>> print(gs['b'].shape)
(10000, 2)
>>> print(gs['b'])
[[  9.28127683  99.53331081]
 [  8.62228298  98.70607512]
 [  9.63029983  99.8487004 ]
 ...
 [  9.3074379   99.38207642]
 [ 10.22812261 100.19961713]
 [  9.74325077  99.92798523]]

Note the strong correlations in the fluctuations of gs['b'][0] and gs['b'][1].

Equivalent to:

next(gvar.raniter(g, svdcut=svdcut, eps=eps, uniform=uniform, nbatch=nbatch, mode=mode))

Parameters:

g – An array or dictionary of objects of type gvar.GVar; or a gvar.GVar.
eps (float) – If positive, singularities in the correlation matrix for g are regulated using gvar.regulate() with cutoff eps. Ignored if svdcut is specified (and not None).
svdcut (float) – If nonzero, singularities in the correlation matrix are regulated using gvar.regulate() with an SVD cutoff svdcut. Default is svdcut=1e-12.
nbatch (int or None) – If not None, returns nbatch samples drawn from the distribution associated with g. The results are packaged in arrays or dictionaries whose elements have an extra index labeling the different samples in the batch. The batch index is the rightmost index if mode='rbatch'; it is the leftmost index if mode='lbatch'. Ignored if nbatch is None (default).
mode (bool) – Batch mode. Allowed modes are 'rbatch' (default) or 'lbatch', corresponding to batch indices that are on the right or the left, respectively. Ignored if nbatch is None.
uniform (float or None) – Replace Gaussian distribution specified by g with a uniform distribution covering the interval [-uniform, uniform] times the standard deviation centered on the mean along each principal axis of the error ellipse. Ignored if None (default).

Returns:

A random array or dictionary, with the same shape as g, drawn from the distribution defined by g; or nbatch such samples if nbatch is not None.

gvar.raniter(g, n=None, eps=None, svdcut=None, uniform=None, nbatch=None, mode='rbatch')¶

Return iterator for random samples from distribution g

raniter(g, n, nbatch=nbatch, ...) returns an iterator that yields n samples, gvar.sample(g, nbatch=nbatch, ...), drawn from the distribution specified by array (or dictionary) g of gvar.GVars.

Functionally equivalent to, but much faster than

for _ in range(n):
    sample = gvar.sample(g, eps=eps, svdcut=svdcut, ...)

Parameters:

g – An array (or dictionary) of objects of type gvar.GVar; or a gvar.GVar.
n (int or None) – Maximum number of random iterations. Setting n=None (the default) implies there is no maximum number.
eps (float) – If positive, singularities in the correlation matrix for g are regulated using gvar.regulate() with cutoff eps. Ignored if svdcut is specified (and not None).
svdcut (float) – If nonzero, singularities in the correlation matrix are regulated using gvar.regulate() with an SVD cutoff svdcut. Default is svdcut=1e-12.
uniform (float or None) – Replace Gaussian distribution specified by g with a uniform distribution covering the interval [-uniform, uniform] times the standard deviation centered on the mean along each principal axis of the error ellipse. Ignored if None (default).
nbatch (int or None) – If not None, the iterator will return nbatch samples drawn from the distribution associated with g. The results are packaged in arrays or dictionaries whose elements have an extra index labeling the different samples in the batch. The batch index is the rightmost index if mode='rbatch'; it is the leftmost index if mode is 'lbatch'. Ignored if nbatch is None (default).
mode (bool) – Batch mode. Allowed modes are 'rbatch' (default) or 'lbatch', corresponding to batch indices that are on the right or the left, respectively. Ignored if nbatch is None.

Returns:

An iterator that returns random arrays or dictionaries (or single gvar.GVars with the same shape as g drawn from the Gaussian distribution defined by g, or nbatch such samples when nbatch is specified.

gvar.gvar_from_sample(gs, mode='rbatch', unbias=False)¶

Convert samples gs from Gaussian distribution into gvar.GVars.

gs is an array (or dictionary) of samples drawn from a multidimensional distribution. Each element of the array (or dictionary) has an extra batch index labeling the sample: gs[..., i] (or gs[key][…, i] for dictionaries) where i is the batch index. gvar_from_sample(gs) calculates the means and covariances of the elements from the variations across samples (i.e., as i varies) and returns an array (or dictionary) whose elements are the corresponding gvar.GVars. These gvar.GVars specify a multidimensional Gaussian distribution that has the same means and covariances as the samples.

For example, the following code

>>> g = gvar.gvar([1., 2.], [[0.81, 0.1], [0.1, 0.81]])
>>> gs = gvar.sample(g, nbatch=10_000)
>>> print(gs.shape)
(2, 10000)
>>> print(gs)
[[-0.28683979  2.01220091  0.88237606 ...  0.11822824 -0.43494815
  -0.31612878]
 [ 3.01067406  1.63392898  1.41686281 ...  2.87272547  1.7961971
   4.19975679]]
>>> g_gs = gvar.gvar_from_sample(gs)
>>> print(g)
['1.00(90)' '2.00(90)']
>>> print(gs)
['0.99(89)' '1.99(90)']
>>> print(gvar.evalcov(g))
[[0.81 0.1 ]
 [0.1  0.81]]
>>> print(gvar.evalcov(g_gs))
[[0.8001385  0.10049553]
 [0.10049553 0.81289579]]

creates 10,000 samples [gs[0,i], gs[1,i]] drawn from the Gaussian distribution g. Using these samples, gvar_from_sample estimates the means and covariances of the original distribution. The resulting estimate g_gs agrees well with the exact distribution g because the latter is Gaussian.

gvar_from_sample can also be used to estimate the uncertainty (standard error) in, for example, its estimates of the means and standard deviations:

>>> mom = gvar_from_sample([gs, gs**2], stderr=True)
>>> mean = mom[0]
>>> sdev = (mom[1] - mean**2) ** 0.5
>>> print(f'mean = {mean}    sdev = {sdev}')
mean = [0.9923(89) 1.9932(90)]    sdev = [0.8945(63) 0.9016(65)]

The means and standard deviations estimated from the sample agree within errors with the correct values of [1. 2.] and [0.9 0.9] respectively.

The sample index i is on the right in the example above. If mode='lbatch' (instead of default 'rbatch') the index is on the right: [gs[i,0], gs[i,1]] where i=0...9999.

Parameters:

gs (array or dict) – Array of samples or dictionary whose values are samples or arrays of samples from a distribution. See gvar.sample() for more information.
mode (str) – The sample index is the rightmost index when mode='rbatch' (default), and the leftmost index when mode='lbatch'.
stderr (bool) – If True (default) the variances and covariances represent the uncertainties in the estimates of the means (corresponding to the standard errors). If False they represent the spread in the data (corresponding to the standard deviations of the distribution). The former is smaller by a factor of one over the number of samples.
unbias (bool) – The default choice unbias=False means that variances are estimated using sum((x[:] - mean(x))**2)/N where N is the number of random samples. This estimate is biased but is typically more accurate than the unbiased estimate obtained when N is replaced by N-1 (because statistical fluctuations are then larger than the bias). The difference is negligible unless N is very small. Setting unbias=True gives the unbiased estimate.

Results:: An array of gvar.GVars or a dictionary whose values are gvar.GVars or arrays of gvar.GVars that specifies a multi-dimensional Gaussian distribution whose means and covariances are obtained from the sample.

gvar.bootstrap_iter(g, n=None, eps=None, svdcut=None)¶

Return iterator for bootstrap copies of g.

The gaussian variables (gvar.GVar objects) in array (or dictionary) g collectively define a multidimensional gaussian distribution. The iterator created by bootstrap_iter() generates an array (or dictionary) of new gvar.GVars whose covariance matrix is the same as g’s but whose means are drawn at random from the original g distribution. This is a bootstrap copy of the original distribution. Each iteration of the iterator has different means (but the same covariance matrix).

bootstrap_iter() also works when g is a single gvar.GVar, in which case the resulting iterator returns bootstrap copies of the g.

Parameters:

g – An array (or dictionary) of objects of type gvar.GVar.
n – Maximum number of random iterations. Setting n=None (the default) implies there is no maximum number.
eps (float) – If positive, singularities in the correlation matrix for g are regulated using gvar.regulate() with cutoff eps. Ignored if svdcut is specified (and not None).
svdcut (float) – If nonzero, singularities in the correlation matrix are regulated using gvar.regulate() with an SVD cutoff svdcut. Default is svdcut=1e-12.

Returns:

An iterator that returns bootstrap copies of g.

gvar’s random number generator is gvar.RNG. It is currently (since version 12.1) an instance of numpy.random.Generator. The following function is used to reseed the random number generator:

gvar.ranseed(seed=None, size=3, version=None)¶

Seed random number generators with tuple seed.

Argument seed is an integer or a tuple of integers that is used to seed the random number generators used by by gvar. Reusing the same seed results in the same set of random numbers.

ranseed generates its own seed when called without an argument or with seed=None. This seed is stored in ranseed.seed and is also returned by the function. The seed can be used to regenerate the same set of random numbers at a later time. Keyword argument size specifies the size or shape of the new seed.

gvar’s random number generator is gvar.RNG, which is (currently) an instance of the numpy.random.Generator class. (Random numbers are generated by calling routines like gvar.RNG.random(size); see the documentation for numpy.random.Generator.) Calling gvar.ranseed assigns gvar.RNG to a new generator of this type, corresponding to the seed.

Note: The random number generator changed in version 12.1 of gvar in order to track changes in version 1.17 of numpy, which provides the generator. To restore the old generator set keyword version=0. This option should only be used for legacy code as it is likely to disappear eventually. Note that gvar.RNG=numpy.random in this case. This is also the choice if the numpy version is earlier than 1.17.

Parameters:

seed (int, tuple, or None) – Seed for generator. Generates a random tuple for the seed if None.
size (int) – Size of the tuple of integers used to seed the random number generator when seed=None. Default is size=3. Ignored if seed is not None.
version (int or None) – Version of random number generator: version=0 specifies the generator used in gvar versions earlier than 12.1; version=None (default) specifies the current random number generator.

Returns:

The seed used to reseed the generator.

The following two functions that are useful for tabulating results and for analyzing where the errors in a gvar.GVar constructed from other gvar.GVars come from:

gvar.fmt_errorbudget(outputs, inputs, ndecimal=2, percent=True, verify=False, colwidth=10)¶

Tabulate error budget for outputs[ko] due to inputs[ki].

For each output outputs[ko], fmt_errorbudget computes the contributions to outputs[ko]’s standard deviation coming from the gvar.GVars collected in inputs[ki]. This is done for each key combination (ko,ki) and the results are tabulated with columns and rows labeled by ko and ki, respectively. If a gvar.GVar in inputs[ki] is correlated with other gvar.GVars, the contribution from the others is included in the ki contribution as well (since contributions from correlated gvar.GVars cannot be distinguished). The table is returned as a string.

Parameters:

outputs – Dictionary of gvar.GVars for which an error budget is computed.
inputs – Dictionary of: gvar.GVars, arrays/dictionaries of gvar.GVars, or lists of gvar.GVars and/or arrays/dictionaries of gvar.GVars. fmt_errorbudget tabulates the parts of the standard deviations of each outputs[ko] due to each inputs[ki].
ndecimal (int) – Number of decimal places displayed in table.
percent (bool) – Tabulate % errors if percent is True; otherwise tabulate the errors themselves.
colwidth (int) – Width of each column. This is set automatically, to accommodate label widths, if colwidth=None (default).
verify (bool) – If True, a warning is issued if: 1) different inputs are correlated (and therefore double count errors); or 2) the sum (in quadrature) of partial errors is not equal to the total error to within 0.1% of the error (and the error budget is incomplete or overcomplete). No checking is done if verify==False (default).

Returns:

A table (str) containing the error budget. Output variables are labeled by the keys in outputs (columns); sources of uncertainty are labeled by the keys in inputs (rows).

gvar.fmt_values(outputs, ndecimal=None)¶

Tabulate gvar.GVars in outputs.

Parameters:

outputs – A dictionary of gvar.GVar objects.
ndecimal (int) – Format values v using v.fmt(ndecimal).

Returns:

A table (str) containing values and standard deviations for variables in outputs, labeled by the keys in outputs.

The following functions are used to make correlation matrices less singular:

gvar.regulate(g, svdcut=1e-12, eps=None, wgts=False, noise=False)¶

Regulate singularities in the correlation matrix of the gvar.GVars in g.

Standard usage is, for example,

import gvar as gv

# with eps cutoff
gmod = gv.regulate(g, eps=1e-6)

# or with svd cutoff
gmod = gv.regulate(g, svdcut=1e-6)

where g is an array of gvar.GVars or a dictionary containing gvar.GVars and/or arrays of gvar.GVars. The result gmod is a copy of g whose gvar.GVars have been modified to make their correlation matrix less singular than that of the original g. Parameter eps or svdcut specifies the extent of the modification.

The modification of g is implemented by adding a set of gvar.GVars with zero means:

gmod = g + gmod.correction

where gmod.correction is an array/dictionary containing the gvar.GVars. When g is an array and eps is specified, for example, g[i] is modified by adding:

gmod.correction[i] = gv.gvar(0, (eps * norm) ** 0.5 * g[i].sdev)

where norm = numpy.linalg.norm(corr, numpy.inf) is an estimate of the largest eigenvalue of the correlation matrix corr. This correction typically has a negligible effect on the final standard deviations (relative order eps*norm/2), but can make noticeable changes in the correlation matrix for highly correlated data. Strong correlations lead to small eigenvalues in the correlation matrix, and these are significantly increased by the cutoff, which in effect replaces each eigenvalue eig by eig + eps * norm. Only members of g that are correlated with other members of g are modified; uncorrelated members are left unchanged (i.e., the correction is gv.gvar(0,0)).

Adding gmod.correction to g increases the uncertainties in g but does not affect random fluctuations in its mean values. If parameter noise=True, random noise is included in gmod.correction,

gmod.correction += gv.sample(gmod.correction),

before it is added to g. This adds random fluctuations to the means in gmod that are commensurate with the additions to the uncertainties. This is important, for example, when interpreting a chi**2 involving gmod, since chi**2 compares fluctuations in the means with the uncertainties.

Specifying svdcut rather than eps regulates the correlation matrix using gvar.svd(): calling gvar.regulate(g, svdcut=1e-4) is the same as calling gvar.svd(g, svdcut=1e-4). When svdcut>=0, each eigenvalue eig of the correlation matrix is replaced by max(eig, svdcut * max_eig) where max_eig is the largest eigenvalue. See the gvar.svd() documentation for more details. SVD cuts are numerically more robust, but more costly, especially for large systems.

There are a variety of reasons for regulating the correlation matrices. Roundoff error, for example, can make the smallest eigenvalues unreliable and destabilize calulations involving the correlation matrix. A similar situation arises when the correlation matrix is estimated from a small set of random data. This can result in small eigenvalues that are badly underestimated.

There is an additional parameter wgts in gvar.regulate() whose default value is False. Setting wgts=1 or wgts=-1 instead causes gvar.regulate() to return a tuple (gmod, i_wgts) where gmod is the modified copy of g, and i_wgts contains a decomposition of either the modified covariance matrix, when wgts=1, or the inverse of the modified covariance matrix, when wgts=-1. The covariance matrix is decomposed into non-overlapping sub-matrices, with i_wgts[0] containing the contributions from all 1x1 sub-matrices. Typical usage, for example to compute the inverse inv_cov of the gmod covariance matrix, is:

gmod, i_wgts = gvar.regulate(g, eps=1e-6, wgts=-1)
inv_cov = numpy.zeros((gmod.size, gmod.size))
# 1x1 sub-matrices
i, wgts = i_wgts[0]
inv_cov[i, i] = wgts ** 2
# nxn sub-matrices (n>1)
for i, wgts in i_wgts[1:]:
    inv_cov[i[:, None], i] = wgts.T @ wgts

Similarly, we can compute the expectation value, u.dot(inv_cov.dot(v)), between two vectors (numpy arrays) using:

result = 0.0
i, wgts = i_wgts[0]
# 1x1 sub-matrices
if len(i) > 0:
    result += numpy.sum((u[i] * wgts) * (v[i] * wgts))
# nxn sub-matrices (n>1)
for i, wgts in i_wgts[1:]:
    result += numpy.sum(wgts.dot(u[i]) * wgts.dot(v[i]))

where result is the desired expectation value.

Setting wgts=True causes gvar.regulate() to return a tuple (gmod, i_wgts, i_invwgts) where i_wgts is the decomposition of the covariance matrix and i_invwgts is the decomposition of the inverse of the covariance matrix.

These decompositions are useful for least squares fitting and simulating correlated data. A Cholesky decomposition is used when eps is specified, while an SVD decomposition is used with svdcut.

Parameters:

g – An array of gvar.GVars or a dicitionary whose values are gvar.GVars and/or arrays of gvar.GVars.
eps (float) – The diagonal elements of the g covariance matrix are multiplied by 1 + eps*norm where norm is the norm of the correlation matrix, numpy.linalg.norm(corr, numpy.inf). Ignored if svdcut is specified (and not None).
svdcut (float) – If positive, replaces eigenvalues eig of the correlation matrix with max(eig, svdcut * max_eig) where max_eig is the largest eigenvalue; if negative, discards eigenmodes with eigenvalues smaller than |svdcut| * max_eig. Note |svdcut| < 1. Default is 1e-12.
wgts – Setting wgts=1 causes gvar.regulate() to compute and return a decomposition of the covariance matrix of the modified gvar.GVars, gmod. Setting wgts=-1 results in a decomposition of the inverse of the covariance matrix. Setting wgts=True results in both decompositions. The default value is False, in which case only gmod is returned.
noise (bool) – If True, noise is added to the correction (see above). Default is False.

Returns:

If wgts is not False, a tuple (g, i_wgts) is returned where i_wgts contains a decomposition of the gmod covariance matrix or its inverse (see above), or if wgts=True both decompositions are returned: (gmod, i_wgts, i_invwgts).

Additional information is stored in gmod:

gmod.correction: Array or dictionary containing the SVD corrections added to g to create gmod: gmod = g + gmod.correction.

gmod.eps: eps used to create gmod if set (otherwise None).

gmod.svdcut: svdcut used to create gmod if set (otherwise None).

gmod.dof: Number of degrees of freedom in gmod.

gmod.nmod: Number of members of gmod modified by regulation.

gmod.nblocks: A dictionary where gmod.nblocks[s] contains the number of block-diagonal s-by-s sub-matrices in the correlation matrix. Useful for debugging.

gmod.logdet: Logarithm of the determinant of the covariance matrix after eps regulation is applied.

gvar.svd(g, svdcut=1e-12, wgts=False, noise=False)¶

Apply SVD cuts to collection of gvar.GVars in g.

Standard usage is, for example,

import gvar as gv
...
gmod = gv.svd(g, svdcut=1e-4)

where g is an array of gvar.GVars or a dictionary containing gvar.GVars and/or arrays of gvar.GVars. When svdcut>0, gmod is a copy of g whose gvar.GVars have been modified to make their correlation matrix less singular than that of the original g: each eigenvalue eig of the correlation matrix is replaced by max(eig, svdcut * max_eig) where max_eig is the largest eigenvalue. This SVD cut, which is applied separately to each block-diagonal sub-matrix of the correlation matrix, increases the variance of the eigenmodes with eigenvalues smaller than svdcut * max_eig.

The modification of g’s covariance matrix is implemented by adding (to g) a set of gvar.GVars with zero means:

gmod = g + gmod.correction

where gmod.correction is an array/dictionary containing the gvar.GVars.

Adding gmod.correction to g increases the uncertainties in g but does not affect random fluctuations in the mean values. If parameter noise=True, random noise is included in gmod.correction,

gmod.correction += gv.sample(gmod.correction),

before it is added to g. This adds random fluctuations to the means in gmod that are commensurate with the additions to the uncertainties. This is important, for example, when interpreting a chi**2 involving gmod, since chi**2 tests whether fluctuations in the means are consistent with the uncertainties.

When svdcut is negative, eigenmodes of the correlation matrix whose eigenvalues are smaller than |svdcut| * max_eig are dropped from the new matrix and the corresponding components of g are zeroed out (that is, replaced by 0(0)) in gmod.

There is an additional parameter wgts in gvar.svd() whose default value is False. Setting wgts=1 or wgts=-1 instead causes gvar.svd() to return a tuple (gmod, i_wgts) where gmod is the modified copy of g, and i_wgts contains an SVD decomposition of either the modified covariance matrix, when wgts=1, or the inverse of the modified covariance matrix, when wgts=-1. The covariance matrix is decomposed into non-overlapping sub-matrices, with i_wgts[0] containing the contributions from all 1x1 sub-matrices. Typical usage, for example to compute the inverse inv_cov of the gmod covariance matrix, is:

gmod, i_wgts = gvar.svd(g, svdcut=1e-6, wgts=-1)
inv_cov = numpy.zeros((gmod.size, gmod.size))
# 1x1 sub-matrices
i, wgts = i_wgts[0]
inv_cov[i, i] = wgts ** 2
# nxn sub-matrices (n>1)
for i, wgts in i_wgts[1:]:
    inv_cov[i[:, None], i] = wgts.T @ wgts

Similarly, we can compute the expectation value, u.dot(inv_cov.dot(v)), between two vectors (numpy arrays) using:

result = 0.0
i, wgts = i_wgts[0]
# 1x1 sub-matrices
if len(i) > 0:
    result += numpy.sum((u[i] * wgts) * (v[i] * wgts))
# nxn sub-matrices (n>1)
for i, wgts in i_wgts[1:]:
    result += numpy.sum(wgts.dot(u[i]) * wgts.dot(v[i]))

where result is the desired expectation value. The SVD decompositions are useful for least squares fitting and simulating correlated data.

Setting wgts=True causes gvar.svd() to return a tuple (gmod, i_wgts, i_invwgts) where i_wgts is the decomposition of the covariance matrix and i_invwgts is the decomposition of the inverse of the covariance matrix.

Parameters:

g – An array of gvar.GVars or a dicitionary whose values are gvar.GVars and/or arrays of gvar.GVars.
svdcut (None or float) – If positive, replace eigenvalues eig of the correlation matrix with max(eig, svdcut * max_eig) where max_eig is the largest eigenvalue; if negative, discard eigenmodes with eigenvalues smaller than |svdcut| * max_eig. Note |svdcut| < 1. Default is 1e-12.
wgts – Setting wgts=1 causes gvar.svd() to compute and return a spectral decomposition of the covariance matrix of the modified gvar.GVars, gmod. Setting wgts=-1 results in a decomposition of the inverse of the covariance matrix. Setting wgts=True results in both decompositions. The default value is False, in which case only gmod is returned.
noise – If True, noise is added to the SVD correction (see above). Default is False.

Returns:

A copy gmod of g whose correlation matrix is modified by SVD cuts. If wgts=1, a tuple (gmod,i_wgts) is returned where i_wgts contains an SVD decomposition of gmod’s covariance matrix. If wgts=-1, a tuple (gmod,i_invwgts) is returned where i_invwgts contains an SVD decomposition of the inverse of gmod’s covariance matrix. If wgts=True, both decompositions are returned: (gmod,i_wgts,i_invwgts).

Data from the SVD analysis is stored in gmod:

gmod.svdcut: SVD cut used to create gmod.

gmod.dof: Number of independent degrees of freedom left after the SVD cut. This is the same as the number initially unless svdcut < 0 in which case it may be smaller.

gmod.nmod: Number of modes whose eignevalue was modified by the SVD cut.

gmod.nblocks: A dictionary where gmod.nblocks[s] contains the number of block-diagonal s-by-s sub-matrices in the correlation matrix.

gmod.eigen_range: Ratio of the smallest to largest eigenvalue before SVD cuts are applied (but after rescaling).

gmod.logdet: Logarithm of the determinant of the covariance matrix after SVD cuts are applied (excluding any omitted modes when svdcut < 0 and any diagonal zero modes).

gmod.correction: Array or dictionary containing the SVD corrections added to g to create gmod: gmod = g + gmod.correction.

This function is useful when the correlation matrix is singular or almost singular, and its inverse is needed (as in curve fitting).

The following function can be used to rebuild collections of gvar.GVars, ignoring all correlations with other variables. It can also be used to introduce correlations between uncorrelated variables.

gvar.rebuild(g, corr=0.0, gvar=gvar.gvar)¶

Rebuild g stripping correlations with variables not in g.

g is either an array of gvar.GVars or a dictionary containing gvar.GVars and/or arrays of gvar.GVars. rebuild(g) creates a new collection gvar.GVars with the same layout, means and covariance matrix as those in g, but discarding all correlations with variables not in g.

If corr is nonzero, rebuild will introduce correlations wherever there aren’t any using

cov[i,j] -> corr * sqrt(cov[i,i]*cov[j,j])

wherever cov[i,j]==0.0 initially. Positive values for corr introduce positive correlations, negative values anti-correlations.

Parameter gvar specifies a function for creating new gvar.GVars that replaces gvar.gvar() (the default).

Parameters:

g (array or dictionary) – gvar.GVars to be rebuilt.
gvar (gvar.GVarFactory or None) – Replacement for gvar.gvar() to use in rebuilding. Default is gvar.gvar().
corr (number) – Size of correlations to introduce where none exist initially.

Returns:

Array or dictionary (gvar.BufferDict) of gvar.GVars (same layout as g) where all correlations with variables other than those in g are erased.

The following functions creates new functions that generate gvar.GVars (to replace gvar.gvar()):

gvar.switch_gvar()¶

Switch gvar.gvar() to new gvar.GVarFactory.

gvar.GVars created from different factory functions (see gvar.gvar_factory()), with different covariance matrices, should never be mixed in arithmetic or other expressions. Such mixing is unsupported and will result in unpredictable behavior. Arithmetic that mixes gvar.GVars in this way will generate an error message: “incompatible GVars”.

Parameters:: cov – Covariance matrix for new gvar.gvar(). A new covariance matrix is created if cov=None (default). If cov is a gvar.GVar, the covariance matrix of the gvar.GVar is used.
Returns:: New gvar.gvar().

gvar.restore_gvar()¶

Restore previous gvar.gvar().

gvar.GVars created from different factory functions (see gvar.gvar_factory()), with different covariance matrices, should never be mixed in arithmetic or other expressions. Such mixing is unsupported and will result in unpredictable behavior. Arithmetic that mixes gvar.GVars in this way will generate an error message: “incompatible GVars”.

Returns:: Previous gvar.gvar().

gvar.gvar_factory(cov=None)¶

Return new function for creating gvar.GVars (to replace gvar.gvar()).

If cov is specified, it is used as the covariance matrix for new gvar.GVars created by the function returned by gvar_factory(cov). Otherwise a new covariance matrix is created internally.

gvar.GVars created from different factory functions, with different covariance matrices, should never be mixed in arithmetic or other expressions. Such mixing is unsupported and will result in unpredictable behavior. Arithmetic that mixes gvar.GVars in this way will generate an error message: “incompatible GVars”.

`gvar.GVar` Objects¶

The fundamental class for representing Gaussian variables is:

class gvar.GVar¶

The basic attributes are:

mean¶: Mean value.

sdev¶: Standard deviation.

var¶: Variance.

Two methods allow one to isolate the contributions to the variance or standard deviation coming from other gvar.GVars:

partialvar(*args)¶

Compute partial variance due to gvar.GVars in args.

This method computes the part of self.var due to the gvar.GVars in args. If args[i] is correlated with other gvar.GVars, the variance coming from these is included in the result as well. (This last convention is necessary because variances associated with correlated gvar.GVars cannot be disentangled into contributions corresponding to each variable separately.)

Parameters:: args[i] – A gvar.GVar or array/dictionary of gvar.GVars contributing to the partial variance.
Returns:: Partial variance due to all of args.

partialsdev(*args)¶

Compute partial standard deviation due to gvar.GVars in args.

This method computes the part of self.sdev due to the gvar.GVars in args. If args[i] is correlated with other gvar.GVars, the standard deviation coming from these is included in the result as well. (This last convention is necessary because variances associated with correlated gvar.GVars cannot be disentangled into contributions corresponding to each variable separately.)

Parameters:: args[i] (gvar.GVar or array/dictionary of gvar.GVars) – Variables contributing to the partial standard deviation.
Returns:: Partial standard deviation due to args.

Partial derivatives of the gvar.GVar with respect to the independent gvar.GVars from which it was constructed are given by:

deriv(x)¶

Derivative of self with respest to primary gvar.GVars in x.

All gvar.GVars are constructed from primary gvar.GVars (see gvar.is_primary()). self.deriv(x) returns the partial derivative of self with respect to primary gvar.GVar x, holding all of the other primary gvar.GVars constant.

Parameters:: x – A primary gvar.GVar or an array of primary gvar.GVars.
Returns:: Derivatives of self with respect to the gvar.GVars in x. The result has the same shape as x.

There are three methods for converting self into a string for printing and display, and fourth for modifying the defaults used in formatting:

__format__(spec)¶

Format strings for gvar.GVars.

Support is provided for standard presentation formats normally used for floats including: 'e', 'f', 'g', 'n', and '%'. The format is applied to the mean value and the output modified to include the standard deviation when the mean is larger in magnitude than the standard deviation:

>>> x = gvar.gvar(12.314, 1.56)
>>> print(f'{x:.2g}'')
12(2)
>>> print(f'{x:.^25.3e}')
.....1.231(156)e+01......

The format is applied first to the standard deviation when it is the larger of the two.

There is an additional format, 'p', where the precision field specifies the number of digits displayed in the standard deviation:

>>> print(f'{x:.2p}')
12.3(1.6)

The default precision is 2.

An alternative form '#p' of this format may add further digits to the standard deviation (when it is larger in magnitude than the mean) so that at least one non-zero digit of the mean is displayed:

>>> x = gvar.gvar(9, 123)
>>> print(f'{x:#.2p}'')
9(123)

If the mean is exactly zero, the result is ‘0 ± sdev’.

Format 'P' is similar to '#p' but always uses the plus-minus format: 'mean ± sdev'.

The default format, when none is specified, is ‘{:#.2p}’. This can be changed using GVar.set(default_format=...).

__str__()¶: Returns string using the default format: f'{self}'.

fmt(ndecimal=None, sep='', format='{}')¶

Format gvar.GVar.

Typical usage:

>>> g = gvar.gvar(27.9315, 1.23)
>>> print(g.fmt(), g.fmt(format='{:.2g}'), g.fmt(ndecimal=3))
27.9(1.2) 28(1) 27.931(1.230)
>>> print(g.fmt(ndecimal=-1), g.fmt(sep='|'))
27.9 ± 1.2 27.9|(1.2)

Parameters:

ndecimal (int or None) – Format gvar.GVar using f'{self:.{ndecimal}f}' when ndecimal is a non-negative integer, or f'{self:.2P}' when ndecimal is negative. Ignored if ndecimal=None (default).
sep (str) – String inserted between the mean and the (sdev) in 'mean(sdev)'. Default is sep=''.
format (str) – Format string. Ignored if ndecimal is not None. Default is '{}'.

Returns:

Formatted string.

static set(**kargs)¶

Change gvar.GVar’s format parameters.

Typical usage is:

# change default format spec and plus/minus symbol
oldparam = GVar.set(default_format='{:.3f}', plusminus=' +/- ')
...
# change back
GVar.set(**oldparam)

The following parameters can be reset.

Parameters:

formatter (callable) – Sets the default formatter. The formatted string is returned by formatter(g, spec) where g is the gvar.GVar to be formatted and spec is the format specification (e.g, '.3g' or '.^20.3f'). Setting formatter=None restores the original formatter. Setting formatter='old' switches to the formatter used before gvar version 12.0 (buggy and deprecated).
default_format (str) – Sets the format used when none is specified. (Note that default_format='{}' is not allowed.) The format can be specified as a full format string (e.g., '{:.4g}') or with just the format specification ('.4g'). Default is '{:#.2p}'.
plusminus (str) – Sets the plus/minus symbol(s) for format strings. Default is ' ± '.
prange (tuple) – Sets the range of values of abs(g.mean)/g.sdev formatted with a compressed format when using the 'p' presentation format. Values outside this range are formatted as '{g.mean} ± {g.sdev}' with the 'p' format. Default is (1e-5, 10 ** sys.float_info.dig).

Two attributes and a method make reference to the original variables from which self is derived:

cov¶: Underlying covariance matrix (type gvar.smat) shared by all gvar.GVars.

der¶: Array of derivatives with respect to underlying (original) gvar.GVars.

dotder(v)¶: Return the dot product of self.der and v.

`gvar.BufferDict` Objects¶

gvar.BufferDict objects are ordered dictionaries that are heavily used in gvar’s implementation. They provide the most flexible representation for multi-dimensional Gaussian distributions.

gvar.BufferDict objects differ from ordinary dictionaries in two respects. The first difference is that the dictionary’s values must be scalars or numpy arrays (any shape) of scalars. The scalars can be ordinary integers or floats, but the dictionary was designed especially for gvar.GVars. The dictionary’s values are packed into different parts of a single one-dimensional array or buffer. Items can be added one at a time as in other dictionaries,

>>> import gvar as gv
>>> b = gv.BufferDict()
>>> b['s'] = 0.0
>>> b['v'] = [1., 2.]
>>> b['t'] = [[3., 4.], [5., 6.]]
>>> print(b)
{
    's': 0.0,
    'v': array([1., 2.]),
    't': array([[3., 4.],
                [5., 6.]]),
 }

but the values can also be accessed all at once through the buffer:

>>> print(b.buf)
[0. 1. 2. 3. 4. 5. 6.]

A previous entry can be overwritten, but the size, shape, and type of the data needs to be the same. For example, here setting b['s'] = [10., 20.] would generate a ValueError exception, but b['s'] = 22. is fine.

The second difference between gvar.BufferDicts and other dictionaries is illustrated by the following code:

>>> b = gv.BufferDict()
>>> b['log(a)'] = gv.gvar('1(1)')
>>> print(b)
{'log(a)': 1.0(1.0)}
>>> print(b['a'], b['log(a)'])
2.7(2.7) 1.0(1.0)

Even though 'a' is not a key in the dictionary, b['a'] is still defined: it equals exp(b['log(a)']). This feature is used to provide (limited) support for non-Gaussian distributions. Here gvar.BufferDict b specifies a distribution that is Gaussain for p['log(a)'], and therefore log-normal for p['a']. Thus, for example,

>>> [x['a'] for x in gv.raniter(b, n=4)]
[2.1662650927997817, 2.3350022125310317, 8.732161128765775, 3.578188553455522]

creates a list of four random numbers drawn from a log-normal distribution. Note that 'a' in this example is not a key in dictionary b, even though both b['a'] and b.get('a') return values:

>>> print('a' in b)
False
>>> print('log(a)' in b)
True
>>> print(list(b))
['log(a)']
>>> print(b['a'], b.get('a'))
2.7(2.7) 2.7(2.7)

This functionality is used routinely by other modules (e.g., lsqfit). The supported distributions, and methods for adding new ones, are described in the documentation for gvar.BufferDict.add_distribution(), below.

class gvar.BufferDict(*args, **kargs)¶

Ordered dictionary whose values are packed into a 1-d buffer (numpy array).

gvar.BufferDicts can be created in the usual way dictionaries are created:

>>> b = BufferDict()
>>> b['a'] = 1
>>> b['b'] = [2.7, 3.2]
>>> print(b)
{'a': 1.0, 'b': array([2.7, 3.2])}
>>> print(b.buf)
[1. 2.7 3.2]
>>> b = BufferDict(a=1, b=[2.7, 3.2])
>>> b = BufferDict([('a',1.), ('b',[2.7, 3.2])])

They can also be created from other dictionaries or gvar.BufferDicts:

>>> c = BufferDict(b)
>>> print(c)
{'a': 1.0, 'b': 2.7}
>>> c = BufferDict(b, keys=['b'])
>>> print(c)
{'b': 2.7}

The keys keyword restricts the copy of a dictionary to entries whose keys are in keys and in the same order as in keys.

The values in a gvar.BufferDict are scalars or arrays of a scalar type (gvar.GVar, float, int, etc.). The data type of the buffer is normally inferred (dynamically) from the data itself:

>>> b = BufferDict(a=1)
>>> print(b, b.dtype)
{'a': 1} int64
>>> b['b'] = 2.7
>>> print(b, b.dtype)
{'a': 1.0, 'b': 2.7} float64

The data type of the gvar.BufferDict’s buffer can be specified when creating the gvar.BufferDict from another dictionary or list by using keyword dtype:

>>> b = BufferDict(dict(a=1.2), dtype=int)
>>> print(b, b.dtype)
{'a': 1} int64
>>> b['b'] = 2.7
>>> print(b, b.dtype)
{'a': 1, 'b': 2} int64

Note in this example that the data type is not changed by subsequent additions to the gvar.BufferDict when the dtype keyword is specified. To create an empty gvar.BufferDict with a specified type use, for example, BufferDict({}, dtype=int). Any data type supported by numpy arrays can be specified.

Some simple arithemetic is allowed between two gvar.BufferDicts, say, g1 and g2 provided they have the same keys and array shapes. So, for example:

>>> a = BufferDict(a=1, b=[2, 3])
>>> b = BufferDict(b=[20., 30.], a=10.)
>>> print(a + b)
    {'a': 11.0, 'b': array([22., 33.])}

Subtraction is also defined, as are multiplication and division by scalars. The corresponding +=, -=, *=, /= operators are supported, as are unary + and -.

Finally a gvar.BufferDict can be cloned from another one using a different buffer (containing different values):

>>> b = BufferDict(a=1., b=2.)
>>> c = BufferDict(b, buf=[10, 20])
>>> print(c)
{'a': 10, 'b': 20}

The new buffer itself (i.e., not a copy) is used in the new gvar.BufferDict if the buffer is a numpy array:

>>> import numpy as np
>>> b = BufferDict(a=1., b=2.)
>>> nbuf = np.array([10, 20])
>>> c = BufferDict(b, buf=nbuf)
>>> c['a'] += 1
>>> print(c)
{'a': 11, 'b': 20}
>>> print(nbuf)
[11 20]

If many clones of a gvar.BufferDict are needed, it is usually substantially more efficient to create them together in a single batch (or a small number of large batches). For example, the following code

>>> a = BufferDict(s=1., v=[2., 3., 4.])
>>> print(a, '   ', a.buf)
{'s': 1.0, 'v': array([2., 3.])}    [1. 2. 3. 4.]
>>> lbuf = np.array(
... [[10., 20., 30., 40.], [100., 200., 300., 400.], [1000., 2000., 3000., 4000.]]
... )
>>> lb = BufferDict(a, lbatch_buf=lbuf)
>>> print(lb)
{
    's': array([  10.,  100., 1000.]),
    'v':
    array([[  20.,   30.,   40.],
        [ 200.,  300.,  400.],
        [2000., 3000., 4000.]]),
}

merges three clones of the original gvar.BufferDict a into gvar.BufferDict lb. Each clone has its own buffer lbuf[i] where i=0...2 labels the clone. Each element of lb has an extra (leftmost) index labeling the clone: lb['s'][i] and lb['v'][i, :]. If needed, the individual clones can be pulled out into separate dictionaries using gvar.BufferDict.batch_iter():

>>> for clone in b.batch_iter('lbatch'):
...     print(clone)
...
{'s': 10.0, 'v': array([20., 30., 40.])}
{'s': 100.0, 'v': array([200., 300., 400.])}
{'s': 1000.0, 'v': array([2000., 3000., 4000.])}

These clones all provide views into b.buf.

The batch index is always the leftmost index in the example above. It is frequently more convenient for the batch index to be the rightmost index. This is done by setting rbatch_buf rather than lbatch_buf when making the batch gvar.BufferDict: for example,

>>> rbuf = np.array(
... [[10., 100., 1000.], [20., 200., 2000.], [30., 300., 3000.], [40., 400., 4000.]]
... )
>>> rb = BufferDict(a, rbatch_buf=rbuf)
>>> print(rb)
{
    's': array([  10.,  100., 1000.]),
    'v':
    array([[  20.,  200., 2000.],
           [  30.,  300., 3000.]]),
           [  40.,  400., 4000.]]),
}

creates a batch gvar.BufferDict equivalent to the previous one but now where the batch index is the last index for each element: rb[k][...,i] for each key k and batch index i=0..2. The batch buffer rbuf used here is the transpose of that used for the lbatch case; rb.buf is a view into rbuf (not a copy as is the case for lbatch). Again we can iterate over the individual clones (and these are the same as above):

>>> for clone in rb.batch_iter('rbatch'):
...     print(clone)
...
{'s': 10.0, 'v': array([20., 30., 40.])}
{'s': 100.0, 'v': array([200., 300., 400.])}
{'s': 1000.0, 'v': array([2000., 3000., 4000.])}

The main attributes are:

size¶: Size of buffer array.

flat¶: Buffer array iterator.

dtype¶: Data type of buffer array elements.

buf¶: The (1d) buffer array. Allows direct access to the buffer: for example, self.buf[i] = new_val sets the value of the i-th element in the buffer to value new_val. Setting self.buf = nbuf replaces the old buffer by new buffer nbuf. This only works if nbuf is a one-dimensional numpy array having the same length as the old buffer, since nbuf itself is used as the new buffer (not a copy).

shape¶: Always equal to None. This attribute is included since gvar.BufferDicts share several attributes with numpy arrays to simplify coding that might support either type. Being dictionaries they do not have shapes in the sense of numpy arrays (hence the shape is None).

In addition to standard dictionary methods, the main methods here are:

flatten()¶: Copy of buffer array.

slice(k)¶: Return slice/index in self.flat corresponding to key k.

slice_shape(k)¶: Return tuple (slice/index, shape) corresponding to key k.

has_dictkey(k)¶

Returns True if self[k] is defined; False otherwise.

Note that k may be a key or it may be related to a another key associated with a non-Gaussian distribution (e.g., 'log(k)'; see gvar.BufferDict.add_distribution() for more information).

all_keys()¶

Iterator over all keys and implicit keys.

For example, the following code

b = BufferDict()
b['log(x)'] = 1.
for k in b.keys():
    print(k, b[k])
print()
for k in b.all_keys():
    print(k, b[k])

gives the following output:

log(x) 1

log(x) 1
x 2.718281828459045

Here 'x' is not a key in dictionary b but b['x'] is defined implicitly from b['log(x)']. See gvar.BufferDict.add_distribution() for more information.

static add_distribution(name, invfcn)¶

Add new parameter distribution.

gvar.BufferDicts can be used to represent a restricted class of non-Gaussian distributions. For example, the code

import gvar as gv
gv.BufferDict.add_distribution('log', gv.exp)

enables the use of log-normal distributions for parameters. So defining, for example,

b = gv.BufferDict()
b['log(a)'] = gv.gvar('1(1)')

means that b['a'] has a value (equal to exp(b['log(a)']) even though 'a' is not a key in the dictionary.

The distributions available by default correspond to:

gv.BufferDict.add_distribution('log', gv.exp)
gv.BufferDict.add_distribution('sqrt', gv.square)
gv.BufferDict.add_distribution('erfinv', gv.erf)

See also gvar.BufferDict.uniform() for uniform distributions.

Parameters:

name (str) – Distributions’ function name. A ValueError is raised if name is already being used for a distribution; the error can be avoided by deleting the old definition first using BufferDict.del_distribution().
invfcn (callable) – Inverse of the transformation function.

static del_distribution(name)¶

Delete gvar.BufferDict distribution name.

Raises a ValueError if name is not the name of an existing distribution.

static has_distribution(name)¶: True if name has been defined as a distribution; False otherwise.

static uniform(fname, umin, umax, shape=())¶

Create uniform distribution on interval [umin, umax].

The code

import gvar as gv
b = gv.BufferDict()
b['f(w)'] = gv.BufferDict.uniform('f', 2., 3.)

adds a distribution function f(w) designed so that b['w'] corresponds to a uniform distribution on the interval [2., 3.] (see gvar.BufferDict.add_distribution() for more about distributions).

Parameters:

fname (str) – Name of function used in the BufferDict key. Note that names can be reused provided they correspond to the same interval as in previous calls.
umin (float) – Minimum value of the uniform distribution.
umax (float) – Maximum value of the uniform distribution.
shape (tuple) – Shape of array of uniform variables. Default is ().

Returns:

gvar.GVar object corresponding to a uniform distribution.

batch_iter(mode)¶

Iterate over dictionaries in a batch gvar.BufferDict.

Parameters:: mode (str) – Type of batch gvar.BufferDict: 'rbatch' if the batch index on each entry is on the right; or 'lbatch' if the batch index is on the left.
Returns:: An iterator that yields the individual dictionaries merged into the batch gvar.BufferDict. The returned dictionaries are type dict (not type gvar.BufferDict). The values are views into the original gvar.BufferDict, not copies.

`gvar.SVD` Objects¶

SVD analysis is handled by the following class:

class gvar.SVD(mat, svdcut=None, svdnum=None, compute_delta=False, rescale=False)¶

SVD decomposition of a pos. semi-def. sym. matrix.

SVD is a function-class that computes the eigenvalues and eigenvectors of a symmetric matrix mat. Eigenvalues that are small (or negative, because of roundoff) can be eliminated or modified using svd cuts. Typical usage is:

>>> mat = [[1.,.25],[.25,2.]]
>>> s = SVD(mat)
>>> print(s.val)             # eigenvalues
[ 0.94098301  2.05901699]
>>> print(s.vec[0])          # 1st eigenvector (for s.val[0])
[ 0.97324899 -0.22975292]
>>> print(s.vec[1])          # 2nd eigenvector (for s.val[1])
[ 0.22975292  0.97324899]

>>> s = SVD(mat,svdcut=0.6)  # force s.val[i]>=s.val[-1]*0.6
>>> print(s.val)
[ 1.2354102   2.05901699]
>>> print(s.vec[0])          # eigenvector unchanged
[ 0.97324899 -0.22975292]

>>> s = SVD(mat)
>>> w = s.decomp(-1)         # decomposition of inverse of mat
>>> invmat = sum(numpy.outer(wj,wj) for wj in w)
>>> print(numpy.dot(mat,invmat))    # should be unit matrix
[[  1.00000000e+00   2.77555756e-17]
 [  1.66533454e-16   1.00000000e+00]]

Parameters:

mat – Positive, symmetric matrix.
svdcut – If positive (or zero), replace mat’s eigenvalues e[i] -> max(e[i], svdcut*max(e)); otherwwise, discard eigenmodes with eigenvalues smaller than |svdcut| times the maximum eigenvalue.
svdnum – If positive, keep only the modes with the largest svdnum eigenvalues; ignore if set to None.
compute_delta (bool) – Compute delta (see below) if True; set delta=None otherwise.
rescale – Rescale the input matrix to make its diagonal elements equal to +-1.0 before diagonalizing.
warn – Suppress warnings if False. Default is True.

The results are accessed using:

val¶: An ordered array containing the eigenvalues of mat``after the SVD cut. Note that ``val[i]<=val[i+1].

vec¶: Eigenvectors vec[i] corresponding to the eigenvalues val[i].

valmin¶: Minimum eigenvalue allowed in the modified matrix.

valorig¶: Eigenvalues of original matrix.

D¶: The diagonal matrix used to precondition the input matrix if rescale==True. The matrix diagonalized is D M D where M is the input matrix. D is stored as a one-dimensional vector of diagonal elements. D is None if rescale==False.

nmod¶: The first nmod eigenvalues in self.val were modified by the SVD cut.

eigen_range¶: Ratio of the smallest to the largest eigenvector in the unconditioned matrix (after rescaling if rescale=True)

delta¶: A vector of gvars whose means are zero and whose covariance matrix is what was added to mat to condition its eigenvalues. Is None if svdcut<0 or compute_delta==False.

decomp(n)¶

Vector decomposition of input matrix raised to power n.

Computes vectors w[i] such that

mat**n = sum_i numpy.outer(w[i],w[i])

where mat is the original input matrix to svd. This decomposition cannot be computed if the input matrix was rescaled (rescale=True) except for n=1 and n=-1.

Parameters:: n (number) – Power of input matrix.
Returns:: Array w of vectors.

PDF-related Objects¶

Expectation values using probability density functions defined by collections of gvar.GVars can be evaluated using the vegas module (for multi-dimensional integration) and class vegas.PDFIntegrator. Related classes are:

class gvar.PDF(g, svdcut=1e-12, mode=None)¶

Probability density function (PDF) for g.

Given an array or dictionary g of gvar.GVars, pdf=PDF(g) is the probability density function for the (correlated) multi-dimensional Gaussian distribution defined by g. That is pdf(p) is the probability density for random sample p drawn from g. The logarithm of the PDF is obtained using pdf.logpdf(p).

Parameters:

g – gvar.GVar or array of gvar.GVars, or dictionary of gvar.GVars or arrays of gvar.GVars.
svdcut (float) – If nonzero, singularities in the correlation matrix are regulated using gvar.regulate() with an SVD cutoff svdcut. Default is svdcut=1e-12.
eps (non-negative float) – If positive, singularities in the correlation matrix for g are regulated using gvar.regulate() with cutoff eps. Ignored if svdcut is specified (and not None).
noise (bool) – If True adds noise to the corrections caused by svdcut/eps (see documentation for gvar.regulate()). Default value is False.
mode (str or None) –
The PDF is evaluated simultaneously for batches of points in the parameter space when mode equals either rbatch or lbatch. For example, in the code
```
import gvar as gv
g = gv.gvar([1, 2], [[1., .1], [.1, 2]])
pdf = gvar.PDF(g, mode='rbatch')
p = gv.sample(g, nbatch=1000, mode='rbatch')
pdf_p = pdf(p)
```
p[d, i] represents a collection of 1000 points in the parameter space, labeled by batch index i=0..999. Index d=0..3 labels directions in parameter space. pdf_p=pdf(p) returns the PDF value pdf_p[i] for each point. This is generally much more efficient than calculating the PDF values one at a time (for example, in PDF integrals using vegas).

In batch mode an extra batch index is added to each parameter. The index is the rightmost index when mode='rbatch' and the leftmost index when mode='lbatch' (e.g., p[d,i] and p[i,d], respectively, in the example above).

Default is mode=None which turns batch mode off.
decorrelate (bool) – If True correlations between parameters are ignored. Ignored otherwise. Default value is False.

Objects of type PDF have the following attributes:

size¶

Dimension of the parameter space.

Type:: int

nchiv¶

Dimension of the self.chiv(p) vector. This can be smaller than self.size if svdcut is negative.

Type:: int

shape¶

Shape of g array or None if g is a dictionary.

Type:: tuple or None

mean¶

Mean values of g.

Type:: array or dictionary

meanflat¶

Flattened array containing mean values of g.

Type:: array

cov¶

Modified covariance matrix of the elements in g.flat[:].

Type:: array

invcov¶

Inverse of modified covariance matrix.

Type:: array

i_wgts¶: gvar.regulate() decomposition of the modified covariance matrix.

i_invwgts¶: gvar.regulate() decomposition of the inverse modified covariance matrix.

logdet¶: Determinant of modified covariance matrix.

dp_dchiv¶: Jacobian $\mathrm{det}(\partial p_i/\partial \chi_j)$ .

distribution¶: Array or dictionary of gvar.GVars describing the distribution corresponding to the PDF. This may differ from the initial distribution if svdcut or eps is nonzero.

correction¶: self.distribution - self.correction is the original distribution before applying svdcut/eps corrections.

nmod¶: Number of eigenmodes modified by the SVD cut.

nblocks¶

nblocks[s] equals the number of block-diagonal sub-matrices of the y–prior covariance matrix that are size s-by-s. This is sometimes useful for debugging.

Type:: dict

The main methods are:

chi2(p, mode='default')¶: sum(self.chiv(p)**2).

chiv(p, pflat=None, mode='default')¶

Returns $\chi_i(p)$ where $\chi^2(p) = \sum_i \chi_i^2(p)$ .

The PDF is proportional to $\exp(-\chi^2(p)/2)$ where

$\chi^2(p) = \Delta p \cdot\mathrm{cov}^{-1}_p\cdot\Delta p$

$\Delta p_i\equiv \overline p_i - p_i$ , and $\overline p_i$ and $\mathrm{cov}_p$ are the means and covariance matrix of the distribution. By replacing the parameters $p_i$ with a new set of uncorrelated parameters (linear combinations of the originals), $\chi^2(p)$ is reexpressed as a sum of terms $(\chi_i(p))^2$ . By default the individual terms correspond to individual eigenmodes of the correlation matrix. If eps is specified (instead of svdcut), a Cholesky decomposition is used instead (see gvar.regulate()).

Parameters:

p – An array or dictionary represention a point in the distribution’s parameter space (or multiple points if in batch mode). Ignored if pflat is specified.
pflat (array) – The coordinates of a point in parameter space flattened into a 1-d array. In batch mode an extra batch index is added to the array, either on the right (mode='rbatch') or on the left (mode='lbatch'), to represent multiple points.
mode – Batch mode. Defaults to self.mode if not specified (mode='default').

Returns:

An array containing $\chi_i(p)$ . In batch mode the array has an extra batch index (on the right or left depending upon mode).

pflat(chiv, mode='default')¶

Inverse of PDF.chiv().

Parameters:

chiv – An array of values for $\chi_i(p)$ evaluate a point p in paramter space. In batch mode the array has an extra batch index, either on the right (mode='rbatch') or on the l left (mode='lbatch'), to identify multiple points in parameter space.
mode – Batch mode used. Defaults to self.mode if not specified (mode='default').

Returns:

Am array containing the flattened version of p corresponding to chiv. In batch mode the array has an extra batch index (on the right or left depending upon mode).

__call__(p, mode='default')¶

The PDF evaluated at p.

The PDF is exp(-sum(self.chiv(p)**2)/2 - self.lognorm).

logpdf(p, mode='default')¶

Logarithm of the PDF evaluated at p.

The logarithm of the PDF is -sum(self.chiv(p)**2)/2 - self.lognorm.

sample(nbatch=None, uniform=None, mode='default')¶

Generate random sample(s) from the PDF.

Parameters:

nbatch (int or None) – If not None, the iterator will return nbatch samples drawn from the PDF. The results are packaged in arrays or dictionaries whose elements have an extra index labeling the different samples in the batch. The batch index is the rightmost index if mode='rbatch'; it is the leftmost index if mode is 'lbatch'. Ignored if nbatch is None (default).
mode (bool) – Batch mode. Allowed modes are 'rbatch'``or ``'lbatch', corresponding to batch indices that are on the right or the left, respectively. Defaults to self.mode if not specified (mode='default').
uniform (float or None) – Replace Gaussian distribution specified by g with a uniform distribution covering the interval [-uniform, uniform] times the standard deviation centered on the mean along each principal axis the distribution. Ignored if None (default).

class gvar.PDFStatistics(moments=None, histogram=None)¶

Compute statistical information about a distribution.

Given moments mom[i] of a random variable, PDFStatistics computes the mean, standard deviation, skewness, and excess kurtosis.

With histogram data, it estimates the median and the intervals on either side of the median (each) containing 34% of the probability. Also it does a maximum likelihood fit of a split-normal distribution to the histogram. Finally the moments are estimated from the histogram if they are not otherwise specified.

Typical usage:

>>> stats = gvar.PDFStatistics(moments=mom, histogram=(bins, prob))
>>> print(stats)
    mean = 0.9548573(45)   sdev = 0.028731(14)   skew = -1.0217(42)   ex_kurt = 1.643(29)
    split-normal: 0.9765402(99) +/- 0.0124721(50)/0.038875(19)
            median: 0.9596352(52) +/- 0.0222820(67)/0.032022(17)
>>> stats.plot_histogram(show=True)

where the uncertainties on the various quantities reflect uncertainties specified in the inputs. The distribution in this example is significantly skewed, with the tail on the negative side of the peak roughly three times wider than that on the positive side. The last line displays a plot of the histogram, overlayed with plots of the Gaussians or two-sided Gaussians corresponding to the mean and standard deviation, or the median or split-normal distributions.

Parameters:

moments (array of floats or GVars) – moments[i] is the (i+1)-th moment. Optional argument unless histgram=None.
histogram (tuple) – Tuple (bins,prob) where prob[i] is the probability in the bin between bins[i-1] and bins[i]. prob[0] is the probability below bins[0] and prob[-1] is the probability above bins[-1]. Array bins is ordered. Optional argument unless moments=None.

The attributes are as follows:

mean¶: mean value

sdev¶: standard deviation

skew¶: skewness coefficient

ex_kurt¶: excess kurtosis

median¶

self.median.loc is the location of the median estimated from the histogram, while the intervals

(self.median.loc, self.median.loc + self.median.plus)
(self.median.loc - self.median.minus, self.median.loc)

each contain 34% of the probability.

splitnormal¶: self.splitnormal.loc is the location of the peak of the split-normal distribution fit to the histogram. The standard deviations above and below that point are self.splitnormal.plus and self.splitnormal.minus, respectively.

gvar¶: gvar.gvar(mean, sdev)

moments¶: array containing the values of (up to) the first four moments.

bins¶: array of bin edges used for the histogram

prob¶: array of probabilities prob[i] associated with the intervals (bins[i-1], bins[i]). prob[0] is the probability from below bins[0]; prob[-1] is the probability from above bins[-1].

plot_histogram(plot=None, show=False)¶

Plot histogram.

Plots the histogram, overlayed with plots of the Gaussian or two-sided Gaussians corresponding to the mean and standard deviation, or the median or split-normal distributions.

Parameters:

plot – Plotter. Set to matplotlib.pyplot if plot=None (default).
show (bool) – Plot is displayed if show=True, and not otherwise (default).

Returns:

The plotter.

class gvar.PDFHistogram(g, nbin=None, binwidth=None, bins=None)¶

Utility class for creating PDF histograms.

This class is designed to facilitate studies of probability density functions associated with gvar.GVars. The following code, for example, makes a histogram of probabilities for the Gaussian distribution corresponding to gvar.GVar 1.0(5):

g = gv.gvar('1.0(5)')
data = [g() for i in range(10000)]

hist = gv.PDFHistogram(g)
count = hist.count(data)
a = hist.analyze(count)
print('probabilities:', a.prob)
print('statistics:\n', a.stats)

Here hist defines a histogram with 8 bins, centered on g.mean, and each with width equal to g.sdev. The data in array data is a random sampling from g’s distribution. The number of data elements in each bin is determined by hist.count(data) and turned into probabilities by hist.analyze(count). The probabilities (a.prob) and a statistical analysis of the probability distribution based on the histogram (a.stats) are then printed out:

probabilities: [ 0.0017  0.0213  0.1358  0.3401  0.3418  0.1351  0.0222  0.0018]
statistics:
      mean = 1.001   sdev = 0.52334   skew = 0.0069999   ex_kurt = 0.034105
   median = 1.00141666398   plus = 0.499891542549   minus = 0.501710986504

A plot of the histogram can be created and displayed using, for example:

plt = hist.make_plot(count)
plt.xlabel('g')
plt.ylabel('probabilities')
plt.show()

vegas.PDFIntegrator can be used to create histograms for more complicated, multi-dimensional distributions: the expectation value of hist.count(f(p)) over values of p drawn from a multi-dimensional distribution gives the the probability distribution for function f(p).

Parameters:

g (gvar.GVar or None) – The mean and standard deviation of g are used to design the histogram bins, which are centered on g.mean. Ignored if None (in which case bins must be specified).
nbin (int) – The number of histogram bins. Set equal to PDFHistogram.default_nbin (=8 initially) if None.
binwidth (float) – The width of each bin is binwidth * g.sdev. Set equal to PDFHistogram.default_binwidth (=1. initially) if None.
bins (array or None) – Ignored if None (default). Otherwise specifies the histogram’s bin edges, overriding the default bin design specified by g. len(bins) is one larger than the number of bins. If specified it overrides the default bin design indicated by g. One of g or bins must be specified.

The main attributes are:

g¶: gvar.GVar used to design the histogram.

bins¶: Bin edges for the histogram (see above).

midpoints¶: Bin midpoints.

widths¶: Bin widths.

The main methods are:

count(data)¶

Compute histogram of data.

Counts the number of elements from array data in each bin of the histogram. Results are returned in an array, call it h, of length nbin+2 where h[0] is the number of data elements that fall below the range of the histogram, h[-1] (i.e., h[nbin+1]) is the number that fall above the range, and h[i] is the number in the i-th bin for i=1...nbin.

Argument data can also be a float, in which case the result is the same as from histogram([data]). Note that the expectation value of count(f(p)) over parameter values p drawn from a random distribution gives the probabilities for values of f(p) to fall in each histogram bin. Dividing by the bin widths gives the average probability density for random variable f(p) in each bin.

Bin intervals are closed on the left and open on the right, except for the last interval which is closed on both ends.

analyze(count)¶

Analyze count data from PDFHistogram.count().

Turns an array of counts (see PDFHistogram.count()) into a histogram of probabilities, and estimates the mean, standard deviation, and other statistical characteristics of the corresponding probability distribution.

Parameters:: count (array) – Array of length nbin+2 containing histogram data where count[0] is the count for values that are below the range of the histogram, count[-1] is the count for values above the range, and count[i] is the count for the i-th bin where i=1...nbin.

Returns a named tuple containing the following information (in order):

bins: Array of bin edges for histogram (length nbin+1)

prob: Array of probabilities for each bin.

stats: Statistical data about histogram. See PDFStatistics.

norm: Convert counts into probabilities by dividing by norm.

make_plot(count, plot=None, show=False, plottype='probability', bar=dict(alpha=0.15, color='b'), errorbar=dict(fmt='b.'), gaussian=dict(ls='--', c='r'))¶

Convert histogram counts in array count into a plot.

Parameters:

count (array) – Array of histogram counts (see PDFHistogram.count()).
plot (plotter) – matplotlib plotting window. If None uses the default window. Default is None.
show (boolean) – Displayes plot if True; otherwise returns the plot. Default is False.
plottype (str) – The probabilities in each bin are plotted if plottype='probability' (default). The average probability density is plot if plottype='density'. The cumulative probability is plotted if plottype=cumulative.
bar (dictionary) – Additional plotting arguments for the bar graph showing the histogram. This part of the plot is omitted if bar=None.
errorbar (dictionary) – Additional plotting arguments for the errorbar graph, showing error bars on the histogram. This part of the plot is omitted if errorbar=None.
gaussian (dictionary) – Additional plotting arguments for the plot of the Gaussian probability for the gvar.GVar (g) specified in the initialization. This part of the plot is omitted if gaussian=None or if no g was specified.

Requirements¶

gvar makes heavy use of numpy for array manipulations. It also uses the numpy code for implementing elementary functions (e.g., sin, exp …) in terms of member functions.

`gvar` - Gaussian Random Variables¶

Introduction¶

Functions¶

`gvar.GVar` Objects¶

`gvar.BufferDict` Objects¶

`gvar.SVD` Objects¶

Requirements¶

Table of Contents

Previous topic

Next topic

This Page

gvar - Gaussian Random Variables¶

Introduction¶

Functions¶

gvar.GVar Objects¶

gvar.BufferDict Objects¶

gvar.SVD Objects¶

PDF-related Objects¶

Requirements¶

`gvar` - Gaussian Random Variables¶

`gvar.GVar` Objects¶

`gvar.BufferDict` Objects¶

`gvar.SVD` Objects¶