gvar
- Gaussian Random Variables¶
Introduction¶
Objects of type gvar.GVar
represent Gaussian random variables,
which are specified by means and covariances. They are created
using gvar.gvar()
: for example,
>>> x = gvar.gvar(10, 3) # 10 ± 3
>>> y = gvar.gvar(12, 4) # 12 ± 4
>>> z = x + y # 22 ± 5
>>> print(z)
22.0(5.0)
>>> print(z.mean)
22.0
>>> print(z.sdev)
5.0
>>> print((z - x) / y) # z is correlated with y
1(0)
This module contains a variety of tools for creating and manipulating Gaussian random variables, including:
Creating or modifying
Examining and manipulating
mean
(g)
— extract means.
sdev
(g)
— extract standard deviations.
var
(g)
— extract variances.
deriv
(g, x)
— derivatives ofg
with respect tox
.
is_primary
(g)
— test whethergvar.GVar
is primary or derived.
evalcov
(g)
— compute covariance matrix.
evalcov_blocks
(g)
— compute diagonal blocks of covariance matrix.
evalcorr
(g)
— compute correlation matrix.
cov
(g1, g2)
— covariance betweeng1
andg2
.
corr
(g1, g2)
— correlation betweeng1
andg2
.
uncorrelated
(g1, g2)
— test for correlation
chi2
(g1, g2)
—chi**2
ofg1-g2
.
filter
(g, f, *args, **kargs)
— filtergvar.GVar
s ing
through functionf
.
Sampling
sample
(g, nbatch)
— random sample from collection ofgvar.GVar
s.
raniter
(g, n, nbatch)
— iterator for random numbers.
gvar_from_sample
(gs)
—gvar.GVar
s from random sample.
bootstrap_iter
(g, n)
— bootstrap iterator.
ranseed
(seed)
— seed random number generator.
Formatting
fmt
(g)
— replace allgvar.GVar
s in array/dictionary by string representations.
tabulate
(g)
— tabulate entries in array/dictionary ofgvar.GVar
s.
qqplot
(g1, g2)
— QQ plot ofg1-g2
,
equivalent
(g1, g2)
—gvar.GVar
s the same ing1
andg2
?
fmt_values
(g)
— list values for printing.
fmt_errorbudget
(g)
— create error-budget table for printing.
fmt_chi2
(f)
— format chi**2 information in f as string for printing.
Saving
dump
(g, outputfile)
— serialize data, includinggvar.GVar
s, fromg
in file.
dumps
(g)
— serialize data fromg
in a bytes object.
load
(inputfile)
— reconstitute data serialized in a file.
loads
(inputbytes)
— reconstitute data serialized in a bytes object.
gdump
(g, outputfile)
— serialize a collection ofgvar.GVar
s in file.
gdumps
(g)
— serialize a collection ofgvar.GVar
s in a string.
gload
(inputfile)
— reconstitute a collection ofgvar.GVar
s from a file.
gloads
(inputstr)
— reconstitute a collection ofgvar.GVar
s from a string.
remove_gvars
(g, gvlist)
— removegvar.GVar
s fromg
, appending them togvlist
.
distribute_gvars
(g, gvlist)
— restoregvar.GVar
s tog
fromgvlist
.class
GVarRef
— placeholder forgvar.GVar
removed bygvar.remove_gvars()
.
dependencies
(g)
— collect primarygvar.GVar
s contributing tog
.
disassemble
(g)
— low-level routine to disassemble a collection ofgvar.GVar
s.
reassemble
(data,cov)
— low-level routine to reassemble a collection ofgvar.GVar
s.
Related classes
class
BufferDict
— ordered dictionary with data buffer.class
SVD
— SVD decomposition of symmetric matrixclass
class
PDFStatistics
— statistical analysis of moments of a random variable.
Analyzing Monte Carlo datasets
dataset.bin_data
(data)
— bin random sample data.
dataset.avg_data
(data)
— estimate means of random sample data.
dataset.bootstrap_iter
(data,N)
— bootstrap random sample data.class
dataset.Dataset
— class for collecting random sample data.
Functions¶
The function used to create Gaussian variable objects is:
- gvar.gvar(...)¶
Creates one or more new
gvar.GVar
s.gvar.gvar
is an object of typegvar.GVarFactory
. Each of the following creates newgvar.GVar
s:- gvar.gvar(x, xsdev)
Returns a
gvar.GVar
with meanx
and standard deviationxsdev
. Returns an array ofgvar.GVar
s ifx
andxsdev
are arrays with the same shape; the shape of the result is the same as the shape ofx
. Returns agvar.BufferDict
ifx
andxsdev
are dictionaries with the same keys and layout; the result has the same keys and layout asx
.
- gvar.gvar(x, xcov)
Returns an array of
gvar.GVar
s with means given by arrayx
and a covariance matrix given by arrayxcov
, wherexcov.shape = 2*x.shape
; the result has the same shape asx
. Returns agvar.BufferDict
ifx
andxcov
are dictionaries, where the keys inxcov
are(k1,k2)
for any keysk1
andk2
inx
. Returns a singlegvar.GVar
ifx
is a number andxcov
is a one-by-one matrix. The layout forxcov
is compatible with that produced bygvar.evalcov()
for a singlegvar.GVar
, an array ofgvar.GVar
s, or a dictionary whose values aregvar.GVar
s and/or arrays ofgvar.GVar
s. Thereforegvar.gvar(gvar.mean(g), gvar.evalcov(g))
createsgvar.GVar
s with the same means and covariance matrix as thegvar.GVar
s ing
providedg
is a singlegvar.GVar
, or an array or dictionary ofgvar.GVar
s.
- gvar.gvar(x, xcov, verify=True)
Same as
gvar.gvar(x, xcov)
above but checks that the covariance matrix is symmetric and positive definite (which covariance matrices should be). This check is expensive for large matrices and so is not done by default. Note, however, that unpredictable outcomes will result from specifying an improper covariance matrix.
- gvar.gvar(x, xcov, fast=True)
Normally
gvar.gvar(x, xcov)
tries to break the covariance matrix into disjoint diagonal blocks, if there are any. For example,xcov = [[1,1,0], [1,2,0], [0,0,3]]
can be decomposed into two blocks. This decomposition saves memory, and can make later manipulations of the resulting
gvar.GVar
s faster. This is at the expense of extra processing to create thegvar.GVar
s. Setting keywordfast=True
preventsgvar.gvar
from doing this, which would make sense, for example, if it was known ahead of time thatxcov
has no sub-blocks. The default isfast=False
. Either choice gives correct answers; the difference is about efficiency.
- gvar.gvar((x, xsdev))
Returns a
gvar.GVar
with meanx
and standard deviationxsdev
.
- gvar.gvar(xstr)
Returns a
gvar.GVar
corresponding to stringxstr
which is either of the form"xmean ± xsdev"
or"x(xerr)"
. Here±
can be replaced by+/-
or+-
.
- gvar.gvar(xgvar)
Returns
gvar.GVar
xgvar
unchanged.
- gvar.gvar(xdict)
Returns a dictionary (
BufferDict
)b
whereb[k] = gvar.gvar(xdict[k])
for every key in dictionaryxdict
. The values inxdict
, therefore, can be strings, tuples orgvar.GVar
s (see above), or arrays of these.
- gvar.gvar(xarray)
Returns an array
a
having the same shape asxarray
where every elementa[i...] = gvar.gvar(xarray[i...])
. The values inxarray
, therefore, can be strings, tuples orgvar.GVar
s (see above).
- gvar.gvar(ymean, ycov, x, xycov)
Returns a 1-d array of
gvar.GVar
sy[i]
constructed from the 1-d array of mean valuesymean
and the 2-d covariance matrixycov
. They[i]
are correlated with the primarygvar.GVar
s in 1-d arrayx
. Thex-y
covariance matrix isxycov
whose shape isx.shape + y.shape
. Note that this changes thegvar.GVar
s inx
(because they are correlated with they[i]
); it has no effect on the variance or on correlations between differentx[i]
s.
The following function is useful for constructing new functions that
can accept gvar.GVar
s as arguments:
- gvar.gvar_function(x, f, dfdx)¶
Create a
gvar.GVar
for function f(x) given f and df/dx at x.This function creates a
gvar.GVar
corresponding to a function ofgvar.GVar
sx
whose value isf
and whose derivatives with respect to eachx
are given bydfdx
. Herex
can be a singlegvar.GVar
, an array ofgvar.GVar
s (for a multidimensional function), or a dictionary whose values aregvar.GVar
s or arrays ofgvar.GVar
s, whiledfdx
must be a float, an array of floats, or a dictionary whose values are floats or arrays of floats, respectively.This function is useful for creating functions that can accept
gvar.GVar
s as arguments. For example,import math import gvar as gv def sin(x): if isinstance(x, gv.GVar): f = math.sin(x.mean) dfdx = math.cos(x.mean) return gv.gvar_function(x, f, dfdx) else: return math.sin(x)
creates a version of
sin(x)
that works with either floats orgvar.GVar
s as its argument. This particular function is unnecessary since it is already provided bygvar
.- Parameters:
- Returns:
A
gvar.GVar
representing the function’s value atx
.
Means, standard deviations, variances, formatted strings, covariance
matrices and correlation/comparison information can be extracted from arrays
(or dictionaries) of gvar.GVar
s using:
- gvar.mean(g)¶
Extract means from
gvar.GVar
s ing
.g
can be agvar.GVar
, an array ofgvar.GVar
s, or a dictionary containinggvar.GVar
s or arrays ofgvar.GVar
s. Result has the same layout asg
.Elements of
g
that are notgvar.GVar
s are left unchanged.
- gvar.sdev(g)¶
Extract standard deviations from
gvar.GVar
s ing
.g
can be agvar.GVar
, an array ofgvar.GVar
s, or a dictionary containinggvar.GVar
s or arrays ofgvar.GVar
s. Result has the same layout asg
.The deviation is set to 0.0 for elements of
g
that are notgvar.GVar
s.
- gvar.var(g)¶
Compute variances for elements of array/dictionary
g
or a singlegvar.GVar
.If
g
is an array ofgvar.GVar
s,var
returns the variances as an array with shapeg.shape
. Ifg
is a dictionary whose values aregvar.GVar
s or arrays ofgvar.GVar
s, the result is a dictionary wherecov[k]
is the variance forg[k]
.
- gvar.is_primary(g)¶
Determine whether or not the
gvar.GVar
s ing
are primarygvar.GVar
s.A primary
gvar.GVar
is one created usinggvar.gvar()
(or a function of such a variable). A derivedgvar.GVar
is one that is constructed from arithmetic expressions and functions that combine multiple primarygvar.GVar
s. The standard deviations for allgvar.GVar
s originate with the primarygvar.GVar
s. In particular,z = z.mean + sum_p (p - p.mean) * dz/dp
is true for any
gvar.GVar
z
, where the sum is over all primarygvar.GVar
sp
.Here
g
can be agvar.GVar
, an array ofgvar.GVar
s, or a dictionary containinggvar.GVar
s or arrays ofgvar.GVar
s. The result has the same layout asg
. Eachgvar.GVar
is replaced byTrue
orFalse
depending upon whether it is primary or not.When the same
gvar.GVar
appears more than once ing
, only the first appearance is marked as a primarygvar.GVar
. This avoids double counting the same primarygvar.GVar
— each primarygvar.GVar
in the list is unique.
- gvar.dependencies(g, all=False)¶
Collect primary
gvar.GVar
s contributing to the covariances ofgvar.GVar
s ing
.
- gvar.filter(g, f, *args, **kargs)¶
Filter
gvar.GVar
s ing
through functionf
.Sample usage:
import gvar as gv g_mod = gv.filter(g, gv.mean)
replaces every
gvar.GVar
ing
by its mean. This is useful whengv.mean(g)
doesn’t work — for example, becauseg
contains data types other thangvar.GVar
s, or consists of nested dictionaries.- Parameters:
g – Object consisting of (possibly nested) dictionaries, deques, lists,
numpy.array
s, tuples, and/or other data types that containgvar.GVar
s and other types of data.f – Function that takes an array of
gvar.GVar
s as an argument and returns an array of results having the same shape. Given an arraygvar_array
containing all thegvar.GVar
s fromg
, the function call isf(gvar_array, *args, **kargs)
. Results from the returned array replace thegvar.GVar
s in the original object.args – Additional arguments for
f
.kargs – Additional keyword arguments for
f
.
- Returns:
An object like
g
but with all thegvar.GVar
s replaced by objects generated by functionf
.
- gvar.fmt(g, ndecimal=None, sep='')¶
Format
gvar.GVar
s ing
.g
can be agvar.GVar
, an array ofgvar.GVar
s, or a dictionary containinggvar.GVar
s or arrays ofgvar.GVar
s. Eachgvar.GVar
gi
ing
is replaced by the string generated bygi.fmt(format,ndecimal,sep)
. The result has same structure asg
.- Parameters:
ndecimal (int or None) – Format each
gvar.GVar
usingf'{self:.{ndecimal}f}'
whenndecimal
is a non-negative integer, orf'{self:2P}'
whenndecimal
is negative. Ignored ifndecimal=None
(default).sep (str) – String inserted between the
mean
and the(sdev)
in'mean(sdev)'
. Default issep=''
.format (str) – Format string used to convert each
gvar.GVar
. Ignored ifndecimal
is notNone
.
- gvar.tabulate(g, ncol=1, headers=True, offset='', ndecimal=None)¶
Tabulate contents of an array or dictionary of
gvar.GVar
s.Given an array
g
ofgvar.GVar
s or a dictionary whose values aregvar.GVar
s or arrays ofgvar.GVar
s,gvar.tabulate(g)
returns a string containing a table of the values ofg
’s entries. For example, the codeimport collections import gvar as gv g = collections.OrderedDict() g['scalar'] = gv.gvar('10.3(1)') g['vector'] = gv.gvar(['0.52(3)', '0.09(10)', '1.2(1)']) g['tensor'] = gv.gvar([ ['0.01(50)', '0.001(20)', '0.033(15)'], ['0.001(20)', '2.00(5)', '0.12(52)'], ['0.007(45)', '0.237(4)', '10.23(75)'], ]) print(gv.tabulate(g, ncol=2))
prints the following table:
key/index value key/index value --------------------------- --------------------------- scalar 10.30 (10) 1,0 0.001 (20) vector 0 0.520 (30) 1,1 2.000 (50) 1 0.09 (10) 1,2 0.12 (52) 2 1.20 (10) 2,0 0.007 (45) tensor 0,0 0.01 (50) 2,1 0.2370 (40) 0,1 0.001 (20) 2,2 10.23 (75) 0,2 0.033 (15)
- Parameters:
g – Array of
gvar.GVar
s (any shape) or dictionary whose values aregvar.GVar
s or arrays ofgvar.GVar
s (any shape).ncol – The table is split over
ncol
columns of key/index values plusgvar.GVar
values. Default value is 1.headers – Prints standard header on table if
True
; omits the header ifFalse
. Ifheaders
is a 2-tuple, thenheaders[0]
is used in the header over the indices/keys andheaders[1]
over thegvar.GVar
values. (Default isTrue
.)offset (str) – String inserted at the beginning of each line in the table. Default is
''
.ndecimal – Number of digits displayed after the decimal point. Default is
ndecimal=None
which adjusts table entries to show 2 digits of error.keys – When
g
is a dictionary andkeys
is a list of keys, entries will be tabulated only for keys in thekeys
list; ignored ifkeys=None
(default).
- gvar.correlate(g, corr, upper=False, lower=False, verify=False)¶
Add correlations to uncorrelated
gvar.GVar
s ing
.This method creates correlated
gvar.GVar
s from uncorrelatedgvar.GVar
sg
, using the correlations specified incorr
.Note that correlations initially present in
g
, if any, are ignored.Examples
A typical application involves the construction of correlated
gvar.GVar
s give the means and standard deviations, together with a correlation matrix:>>> import gvar as gv >>> g = gv.gvar(['1(1)', '2(10)']) >>> print(gv.evalcorr(g)) # uncorrelated [[ 1. 0.] [ 0. 1.]] >>> g = gv.correlate(g, [[1. , 0.1], [0.1, 1.]]) >>> print(gv.evalcorr(g)) # correlated [[ 1. 0.1] [ 0.1 1. ]]
This also works when
g
andcorr
are dictionaries:>>> g = gv.gvar(dict(a='1(1)', b='2(10)')) >>> print(gv.evalcorr(g)) {('a', 'a'): array([[ 1.]]),('a', 'b'): array([[ 0.]]),('b', 'a'): array([[ 0.]]),('b', 'b'): array([[ 1.]])} >>> corr = {} >>> corr['a', 'a'] = 1.0 >>> corr['a', 'b'] = 0.1 >>> corr['b', 'a'] = 0.1 >>> corr['b', 'b'] = 1.0 >>> g = correlate(g, corr) >>> print(gv.evalcorr(g)) {('a', 'a'): array([[ 1.]]),('a', 'b'): array([[ 0.1]]),('b', 'a'): array([[ 0.1]]),('b', 'b'): array([[ 1.]])}
- Parameters:
g – An array of
gvar.GVar
s or a dictionary whose values aregvar.GVar
s or arrays ofgvar.GVar
s.corr – Correlations between
gvar.GVar
s:corr[i, j]
is the correlation betweeng[i]
andg[j]
. Should be a symmetric and positive-definite (unlessupper
orlower
is specified).upper (bool) – If
True
, replaces lower triangular part ofcorr
with transpose of the upper triangular part. The diagonal is set to one. Default isFalse
.lower (bool) – If
True
, replaces upper triangular part ofcorr
with transpose of the lower triangular part. The diagonal is set to one. Default isFalse
.verify (bool) – If
True
, verifies thatcorr
is symmetric and positive definite. Default isFalse
because verification is costly for large matrices.
- gvar.evalcov(g)¶
Compute covariance matrix for elements of array/dictionary
g
.If
g
is an array ofgvar.GVar
s,evalcov
returns the covariance matrix as an array with shapeg.shape+g.shape
. Ifg
is a dictionary whose values aregvar.GVar
s or arrays ofgvar.GVar
s, the result is a doubly-indexed dictionary wherecov[k1,k2]
is the covariance forg[k1]
andg[k2]
.
- gvar.evalcov_blocks(g, compress=False)¶
Evaluate covariance matrix for elements of
g
.Evaluates the covariance matrices for
gvar.GVar
s stored in array or dictionary of arrays/gvar.GVar
sg
. The covariance matrix is decomposed into its block diagonal components, and a list of tuples(idx,bcov)
is returned wherebcov
is a diagonal block of the covariance matrix andidx
an array containing the corresponding indices ing.flat
for that block. So to reassemble the blocks into a single matrixcov
, for example, one would use:import numpy as np import gvar as gv cov = np.zeros((len(g.flat), len(g.flat)), float) for idx, bcov in gv.evalcov_blocks(g): cov[idx[:, None], idx] = bcov
gvar.evalcov_blocks()
is particularly useful when the covariance matrix is sparse; only nonzero elements are retained.Setting argument
compress=True
(default isFalse
) changes the meaning of the first element in the list:idx
for the first element contains the indices of allgvar.GVar
s ing.flat
that are uncorrelated from othergvar.GVar
s ing.flat
, andbcov
contains the standard deviations for thosegvar.GVar
s. Thus the full covariance matrix is reconstructed using the following code:import numpy as np import gvar as gv cov = np.zeros((len(g.flat), len(g.flat)), float) blocks = gv.evalcov_blocks(g, compress=True) # uncorrelated pieces are diagonal idx, sdev = blocks[0] cov[idx, idx] = sdev ** 2 # correlated pieces for idx, bcov in blocks[1:]: cov[idx[:, None], idx] = bcov
The code with
compress=True
should be slightly faster if there are many uncorrelatedgvar.GVar
s.It is easy to create an array of
gvar.GVar
s having the covariance matrix fromg.flat
: for example,import numpy as np import gvar as gv new_g = np.empty(len(g.flat), dtype=object) for idx, bcov in evalcov_blocks(g): new_g[idx] = gv.gvar(new_g[idx], bcov)
creates an array of
gvar.GVar
s with zero mean and the same covariance matrix asg.flat
. This works with either value for argumentcompress
.
- gvar.evalcorr(g)¶
Compute correlation matrix for elements of array/dictionary
g
.If
g
is an array ofgvar.GVar
s,evalcorr
returns the correlation matrix as an array with shapeg.shape+g.shape
. Ifg
is a dictionary whose values aregvar.GVar
s or arrays ofgvar.GVar
s, the result is a doubly-indexed dictionary wherecorr[k1,k2]
is the correlation forg[k1]
andg[k2]
.The correlation matrix is related to the covariance matrix by:
corr[i,j] = cov[i,j] / (cov[i,i] * cov[j,j]) ** 0.5
Return
True
ifgvar.GVar
s ing1
uncorrelated with those ing2
.g1
andg2
can begvar.GVar
s, arrays ofgvar.GVar
s, or dictionaries containinggvar.GVar
s or arrays ofgvar.GVar
s. ReturnsTrue
if either ofg1
org2
isNone
.
- gvar.chi2(g1, g2, svdcut=1e-12, dof=None)¶
Compute chi**2 of
g1-g2
.chi**2 equals
dg.invcov.dg, where ``dg = g1 - g2
andinvcov
is the inverse ofdg
’s covariance matrix. It is a measure of how well multi-dimensional Gaussian distributionsg1
andg2
(dictionaries or arrays) agree with each other — that is, do their means agree within errors for corresponding elements. The probability is high ifchi2(g1,g2)/dof
is of order 1 or smaller, wheredof
is the number of degrees of freedom being compared.Usually
g1
andg2
are dictionaries with the same keys, whereg1[k]
andg2[k]
aregvar.GVar
s or arrays ofgvar.GVar
s having the same shape. Alternativelyg1
andg2
can begvar.GVar
s, or arrays ofgvar.GVar
s having the same shape.One of
g1
org2
can contain numbers instead ofgvar.GVar
s, in which casechi**2
is a measure of the likelihood that the numbers came from the distribution specified by the other argument.One or the other of
g1
org2
can be missing keys, or missing elements from arrays. Only the parts ofg1
andg2
that overlap are used. Also settingg2=None
is equivalent to replacing its elements by zeros.A typical application tests whether distribution
g1
andg2
are consistent with each other (within errors): the code>>> chi2 = gvar.chi2(g1, g2) >>> print(gvar.fmt_chi2(chi2)) chi2/dof = 1.1 [100] Q = 0.26
shows that the distributions are reasonably consistent. The number of degrees of freedom (here 100) in this example equals the number of variables from
g1
andg2
that are compared; this can be changed using thedof
argument.- Parameters:
g1 –
gvar.GVar
or array ofgvar.GVar
s, or a dictionary whose values aregvar.GVar
s or arrays ofgvar.GVar
s. Specifies a multi-dimensional Gaussian distribution. Alternatively the elements can be numbers instead ofgvar.GVar
s, in which caseg1
specifies a sample from a distribution.g2 –
gvar.GVar
or array ofgvar.GVar
s, or a dictionary whose values aregvar.GVar
s or arrays ofgvar.GVar
s. Specifies a multi-dimensional Gaussian distribution. Alternatively the elements can be numbers instead ofgvar.GVar
s, in which caseg2
specifies a sample from a distribution. Settingg2=None
(default) is equivalent to setting its elements all to zero.eps (float) – If positive, singularities in the correlation matrix for
g1-g2
are regulated usinggvar.regulate()
with cutoffeps
. Ignored ifsvdcut
is specified (and notNone
).svdcut (float) – If nonzero, singularities in the correlation matrix for
g1-g2
are regulated usinggvar.regulate()
with an SVD cutoffsvdcut
. Default issvdcut=1e-12
.dof (int or None) – Number of independent degrees of freedom in
g1-g2
. This is set equal to the number of elements ing1-g2
ifdof=None
is set. This parameter affects theQ
value assigned to thechi**2
.
- Returns:
The return value is the
chi**2
. Extra attributes attached to this number give additional information:- dof — Number of degrees of freedom (that is, the number
of variables compared if not specified).
- Q — The probability that the
chi**2
could have been larger, by chance, even if
g1
andg2
agree. Values smaller than 0.1 or so suggest that they do not agree. Also called the p-value.
- Q — The probability that the
- residuals — Decomposition of the
chi**2
in terms of the independent modes of the correlation matrix:
chi**2 = sum(residuals**2)
.
- residuals — Decomposition of the
- gvar.qqplot(g1, g2, plot=None, svdcut=1e-12, dof=None)¶
QQ plot
g1-g2
.The QQ plot compares the distribution of the means of Gaussian distribution
g1-g2
to that of random samples from the distribution. The resulting plot will approximate a straight line along the diagonal of the plot (dashed black line) if the means have a Gaussian distribution about zero with the correct standard deviations.Usually
g1
andg2
are dictionaries with the same keys, whereg1[k]
andg2[k]
aregvar.GVar
s or arrays ofgvar.GVar
s having the same shape. Alternativelyg1
andg2
can begvar.GVar
s, or arrays ofgvar.GVar
s having the same shape.One of
g1
org2
can contain numbers instead ofgvar.GVar
s.One or the other of
g1
org2
can be missing keys, or missing elements from arrays. Only the parts ofg1
andg2
that overlap are used. Also settingg2=None
is equivalent to replacing its elements by zeros.In a typical application, the plot displayed by
gvar.qqplot(gsample, g).show()
tests whether
gsample
is likely a random sample from a multi-dimensional Gaussian distribtuiong
. It is consistent with being a random sample if the QQ plot is a straight line through the origin with unit slope. The is most useful when there are many variables (g.size >> 1
).- Parameters:
g1 –
gvar.GVar
or array ofgvar.GVar
s, or a dictionary whose values aregvar.GVar
s or arrays ofgvar.GVar
s. Specifies a multi-dimensional Gaussian distribution. Alternatively the elements can be numbers instead ofgvar.GVar
s, in which caseg1
specifies a sample from a distribution.g2 –
gvar.GVar
or array ofgvar.GVar
s, or a dictionary whose values aregvar.GVar
s or arrays ofgvar.GVar
s. Specifies a multi-dimensional Gaussian distribution. Alternatively the elements can be numbers instead ofgvar.GVar
s, in which caseg2
specifies a sample from a distribution. Settingg2=None
(default) is equivalent to setting its elements all to zero.plot – a
matplotlib
plotter. IfNone
(default), usesmatplotlib.pyplot
.eps (float or None) –
eps
used bygvar.regulate()
when inverting the covariance matrix. Ignored ifsvdcut
is set (and notNone
).svdcut (float) – SVD cut used when inverting the covariance matrix of
g1-g2
. See documentation forgvar.svd()
for more information. Default issvdcut=1e-12
.dof (int or None) – Number of independent degrees of freedom in
g1-g2
. This is set equal to the number of elements ing1-g2
ifdof=None
is set. This parameter affects theQ
value assigned to thechi**2
.
- Returns:
Plotter
plot
.
This method requires the
scipy
andmatplotlib
modules.
- gvar.fmt_chi2(f)¶
Return string containing
chi**2/dof
,dof
andQ
fromf
.Assumes
f
has attributeschi2
,dof
andQ
. The logarithm of the Bayes factor will also be printed iff
has attributelogGBF
.
gvar.GVar
s are compared by:
- gvar.equivalent(g1, g2, rtol=1e-10, atol=1e-10)¶
Determine whether
g1
andg2
contain equivalentgvar.GVar
s.Compares sums and differences of
gvar.GVar
s stored ing1
andg2
to see if they agree within tolerances. Operationally, agreement means that:abs(diff) < abs(summ) / 2 * rtol + atol
where
diff
andsumm
are the difference and sum of the mean values (g.mean
) or derivatives (g.der
) associated with each pair ofgvar.GVar
s.gvar.GVar
s that are equivalent are effectively interchangeable with respect to both their means and also their covariances with any othergvar.GVar
(including ones not ing1
andg2
).g1
andg2
can be individualgvar.GVar
s or arrays ofgvar.GVar
s or dictionaries whose values aregvar.GVar
s and/or arrays ofgvar.GVar
s. Comparisons are made only for shared keys when they are dictionaries. Array dimensions must match betweeng1
andg2
, but the shapes can be different; comparisons are made for the parts of the arrays that overlap in shape.- Parameters:
g1 – A
gvar.GVar
or an array ofgvar.GVar
s or a dictionary ofgvar.GVar
s and/or arrays ofgvar.GVar
s.g2 – A
gvar.GVar
or an array ofgvar.GVar
s or a dictionary ofgvar.GVar
s and/or arrays ofgvar.GVar
s.rtol – Relative tolerance with which mean values and derivatives must agree with each other. Default is
1e-10
.atol – Absolute tolerance within which mean values and derivatives must agree with each other. Default is
1e-10
.
gvar.GVar
s can be stored (serialized) and retrieved from files (or strings) using:
- gvar.dump(g, outputfile=None, add_dependencies=False, **kargs)¶
Write a representation of
g
to fileoutputfile
.Object
g
consists of (possibly nested) dictionaries, deques, lists,numpy.array
s, and/or tuples that containgvar.GVar
s and other types of data. Callinggvar.dump(g, 'filename')
writes a serialized representation ofg
into the file namedfilename
. Objectg
can be recovered later usinggvar.load('filename')
.Typical usage is:
# create file xxx.pickle containing GVars in g import gvar as gv gv.dump(g, 'xxx.pickle') # read file xxx.pickle to recreate g new_g = gv.load('xxx.pickle')
gvar.dump()
is an alternative topickle.dump()
which, unlike the latter, works when there aregvar.GVar
s ing
. In particular it preserves correlations between differentgvar.GVar
s, as well as relationships (i.e., derivatives) between derivedgvar.GVar
s and primarygvar.GVar
s ing
.gvar.dump()
usesgvar.remove_gvars()
to search (recursively) for thegvar.GVar
s ing
.The partial variances for derived
gvar.GVar
s ing
coming from primarygvar.GVar
s ing
are preserved bygvar.dump()
. (These are used, for example, to calculate error budgets.) Partial variances coming from derived (rather than primary)gvar.GVar
s, however, are unreliable unless every primarygvar.GVar
that contributes to the covariances ing
is included ing
. To guarantee that this is the case, set keywordadd_dependencies=True
. This can greatly increase the size of the output file, and so should only be done if error budgets, etc. are needed. (Also the cost of evaluating covariance matrices for the reconstitutedgvar.GVar
s is increased if there are large numbers of primarygvar.GVar
s.) The default isadd_dependencies=False
.- Parameters:
g – Object to be serialized. Consists of (possibly nested) dictionaries, deques, lists,
numpy.array
s, and/or tuples that containgvar.GVar
s and other types of data.outputfile – The name of a file or a file object in which the serialized
g
is stored. Ifoutputfile=None
(default), the results are written to aBytesIO
.add_dependencies (bool) – If
True
, automatically includes all primarygvar.GVar
s that contribute to the covariances of thegvar.GVar
s ing
but are not already ing
. Default isFalse
.kargs (dict) – Additional arguments, if any, that are passed to the underlying serializer (
pickle
).
- Returns:
The
BytesIO
object containing the serialized data, ifoutputfile=None
; otherwiseoutputfile
.
- gvar.dumps(g, add_dependencies=False, **kargs)¶
Return a serialized representation of
g
.This function is shorthand for:
gvar.dump(g).getvalue()
Typical usage is:
# create bytes object containing GVars in g import gvar as gv gbytes = gv.dumps(g) # convert bytes object back into g new_g = gv.loads(gbytes)
- Parameters:
g – Object to be serialized. Consists of (possibly nested) dictionaries, deques, lists,
numpy.array
s, and/or tuples that containgvar.GVar
s and other types of data.add_dependencies (bool) – If
True
, automatically includes all primarygvar.GVar
s that contribute to the covariances of thegvar.GVar
s ing
but are not already ing
. Default isFalse
.kargs (dict) – Additional arguments, if any, that are passed to the underlying serializer (
pickle
).
- Returns:
A bytes object containing a serialized representation of object
g
.
- gvar.load(inputfile, **kargs)¶
Read and return object serialized in
inputfile
bygvar.dump()
.This function recovers data serialized with
gvar.dump()
. Typical usage is:# create file xxx.pickle containing data in g import gvar as gv gv.dump(g, 'xxx.pickle') # read file xxx.pickle to recreate g new_g = gv.gload('xxx.pickle')
Note that the format used by
gvar.dump()
changed with version 11.0 ofgvar
.gvar.load()
will attempt to read the old formats if they are encountered, but old data should be converted to the new format (by reading it in withload()
and them writing it out again withdump()
).- Parameters:
inputfile – The name of the file or a file object in which the serialized
gvar.GVar
s are stored (created bygvar.dump()
).kargs (dict) – Additional arguments, if any, that are passed to the underlying de-serializer (
pickle
).
- Returns:
The reconstructed data object.
- gvar.loads(inputbytes, **kargs)¶
Read and return object serialized in
inputbytes
bygvar.dumps()
.This function recovers data serialized with
gvar.dumps()
. It is shorthand for:gvar.load(BytesIO(inputbytes))
Typical usage is:
# create bytes object containing data in g import gvar as gv gbytes = gv.dumps(g) # recreate g from bytes object gbytes new_g = gv.gloads(gbytes)
- Parameters:
inputbytes (bytes) – Bytes object created by
gvar.dumps()
.kargs (dict) – Additional arguments, if any, that are passed to the underlying serializer (
pickle
).
- Returns:
The reconstructed data.
- gvar.gdump(g, outputfile=None, method=None, add_dependencies=False, **kargs)¶
Write a representation of
g
to fileoutputfile
.Object
g
is agvar.GVar
, an array ofgvar.GVar
s (any shape), or a dictionary whose values aregvar.GVar
s and/or arrays ofgvar.GVar
s; it describes a general (multi-dimensional) Gaussian distribution. Callinggvar.gdump(g, 'filename')
writes a serialized representation ofg
into the file namedfilename
. The Gaussian distribution described byg
can be recovered later usinggvar.load('filename')
. Correlations between differentgvar.GVar
s ing
are preserved, as are relationships (i.e., derivatives) between derivedgvar.GVar
s and any primarygvar.GVar
s ing
.gvar.gdump()
differs fromgvar.dump()
in that the elements ing
must all begvar.GVar
s for the former, whereasgvar.GVar
s may be mixed in with other data types for the latter. The structure ofg
is also more restricted forgvar.gdump()
, but it is typically faster thangvar.dump()
.Typical usage is:
# create file xxx.pickle containing GVars in g import gvar as gv gv.gdump(g, 'xxx.pickle') # read file xxx.pickle to recreate g new_g = gv.gload('xxx.pickle')
Object
g
is serialized using one of thejson
(text file) orpickle
(binary file) Python modules. Method'json'
can produce smaller files, but method'pickle'
can be significantly faster.'json'
is more secure than'pickle'
, if that is an issue.json
can have trouble with dictionaries whose keys are not strings. A workaround is used here that succeeds providedeval(repr(k)) == k
for every keyk
. This works for a wide variety of standard key types including strings, integers, and tuples of strings and/or integers. Trypickle
if the workaround fails.The partial variances for derived
gvar.GVar
s ing
coming from primarygvar.GVar
s ing
are preserved bygvar.gdump()
. (These are used, for example, to calculate error budgets.) Partial variances coming from derived (rather than primary)gvar.GVar
s, however, are unreliable unless every primarygvar.GVar
that contributes to the covariances ing
is included ing
. To guarantee that this is the case set keywordadd_dependencies=True
. This can greatly increase the size of the output file, and so should only be done if error budgets, etc. are needed. (Also the cost of evaluating covariance matrices for the reconstitutedgvar.GVar
s is increased if there are large numbers of primarygvar.GVar
s.) The default isadd_dependencies=False
.- Parameters:
g – A
gvar.GVar
, array ofgvar.GVar
s, or dictionary whose values aregvar.GVar
s and/or arrays ofgvar.GVar
s.outputfile – The name of a file or a file object in which the serialized
gvar.GVar
s are stored. Ifoutputfile=None
(default), the results are written to aStringIO
(formethod='json'
) orBytesIO
(formethod='pickle'
) object.method (str) – Serialization method, which should be either
'json'
or'pickle'
. Ifmethod=None
(default), the method is inferred from the filename’s extension:.json
for'json'
; and.pickle
or.pkl
or.p
for'pickle'
. If that fails the method defaults to'json'
.add_dependencies (bool) – If
True
, automatically includes all primarygvar.GVar
s that contribute to the covariances of thegvar.GVar
s ing
but are not already ing
. Default isFalse
.kargs (dict) – Additional arguments, if any, that are passed to the underlying serializer (
pickle
orjson
).
- Returns:
if
outputfile=None
, theStringIO
orBytesIO
object containing the serialized data is returned. Otherwiseoutputfile
is returned.
- gvar.gdumps(g, method='json', add_dependencies=False)¶
Return a serialized representation of
g
.This function is shorthand for:
gvar.gdump(g, method='json').getvalue()
Typical usage is:
# create string containing GVars in g import gvar as gv gstr = gv.gdumps(g) # convert string back into GVars new_g = gv.gloads(gstr)
- Parameters:
g – A
gvar.GVar
, array ofgvar.GVar
s, or dictionary whose values aregvar.GVar
s and/or arrays ofgvar.GVar
s.method – Serialization method, which should be either
'json'
or'pickle'
. Default is'json'
.add_dependencies (bool) – If
True
, automatically includes all primarygvar.GVar
s that contribute to the covariances of thegvar.GVar
s ing
but are not already ing
. Default isFalse
.
- Returns:
A string or bytes object containing a serialized representation of object
g
.
- gvar.gload(inputfile, method=None, **kargs)¶
Read and return
gvar.GVar
s that are serialized ininputfile
.This function recovers
gvar.GVar
s serialized withgvar.gdump()
. Typical usage is:# create file xxx.pickle containing GVars in g import gvar as gv gv.gdump(g, 'xxx.pickle') # read file xxx.pickle to recreate g new_g = gv.gload('xxx.pickle')
- Parameters:
inputfile – The name of the file or a file object in which the serialized
gvar.GVar
s are stored (created bygvar.gdump()
).method (str or None) – Serialization method, which should be either
'pickle'
or'json'
. Ifmethod=None
(default), the method is inferred from the filename’s extension:.json
for'json'
; and.pickle
or.pkl
or.p
for'pickle'
. If that fails the method defaults to'json'
. Argumentmethod
is ignored ifinputfile
is either aStringIO
orBytesIO
object, with the method being set to'json'
or'pickle'
, respectively.kargs (dict) – Additional arguments, if any, that are passed to the underlying de-serializer (
pickle
orjson
).
- Returns:
The reconstructed
gvar.GVar
, array ofgvar.GVar
s, or dictionary whose values aregvar.GVar
s and/or arrays ofgvar.GVar
s.
- gvar.gloads(inputstring)¶
Return
gvar.GVar
s that are serialized ininputstr
.This function recovers the
gvar.GVar
s serialized withgvar.gdumps(g)()
. It is shorthand for:gvar.gload(StringIO(inputstr), method='json')
Typical usage is:
# create string containing GVars in g import gvar as gv gstr = gv.gdumps(g) # convert string back into GVars new_g = gv.gloads(gstr)
- Parameters:
inputstr (str or bytes) – String or bytes object created by
gvar.dumps()
.- Returns:
The reconstructed
gvar.GVar
, array ofgvar.GVar
s, or dictionary whose values aregvar.GVar
s and/or arrays ofgvar.GVar
s.
- gvar.remove_gvars(g, gvlist)¶
Remove
gvar.GVar
s from structure g, replacing them withgvar.GVarRef
s and collecting them ingvlist
.remove_gvars()
searches objectg
(recursively) forgvar.GVar
s and replaces them withGVarRef
objects. Thegvar.GVar
s are collected in listgvlist
.g
can be a standard container object (dictionary, list, etc.) or an objectobj
whose contents can be accessed throughobj.__dict__
orobj.__slots__
. An object that defines methodobj._remove_gvars
is replaced byobj._remove_gvars(gvlist)
.The
gvar.GVar
s are restored usinggvar.distribute_gvars
: e.g.,gvlist = [] new_g = gvar.distribute_gvars(gvar.remove_gvars(g, gvlist), gvlist)
creates a copy
new_g
ofg
with thegvar.GVar
s restored (and preserving correlations between differentgvar.GVar
s).gvlist
contains copies of the restoredgvar.GVar
s.The default treatment of a class instance
obj
without anobj._remove_gvars(gvlist)
method is equivalent to adding the following method to the object’s classdef _remove_gvars(self, gvlist): tmp = copy.copy(self) tmp.__dict__ = gvar.remove_gvars(tmp.__dict__, gvlist) return tmp
assuming the class does not use
__slots__
. A class that has methodobj._remove_gvars(gvlist)
should have a corresponding methodobj._distribute_gvars(gvlist)
, for use bygvar.distribute_gvars()
. The default behavior when this method is undefined is equivalent to:def _distribute_gvars(self, gvlist): self.__dict__ = gvar.distribute_gvars(self.__dict__, gvlist)
There are analogous defaults for classes that uses
__slots__
.These routines are used by
gvar.dump
andgvar.load
to facilitate pickling of objects containinggvar.GVar
s.
- gvar.distribute_gvars(g, gvlist)¶
Distribute
gvar.GVar
s fromgvlist
in structure g, replacinggvar.GVarRef
s.distribute_gvars()
undoes whatremove_gvars()
does tog
. See discussion ofremove_gvars()
for more details.- Parameters:
g – Object containing
GVarRef
s created byremove_gvars()
.gvlist – List of
gvar.GVar
s created byremove_gvars()
.
- Returns:
Object
g
withGVarRef
s replaced by correspondinggvar.GVar
s from listgvlist
.
- class gvar.GVarRef(g, gvlist)¶
Placeholder for a
gvar.GVar
, used bygvar.remove_gvars()
.Typical usage, when
g
is a dictionary containinggvar.GVar
s:import gvar as gv gvlist = [] for k in g: if isinstance(g[k], gv.GVar): g[k] = gv.GVarRef(g[k], gvlist)
where at the end
gvlist
contains thegvar.GVar
s removed fromg
.The
gvar.GVar
s are restored using:for k in g: if isinstance(g[k], gv.GVarRef): g[k] = g[k](gvlist)
where the
gvar.GVar
s are drawn from listgvlist
.
- gvar.reassemble(data, cov=gvar.gvar.cov)¶
Convert data from
gvar.disassemble()
back intogvar.GVar
s.- Parameters:
data (BufferDict, array) – Disassembled collection of
gvar.GVar
s fromgvar.disassemble()
that are to be reassembled.cov (gvar.smat) – Covariance matrix corresponding to the
gvar.GVar
s indata
. (Default isgvar.gvar.cov
.)
gvar.GVar
s contain information about derivatives with respect to the primary
gvar.GVar
s from which they were constructed. This information can be extracted using:
- gvar.deriv(g, x)¶
Compute partial derivatives of
gvar.GVar
s ing
w.r.t. primarygvar.GVar
s inx
.Primary
gvar.GVar
s are those created usinggvar.gvar()
(or any function of such a variable); seegvar.is_primary()
.
The following functions are used to generate and analyze random arrays or dictionaries
from the distribution defined by array (or dictionary) g
of gvar.GVar
s.
The random numbers incorporate any correlations implied by the g
s.
- gvar.sample(g, eps=None, svdcut=None, uniform=None, nbatch=None, mode='rbatch')¶
Generate random sample(s) from distribution
g
.The
gvar.GVar
s in array (or dictionary)g
collectively define a multidimensional Gaussian distribution.sample(g)
returns an array (or dictionary) containing random samples drawn from that distribution, with correlations intact:>>> import gvar as gv >>> g = dict(a=gv.gvar(1,1), b=gv.gvar([10, 100], [[1,0.99], [0.99,1]])) >>> print(g) {'a': 1.0(1.0), 'b': array([10.0(1.0), 100.0(1.0)], dtype=object)} >>> print(gv.sample(g)) {'a': 0.03301713218946034, 'b': array([ 11.65655789, 101.64702394])}
The layout for each sample is the same as for
g
– an array (or dictionary) of the same shape but where eachgvar.GVar
is replaced by a number.If multiple samples are needed, it is most efficient to create them all at once by specifying parameter
nbatch
. The following code creates a batch of 10,000 samples:>>> gs = gv.sample(g, nbatch=10_000) >>> print(gs['a'].shape, gs['b'].shape) (10000,) (2, 10000) >>> print(gs['a']) [ 1.71081068 -0.12894464 0.41654825 ... 0.39854737 0.42461779 0.97666448] >>> print(gs['b']) [[ 10.94348928 10.46761851 11.10887492 ... 10.46469306 8.6012229 11.07082101] [100.8592698 100.55615515 101.3894677 ... 100.50053354 98.60155065 101.11041462]]
Here individual samples are
gs['a'][i]
andgs['b'][:, i]
where batch indexi=0...9999
labels the sample.Specifying
mode='lbatch'
(instead of default'rbatch'
) moves the batch index to the left,gs['b'][i, :]
:>>> gs = gv.sample(g, nbatch=10_000, mode='lbatch') >>> print(gs['b'].shape) (10000, 2) >>> print(gs['b']) [[ 9.28127683 99.53331081] [ 8.62228298 98.70607512] [ 9.63029983 99.8487004 ] ... [ 9.3074379 99.38207642] [ 10.22812261 100.19961713] [ 9.74325077 99.92798523]]
Note the strong correlations in the fluctuations of
gs['b'][0]
andgs['b'][1]
.Equivalent to:
next(gvar.raniter(g, svdcut=svdcut, eps=eps, uniform=uniform, nbatch=nbatch, mode=mode))
- Parameters:
g – An array or dictionary of objects of type
gvar.GVar
; or agvar.GVar
.eps (float) – If positive, singularities in the correlation matrix for
g
are regulated usinggvar.regulate()
with cutoffeps
. Ignored ifsvdcut
is specified (and notNone
).svdcut (float) – If nonzero, singularities in the correlation matrix are regulated using
gvar.regulate()
with an SVD cutoffsvdcut
. Default issvdcut=1e-12
.nbatch (int or None) – If not
None
, returnsnbatch
samples drawn from the distribution associated withg
. The results are packaged in arrays or dictionaries whose elements have an extra index labeling the different samples in the batch. The batch index is the rightmost index ifmode='rbatch'
; it is the leftmost index ifmode='lbatch'
. Ignored ifnbatch
isNone
(default).mode (bool) – Batch mode. Allowed modes are
'rbatch'
(default) or'lbatch'
, corresponding to batch indices that are on the right or the left, respectively. Ignored ifnbatch
isNone
.uniform (float or None) – Replace Gaussian distribution specified by
g
with a uniform distribution covering the interval[-uniform, uniform]
times the standard deviation centered on the mean along each principal axis of the error ellipse. Ignored ifNone
(default).
- Returns:
A random array or dictionary, with the same shape as
g
, drawn from the distribution defined byg
; ornbatch
such samples ifnbatch
is notNone
.
- gvar.raniter(g, n=None, eps=None, svdcut=None, uniform=None, nbatch=None, mode='rbatch')¶
Return iterator for random samples from distribution
g
raniter(g, n, nbatch=nbatch, ...)
returns an iterator that yieldsn
samples,gvar.sample(g, nbatch=nbatch, ...)
, drawn from the distribution specified by array (or dictionary)g
ofgvar.GVar
s.Functionally equivalent to, but much faster than
for _ in range(n): sample = gvar.sample(g, eps=eps, svdcut=svdcut, ...)
- Parameters:
g – An array (or dictionary) of objects of type
gvar.GVar
; or agvar.GVar
.n (int or
None
) – Maximum number of random iterations. Settingn=None
(the default) implies there is no maximum number.eps (float) – If positive, singularities in the correlation matrix for
g
are regulated usinggvar.regulate()
with cutoffeps
. Ignored ifsvdcut
is specified (and notNone
).svdcut (float) – If nonzero, singularities in the correlation matrix are regulated using
gvar.regulate()
with an SVD cutoffsvdcut
. Default issvdcut=1e-12
.uniform (float or None) – Replace Gaussian distribution specified by
g
with a uniform distribution covering the interval[-uniform, uniform]
times the standard deviation centered on the mean along each principal axis of the error ellipse. Ignored ifNone
(default).nbatch (int or None) – If not
None
, the iterator will returnnbatch
samples drawn from the distribution associated withg
. The results are packaged in arrays or dictionaries whose elements have an extra index labeling the different samples in the batch. The batch index is the rightmost index ifmode='rbatch'
; it is the leftmost index ifmode
is'lbatch'
. Ignored ifnbatch
isNone
(default).mode (bool) – Batch mode. Allowed modes are
'rbatch'
(default) or'lbatch'
, corresponding to batch indices that are on the right or the left, respectively. Ignored ifnbatch
isNone
.
- Returns:
An iterator that returns random arrays or dictionaries (or single
gvar.GVar
s with the same shape asg
drawn from the Gaussian distribution defined byg
, ornbatch
such samples whennbatch
is specified.
- gvar.gvar_from_sample(gs, mode='rbatch', unbias=False)¶
Convert samples gs from Gaussian distribution into
gvar.GVar
s.gs
is an array (or dictionary) of samples drawn from a multidimensional distribution. Each element of the array (or dictionary) has an extra batch index labeling the sample:gs[..., i]
(or gs[key][…, i] for dictionaries) wherei
is the batch index.gvar_from_sample(gs)
calculates the means and covariances of the elements from the variations across samples (i.e., asi
varies) and returns an array (or dictionary) whose elements are the correspondinggvar.GVar
s. Thesegvar.GVar
s specify a multidimensional Gaussian distribution that has the same means and covariances as the samples.For example, the following code
>>> g = gvar.gvar([1., 2.], [[0.81, 0.1], [0.1, 0.81]]) >>> gs = gvar.sample(g, nbatch=10_000) >>> print(gs.shape) (2, 10000) >>> print(gs) [[-0.28683979 2.01220091 0.88237606 ... 0.11822824 -0.43494815 -0.31612878] [ 3.01067406 1.63392898 1.41686281 ... 2.87272547 1.7961971 4.19975679]] >>> g_gs = gvar.gvar_from_sample(gs) >>> print(g) ['1.00(90)' '2.00(90)'] >>> print(gs) ['0.99(89)' '1.99(90)'] >>> print(gvar.evalcov(g)) [[0.81 0.1 ] [0.1 0.81]] >>> print(gvar.evalcov(g_gs)) [[0.8001385 0.10049553] [0.10049553 0.81289579]]
creates 10,000 samples
[gs[0,i], gs[1,i]]
drawn from the Gaussian distributiong
. Using these samples,gvar_from_sample
estimates the means and covariances of the original distribution. The resulting estimateg_gs
agrees well with the exact distributiong
because the latter is Gaussian.gvar_from_sample
can also be used to estimate the uncertainty (standard error) in, for example, its estimates of the means and standard deviations:>>> mom = gvar_from_sample([gs, gs**2], stderr=True) >>> mean = mom[0] >>> sdev = (mom[1] - mean**2) ** 0.5 >>> print(f'mean = {mean} sdev = {sdev}') mean = [0.9923(89) 1.9932(90)] sdev = [0.8945(63) 0.9016(65)]
The means and standard deviations estimated from the sample agree within errors with the correct values of [1. 2.] and [0.9 0.9] respectively.
The sample index
i
is on the right in the example above. Ifmode='lbatch'
(instead of default'rbatch'
) the index is on the right:[gs[i,0], gs[i,1]]
wherei=0...9999
.- Parameters:
gs (array or dict) – Array of samples or dictionary whose values are samples or arrays of samples from a distribution. See
gvar.sample()
for more information.mode (str) – The sample index is the rightmost index when
mode='rbatch'
(default), and the leftmost index whenmode='lbatch'
.stderr (bool) – If
True
(default) the variances and covariances represent the uncertainties in the estimates of the means (corresponding to the standard errors). IfFalse
they represent the spread in the data (corresponding to the standard deviations of the distribution). The former is smaller by a factor of one over the number of samples.unbias (bool) – The default choice
unbias=False
means that variances are estimated usingsum((x[:] - mean(x))**2)/N
whereN
is the number of random samples. This estimate is biased but is typically more accurate than the unbiased estimate obtained whenN
is replaced byN-1
(because statistical fluctuations are then larger than the bias). The difference is negligible unlessN
is very small. Settingunbias=True
gives the unbiased estimate.
- gvar.bootstrap_iter(g, n=None, eps=None, svdcut=None)¶
Return iterator for bootstrap copies of
g
.The gaussian variables (
gvar.GVar
objects) in array (or dictionary)g
collectively define a multidimensional gaussian distribution. The iterator created bybootstrap_iter()
generates an array (or dictionary) of newgvar.GVar
s whose covariance matrix is the same asg
’s but whose means are drawn at random from the originalg
distribution. This is a bootstrap copy of the original distribution. Each iteration of the iterator has different means (but the same covariance matrix).bootstrap_iter()
also works wheng
is a singlegvar.GVar
, in which case the resulting iterator returns bootstrap copies of theg
.- Parameters:
g – An array (or dictionary) of objects of type
gvar.GVar
.n – Maximum number of random iterations. Setting
n=None
(the default) implies there is no maximum number.eps (float) – If positive, singularities in the correlation matrix for
g
are regulated usinggvar.regulate()
with cutoffeps
. Ignored ifsvdcut
is specified (and notNone
).svdcut (float) – If nonzero, singularities in the correlation matrix are regulated using
gvar.regulate()
with an SVD cutoffsvdcut
. Default issvdcut=1e-12
.
- Returns:
An iterator that returns bootstrap copies of
g
.
gvar
’s random number generator is gvar.RNG
. It is currently
(since version 12.1) an instance of numpy.random.Generator
.
The following function is used to reseed the random number generator:
- gvar.ranseed(seed=None, size=3, version=None)¶
Seed random number generators with tuple
seed
.Argument
seed
is an integer or atuple
of integers that is used to seed the random number generators used by bygvar
. Reusing the sameseed
results in the same set of random numbers.ranseed
generates its own seed when called without an argument or withseed=None
. This seed is stored inranseed.seed
and is also returned by the function. The seed can be used to regenerate the same set of random numbers at a later time. Keyword argumentsize
specifies the size or shape of the new seed.gvar
’s random number generator isgvar.RNG
, which is (currently) an instance of thenumpy.random.Generator
class. (Random numbers are generated by calling routines likegvar.RNG.random(size)
; see the documentation fornumpy.random.Generator
.) Callinggvar.ranseed
assignsgvar.RNG
to a new generator of this type, corresponding to the seed.Note: The random number generator changed in version 12.1 of
gvar
in order to track changes in version 1.17 ofnumpy
, which provides the generator. To restore the old generator set keywordversion=0
. This option should only be used for legacy code as it is likely to disappear eventually. Note thatgvar.RNG=numpy.random
in this case. This is also the choice if thenumpy
version is earlier than 1.17.- Parameters:
seed (int, tuple, or None) – Seed for generator. Generates a random tuple for the seed if
None
.size (int) – Size of the tuple of integers used to seed the random number generator when
seed=None
. Default issize=3
. Ignored ifseed
is notNone
.version (int or None) – Version of random number generator:
version=0
specifies the generator used ingvar
versions earlier than 12.1;version=None
(default) specifies the current random number generator.
- Returns:
The seed used to reseed the generator.
The following two functions that are useful for tabulating results
and for analyzing where the errors in a gvar.GVar
constructed from
other gvar.GVar
s come from:
- gvar.fmt_errorbudget(outputs, inputs, ndecimal=2, percent=True, verify=False, colwidth=10)¶
Tabulate error budget for
outputs[ko]
due toinputs[ki]
.For each output
outputs[ko]
,fmt_errorbudget
computes the contributions tooutputs[ko]
’s standard deviation coming from thegvar.GVar
s collected ininputs[ki]
. This is done for each key combination(ko,ki)
and the results are tabulated with columns and rows labeled byko
andki
, respectively. If agvar.GVar
ininputs[ki]
is correlated with othergvar.GVar
s, the contribution from the others is included in theki
contribution as well (since contributions from correlatedgvar.GVar
s cannot be distinguished). The table is returned as a string.- Parameters:
outputs – Dictionary of
gvar.GVar
s for which an error budget is computed.inputs – Dictionary of:
gvar.GVar
s, arrays/dictionaries ofgvar.GVar
s, or lists ofgvar.GVar
s and/or arrays/dictionaries ofgvar.GVar
s.fmt_errorbudget
tabulates the parts of the standard deviations of eachoutputs[ko]
due to eachinputs[ki]
.ndecimal (int) – Number of decimal places displayed in table.
percent (bool) – Tabulate % errors if
percent is True
; otherwise tabulate the errors themselves.colwidth (int) – Width of each column. This is set automatically, to accommodate label widths, if
colwidth=None
(default).verify (bool) – If
True
, a warning is issued if: 1) different inputs are correlated (and therefore double count errors); or 2) the sum (in quadrature) of partial errors is not equal to the total error to within 0.1% of the error (and the error budget is incomplete or overcomplete). No checking is done ifverify==False
(default).
- Returns:
A table (
str
) containing the error budget. Output variables are labeled by the keys inoutputs
(columns); sources of uncertainty are labeled by the keys ininputs
(rows).
- gvar.fmt_values(outputs, ndecimal=None)¶
Tabulate
gvar.GVar
s inoutputs
.- Parameters:
outputs – A dictionary of
gvar.GVar
objects.ndecimal (int) – Format values
v
usingv.fmt(ndecimal)
.
- Returns:
A table (
str
) containing values and standard deviations for variables inoutputs
, labeled by the keys inoutputs
.
The following functions are used to make correlation matrices less singular:
- gvar.regulate(g, svdcut=1e-12, eps=None, wgts=False, noise=False)¶
Regulate singularities in the correlation matrix of the
gvar.GVar
s ing
.Standard usage is, for example,
import gvar as gv # with eps cutoff gmod = gv.regulate(g, eps=1e-6) # or with svd cutoff gmod = gv.regulate(g, svdcut=1e-6)
where
g
is an array ofgvar.GVar
s or a dictionary containinggvar.GVar
s and/or arrays ofgvar.GVar
s. The resultgmod
is a copy ofg
whosegvar.GVar
s have been modified to make their correlation matrix less singular than that of the originalg
. Parametereps
orsvdcut
specifies the extent of the modification.The modification of
g
is implemented by adding a set ofgvar.GVar
s with zero means:gmod = g + gmod.correction
where
gmod.correction
is an array/dictionary containing thegvar.GVar
s. Wheng
is an array andeps
is specified, for example,g[i]
is modified by adding:gmod.correction[i] = gv.gvar(0, (eps * norm) ** 0.5 * g[i].sdev)
where
norm = numpy.linalg.norm(corr, numpy.inf)
is an estimate of the largest eigenvalue of the correlation matrixcorr
. This correction typically has a negligible effect on the final standard deviations (relative ordereps*norm/2
), but can make noticeable changes in the correlation matrix for highly correlated data. Strong correlations lead to small eigenvalues in the correlation matrix, and these are significantly increased by the cutoff, which in effect replaces each eigenvalueeig
byeig + eps * norm
. Only members ofg
that are correlated with other members ofg
are modified; uncorrelated members are left unchanged (i.e., the correction isgv.gvar(0,0)
).Adding
gmod.correction
tog
increases the uncertainties ing
but does not affect random fluctuations in its mean values. If parameternoise=True
, random noise is included ingmod.correction
,gmod.correction += gv.sample(gmod.correction),
before it is added to
g
. This adds random fluctuations to the means ingmod
that are commensurate with the additions to the uncertainties. This is important, for example, when interpreting achi**2
involvinggmod
, sincechi**2
compares fluctuations in the means with the uncertainties.Specifying
svdcut
rather thaneps
regulates the correlation matrix usinggvar.svd()
: callinggvar.regulate(g, svdcut=1e-4)
is the same as callinggvar.svd(g, svdcut=1e-4)
. Whensvdcut>=0
, each eigenvalueeig
of the correlation matrix is replaced bymax(eig, svdcut * max_eig)
wheremax_eig
is the largest eigenvalue. See thegvar.svd()
documentation for more details. SVD cuts are numerically more robust, but more costly, especially for large systems.There are a variety of reasons for regulating the correlation matrices. Roundoff error, for example, can make the smallest eigenvalues unreliable and destabilize calulations involving the correlation matrix. A similar situation arises when the correlation matrix is estimated from a small set of random data. This can result in small eigenvalues that are badly underestimated.
There is an additional parameter
wgts
ingvar.regulate()
whose default value isFalse
. Settingwgts=1
orwgts=-1
instead causesgvar.regulate()
to return a tuple(gmod, i_wgts)
wheregmod
is the modified copy ofg
, andi_wgts
contains a decomposition of either the modified covariance matrix, whenwgts=1
, or the inverse of the modified covariance matrix, whenwgts=-1
. The covariance matrix is decomposed into non-overlapping sub-matrices, withi_wgts[0]
containing the contributions from all 1x1 sub-matrices. Typical usage, for example to compute the inverseinv_cov
of thegmod
covariance matrix, is:gmod, i_wgts = gvar.regulate(g, eps=1e-6, wgts=-1) inv_cov = numpy.zeros((gmod.size, gmod.size)) # 1x1 sub-matrices i, wgts = i_wgts[0] inv_cov[i, i] = wgts ** 2 # nxn sub-matrices (n>1) for i, wgts in i_wgts[1:]: inv_cov[i[:, None], i] = wgts.T @ wgts
Similarly, we can compute the expectation value,
u.dot(inv_cov.dot(v))
, between two vectors (numpy
arrays) using:result = 0.0 i, wgts = i_wgts[0] # 1x1 sub-matrices if len(i) > 0: result += numpy.sum((u[i] * wgts) * (v[i] * wgts)) # nxn sub-matrices (n>1) for i, wgts in i_wgts[1:]: result += numpy.sum(wgts.dot(u[i]) * wgts.dot(v[i]))
where
result
is the desired expectation value.Setting
wgts=True
causesgvar.regulate()
to return a tuple(gmod, i_wgts, i_invwgts)
wherei_wgts
is the decomposition of the covariance matrix andi_invwgts
is the decomposition of the inverse of the covariance matrix.These decompositions are useful for least squares fitting and simulating correlated data. A Cholesky decomposition is used when
eps
is specified, while an SVD decomposition is used withsvdcut
.- Parameters:
g – An array of
gvar.GVar
s or a dicitionary whose values aregvar.GVar
s and/or arrays ofgvar.GVar
s.eps (float) – The diagonal elements of the
g
covariance matrix are multiplied by1 + eps*norm
wherenorm
is the norm of the correlation matrix,numpy.linalg.norm(corr, numpy.inf)
. Ignored ifsvdcut
is specified (and notNone
).svdcut (float) – If positive, replaces eigenvalues
eig
of the correlation matrix withmax(eig, svdcut * max_eig)
wheremax_eig
is the largest eigenvalue; if negative, discards eigenmodes with eigenvalues smaller than|svdcut| * max_eig
. Note|svdcut| < 1
. Default is 1e-12.wgts – Setting
wgts=1
causesgvar.regulate()
to compute and return a decomposition of the covariance matrix of the modifiedgvar.GVar
s,gmod
. Settingwgts=-1
results in a decomposition of the inverse of the covariance matrix. Settingwgts=True
results in both decompositions. The default value isFalse
, in which case onlygmod
is returned.noise (bool) – If
True
, noise is added to the correction (see above). Default isFalse
.
- Returns:
If
wgts
is notFalse
, a tuple(g, i_wgts)
is returned wherei_wgts
contains a decomposition of thegmod
covariance matrix or its inverse (see above), or ifwgts=True
both decompositions are returned:(gmod, i_wgts, i_invwgts)
.
Additional information is stored in
gmod
:- gmod.correction
Array or dictionary containing the SVD corrections added to
g
to creategmod
:gmod = g + gmod.correction
.
- gmod.eps
eps
used to creategmod
if set (otherwiseNone
).
- gmod.svdcut
svdcut
used to creategmod
if set (otherwiseNone
).
- gmod.dof
Number of degrees of freedom in
gmod
.
- gmod.nmod
Number of members of
gmod
modified by regulation.
- gmod.nblocks
A dictionary where
gmod.nblocks[s]
contains the number of block-diagonals
-by-s
sub-matrices in the correlation matrix. Useful for debugging.
- gmod.logdet
Logarithm of the determinant of the covariance matrix after
eps
regulation is applied.
- gvar.svd(g, svdcut=1e-12, wgts=False, noise=False)¶
Apply SVD cuts to collection of
gvar.GVar
s ing
.Standard usage is, for example,
import gvar as gv ... gmod = gv.svd(g, svdcut=1e-4)
where
g
is an array ofgvar.GVar
s or a dictionary containinggvar.GVar
s and/or arrays ofgvar.GVar
s. Whensvdcut>0
,gmod
is a copy ofg
whosegvar.GVar
s have been modified to make their correlation matrix less singular than that of the originalg
: each eigenvalueeig
of the correlation matrix is replaced bymax(eig, svdcut * max_eig)
wheremax_eig
is the largest eigenvalue. This SVD cut, which is applied separately to each block-diagonal sub-matrix of the correlation matrix, increases the variance of the eigenmodes with eigenvalues smaller thansvdcut * max_eig
.The modification of
g
’s covariance matrix is implemented by adding (tog
) a set ofgvar.GVar
s with zero means:gmod = g + gmod.correction
where
gmod.correction
is an array/dictionary containing thegvar.GVar
s.Adding
gmod.correction
tog
increases the uncertainties ing
but does not affect random fluctuations in the mean values. If parameternoise=True
, random noise is included ingmod.correction
,gmod.correction += gv.sample(gmod.correction),
before it is added to
g
. This adds random fluctuations to the means ingmod
that are commensurate with the additions to the uncertainties. This is important, for example, when interpreting achi**2
involvinggmod
, sincechi**2
tests whether fluctuations in the means are consistent with the uncertainties.When
svdcut
is negative, eigenmodes of the correlation matrix whose eigenvalues are smaller than|svdcut| * max_eig
are dropped from the new matrix and the corresponding components ofg
are zeroed out (that is, replaced by 0(0)) ingmod
.There is an additional parameter
wgts
ingvar.svd()
whose default value isFalse
. Settingwgts=1
orwgts=-1
instead causesgvar.svd()
to return a tuple(gmod, i_wgts)
wheregmod
is the modified copy ofg
, andi_wgts
contains an SVD decomposition of either the modified covariance matrix, whenwgts=1
, or the inverse of the modified covariance matrix, whenwgts=-1
. The covariance matrix is decomposed into non-overlapping sub-matrices, withi_wgts[0]
containing the contributions from all 1x1 sub-matrices. Typical usage, for example to compute the inverseinv_cov
of thegmod
covariance matrix, is:gmod, i_wgts = gvar.svd(g, svdcut=1e-6, wgts=-1) inv_cov = numpy.zeros((gmod.size, gmod.size)) # 1x1 sub-matrices i, wgts = i_wgts[0] inv_cov[i, i] = wgts ** 2 # nxn sub-matrices (n>1) for i, wgts in i_wgts[1:]: inv_cov[i[:, None], i] = wgts.T @ wgts
Similarly, we can compute the expectation value,
u.dot(inv_cov.dot(v))
, between two vectors (numpy
arrays) using:result = 0.0 i, wgts = i_wgts[0] # 1x1 sub-matrices if len(i) > 0: result += numpy.sum((u[i] * wgts) * (v[i] * wgts)) # nxn sub-matrices (n>1) for i, wgts in i_wgts[1:]: result += numpy.sum(wgts.dot(u[i]) * wgts.dot(v[i]))
where
result
is the desired expectation value. The SVD decompositions are useful for least squares fitting and simulating correlated data.Setting
wgts=True
causesgvar.svd()
to return a tuple(gmod, i_wgts, i_invwgts)
wherei_wgts
is the decomposition of the covariance matrix andi_invwgts
is the decomposition of the inverse of the covariance matrix.- Parameters:
g – An array of
gvar.GVar
s or a dicitionary whose values aregvar.GVar
s and/or arrays ofgvar.GVar
s.svdcut (None or float) – If positive, replace eigenvalues
eig
of the correlation matrix withmax(eig, svdcut * max_eig)
wheremax_eig
is the largest eigenvalue; if negative, discard eigenmodes with eigenvalues smaller than|svdcut| * max_eig
. Note|svdcut| < 1
. Default is 1e-12.wgts – Setting
wgts=1
causesgvar.svd()
to compute and return a spectral decomposition of the covariance matrix of the modifiedgvar.GVar
s,gmod
. Settingwgts=-1
results in a decomposition of the inverse of the covariance matrix. Settingwgts=True
results in both decompositions. The default value isFalse
, in which case onlygmod
is returned.noise – If
True
, noise is added to the SVD correction (see above). Default isFalse
.
- Returns:
A copy
gmod
ofg
whose correlation matrix is modified by SVD cuts. Ifwgts=1
, a tuple(gmod,i_wgts)
is returned wherei_wgts
contains an SVD decomposition ofgmod
’s covariance matrix. Ifwgts=-1
, a tuple(gmod,i_invwgts)
is returned wherei_invwgts
contains an SVD decomposition of the inverse ofgmod
’s covariance matrix. Ifwgts=True
, both decompositions are returned:(gmod,i_wgts,i_invwgts)
.
Data from the SVD analysis is stored in
gmod
:- gmod.svdcut
SVD cut used to create
gmod
.
- gmod.dof
Number of independent degrees of freedom left after the SVD cut. This is the same as the number initially unless
svdcut < 0
in which case it may be smaller.
- gmod.nmod
Number of modes whose eignevalue was modified by the SVD cut.
- gmod.nblocks
A dictionary where
gmod.nblocks[s]
contains the number of block-diagonals
-by-s
sub-matrices in the correlation matrix.
- gmod.eigen_range
Ratio of the smallest to largest eigenvalue before SVD cuts are applied (but after rescaling).
- gmod.logdet
Logarithm of the determinant of the covariance matrix after SVD cuts are applied (excluding any omitted modes when
svdcut < 0
and any diagonal zero modes).
- gmod.correction
Array or dictionary containing the SVD corrections added to
g
to creategmod
:gmod = g + gmod.correction
.
This function is useful when the correlation matrix is singular or almost singular, and its inverse is needed (as in curve fitting).
The following function can be used to rebuild collections of gvar.GVar
s,
ignoring all correlations with other variables. It can also be used to
introduce correlations between uncorrelated variables.
- gvar.rebuild(g, corr=0.0, gvar=gvar.gvar)¶
Rebuild
g
stripping correlations with variables not ing
.g
is either an array ofgvar.GVar
s or a dictionary containinggvar.GVar
s and/or arrays ofgvar.GVar
s.rebuild(g)
creates a new collectiongvar.GVar
s with the same layout, means and covariance matrix as those ing
, but discarding all correlations with variables not ing
.If
corr
is nonzero,rebuild
will introduce correlations wherever there aren’t any usingcov[i,j] -> corr * sqrt(cov[i,i]*cov[j,j])
wherever
cov[i,j]==0.0
initially. Positive values forcorr
introduce positive correlations, negative values anti-correlations.Parameter
gvar
specifies a function for creating newgvar.GVar
s that replacesgvar.gvar()
(the default).- Parameters:
g (array or dictionary) –
gvar.GVar
s to be rebuilt.gvar (
gvar.GVarFactory
orNone
) – Replacement forgvar.gvar()
to use in rebuilding. Default isgvar.gvar()
.corr (number) – Size of correlations to introduce where none exist initially.
- Returns:
Array or dictionary (gvar.BufferDict) of
gvar.GVar
s (same layout asg
) where all correlations with variables other than those ing
are erased.
The following functions creates new functions that generate gvar.GVar
s (to
replace gvar.gvar()
):
- gvar.switch_gvar()¶
Switch
gvar.gvar()
to newgvar.GVarFactory
.gvar.GVar
s created from different factory functions (seegvar.gvar_factory()
), with different covariance matrices, should never be mixed in arithmetic or other expressions. Such mixing is unsupported and will result in unpredictable behavior. Arithmetic that mixesgvar.GVar
s in this way will generate an error message: “incompatible GVars”.- Parameters:
cov – Covariance matrix for new
gvar.gvar()
. A new covariance matrix is created ifcov=None
(default). Ifcov
is agvar.GVar
, the covariance matrix of thegvar.GVar
is used.- Returns:
New
gvar.gvar()
.
- gvar.restore_gvar()¶
Restore previous
gvar.gvar()
.gvar.GVar
s created from different factory functions (seegvar.gvar_factory()
), with different covariance matrices, should never be mixed in arithmetic or other expressions. Such mixing is unsupported and will result in unpredictable behavior. Arithmetic that mixesgvar.GVar
s in this way will generate an error message: “incompatible GVars”.- Returns:
Previous
gvar.gvar()
.
- gvar.gvar_factory(cov=None)¶
Return new function for creating
gvar.GVar
s (to replacegvar.gvar()
).If
cov
is specified, it is used as the covariance matrix for newgvar.GVar
s created by the function returned bygvar_factory(cov)
. Otherwise a new covariance matrix is created internally.gvar.GVar
s created from different factory functions, with different covariance matrices, should never be mixed in arithmetic or other expressions. Such mixing is unsupported and will result in unpredictable behavior. Arithmetic that mixesgvar.GVar
s in this way will generate an error message: “incompatible GVars”.
gvar.GVar
Objects¶
The fundamental class for representing Gaussian variables is:
- class gvar.GVar¶
The basic attributes are:
- mean¶
Mean value.
- sdev¶
Standard deviation.
- var¶
Variance.
Two methods allow one to isolate the contributions to the variance or standard deviation coming from other
gvar.GVar
s:- partialvar(*args)¶
Compute partial variance due to
gvar.GVar
s inargs
.This method computes the part of
self.var
due to thegvar.GVar
s inargs
. Ifargs[i]
is correlated with othergvar.GVar
s, the variance coming from these is included in the result as well. (This last convention is necessary because variances associated with correlatedgvar.GVar
s cannot be disentangled into contributions corresponding to each variable separately.)
- partialsdev(*args)¶
Compute partial standard deviation due to
gvar.GVar
s inargs
.This method computes the part of
self.sdev
due to thegvar.GVar
s inargs
. Ifargs[i]
is correlated with othergvar.GVar
s, the standard deviation coming from these is included in the result as well. (This last convention is necessary because variances associated with correlatedgvar.GVar
s cannot be disentangled into contributions corresponding to each variable separately.)
Partial derivatives of the
gvar.GVar
with respect to the independentgvar.GVar
s from which it was constructed are given by:- deriv(x)¶
Derivative of
self
with respest to primarygvar.GVar
s inx
.All
gvar.GVar
s are constructed from primarygvar.GVar
s (seegvar.is_primary()
).self.deriv(x)
returns the partial derivative ofself
with respect to primarygvar.GVar
x
, holding all of the other primarygvar.GVar
s constant.
There are three methods for converting
self
into a string for printing and display, and fourth for modifying the defaults used in formatting:- __format__(spec)¶
Format strings for
gvar.GVar
s.Support is provided for standard presentation formats normally used for floats including:
'e'
,'f'
,'g'
,'n'
, and'%'
. The format is applied to the mean value and the output modified to include the standard deviation when the mean is larger in magnitude than the standard deviation:>>> x = gvar.gvar(12.314, 1.56) >>> print(f'{x:.2g}'') 12(2) >>> print(f'{x:.^25.3e}') .....1.231(156)e+01......
The format is applied first to the standard deviation when it is the larger of the two.
There is an additional format,
'p'
, where the precision field specifies the number of digits displayed in the standard deviation:>>> print(f'{x:.2p}') 12.3(1.6)
The default precision is 2.
An alternative form
'#p'
of this format may add further digits to the standard deviation (when it is larger in magnitude than the mean) so that at least one non-zero digit of the mean is displayed:>>> x = gvar.gvar(9, 123) >>> print(f'{x:#.2p}'') 9(123)
If the mean is exactly zero, the result is ‘0 ± sdev’.
Format
'P'
is similar to'#p'
but always uses the plus-minus format:'mean ± sdev'
.The default format, when none is specified, is ‘{:#.2p}’. This can be changed using
GVar.set(default_format=...)
.
- __str__()¶
Returns string using the default format:
f'{self}'
.
- fmt(ndecimal=None, sep='', format='{}')¶
Format
gvar.GVar
.Typical usage:
>>> g = gvar.gvar(27.9315, 1.23) >>> print(g.fmt(), g.fmt(format='{:.2g}'), g.fmt(ndecimal=3)) 27.9(1.2) 28(1) 27.931(1.230) >>> print(g.fmt(ndecimal=-1), g.fmt(sep='|')) 27.9 ± 1.2 27.9|(1.2)
- Parameters:
ndecimal (int or None) – Format
gvar.GVar
usingf'{self:.{ndecimal}f}'
whenndecimal
is a non-negative integer, orf'{self:.2P}'
whenndecimal
is negative. Ignored ifndecimal=None
(default).sep (str) – String inserted between the
mean
and the(sdev)
in'mean(sdev)'
. Default issep=''
.format (str) – Format string. Ignored if
ndecimal
is notNone
. Default is'{}'
.
- Returns:
Formatted string.
- static set(**kargs)¶
Change
gvar.GVar
’s format parameters.Typical usage is:
# change default format spec and plus/minus symbol oldparam = GVar.set(default_format='{:.3f}', plusminus=' +/- ') ... # change back GVar.set(**oldparam)
The following parameters can be reset.
- Parameters:
formatter (callable) – Sets the default formatter. The formatted string is returned by
formatter(g, spec)
whereg
is thegvar.GVar
to be formatted andspec
is the format specification (e.g,'.3g'
or'.^20.3f'
). Settingformatter=None
restores the original formatter. Settingformatter='old'
switches to the formatter used beforegvar
version 12.0 (buggy and deprecated).default_format (str) – Sets the format used when none is specified. (Note that
default_format='{}'
is not allowed.) The format can be specified as a full format string (e.g.,'{:.4g}'
) or with just the format specification ('.4g'
). Default is'{:#.2p}'
.plusminus (str) – Sets the plus/minus symbol(s) for format strings. Default is
' ± '
.prange (tuple) – Sets the range of values of
abs(g.mean)/g.sdev
formatted with a compressed format when using the'p'
presentation format. Values outside this range are formatted as'{g.mean} ± {g.sdev}'
with the'p'
format. Default is(1e-5, 10 ** sys.float_info.dig)
.
Two attributes and a method make reference to the original variables from which
self
is derived:- dotder(v)¶
Return the dot product of
self.der
andv
.
gvar.BufferDict
Objects¶
gvar.BufferDict
objects are ordered dictionaries that are heavily used
in gvar
’s implementation. They provide the most flexible
representation for multi-dimensional Gaussian distributions.
gvar.BufferDict
objects differ from ordinary dictionaries in two respects. The
first difference is that the dictionary’s values must be scalars or numpy
arrays (any shape) of scalars. The scalars can be ordinary integers or floats,
but the dictionary was designed especially for gvar.GVar
s.
The dictionary’s values are packed into different parts of a single
one-dimensional array or buffer. Items can be added one at a time as in other
dictionaries,
>>> import gvar as gv
>>> b = gv.BufferDict()
>>> b['s'] = 0.0
>>> b['v'] = [1., 2.]
>>> b['t'] = [[3., 4.], [5., 6.]]
>>> print(b)
{
's': 0.0,
'v': array([1., 2.]),
't': array([[3., 4.],
[5., 6.]]),
}
but the values can also be accessed all at once through the buffer:
>>> print(b.buf)
[0. 1. 2. 3. 4. 5. 6.]
A previous entry can be overwritten, but the size, shape, and type of the
data needs to be the same. For example, here setting b['s'] = [10., 20.]
would generate a ValueError
exception, but b['s'] = 22.
is fine.
The second difference between gvar.BufferDict
s and other dictionaries is
illustrated by the following code:
>>> b = gv.BufferDict()
>>> b['log(a)'] = gv.gvar('1(1)')
>>> print(b)
{'log(a)': 1.0(1.0)}
>>> print(b['a'], b['log(a)'])
2.7(2.7) 1.0(1.0)
Even though 'a'
is not a key in the dictionary, b['a']
is still
defined: it equals exp(b['log(a)'])
. This feature is used to provide
(limited) support for non-Gaussian distributions. Here gvar.BufferDict
b
specifies a distribution that is Gaussain for p['log(a)']
, and
therefore log-normal for p['a']
. Thus, for example,
>>> [x['a'] for x in gv.raniter(b, n=4)]
[2.1662650927997817, 2.3350022125310317, 8.732161128765775, 3.578188553455522]
creates a list of four random numbers drawn from a log-normal distribution.
Note that 'a'
in this
example is not a key in dictionary b
, even though both b['a']
and b.get('a')
return values:
>>> print('a' in b)
False
>>> print('log(a)' in b)
True
>>> print(list(b))
['log(a)']
>>> print(b['a'], b.get('a'))
2.7(2.7) 2.7(2.7)
This functionality is used routinely by other modules (e.g., lsqfit
).
The supported distributions, and methods for adding new ones, are
described in the documentation for gvar.BufferDict.add_distribution()
,
below.
- class gvar.BufferDict(*args, **kargs)¶
Ordered dictionary whose values are packed into a 1-d buffer (
numpy
array).gvar.BufferDict
s can be created in the usual way dictionaries are created:>>> b = BufferDict() >>> b['a'] = 1 >>> b['b'] = [2.7, 3.2] >>> print(b) {'a': 1.0, 'b': array([2.7, 3.2])} >>> print(b.buf) [1. 2.7 3.2] >>> b = BufferDict(a=1, b=[2.7, 3.2]) >>> b = BufferDict([('a',1.), ('b',[2.7, 3.2])])
They can also be created from other dictionaries or
gvar.BufferDict
s:>>> c = BufferDict(b) >>> print(c) {'a': 1.0, 'b': 2.7} >>> c = BufferDict(b, keys=['b']) >>> print(c) {'b': 2.7}
The
keys
keyword restricts the copy of a dictionary to entries whose keys are inkeys
and in the same order as inkeys
.The values in a
gvar.BufferDict
are scalars or arrays of a scalar type (gvar.GVar
,float
,int
, etc.). The data type of the buffer is normally inferred (dynamically) from the data itself:>>> b = BufferDict(a=1) >>> print(b, b.dtype) {'a': 1} int64 >>> b['b'] = 2.7 >>> print(b, b.dtype) {'a': 1.0, 'b': 2.7} float64
The data type of the
gvar.BufferDict
’s buffer can be specified when creating thegvar.BufferDict
from another dictionary or list by using keyworddtype
:>>> b = BufferDict(dict(a=1.2), dtype=int) >>> print(b, b.dtype) {'a': 1} int64 >>> b['b'] = 2.7 >>> print(b, b.dtype) {'a': 1, 'b': 2} int64
Note in this example that the data type is not changed by subsequent additions to the
gvar.BufferDict
when thedtype
keyword is specified. To create an emptygvar.BufferDict
with a specified type use, for example,BufferDict({}, dtype=int)
. Any data type supported bynumpy
arrays can be specified.Some simple arithemetic is allowed between two
gvar.BufferDict
s, say,g1
andg2
provided they have the same keys and array shapes. So, for example:>>> a = BufferDict(a=1, b=[2, 3]) >>> b = BufferDict(b=[20., 30.], a=10.) >>> print(a + b) {'a': 11.0, 'b': array([22., 33.])}
Subtraction is also defined, as are multiplication and division by scalars. The corresponding
+=
,-=
,*=
,/=
operators are supported, as are unary+
and-
.Finally a
gvar.BufferDict
can be cloned from another one using a different buffer (containing different values):>>> b = BufferDict(a=1., b=2.) >>> c = BufferDict(b, buf=[10, 20]) >>> print(c) {'a': 10, 'b': 20}
The new buffer itself (i.e., not a copy) is used in the new
gvar.BufferDict
if the buffer is anumpy
array:>>> import numpy as np >>> b = BufferDict(a=1., b=2.) >>> nbuf = np.array([10, 20]) >>> c = BufferDict(b, buf=nbuf) >>> c['a'] += 1 >>> print(c) {'a': 11, 'b': 20} >>> print(nbuf) [11 20]
If many clones of a
gvar.BufferDict
are needed, it is usually substantially more efficient to create them together in a single batch (or a small number of large batches). For example, the following code>>> a = BufferDict(s=1., v=[2., 3., 4.]) >>> print(a, ' ', a.buf) {'s': 1.0, 'v': array([2., 3.])} [1. 2. 3. 4.] >>> lbuf = np.array( ... [[10., 20., 30., 40.], [100., 200., 300., 400.], [1000., 2000., 3000., 4000.]] ... ) >>> lb = BufferDict(a, lbatch_buf=lbuf) >>> print(lb) { 's': array([ 10., 100., 1000.]), 'v': array([[ 20., 30., 40.], [ 200., 300., 400.], [2000., 3000., 4000.]]), }
merges three clones of the original
gvar.BufferDict
a
intogvar.BufferDict
lb
. Each clone has its own bufferlbuf[i]
wherei=0...2
labels the clone. Each element oflb
has an extra (leftmost) index labeling the clone:lb['s'][i]
andlb['v'][i, :]
. If needed, the individual clones can be pulled out into separate dictionaries usinggvar.BufferDict.batch_iter()
:>>> for clone in b.batch_iter('lbatch'): ... print(clone) ... {'s': 10.0, 'v': array([20., 30., 40.])} {'s': 100.0, 'v': array([200., 300., 400.])} {'s': 1000.0, 'v': array([2000., 3000., 4000.])}
These clones all provide views into
b.buf
.The batch index is always the leftmost index in the example above. It is frequently more convenient for the batch index to be the rightmost index. This is done by setting
rbatch_buf
rather thanlbatch_buf
when making the batchgvar.BufferDict
: for example,>>> rbuf = np.array( ... [[10., 100., 1000.], [20., 200., 2000.], [30., 300., 3000.], [40., 400., 4000.]] ... ) >>> rb = BufferDict(a, rbatch_buf=rbuf) >>> print(rb) { 's': array([ 10., 100., 1000.]), 'v': array([[ 20., 200., 2000.], [ 30., 300., 3000.]]), [ 40., 400., 4000.]]), }
creates a batch
gvar.BufferDict
equivalent to the previous one but now where the batch index is the last index for each element:rb[k][...,i]
for each keyk
and batch indexi=0..2
. The batch bufferrbuf
used here is the transpose of that used for thelbatch
case;rb.buf
is a view intorbuf
(not a copy as is the case forlbatch
). Again we can iterate over the individual clones (and these are the same as above):>>> for clone in rb.batch_iter('rbatch'): ... print(clone) ... {'s': 10.0, 'v': array([20., 30., 40.])} {'s': 100.0, 'v': array([200., 300., 400.])} {'s': 1000.0, 'v': array([2000., 3000., 4000.])}
The main attributes are:
- size¶
Size of buffer array.
- flat¶
Buffer array iterator.
- dtype¶
Data type of buffer array elements.
- buf¶
The (1d) buffer array. Allows direct access to the buffer: for example,
self.buf[i] = new_val
sets the value of thei-th
element in the buffer to valuenew_val
. Settingself.buf = nbuf
replaces the old buffer by new buffernbuf
. This only works ifnbuf
is a one-dimensionalnumpy
array having the same length as the old buffer, sincenbuf
itself is used as the new buffer (not a copy).
- shape¶
Always equal to
None
. This attribute is included sincegvar.BufferDict
s share several attributes withnumpy
arrays to simplify coding that might support either type. Being dictionaries they do not have shapes in the sense ofnumpy
arrays (hence the shape isNone
).
In addition to standard dictionary methods, the main methods here are:
- flatten()¶
Copy of buffer array.
- slice(k)¶
Return slice/index in
self.flat
corresponding to keyk
.
- slice_shape(k)¶
Return tuple
(slice/index, shape)
corresponding to keyk
.
- has_dictkey(k)¶
Returns
True
ifself[k]
is defined;False
otherwise.Note that
k
may be a key or it may be related to a another key associated with a non-Gaussian distribution (e.g.,'log(k)'
; seegvar.BufferDict.add_distribution()
for more information).
- all_keys()¶
Iterator over all keys and implicit keys.
For example, the following code
b = BufferDict() b['log(x)'] = 1. for k in b.keys(): print(k, b[k]) print() for k in b.all_keys(): print(k, b[k])
gives the following output:
log(x) 1 log(x) 1 x 2.718281828459045
Here
'x'
is not a key in dictionaryb
butb['x']
is defined implicitly fromb['log(x)']
. Seegvar.BufferDict.add_distribution()
for more information.
- static add_distribution(name, invfcn)¶
Add new parameter distribution.
gvar.BufferDict
s can be used to represent a restricted class of non-Gaussian distributions. For example, the codeimport gvar as gv gv.BufferDict.add_distribution('log', gv.exp)
enables the use of log-normal distributions for parameters. So defining, for example,
b = gv.BufferDict() b['log(a)'] = gv.gvar('1(1)')
means that
b['a']
has a value (equal toexp(b['log(a)']
) even though'a'
is not a key in the dictionary.The distributions available by default correspond to:
gv.BufferDict.add_distribution('log', gv.exp) gv.BufferDict.add_distribution('sqrt', gv.square) gv.BufferDict.add_distribution('erfinv', gv.erf)
See also
gvar.BufferDict.uniform()
for uniform distributions.- Parameters:
name (str) – Distributions’ function name. A
ValueError
is raised ifname
is already being used for a distribution; the error can be avoided by deleting the old definition first usingBufferDict.del_distribution()
.invfcn (callable) – Inverse of the transformation function.
- static del_distribution(name)¶
Delete
gvar.BufferDict
distributionname
.Raises a
ValueError
ifname
is not the name of an existing distribution.
- static has_distribution(name)¶
True
ifname
has been defined as a distribution;False
otherwise.
- static uniform(fname, umin, umax, shape=())¶
Create uniform distribution on interval
[umin, umax]
.The code
import gvar as gv b = gv.BufferDict() b['f(w)'] = gv.BufferDict.uniform('f', 2., 3.)
adds a distribution function
f(w)
designed so thatb['w']
corresponds to a uniform distribution on the interval[2., 3.]
(seegvar.BufferDict.add_distribution()
for more about distributions).- Parameters:
fname (str) – Name of function used in the
BufferDict
key. Note that names can be reused provided they correspond to the same interval as in previous calls.umin (float) – Minimum value of the uniform distribution.
umax (float) – Maximum value of the uniform distribution.
shape (tuple) – Shape of array of uniform variables. Default is
()
.
- Returns:
gvar.GVar
object corresponding to a uniform distribution.
- batch_iter(mode)¶
Iterate over dictionaries in a batch
gvar.BufferDict
.- Parameters:
mode (str) – Type of batch
gvar.BufferDict
:'rbatch'
if the batch index on each entry is on the right; or'lbatch'
if the batch index is on the left.- Returns:
An iterator that yields the individual dictionaries merged into the batch
gvar.BufferDict
. The returned dictionaries are typedict
(not typegvar.BufferDict
). The values are views into the originalgvar.BufferDict
, not copies.
gvar.SVD
Objects¶
SVD analysis is handled by the following class:
- class gvar.SVD(mat, svdcut=None, svdnum=None, compute_delta=False, rescale=False)¶
SVD decomposition of a pos. semi-def. sym. matrix.
SVD
is a function-class that computes the eigenvalues and eigenvectors of a symmetric matrixmat
. Eigenvalues that are small (or negative, because of roundoff) can be eliminated or modified using svd cuts. Typical usage is:>>> mat = [[1.,.25],[.25,2.]] >>> s = SVD(mat) >>> print(s.val) # eigenvalues [ 0.94098301 2.05901699] >>> print(s.vec[0]) # 1st eigenvector (for s.val[0]) [ 0.97324899 -0.22975292] >>> print(s.vec[1]) # 2nd eigenvector (for s.val[1]) [ 0.22975292 0.97324899] >>> s = SVD(mat,svdcut=0.6) # force s.val[i]>=s.val[-1]*0.6 >>> print(s.val) [ 1.2354102 2.05901699] >>> print(s.vec[0]) # eigenvector unchanged [ 0.97324899 -0.22975292] >>> s = SVD(mat) >>> w = s.decomp(-1) # decomposition of inverse of mat >>> invmat = sum(numpy.outer(wj,wj) for wj in w) >>> print(numpy.dot(mat,invmat)) # should be unit matrix [[ 1.00000000e+00 2.77555756e-17] [ 1.66533454e-16 1.00000000e+00]]
- Parameters:
mat – Positive, symmetric matrix.
svdcut – If positive (or zero), replace
mat
’s eigenvaluese[i] -> max(e[i], svdcut*max(e))
; otherwwise, discard eigenmodes with eigenvalues smaller than|svdcut|
times the maximum eigenvalue.svdnum – If positive, keep only the modes with the largest
svdnum
eigenvalues; ignore if set toNone
.compute_delta (bool) – Compute
delta
(see below) ifTrue
; setdelta=None
otherwise.rescale – Rescale the input matrix to make its diagonal elements equal to +-1.0 before diagonalizing.
warn – Suppress warnings if
False
. Default isTrue
.
The results are accessed using:
- val¶
An ordered array containing the eigenvalues of
mat``after the SVD cut. Note that ``val[i]<=val[i+1]
.
- vec¶
Eigenvectors
vec[i]
corresponding to the eigenvaluesval[i]
.
- valmin¶
Minimum eigenvalue allowed in the modified matrix.
- valorig¶
Eigenvalues of original matrix.
- D¶
The diagonal matrix used to precondition the input matrix if
rescale==True
. The matrix diagonalized isD M D
whereM
is the input matrix.D
is stored as a one-dimensional vector of diagonal elements.D
isNone
ifrescale==False
.
- nmod¶
The first
nmod
eigenvalues inself.val
were modified by the SVD cut.
- eigen_range¶
Ratio of the smallest to the largest eigenvector in the unconditioned matrix (after rescaling if
rescale=True
)
- delta¶
A vector of
gvar
s whose means are zero and whose covariance matrix is what was added tomat
to condition its eigenvalues. IsNone
ifsvdcut<0
orcompute_delta==False
.
- decomp(n)¶
Vector decomposition of input matrix raised to power
n
.Computes vectors
w[i]
such thatmat**n = sum_i numpy.outer(w[i],w[i])
where
mat
is the original input matrix tosvd
. This decomposition cannot be computed if the input matrix was rescaled (rescale=True
) except forn=1
andn=-1
.- Parameters:
n (number) – Power of input matrix.
- Returns:
Array
w
of vectors.
Requirements¶
gvar
makes heavy use of numpy
for array manipulations. It
also uses the numpy
code for implementing elementary functions
(e.g., sin
, exp
…) in terms of member functions.