Configurations and Ensembles

We perform Markov-chain Monte Carlo and estimate expectation values stochastically. Each step in the Markov Chain is called a ‘configuration’; a set of configurations is called an ‘ensemble’.

We need to store data the same way for each configuration. A batch is a place to store data for a single physical field for every configuration.

class supervillain.batch.Batch(draws_or_data, *, cls=None, shape=None, dtype=None, **item_kwargs)[source]

Bases: object

A column of MCMC draws stored as array with shape (draw, …).

extendable is for storage only; computation uses batch[i] (a scalar, ndarray slice, or cls-wrapped element such as Form). For whole-column ndarray operations use array.

Parameters
  • draws_or_data – If an int, allocate a new zeroed column of that many draws. Otherwise wrap existing batched data (draw axis must be 0).

  • cls – Optional element class (Form, etc.). None for plain ndarray / scalar columns.

  • shape – Spatial shape when cls is None and draws_or_data is an int.

  • dtype – Column dtype. When allocating a new column (draws_or_data is an int) it defaults to float. When wrapping existing data it defaults to the data’s own dtype; if given explicitly it must be able to hold the data without loss, otherwise a TypeError is raised (a lossy cast such as complex→float or float→int is rejected rather than silently dropping data).

  • item_kwargs – Column-constant keyword arguments passed to cls on each draw (e.g. degree, lattice for Form).

cls

The element class used to wrap each draw, or None for plain arrays.

dtype

The column dtype.

classmethod from_data(data, *, dtype=None, **kwargs)[source]

Construct a Batch from existing batched data.

Parameters
  • data (array_like) – Data whose zeroth axis is the draw index.

  • dtype – Optional dtype override when wrapping data.

  • kwargs – Forwarded to __init__() (e.g. cls, degree, lattice).

Returns

A batch wrapping data.

Return type

Batch

property array

The resizable storage column (extendable.array, shape (draw, …)).

Use when you already have a Batch. Prefer batch[i] for one draw; use as_array() when a value might still be a legacy column.

static as_array(column)[source]

Unwrap a column for NumPy analysis.

Batcharray; anything else passes through (legacy extendable.array, plain ndarray). Use at boundaries where the static type is unknown — not on attributes known to be Batch.

property shape

returns: Shape of the underlying storage array, (draw, …). :rtype: tuple

__len__()[source]
Returns

Number of draws (length of axis 0).

Return type

int

__getitem__(index)[source]
Parameters

index (int, slice, or tuple) – If an int, return one draw (a scalar, ndarray slice, or cls instance). If a slice, return a new Batch sharing metadata. Otherwise delegate fancy indexing to the underlying storage array.

Returns

One element, a sub-batch, or a indexed view of the storage array.

Return type

scalar, ndarray, Form, or Batch

__setitem__(index, item)[source]
Parameters
  • index – Draw index or indices to overwrite (numpy indexing).

  • item – Value to store, coerced to dtype.

__iter__()[source]

Yield one draw per step (same objects as batch[i] for integer i).

extend_h5(group)[source]

Append this batch’s draws to an on-disk column.

The group must be an HDF5 group produced by the batch write strategy (it must contain a resizable data dataset).

Parameters

group (h5py.Group) – Group storing the batch column to extend.

Each set of Configurations contains a Batch for each physical field.

class supervillain.configurations.Configurations(dictionary)[source]

Bases: Extendable, ReadWriteable

A group of configurations has fields (which you can access by doing cfgs.field) and other auxiliary information (one per configuration).

However, you can also use cfgs[step] to get a dictionary with keys that correspond to the names of the fields (and the auxiliary information) and associated values.

If you like you can think of a set of Configurations as a very lightweight barely-featured pandas DataFrame.

Parameters

dictionary (dict) – A dictionary with {key: value} pairs, where each value is an array whose first dimension is one per configuration.

__getitem__(index)[source]
Parameters

index (fancy indexing) – A subset of numpy fancy indexing is supported; this selection is used for selecting configurations based on their location in the dataset, rather than their index. Some valid choices are 7, [1,2,3], slice(1,4), slice(1,10,2).

Returns

If index is an integer, returns a dictionary with key/value pairs for the requested configuration. If the index is fancier, return another set of Configurations.

Return type

one or many configurations

__setitem__(index, new)[source]
Parameters
  • index (fancy indexing) – Index or indices to overwrite.

  • new (dictionary or Configurations) – Data to write.

items()[source]

Like a dictionary’s .items(), iterates over the fields and auxiliary information.

extend_h5(group, _top=True)[source]
copy()[source]

Ensembles are made up of configurations and also have other physics information—not just the stored fields themselves.

class supervillain.ensemble.Ensemble(action)[source]

Bases: Extendable

An ensemble of configurations importance-sampled according to the action.

Parameters

Action (an action) – An action which describes the path integral of interest.

Action

The action for the ensemble.

from_configurations(configurations)[source]
Parameters

configurations – A set of pre-computed configurations.

Return type

The ensemble itself, so that one can do ensemble = Ensemble(action).from_configurations(cfgs).

generate(steps, generator, start='cold', progress=<function _no_op>, starting_index=0, index_stride=1)[source]
Parameters
  • steps (int) – Number of configurations to generate.

  • generator – Something which produces a new configuration if called as generator.step(previous_configuration).

  • start (‘cold’, or a configuration as a dictionary) – A cold start beins with the all-zero configuration. If a dictionary is passed it is used as the zeroeth configuration.

  • progress (something which wraps an iterator and provides a progress bar.) – In a script you might use tqdm.tqdm, and in a notebook tqdm.notebook. Defaults to no progress reporting. Must accept a desc keyword argument.

  • starting_index (int) – An ensemble has a .index which is an array of regularly-spaced integers labeling the configurations; this sets the lower value.

  • index_stride (int) – The increment of the .index for each call of the generator.

Return type

the ensemble itself, so that one can do ensemble = GrandCanonical(action).generate(...).

classmethod continue_from(ensemble, steps, progress=<function _no_op>)[source]

Use the last configuration and generator of ensemble to produce a new ensemble of steps configurations.

Parameters
  • ensemble (supervillain.Ensemble or an h5py.Group that encodes such an ensemble) – The ensemble to continue. Raises a ValueError if it is not a supervillain.Ensemble or an h5py.Group with an action, generator, and at least one configuration.

  • steps (int) – Number of configurations to generate.

  • progress – As in generate().

Returns

An ensemble with steps new configurataions generted in the same way as ensemble.

Return type

supervillain.Ensemble

Todo

The starting weight should automatically be read in; currently not.

measure(observables=None)[source]

If observables is None, measure every known primary observable on this ensemble. Otherwise measure only those observables named. If an observable is already computed, no new computation occurs.

Parameters

observables (None or iterable of strings naming observables.) – Observables to compute on this ensemble.

Returns

Keys are observable names, values are the measurements.

Return type

dict

property measured

A set of strings naming measured observables.

autocorrelation_time(observables=None, every=False)[source]

Compute the autocorrelation time for the ensemble’s measurements. However, the autocorrelation time for any observable is only computed if that observable’s autocorrelation() is true for this ensemble.

However, if no measurements have been made so that measured is empty, try every observable with a true autocorrelation(). This may trigger measurement, and is usually what you want; after generation you want to thermalize or decorrelate.

Note

The measurement of some observables, particularly those for which autocorrelation() is false for this ensemble, is not triggered automatically, unless it is a prerequisite for an observable for autocorrelation() is true.

Parameters
  • observables (None or iterable of strings naming observables.) – Which observables to consider. If None, consider all previously-measured observables.

  • every (boolean) – If True returns a dictionary with keys given by observable names and values the computed autocorrelation times.

cut(start)[source]

Good for thermalization.

thermalized = ensemble.cut(start)
Parameters

start (int) – How many configurations to drop from the beginning of the ensemble.

Returns

An ensemble with fewer configurations.

Return type

Ensemble

every(stride)[source]

Good for decorrelation.

The generator is wrapped in KeepEvery so that continue_from() produces a strided follow-on ensemble.

decorrelated = thermalized.every(stride)
Parameters

stride (int) – How many configurations to skip.

Returns

An ensemble with fewer configurations.

Return type

Ensemble

plot_history(axes, observable, label=None, histogram_label=None, bins=31, density=True, alpha=0.5, color=None, history_kwargs={'label': None})[source]

See also

Blocking.plot_history.