h5

pysammos.data_write.h5 package

Subpackage for writing Coarse Graining computed data to HDF5 files.

Writer module

pysammos.data_write.h5.writer module

This module provides a manager class for reading and writing HDF5 files containing array data, and converting them to xarray Datasets. It supports adding grid point positions, phase labels, and saving data at specific indices with custom dimension values.

It also includes functionality to load the HDF5 file as an xarray Dataset with appropriate coordinates and dimension names.

It is designed to handle various data shapes, including scalar, vector, and tensor data, and to ensure unique dimension names for each variable.

It is particularly useful for managing simulation data in a structured format, allowing for efficient storage and retrieval of large datasets, while providing a convenient interface for data analysis using xarray.

This module contains the following class:
  1. H5XarrayManager:

This class manages HDF5 files and provides methods to add positions, phases, and update datasets with new data. It also includes methods to convert the HDF5 file into an xarray Dataset, handling various data shapes and ensuring appropriate dimension names. Initializes the manager with the specified HDF5 file.

Methods:
  • add_positions(): Adds grid point positions to the HDF5 file

  • add_phases(): Adds phase labels to the HDF5 file

  • update_h5py_file(): Saves a single step of data to the HDF5 file at a specific index with a custom dimension value.

  • h5_to_xarray(): Loads the HDF5 file as an xarray Dataset, handling various data shapes and ensuring appropriate dimension names.

class pysammos.data_write.h5.writer.H5XarrayManager(filename)[source]

Bases: object

Manager for reading and writing HDF5 files with array data and converting to xarray Datasets.

Inputs

filenamestr

Path to the HDF5 file to read/write.

Examples

>>> manager = H5XarrayManager("data.h5")
>>> manager.add_positions(positions_array)
>>> manager.add_phases(phase_labels)
>>> manager.update_h5py_file(data_dict, dim_index=0, dim_value=0.0)
>>> ds = manager.h5_to_xarray()
>>> print(ds)
HDF5 file structure:
    - positions: (n_points, 3) array of grid point positions
    - phases: (n_phases,) array of phase labels
    - data variables: (time, point, [phase], ...) arrays of data
    - time: (n_times,) array of time values
add_phases(phase_labels)[source]

Add phase labels to the HDF5 file if not already present.

Inputs

phase_labelslist or array-like of str

List of phase label names.

add_positions(positions)[source]

Add grid point positions to the HDF5 file if not already present.

Inputs

positionsarray-like, shape (n_points, 3)

Array of grid point positions.

h5_to_xarray(dim_name='time')[source]

Load the HDF5 file as an xarray Dataset.

Inputs

dim_namestr, optional

Name of the main dimension (default is "time").

Outputs

dsxarray.Dataset

The loaded dataset with appropriate coordinates and dimension names:

  • For 2x2 tensors, trailing dimensions are named 'dim1_2D', 'dim2_2D'.

  • For 3x3 tensors, trailing dimensions are named 'dim1_3D', 'dim2_3D'.

  • For other shapes, unique dimension names are generated per variable.

update_h5py_file(data_dict, dim_index, dim_value, dim_name='time')[source]

Save a single step of data to an HDF5 file at a specific index, with a custom dimension value.

Inputs

data_dictdict

Dictionary where keys are variable names and values are arrays to store.

dim_indexint

The index in the main dimension (e.g., time) to write to.

dim_valuefloat or int or str

The value for the main dimension at this index (e.g., the time value).

dim_namestr, optional

The name of the main dimension (default is "time").