tokio.tools.hdf5 module

Retrieve data from TOKIO Time Series files using time as inputs

Provides a mapping between dates and times and a site’s time-indexed repository of TOKIO Time Series HDF5 files.

tokio.tools.hdf5.enumerate_h5lmts(fsname, datetime_start, datetime_end)[source]

Alias for tokio.tools.hdf5.enumerate_hdf5()

tokio.tools.hdf5.enumerate_hdf5(fsname, datetime_start, datetime_end)[source]

Returns all time-indexed HDF5 files falling between a time range

Given a starting and ending datetime, returns the names of all HDF5 files that should contain data falling within that date range (inclusive).

Parameters:
  • fsname (str) – Logical file system name; should match a key within the hdf5_files config item in site.json.
  • datetime_start (datetime.datetime) – Begin including files corresponding to this start date, inclusive.
  • datetime_end (datetime.datetime) – Stop including files with timestamps that follow this end date. Resulting files _will_ include this date.
Returns:

List of strings, each describing a path to an existing HDF5 file that should contain data relevant to the requested start and end dates.

Return type:

list

tokio.tools.hdf5.get_dataframe_from_time_range(fsname, dataset_name, datetime_start, datetime_end, fix_errors=False)[source]

Returns all TOKIO Time Series data within a time range as a DataFrame.

Given a time range,

  1. Find all TOKIO Time Series HDF5 files that exist and overlap with that time range
  2. Open each and load all data that falls within the given time range
  3. Convert loaded data into a single, time-indexed DataFrame
Parameters:
  • fsname (str) – Name of file system whose data should be retrieved. Should exist as a key within tokio.config.CONFIG['hdf5_files']
  • dataset_name (str) – Dataset within each matching HDF5 file to load
  • datetime_start (datetime.datetime) – Lower bound of time range to load, inclusive
  • datetime_end (datetime.datetime) – Upper bound of time range to load, exclusive
  • fix_errors (bool) – Replace negative values with -0.0. Necessary if any HDF5 files contain negative values as a result of being archived with a buggy version of pytokio.
Returns:

DataFrame indexed in time and whose columns correspond to those in the given dataset_name.

Return type:

pandas.DataFrame

tokio.tools.hdf5.get_files_and_indices(fsname, dataset_name, datetime_start, datetime_end)[source]

Retrieve filenames and indices within files corresponding to a date range

Given a logical file system name and a dataset within that file system’s TOKIO Time Series files, return a list of all file names and the indices within those files that fall within the specified date range.

Parameters:
  • fsname (str) – Logical file system name; should match a key within the hdf5_files config item in site.json.
  • dataset_name (str) – Name of a TOKIO Time Series dataset name
  • datetime_start (datetime.datetime) – Begin including files corresponding to this start date, inclusive.
  • datetime_end (datetime.datetime) – Stop including files with timestamps that follow this end date. Resulting files _will_ include this date.
Returns:

List of three-item tuples of types (str, int, int), where

  • element 0 is the path to an existing HDF5 file
  • element 1 is the first index (inclusive) of dataset_name within that file containing data that falls within the specified date range
  • element 2 is the last index (exclusive) of dataset_name within that file containing data that falls within the specified date range

Return type:

list