tokio.connectors.darshan module

Connect to Darshan logs.

This connector provides an interface into Darshan logs created by Darshan 3.0 or higher and represents the counters and data contained therein as a Python dictionary. This dictionary has the following structure, where block denote literal key names.

  • header which contains key-value pairs corresponding to each line in the header. exe and metadata are lists; the other keys correspond to a single scalar value.
    • compression, end_time, end_time_string, exe, etc
  • counters
    • modulename which is posix, lustre, stdio, etc
      • recordname, which is usually the full path to a file opened by the profiled application _or_ _perf (contains performance summary metrics) or _total (contains aggregate file statistics)
        • ranknum which is a string (0, 1, etc or -1)
          • counternames, which depends on the Darshan module defined by modulename above
  • mounts which is the mount table with keys of a path to a mount location and values of the file system type

The counternames are module-specific and have their module name prefix stripped off. The following counter names are examples of what a Darshan log may expose through this connector for the posix module:

  • BYTES_READ and BYTES_WRITTEN - number of bytes read/written to the file
  • MAX_BYTE_WRITTEN and MAX_BYTE_READ - highest byte written/read; useful if an application re-reads or re-writes a lot of data
  • WRITES and READS - number of write and read ops issued
  • F_WRITE_TIME and F_READ_TIME - amount of time spent inside write and read calls (in seconds)
  • F_META_TIME - amount of time spent in metadata (i.e., non-read/write) calls

Similarly the lustre module provides the following counter keys:

  • MDTS - number of MDTs in the underlying file system
  • OSTS - number of OSTs in the underlying file system
  • OST_ID_0 - the OBD index for the 0th OST over which the file is striped
  • STRIPE_OFFSET - the setting used to define stripe offset when the file was created
  • STRIPE_SIZE - the size, in bytes, of each stripe
  • STRIPE_WIDTH - how many OSTs the file touches

Note

This connector presently relies on darshan-parser to convert the binary logs to ASCII, then convert the ASCII into Python objects. In the future, we plan on using the Python API provided by darshan-utils to circumvent the ASCII translation.

class tokio.connectors.darshan.Darshan(log_file=None, *args, **kwargs)[source]

Bases: tokio.connectors.common.SubprocessOutputDict

__init__(log_file=None, *args, **kwargs)[source]

Initialize the object from either a Darshan log or a cache file.

Configures the object’s internal state to operate on a Darshan log file or a cached JSON representation of a previously processed Darshan log.

Parameters:
  • log_file (str, optional) – Path to a Darshan log to be processed
  • cache_file (str, optional) – Path to a Darshan log’s contents cached
  • *args – Passed to tokio.connectors.common.SubprocessOutputDict
  • *kwargs – Passed to tokio.connectors.common.SubprocessOutputDict
Variables:

log_file (str) – Path to the Darshan log file to load

__repr__()[source]

Serialize self into JSON.

Returns:JSON representation of the object
Return type:str
_darshan_parser()[source]

Call darshan-parser to initialize values in self

_load_subprocess_iter(*args)[source]

Run a subprocess and pass its stdout to a self-initializing parser

_parse_darshan_parser(lines)[source]

Load values from output of darshan-parser

Parameters:lines – Any iterable that produces lines of darshan-parser output
darshan_parser_base(modules=None, counters=None)[source]

Populate data produced by darshan-parser --base

Runs the darshan-parser --base and convert all results into key-value pairs which are inserted into the object.

Parameters:
  • modules (list of str) – If specified, only return data from the given Darshan modules
  • counters (list of str) – If specified, only return data for the given counters
Returns:

Dictionary containing all key-value pairs generated by running darshan-parser --base. These values are also accessible via the BASE key in the object.

Return type:

dict

darshan_parser_perf(modules=None, counters=None)[source]

Populate data produced by darshan-parser --perf

Runs the darshan-parser --perf and convert all results into key-value pairs which are inserted into the object.

Parameters:
  • modules (list of str) – If specified, only return data from the given Darshan modules
  • counters (list of str) – If specified, only return data for the given counters
Returns:

Dictionary containing all key-value pairs generated by running darshan-parser --perf. These values are also accessible via the PERF key in the object.

Return type:

dict

darshan_parser_total(modules=None, counters=None)[source]

Populate data produced by darshan-parser --total

Runs the darshan-parser --total and convert all results into key-value pairs which are inserted into the object.

Parameters:
  • modules (list of str) – If specified, only return data from the given Darshan modules
  • counters (list of str) – If specified, only return data for the given counters
Returns:

Dictionary containing all key-value pairs generated by running darshan-parser --total. These values are also accessible via the TOTAL key in the object.

Return type:

dict

load()[source]

Load based on initialization state of object

Parameters:cache_file (str or None) – The cached input file to load. If not specified, uses whatever self.cache_file is
load_str(input_str)[source]

Load from either a json cache or the output of darshan-parser

Parameters:input_str – Either (1) stdout of the darshan-parser command as a string, (2) the json-encoded representation of a Darshan object that can be deserialized to initialize self, or (3) an iterator that produces the output of darshan-parser line-by-line
tokio.connectors.darshan.parse_base_counters(line)[source]

Parse a counter line from darshan-parser --base.

Parse the line containing an actual counter’s data. It is a tab-delimited line of the form

module, rank, record_id, counter, value, file_name, mount_pt, fs_type

Parameters:line (str) – A single line of output from darshan-parser --base
Returns:Returns a tuple containing eight values. If line is not a valid counter line, all values will be None. The returned values are:
  1. module name
  2. MPI rank
  3. record id
  4. counter name
  5. counter value
  6. file name
  7. mount point
  8. file system type
Return type:tuple
tokio.connectors.darshan.parse_filename_metadata(filename)[source]

Extracts metadata from a Darshan log’s file name

Parameters:filename (str) – Name of a Darshan log file. Can be basename or a full path.
Returns:
key-value pairs describing the metadata extracted from the file
name.
Return type:dict
tokio.connectors.darshan.parse_header(line)[source]

Parse the header lines of darshan-parser.

Accepts a line that may or may not be a header line as printed by darshan-parser. Such header lines take the form:

# darshan log version: 3.10
# compression method: ZLIB
# exe: /home/user/bin/myjob.exe --whatever
# uid: 69615

If it is a valid header line, return a key-value pair corresponding to its decoded contents.

Parameters:line (str) – A single line of output from darshan-parser
Returns:Returns a (key, value) corresponding to the key and value decoded from the header line, or (None, None) if the line does not appear to contain a known header field.
Return type:tuple
tokio.connectors.darshan.parse_mounts(line)[source]

Parse a mount table line from darshan-parser.

Accepts a line that may or may not be a mount table entry from darshan-parser. Such lines take the form:

# mount entry:  /usr/lib64/libibverbs.so.1.0.0  dvs

If line is a valid mount table entry, return a key-value representation of its contents.

Parameters:line (str) – A single line of output from darshan-parser
Returns:Returns a (key, value) corresponding to the mount table entry, or (None, None) if the line is not a valid mount table entry.
Return type:tuple
tokio.connectors.darshan.parse_perf_counters(line)[source]

Parse a counter line from darshan-parser --perf.

Parse a line containing counter data from darshan-parser --perf. Such lines look like:

# total_bytes: 2199023259968
# unique files: slowest_rank_io_time: 0.000000
# shared files: time_by_cumul_io_only: 39.992327
# agg_perf_by_slowest: 28670.996545
Parameters:line (str) – A single line of output from darshan-parser --perf
Returns:Returns a single (key, value) pair corresponding to the performance metric encoded in line. If line is not a valid performance counter line, (None, None) is returned.
Return type:tuple
tokio.connectors.darshan.parse_total_counters(line)[source]

Parse a counter line from darshan-parser --total.

Parse a line containing counter data from darshan-parser --total. Such lines are of the form:

total_MPIIO_F_READ_END_TIMESTAMP: 0.000000
Parameters:line (str) – A single line of output from darshan-parser --total
Returns:Returns a single (key, value) pair corresponding to a counted metric and its total value. If line is not a valid counter line, (None, None) are returned.
Return type:tuple