tokio.connectors.darshan module¶

Connect to Darshan logs.

This connector provides an interface into Darshan logs created by Darshan 3.0 or higher and represents the counters and data contained therein as a Python dictionary. This dictionary has the following structure, where block denote literal key names.

header which contains key-value pairs corresponding to each line in the header. exe and metadata are lists; the other keys correspond to a single scalar value.
- compression, end_time, end_time_string, exe, etc
counters
- modulename which is posix, lustre, stdio, etc
  - recordname, which is usually the full path to a file opened by the profiled application _or_ _perf (contains performance summary metrics) or _total (contains aggregate file statistics)
    - ranknum which is a string (0, 1, etc or -1)
      - counternames, which depends on the Darshan module defined by modulename above
mounts which is the mount table with keys of a path to a mount location and values of the file system type

The counternames are module-specific and have their module name prefix stripped off. The following counter names are examples of what a Darshan log may expose through this connector for the posix module:

BYTES_READ and BYTES_WRITTEN - number of bytes read/written to the file
MAX_BYTE_WRITTEN and MAX_BYTE_READ - highest byte written/read; useful if an application re-reads or re-writes a lot of data
WRITES and READS - number of write and read ops issued
F_WRITE_TIME and F_READ_TIME - amount of time spent inside write and read calls (in seconds)
F_META_TIME - amount of time spent in metadata (i.e., non-read/write) calls

Similarly the lustre module provides the following counter keys:

MDTS - number of MDTs in the underlying file system
OSTS - number of OSTs in the underlying file system
OST_ID_0 - the OBD index for the 0th OST over which the file is striped
STRIPE_OFFSET - the setting used to define stripe offset when the file was created
STRIPE_SIZE - the size, in bytes, of each stripe
STRIPE_WIDTH - how many OSTs the file touches

Note

This connector presently relies on darshan-parser to convert the binary logs to ASCII, then convert the ASCII into Python objects. In the future, we plan on using the Python API provided by darshan-utils to circumvent the ASCII translation.

class tokio.connectors.darshan.Darshan(log_file=None, *args, **kwargs)[source]¶

Bases: tokio.connectors.common.SubprocessOutputDict

__init__(log_file=None, *args, **kwargs)[source]¶

Initialize the object from either a Darshan log or a cache file.

Configures the object’s internal state to operate on a Darshan log file or a cached JSON representation of a previously processed Darshan log.

Parameters:	log_file (str, optional) – Path to a Darshan log to be processed cache_file (str, optional) – Path to a Darshan log’s contents cached args – Passed to tokio.connectors.common.SubprocessOutputDict kwargs – Passed to tokio.connectors.common.SubprocessOutputDict
Variables:	log_file (str) – Path to the Darshan log file to load

__repr__()[source]¶

Serialize self into JSON.

Returns:	JSON representation of the object
Return type:	str

_darshan_parser()[source]¶: Call darshan-parser to initialize values in self

_load_subprocess_iter(*args)[source]¶: Run a subprocess and pass its stdout to a self-initializing parser

_parse_darshan_parser(lines)[source]¶

Load values from output of darshan-parser

Parameters:	lines – Any iterable that produces lines of darshan-parser output

darshan_parser_base(modules=None, counters=None)[source]¶

Populate data produced by darshan-parser --base

Runs the darshan-parser --base and convert all results into key-value pairs which are inserted into the object.

Parameters:	modules (list of str) – If specified, only return data from the given Darshan modules counters (list of str) – If specified, only return data for the given counters
Returns:	Dictionary containing all key-value pairs generated by running `darshan-parser --base`. These values are also accessible via the BASE key in the object.
Return type:	dict

darshan_parser_perf(modules=None, counters=None)[source]¶

Populate data produced by darshan-parser --perf

Runs the darshan-parser --perf and convert all results into key-value pairs which are inserted into the object.

Parameters:	modules (list of str) – If specified, only return data from the given Darshan modules counters (list of str) – If specified, only return data for the given counters
Returns:	Dictionary containing all key-value pairs generated by running `darshan-parser --perf`. These values are also accessible via the PERF key in the object.
Return type:	dict

darshan_parser_total(modules=None, counters=None)[source]¶

Populate data produced by darshan-parser --total

Runs the darshan-parser --total and convert all results into key-value pairs which are inserted into the object.

Parameters:	modules (list of str) – If specified, only return data from the given Darshan modules counters (list of str) – If specified, only return data for the given counters
Returns:	Dictionary containing all key-value pairs generated by running `darshan-parser --total`. These values are also accessible via the TOTAL key in the object.
Return type:	dict

load()[source]¶

Load based on initialization state of object

Parameters:	cache_file (str or None) – The cached input file to load. If not specified, uses whatever self.cache_file is

load_str(input_str)[source]¶

Load from either a json cache or the output of darshan-parser

Parameters:	input_str – Either (1) stdout of the darshan-parser command as a string, (2) the json-encoded representation of a Darshan object that can be deserialized to initialize self, or (3) an iterator that produces the output of darshan-parser line-by-line

tokio.connectors.darshan.parse_base_counters(line)[source]¶

Parse a counter line from darshan-parser --base.

Parse the line containing an actual counter’s data. It is a tab-delimited line of the form

module, rank, record_id, counter, value, file_name, mount_pt, fs_type

Parameters:	line (str) – A single line of output from `darshan-parser --base`
Returns:	Returns a tuple containing eight values. If line is not a valid counter line, all values will be None. The returned values are: module name MPI rank record id counter name counter value file name mount point file system type
Return type:	tuple

tokio.connectors.darshan.parse_filename_metadata(filename)[source]¶

Extracts metadata from a Darshan log’s file name

Parameters:	filename (str) – Name of a Darshan log file. Can be basename or a full path.
Returns:	key-value pairs describing the metadata extracted from the file name.
Return type:	dict

tokio.connectors.darshan.parse_header(line)[source]¶

Parse the header lines of darshan-parser.

Accepts a line that may or may not be a header line as printed by darshan-parser. Such header lines take the form:

# darshan log version: 3.10
# compression method: ZLIB
# exe: /home/user/bin/myjob.exe --whatever
# uid: 69615

If it is a valid header line, return a key-value pair corresponding to its decoded contents.

Parameters:	line (str) – A single line of output from `darshan-parser`
Returns:	Returns a (key, value) corresponding to the key and value decoded from the header line, or `(None, None)` if the line does not appear to contain a known header field.
Return type:	tuple

tokio.connectors.darshan.parse_mounts(line)[source]¶

Parse a mount table line from darshan-parser.

Accepts a line that may or may not be a mount table entry from darshan-parser. Such lines take the form:

# mount entry:  /usr/lib64/libibverbs.so.1.0.0  dvs

If line is a valid mount table entry, return a key-value representation of its contents.

Parameters:	line (str) – A single line of output from `darshan-parser`
Returns:	Returns a (key, value) corresponding to the mount table entry, or `(None, None)` if the line is not a valid mount table entry.
Return type:	tuple

tokio.connectors.darshan.parse_perf_counters(line)[source]¶

Parse a counter line from darshan-parser --perf.

Parse a line containing counter data from darshan-parser --perf. Such lines look like:

# total_bytes: 2199023259968
# unique files: slowest_rank_io_time: 0.000000
# shared files: time_by_cumul_io_only: 39.992327
# agg_perf_by_slowest: 28670.996545

Parameters:	line (str) – A single line of output from `darshan-parser --perf`
Returns:	Returns a single (key, value) pair corresponding to the performance metric encoded in line. If line is not a valid performance counter line, `(None, None)` is returned.
Return type:	tuple

tokio.connectors.darshan.parse_total_counters(line)[source]¶

Parse a counter line from darshan-parser --total.

Parse a line containing counter data from darshan-parser --total. Such lines are of the form:

total_MPIIO_F_READ_END_TIMESTAMP: 0.000000

Parameters:	line (str) – A single line of output from `darshan-parser --total`
Returns:	Returns a single (key, value) pair corresponding to a counted metric and its total value. If line is not a valid counter line, `(None, None)` are returned.
Return type:	tuple