tokio.connectors.darshan module¶
Connect to Darshan logs.
This connector provides an interface into Darshan logs created by Darshan 3.0 or
higher and represents the counters and data contained therein as a Python
dictionary. This dictionary has the following structure, where block
denote literal key names.
header
which contains key-value pairs corresponding to each line in the header.exe
andmetadata
are lists; the other keys correspond to a single scalar value.compression
,end_time
,end_time_string
,exe
, etc
counters
- modulename which is
posix
,lustre
,stdio
, etc- recordname, which is usually the full path to a file opened by the
profiled application _or_
_perf
(contains performance summary metrics) or_total
(contains aggregate file statistics)- ranknum which is a string (
0
,1
, etc or-1
)- counternames, which depends on the Darshan module defined by modulename above
- ranknum which is a string (
- recordname, which is usually the full path to a file opened by the
profiled application _or_
- modulename which is
mounts
which is the mount table with keys of a path to a mount location and values of the file system type
The counternames are module-specific and have their module name prefix
stripped off. The following counter names are examples of what a Darshan log
may expose through this connector for the posix
module:
BYTES_READ
andBYTES_WRITTEN
- number of bytes read/written to the fileMAX_BYTE_WRITTEN
andMAX_BYTE_READ
- highest byte written/read; useful if an application re-reads or re-writes a lot of dataWRITES
andREADS
- number of write and read ops issuedF_WRITE_TIME
andF_READ_TIME
- amount of time spent inside write and read calls (in seconds)F_META_TIME
- amount of time spent in metadata (i.e., non-read/write) calls
Similarly the lustre
module provides the following counter keys:
MDTS
- number of MDTs in the underlying file systemOSTS
- number of OSTs in the underlying file systemOST_ID_0
- the OBD index for the 0th OST over which the file is stripedSTRIPE_OFFSET
- the setting used to define stripe offset when the file was createdSTRIPE_SIZE
- the size, in bytes, of each stripeSTRIPE_WIDTH
- how many OSTs the file touches
Note
This connector presently relies on darshan-parser
to convert the binary
logs to ASCII, then convert the ASCII into Python objects. In the future,
we plan on using the Python API provided by darshan-utils to circumvent the
ASCII translation.
-
class
tokio.connectors.darshan.
Darshan
(log_file=None, *args, **kwargs)[source]¶ Bases:
tokio.connectors.common.SubprocessOutputDict
-
__init__
(log_file=None, *args, **kwargs)[source]¶ Initialize the object from either a Darshan log or a cache file.
Configures the object’s internal state to operate on a Darshan log file or a cached JSON representation of a previously processed Darshan log.
Parameters: Variables: log_file (str) – Path to the Darshan log file to load
-
__repr__
()[source]¶ Serialize self into JSON.
Returns: JSON representation of the object Return type: str
-
_load_subprocess_iter
(*args)[source]¶ Run a subprocess and pass its stdout to a self-initializing parser
-
_parse_darshan_parser
(lines)[source]¶ Load values from output of darshan-parser
Parameters: lines – Any iterable that produces lines of darshan-parser output
-
darshan_parser_base
(modules=None, counters=None)[source]¶ Populate data produced by
darshan-parser --base
Runs the
darshan-parser --base
and convert all results into key-value pairs which are inserted into the object.Parameters: - modules (list of str) – If specified, only return data from the given Darshan modules
- counters (list of str) – If specified, only return data for the given counters
Returns: Dictionary containing all key-value pairs generated by running
darshan-parser --base
. These values are also accessible via the BASE key in the object.Return type:
-
darshan_parser_perf
(modules=None, counters=None)[source]¶ Populate data produced by
darshan-parser --perf
Runs the
darshan-parser --perf
and convert all results into key-value pairs which are inserted into the object.Parameters: - modules (list of str) – If specified, only return data from the given Darshan modules
- counters (list of str) – If specified, only return data for the given counters
Returns: Dictionary containing all key-value pairs generated by running
darshan-parser --perf
. These values are also accessible via the PERF key in the object.Return type:
-
darshan_parser_total
(modules=None, counters=None)[source]¶ Populate data produced by
darshan-parser --total
Runs the
darshan-parser --total
and convert all results into key-value pairs which are inserted into the object.Parameters: - modules (list of str) – If specified, only return data from the given Darshan modules
- counters (list of str) – If specified, only return data for the given counters
Returns: Dictionary containing all key-value pairs generated by running
darshan-parser --total
. These values are also accessible via the TOTAL key in the object.Return type:
-
load
()[source]¶ Load based on initialization state of object
Parameters: cache_file (str or None) – The cached input file to load. If not specified, uses whatever self.cache_file is
-
load_str
(input_str)[source]¶ Load from either a json cache or the output of darshan-parser
Parameters: input_str – Either (1) stdout of the darshan-parser command as a string, (2) the json-encoded representation of a Darshan object that can be deserialized to initialize self, or (3) an iterator that produces the output of darshan-parser line-by-line
-
-
tokio.connectors.darshan.
parse_base_counters
(line)[source]¶ Parse a counter line from
darshan-parser --base
.Parse the line containing an actual counter’s data. It is a tab-delimited line of the form
module, rank, record_id, counter, value, file_name, mount_pt, fs_type
Parameters: line (str) – A single line of output from darshan-parser --base
Returns: Returns a tuple containing eight values. If line is not a valid counter line, all values will be None. The returned values are: - module name
- MPI rank
- record id
- counter name
- counter value
- file name
- mount point
- file system type
Return type: tuple
-
tokio.connectors.darshan.
parse_filename_metadata
(filename)[source]¶ Extracts metadata from a Darshan log’s file name
Parameters: filename (str) – Name of a Darshan log file. Can be basename or a full path. Returns: - key-value pairs describing the metadata extracted from the file
- name.
Return type: dict
-
tokio.connectors.darshan.
parse_header
(line)[source]¶ Parse the header lines of
darshan-parser
.Accepts a line that may or may not be a header line as printed by
darshan-parser
. Such header lines take the form:# darshan log version: 3.10 # compression method: ZLIB # exe: /home/user/bin/myjob.exe --whatever # uid: 69615
If it is a valid header line, return a key-value pair corresponding to its decoded contents.
Parameters: line (str) – A single line of output from darshan-parser
Returns: Returns a (key, value) corresponding to the key and value decoded from the header line, or (None, None)
if the line does not appear to contain a known header field.Return type: tuple
-
tokio.connectors.darshan.
parse_mounts
(line)[source]¶ Parse a mount table line from
darshan-parser
.Accepts a line that may or may not be a mount table entry from
darshan-parser
. Such lines take the form:# mount entry: /usr/lib64/libibverbs.so.1.0.0 dvs
If line is a valid mount table entry, return a key-value representation of its contents.
Parameters: line (str) – A single line of output from darshan-parser
Returns: Returns a (key, value) corresponding to the mount table entry, or (None, None)
if the line is not a valid mount table entry.Return type: tuple
-
tokio.connectors.darshan.
parse_perf_counters
(line)[source]¶ Parse a counter line from
darshan-parser --perf
.Parse a line containing counter data from
darshan-parser --perf
. Such lines look like:# total_bytes: 2199023259968 # unique files: slowest_rank_io_time: 0.000000 # shared files: time_by_cumul_io_only: 39.992327 # agg_perf_by_slowest: 28670.996545
Parameters: line (str) – A single line of output from darshan-parser --perf
Returns: Returns a single (key, value) pair corresponding to the performance metric encoded in line. If line is not a valid performance counter line, (None, None)
is returned.Return type: tuple
-
tokio.connectors.darshan.
parse_total_counters
(line)[source]¶ Parse a counter line from
darshan-parser --total
.Parse a line containing counter data from
darshan-parser --total
. Such lines are of the form:total_MPIIO_F_READ_END_TIMESTAMP: 0.000000Parameters: line (str) – A single line of output from darshan-parser --total
Returns: Returns a single (key, value) pair corresponding to a counted metric and its total value. If line is not a valid counter line, (None, None)
are returned.Return type: tuple