|Tests Parallel HDF5 performance.|
<pre><code class="language-bash">h5perf [-h | --help]
h5perf is a tool for testing the performance of the Parallel HDF5 library. The tool can perform testing with 1-dimensional and 2-dimensional buffers and datasets. For details regarding data organization and access, see the “h5perf User Guide.”
The following environment variables have the following effects on
|If set, |
h5perf does not remove data files.
(Default: Data files are removed.)
|Must be set to a string containing a list of semi-colon separated |
key=value pairs for the MPI
|Sets the prefix for parallel output data files.|
Options and Parameters:
|These terms are used as follows in this section:|
|file ||A filename|
|size||A size specifier, expressed as an integer greater than or equal to 0 (zero) followed by a size indicator: |
K for kilobytes (1024 bytes)
M for megabytes (1048576 bytes)
G for gigabytes (1073741824 bytes)
37M specifies 37 megabytes or 38797312 bytes.
|N||An integer greater than or equal to 0 (zero)|
| ||Prints a usage message and exits.|
| ||Specifies the alignment of objects in the HDF5 file. |
Specifies which APIs to test. api_list is a comma-separated list with the following valid values:
(Default: All APIs)
--api=mpiio,phdf5 specifies that the MPI I/O and Parallel HDF5 APIs are to be monitored.
| ||Controls the block size within the transfer buffer. |
(Default: Half the number of bytes per process per dataset)
Block size versus transfer buffer size:
The transfer buffer size is the size of a buffer in memory. The data in that buffer is broken into block size pieces and written to the file.
Transfer buffer size is discussed below with the
The pattern in which the blocks are written to the file is described in the discussion of the
| ||Creates HDF5 datasets in chunked layout. |
| ||Use collective I/O for the MPI I/O and Parallel HDF5 APIs. |
(Default: Off, i.e., independent I/O)
If this option is set and the MPI-I/O and PHDF5 APIs are in use, all the blocks of every process will be written at once with an MPI derived type.
| ||Sets the number of datasets per file. |
Sets the debugging level. debug_flags is a comma-separated list of debugging flags with the following valid values:
|Moderate debugging (“not quite everything”)|
|Extensive debugging (“everything”)|
|All possible debugging (“the kitchen sink”)|
|Raw data I/O throughput information|
|Times, in additions to throughputs|
|Verify data correctness|
(Default: No debugging)
--debug=2,r,t specifies to run a moderate level of debugging while collecting raw data I/O throughput information and verifying the correctness of the data.
Throughput values are computed by dividing the total amount of transferred data (excluding metadata) over the time spent by the slowest process. Several time counters are defined to measure the data transfer time and the total elapsed time; the latter includes the time spent during file open and close operations. A number of iterations can be specified with the option
--num-iterations) to create the desired population of measurements from which maximum, minimum, and average values can be obtained. The timing scheme is the following:
for each iteration
initialize elapsed time counter
initialize data transfer time counter
for each file
start and accumulate elapsed time counter
start and accumulate data transfer time counter
access entire file
stop data transfer time counter
stop elapsed time counter
save elapsed time counter
save data transfer time counter
The reported write throughput is based on the accumulated data transfer time, while the write open-close throughput uses the accumulated elapsed time.
| ||Specifies the number of bytes per process per dataset. |
256K for 1D,
8K for 2D)
Depending on the selected geometry, each test dataset can be a linear array of size bytes-per-process * num-processes or a square array of size (bytes-per-process * num-processes) × (bytes-per-process * num-processes). The number of processes is set by the
| ||Specifies the number of files. |
| ||Selects 2D geometry for testing. |
(Default: Off, i.e., 1D geometry)
| ||Sets the number of iterations to perform. |
| ||Sets interleaved block I/O. |
(Default: Contiguous block I/O)
Interleaved and contiguous patterns in 1D geometry:
When a contiguous access pattern is chosen, the dataset is evenly divided into num-processes regions and each process writes data to its assigned region. When interleaved blocks are written to a dataset, space for the first block of the first process is allocated in the dataset, then space is allocated for the first block of the second process, etc., until space is allocated for the first block of each process, then space is allocated for the second block of the first process, the second block of the second process, etc.
For example, with a three process run, 512KB bytes-per-process, 256KB transfer buffer size, and 64KB block size, each process must issue two transfer requests to complete access to the dataset.
Contiguous blocks of the first transfer request are written as follows:
Interleaved blocks of the first transfer request are written as follows:
The actual number of I/O operations involved in a transfer request depends on the access pattern and communication mode. When using independent I/O with an interleaved access pattern, each process performs four small non-contiguous I/O operations per transfer request. If collective I/O is turned on, the combined content of the buffers of the three processes will be written using one collective I/O operation per transfer request.
For details regarding the impact of performance and access patterns in 2D, see the “h5perf User Guide.”
|This option is no longer available.|
|Specifies to not write fill values to HDF5 datasets. This option is supported only in HDF5 Release v1.6 or later. |
(Default: Off, i.e., write fill values)
|Sets the output file for raw data to file. |
|Sets the minimum number of processes to be used. |
|Sets the maximum number of processes to be used. |
|Sets the threshold for alignment of objects in the HDF5 file. |
|Performs only write tests, not read tests. |
(Default: Read and write tests)
|Sets the minimum transfer buffer size. |
(Default: Half the number of bytes per processor per dataset)
This option and the
-X size option (or
--max-xfer-size=size) control transfer-buffer-size, the size of the transfer buffer in memory. In 1D geometry, the transfer buffer is a linear array of size transfer-buffer-size. In 2D geometry, the transfer buffer is a rectangular array of size block-size × transfer-buffer-size, or transfer-buffer-size × block-size if the interleaved access pattern is selected.
|Sets the maximum transfer buffer size. |
(Default: The number of bytes per processor per dataset)
|> 0 ||An error occurred.|
|1.6.0||Tool introduced in this release.|
|1.6.8 and 1.8.0||Option |
--geometry introduced in this release.