H5P_SET_SZIP sets an SZIP compression filter, H5Z_FILTER_SZIP , for a dataset. SZIP is a compression method designed for use with scientific data. Before proceeding, all users should review the “Limitations” section below. Users familiar with SZIP outside the HDF5 context may benefit from reviewing “Notes for Users Familiar with SZIP in Other Contexts” below. In the text below, the term pixel refers to an HDF5 data element. This terminology derives from SZIP compression's use with image data, where pixel referred to an image pixel. The SZIP bits_per_pixel value (see Notes, below) is automatically set, based on the HDF5 datatype. SZIP can be used with atomic datatypes that may have size of 8, 16, 32, or 64 bits. Specifically, a dataset with a datatype that is 8-, 16-, 32-, or 64-bit signed or unsigned integer; char; or 32- or 64-bit float can be compressed with SZIP. See Notes, below, for further discussion of the the SZIP bits_per_pixel setting. SZIP options are passed in an options mask, options_mask , as follows. Option | Description (Mutually exclusive; select one.) |
---|
H5_SZIP_EC_OPTION_MASK
| Selects entropy coding method | H5_SZIP_NN_OPTION_MASK | Selects nearest neighbor coding method |
The following guidelines can be used in determining which option to select: - The entropy coding method, the EC option specified by
H5_SZIP_EC_OPTION_MASK , is best suited for data that has been processed. The EC method works best for small numbers. - The nearest neighbor coding method, the NN option specified by
H5_SZIP_NN_OPTION_MASK , preprocesses the data then the applies EC method as above.
Other factors may affect results, but the above criteria provides a good starting point for optimizing data compression. SZIP compresses data block by block, with a user-tunable block size. This block size is passed in the parameter pixels_per_block and must be even and not greater than 32, with typical values being 8 , 10 , 16 , or 32 . This parameter affects compression ratio; the more pixel values vary, the smaller this number should be to achieve better performance. In HDF5, compression can be applied only to chunked datasets. If pixels_per_block is bigger than the total number of elements in a dataset chunk, H5P_SET_SZIP will succeed but the subsequent call to H5D_CREATE will fail; the conflict can be detected only when the property list is used. To achieve optimal performance for SZIP compression, it is recommended that a chunk's fastest-changing dimension be equal to N times pixels_per_block where N is the maximum number of blocks per scan line allowed by the SZIP library. In the current version of SZIP, N is set to 128. SZIP compression is an optional HDF5 filter. Limitations: - SZIP compression cannot be applied to compound, array, variable-length, enumeration, or any other user-defined datatypes.
If an SZIP filter is set in a dataset creation property list used to create a dataset containing a non-allowed datatype, the call to H5D_CREATE will fail; the conflict can be detected only when the property list is used. - Users should be aware that there are factors that affect one’s rights and ability to use SZIP compression. See the documents at SZIP Compression in HDF5 for important information regarding terms of use and the SZIP copyright notice, for further discussion of SZIP compression in HDF5, and for a list of SZIP-related references.
|