If you are new to HDF5 please read the Learning the Basics topic first. 




Overview of Parallel HDF5 (PHDF5) Design

There were several requirements that we had for Parallel HDF5 (PHDF5). These were:

With these requirements of HDF5 our initial target was to support MPI programming, but not for shared memory programming. We had done some experimentation with thread-safe support for Pthreads and for OpenMP, and decided to use these.

Implementation requirements were to:

The following shows the Parallel HDF5 implementation layers:

Parallel Programming with HDF5

This tutorial assumes that you are somewhat familiar with parallel programming with MPI (Message Passing Interface).

If you are not familiar with parallel programming, here is a tutorial that may be of interest:

Some of the terms that you must understand in this tutorial are:


Parallel HDF5 opens a parallel file with a communicator. It returns a file handle to be used for future access to the file.

All processes are required to participate in the collective Parallel HDF5 API. Different files can be opened using different communicators.

Examples of what you can do with the Parallel HDF5 collective API:

Once a file is opened by the processes of a communicator:

Please refer to the Supported Configuration Features Summary in the release notes for the current release of HDF5 for an up-to-date list of the platforms that we support Parallel HDF5 on.

Creating and Accessing a File with PHDF5

The programming model for creating and accessing a file is as follows:

1. Set up an access template object to control the file access mechanism.

2. Open the file.

3. Close the file.

Each process of the MPI communicator creates an access template and sets it up with MPI parallel access information. This is done with the H5Pcreate / h5pcreate_f call to obtain the file access property list and the H5Pset_fapl_mpio / h5pset_fapl_mpio_f call to set up parallel I/O access.

Following is example code for creating an access template in HDF5:

    23      MPI_Comm comm  = MPI_COMM_WORLD;
    24      MPI_Info info  = MPI_INFO_NULL;
    26      /*
    27       * Initialize MPI
    28       */
    29      MPI_Init(&argc, &argv);
    30      MPI_Comm_size(comm, &mpi_size);
    31      MPI_Comm_rank(comm, &mpi_rank);
    33      /*
    34       * Set up file access property list with parallel I/O access
    35       */
    36 plist_id = H5Pcreate(H5P_FILE_ACCESS); 37 H5Pset_fapl_mpio(plist_id, comm, info);

    23       comm = MPI_COMM_WORLD
    24       info = MPI_INFO_NULL
    26       CALL MPI_INIT(mpierror)
    27       CALL MPI_COMM_SIZE(comm, mpi_size, mpierror)
    28       CALL MPI_COMM_RANK(comm, mpi_rank, mpierror)
    29       !
    30       ! Initialize FORTRAN interface 
    31       !
    32       CALL h5open_f(error)
    34       !
    35       ! Setup file access property list with parallel I/O access.
    36       !
    37 CALL h5pcreate_f(H5P_FILE_ACCESS_F, plist_id, error) 38 CALL h5pset_fapl_mpio_f(plist_id, comm, info, error)

The following example programs create an HDF5 file using Parallel HDF5:    C      F90

Creating and Accessing a Dataset with PHDF5

The programming model for accessing a dataset with Parallel HDF5 is:

The following code demonstrates a collective write using Parallel HDF5:

    95      /*
    96       * Create property list for collective dataset write.
    97       */
 98 plist_id = H5Pcreate (H5P_DATASET_XFER); 99 H5Pset_dxpl_mpio (plist_id, H5FD_MPIO_COLLECTIVE);
   101      status = H5Dwrite (dset_id, H5T_NATIVE_INT, memspace, filespace,
   102                plist_id, data);

   108       ! Create property list for collective dataset write
   109       !
 110 CALL h5pcreate_f (H5P_DATASET_XFER_F, plist_id, error) 111 CALL h5pset_dxpl_mpio_f (plist_id, H5FD_MPIO_COLLECTIVE_F, error)
   113       !
   114       ! Write the dataset collectively.
   115       !
   116       CALL h5dwrite_f (dset_id, H5T_NATIVE_INTEGER, data, dimsfi, error, &
   117          file_space_id = filespace, mem_space_id = memspace, xfer_prp = plist_id)

The following example programs create a dataset in an HDF5 file using Parallel HDF5:    C    F90

Writing and Reading Hyperslabs 

The programming model for writing and reading hyperslabs is:

The memory and file hyperslabs in the first step are defined with the H5Sselect_hyperslab (C) / h5sselect_hyperslab_f (F90).

The start (or offset), count, stride, and block parameters define the portion of the dataset to write to. By changing the values of these parameters you can write hyperslabs with Parallel HDF5 by contiguous hyperslab, by regularly spaced data in a column/row, by patterns, and by chunks:

by Contiguous Hyperslab

by Regularly Spaced Data  

by Pattern  

by Chunk