Travis CI: | |
---|---|
And...: |
This is a filter for HDF5 that uses the Blosc compressor; by installing this filter, you can read and write HDF5 files with Blosc-compressed datasets.
You need to be a bit careful before using this filter because you should not activate the shuffle right in HDF5, but rather from Blosc itself. This is because Blosc uses an SIMD shuffle internally which is much faster.
Instead of just linking this Blosc filter into your HDF5 application, it is possible to install it as a system-wide HDF5 plugin (with HDF5 1.8.11 or later). This is useful because it allows every HDF5-using program on your system to transparently read Blosc-compressed HDF5 files.
As described in the HDF5 plugin documentation, you just need to compile the Blosc plugin into a shared library and
copy it to the plugin directory (which defaults to /usr/local/hdf5/lib/plugin
on non-Windows systems).
Following the cmake
instructions below produces a libH5Zblosc.so
shared library
file (or .dylib
/.dll
on Mac/Windows), that you can copy to the HDF5 plugin directory.
To write Blosc-compressed HDF5 files, on the other hand, an HDF5 using program must be specially modified to enable the Blosc filter when writing HDF5 datasets, as described below.
Instead of (or in addition to) installing the Blosc plugin system-wide as
described above, you can also link the Blosc filter directly into your
application. Although this only makes the Blosc filter available in
your application (as opposed to other HDF5-using applications), it
is useful in cases where installing the plugin is inconvenient. Compile
the Blosc filter as described above, but link libblosc_filter.a
(generated by make
) directly into your program.
In order to register Blosc in your HDF5 application, you then need to call a function in blosc_filter.h, with the following signature:
int register_blosc(char **version, char **date)
Calling this will register the filter with the HDF5 library and will return info about the Blosc release in **version and **date char pointers.
A non-negative return value indicates success. If the registration fails, an error is pushed onto the current error stack and a negative value is returned.
An example C program ('src/example.c') is included which demonstrates the proper use of the filter.
This filter has been tested against HDF5 versions 1.6.5 through 1.8.10. It is released under the MIT license (see LICENSE.txt for details).
Assuming the filter is installed (either by a system-wide plugin or registered directly in your program as described above), your application can transparently read HDF5 files with Blosc-compressed datasets. (The HDF5 library will detect that the dataset is Blosc-compressed and invoke the filter automatically).
To write an HDF5 file with a Blosc-compressed dataset, you call the
H5Pset_filter function
on the property list of the dataset you are creating, and pass FILTER_BLOSC
(defined in blosc_filter.h
) for the filter_id
parameter. In addition, HDF5
only supports compression for "chunked" datasets; this just means that you need to
call H5Pset_chunk to
specify a chunk size (e.g. 1MB chunks), and the subsequent chunking of the dataset I/O
is performed transparently by HDF5.
The filter consists of a single 'src/blosc_filter.c' source file and
'src/blosc_filter.h' header, which will need the Blosc library
installed to work. It is simplest to just use the provided cmake
build scripts, which compile and both the filter and the Blosc library
into a library for you
Assuming you have cmake and other standard Unix build tools installed, do:
mkdir build cd build cmake .. make
This generates the library/plugin files required above in the build
directory.
See THANKS.rst.
Enjoy data!