Implementaion of Container and mixed loaders (H4EP001) #138

hernot · 2020-07-26T15:25:17Z

At first:

@1313e with this pull request i want to express how much i apriciate your really
great work done for hickle 4.0.0 implementing the first step to dedicated loaders.

Second: the reason why I'm so pushing upon implementation of H4EP001

The research conducted by the research group I'm establishing and leading
is split into two tracks. A methodological one dealing with improvement and
development of new algorithms and methods for clinical procedures in diagnostics
and treatment. The second one is concerned with clinical research utilizing the
tools based upon the methods and algorithms provided by the first track.

In the first track python, numpy, scipy etc. are the primary tools for working on
the algorithms, investigating new procedures and algorithmic approaches.
The work in the second track is primarily conducted by clinicians. Therefore
the tools provided for their research and studies have to be thoroughly tested
and validated. This validation at least the part which can be automatized through
unit test utilizes test data, including intermediate data and results obtained and
provided by the python programs and scripts developed during development of
underlying algorithm.

As the clinical tools are implemented in compiled languages
which support true multi-threading the data passed on has to be stored in a
file format readable outside python, out-ruling pickle strings. Therefore jsonpickle
was used to dump the data. Meanwhile the amount of data has grown into the large
so that json files even if compressed using zip, gzip or other compression schemes
is not feasible any more. NPY, and NPZ files which was the next choice mandate a
dependency upon numpy library. Just for conducting unit tests a self contained file
format for which only the corresponding library without any further has to be included
would be the better choice.

And this is the point where hdf5libraries and hickle come into play. I do consider both
as the best and most suitable option i have found so far. And the current limitation that
objects without dedicated loader are stored as pickle strings can be solved by
supporting python copy protocol. Which i offer hereby to contribute to hickle.

Third content of this pull-request:

Implementation of Container based and mixed loaders as proposed by #135 hickle extension
proposal H4EP001. For details see commit-message and the proposal #135.

Finally i do recommend:

Not to put this into an official release. Some extended tests using a real dataset compiled
for testing and validating software tools and components developed for use in clinical track
showed that an important part is missing to keep file sizes at reasonable level. Especially
type-strings and pickle strings for class and function objects currently take-up most of the
file space letting dumped files grow quickly into GB even with hdf5 file compression activated
where the pickle stream just requires 400MB of space.
Therefore i do recommend to implement additionally memoization (H4EP002 #139 ) first before considering
the resulting code base ready for release.

Ps.: loading of hickle 4.0.0 files should be still possible out of the box. Due to the lack of an appropirate testfile no test is included to verify.

codecov · 2020-07-26T17:18:49Z

Codecov Report

Merging #138 (d4ef711) into dev (3b5efbb) will not change coverage.
The diff coverage is 100.00%.

@@            Coverage Diff            @@
##               dev      #138   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files            9         9           
  Lines          592       654   +62     
=========================================
+ Hits           592       654   +62

Impacted Files	Coverage Δ
hickle/__version__.py	`100.00% <100.00%> (ø)`
hickle/helpers.py	`100.00% <100.00%> (ø)`
hickle/hickle.py	`100.00% <100.00%> (ø)`
hickle/loaders/load_astropy.py	`100.00% <100.00%> (ø)`
hickle/loaders/load_builtins.py	`100.00% <100.00%> (ø)`
hickle/loaders/load_numpy.py	`100.00% <100.00%> (ø)`
hickle/loaders/load_scipy.py	`100.00% <100.00%> (ø)`
hickle/lookup.py	`100.00% <100.00%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3b5efbb...d4ef711. Read the comment docs.

telegraphic · 2020-07-28T05:18:47Z

I'll wait for some input from @1313e, I only note two small things:

code coverage (only one line not covered so seems reasonable to push it back to 100%!). L349 in hickle.py is not covered:

        raise FileError("HDF5-file does not have the proper attributes!")

As you noted, we need an appropriate test file for v4.0.0

hernot · 2020-07-28T06:26:55Z

I'm thinking of trying to get hands on this two issues the next days while we are waiting for input for @1313e . Not sure if i will manage or fail.

Another thing i have checked the two other pullrequest #136 and #137. They are basically the same two minor adaptions of load_astropy.py loader and test_astropy.py test. In case you and @1313e agree i would also include in these efforts to implicitly include the ammendments of the two files. Especially when adding anyway a test for verifying that loading of 4.0.0 files is not broken by the proposed changes. What do you think.

1313e · 2020-07-28T06:35:31Z

#136 and #137 exist, because I specifically asked for 2 PRs (I still have to merge them).
They will come before this one gets merged, so some additional merging will be required.

I have just finished an incredibly busy period, so I now finally have time to go and take a look at this.
I will hopefully be able to do so this week.

hernot · 2020-07-28T06:40:46Z

@1313e: So shall i wait with the ammendments for the not covered line and the implementation of the not yet included verification for proper loading of 4.0.0 file too or would you prefer that these ammendments are done before reviewing.

1313e · 2020-07-28T06:43:49Z

You can add them just fine, as long as they have nothing to do with #136 and #137, as you will require merging the master branch into your branch before this PR can be accepted.

1313e · 2020-07-28T07:35:15Z

There you go, now those PRs are merged.
I see that there are some conflicts, so they will have to be resolved first.

hernot · 2020-07-28T07:44:36Z

I do propose the following approach

I will first fix the not covered line and come up with an appropriate test for testing proper loading of 4.0.0 files.
If 1 is stable try to rebase and resolve the conflicts that pr is again non conflicting to dev.

Anything I'm up to missing anything and thus should consider and cover additionally?

1313e · 2020-07-28T07:46:09Z

You first want to merge master or dev (both are the same) into your branch, and then fix everything else.

hernot · 2020-07-28T18:57:40Z

Ok think rebase worked. @1313e thank you very much for reminding.

hernot · 2020-07-29T20:40:31Z

Think i managed. Full coverage and proper loading of hickle 4.0.0 and 4.0.1 files. Added file created with master and pickled version of the same data which test uses to verify that data were properly restored. Hope i did not miss any tricky obstacle. Real world use will reveal.
Fuck numpy and pickle as allways and why 3.8 breaks i have no idea as in 3.6 and 3.7 it works. for today its already too late to solve. dropping the pickle file could be possible as the script creating it is also there but. the error in 3.8 sigh.

1313e · 2020-07-31T06:23:08Z

@hernot Having fun over there? ;)

hernot · 2020-07-31T06:36:46Z

not any more one last squashing and things are ready, just hat to crawl down a tight rabit hole to find the one nasty error occuring in python 3.8. And as i do not have here a native Python 3.8 installation here available i have to abuse Tarvis a bit for this.

The error seemed to be caused by the fact that load_numpy.create_ndarray_dataset creates on lines 102-105 a 'type' string from lambda

            # If not, create a new group and dump py_obj into that
            d = h_group.create_group(name)
            _dump(py_obj, d, **kwargs)
            d.attrs['type'] = np.array(pickle.dumps(lambda x: np.array(x[0])))

which is not used any more in the new container based scheme. And further according to Python doc
Section What can be pickled and unpickled not supported and seems to be enforced now in python3.8 or is just in 3.8 unluckily triggered by the new test verifying that hickle 4.0.0 and 4.0.1 files are proper loaded without the need for an additional dedicated legacy loader.

I will cleanup the code and squash the intermediate commits and repush one final time when done.

hernot · 2020-07-31T18:12:09Z

Ok final cleanup and squashing done.

NOTE:
For testing purpose and thus exeptional form usual release procedure I bumped the minor version number to 4.1.0. This allowed me to properly test that when loading files created by hickle 4.0.x the related fixes and workarounds needed are activated and only then. These are especially required when running under Python 3.8 as this seems to be especially more strict upon pickling and unpickling lambda functions compared to earlier versions. Even more when the to be unpickled functions have vanished from the module they were defined in earlier versions of hickle.
This does not in any case replace the decision upon final version number during release. In case pushing patch number instead of minor would then be preferred i provide any needed support in identifying which items would have to be amended to work properly.

phew
finally awaiting your review and your suggestions for improvement and amendment.

1313e · 2020-07-31T23:12:27Z

Well, I better do some reviewing then huh?
Wonder when I will have the time to review something this big.

hernot · 2020-08-01T09:30:25Z

Take your time, the next steps will definitly be smaller than the switch to loader based design in 4.0.0 and the extesnsion of PyContainer concept initiated in 4.0.0 now. Further a big part is covered by the rather thorough (hopefully) unit tests of individual modules (helpers, lookup, load_builtins, load_numpy, load_scipy, load_astropy, hickle) in addition to tests already present. These are also the reason why i stumbled over the compression issue related to scalars and strings as reported in #140 and implemented a proposal to fix. And some other things which fitted for demonstration purpose (confess #125).
EDIT before final release together with memoization it is necessary to go through all loaders and check for strings and scalars which might otherwise be missed by the by then implemented compression handling fix for #140. I will open an appropriate issue timely when done with the rest.

hernot · 2020-10-01T18:49:40Z

Hi just a small ping on how the plans are.

hernot · 2020-12-02T22:51:51Z

In the middle of rebasing and removing support for python copy protocol. One last step necessary to make tests succeed

limit requirements to h5py < 3.x as suggested by @1313e on issue #143. Will continue tomorrow today already too late.

1313e · 2020-12-03T07:35:15Z

@telegraphic Thoughts?

hernot · 2020-12-03T07:43:06Z

Implementation for python copy protocol support now removed as indicated above and by the reason given why closing issue #125. A an idea how to reach the intended goal in hdf5 friendly manner i have sketched in HEP003 #145.

hernot · 2020-12-20T21:37:28Z

@telegraphic I know you are quite busy but never the less is it possible to timely decide on this as with each fix @1313e necessarily implements rebasing of this pull-request and all dependent branches for upcoming pull-requests becomes more complicated and tedious.

With hickle 4.0.0 the code for dumping and loading dedicated objects like scalar values or numpy arrays was moved to dedicated loader modules. This first step of disentangling hickle core machinery from object specific included all objects and structures which were mappable to h5py.Dataset objects. This commit provides an implementaition of hickle extension proposal H4EP001 (telegraphic#135). In this proposal the extension of the loader concept introduced by hickle 4.0.0 towards generic PyContainer based and mixed loaders specified. In addition to the proposed extension this proposed implementation inludes the following extensions hickle 4.0.0 and H4EP001 H4EP001: ======== PyContainer Interface includes a filter method which allows loaders when data is loaded to adjust, suppress, or insert addtional data subitems of h5py.Group objects. In order to acomplish the temorary modification of h5py.Group and h5py.Dataset object when file is opened in read only mode the H5NodeFilterProxy class is provided. This class will store all temporary modifications while the original h5py.Group and h5py.Dataset object stay unchanged hickle 4.0.0 / 4.0.1: ===================== Strings and arrays of bytes are stored as Python bytearrays and not as variable sized stirngs and bytes. The benefit is that hdf5 filters and hdf5.compression filters can be applied to Python bytearrays. The down is that data is stored as bytes of int8 datatype. This change affects native Python string scalars as well as numpy arrays containing strings. numpy.masked array is now stored as h5py.Group containin a dedicated dataset for data and mask each. scipy.sparce matrices now are stored as h5py.Group with containing the datasets data, indices, indptr and shape dictionary keys are now used as names for h5py.Dataset and h5py.Group objects. Only string, bytes, int, float, complex, bool and NonType keys are converted to name strings, for all other keys a key-value-pair group is created containg the key and value as its subitems. string and bytes keys which contain slashes are converted into key value pairs instead of converting slashes to backslashes. Distinction between hickle 4.0.0 string and byte keys with converted slashes is made by enclosing sting value within double quotes instead of single qoutes as donw by Python repr function or !r or %r string format specifiers. Consequently on load all string keys which are enclosed in single quotes will be subjected to slash conversion while any others will be used as ar. h5py.Group and h5py.Dataset objects the 'base_type' rerfers to 'pickle' are on load automatically get assigned object as their py_object_type. The related 'type' attribute is ignored. h5py.Dataset objects which do not expose a 'base_type' attribute are assumed to contain pickle string and thus get implicitly assigned 'pickle' base type. Thus on dump for all h5py.Dataset objects which contain pickle strings 'base_type' and 'type' attributes are ommited as their values are 'pickle' and object respective. Other stuff: ============ Full separation between hickle core and loaders Distinct unit tests for individual loaders and hickle core Cleanup of not any more required functions and classes Simplification of recursion on dump and load through self contained loader interface. is capbable to load hickle 4.0.x files which do not yet support PyContainer concept beyond list, tuple, dict and set includes extended test of loading hickel 4.0.x files contains fix for labda py_obj_type issue on numpy arrays with single non list/tuple object content. Python 3.8 refuses to unpickle lambda function string. Was observerd during finalizing pullrequest. Fixes are only activated when 4.0.x file is to be loaded Exceptoin thrown by load now includes exception triggering it including stacktrace for better localization of error in debuggin and error reporting.

Related to issue telegraphic#83, making astropy/scipy optional dependencies. Can now install e.g. hickle[astropy] to add astropy support. Uses pkg_resources.requires('hickle[astropy') to check and only load if error is not raised.

With hickle 4.0.0 the code for dumping and loading dedicated objects like scalar values or numpy arrays was moved to dedicated loader modules. This first step of disentangling hickle core machinery from object specific included all objects and structures which were mappable to h5py.Dataset objects. This commit provides an implementaition of hickle extension proposal H4EP001 (telegraphic#135). In this proposal the extension of the loader concept introduced by hickle 4.0.0 towards generic PyContainer based and mixed loaders specified. In addition to the proposed extension this proposed implementation inludes the following extensions hickle 4.0.0 and H4EP001 H4EP001: ======== PyContainer Interface includes a filter method which allows loaders when data is loaded to adjust, suppress, or insert addtional data subitems of h5py.Group objects. In order to acomplish the temorary modification of h5py.Group and h5py.Dataset object when file is opened in read only mode the H5NodeFilterProxy class is provided. This class will store all temporary modifications while the original h5py.Group and h5py.Dataset object stay unchanged hickle 4.0.0 / 4.0.1: ===================== Strings and arrays of bytes are stored as Python bytearrays and not as variable sized stirngs and bytes. The benefit is that hdf5 filters and hdf5.compression filters can be applied to Python bytearrays. The down is that data is stored as bytes of int8 datatype. This change affects native Python string scalars as well as numpy arrays containing strings. numpy.masked array is now stored as h5py.Group containin a dedicated dataset for data and mask each. scipy.sparce matrices now are stored as h5py.Group with containing the datasets data, indices, indptr and shape dictionary keys are now used as names for h5py.Dataset and h5py.Group objects. Only string, bytes, int, float, complex, bool and NonType keys are converted to name strings, for all other keys a key-value-pair group is created containg the key and value as its subitems. string and bytes keys which contain slashes are converted into key value pairs instead of converting slashes to backslashes. Distinction between hickle 4.0.0 string and byte keys with converted slashes is made by enclosing sting value within double quotes instead of single qoutes as donw by Python repr function or !r or %r string format specifiers. Consequently on load all string keys which are enclosed in single quotes will be subjected to slash conversion while any others will be used as ar. h5py.Group and h5py.Dataset objects the 'base_type' rerfers to 'pickle' are on load automatically get assigned object as their py_object_type. The related 'type' attribute is ignored. h5py.Dataset objects which do not expose a 'base_type' attribute are assumed to contain pickle string and thus get implicitly assigned 'pickle' base type. Thus on dump for all h5py.Dataset objects which contain pickle strings 'base_type' and 'type' attributes are ommited as their values are 'pickle' and object respective. Other stuff: ============ Full separation between hickle core and loaders Distinct unit tests for individual loaders and hickle core Cleanup of not any more required functions and classes Simplification of recursion on dump and load through self contained loader interface. is capbable to load hickle 4.0.x files which do not yet support PyContainer concept beyond list, tuple, dict and set includes extended test of loading hickel 4.0.x files contains fix for labda py_obj_type issue on numpy arrays with single non list/tuple object content. Python 3.8 refuses to unpickle lambda function string. Was observerd during finalizing pullrequest. Fixes are only activated when 4.0.x file is to be loaded Exceptoin thrown by load now includes exception triggering it including stacktrace for better localization of error in debuggin and error reporting. h5py version limited to <3.x according to issue telegraphic#143

telegraphic · 2021-01-12T11:46:36Z

Hi @hernot and @1313e -- I'm taking a dive into the PR this week. I'm currently on paternity leave so have been even slower to respond than usual (first child so tough learning curve!).

For now, let me say @hernot I really appreciate all the effort and thought you've clearly put into this. Apologies for my tardiness!

hernot · 2021-01-13T19:35:50Z

@telegraphic no worry, every thing is set and prepared here, so just waiting for your go.

telegraphic · 2021-01-16T10:27:20Z

Making some notes here as I go through:

Compare file structure 4.0.4 and 4.1.0

# Make a basic test file with 4.1.0 and 4.0.4
a = {'a': 1, 'b': 'hello', 'c': [1,2,3]}
b = np.array([1,2,3,5])
c = (a, b)
hkl.dump(c, 'hkl_410.hkl')

# (rinse and repeat for hkl_404.hkl with 4.0.4

Compare the two file structures:

 h5diff -v hkl_410.hkl hkl_404.hkl

file1     file2
---------------------------------------
    x      x    /
    x      x    /data
    x           /data/data0
    x           /data/data0/"a"
    x           /data/data0/"b"
    x           /data/data0/"c"
    x           /data/data1
           x    /data/data_0
           x    /data/data_0/'a'
           x    /data/data_0/'a'/data
           x    /data/data_0/'b'
           x    /data/data_0/'b'/data
           x    /data/data_0/'c'
           x    /data/data_0/'c'/data
           x    /data/data_1

group  : </> and </>
0 differences found
attribute: <HICKLE_PYTHON_VERSION of </>> and <HICKLE_PYTHON_VERSION of </>>
0 differences found
attribute: <HICKLE_VERSION of </>> and <HICKLE_VERSION of </>>
size:           H5S_SCALAR           H5S_SCALAR
position        HICKLE_VERSION of </> HICKLE_VERSION of </> difference
------------------------------------------------------------
[ 2]            0            1
[ 4]            4            0
2 differences found
group  : </data> and </data>
0 differences found
attribute: <base_type of </data>> and <base_type of </data>>
0 differences found
attribute: <type of </data>> and <type of </data>>
0 differences found

The main difference is data_0 --> data0 and dictonary items no longer have trailing /data -- no strong feeling on first and latter is definitely an improvement 👍.

The difference in file format means 4.0.4 will not be able to load 4.1.0. It fail with a ValueError: Provided argument 'file_obj' does not appear to be a valid hickle file! ('int' object has no attribute 'name'). I suggest we add a check if the HICKLE_VERSION in the file (e.g. 4.1.0) is greater than installed version (MAJOR MINOR but ignore PATCH), and if it fails to load, it prints a 'try updating hickle to latest version' message.

Check 4.1.0 can load 4.0.4 still

Yup (at least for this basic test file).

telegraphic · 2021-01-16T13:08:15Z

Hey @hernot and @1313e, overall I am happy for this to be merged, as it is a precondition on H4EP002 and H4EP003.

What changed and was it worth it?

Firstly, I note that this is a pretty major underlying change to how hickle works under the hood. Even though there are a lot of changes to the loading/dumping logic, changes to the actual file structure are minor. I like the changes to how dictionaries are dumped without extraneous /data, which was made possible by the changes to container loaders. The changes add a level of abstraction, to allow "the possibility to support additional container like objects" in the future. Put another way: the changes are supposed to help make it easy to store custom classes.

Part of the motivation for the changes in H4EP001 was issue #125, to do with use of __getstate__ and __setstate__ for custom classes that @hernot used in his research. After rather lengthy discussion, #125 was closed with the conclusion that "hdf5 data format is not really designed for storing dict, list and tuple structures containing vast amounts of heterogenous data especially if they are organized in a rather chaotic huge tree like structure". #145 offers an alternative approach for custom classes: a specification that if the class supplies __compact__ and __expand__ dunder classes then hickle will be able to understand and store the class (see new issue #148).

So, the big question: are the major changes in H4EP001 worth it? At first glance, the end functionality to the user has not changed!

A quick case study

To weigh this up, I found it useful to look at how the scipy sparse matrix support changed -- see diff here. Sparse matrices are stored in the hdf5 file as three datasets -- which need to be recombined into a single sparse array when loaded by hickle. To allow this previously required the exclude_register in lookup.py. The new functionality implements

class SparseMatrixContainer(PyContainer):

which I think is a slightly better design pattern. However there is some more complexity in lookup.py to maintain backward compatibility. The register_class method is now:

def register_class(myclass_type, hkl_str, dump_function=None, load_function=None, container_class=None):
    """ Register a new hickle class.
    Parameters:
    -----------
        myclass_type type(class): type of class
        hkl_str (str): String to write to HDF5 file to describe class
        dump_function (function def): function to write data to HDF5
        load_function (function def): function to load data from HDF5
        container_class (class def): proxy class to load data from HDF5
    Raises:
    -------
        TypeError:
            myclass_type represents a py_object the loader for which is to
            be provided by hickle.lookup and hickle.hickle module only
            
    """

Where container_class will take a subclass of PyContainer, such as SparseMatrixContainer. We will need good clear documentation and examples for how to register your own class!

My conclusions

These changes add complexity, but they do promise to make some things easier in the future. I think, @hernot and @1313e, we can now agree that some data structures in Python are not easy to map to HDF5 optimally without some ugly code...

Overall I am supportive of merging this given H4EP002 and H4EP003, which extend H4EP001 with more tangible improvements. The improvement to the file structure when dumping dictionaries is also nice.

My apologies once again for the latency on the review. Thanks @hernot and @1313e for your patience. Small request: @hernot in the future, can you pretty please make smaller commits instead of one large one so it's easier to review? (I know this is difficult when refactoring, but it would be very helpful!)

1313e · 2021-01-16T23:29:38Z

I am still very sceptical of the usefulness of this PR.
It contains many changes that I would rather not see go through (which cannot be undone due to the lack of this PR being split into several commits), and the feature implementation as a whole is not complete.
Because of the latter, I think that this PR should not be merged into any branch of hickle until it contains a completely implemented feature.
This will require H4EP002 and H4EP003 (apparently) to be fully implemented.
I suggest that this PR is instead kept on @hernot's fork (given that they are the one working on this), and only after the aforementioned proposals have been implemented, that this PR can be merged into the main hickle repo.

hernot · 2021-01-17T10:54:35Z

NOTE:
For testing purpose and thus exceptional form usual release procedure I bumped the minor version number to 4.1.0. This allowed me to properly test that when loading files created by hickle 4.0.x the related fixes and workarounds needed are activated and only then. These are especially required when running under Python 3.8 as this seems to be especially more strict upon pickling and unpickling lambda functions compared to earlier versions. Even more when the to be unpickled functions have vanished from the module they were defined in earlier versions of hickle.
This does not in any case replace the decision upon final version number during release. In case pushing patch number instead of minor would then be preferred i provide any needed support in identifying which items would have to be amended to work properly.

@1313e @telegraphic that is why i wrote this note in one my comments above above. I'm pretty fine if you consider this and all the already prepared following pull-requests, which will introduce further even bigger changes to file format rendering them major and not just minor changes which require to spare them for hickle >= 5.0 release instead, of just bumping version 4 minor. Thinking about it it might be anyway the wiser option.

@1313e I'm pretty fine to first assemble all prepared pieces (#138, #139, #145, and clean-up and finalization) in a hickle 5 RC-1 proposal branch in my fork. Thereby it is for we very important that i do this in full agreement and coordination with you two @telegraphic and @1313e.

So may I suggest that ) create a Hickle-5-RC branch in my forked repo, add there all prepared pull-requests as commits and post here when ready for discussion about it either in continuation of this discussion or as part of together reviewing of the new branch.
Meanwhile i do suggest as i already did once to prioritize #141 which just globally makes hickle ignore any compression related h5py keyword parameter and issues an appropriate warning as @telegraphic suggested on issue #140. And we use the hickle-5-RC branch to prepare hickle 5 without any interference with current productive version of hickle 4.

1313e · 2021-01-17T12:30:34Z

Yes, that's fine with me.

1313e · 2021-01-18T02:03:58Z

Also, the compression thing I already fixed.

hernot · 2021-01-18T19:41:30Z

I would have appriciated also getting the OK from @telegraphic before closing

telegraphic · 2021-01-27T02:31:27Z

Just following up: I see @1313e's point that major changes should have functionality improvements, and see you've opened a new PR for a v5 bump -- all sounds good to me. I am more ambivalent about the code changes, so @1313e when we have RC5 ready I'll be relying on you to identify areas that you flag for reversion / request changes.

1313e · 2021-01-27T02:36:35Z

Just following up: I see @1313e's point that major changes should have functionality improvements, and see you've opened a new PR for a v5 bump -- all sounds good to me. I am more ambivalent about the code changes, so @1313e when we have RC5 ready I'll be relying on you to identify areas that you flag for reversion / request changes.

Sounds good to me 👍

hernot changed the base branch from master to dev July 26, 2020 15:25

hernot force-pushed the mixed_container branch from 5d61c8d to 0b4baac Compare July 26, 2020 19:40

hernot mentioned this pull request Jul 27, 2020

H4EP002: Memoisation scheme #139

Closed

1313e force-pushed the dev branch from e0b721c to e054839 Compare July 28, 2020 07:05

hernot force-pushed the mixed_container branch from 0b4baac to c53e859 Compare July 28, 2020 18:54

hernot force-pushed the mixed_container branch from c53e859 to a638518 Compare July 29, 2020 20:37

hernot force-pushed the mixed_container branch from 90bb8e5 to 0cd486d Compare July 31, 2020 18:00

hernot mentioned this pull request Aug 1, 2020

np.float64 vs float when compression is on. #140

Closed

hernot force-pushed the mixed_container branch from 0cd486d to 1d26746 Compare August 1, 2020 19:47

1313e self-requested a review September 1, 2020 01:28

hernot force-pushed the mixed_container branch 2 times, most recently from 2e46fec to 5d63b64 Compare September 4, 2020 14:04

1313e mentioned this pull request Oct 2, 2020

Quick fix for scalar compression issue #140 (obsoleted by #138) #141

Closed

hernot mentioned this pull request Nov 26, 2020

HEP003: Hickle Compact Expand protocol #145

Closed

hernot force-pushed the mixed_container branch 2 times, most recently from bf12c6a to e4bc715 Compare December 2, 2020 22:41

hernot force-pushed the mixed_container branch from e4bc715 to c681971 Compare December 3, 2020 07:33

hernot mentioned this pull request Dec 16, 2020

Dumping to io.BufferedReader Fails #144

Closed

hernot force-pushed the mixed_container branch from c681971 to 661a2b8 Compare December 21, 2020 19:32

hernot mentioned this pull request Jan 1, 2021

Hickle not working with h5py 3.0 #143

Closed

hernot and others added 3 commits January 5, 2021 21:16

Adding setup.py optional dependencies

39fbb4c

Related to issue telegraphic#83, making astropy/scipy optional dependencies. Can now install e.g. hickle[astropy] to add astropy support. Uses pkg_resources.requires('hickle[astropy') to check and only load if error is not raised.

hernot force-pushed the mixed_container branch from 661a2b8 to d4ef711 Compare January 5, 2021 20:47

telegraphic mentioned this pull request Jan 16, 2021

enhancement request: ability to export dicts as shareable/generic HDF5 #133

Closed

1313e closed this Jan 18, 2021

hernot mentioned this pull request Jan 18, 2021

Hickle 5 rc #149

Merged

hernot mentioned this pull request Jan 16, 2022

Problems with handling kwargs #157

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implementaion of Container and mixed loaders (H4EP001) #138

Implementaion of Container and mixed loaders (H4EP001) #138

hernot commented Jul 26, 2020 •

edited

Loading

codecov bot commented Jul 26, 2020 •

edited

Loading

telegraphic commented Jul 28, 2020

hernot commented Jul 28, 2020

1313e commented Jul 28, 2020

hernot commented Jul 28, 2020

1313e commented Jul 28, 2020

1313e commented Jul 28, 2020

hernot commented Jul 28, 2020 •

edited

Loading

1313e commented Jul 28, 2020

hernot commented Jul 28, 2020

hernot commented Jul 29, 2020 •

edited

Loading

1313e commented Jul 31, 2020

hernot commented Jul 31, 2020 •

edited

Loading

hernot commented Jul 31, 2020 •

edited

Loading

1313e commented Jul 31, 2020

hernot commented Aug 1, 2020 •

edited

Loading

hernot commented Oct 1, 2020

hernot commented Dec 2, 2020

1313e commented Dec 3, 2020

hernot commented Dec 3, 2020 •

edited

Loading

hernot commented Dec 20, 2020

telegraphic commented Jan 12, 2021

hernot commented Jan 13, 2021

telegraphic commented Jan 16, 2021

telegraphic commented Jan 16, 2021 •

edited

Loading

1313e commented Jan 16, 2021

hernot commented Jan 17, 2021 •

edited

Loading

1313e commented Jan 17, 2021

1313e commented Jan 18, 2021

hernot commented Jan 18, 2021

telegraphic commented Jan 27, 2021

1313e commented Jan 27, 2021

Implementaion of Container and mixed loaders (H4EP001) #138

Implementaion of Container and mixed loaders (H4EP001) #138

Conversation

hernot commented Jul 26, 2020 • edited Loading

At first:

Second: the reason why I'm so pushing upon implementation of H4EP001

Third content of this pull-request:

Finally i do recommend:

codecov bot commented Jul 26, 2020 • edited Loading

Codecov Report

telegraphic commented Jul 28, 2020

hernot commented Jul 28, 2020

1313e commented Jul 28, 2020

hernot commented Jul 28, 2020

1313e commented Jul 28, 2020

1313e commented Jul 28, 2020

hernot commented Jul 28, 2020 • edited Loading

1313e commented Jul 28, 2020

hernot commented Jul 28, 2020

hernot commented Jul 29, 2020 • edited Loading

1313e commented Jul 31, 2020

hernot commented Jul 31, 2020 • edited Loading

hernot commented Jul 31, 2020 • edited Loading

1313e commented Jul 31, 2020

hernot commented Aug 1, 2020 • edited Loading

hernot commented Oct 1, 2020

hernot commented Dec 2, 2020

1313e commented Dec 3, 2020

hernot commented Dec 3, 2020 • edited Loading

hernot commented Dec 20, 2020

telegraphic commented Jan 12, 2021

hernot commented Jan 13, 2021

telegraphic commented Jan 16, 2021

Compare file structure 4.0.4 and 4.1.0

Check 4.1.0 can load 4.0.4 still

telegraphic commented Jan 16, 2021 • edited Loading

What changed and was it worth it?

A quick case study

My conclusions

1313e commented Jan 16, 2021

hernot commented Jan 17, 2021 • edited Loading

1313e commented Jan 17, 2021

1313e commented Jan 18, 2021

hernot commented Jan 18, 2021

telegraphic commented Jan 27, 2021

1313e commented Jan 27, 2021

hernot commented Jul 26, 2020 •

edited

Loading

codecov bot commented Jul 26, 2020 •

edited

Loading

hernot commented Jul 28, 2020 •

edited

Loading

hernot commented Jul 29, 2020 •

edited

Loading

hernot commented Jul 31, 2020 •

edited

Loading

hernot commented Jul 31, 2020 •

edited

Loading

hernot commented Aug 1, 2020 •

edited

Loading

hernot commented Dec 3, 2020 •

edited

Loading

telegraphic commented Jan 16, 2021 •

edited

Loading

hernot commented Jan 17, 2021 •

edited

Loading