forked from telegraphic/hickle
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Implements issue telegraphic#139 H4EP002 Meomoisation scheme (object …
…and type) Basic Menoisation: ================== Both types of memoisation are handled by Reference manager dictionary type object. For storing object instance references it is used as python dict object which stores the references to the py_obj and related node using py_obj(id) as key when dumping. On load the id of the h_node is used as key for storing the to be shared reference of the restored object. Additoinal references to the same object are represented by h5py.Datasets with their dtype set to ref_dtype. They are created by assinging a h5py.Refence object as returned by h5py.Dataset.ref or h5py.Group.ref attribute. These datasets are resolved by the filter iterator method of the ExpandReferenceContainer class and returned as sub_item of the reference dataset. Type Memoisation ================ The 'type' attribute of all nodes exempt datasets which contain pickle strings, or expose a ref_dtype as their dtype now contains a reference to the approriate py_obj_type entry in the global 'hickle_types_table' this table host the datasets representing all py_obj_types and base_types encountered by hickle.dump once. Each py_obj_type is represened by a numbered dataset containing the corresponding pickle string. The base_types are represented by empty datasets the name of which is the name of the base_type as defined by class_register table of loaders. No entry is stored for object, b'pickle' as well as hickle.RerfenceType,b'!node-refence' as these can be resolved implicitly on load. The 'base_type' attribute of a py_obj_type entry referres to the base_type used to encode and required to properly restore it again from the hickle file. The entries in the 'hickle_types_table' are managed by the ReferenceManager.store_type and ReferenceManager.resolve_type methods. The latter is also taking care of properly distinguishing pickle datasets from reference datasets and resolving hickle 4.0.X dict_item groups. The ReferenceManager is implemented as context manager and thus can and shall be used within the with statement, to ensure proper cleanup. Each file has its own ReferenceManager instance, therefore different data can be dumped to distinct files which are open in parallel. The basic management of managers is provided by the BaseManager base class which can be used build futher managers for example to allow loaders to be activated only when specific feature flags are passed to hickle.dump method or encountered by hickle.load from the file attributes. The BaseManager class has to be subclassed. Other changes: ============== - lookup.register_class and class_register tables have an addtional memoise flag indicating whether py_obj shall be remmembered for representing and resolving multiple references to it or if it shall be dumped and restored everytime it is encountered - lookup.hickle_types table entries include the memoise flag as third entry - lookup.load_loader: the tuple returned in addtion to py_obj_type includes the memoise flag - hickle.load: Whether to use load_fn stored in lookup.hkl_types_table or use a PyContainer object stored in hkl_container_dict is decided upon the is_container flag returned by ReferenceManager.resolve_type instead of checking whether processed node is of type h5py.Group - dtype of string and bytes datasets is now set to 'S1' instead of 'u8' and shape is set to (1,len)
- Loading branch information
Showing
12 changed files
with
1,163 additions
and
242 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.