You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
While studying the OneNote file spec further, i have the impression, that it might be a good idea to split the file entities described in MS-ONESTORE from the actual content described by MS-ONE.
This would approximately mean, that the lib would first parse the RevisionStoreFile, with all the Revisions and FileNodes, etc...
When all Chunks are declared, the lib would then instantiate a higher-level Document class which represents the MS-ONE spec, such as Sections, Pages, Textboxes, etc. And would also extract binary data, such as images, embedded files and so on.
This new document class could be further processed by the librevenge converters since they don't need to care anymore how chunks were originally stored.
Why would this be a good idea?... this will result in more boiler plates and additional moving/copying of data in memory...
However, the lib could compartmentalize the RevisionStoreFile into a specific stream. This stream would inherit a more general libone stream which is used as general input stream for that Document class, masking the revision store file structure.
On the long run, this would also mean, we could write other input streams which call data from other sources, such as onedrive, without the need to touch the revision store file parser again.
Though, i have not really an idea whether this is rally necessary, since there is a REST API for the notebook in the ms cloud. The other protocol used by sharepoints is likely out of scope for libone. So this means splitting up the different parsing stages might be unnecessary if no other adapter is ever needed.
The text was updated successfully, but these errors were encountered:
blu-base
changed the title
Thoughs on parsing stages RevisionStoreFile and OneDocument
Thoughts on parsing stages RevisionStoreFile and OneDocument
Aug 13, 2020
I don't have a rigid opinion at the moment on this: on paper, it does sound technically better to compartmentalize parsing MS-ONESTORE and MS-ONE, although I wonder if we'll get there one day ourselves. One thing I'm thinking of is that we're not changing public API for this library for the foreseeable future, so I guess that gives us more freedom to move things around in the library if we ever get to the point of wanting to do it in a "proper way".
On the other hand, we'll have to create higher-level classes for the parsed nodes anyway, I wonder if we can take advantage of that..
While studying the OneNote file spec further, i have the impression, that it might be a good idea to split the file entities described in MS-ONESTORE from the actual content described by MS-ONE.
This would approximately mean, that the lib would first parse the RevisionStoreFile, with all the Revisions and FileNodes, etc...
When all Chunks are declared, the lib would then instantiate a higher-level Document class which represents the MS-ONE spec, such as Sections, Pages, Textboxes, etc. And would also extract binary data, such as images, embedded files and so on.
This new document class could be further processed by the librevenge converters since they don't need to care anymore how chunks were originally stored.
Why would this be a good idea?... this will result in more boiler plates and additional moving/copying of data in memory...
However, the lib could compartmentalize the RevisionStoreFile into a specific stream. This stream would inherit a more general libone stream which is used as general input stream for that Document class, masking the revision store file structure.
On the long run, this would also mean, we could write other input streams which call data from other sources, such as onedrive, without the need to touch the revision store file parser again.
Though, i have not really an idea whether this is rally necessary, since there is a REST API for the notebook in the ms cloud. The other protocol used by sharepoints is likely out of scope for libone. So this means splitting up the different parsing stages might be unnecessary if no other adapter is ever needed.
The text was updated successfully, but these errors were encountered: