Thoughts on parsing stages RevisionStoreFile and OneDocument #18

blu-base · 2020-08-13T23:01:31Z

While studying the OneNote file spec further, i have the impression, that it might be a good idea to split the file entities described in MS-ONESTORE from the actual content described by MS-ONE.

This would approximately mean, that the lib would first parse the RevisionStoreFile, with all the Revisions and FileNodes, etc...

When all Chunks are declared, the lib would then instantiate a higher-level Document class which represents the MS-ONE spec, such as Sections, Pages, Textboxes, etc. And would also extract binary data, such as images, embedded files and so on.

This new document class could be further processed by the librevenge converters since they don't need to care anymore how chunks were originally stored.

Why would this be a good idea?... this will result in more boiler plates and additional moving/copying of data in memory...
However, the lib could compartmentalize the RevisionStoreFile into a specific stream. This stream would inherit a more general libone stream which is used as general input stream for that Document class, masking the revision store file structure.
On the long run, this would also mean, we could write other input streams which call data from other sources, such as onedrive, without the need to touch the revision store file parser again.

Though, i have not really an idea whether this is rally necessary, since there is a REST API for the notebook in the ms cloud. The other protocol used by sharepoints is likely out of scope for libone. So this means splitting up the different parsing stages might be unnecessary if no other adapter is ever needed.

tshikaboom · 2020-08-25T21:54:54Z

I don't have a rigid opinion at the moment on this: on paper, it does sound technically better to compartmentalize parsing MS-ONESTORE and MS-ONE, although I wonder if we'll get there one day ourselves. One thing I'm thinking of is that we're not changing public API for this library for the foreseeable future, so I guess that gives us more freedom to move things around in the library if we ever get to the point of wanting to do it in a "proper way".

On the other hand, we'll have to create higher-level classes for the parsed nodes anyway, I wonder if we can take advantage of that..

blu-base changed the title ~~Thoughs on parsing stages RevisionStoreFile and OneDocument~~ Thoughts on parsing stages RevisionStoreFile and OneDocument Aug 13, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Thoughts on parsing stages RevisionStoreFile and OneDocument #18

Thoughts on parsing stages RevisionStoreFile and OneDocument #18

blu-base commented Aug 13, 2020 •

edited

Loading

tshikaboom commented Aug 25, 2020

Thoughts on parsing stages RevisionStoreFile and OneDocument #18

Thoughts on parsing stages RevisionStoreFile and OneDocument #18

Comments

blu-base commented Aug 13, 2020 • edited Loading

tshikaboom commented Aug 25, 2020

blu-base commented Aug 13, 2020 •

edited

Loading