Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Thoughts on parsing stages RevisionStoreFile and OneDocument #18

Open
blu-base opened this issue Aug 13, 2020 · 1 comment
Open

Thoughts on parsing stages RevisionStoreFile and OneDocument #18

blu-base opened this issue Aug 13, 2020 · 1 comment

Comments

@blu-base
Copy link
Contributor

blu-base commented Aug 13, 2020

While studying the OneNote file spec further, i have the impression, that it might be a good idea to split the file entities described in MS-ONESTORE from the actual content described by MS-ONE.

This would approximately mean, that the lib would first parse the RevisionStoreFile, with all the Revisions and FileNodes, etc...

When all Chunks are declared, the lib would then instantiate a higher-level Document class which represents the MS-ONE spec, such as Sections, Pages, Textboxes, etc. And would also extract binary data, such as images, embedded files and so on.

This new document class could be further processed by the librevenge converters since they don't need to care anymore how chunks were originally stored.

Why would this be a good idea?... this will result in more boiler plates and additional moving/copying of data in memory...
However, the lib could compartmentalize the RevisionStoreFile into a specific stream. This stream would inherit a more general libone stream which is used as general input stream for that Document class, masking the revision store file structure.
On the long run, this would also mean, we could write other input streams which call data from other sources, such as onedrive, without the need to touch the revision store file parser again.

Though, i have not really an idea whether this is rally necessary, since there is a REST API for the notebook in the ms cloud. The other protocol used by sharepoints is likely out of scope for libone. So this means splitting up the different parsing stages might be unnecessary if no other adapter is ever needed.

@blu-base blu-base changed the title Thoughs on parsing stages RevisionStoreFile and OneDocument Thoughts on parsing stages RevisionStoreFile and OneDocument Aug 13, 2020
@tshikaboom
Copy link
Owner

I don't have a rigid opinion at the moment on this: on paper, it does sound technically better to compartmentalize parsing MS-ONESTORE and MS-ONE, although I wonder if we'll get there one day ourselves. One thing I'm thinking of is that we're not changing public API for this library for the foreseeable future, so I guess that gives us more freedom to move things around in the library if we ever get to the point of wanting to do it in a "proper way".

On the other hand, we'll have to create higher-level classes for the parsed nodes anyway, I wonder if we can take advantage of that..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants