-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Blacklist without subtree #19
base: develop
Are you sure you want to change the base?
Conversation
* Handle node injection after the all text in a text node * Apply blacklist to a node, not its subtree * Fix a couple of other bugs
Thanks @marcuswu Update: as @jccr reminded me, "marker" elements with empty content are also inserted, which is non-disruptive to CFI processing (whole subtree can be safely ignored/skipped). There is the potential edge-case of broken CSS selectors (in cases whereby the "sibling" relationship is used), but that is not a big concern due to the rare probability of such problem occurring, or having a significant impact on document rendering (same remarks applies to the last sentence at the bottom of this comment). Code references: That being said, I like the idea of inserted markup being truly "transparent", in the structural sense. Besides the bookmarks / annotations use-case, this could be useful (for example) when wrapping images/videos in an extra div to achieve some kind of rendering / interaction effect. More importantly, this would eliminate some existing caveats with the "overlay" technique: semi-opacity is a suboptimal way to create a background colour for the text selection, as it produces undesirable (uncontrollable) font colour. Furthermore, the overlay div(s) need to be visually relocated (CSS fixed positioning) for example when the font size changes, or when the viewport is resized. One possible caveat for "real" inserted markup is that this may break authored CSS selectors that have strict structural dependencies (i.e. it should work fine in most cases such as class names, but ">" would break). |
For the sake of completeness, I should also point out that in the context of EPUB3 Media Overlays (talking / readaloud books with synchronised text / audio) the "highlighter" implements experimental support for CFI character ranges (i.e. the SMIL does not need to reference the IDs of target HTML elements). This functionality relies on a library called "Rangy", which provides a utility that inserts necessary span element around a text selection, using a given CSS class name. This process temporarily produces CFI-breaking markup within the HTML document (i.e. for the duration of the current audio/text playback), so in theory the bookmark / annotation tool should not be invoked at that point in time. In practice, this is an extremely rare edge-case as there are currently no Media Overlays ebooks that use this feature, and there is not even a clear IDPF specification for it. My general suggestion would be to disable the bookmark / annotation commands during MO playback. In a future version of Readium, "audio bookmarking" would be a desirable feature anyway, and would require an alternative framework (to preserve not only the currently-playing SMIL fragment, but also the time offset within the associated audio clip ... as you can see, this would be irrespective of the currently-highlighted DOM element / character range). Code references: https://github.com/readium/readium-shared-js/tree/develop/lib/rangy |
At BiblioLabs we have been using a different approach to highlighting. We had issues with properly calculating positioning for overlays in certain conditions. They also do not provide the ability to perform text selections on the content while displayed (for instance, selecting text for creating an annotation / note while book text search results are highlighted). The tests fail due to the difference in how blacklisting works with the patch applied. I can look into updated tests if the approach at blacklisting is desired. |
Hi Marcus, I'm a web developer and I'm working on an e-Reader project. I saw your comments on the Readium/idpf forum on CFI Element blacklisting and wanted to comment. Unfortunately, I can't seem to get a Readium forum account to comment so I went here. We have been undertaking to implement a highlight feature and can infer from the public APIs on readium that the "preferred" approach is to use the API to add a highlight, which adds a div element (or group of div elements, one per line) outside of the document body and some zero length span elements at the start and end point of the selected text of the highlight. This approach works okay, but the lack of ability to get a reference to the colored div is causing us lots of technical challenges. Probably like you, we concluded that a better approach would be to wrap the document's selected text in a span and use css styles on the span to get color, hover treatment etc. Partway down that path we discovered that the insertion of non zero length span elements pushes all the subsequent cfi references in that same parent out even if the span is blacklisted. Like you pointed out, the content within the span does not get blacklisted so if mucks up our cfis. We are persisting highlights to the cloud, not by changing the document dom but by saving the cfi references and annotation properties and reapplying them on spine change events or when the book is reloaded (browser client). So, like you, the overlay positioning is pretty tricky to deal with when multiple columns are rendered and the browser window is resized. i should say, the readium code positions the highlights as expected, but dynamic events (for instance, sidebar popped out) which changes the pagination will throw them off position and we need to refresh their position. (it doesn't seem like the highlight refresh method is exposed to us for our particular view but it looks implemented.) There is also another problem with the overlay approach which is causing us grief. That is, a drag selection that drags over an existing highlight makes the browser select all text in the spine since the highlight div overlay is below the bottom of the body containing the content. This precludes us from having a user friendly UI for the situation when we select into another highlight. All this to get to my point which is that we like the "embed in dom" approach, and can deal with the highlight persistence readily. Plus we have reference to our own spans or divs that hold the highlighted text so we can manage the positioning, styling and events more easily. However, this is a fork in the road, if we deviate from the Readium approach and I would wonder if we would cause ourselves grief in the future. Also, we would need the "blacklist span plus content" capability that you describe otherwise we'd be constantly refreshing the cfi tree to make sure it didn't invalidate our persisted annotation references. Have you considered the impact of deviating, and what kind of feedback have you gotten from Readium regarding the pull request? thanks, |
We did end up deviating (hence the fact that we have the new blacklist code If you do end up doing any deviating work on Readium, I recommend that you What we have decided to do is to periodically issue pull requests back to I have successfully had other pull requests merged for smaller issues -- So far the feedback on this particular pull request has been pretty Thanks, On Wed, Nov 12, 2014 at 11:30 AM, dannyvenier [email protected]
|
The original implementation of highlighting inserted markers into the DOM tree that then later were used to calculate the coordinates for highlight divs. The markers were empty spans (no children) and the divs were outside of the document content so that blacklists could be effectively applied. Modifying spine DOM strikes me as an poor choice since much annotation functionality/highlighting functionality relies on CFIs references that are incorrect (if the DOM has been modified). Blacklisting is certainly a possible solution. Unfortunately we've seen massive performance degradation with blacklists when it comes to larger spine items. This is just traversing the DOM. An annotation implementation where whole DOM subtrees need to be reordered to apply CFIs implies an even bigger performance hit. Moving forward we should avoid DOM modifications, if possible. That will allow elimination of blacklisting which would improve performance and certainly simplify some things. |
I did notice some performance degradation using the original code until I changed the blacklisting. The original blacklist code iterates through a list of blacklisted classes, elements, and ids for each element that is being checked. The approach I chose is to instead build a single selector filter from the id and class blacklist items and apply those together using jquery which should fall to the browser's selection capabilities rather than a javascript loop. This is much faster and I have not seen nearly as large performance hits with this approach. In any case, I do agree with avoiding the necessity for blacklisting as much as possible, but that is a separate issue from having the ability to blacklist and being able to do so in a flexible way that allows content to be wrapped without being eliminated. |
Modifying the DOM seems to me to be a more pure and intuitive solution to modifying the view of a document versus an overlay. However, I have no background with the challenge of keeping the cfi tree "correct with blacklist exceptions". The fact that you say you've seen massive performance degradation when using lengthy blacklists is worrisome for sure. Still, there seem to be an continuing string of challenges in maintaining highlights and supporting elements for the overlay approach:
Something that might help item 1 is a public method to convert a cfi position into a document container pixel position. That might exist in rangy, not sure if it was there whether public. Item 2 is tricky and item 3 would need some testing to determine if performance degradation was substantial for a particular client machine. |
Well, CFIs specify a location within a DOM so modifying the DOM implies a subsequent effort to revise the tree so that CFIs resolve correctly. I find the whole concept challenging so my preference has been to try to think of spine DOM as constant. Not to say DOM modification approach won't work; it introduces another layer of complexity that I would prefer to avoid.
|
I definitely respect the depth of knowledge that went in to designing the annotation approach the way it was designed. We are reluctant to diverge from the intended approach as it would likely limit us in the future if we want to continue using the Readium framework. So if I were to stay true to the scope of this thread, the blacklist extension, I would say that we should first try to find means to do what we want within the current approach rather than diverging. Just my opinion. (That being said, the pull request doesn't take away our ability to use the framework per the original intent.) From the standpoint of our project, I think the biggest improvement could be made by allowing the client to reference the overlay divs. If the annotation id was placed in the element(s), that would give us a way to access the annotation div, style it, add listeners for what we needed, get the position for related controls, implement a button to hide highlights etc. We can access them now, but only using jquery class selectors for example $('.highlight') but that doesn't relate the div to the annotation. Maybe just a getHighlight(id) method that gives us access to the overlay elements would do - maybe return a highlight group object? That wouldn't solve my issue 2. We've been able to get around issue 2 (iframe text interaction obscured by overlay div) by toggling the z-index of the div during the drag gesture that we use for text selection. But that could get messy if we want to support more gestures, like double click of text. Regarding your statement that you've gotten rid of the span markers completely, I'd like to know more. Is that change in the dev branch or just planned at this point? Can you point me to the right place. thanks |
I understand the desire to move away from altering the spine DOM, and I can understand phasing that out as part of the Readium viewer project. However, blacklist use is pretty integrated into the epub cfi project. Do you intend to deprecate and remove that functionality from the API? If the intention is to leave it in the API but not utilize it in the viewer project, doesn't it make sense to provide the API feature in a way that makes it as useful as possible to anyone who might use the library? |
By the way, blacklisting is currently passed as function parameters in shared-js, instead of being centralised by some sort of "provider" interface. This could be made more elegant / maintainable. e.g. https://github.com/readium/readium-shared-js/blob/develop/lib/annotations_module.js#L583 |
I think it is very likely that will need to modify the DOM (and use a blacklist) for numerous purposes for our application. Enabling this capability need not turn Readium in to "Readium-Writeum" (...sorry) and diverge from it's original intent. For example, the findText() method in Rangy wraps search results in span markers (albeit it would likely be only temporary). The point is, that there could be many cases where modifying the main DOM and injecting elements into it makes more sense for the particular application than finding an overlay solution. |
@dannyvenier agree about your observation regarding Rangy's approach to "highlighting". Allow me to mention my above comment (starting with "For the sake of completeness") which points out the existing EPUB3 Media Overlays CFI highlighting method (audio synchronisation with arbitrary character ranges). |
Ah, yes, you were on point. That particular point was lost on me at the time of reading as I hadn't really looked at Rangy. When I ran into the injection on the rangy equivalent to search-text the light came on. As it turns out, the highlighting of search results is likely similar to your example because it is a short term modification so another edge case. However the key point is that there are numerous features that one might try to implement where injecting into the DOM is possible, which in my mind, justifies the use and enhancement of the blacklist feature. Your suggestion of implementing it as a provider makes a lot of sense. Dmitry mentioned in one post that the span markers for highlight have or are being completely removed. Has this happened yet and if so what branch is that in? And is there a known timeframe for the pull request which is the subject of this thread? |
I think the start/end markers still get inserted: https://github.com/readium/readium-shared-js/blob/master/lib/annotations_module.js#L565 https://github.com/readium/readium-shared-js/blob/master/lib/annotations_module.js#L976 |
There are some use cases where applyBlacklist does not work. applyBlacklist removes the entire subtree starting from the element that is blacklisted. If the injected element surrounds original content (for instance in the case of a span tag to highlight text in a spine item), any content surrounded by the span tag is no longer CFI addressable which would also lead to calculation of CFIs within the wrapped content to be incorrect.
I have altered applyBlacklist to instead only ignore those elements directly marked for the blacklist. Any children of the element not marked for blacklist are logically promoted to the spot of the blacklisted parent.
This means that all elements that should be blacklisted must be directly marked (subtrees are no longer automatically filtered out), but it also means one can wrap content in tags and it will not affect addressing those elements via CFI.
I think this is necessary for highlighting and notes functionality.
I had these committed in our own fork for some time and I noticed that those commits also have some other random fixes integrated in them. Sorry, but I didn't see an easy way to tease those out. If desired, I can squash these into one commit.
Since this departs from existing functionality, I fully expect some conversation surrounding these changes and I welcome it.