-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Editable: Treat value as HTML string, eliminate children source #2750
Comments
Related Slack discussion in |
We will still require some post processing, which is currently handled in the nodeToReact loop (cheap).
I'm not sure I understand this concern. Could you elaborate? This is a simple loop that looks what each DOM node is and makes an object out of it, so it's a cheap call.
In this case (which I like much more than passing HTML), we'd retain the same logic, but instead of nodeToReact, we'll have something like nodeToArrayStructure.
I like this idea in general. Would also be great for assigning keys so the block implementor does not need to concern themselves with that. |
Yep, we're on the same page now. This was discussed in the Slack conversation noted above, though not clearly referenced here.
It's a concern as the document becomes more complex. In a single TinyMCE instance containing thousands of nodes, for example, this loop occurring on each keystroke could become noticeable for slower machines. In our case, I don't think it's as much of a problem, though of course the
A main desire here is to eliminate |
If we end of keeping a local or global state, it would also be easier for Editable to make use of it, mostly together with focus state (assuming we'll also have a focus/selection state). |
I think it could still be reasonable (or at least technically possible) for Editable to convert the string to its own internal representation of state. The more pertinent question to me is from the perspective of a plugin author implementing a block, should I be exposed to this separate content shape? Is it at all useful outside representing the value of Editable? |
As a global state, I think so, yes. Things that come to mind are footnotes, (future) collaborative editing, maybe some more advanced things like spell checking and suggestions (not even unrealistic as @Yoast has been looking into that). |
And if we are doing this, the block implementor will be familiar with the shape? :) It we're still making it agnostic that is... |
leveraging |
Atm, we don't have a string value anywhere? As for hpq, if we have an array value for the rest, we could use this for the matchers/source as well, where the content is extracted form the arrays (maybe with some helper functions for easy use). With that, it would even all be unified under one structure. State (local and global), matchers/source and template agnostic framework. |
For now, we have react elements and no string value
So, with an array value, we'll still need helpers (matchers) to manipulate this data. Granted these helpers may have a simpler implementation, but on the other side, with a string value no need for serializing/parsing HTML,
template agnostic framework, means custom elements, so creating a new standard, I'm ok on relying on an existing approach familiar to a lot of developpers aka JSX or So, IMO, the only advantage of an array structure is having a simpler helpers/matchers implementation. Is this worth-it? I don't know.
|
I didn't say this at all, the matchers only came up later, and it's extra goodness of the array structure. See #2750 (comment) and #2750 (comment). |
|
Sorry, I meant |
I guess I should make a PR to make things clearer. @aduth did some interesting work here https://github.com/WordPress/gutenberg/pull/2463/files. |
what about Not saying the array structure is a bad choice, just that I don't see any major downside of using |
Why destroy a proper nested structure (DOM => Array) by mashing it it all together in HTML if it's useful internally in Editable, for footnotes, collaborative editing, perhaps doing stuff with editable like spell check/suggestions, having the same structure for matchers (no need for block implementors to understand two concepts)? If we do the agnostic framework approach (which is not my main argument here), it could be the same concept as well (https://github.com/WordPress/gutenberg/pull/2463/files#diff-861daa55c62f4f58dc9b3c0cca54f80bR31). |
I like the driven by use-cases approach, I think moving to a string is easy enough and will remove a lot of code. Let's see if the array structure solves the use-cases better than a simple string. |
Because the developer-facing implications (distinguishing between types of attribute values) should take precedence, particularly if it's still technically feasible to satisfy all internal (or the rare block-specific) requirements. Documenting what and why |
Coming in late to the party I can say that I don't understand the implications here. A string does seem a bit overkill and backwards but maybe we've had problems with trees? I can understand a confusion with a mixture of strings and rich types, but have we looked at making everything a rich type such as an array or tree? Lots of earlier work was explored to serve a tree-representation of HTML as a POJO. TinyMCE won't spit out this tree which is kind of a bummer. It all ties in with parser work and "how do we represent block data in the parse?" If everything is a tree it's easy. If everything is a string it's probably also easy. Here's what a tree doesn't get us: an answer to why the serialized format is an HTML/JSON hybrid. I do know that I've found great merit in representing text fundamentally as a plain-text string with attributes operating over ranges of the text. It has worked well with notifications (especially in multi-platform contexts) and it seems to work well in Draft-JS and other code editors. Nothing funny has to happen in order to move from a tree to a decorated view component. If we make it all strings it seems like that could provide a path to figure out better what our needs are and reduce the code in play until we have a straightforward conception for the trees. My personal hope would be that in the end we're working with data in the language of the editor (JavaScript and JavaScript objects) vs data in another language (and a complicated one at that) - HTML. |
@dmsnell That's not true though... it does spit out a tree. It's in the document, so it's a tree, just do I guess it's all fairly easy to adjust again, and I'm not going to argue against it further without any code. :) I agree with @youknowriad there. |
Cool @iseulde! I didn't know that. If we were to have a tree structure for How hard is it to feed TinyMCE a tree? How hard would it be to convert a block placeholder to something TinyMCE could represent? These kinds of structures are easily processed with stupid-simple recursive-descent parsers. For example, converting to a React tree or a DOM tree or building a string are all basically done with the same abstraction: walk the tree in a depth-first search. For some examples (lacking documentation or explanation of what they are doing) I have some code lying around from a prototype for the notifications panel at WordPress.com https://github.com/dmsnell/wrnc/blob/master/src/api-poller/block-parser.js https://github.com/dmsnell/wrnc/blob/master/src/blocks/block-tree-parser.jsx The commit which brought those two files into existence explains things in its comment. There's lots to ignore in that code as well though so please don't let that distract from the core idea. The original doodles for that code turned the tree back into a linear string and took a similar amount (but less) code as the |
If the content is clean (a valid structure, etc.), you can map the JSON tree to a DOM tree and attach it to the TinyMCE editor root node. At the moment we run it through
I'm not sure what you mean? Could you explain? |
This is just a test with editable state as array tree instead of React tree: https://github.com/WordPress/gutenberg/compare/try/editable-state-tree |
I find it hard to jump in without context, but here goes. In general I would prefer any kind of data structure that is more rich than a tree to represent anything. In the content analysis we are using in Yoast SEO we are currently moving a way from string processing and moving towards building a tree form the content our users write. If we can somehow use the tree already present in Gutenberg we can speed up our content analysis by a lot. If the concern is how easy it is to make a block, I think there are plenty of ways to make the API easy and still use a tree structure internally. It is a matter of high level APIs versus low level APIs. The low level APIs are exposed to trees and the higher level APIs could be exposed to strings. I would assume that the block API is a high level API. As for specific use-cases. We are trying to mark specific pieces of text in the editor to highlight places where the user can improve their text. In current WordPress/tinyMCE they are completely removed once the user focuses the editor, because we cannot guarantee we won't corrupt the data otherwise. We would really like to have real-time marking of the text. |
You may also be interested in #2541 which seeks to explore solutions to enabling block implementers to change their block markup without invalidating previously created blocks. |
Wow, that's mind blowing! But anyway, I can see a couple of cases where a complete server-side solution might be a better option. What's the current strategy for core blocks? They all get invalidated on subsequent changes, right? |
Depending on how severe the change is, and notably for markup changes it tends to be more disruptive but, generally, yes a saved core block may simply become invalid if its markup is changed version to version. |
After some unsuccessful iterations, I'm going to close this for reasons discussed in #5380. Certainly can be revisited, but I don't think this is a priority item for now, and the tree shape may prove useful in the long-term. |
Previously: #421
Related: #2463, #2418
The
children
source matcher was introduced as a means of extracting and representing the value of rich text in content. It was not expected that a developer should need to interact with it, thus we were comfortable with it being an "opaque" object shape: in the original implementation, an array of React elements. Primarily, this helped avoid the need fordangerouslySetInnerHTML
when implementing thesave
representation of a block, allowing the implementer to instead use the attribute value directly.Example:
There were a few downsides to this approach however:
onChange
callbacks.children
source increases knowledge necessary to become productive with blocks, as block implementers now need to understand difference between:text
,html
,children
,node
Proposal:
To address the original need to avoid
dangerouslySetInnerHTML
, we could instead create a separate component which abstracts this implementation detail. Example:In this case, the underlying implementation may be to merely assign
dangerouslySetInnerHTML
, but it is not surfaced to the block implementer, thereby avoiding some verbosity and worries around its application.This could then allow us to eliminate or alter the value of rich text to that of our choosing. In #2463 (more accurately the
update/children-value
branch), I tried an array structure.After further thought, if possible, it would be desirable to eliminate the separate
children
source altogether, if it is possible to represent the value in raw HTML (leveragehtml
source instead). This has the added benefit of requiring no post-processing in the Editable change handlers, instead merely passing the value ofthis.editor.getBody().innerHTML
(orthis.editor.getContent( { format: 'raw' } )
). Finally, as a string, it can be easily concatenated, but may be difficult to work with if further manipulation of the DOM structure is desirable.Open questions:
getContent
is to run a serializer over the body contents. We should consult with the Ephox team to determine what of this serialization behavior we might need, whether it can be avoided, or if there is a performance concern in frequently calling to the non-raw format content getter (particularly given that we can expect single TinyMCE instances not to be very large or complex documents)The text was updated successfully, but these errors were encountered: