Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance issue loading large notes (draft-js editor) #501

Closed
dmsnell opened this issue Jan 24, 2017 · 11 comments · Fixed by #2148
Closed

Performance issue loading large notes (draft-js editor) #501

dmsnell opened this issue Jan 24, 2017 · 11 comments · Fixed by #2148
Labels
bug Something isn't working [feature] editor Anything related to the editor.
Milestone

Comments

@dmsnell
Copy link
Member

dmsnell commented Jan 24, 2017

It was bound to happen at some point: the draft-js editor has introduced a performance hit when opening very large notes. I discovered this in the notes500 test account. The issue has more to do with the number of lines/paragraphs in the note than the total amount of content unless I am wrong.

Each line in the editor has its own overhead associated with it. It's actually slower to load the editor than the markdown preview 🙃. However I am optimistic that we can mitigate these issues if we approach the problem properly. It seems like this problem is inherent to draft-js for long contents and I doubt we will be able to fix it unless draft-js fixes it. I have offered our help on their issue for this problem.

The note in question has been retitled to "Slow load/update note" and took just under one second to lead the editor view on my machine.

CPU profile
screen shot 2017-01-23 at 9 21 20 pm
Note that in this profile the NoteContentEditor loads relatively faster than the rest of the editor view. Most of the time might actually be spent updating and reconciling props if I read the chart properly.

React Performance: Inclusive Time
screen shot 2017-01-23 at 9 08 48 pm

React Performance: Exclusive Time
screen shot 2017-01-23 at 9 09 46 pm

Since performance can be significantly impacted by the development environment I will update this issue after testing with NODE_ENV=production

Garbage Collector: Going crazy
screen shot 2017-01-23 at 11 21 42 pm

It takes about a second to load this note and over 800ms of that time is being spent in the small garbage collection pauses. Something with the allocations are going berserk. With over a thousand lines in the editor it's no surprise that this is happening.

@gubrist
Copy link

gubrist commented Feb 8, 2017

I do not want to intrude in the "developper's ktichen" of electron - but is there a possible link to the
performance issue (lots and large notes) on the Simplenote web version (Simperium) as reportet here:

Automattic/simplenote-macos#83

@dmsnell
Copy link
Member Author

dmsnell commented Feb 8, 2017

@gubrist I'm not sure of what kind of "developers kitchen" you think you have intruded upon - you are welcome here!

we don't have a public page like this for reporting issues with the web version at app.simplenote.com but we do pay attention to concerns when people bring them up.

thanks for the feedback!

@JM-Mendez
Copy link

JM-Mendez commented Aug 15, 2017

@dmsnell Are you still having this issue and have you checked it with the production React version yet?

I saw the referenced post on the Draftjs repo, and one thing to keep in mind is that Draftjs uses immutablejs internally. So it will definitely be slower than Prosemirror regardless. Although the numbers posted over there I believe are indicative of using the production version on their demo site. I loaded 160 pages worth of text and saw the same issue, but when I used the examples in their repo, load time significantly improved.

Also, each text block has it's own overhead. Including line breaks. So a note with more single lines will be slower than a note with large blocks of text.

@dmsnell
Copy link
Member Author

dmsnell commented Aug 15, 2017

@JM-Mendez as far as I'm aware there has been no movement on this issue in the draft-js repo. Also although the author of that issue is comparing to ProseMirror we aren't. Based on my observations above and notes in the linked issue I believe this is the result of an initial parse-and-generate process which thrashes the garbage collector. In that regard Immutable.js probably isn't at fault so much as something is iterating over operations when they potentially could be batched in one call/one allocation. Unfortunately I don't have time to dig into the internals of draft-js without someone guiding me in. This issue is very much a problem with the production build of React as well. To test I have been using the text from Pygmalion from Project Gutenberg.

As a workaround there is also an option to switch back to a basic text field in #568

@JM-Mendez
Copy link

@dmsnell I read that apparently FB has made it clear they're mostly interested in fixing issues that affect their own workflow, so I don't think you'll see much movement there.

I took a cursory glance through the code, and when pasting html they sanitize the html, then sanitize the text while creating multiple new RegEx objects per chunk of text, convert to an Immutable record, then batch render all the new blocks.

Since the pygmalion text is a play with 1126 separate pieces of text (I tested the length of the blocks array), I think the overhead of creating RegEx objects for each of those chunks if thrashing the garbage collector.

@dmsnell
Copy link
Member Author

dmsnell commented Aug 16, 2017

I think the overhead of creating RegEx objects for each of those chunks if thrashing the garbage collector.

@JM-Mendez that's interesting. I hadn't performed any analysis at the draft-js code level. Hrm… after taking a look, I'm not seeing anything absurd such as creating RegExps inside of any loops (unless it's in sanitizeDraftText() - I didn't look).

I wonder if we could replace createFromText() with something like this (from the "clean" portion of the draft-js code)…

// normalize all `\r\n` to `\n` elsewhere
const strings = text.split("\n");
const blocks = strings.map( block => new ContentBlock({
	key: generateRandomKey(),
	text: block,
	type: 'unstyled',
	characterList: List(Repeat(CharacterMetadata.EMPTY, block.length)),
});
return ContentState.createFromBlockArray(blocks);

if we did this, we could also potentially replace blocks only as they are changed from the Simperium side, not having to recreate the entire document…

@JM-Mendez
Copy link

JM-Mendez commented Aug 17, 2017

Ok, so I'm not familiar with this project's data structures (I stumbled upon this issue while trying to resolve a similar one for myself), but I ran some more tests and came up with these conclusions.

  1. The RegEx is not the problem. They're properly cached.
  2. You're right that most of the time is spent in React. Draftjs re-renders the entire editor with each change.
    • If you're just typing, then it looks like it just passes a new editor state object as an undo/redo point.
    • pressing backspace or loading the editor requires each block to render according to the react devtools highlight

I threw together a test using the production version of both Draftjs and React. I logged to the console the time it took to update the block, exclusive of any work React does, and was able to get it significantly down, but it looks like React adds an extra 60-80ms. Still upper bounds were 140ms to update under stress. You'd still be dealing with typing and deleting lag with something this size. I set up the test like so:

  • 4000 draftjs blocks (line breaks count as a block), with a total of over 120k words.

    • 100k words is the length of a long history non-fiction book, so I thought it'd be a decent stress test
    • the blocks of text are of varying length
    • I only updated the last two blocks for recording purposes.

The following code assumes

  • your block data structures contain at least a key and text prop. (I'd recommend using convertToRaw and using the resulting structure if possible.)
  • you are managing adding new blocks anywhere in the document
    • cutting and pasting reassigns the key
    • unfound keys are set to the end of the OrderedMap

I also came up with a solution to just update the text, but it's inelegant. I can post that if you want.

// current editorState since each change is a complete editor component (which is why immutablejs is useful)
const currentContent = this.state.editorState.getCurrentContent()

//  _**key must be a string and unique**_ 
const arrayOfPojos = [{key: '123', text: 'text of 1'}, {key: '456', text: 'text of 2'} ]

// entityMap must be provided
const newBlocks = convertFromRaw({ entityMap: {}, blocks: arrayOfPojos })

const newEditorState = newBlocks.blockMap.reduce((contentState, block) => {
    const newBlockMap = contentState.blockMap.set(block.key, block)
    return contentState.set('blockMap', newBlockMap)
}, currentContent)

this.setState({editorState: EditorState.push(this.state.editorState, newEditorState)})

Let me know if you find this useful.

Off topic, are you a maintainer of the simplenote-electron app? I'm curious as to how it only shows usage of around 35mb, when all other electron apps are hovering in the hundreds.

@dmsnell
Copy link
Member Author

dmsnell commented Aug 17, 2017

Thanks again for that great analysis @JM-Mendez

Draftjs re-renders the entire editor with each change.

While it's definitely slower than other editors on edit, the original provocation for this PR is on initialization of a new document. Even if it takes 200ms to re-render it may take a second or two to load initially. I think that you should be able to recreate this in your tests.

Once the blocks have loaded, re-renders become much cheaper and only those blocks which change need to be recalculated and altered. This is actually a pretty cool aspect about the draft-js architecture and allowed for fairly performant syntax highlighting/formatting in #594. Unfortunately that's hardly worth it if large documents don't load quickly.

I also came up with a solution to just update the text, but it's inelegant. I can post that if you want.

I'm not sure I follow what you are proposing here, could you clarify what your code is accomplishing?


I'm a big fan of draft-js but I'm also trying to be practical with its limits. If we could get the performance up I'd love to be able to keep it around because I have many ideas for enhancing the editing experience that are much easier to accomplish with the draft-js model than with textarea fields or other editor components. it's mainly the block-oriented functional design which lets us do incremental updates and changes that's to boast about.


are you a maintainer of the simplenote-electron app? I'm curious as to how it only shows usage of around 35mb, when all other electron apps are hovering in the hundreds.

35MB? that surprises me! when I boot my test account in macOS it appears to be using around 130 MB. Electron apps will all be inefficient at a base level but don't have to grow too much beyond that. Visual Studio Code is an excellent example of building a performant Electron-based application. We haven't optimized our code at all so if it's lean it's by accident.

Simplenote is pretty simple. We don't have many images or animations or files or media to take up memory and processing time. It should be a leaner application because it doesn't do much.

@JM-Mendez
Copy link

JM-Mendez commented Aug 18, 2017

I'm not sure I follow what you are proposing here, could you clarify what your code is accomplishing?

Previously you said:

if we did this, we could also potentially replace blocks only as they are changed from the Simperium side, not having to recreate the entire document…

The code I posted just does that. Updates individual blocks with changes and creates a new editorState object. I did notice the initial slow load, but thought you were recreating the entire doc from a raw state on changes (so 1-2 seconds to update on each change). Re-renders are much cheaper, but the entire editor is passed through on each change, so React still has to diff each child (thousands in your example). Anyways, that's the reason I shared the code.

I haven't found a way around the slow load times. Honestly I think it has more to do with the combination of React and Draftjs. Atom editor had issues with React powering their editor as well, and decided to move to a more manual solution as can be seen here atom/atom#5624. So using the combination may very well not be ideal for your problem domain.

Although since you're using redux and it can store an immutable object - if you store the note as a complete contentState object (using convertToRaw), then you'd be able to update individual blocks even while it's in the store. You could run it in an invisible browserWindow render process to offload the work as well.

But that's just shooting from the hip. I've no idea how performant or viable a solution that would be.

And yeah, 35mb no lie. There's only one instance running that I could find. Not one helper or anything. So weird. lol

@dmsnell
Copy link
Member Author

dmsnell commented Aug 18, 2017

The code I posted just does that.

ah. in our case I think it's simpler because what I meant is that Simperium gives us the queues to where changes occur and we can run them through the same draft-js update function as direct input does. that is, we can use things like Modifier.replaceText() or Modifier.insertText() etc… no need to manually interact with the block structure.

Atom editor had issues with React powering their editor as well, and decided to move to a more manual solution as can be seen here atom/atom#5624.

This is the reason I created #568 and want to see if we can get Monaco to do what we want. I've also considered a custom solution but think that's probably too big of a task for our needs, and Monaco is essentially that custom solution done for us with a few extra constraints. What Monaco doesn't seem as easy for is decoration and variable-height lines, but I may just be inexperienced with it to know of easier ways.


How can I be a help to you with this? are you wanting to explore some ideas and post PRs for them? I'd be happy to review or answer any questions you might raise. Really appreciate your energy on this as I want Simplenote to be able to handle those large use cases, such as when writing a book.

@JM-Mendez
Copy link

At the moment I'm exploring possibilities for writing a collaborative editor using Draftjs. Since I was reading the codebase and running tests anyways, I thought I'd share whatever I found in case you'd find it useful.

I'm self taught (mostly by following the open source community), and have never worked on a project with others, so I don't have much of an idea on where to begin to make pull requests to be honest. But my editor is the only software I'm working on during my free time, and I figure you probably have a lot more on your plate than I do. So I'm more than happy to post my findings over here.


ah. in our case I think it's simpler because what I meant is that Simperium gives us the queues to where changes occur...

I see. So your logic flow updates the material using Simperium, and then syncs it into the app
using Draftjs, is that correct?

What Monaco doesn't seem as easy for is decoration and variable-height lines, but I may just be inexperienced with it to know of easier ways.

For my use cases, I'd need variable-height lines, so Monaco won't work unless I could figure that out. But it's something I'll look into. Although tbh, I really like the Draftjs internal data structure. I'm contemplating forking it and modifying the base, especially looking into the load times to see if I can bring it down to at least a second for large docs. I personally don't think FB is interested in figuring out this use case because it does not impact their core business. It'll take me some time to get more familiar with the code tho. They micromanage the entire contenteditable state, so my main concern is introducing unknown regressions since I'm not familiar with it.

@bummytime bummytime added the [feature] editor Anything related to the editor. label May 21, 2019
@bummytime bummytime added this to the Future milestone May 21, 2019
dmsnell added a commit that referenced this issue Sep 21, 2020
co-authored by @belcherj and @codebykat with design work by @SylvesterWilmott 

## Description

We've been unable to resolve the worst sync bugs in the application and feature development has been artificially limited by the fact that we have to fight a number of data-flow issues tightly coupling different levels of the application's architecture together.

In this branch, we're ripping apart the entire app state and Simperium data flow to rebuild it in order to remove a number of those couplings and races. This commit changes most of the app and was rolled out in staging to test the changes.

We're also replacing `draft-js` with `Monaco` which is almost as big of a change conceptually as changing the state. Moving to `Monaco` allows us to remove a copy of the note data from the app and allows us to maintain a fully-synchronous update cycle, eliminating a race condition between the Simperium copy of the data, the app copy, and the contents in the text buffer.

## Major code changes

 - The interface between `node-simperium` and this app has been moved into Redux state and out of `indexedDB`. The `indexedDB` interface as asynchronous and led to sync issues under a variety of race conditions with the network data, remote updates, browser tab scheduling, whether the browser was focused or hidden behind other windows, and more. The new synchronous interface guarantees that updates occur when we expect them to occur and therefore will be updated by the time we continue processing changes and updates from the server.
 - `Monaco`'s synchronous plaintext interface allows us to extend the atomic "all updates occur together and instantaneously" paradigm to the text buffer. By sharing the note data in the text editor, in the app, and in `node-simperium`, we will guarantee that we won't accidentally apply new edits to old data.
 - In many cases we have been dispatching multiple actions in order to perform one real action. For example, edit a note and then clear out the search state. Because these presented intermediate states for the app, partial updates, they have been removed. Now we have created new Redux actions for these real actions and app state has been adjusted so that each kind of data has its own reducer and those reducers listen to all the actions which could affect them.  Now we will see a single dispatch that multiple reducers listen to instead of dispatching one action type for each reducer.
 - The Simperium connection has been moved into a new centralized middleware. All Simperium interactions take place in this middleware and are no longer woven into the app. This allows the app to update so that there's only ever one copy of a note's data and it's always up to date with the text editor. This resolves longstanding issues with the note list showing expired data. It also allows us to model the Simperium connection as a reactive system that responds to changes in the app and injects new events from the server in a way that's independent from local operation. This has dramatically simplified many different subsystems.
 - The persistence layer has now been created as a separate subsystem from before, when it was integrated with the Simperium code. It now operates as a kind of background worker that persists the Redux state into `indexedDB` and stores the entire contents atomically. Previously, integrated into the note bucket, the persistence layer would update each note, ghost, and bucket value separately leading to mismatches between a note and the base version it was built from, leading to sync issues.
 - So-called "prop drilling" has been replaced by connecting components in the React tree to state. This was done to simplify the interfaces around the app. It was previously difficult to understand what exactly was executing with the `onAction` props because there wasn't a clear way to walk up the component tree in an editor without resorting to text searching. Any `onEdit` type actions have been replaced with `dispatchProps` that directly dispatch the intended actions.

## Related or fixed issues

### Defects
Resolves #2171 (note display doesn't update height immediately upon change from menu bar)
Resolves #2074 (floating IME on Korean input)
Resolves #2014 (when opened note changes so it's not in the search, it still stays open)
Resolves #1953 (when renaming tag, update in search bar if tag opened)
Resolves #1942 (can't easily select start of note content)
Resolves #1887 (renaming a tag removes it from a note)

#### Cursor position and movement
Resolves #2085 (cursor jumps to new note)
Resolves #2035 (cursor hidden at bottom of note)
Resolves #1595 (restore cursor position when flipping between edit/preview mode)
Resolves #1477 (cursor jumping in Windows)

#### Synchronization
Resolves #1938 (not restoring tags when reverting to an earlier revision)
Resolves #1641 (updates to a tag in one session don't update in other sessions until they reload)
Resolves #1640 (infinite duplication of tag name when editing tag)
Resolves #1562 (content not syncing)
Resolves #1520 (only syncing one note at a time/per session)
Resolves #1291 (quirky unsynced changes info)
Resolves #502 (data loss when editing simultaneous with iOS)
Resolves #459 (show note sync indicator)

#### Force-sync
#1897
#800

#### Ghost-Writing
Resolves #2030
Resolves #1787

### Enhancements
Resolves #2162 (mostly - additional syncing metadata for notes)
Resolves #1836 (remove `app-state`)
Resolves #1816 (confusing revision history ordering)
#1537 (add indicator to show syncing status)
Resolves #2036 (logging-out in one sessions logs out all sessions)
Resolves #1410 (logout is buried in app settings and hard to find)

#### Private Mode
#1924 - functionality in Firefox private mode other than offline persistence

### Performance
Resolves #2172 (slow app)
Resolves #501 (slow loading large notes)

### Deprecations
Resolves #2117 (audit use of draftJS)
Resolves #1762 (decorator performance in draftJS)
Resolves #1026 (use of `token` in `localStorage` in `boot.js`)

### Possibly - check these
#1698 (problems with Korean input and lists at start of note)
#1619 (buggy Japanese IME conversion)
#1572 (RTL languages - checked tasks moving to the right)
#1511 (missing characters on Korean IME conversion)
#1456 (slow, possibly 4K-resolution-related)
@codebykat codebykat modified the milestones: Future, 2.0.0 Dec 23, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working [feature] editor Anything related to the editor.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants