-
Notifications
You must be signed in to change notification settings - Fork 5.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor document delta handling #1745
Conversation
Refactor delta handling code to: - Combine the "insertText" and "insertLines" delta types into a single "insert" delta type - Combine the "removeText" and "removeLines" delta types into a single "remove" delta type - Make all document mutations in a single applyDelta function. - Add basic delta validation (more needed . . . see TODOs) - Rework anchor logic to handle new delta types (also simplified) - Rename "insert()" to "insertText()" and "remove()" to "removeText()" - Rename "insertLines()" to "insertFullLines()" and "removeLines()" to "removeFullLines()" See related issue for more information. All tests are passing and the changes appear functional under preliminary testing, but careful review and testing will be necessary.
if (range.start.row == range.end.row && delta.action != "insertLines" && delta.action != "removeLines") | ||
lastRow = range.end.row; | ||
else | ||
lastRow = Infinity; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removing this breaks screen updating after removing a line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, sure enough. I'll update this to:
var range = e.data.range;
var lastRow = (range.start.row == range.end.row ? range.end.row : Infinity);
this.renderer.updateLines(range.start.row, lastRow);
I hadn't realized that multi-line actions require regenerating everything below, but it makes sense that they do.
This is great! Also this still needs some work:
|
this._emit("change", { data: delta }); | ||
|
||
if (row < this.getLength() - 1 && row >= 0) | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use style matching the rest of code in ace
if (...) {
}
Hi @nightwing, thanks for the feedback. I'm working on improvements based on your suggestions above. I'm not sure I understand your final point, however. Which of the following are you suggesting?
I've emailed the cla agreement. |
This seeks to keep the public API in-tact while improving method names within ace by keeping the old methods as wrappers around the new better-named methods. For example, document.insert() now simply calls document.insertText() and warns the caller via a console.log() that they are using a deprecated method. I've also updated the coding style of my changes (where I noticed discrepancies) to match the rest of Ace.
Both were introduced in 2e6f127.
console.warn makes better sense than console.log and matches similar warnings in ace (see gutter.js for example).
Not quite since insertInLine doesn't have to do additional split, without that it might be impossible to make indenting or commenting long text fast again. but let's leave this for later.
Yes, that's why i am saying would be good to find a way to do that and still keep api easy to use.
Thanks! |
2e6f127 slowed down the application of deltas that only affect a single line. The slow-down, though trivial for a single line, is significant for operations than separately modify thousands of rows (such as indenting a large document). This commit speeds up single-line deltas by avoiding unnecessary calls to splitLine() and joinLineWithNext().
Good point. I was concerned about that initially but (testing in Chrome) I didn't see any appreciable slowdown. Most of the slowdown was from inefficiencies in
That's fair. I like using |
every function can be misused, there's nothing we can do to prevent it. But lets leave it out for now, since there are still other ways to improve performance.
one could use
I must be biased too. But cutting down range object, and an array for most of deltas saves lots of memory, which is worth some added complexity.
Do you use ace deltas in the OT implementation or convert them to something else in change handler? |
|
||
this.applyDelta = function(delta) { | ||
|
||
function splitLine(lines, point) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it would be good to move these helper function to outer scope, since that is faster
This makes it possible to break out helper functions without exposing them to the rest of the document class. Also, long term, we may want to have a stand-alone test suite for applyDelta, so it makes sense in its own file. All other changes involve syntax corrections (some syntax issues were mine, others pre-existed) to make the documentation compilation work.
Since .apply() can't handle more than 65535 parameters, splice.apply() is brittle. It's also hard to read. This replaces splice.apply() calls throughout ace code with lang.spliceIntoArray().
Yes, if the difference is appreciable. In terms of speed, the difference in delta-creation time between an array and a string isn't significant from what I can tell. I'm seeing a difference of about 500 milliseconds across 10 million delta objects in Chrome and FF: http://jsfiddle.net/y24yq/2/. I'm not certain about memory usage.
True, although if you do that very much you'll blow away any performance benefit gained.
I convert them, but they closely resemble the original ace delta in structure. If we use My preference is for using |
that's jit compiler being smart and creating one array for all deltas, when it can't cheat array version takes 2-3 times more http://jsfiddle.net/y24yq/4/, but yeah i wasn't talking about performance, but about memory usage of deltas stored in undoManager, if we go with lines we'll have to modify them for storing in the undoManager.
code that cares about performance, have to special case one line deltas anyway, and if it doesn't we can't help it.
Since this changes api, we need to release current version first, likely next week. Also we need to make this as fast as the current implementation. Now for one line deltas, this is only 1.5 times slower, but multiline deltas are very slow, due to line by line splicing. It would be good to add As I said, I don't think renaming |
OK, please review the To Do list below to make sure we're completely on the same page. This corresponds basically to your suggestions above, but diverges slightly with regards to Note that I think the changes in the Deferred group are good changes. I'm going to be personally crimped for time over the next six weeks, however, so I'd like to focus on getting the core changes thoroughly vetted. The deferred changes can be subsequently handled. To Do:
I propose deferring:
Your input on items 3, 5, and 6 would be particularly appreciated. |
I didn't mean that
Try pasting large text at the start of a large file. With line by line splicing it takes so for
and i am getting around 16 mb, (20.234 instead of 4.29959) on 64 bit chrome it will be twice of that. So, unless i measure something wrong, undo history have to use text. For For |
To summarize: To Do:
Note: I'm aware of one bug that Question: Once the above is done, will you merge this as-is or will you need me to submit a new pull request with all these commits squashed into one? |
Yes, i agree about the todo. (also would be nice if you could remove all uses of There is no need to create a new pull request, and you can when this is ready i'll merge this it into
Why? I actually want to modify |
Ha! Old coding practices die hard. I thought I had brought all my
That sounds good. Any idea when the v1.2 will be released? |
This reverts commit 8624ab8.
Matches previous naming convention.
Avoids an extra $split call.
Also brings back the functionality where large deltas are split into smaller deltas so that .splice.apply() calls will work.
This uncovered the fact that until now delta.range had not always been a Range object. This inconsistency has been resolved by my changes in mirror.js.
Set it to true in insertInLine/removeInLine. Also sped up indent/dedent by using insertInLine and removeInLine.
Stores single-line delta content as .text instead of .lines in undo history. This is done without modifying the original delta object in case the caller still retains a handle to the original.
- Fix unconventional '{' formatting - Reformat `UndoManager` changes - Revert change from `insertInLine` to `insert` in text.js
This should be faster since we don't have to re-initialize the helper functions each time Anchor.onChange is fired.
OK, all of the above TODOs are done. |
Any ETA on when this can get merged into the 1.2 branch? As soon as it is, I'll check out the branch and start using / testing it in my project. |
Refactor document delta handling
Sorry for the delay, i've created v-1.2 branch and expect to merge it into master late February. My todo for merging this is:
|
// 2. fn.apply() doesn't work for a large number of params. The mallest threshold is on safari 0xFFFF. | ||
// | ||
// To Do: Ideally we'd be consistent and also split 'delete' deltas. We don't do this now, because delete | ||
// delta handling is too slow. If we make delete delta handling faster we can split all large deltas |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is delete delta handling is too slow?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On very long documents (like 1,000,000 lines), deleting the entire document all in one delta is fast but deleting it in multiple deltas is slow. I'm not sure why. I'm guessing it's related to inefficiencies in the rendering layer, but IDK.
Hi @nightwing, no problem. Thanks for staying on top of this. See responses below:
Just to clarify: you're planning on making these changes? I'm happy to help as you find issues, but I won't be able to give this much serious time in the near future.
Makes sense. This will allow us to remove my fix in in mirror.js added in ef0e8da. See my comment above, however, for why relying on window.message still seems like a bad idea.
I'm confused.
I personally prefer the former ( |
Hi
i don't like the word Btw validateDelta helped to find a bug in setValue, too, so thanks for keeping it! |
Nice work! Looking forward to seeing this released. I'm glad
Yeah, "merged" isn't great. The idea was to communicate that the first and last lines are merged with the existing document (e.g. |
Closing this in favor of #1819. Also note the link in the changelog there. |
This required a refactor of all code that listen to events changes since the API has changed. See ajaxorg/ace#1745 for more details.
The changes proposed below seek to simplify Ace's handling of document mutations in the interest of bug-free code and easy integration. I hope that the benefits of the recommended changes will be clear, but I'd be glad to elaborate as necessary.
Feedback and review of these changes from a core contributor would be appreciated. Having extensive experience with web-based editing, I realize the risk of destabilization inherent in touching the core document code. In this case, I think it's worth the trade-off, but the changes will need careful scrutiny.
Refactor delta types
insertText
andinsertLines
delta types into a singleinsert
delta typeremoveText
andremoveLines
delta types into a singleremove
delta typeThe new delta types will store inserted or deleted text has an array of lines (split on the detected new line character). This is efficient and has no presuppositions regarding newlines. Pressing ENTER in an empty document, therefore, will create the following delta:
Handle all document changes in a single spot
Basically
insert()
,insertLines()
,insertNewLine()
,remove()
, etc. should all wrap theapplyDelta()
function instead of the inverse. This way it's easy to ensure that all deltas are being applied in the same way and are completely reversible. This also makes it easy to validate deltas in a single location.Supporting changes
insert()
toinsertText()
remove()
toremoveText()
insertLines()
toinsertFullLines()
removeLines()
toremoveFullLines()