Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Linear data structure. #65

Closed
pjasiun opened this issue Jul 8, 2015 · 3 comments
Closed

Linear data structure. #65

pjasiun opened this issue Jul 8, 2015 · 3 comments

Comments

@pjasiun
Copy link

pjasiun commented Jul 8, 2015

The document prototype used for linear data array which contains strings (for characters without styles), arrays (for characters with styles) and objects (for opening and closing elements). This structure could be simplified to arrays of all types of characters or using objects everywhere, but the research is needed to check the performance of each structure.

@pjasiun
Copy link
Author

pjasiun commented Jul 8, 2015

I have pushed tests to the performance branch.

A used the array of 40K elements, every character with 3 styles. This document is 10 times bigger then Apollo used for CKEditor 4 examples and equivalent to 15-20 pages of text (depends on standard). I used random character to avoid fake results because of browsers optimizations.

Memory usage

I measured memory usage on Chrome using build-in memory profiler. Memory usage:

  • array of strings (x): 406KB - definitely the smallest structure, but this is only for text without styles,
  • array of objects ({ char: x attr: [ 0, 1, 2] }): 9.2MB - does not matter if object has long or short key, was created as literal or using constructor,
  • array of arrays ([ 'x', [ 1, 2, 3 ] ]): 9.9MB - this is (surprise, surprise) the biggest data structure, event it seems to be simpler V8 engine optimize objects better then arrays,
  • array of objects with attributes as an object ({ char: x attr: { 0: 1, 1: 1, 2: 1 } }): 13MB - it is the bigger, but useful because of performance.

On Firefox it is hard to measure memory usage. All I can do without rebuilding a browser is to measure how much memory tab is using, what seems to be good enough. Results on Firefox are very similar to those on Chrome.

Performance tests

I tested 4 operations: creating, adding attribute, removing attribute and checking type.

Creating

It is interesting that creating objects with long properties names (attributes, character) is ~50% longer then with short names (a, c). Also when attributes are stored as an object creating whole structure takes twice that much. You can also see that browsers do some optimizations: when I was using fixed data results were far different then with random data. Creating array of objects takes ~60ms on Chrome and ~6ms on Firefox.

Checking type

I tested 5 ways of checking if it is character data (is array, duck-typing, data.type, data.t, instanceof) and results are very similar. It takes about 2ms to check 40000 items.

Adding and removing attribute

Adding and removing attributes is about 10 times faster when attributes are stored as an object instead of data, it takes then 1-2 ms instead of 20 ms, if the attribute is already defined and we do not add/remove new property, but only change its value. So if I use delete to remove attribute it is slow, but is I set it to be 0 it is much faster. Of course it is a small memory leak, but it should not be a problem.

@fredck
Copy link
Contributor

fredck commented Jul 9, 2015

Very interesting research, @pjasiun.

I think that memory usage should not be an issue at this point. The most important is having a model that performs well in the most intensive tasks we need to perform: (1) startup creation and (2) runtime manipulation. If (1) is a matter of milliseconds difference, our focus must be on (2), because this will be the point that brings more impact in terms of UX - features must perform as fast as possible.

So it looks like "array of objects with attributes as objects" seems the way to go.

@pjasiun
Copy link
Author

pjasiun commented Jul 9, 2015

I definitely agree. The most important conclusion is that memory usage and performance are acceptable. And because changing attributes is the operation user will perform most often "array of objects with attributes as objects" seems to be the best data structure.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants