Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Potential memory leak in store.query() #4041

Closed
trumb1mj opened this issue Jan 4, 2016 · 62 comments
Closed

Potential memory leak in store.query() #4041

trumb1mj opened this issue Jan 4, 2016 · 62 comments

Comments

@trumb1mj
Copy link

trumb1mj commented Jan 4, 2016

When calling store.query(), the heap grows. I've setup a sample project that demonstrates the issue here: https://github.com/trumb1mj/ember-data-growing-recordarrays

@runspired
Copy link
Contributor

@bmac this is the leak I spent a few hours investigating with @trumb1mj. I haven't had time to revisit it but I've had a few thoughts on what's causing the leak.

The initial query is the model hook's return. The subsequent queries are done by the poller. I suspect the interplay of the initial query being alive and active and bound to the route is probably what's leading to this behavior.

@fivetanley
Copy link
Member

I'm not sure this is a leak.Query intentionally creates a new record array each time as the query is executed by the server so we can't re-use single arrays.

@runspired
Copy link
Contributor

@fivetanley it's not a "leak" in the sense that nothing is unintentionally retained, but it is a leak. The memory growth in this setup is ridiculous, and it's a fairly common use case.

@trumb1mj
Copy link
Author

trumb1mj commented Jan 4, 2016

@runspired in my example there is no query done on the model hook return. @fivetanley even when you use the .clear() method to empty out that array, the heap continues to grow so there is something else going on here.

@fivetanley
Copy link
Member

@trumb1mj What's the result when you use destroy instead?

@trumb1mj
Copy link
Author

trumb1mj commented Jan 4, 2016

Using destroy seems to slow the memory growth but I'm still seeing about a .5mb per minute.

fivetanley pushed a commit to fivetanley/data that referenced this issue Jan 4, 2016
Although the result from `store.query` is intended to never be re-used
as we delegate to the server for the truth of a query, the user still
needs an escape hatch to destroy these AdapterPopulatedRecordArrays
created by query in order to prevent excessive memory.

Before, calling `destroy` on an AdapterPopulatedRecordArray (the type
returned from store.query), the logic in
RecordArrayManager#unregisterRecordArray did not account for adapter
populated record arrays resulting in a leak.

This commit changes the logic to search for the array in the
RecordArrayManager's AdapterPopulatedRecordArray collection when
unregistering a record array.

Fixes emberjs#4041
@fivetanley
Copy link
Member

i have opened #4042 which i think is a cause of the leak

@fivetanley
Copy link
Member

@trumb1mj if you can verify these changes help that'd be great. I didn't have time to check them against the sample app.

@trumb1mj
Copy link
Author

trumb1mj commented Jan 4, 2016

I grabbed the code from https://github.com/fivetanley/data.git#adapter-populated-record-array-leaks and retested with the .destroy() method. The leak still seems to be occurring.

@fivetanley
Copy link
Member

I think the leak after my change is Pretender. It never seems to cycle out its handledRequests array as you can see in the diff here in config/mirage.js trumb1mj/ember-data-growing-recordarrays@master...fivetanley:maybe-its-pretender

This includes all the FakeResponses and their json responses. The stuff in yellow in the Chrome inspector are retained nodes between snapshots. I don't see anything from Ember Data but there is a lot of stuff from Pretender besides the globals it exposes:

show memory leak from pretender in chrome

@fivetanley
Copy link
Member

The two heap snapshots in this screenshot indicate that the memory leak should be fixed. Yes one is slightly larger than the other but I came back after a couple minutes of making coffee and this was the result:

I plugged pretender to reset this.handledRequests = [] so the array wouldn't grow. Another leak seemed related to ember 1.13.8 so I bumped ember to the latest as well.

There may be a slight leak left but I'm not sure at this point. The liveness of the demo and us not being able to take snapshots with precise cutoff points doesn't really help either. A good update would to be to add a button that calls query instead of the polling so you could click and then take a heap snapshot after each click rather than end up on some moment where the browser chooses to use more memory randomly, and also messes up retainers.

@runspired
Copy link
Contributor

@fivetanley reopened? Did you find more?

@fivetanley
Copy link
Member

No, it was accidentally auto-closed when I merged the PR. I'd like to close it if you two don't have anything else to look into.

@runspired
Copy link
Contributor

I haven't had a moment to test out the fix, but will do so.

@trumb1mj
Copy link
Author

trumb1mj commented Jan 7, 2016

I had some more thoughts on this. Right now the query method retains model history by default and while I'm not sure the majority of use cases require this, I think there is a need for it. But would it make more sense to allow us to opt out?

For example,

this.store.query('person', { filter: { name: 'Peter' } }, retain_history=false)

That way a user would not have to wait for the promise to be returned and do a .destroy().

@bmac
Copy link
Member

bmac commented Jan 11, 2016

Hey @trumb1mj sorry I didn't have much time to look at this last week. What do you mean by "retains model history"?

@trumb1mj
Copy link
Author

I think part of the problem was not a leak but rather on each query call the _internalModel._recordArrays gets populated with a model record. When polling that can add up rather quickly. It would be nice if this was either not default behavior or we had a choice when submitting queries.

@bmac
Copy link
Member

bmac commented Jan 11, 2016

That make sense. Thanks @trumb1mj. I'll try to fid out why the results of query get added to _internalModel._recordArrays. I suspect Ember Data could opt out of doing that all together.

bmac pushed a commit that referenced this issue Jan 11, 2016
Although the result from `store.query` is intended to never be re-used
as we delegate to the server for the truth of a query, the user still
needs an escape hatch to destroy these AdapterPopulatedRecordArrays
created by query in order to prevent excessive memory.

Before, calling `destroy` on an AdapterPopulatedRecordArray (the type
returned from store.query), the logic in
RecordArrayManager#unregisterRecordArray did not account for adapter
populated record arrays resulting in a leak.

This commit changes the logic to search for the array in the
RecordArrayManager's AdapterPopulatedRecordArray collection when
unregistering a record array.

Fixes #4041

(cherry picked from commit 0eedecb)
@smcclstocks
Copy link

Is there any guidance on how best to handle this situation. I have a page that polls using query with dynamic filters & sets the promise to a controller property so the template can show loaders, failures, etc as the promise does its state transitions.

What I do now is get the old resolved promise from the query & call destroy on it & then fire off the next one setting it to the same controller property. I'm seeing the same leak mentioned in this thread. Is calling destroy not enough & should we be doing more? Thanks.

@trumb1mj
Copy link
Author

Calling destroy will empty out that array but the leak will continue to happen in .query(). In our project I'm actually using naked ajax and loading the records manually until the fix mentioned in this thread is in an ember data release.

@smcclstocks
Copy link

bummer. I may need to do the same thing. Thanks man.

@bmac
Copy link
Member

bmac commented Jan 27, 2016

This issue was discussed on the Ember Data call today and it was determined that the fact that Ember Data hold a reference to the query's record array is not something that Ember Data needs to do anymore. It was likely added at a time when Ember Data worked differently and is no longer relevant after some internal refactorings.

I think the path forward to closing this issue would be to stop registering the query's RecordArray with the RecordArrayManager to reduce the memory leaks.

@runspired
Copy link
Contributor

+1

@trumb1mj
Copy link
Author

Great, thanks for the follow up on this guys!

@trumb1mj
Copy link
Author

trumb1mj commented Jun 1, 2016

I've updated the demo repo to the latest versions of ember and ember-data and the RecordArray is still growing -- Do we anticipate a fix for this in the near future?

Here is the repo that demonstrates the leak: https://github.com/trumb1mj/ember-data-growing-recordarrays

@stefanpenner
Copy link
Member

stefanpenner commented Feb 27, 2017

This seems to be working as expected.

For live recordArrays such as the ones returned from query() once you are done with them, you must invoke destroy on them, so they are cleaned-up.


Any chance this could be due to ArrayProxy array observers not being cleaned up? We've got a long lived polling app and I think I've narrowed our leak problems down to that. Wondering if there's any crossover.

I don't believe it is related. Although that may be a leak onto its own?


Manual solution would be something like (clearly not pretty):

{

  filteredList: ember.computed(function() {
    let list =  this.store.query('foo', { something: 1 });
    if (this._lastFilteredList) {
      this._lastFilteredList.destroy();
    }
    this._lastFilteredList = list;
    return list;
  }),
  destroy() {
    this._super(...arguments);
    this._lastFilteredList && this._lastFilteredList.destroy();
  }
} 

Another approach, is to create a CP or something that did this automatically when recomputed.

filteredList: ensureDestroyed(dependentKey, function() {
  return this.store.query('foo', { something: 1 });
});

Finally, a future would exist where anything fetched from this.store would be automatically destroyed/unloaded when this was destroyed. This is something we should likely explore in the future.

@stefanpenner
Copy link
Member

stefanpenner commented Feb 27, 2017

Just spoke with @runspired:

  • fix docs: they need to be updated to inform users to invoke destroy
  • investigate leak: there may be an related leak that will need to be investigated (@runspired has an example, he will share when he finds it)

Future work:

  • explore auto(or ergonomic)-destroy/unloading

@trumb1mj
Copy link
Author

You can use the repo (linked at the top of this issue) as an example: https://github.com/trumb1mj/ember-data-growing-recordarrays

@stefanpenner
Copy link
Member

stefanpenner commented Feb 27, 2017

@trumb1mj that is a pretty old version of ember-data. I'll upgrade and investigate...

@stefanpenner
Copy link
Member

upgraded your example to the recent stuff: trumb1mj/ember-data-growing-recordarrays#1

I am unable to see the leak in question. It is possible I missed something though:

screen shot 2017-02-27 at 1 40 30 pm

Its worth pointing out, there was work fixing up correctness of RecordArrayManager stuff in October that may have addressed this.


If I have missed something, let me know. If i can reproduce locally, i can likely fix quickly.

@stefanpenner
Copy link
Member

There does appear to be a leak with FakeRequest though:

screen shot 2017-02-27 at 1 48 51 pm

@trumb1mj
Copy link
Author

After merging your PR I'm still reproducing a small leak like I was seeing before. Are you saying the old leak is no more and the leak on display here is within mirage?

@stefanpenner
Copy link
Member

stefanpenner commented Feb 27, 2017

Yes, Specifically Pretender only releases some of its internal data structures, like handledRequests on shutdown. Which cause the payload (in this example) to be retained.

screen shot 2017-02-27 at 1 50 55 pm

@stefanpenner
Copy link
Member

stefanpenner commented Feb 27, 2017

As far as I can tell there is no obvious leak here any longer, but if someone can demonstrate one I'll gladly debug it.


I've opened an issue to improve documentation related to this here. We should also explore, more ergonomic ways of auto-deleting things.

@trumb1mj
Copy link
Author

@fpauser is still seeing it in his app on the same version of ember-data. This should not be closed yet.

@trumb1mj
Copy link
Author

@stefanpenner how do you recommend testing this without mirage?

@BryanCrotaz
Copy link
Contributor

I'll add destroy to our code and let you know whether the leak goes away. Good way to show it in our code is to repeatedly store.push the same object as jsonapi using ember-concurrency in a while(true) loop with a yield timeout(25). That leaks fast enough to show up clearly in a few seconds

@trumb1mj
Copy link
Author

We all agree that having to call destroy is just a hack for an underlying problem, right? Why in the world would ED keep those record arrays around after the record gets removed?

@BryanCrotaz
Copy link
Contributor

@trumb1mj after which record gets removed?

I'm seeing it by getting a hasMany on a model. It appears that you get a new array proxy each time you do that, with the same inner array content. So it's not the record getting removed, it needs to be destroyed when the subscriber has finished with it. That's not something ember-data can know.

@trumb1mj
Copy link
Author

Sorry, I misspoke. I think the recommendation is to call .destroy on the results from query. I think if you want to keep those arrays around it should be opt in. Anyone polling for data is going to quickly experience a really hard to detect leak.

@BryanCrotaz
Copy link
Contributor

here's what we're doing with your advice, @stefanpenner - but still leaking observers.

	_observedAssetlist: computed({
		get () {
			return get(this, '_currentlyObservedAssetlist');
		},

		set (key, value) {
			const currentlyObservedAssetlist = get(this, '_currentlyObservedAssetlist');
			if (currentlyObservedAssetlist)
			{
//				currentlyObservedAssetlist.removeArrayObserver(this);
				currentlyObservedAssetlist.destroy();
			}

			if (value)
			{
				if (typeof value.addArrayObserver === 'function')
				{
					// it's an Ember array
					value.addArrayObserver(this);
				}
				else
				{
					set(this, '_currentlyObservedAssetlist', null);
					throw new Error('cannot efficiently filter a source that is not an Ember array');
				}
			}

			return set(this, '_currentlyObservedAssetlist', value);
		},
	}),

This code is called by

let newSourceAssets = get(this, 'source.slides');
set(this, '_observedAssetlist', newSourceAssets);

source is an Ember Data model with a hasMany called slides.

I'm console logging result.length in ember.debug.js matchingListeners and the count is climbing by one every time we push.

@stefanpenner
Copy link
Member

@stefanpenner how do you recommend testing this without mirage?

In your application adapter, implement the query method, and return the payload you previous received via mirage.

@fpauser is still seeing it in his app on the same version of ember-data. This should not be closed yet.

We will need a reproduction, if provided I will absolutely debug and address (i actually enjoy debugging memory leaks...)

I'll add destroy to our code and let you know whether the leak goes away. Good way to show it in our code is to repeatedly store.push the same object as jsonapi using ember-concurrency in a while(true) loop with a yield timeout(25). That leaks fast enough to show up clearly in a few seconds

I do not believe ember-concurrency does anything to auto-delete record arrays for you. So the above hypothesis stands.

We all agree that having to call destroy is just a hack for an underlying problem, right? Why in the world would ED keep those record arrays around after the record gets removed?

I don't believe so. RecordArrays (as currently designed) are live and can mutate in reaction to the store changing (in the query case, when records are deleted/unloaded). To accomplish this,book keeping from store to recordArray is required, which necessarily causes the store to retain the recordArrays. To release a RecordArray, manual intervention is required.


There are some conflicting constraints at play, it is most likely not possible for a live RecordArray, and a automatically garbage collected RecordArray to be the same RecordArray.

It would be possible to introduce new type of RecordArray which is "frozen" which would automatically garbage collect. Unfortunately, a side-affect of that would be if the "frozen" record array survives longer then a record it contains, the record it contains would be retained...

If the language had WeakCollection / WeakRef, more automagical things could happen...


I am sympathetic to the problem described here, I suspect we need to do a better job of documenting and providing good memory management strategies for our ember-data users. Something I hope to have time to help with over the next little while...

@stefanpenner
Copy link
Member

Sorry, I misspoke. I think the recommendation is to call .destroy on the results from query.

Yes the RecordArray is what should be destroyed.

I think if you want to keep those arrays around it should be opt in.

Ya, unfortunately that is a breaking change. So for now, it will need to be opt-out, which today means manually invoking destroy. We can do better, although that seems like a new feature entirely (not an issue, but RFC)

Anyone polling for data is going to quickly experience a really hard to detect leak.

I suspect the "polling" use-case itself could become more first class, one could imagine several approaches:

  • reloadable query results recordArray.reload() then we would only have 1 recordArray, that changes over time.
  • snapshotted record arrays query.snapshot(type, {}) which themselves are not retained, but they would retain there own models (and not update when the store changes).

@stefanpenner
Copy link
Member

here's what we're doing with your advice, @stefanpenner - but still leaking observers.

@BryanCrotaz if you can provide a reproduction of the leaking observers, I will gladly debug.

@BryanCrotaz
Copy link
Contributor

@stefanpenner Happy to send you a reproducible case but it's using private repos so can I send you a zip with node_modules populated?

@BryanCrotaz
Copy link
Contributor

@stefanpenner and any hints and tips about how to debug memory leaks in JS much appreciated. I can use the chrome profiler to show there's a leak but no idea how to turn that into a line of code :)

@stefanpenner
Copy link
Member

@stefanpenner and any hints and tips about how to debug memory leaks in JS much appreciated. I can use the chrome profiler to show there's a leak but no idea how to turn that into a line of code :)

I typically debug memory leaks like I play sudoku...

More seriously, my general path is using all the tools available to me (mem profiler, debugger, pen & paper etc) but most importantly building a mental model of the ownership and object lifecycle graph, and using the above tools to validate/inform that mental model. I realize this sounds cryptic, I hope to someday find time to write this down (or even better find someones existing resource to share)...

@stefanpenner
Copy link
Member

@BryanCrotaz sure zip works fine. Or adding me to that repo... Ping me on slack

@dknutsen
Copy link

@stefanpenner so does this also affect query results in a route's model hook, or does ember somehow destroy the RecordArray when it tears down the view?

@stefanpenner
Copy link
Member

stefanpenner commented Feb 28, 2017

@stefanpenner so does this also affect query results in a route's model hook, or does ember somehow destroy the RecordArray when it tears down the view?

Ember does not do automatic cleanups here. Although this is something we will be exploring (as a new feature).

@dknutsen
Copy link

@stefanpenner 🙇 thanks for all the thorough explanations, I really appreciate how Ember's core team specifically is patient and responsive on a lot of this stuff. Hopefully I'm not about to test said patience 😬

Fully acknowledging that I have a limited grasp of the Ember Data internals, and that I as a developer represent a very tiny slice of the concerns and use cases of Ember Data, and that my awareness of the full scope of said concerns and use cases is very limited, I think this design is really problematic and would imagine that to be the case for many developers. It seems that you are aware of this and have some potential solutions in mind and obviously these things take time, but I still want to throw my 🎩 in the ring on this discussion.

From a naive developer perspective I would fully expect a RecordArray to be garbage collected when I am no longer using it (when I leave a route, when my poller function has finished executing, etc). I would also not really want to concern myself with the difference between the different types of RecordArrays, nor would I ever have known that I was expected to manually destroy these RecordArrays (as up until now it was not in the documentation or docs backlog). I don't think these are unreasonable assumptions, and my guess is that they would be pretty common among developers.

I know that's where we are now, and that time and effort will be required to make any of the changes you described above, but I personally would still consider this a leak and a fairly serious usability issue even if it's partly by design. If I'm understanding all of this correctly, it means that any time the average dev is using store.query() in pretty much any normal way the store is retaining memory without their knowledge/consent. That, to me, seems like it qualifies as a memory leak. So I guess what I"m saying is that I humbly request that this be treated with a reasonably high level of priority. This was originally opened over a year ago and it seems like we're just now digging into some of the core problems.

Also, is this limited to just results from store.query or do the other store methods also retain RecordArrays (findAll, peekAll, etc) which must be manually destroyed?

@stefanpenner
Copy link
Member

stefanpenner commented Mar 8, 2017

@dknutsen ya, I get the pain here. Unfortunately the system is working as designed (the design is just unfortunate), and there exists no solution that both maintains compatibility with query that does not leak given the functionality that JavaScript provides today (the future may hold nice things: https://github.com/tc39/proposal-weakrefs).

I believe the only reasonable path forward, is to:

  • properly document when/how cleanup is required
  • explore improved integration with other parts of ember, so that these records/recordArrays are automatically purged based on some reasonable heuristics.
  • or explore deprecating query in-favor of something that is not live and thus does not leak by default.

note: I am hoping to find some cycles over the next little while to improve the current state of things here, as I (and many others on ember/data core) share your feelings – this is simply not reasonable


Also, is this limited to just results from store.query or do the other store methods also retain RecordArrays (findAll, peekAll, etc) which must be manually destroyed?

findAll peekAll recordArrays are also not garbage collected, but they do not pose the same leak potential as above, this is because:

  • they are singleton arrays (1 RecordArray per type, subsequent calls merely reuse)
  • they represent merely what is already in the store

query really is the funky one here,

@dknutsen
Copy link

dknutsen commented Mar 8, 2017

@stefanpenner understood, thanks a lot for the thorough explanation and the info about the other methods too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants