Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add reset and cleanup of dataLayer to prevent memory leaks #3475

Merged
merged 1 commit into from
Aug 3, 2022

Conversation

bjester
Copy link
Member

@bjester bjester commented Aug 2, 2022

Summary

Description of the change(s) you made

  • Calls dataLayer.reset() according to official documentation
  • Also adds manual removal of gtm.element since dataLayer.reset() doesn't seem to affect it, but should be safe as long as it occurs in a function queued onto the dataLayer

Manual verification steps performed

  1. Logged into the Studio (dev) view in Google Analytics (see this notion page for more info)
  2. Observed events triggered as I used my local instance of Studio
  3. Implemented the PR changes
  4. Ensured that the changes produced no regressions to the data in GA

References

Fixes #3428

Comments

Again, dataLayer.reset() didn't do the trick, so I added a little bit extra


Contributor's Checklist

PR process:

  • If this is an important user-facing change, PR or related issue the CHANGELOG label been added to this PR. Note: items with this label will be added to the CHANGELOG at a later time
  • If this includes an internal dependency change, a link to the diff is provided
  • The docs label has been added if this introduces a change that needs to be updated in the user docs?
  • If any Python requirements have changed, the updated requirements.txt files also included in this PR
  • Opportunities for using Google Analytics here are noted
  • Migrations are safe for a large db

Studio-specifc:

  • All user-facing strings are translated properly
  • The notranslate class been added to elements that shouldn't be translated by Google Chrome's automatic translation feature (e.g. icons, user-generated text)
  • All UI components are LTR and RTL compliant
  • Views are organized into pages, components, and layouts directories as described in the docs
  • Users' storage used is recalculated properly on any changes to main tree files
  • If there new ways this uses user data that needs to be factored into our Privacy Policy, it has been noted.

Testing:

  • Code is clean and well-commented
  • Contributor has fully tested the PR manually
  • If there are any front-end changes, before/after screenshots are included
  • Critical user journeys are covered by Gherkin stories
  • Any new interactions have been added to the QA Sheet
  • Critical and brittle code paths are covered by unit tests

Reviewer's Checklist

This section is for reviewers to fill out.

  • Automated test coverage is satisfactory
  • PR is fully functional
  • PR has been tested for accessibility regressions
  • External dependency files were updated if necessary (yarn and pip)
  • Documentation is updated
  • Contributor is in AUTHORS.md

Copy link
Member

@MisRob MisRob left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, @bjester, code changes are looking good to me. Overall, the approach looks similar to what I saw in https://www.voorhoede.nl/en/blog/your-website-probably-has-a-memory-leak/ where they were able to resolve their memory leaks by clearing elements references (they decided to go with a bit different method eventually but the point was the same). I can confirm that elements references are being cleared up.

One question about Google Tag Manager's reset, do you expect it to clear the dataLayer? I assumed it should happen, but documentation is minimal, and according to my browser console it doesn't seem to be the case. This is its output after I logged in and navigated to "Content Library" tab:

plugin.js:77 Analytics.trackEvent("channel_list", {"eventLabel":"PUBLIC","eventAction":"Click"})
:8080/en/channels/#/public:1 Autofocus processing was blocked because a document already has a focused element.
plugin.js:77 Analytics.trackEvent("Catalog search", {"eventAction":"","eventLabel":"{\"total\":1,\"matched\":[\"66a31fdbdd745c7c9e499206f3c4cced Published Channel\"]}","total":1,"matched":["66a31fdbdd745c7c9e499206f3c4cced Published Channel"]})
window.dataLayer
(8) [{…}, {…}, {…}, {…}, {…}, {…}, {…}, {…}, push: ƒ]0: {gtm.start: 1659525874116, event: 'gtm.js', gtm.uniqueEventId: 1}1: {pageCategory: 'Channel list', currentUser: {…}}2: {event: 'gtm.dom', gtm.uniqueEventId: 11}3: {event: 'gtm.load', gtm.uniqueEventId: 14}4: {event: 'gtm.click', gtm.element: null, gtm.elementClasses: 'v-tabs__item', gtm.elementId: '', gtm.elementTarget: '', …}5: {eventLabel: 'PUBLIC', eventAction: 'Click', event: 'channel_list', gtm.uniqueEventId: 56}6: {event: 'gtm.linkClick', gtm.element: null, gtm.elementClasses: 'v-tabs__item--active v-tabs__item v-tabs__item--active', gtm.elementId: '', gtm.elementTarget: '', …}7: {eventAction: '', eventLabel: '{"total":1,"matched":["66a31fdbdd745c7c9e499206f3c4cced Published Channel"]}', total: 1, matched: Array(1), event: 'Catalog search', …}8: ƒ ()push: ƒ ()length: 9[[Prototype]]: Array(0)
plugin.js:31 Analytics.reset()
window.dataLayer
(9) [{…}, {…}, {…}, {…}, {…}, {…}, {…}, {…}, ƒ, push: ƒ]0: {gtm.start: 1659525874116, event: 'gtm.js', gtm.uniqueEventId: 1}1: {pageCategory: 'Channel list', currentUser: {…}}2: {event: 'gtm.dom', gtm.uniqueEventId: 11}3: {event: 'gtm.load', gtm.uniqueEventId: 14}4: {event: 'gtm.click', gtm.element: null, gtm.elementClasses: 'v-tabs__item', gtm.elementId: '', gtm.elementTarget: '', …}5: {eventLabel: 'PUBLIC', eventAction: 'Click', event: 'channel_list', gtm.uniqueEventId: 56}6: {event: 'gtm.linkClick', gtm.element: null, gtm.elementClasses: 'v-tabs__item--active v-tabs__item v-tabs__item--active', gtm.elementId: '', gtm.elementTarget: '', …}7: {eventAction: '', eventLabel: '{"total":1,"matched":["66a31fdbdd745c7c9e499206f3c4cced Published Channel"]}', total: 1, matched: Array(1), event: 'Catalog search', …}8: ƒ ()push: ƒ ()length: 9[[Prototype]]: Array(0)
plugin.js:31 Analytics.reset()
window.dataLayer
(10) [{…}, {…}, {…}, {…}, {…}, {…}, {…}, {…}, ƒ, ƒ, push: ƒ]0: {gtm.start: 1659525874116, event: 'gtm.js', gtm.uniqueEventId: 1}1: {pageCategory: 'Channel list', currentUser: {…}}2: {event: 'gtm.dom', gtm.uniqueEventId: 11}3: {event: 'gtm.load', gtm.uniqueEventId: 14}4: {event: 'gtm.click', gtm.element: null, gtm.elementClasses: 'v-tabs__item', gtm.elementId: '', gtm.elementTarget: '', …}5: {eventLabel: 'PUBLIC', eventAction: 'Click', event: 'channel_list', gtm.uniqueEventId: 56}6: {event: 'gtm.linkClick', gtm.element: null, gtm.elementClasses: 'v-tabs__item--active v-tabs__item v-tabs__item--active', gtm.elementId: '', gtm.elementTarget: '', …}7: {eventAction: '', eventLabel: '{"total":1,"matched":["66a31fdbdd745c7c9e499206f3c4cced Published Channel"]}', total: 1, matched: Array(1), event: 'Catalog search', …}8: ƒ ()9: ƒ ()push: ƒ ()length: 10[[Prototype]]: Array(0)
plugin.js:31 Analytics.reset()
plugin.js:31 Analytics.reset()
plugin.js:31 Analytics.reset()
plugin.js:31 Analytics.reset()
plugin.js:31 Analytics.reset()
plugin.js:31 Analytics.reset()
plugin.js:31 Analytics.reset()
plugin.js:31 Analytics.reset()
plugin.js:31 Analytics.reset()
plugin.js:31 Analytics.reset()
plugin.js:31 Analytics.reset()
plugin.js:31 Analytics.reset()
plugin.js:31 Analytics.reset()
plugin.js:31 Analytics.reset()
plugin.js:31 Analytics.reset()
plugin.js:31 Analytics.reset()
window.dataLayer
(26) [{…}, {…}, {…}, {…}, {…}, {…}, {…}, {…}, ƒ, ƒ, ƒ, ƒ, ƒ, ƒ, ƒ, ƒ, ƒ, ƒ, ƒ, ƒ, ƒ, ƒ, ƒ, ƒ, ƒ, ƒ, push: ƒ]0: {gtm.start: 1659525874116, event: 'gtm.js', gtm.uniqueEventId: 1}1: {pageCategory: 'Channel list', currentUser: {…}}2: {event: 'gtm.dom', gtm.uniqueEventId: 11}3: {event: 'gtm.load', gtm.uniqueEventId: 14}4: {event: 'gtm.click', gtm.element: null, gtm.elementClasses: 'v-tabs__item', gtm.elementId: '', gtm.elementTarget: '', …}5: {eventLabel: 'PUBLIC', eventAction: 'Click', event: 'channel_list', gtm.uniqueEventId: 56}6: {event: 'gtm.linkClick', gtm.element: null, gtm.elementClasses: 'v-tabs__item--active v-tabs__item v-tabs__item--active', gtm.elementId: '', gtm.elementTarget: '', …}7: {eventAction: '', eventLabel: '{"total":1,"matched":["66a31fdbdd745c7c9e499206f3c4cced Published Channel"]}', total: 1, matched: Array(1), event: 'Catalog search', …}8: ƒ ()9: ƒ ()10: ƒ ()11: ƒ ()12: ƒ ()13: ƒ ()14: ƒ ()15: ƒ ()16: ƒ ()17: ƒ ()18: ƒ ()19: ƒ ()20: ƒ ()21: ƒ ()22: ƒ ()23: ƒ ()24: ƒ ()25: ƒ ()push: ƒ ()length: 26[[Prototype]]: Array(0)
plugin.js:31 Analytics.reset()

Also attaching a screenshot where you can see the final dataLayer's content better
Screenshot from 2022-08-03 13-33-59

Notice that dataLayer keeps growing even though as a user I didn't do any action. It seems that

  1. It isn't actually reset
  2. It's getting bigger after each analytics reset as we push the reset function to the array

I tried to google a bit and played around with code but haven't figured out yet what's going on. I agree that ideally, we'd call it but if it doesn't seem to work (and again, I'm not sure what to expect actually, but growing the array even more seems unexpected and potentially problematic), maybe we could try only clearing up the elements' references (that was also the only thing they needed to do in the article I referenced above). Have you seen this? Any thoughts?

@bjester
Copy link
Member Author

bjester commented Aug 3, 2022

@MisRob Sorry I didn't make it clear in my PR description. I experienced the same thing-- reset() doesn't appear to empty the array. That was the driving factor in adding the custom clearing of gtm.element. I didn't want to remove it because the official documentation still refers to using reset as the "... most useful with a page that will remain open and the data layer size continues to grow over time."

I did confirm that what I've done is calling it correctly. The documentation doesn't quite describe that it will empty the the dataLayer array, only that it "resets" the data, so it's possible it's doing something we're not exactly seeing at first glance or perhaps over longer periods of time it does something different. Either way, since Studio consists of single page apps, it seems to make sense having it with regards to their documentation.

@MisRob
Copy link
Member

MisRob commented Aug 3, 2022

Thank you for your clarification. Sounds good. The documentation is really opaque in that sense, so I think it's fine to give it a try if it was intentional. Just a bit concerned over the 30s frequency - the array will grow pretty fast, and it's not very clear to me if all those functions ever get cleared. Do you think it's safe?

@MisRob
Copy link
Member

MisRob commented Aug 3, 2022

Maybe we could give it a try by leaving the Studio open for half a day or so and see what happens?

@MisRob
Copy link
Member

MisRob commented Aug 3, 2022

I saw a note somewhere that it should get cleared automatically after couple hundred items are there but if I recall it wasn't from the official documentation, so not sure what to expect.

@bjester
Copy link
Member Author

bjester commented Aug 3, 2022

Just a bit concerned over the 30s frequency - the array will grow pretty fast, and it's not very clear to me if all those functions ever get cleared. Do you think it's safe?

That's a good point. Perhaps we should try to clear the old reset functions too. I could also increase the interval to a couple of minutes? What do you think?

@bjester
Copy link
Member Author

bjester commented Aug 3, 2022

I saw a note somewhere that it should get cleared automatically after couple hundred items are there but if I recall it wasn't from the official documentation, so not sure what to expect.

I also saw something that mentioned that there's a cap at 300 items. Perhaps they added that and didn't update the documentation regarding reset 🤷

@rtibbles rtibbles merged commit 18d9dad into learningequality:unstable Aug 3, 2022
@MisRob
Copy link
Member

MisRob commented Aug 4, 2022

@bjester

Perhaps we should try to clear the old reset functions too. I could also increase the interval to a couple of minutes? What do you think?

Make sense to me, or we could push reset on router navigation, for example. That way, it wouldn't need to run when the page is just open with no new interactions. I think you have better insight into what events we track, so whatever makes more sense to you.

I also saw something that mentioned that there's a cap at 300 items. Perhaps they added that and didn't update the documentation regarding reset

I saw the same number. However, I let the Studio run today for a longer time to watch its behavior and it's not the case. The number could be different, or it never gets cleared. So I think it'd be better to be careful here.

Screenshot from 2022-08-04 12-48-27

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Optimize Google Tag Manager performance wise
3 participants