Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove GOVUKTracker #2152

Merged
merged 1 commit into from
Apr 30, 2020
Merged

Remove GOVUKTracker #2152

merged 1 commit into from
Apr 30, 2020

Conversation

kevindew
Copy link
Member

I came across this code while exploring the options for moving static's
asset files from being served from the www host rather than the assets
one. It proved to be a blocking issue as it required the presence of a
"/static/a" on the assets host.

Further investigation led me to question whether this data is actually
used for anything and I could find no evidence that we do anything other
than record this data nor anyone who knew about it. For reference this has
existed for 3 years, stores ~100 GB in S3 and produces HTTP requests that
mirror each Google Analytics one.

The simplest option to get past the blocking issue seemed to be removing
this code, which then allows cleaning up the configuration in AWS and
Fastly CDN. If we decide we have a use for this later we can always
resurrect it.

I came across this code while exploring the options for moving static's
asset files from being served from the www host rather than the assets
one. It proved to be a blocking issue as it required the presence of a
"/static/a" on the assets host.

Further investigation led me to question whether this data is actually
used for anything and I could find no evidence that we do anything other
than record this data nor anyone who knew about it. For reference this has
existed for 3 years, stores ~100 GB in S3 and produces HTTP requests that
mirror each Google Analytics one.

The simplest option to get past the blocking issue seemed to be removing
this code, which then allows cleaning up the configuration in AWS and
Fastly CDN. If we decide we have a use for this later we can always
resurrect it.
@bevanloon bevanloon temporarily deployed to govuk-static-remove-gov-oz1xgi April 29, 2020 21:58 Inactive
Copy link
Contributor

@benthorner benthorner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a positive change to me, not least because we should avoid using two systems to store analytics data. I think this is worth trying out: it shouldn't cause an issue for users, and we can always revert it if it turns out we are using the S3 data.

Before this goes in, it would help to have a bit more explanation about what this code was/is doing. The code seems to support GA, but also sending to a gifURL (?!). It would be helpful to know:

  • What the /static/a route actually does. Can we delete it once this is deployed?
  • How we can be sure the GA functionality really is unused.

@kevindew
Copy link
Member Author

This looks like a positive change to me, not least because we should avoid using two systems to store analytics data. I think this is worth trying out: it shouldn't cause an issue for users, and we can always revert it if it turns out we are using the S3 data.

Before this goes in, it would help to have a bit more explanation about what this code was/is doing. The code seems to support GA, but also sending to a gifURL (?!). It would be helpful to know:

  • What the /static/a route actually does. Can we delete it once this is deployed?
  • How we can be sure the GA functionality really is unused.

Thanks for having a look and sure, I should have added more in the description. It's been a wild journey of discovery around this functionality.

So as far as I have been able to tell (with various slack questions):

  • if you have accepted cookies, any requests to GA also send a request to https://assets.publishing.service.gov.uk/static/a with a lengthy of query string
  • Assets CDN has a route set-up for /static/a to return a empty 200 response
  • Assets CDN logs all those requests to a bucket
  • Nothing else happens

To try verify what would happen without this I did various searches to find any references and came up a blank, no-one in #govuk-developers or #analysis seems to know about it either. So I changed the path used for the bucket so that if there was any processes using this data they would break, that change went through last Friday. So I'm now confident that nothing that runs regularly is using it (well at least something monitored).

Once this is removed and deployed I will proceed to remove the logging from Fastly and the route from our CDN config.

My current conclusion is that this was a well intentioned idea that never actually led to anything.

Copy link
Contributor

@benthorner benthorner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kevindew thanks for adding more comments - it's reassuring to understand the confidence we have this is unused. I'm still unsure about the interaction of this code with GA, though:

Wouldn't removing the GOVUKTracker mean we lose all the events in GA that aren't covered by the universal one? It looks like they're doing roughly the same thing, so maybe only the universal one is actually active? I'm a bit confused why we've got two.

However, I think the "fail fast" reasoning still applies. We shouldn't have to do this much investigation to justify someone else's code. Let's ditch it!

@kevindew
Copy link
Member Author

@kevindew thanks for adding more comments - it's reassuring to understand the confidence we have this is unused. I'm still unsure about the interaction of this code with GA, though:

Wouldn't removing the GOVUKTracker mean we lose all the events in GA that aren't covered by the universal one? It looks like they're doing roughly the same thing, so maybe only the universal one is actually active? I'm a bit confused why we've got two.

However, I think the "fail fast" reasoning still applies. We shouldn't have to do this much investigation to justify someone else's code. Let's ditch it!

Thanks - sorry I didn't appreciate fully what you meant first time around and now looked into it more.

So the call to global.ga doesn't actually send any data but is used as an asynchronous callback to access GA properties: https://developers.google.com/analytics/devguides/collection/analyticsjs/command-queue-reference#ready-callback

So what the code is doing is adding the GA tracking id to the data send to the /static/a URL.

@kevindew kevindew merged commit fd68a4b into master Apr 30, 2020
@kevindew kevindew deleted the remove-govuk-tracker branch April 30, 2020 15:25
kevindew added a commit to alphagov/govuk-cdn-config that referenced this pull request May 5, 2020
The code for this was removed from static in
alphagov/static#2152 and thus this endpoint is
not used anymore. The 802 status response in this host was only used by
this endpoint so can also be removed.
kevindew added a commit to alphagov/govuk-cdn-config that referenced this pull request May 6, 2020
The code for this was removed from static in
alphagov/static#2152 and thus this endpoint is
not used anymore. The 802 status response in this host was only used by
this endpoint so can also be removed.
kevindew added a commit to alphagov/govuk-puppet that referenced this pull request Jun 16, 2020
This analytics path has been removed from static [1] and is therefore no
longer used. The configuration for it can now be removed.

[1]: alphagov/static#2152
kevindew added a commit to alphagov/govuk-puppet that referenced this pull request Jun 16, 2020
This analytics path has been removed from static [1] and is therefore no
longer used. The configuration for it can now be removed.

[1]: alphagov/static#2152
kevindew added a commit to alphagov/govuk-puppet that referenced this pull request Jun 17, 2020
This analytics path has been removed from static [1] and is therefore no
longer used. The configuration for it can now be removed.

[1]: alphagov/static#2152
kevindew added a commit to alphagov/govuk-puppet that referenced this pull request Jun 17, 2020
This analytics path has been removed from static [1] and is therefore no
longer used. The configuration for it can now be removed.

[1]: alphagov/static#2152
kevindew added a commit to alphagov/govuk-puppet that referenced this pull request Jun 17, 2020
This analytics path has been removed from static [1] and is therefore no
longer used. The configuration for it can now be removed.

I'm a bit baffled by the modules/govuk/templates/node/s_backend_lb/assets-carrenza.conf.erb
file and whether it's still relevant, but it seems a big tangent to
investigate this over something so small.

[1]: alphagov/static#2152
kevindew added a commit to alphagov/govuk-puppet that referenced this pull request Jun 17, 2020
This analytics path has been removed from static [1] and is therefore no
longer used. The configuration for it can now be removed.

I'm a bit baffled by the modules/govuk/templates/node/s_backend_lb/assets-carrenza.conf.erb
file and whether it's still relevant, but it seems a big tangent to
investigate this over something so small.

[1]: alphagov/static#2152
kevindew added a commit to alphagov/govuk-puppet that referenced this pull request Jun 17, 2020
This analytics path has been removed from static [1] and is therefore no
longer used. The configuration for it can now be removed.

I'm a bit baffled by the modules/govuk/templates/node/s_backend_lb/assets-carrenza.conf.erb
file and whether it's still relevant, but it seems a big tangent to
investigate this over something so small.

[1]: alphagov/static#2152
kevindew added a commit to alphagov/govuk-puppet that referenced this pull request Jun 17, 2020
This analytics path has been removed from static [1] and is therefore no
longer used. The configuration for it can now be removed.

I'm a bit baffled by the modules/govuk/templates/node/s_backend_lb/assets-carrenza.conf.erb
file and whether it's still relevant, but it seems a big tangent to
investigate this over something so small.

[1]: alphagov/static#2152
kevindew added a commit to alphagov/govuk-puppet that referenced this pull request Jun 17, 2020
This analytics path has been removed from static [1] and is therefore no
longer used. The configuration for it can now be removed.

I'm a bit baffled by the modules/govuk/templates/node/s_backend_lb/assets-carrenza.conf.erb
file and whether it's still relevant, but it seems a big tangent to
investigate this over something so small.

[1]: alphagov/static#2152
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants