-
-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Server side analytics #4131
Server side analytics #4131
Conversation
I'm expecting a lint error here around the naming of the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Looks great with some gating implemented, unless I'm missing it.
For this use case, I think it might actually be more useful to have a sample that isn't project specific, since our initial goal is to compare ad rates across all projects. However, we don't have a way to easily handle that from the DB, so I don't want scaling to require a deploy to change the % of traffic to send.
readthedocs/analytics/models.py
Outdated
@@ -0,0 +1 @@ | |||
"""Intentionally blank""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is necessary with the AppConfig.
readthedocs/analytics/signals.py
Outdated
'ua': request.META.get('HTTP_USER_AGENT'), | ||
'uip': get_client_ip(request), | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't look like this is gated by the feature flag, so it would start sending all footer data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct, it is not yet.
How about sending ad clicks to GA server side rather than client side? That would be hundreds (but not thousands) of clicks per day. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great. I just mentioned a couple small nits.
readthedocs/analytics/tasks.py
Outdated
'ec': None, # Event category (required) | ||
'ea': None, # Event action (required) | ||
'el': None, # Event label | ||
'ev': None, # Event value (numeric) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we be setting these to None by default? It seems like we should be a bit more defensive here, and make sure the incoming event_data
contains them.
readthedocs/analytics/utils.py
Outdated
log = logging.getLogger(__name__) # noqa | ||
|
||
# Used to anonymize an IP by zero-ing out the last 2 bytes | ||
MASK = int('0xFFFFFFFFFFFFFFFFFFFFFFFFFFFF0000', 16) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this just live in the anonymize_ip_address
function? Not sure if we need it elsewhere.
readthedocs/analytics/utils.py
Outdated
'https://www.google-analytics.com/collect', | ||
data=data, | ||
) | ||
except requests.Timeout: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the default timeout here? We should probably set it to something really low.
['rtfd._trackEvent', 'Promo', 'Click', data.id] | ||
); | ||
} | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder about leaving this logic in, until we can properly confirm the server-side stuff is working. Not sure how important it is, if it's better to possible double count or to just lose the data.
I made the updates based on the feedback including making it more obvious what is required in terms of analytics. |
💯 |
This PR sends an event to Google Analytics from the server side when the Footer API is called.
This is one of our identified possible alternatives to client side GA. Sending data from the server side has many advantages such as being able to anonymize data before it is sent and not having clients directly connect to a site they may not want to. It comes with the downside of needing to have many web servers to handle all our analytics calls.
I am seeking feedback on the structure especially around the structure in terms of using signals/hooks to find when footer calls are made and whether this seems like a sane approach.