Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do Not Track support #4046

Merged
merged 16 commits into from
May 30, 2018
Merged
Show file tree
Hide file tree
Changes from 6 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
53 changes: 49 additions & 4 deletions docs/advertising-details.rst
Original file line number Diff line number Diff line change
Expand Up @@ -110,13 +110,50 @@ However, we always give advance notice in our issue tracker
and via email about showing ads where none were shown before.


.. _do-not-track:

Do Not Track Policy
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure this is the right part of the docs for this. It feels like it should probably be it's own thing, since it applies to RTD itself, and not just ads.

I realize there are ad-specific things, so maybe a DNT page in the docs, and then also a section here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps having it in the Privacy Policy is enough as an additional section. I guess it depends how heavily we want to promote the fact that we support it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a section in the privacy policy is good and the advertising details will link there.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

-------------------

Read the Docs supports Do Not Track (DNT) and respects users' tracking preferences.
Specifically, we support the `W3C's tracking preference expression`_
and the `EFF's DNT Policy`_.

This means:

* We **do not** do behavioral ad targeting regardless of your DNT preference.
You probably already knew that from reading the rest of this document.
* When DNT is enabled, both logged-in and logged-out users
are considered opted-out of analytics.
* Regardless of DNT preference, our logs that contain IP addresses
and user agent strings are deleted after 10 days unless a DNT exception applies.
* Our full DNT policy is `available here`_.

For more details about DNT, visit `All About Do Not Track`_.

Our DNT policy applies without reservation to ``readthedocs.org``.
A best effort is made to apply this to documentation sites hosted for authors
(typically ``*.readthedocs.io``, but also other domains),
but we do not have complete control over the contents of these sites.

.. _W3C's tracking preference expression: https://www.w3.org/TR/tracking-dnt/
.. _EFF's DNT Policy: https://www.eff.org/issues/do-not-track
.. _available here: https://readthedocs.org/.well-known/dnt-policy.txt
.. _All About Do Not Track: http://www.allaboutdnt.com

.. important::

Due to the nature of our environment where documentation is built as necessary,
the analytics opt-out only applies to documentation sites built after May 1, 2018.


.. _advertising-analytics:

Analytics
---------

Analytics are a sensitive enough issue that they require their own section.
In the spirit of full transparency, Read the Docs currently uses Google Analytics (GA).
In the spirit of full transparency, Read the Docs uses Google Analytics (GA).
We go into a bit of detail on our use of GA in our :doc:`privacy-policy`.

GA is a contentious issue inside Read the Docs and in our community.
Expand All @@ -126,14 +163,22 @@ The developers at Read the Docs understand that different users have different p
and we try to respect the different viewpoints as much as possible while also accomplishing
our own goals.

We have taken steps to address some of the privacy concerns surrounding GA.
These steps apply both to analytics collected by Read the Docs and when
:doc:`authors enable analytics on their docs <guides/google-analytics>`.

* Users can opt-out of analytics by using the Do Not Track feature of their browser.
* Read the Docs instructs Google to anonymize IPs sent to them before they are stored.
* The cookies set by GA expire more rapidly (30 days) than the default.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"We configured the cookies set by GA to only last 30 days, instead of the default of 2 years" reads better.


Why we use analytics
~~~~~~~~~~~~~~~~~~~~

Advertisers ask us questions that are easily answered with an analytics solution like
"how many users do you have in Switzerland browsing Python docs?". We need to be able
to easily get this data. We also use data from GA for some development decisions such
as what browsers to support (or not) or how much usage a particular page or feature gets.

We have taken steps to address some of the privacy concerns.
Read the Docs instructs Google to anonymize IPs sent to them before they are stored.

Alternatives
~~~~~~~~~~~~

Expand Down
2 changes: 1 addition & 1 deletion docs/ethical-advertising.rst
Original file line number Diff line number Diff line change
Expand Up @@ -96,7 +96,7 @@ Additional details

* We have additional documentation on the
:doc:`technical details of our advertising <advertising-details>`
including our use of analytics.
including our Do Not Track policy and our use of analytics.
* We have an `advertising FAQ`_ written for advertisers.
* We have gone into more detail about our views in our
`blog post <https://blog.readthedocs.com/ads-on-read-the-docs/>`_ about this topic.
Expand Down
10 changes: 9 additions & 1 deletion docs/guides/google-analytics.rst
Original file line number Diff line number Diff line change
Expand Up @@ -9,4 +9,12 @@ You can enable it by:

Once your documentation rebuilds it will include your Analytics tracking code and start sending data.
Google Analytics usually takes 60 minutes,
and sometimes can take up to a day before it starts reporting data.
and sometimes can take up to a day before it starts reporting data.

.. note::

Read the Docs takes some extra precautions with analytics to protect user privacy.
As a result, users with Do Not Track enabled will not be counted
for the purpose of analytics.

For more details, see our :ref:`Do Not Track Policy <do-not-track>`.
12 changes: 12 additions & 0 deletions docs/privacy-policy.rst
Original file line number Diff line number Diff line change
Expand Up @@ -218,6 +218,17 @@ We do not sell that content; it is yours.
Our use of cookies and tracking
-------------------------------

Do Not Track
~~~~~~~~~~~~

Read the Docs complies with the "Do Not Track" ("DNT") standard
recommended by the World Wide Web Consortium and the Electronic Frontier Foundation.
Users on readthedocs.org or Documentation Sites with DNT enabled
will be opted-out of analytics.
Regardless of your DNT preference, Read the Docs does not use behavioral targeting for advertising.
At this time, DNT does not apply to our commercial hosting solution on readthedocs.com.
For more details, see our :ref:`Do Not Track policy <do-not-track>`.

Cookies
~~~~~~~

Expand Down Expand Up @@ -258,6 +269,7 @@ collect any User Personal Information other than IP address;
or correlate your IP address with your identity.
Google provides further information about its own privacy practices and offers a
`browser add-on to opt out of Google Analytics tracking <https://tools.google.com/dlpage/gaoptout>`_.
You may also opt-out of analytics on Read the Docs by enabled Do Not Track.


How Read the Docs secures your information
Expand Down
59 changes: 33 additions & 26 deletions media/javascript/readthedocs-analytics.js
Original file line number Diff line number Diff line change
Expand Up @@ -2,33 +2,40 @@
// https://docs.readthedocs.io/en/latest/advertising-details.html#analytics


// RTD Analytics Code
// Skip analytics for users with Do Not Track enabled
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can use something like this script to check if do not track is enabled

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That entire script is basically to handle an IE10 bug. I'm not sure it's worth the effort.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should be in a script or a function that we can call from any scipt. Regarding IE, we may still support IE10.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IE10 is not supported by Microsoft with a couple exceptions and it is a tiny fraction of our users (sub-0.1%). I don't think it is unreasonable to not support a privacy feature for users who are using a browser unsupported by the vendor. In addition, the "support" that the linked script offers is mostly just to mark IE10 as "unspecified" for DNT which for our purpose would be off.

I lean toward simplicity here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understand. then we can keep window.doNotTrack === '1' || navigator.doNotTrack === '1') in a function and call it from everywhere!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, after testing this I'm reconsidering. It looks like IE11 on Windows 7 and Windows 8 set the DNT default to on.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Considering that to use this script would mean marking IE11 and IE10 as having DNT as "unspecified" I'm leaning toward maybe just checking navigator.doNotTrack === '1' and that's it. This would mean that no versions of IE can opt-out of tracking. It would be supported in MS Edge, however.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can keep window.doNotTrack === '1' || navigator.doNotTrack === '1') in a function and call it from everywhere!

Not very easily actually. readthedocs-analytics.js is loaded on the docs pages and should not have any outside dependencies apart from READTHEDOCS_DATA.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated the PR to just check navigator.doNotTrack === '1'.

if (navigator.doNotTrack === '1') {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the JS respect the setting as well? Probably needs to be added to the context manager.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand this comment. Can you elaborate?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't add this JS opt out unless the DO_NOT_TRACK_ENABLED setting is True

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally, yes we want to only respect DNT if people running RTD have the setting enabled. However, the use case we are supporting by changing that is to help people taking the readthedocs.org code base and don't want to support DNT.

To do this, we would need to pass the DNT setting through the READTHEDOCS_DATA object. We can do that, but I don't think it's worth it. Do you think it is?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, that makes sense for user docs. I guess I was thinking for the base.html more than the analytics.js.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's the only tidbit I'd add before it's deployable -- just so that all our users dashboards don't get DNT'd automatically.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to add this to base.html, but user dashboards are going to get DNT'd. The only thing this will change is that people who take the readthedocs.org codebase and use it in their infrastructure won't get DNT'd automatically on their installation except on docs pages where they still will.

console.log('Respecting DNT with respect to analytics...');
} else {
// RTD Analytics Code
(function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
(i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
})(window,document,'script','https://www.google-analytics.com/analytics.js','ga');

(function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
(i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
})(window,document,'script','https://www.google-analytics.com/analytics.js','ga');
if (typeof READTHEDOCS_DATA !== 'undefined') {
if (READTHEDOCS_DATA.global_analytics_code) {
ga('create', READTHEDOCS_DATA.global_analytics_code, 'auto', 'rtfd', {
'cookieExpires': 30 * 24 * 60 * 60
});
ga('rtfd.set', 'dimension1', READTHEDOCS_DATA.project);
ga('rtfd.set', 'dimension2', READTHEDOCS_DATA.version);
ga('rtfd.set', 'dimension3', READTHEDOCS_DATA.language);
ga('rtfd.set', 'dimension4', READTHEDOCS_DATA.theme);
ga('rtfd.set', 'dimension5', READTHEDOCS_DATA.programming_language);
ga('rtfd.set', 'dimension6', READTHEDOCS_DATA.builder);
ga('rtfd.set', 'anonymizeIp', true);
ga('rtfd.send', 'pageview');
}

if (typeof READTHEDOCS_DATA !== 'undefined') {
if (READTHEDOCS_DATA.global_analytics_code) {
ga('create', READTHEDOCS_DATA.global_analytics_code, 'auto', 'rtfd');
ga('rtfd.set', 'dimension1', READTHEDOCS_DATA.project);
ga('rtfd.set', 'dimension2', READTHEDOCS_DATA.version);
ga('rtfd.set', 'dimension3', READTHEDOCS_DATA.language);
ga('rtfd.set', 'dimension4', READTHEDOCS_DATA.theme);
ga('rtfd.set', 'dimension5', READTHEDOCS_DATA.programming_language);
ga('rtfd.set', 'dimension6', READTHEDOCS_DATA.builder);
ga('rtfd.set', 'anonymizeIp', true);
ga('rtfd.send', 'pageview');
// User Analytics Code
if (READTHEDOCS_DATA.user_analytics_code) {
ga('create', READTHEDOCS_DATA.user_analytics_code, 'auto', 'user', {
'cookieExpires': 30 * 24 * 60 * 60
});
ga('user.set', 'anonymizeIp', true);
ga('user.send', 'pageview');
}
// End User Analytics Code
}

// User Analytics Code
if (READTHEDOCS_DATA.user_analytics_code) {
ga('create', READTHEDOCS_DATA.user_analytics_code, 'auto', 'user');
ga('user.set', 'anonymizeIp', true);
ga('user.send', 'pageview');
}
// End User Analytics Code
// end RTD Analytics Code
}

// end RTD Analytics Code
16 changes: 15 additions & 1 deletion readthedocs/core/views/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@
import logging

from django.conf import settings
from django.http import HttpResponseRedirect, Http404
from django.http import HttpResponseRedirect, Http404, JsonResponse
from django.shortcuts import render, get_object_or_404, redirect
from django.views.decorators.csrf import csrf_exempt
from django.views.generic import TemplateView
Expand Down Expand Up @@ -115,3 +115,17 @@ def server_error_404(request, exception, template_name='404.html'): # pylint: d
r = render(request, template_name)
r.status_code = 404
return r


def do_not_track(request):
dnt_header = request.META.get('HTTP_DNT')

# https://w3c.github.io/dnt/drafts/tracking-dnt.html#status-representation
return JsonResponse({ # pylint: disable=redundant-content-type-for-json-response
'policy': 'https://docs.readthedocs.io/en/latest/privacy-policy.html',
'same-party': [
'readthedocs.org',
'readthedocs.io',
],
'tracking': 'N' if dnt_header == '1' else 'T',
}, content_type='application/tracking-status+json')
2 changes: 2 additions & 0 deletions readthedocs/settings/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,8 @@ class CommunityBaseSettings(Settings):
SESSION_COOKIE_DOMAIN = 'readthedocs.org'
SESSION_COOKIE_HTTPONLY = True
CSRF_COOKIE_HTTPONLY = True
# See: docs/advertising-details.rst
CSRF_COOKIE_AGE = None # session cookie (expires on browser quit)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will break user experience

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How so?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you have closed the browser and restore the window, the page will be loaded from cache. but the CSRF cookie will not be there. So it may make submittion CSRF Error. maybe you can try using django session CSRF?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is potentially true and is warned about in the Django docs. I don't think it will be a big issue for us since it only affects form submissions of which the only ones on a non-authed page are the login/signup forms. However, we could make the CSRF cookie age match the logged in cookie age (~2 weeks) to mitigate it. Thoughts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I set it to 30 days so it matches the GA cookie. I think that's a reasonable balance so it's pretty obvious we aren't using it to track users.


# Application classes
@property
Expand Down
42 changes: 26 additions & 16 deletions readthedocs/templates/base.html
Original file line number Diff line number Diff line change
Expand Up @@ -17,22 +17,32 @@

<!-- Google Analytics -->
<script>
(function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
(i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
})(window,document,'script','https://www.google-analytics.com/analytics.js','ga');

ga('create', '{{ GLOBAL_ANALYTICS_CODE }}', 'auto', 'rtfd');
ga('rtfd.set', 'anonymizeIp', true);
ga('rtfd.send', 'pageview');

{% if DASHBOARD_ANALYTICS_CODE %}
// Dashboard Analytics Code
ga('create', '{{ DASHBOARD_ANALYTICS_CODE }}', 'auto', 'user');
ga('user.set', 'anonymizeIp', true);
ga('user.send', 'pageview');
// End Dashboard Analytics Code
{% endif %}
if (navigator.doNotTrack === '1') {
console.log('Respecting DNT with respect to analytics...');
} else {
// For more details on analytics at Read the Docs, please see:
// https://docs.readthedocs.io/en/latest/advertising-details.html#analytics
(function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
(i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
})(window,document,'script','https://www.google-analytics.com/analytics.js','ga');

ga('create', '{{ GLOBAL_ANALYTICS_CODE }}', 'auto', 'rtfd', {
'cookieExpires': 30 * 24 * 60 * 60
});
ga('rtfd.set', 'anonymizeIp', true);
ga('rtfd.send', 'pageview');

{% if DASHBOARD_ANALYTICS_CODE %}
// Dashboard Analytics Code
ga('create', '{{ DASHBOARD_ANALYTICS_CODE }}', 'auto', 'user', {
'cookieExpires': 30 * 24 * 60 * 60
});
ga('user.set', 'anonymizeIp', true);
ga('user.send', 'pageview');
// End Dashboard Analytics Code
{% endif %}
}
</script>
<!-- End Google Analytics -->

Expand Down
Loading