From b319fff1ebee11ec4f664fe0dd8fcae2399b89b1 Mon Sep 17 00:00:00 2001 From: David Fischer Date: Mon, 16 Apr 2018 15:53:57 -0700 Subject: [PATCH 1/3] Add advertising details docs --- docs/advertising-details.rst | 130 ++++++++++++++++++++++ docs/ethical-advertising.rst | 17 ++- docs/index.rst | 1 + media/javascript/readthedocs-analytics.js | 20 +--- 4 files changed, 147 insertions(+), 21 deletions(-) create mode 100644 docs/advertising-details.rst diff --git a/docs/advertising-details.rst b/docs/advertising-details.rst new file mode 100644 index 00000000000..254c1587a2e --- /dev/null +++ b/docs/advertising-details.rst @@ -0,0 +1,130 @@ +Advertising Details +=================== + +.. NOTE: This document is linked from: +.. https://media.readthedocs.org/javascript/readthedocs-analytics.js + +Read the Docs largely funds our operations and development through advertising. +However, we aren't willing to compromise our values, document authors, +or site visitors simply to make a bit more money. +That's why we created our +:doc:`ethical advertising ` initiative. + +We get a lot of inquiries about our approach to advertising which range +from questions about our practices to requests to partner. +The goal of this document is to shed light on the advertising industry, +exactly what we do for advertising, and how what we do is different. +If you have questions or comments, +`send us an email `_ +or `open an issue on GitHub `_. + + +Other ad networks' targeting +---------------------------- + +Some ad networks build a database of user data in order to predict the types +of ads that are likely to be clicked. +In the advertising industry, this is called *behavioral targeting*. +This can include data such as: + +* sites a user has visited +* a user's search history +* ads, pages, or stories a user has clicked on in the past +* demographic information such as age, gender, or income level + +Typically, getting a user's page visit history is accomplished by the use of trackers +(sometimes called beacons or pixels). +For example, if a site uses a tracker from an ad network and a user visits their site, +the site can now target future advertising to that user -- a known past visitor -- +with that network. This is called *retargeting*. + +Other ad predictions are made by grouping similar users +together based on user data using machine learning. +Frequently this involves an advertiser uploading personal data on users +(often past customers of the advertiser) +to an ad network and telling the network to target similar users. +The idea is that two users with similar demographic information +and similar interests would like the same products. +In ad tech, this is known as *lookalike audiences* or *similar audiences*. + +Understandably, many people have concerns about these targeting techniques. +The modern advertising industry has built enormous value by centralizing +massive amounts of data on as many people as possible. + + +Our targeting details +--------------------- + +**Read the Docs doesn't use the above techniques**. +Instead, we target based solely upon: + +* Details of the page where the advertisement is shown including: + + * The name, keywords, or programming language associated with the project being viewed + * Content of the page (eg. H1, title, theme, etc.) + * Whether the page is being viewed from a mobile device + +* General geography + + * We allow advertisers to target ads to a list of countries or to exclude + countries from their advertising. + * We geolocate a user's IP address to a country when a request is made. + +Read the Docs uses GeoLite2 data created by `MaxMind `_. + +.. note:: + + We are considering expanding geographic targeting in the USA and Canada. + Because the USA and Canada are so large, we are considering allowing ads to be + targeted to a state or province or to a major metro area (DMA). + This document will be updated if that happens. + + +.. _advertising-analytics: + +Analytics +--------- + +Analytics are a sensitive enough issue that they require their own section. +In the spirit of full transparency, Read the Docs currently uses Google Analytics (GA). + +GA is a contentious issue inside Read the Docs and in our community. +Some users are very sensitive and privacy conscious to usage of GA. +Some authors want their own analytics on their docs to see the usage their docs get. +The developers at Read the Docs understand that different users have different priorities +and we try to respect the different viewpoints as much as possible while also accomplishing +our own goals. + +Advertisers ask us questions that are easily answered with an analytics solution like +"how many users do you have in Switzerland browsing Python docs?". We need to be able +to easily get this data. We also use data from GA for some development decisions such +as what browsers to support (or not) or how much usage a particular page or feature gets. + +We have taken steps to address some of the privacy concerns. +Read the Docs instructs Google to anonymize IPs sent to them before they are stored. + +Alternatives +~~~~~~~~~~~~ + +We are always exploring our options with respect to analytics. +There are alternatives but none of them are without downsides. +Some alternatives are: + +* Run a different cloud analytics solution from a provider other than Google + (eg. Parse.ly, Matomo Cloud, Adobe Analytics). + We priced a couple of these out based on our load and they are very expensive. + They also just substitute one problem of data sharing with another. +* Send data to GA (or another cloud analytics provider) on the server side and + strip or anonymize personal data such as IPs before sending them. + This would be a complex solution and involve additional infrastructure, + but it would have many advantages. It would result in a loss of data on + "sessions" and new vs. returning visitors which are of limited value to us. +* Run a local JavaScript based analytics solution (eg. Matomo community). + This involves additional infrastructure that needs to be always up. + Frequently there are very large databases associated with this. + Many of these solutions aren't built to handle Read the Docs' load. +* Run a local analytics solution based on web server log parsing. + This has the same infrastructure problems as above while also + not capturing all the data we want (without additional engineering) like the + programming language of the docs being shown or + whether the docs are built with Sphinx or something else. diff --git a/docs/ethical-advertising.rst b/docs/ethical-advertising.rst index 7ca97b34206..9fe5c07b196 100644 --- a/docs/ethical-advertising.rst +++ b/docs/ethical-advertising.rst @@ -33,9 +33,6 @@ or seeing paid ads if you want. You will still see :ref:`community ads `, which we run for free that promote community projects. -We have gone into more detail about our views in our `blog post `_ about this topic. -Eric Holscher, one of our co-founders `talks a bit more `_ about funding open source this way on his blog. - .. _ethical-info: Our worldview @@ -94,6 +91,20 @@ in a way that makes us feel good. .. _fake ad clicks: https://en.wikipedia.org/wiki/Click_fraud +Additional details +------------------ + +* We have additional documentation on the + :doc:`technical details of our advertising `. +* We have an `advertising FAQ`_ written for advertisers. +* We have gone into more detail about our views in our + `blog post `_ about this topic. +* Eric Holscher, one of our co-founders + `talks a bit more `_ + about funding open source this way on his blog. + +.. _advertising FAQ: https://readthedocs.org/sustainability/advertising/faq/ + Join us ------- diff --git a/docs/index.rst b/docs/index.rst index d6e30f31272..022aebca03e 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -59,6 +59,7 @@ Information about development is also available: gsoc code-of-conduct ethical-advertising + advertising-details sponsors open-source-philosophy story diff --git a/media/javascript/readthedocs-analytics.js b/media/javascript/readthedocs-analytics.js index e04090308f1..3f550ca8ba5 100644 --- a/media/javascript/readthedocs-analytics.js +++ b/media/javascript/readthedocs-analytics.js @@ -1,21 +1,5 @@ -// Google Analytics is a contentious issue inside Read the Docs and in our community. -// Some users are very sensitive and privacy conscious to usage of GA. -// Other users want their own GA tracker on their docs to see the usage their docs get. -// The developers at Read the Docs understand that different users have different priorities -// and we try to respect the different viewpoints as much as possible while also accomplishing -// our own goals. - -// Read the Docs largely funds our operations and development through advertising and -// advertisers ask us questions that are easily answered with an analytics solution like -// "how many users do you have in Switzerland browsing Python docs?". We need to be able -// to easily get this data. We also use data from GA for some development decisions such -// as what browsers to support (or not) or how much usage a particular page/feature gets. - -// We have taken steps with GA to address some of the privacy issues. -// Read the Docs instructs Google to anonymize IPs sent to them before they are stored (see below). - -// We are always exploring our options with respect to analytics and if you would like -// to discuss further, feel free to open an issue on github. +// For more details on analytics at Read the Docs, please see: +// https://docs.readthedocs.io/en/latest/advertising-details.html#analytics // RTD Analytics Code From a12851d77454ac046553060ae4ea459eaad050e2 Mon Sep 17 00:00:00 2001 From: David Fischer Date: Tue, 17 Apr 2018 12:29:21 -0700 Subject: [PATCH 2/3] Changes based on feedback --- docs/ethical-advertising.rst | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/docs/ethical-advertising.rst b/docs/ethical-advertising.rst index 9fe5c07b196..1de603cdcff 100644 --- a/docs/ethical-advertising.rst +++ b/docs/ethical-advertising.rst @@ -95,7 +95,8 @@ Additional details ------------------ * We have additional documentation on the - :doc:`technical details of our advertising `. + :doc:`technical details of our advertising ` + including our use of analytics. * We have an `advertising FAQ`_ written for advertisers. * We have gone into more detail about our views in our `blog post `_ about this topic. From d703f1c881e6d2df3a42537c807971ba31a9c277 Mon Sep 17 00:00:00 2001 From: David Fischer Date: Tue, 17 Apr 2018 12:35:57 -0700 Subject: [PATCH 3/3] Small verbiage change for clarity --- docs/advertising-details.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/advertising-details.rst b/docs/advertising-details.rst index 254c1587a2e..e52c7d014b7 100644 --- a/docs/advertising-details.rst +++ b/docs/advertising-details.rst @@ -34,7 +34,7 @@ This can include data such as: Typically, getting a user's page visit history is accomplished by the use of trackers (sometimes called beacons or pixels). -For example, if a site uses a tracker from an ad network and a user visits their site, +For example, if a site uses a tracker from an ad network and a user visits that site, the site can now target future advertising to that user -- a known past visitor -- with that network. This is called *retargeting*.