Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ID Library #5863

Merged
merged 9 commits into from
Nov 12, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
243 changes: 243 additions & 0 deletions modules/idLibrary.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,243 @@
import {getGlobal} from '../src/prebidGlobal.js';
import {ajax} from '../src/ajax.js';
import {config} from '../src/config.js';
import * as utils from '../src/utils.js';
import MD5 from 'crypto-js/md5.js';

let email;
let conf;
const LOG_PRE_FIX = 'ID-Library: ';
const CONF_DEFAULT_OBSERVER_DEBOUNCE_MS = 250;
const CONF_DEFAULT_FULL_BODY_SCAN = true;
const OBSERVER_CONFIG = {
subtree: true,
attributes: true,
attributeOldValue: false,
childList: true,
attirbuteFilter: ['value'],
characterData: true,
characterDataOldValue: false
};
const logInfo = createLogInfo(LOG_PRE_FIX);
const logError = createLogError(LOG_PRE_FIX);

function createLogInfo(prefix) {
return function (...strings) {
utils.logInfo(prefix + ' ', ...strings);
}
}

function createLogError(prefix) {
return function (...strings) {
utils.logError(prefix + ' ', ...strings);
}
}

function getEmail(value) {
const matched = value.match(/([a-zA-Z0-9._-]+@[a-zA-Z0-9._-]+\.[a-zA-Z0-9._-]+)/gi);
if (!matched) {
return null;
}
logInfo('Email found' + matched[0]);
return matched[0];
}

function bodyAction(mutations, observer) {
logInfo('BODY observer on debounce called');
// If the email is found in the input element, disconnect the observer
if (email) {
observer.disconnect();
logInfo('Email is found, body observer disconnected');
return;
}

const body = document.body.innerHTML;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are scanning the entire page with regex for an email?

Can you explain the use case here? This doesn't seem very prebid-y.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless the publisher has configured the element from which the email needs to be fetched, we will have to scan the entire page to get email.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why can't the publisher just pass the hashed email directly to the library when they initiate it? i agree that scanning an entire page is not ideal

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recommend we make this scanning of entire document a configurable option via the setConfig call to this module.

We will leave it defaulting to true but I think it is in our best interest to document the behavior of the module thoroughly, and make sure the publisher has the control necessary.

@jdwieland8282 Can we make sure the associated docs PR contains the steps outlined and the fallbacks?

Something that outlines the flow:

How the target is found:

  1. Uses publisher defined target element
  2. Searches for HTML input (text/email) element
  3. Searches entire document for email using regex.

And outline that the latter (3) can be configured off by something like:

pbjs.setConfig({
   idLibrary: {
      fallbackSearch: false // defaults to true (I am bad at naming vars)
   }
});

@SKOCHERI

Copy link
Member

@jdwieland8282 jdwieland8282 Nov 2, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@robertrmartinez the existing Docs PR describes the following:

A persistent identifier can be extracted in the following ways:

1. From a generic `<div>` element
2. From a publisher configured HTML element id
3. From an `<input>` element of type text/email

Are you suggesting your 1-3 above replace the existing 1-3?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well just something that outlines exactly the flow of the module.

Could be another flow chart or something.

email = getEmail(body);
if (email !== null) {
logInfo(`Email obtained from the body ${email}`);
observer.disconnect();
logInfo('Post data on email found in body');
postData();
}
}

function targetAction(mutations, observer) {
logInfo('Target observer called');
for (const mutation of mutations) {
for (const node of mutation.addedNodes) {
email = node.textContent;

if (email) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we get an email do we need to keep going through all other mutations?

Should there only ever be one email on page?

Should we break here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once email is found, data is posted and returned now

logInfo('Email obtained from the target ' + email);
observer.disconnect();
logInfo('Post data on email found in target');
postData();
return;
}
}
}
}

function addInputElementsElementListner(conf) {
logInfo('Adding input element listeners');
const inputs = document.querySelectorAll('input[type=text], input[type=email]');

for (var i = 0; i < inputs.length; i++) {
logInfo(`Original Value in Input = ${inputs[i].value}`);
inputs[i].addEventListener('change', event => processInputChange(event));
inputs[i].addEventListener('blur', event => processInputChange(event));
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could use a forEach instead of a for loop since you do not need to exit early or anything.

Just a personal preference, can leave it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

using forEach LGTM analysis: JavaScript test failed so used for loop

}

function removeInputElementsElementListner() {
logInfo('Removing input element listeners');
const inputs = document.querySelectorAll('input[type=text], input[type=email]');

for (var i = 0; i < inputs.length; i++) {
inputs[i].removeEventListener('change', event => processInputChange(event));
inputs[i].removeEventListener('blur', event => processInputChange(event));
}
}

function processInputChange(event) {
const value = event.target.value;
logInfo(`Modified Value of input ${event.target.value}`);
email = getEmail(value);
if (email !== null) {
logInfo('Email found in input ' + email);
postData();
removeInputElementsElementListner();
}
}

function debounce(func, wait, immediate) {
var timeout;
return function () {
const context = this;
const args = arguments;
const later = function () {
timeout = null;
if (!immediate) func.apply(context, args);
};
var callNow = immediate && !timeout;
clearTimeout(timeout);
if (callNow) {
func.apply(context, args);
} else {
logInfo('Debounce wait time ' + wait);
timeout = setTimeout(later, wait);
}
};
};

function handleTargetElement() {
const targetObserver = new MutationObserver(debounce(targetAction, conf.debounce, false));

const targetElement = document.getElementById(conf.target);
if (targetElement) {
email = targetElement.innerText;

if (!email) {
logInfo('Finding the email with observer');
targetObserver.observe(targetElement, OBSERVER_CONFIG);
} else {
logInfo('Target found with target ' + email);
logInfo('Post data on email found in target with target');
postData();
}
}
}

function handleBodyElements() {
if (doesInputElementsHaveEmail()) {
logInfo('Email found in input elements ' + email);
logInfo('Post data on email found in target without');
postData();
return;
}
email = getEmail(document.body.innerHTML);
if (email !== null) {
logInfo('Email found in body ' + email);
logInfo('Post data on email found in the body without observer');
postData();
return;
}
addInputElementsElementListner();
if (conf.fullscan === true) {
const bodyObserver = new MutationObserver(debounce(bodyAction, conf.debounce, false));
bodyObserver.observe(document.body, OBSERVER_CONFIG);
}
}

function doesInputElementsHaveEmail() {
const inputs = document.getElementsByTagName('input');

for (let index = 0; index < inputs.length; ++index) {
const curInput = inputs[index];
email = getEmail(curInput.value);
if (email !== null) {
return true;
}
}
return false;
}

function syncCallback() {
return {
success: function () {
logInfo('Data synced successfully.');
},
error: function () {
logInfo('Data sync failed.');
}
}
}

function postData() {
(getGlobal()).refreshUserIds();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we refreshing userIds every single call?

Not sure that was the intent when pbjs.refreshUserIds was implemented.

@jdwieland8282

CC @bretg

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

refreshUserIds is called prior to posting data which happens only when a email is found.

const userIds = (getGlobal()).getUserIds();
if (Object.keys(userIds).length === 0) {
logInfo('No user ids');
return;
}
logInfo('Users' + JSON.stringify(userIds));
const syncPayload = {};
syncPayload.hid = MD5(email).toString();
syncPayload.uids = JSON.stringify(userIds);
const payloadString = JSON.stringify(syncPayload);
logInfo(payloadString);
ajax(conf.url, syncCallback(), payloadString, {method: 'POST', withCredentials: true});
}

function associateIds() {
if (window.MutationObserver || window.WebKitMutationObserver) {
if (conf.target) {
handleTargetElement();
} else {
handleBodyElements();
}
}
}

export function setConfig(config) {
if (!config) {
logError('Required confirguration not provided');
return;
}
if (!config.url) {
logError('The required url is not configured');
return;
}
if (typeof config.debounce !== 'number') {
config.debounce = CONF_DEFAULT_OBSERVER_DEBOUNCE_MS;
logInfo('Set default observer debounce to ' + CONF_DEFAULT_OBSERVER_DEBOUNCE_MS);
}
if (typeof config.fullscan !== 'boolean') {
config.fullscan = CONF_DEFAULT_FULL_BODY_SCAN;
logInfo('Set default fullscan ' + CONF_DEFAULT_FULL_BODY_SCAN);
}
conf = config;
associateIds();
}

config.getConfig('idLibrary', config => setConfig(config.idLibrary));
24 changes: 24 additions & 0 deletions modules/idLibrary.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
## ID Library Configuration Example


|Param |Required |Description |
|----------------|-------------------------------|-----------------------------|
|url |Yes | The url endpoint is used to post the hashed email and user ids. |
|target |No |It should contain the element id from which the email can be read. |
|debounce |No | Time in milliseconds before the email and ids are fetched |
|fullscan |No | Option to enable/disable full body scan to get email. By default the full body scan is enabled. |

### Example
```
pbjs.setConfig({
idLibrary:{
url: <url>,
debounce: 250,
target: 'username',
fullscan: false
},
});
```


```
61 changes: 61 additions & 0 deletions test/spec/modules/idLibrary_spec.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
import * as utils from 'src/utils.js';
import * as idlibrary from 'modules/idLibrary.js';

var expect = require('chai').expect;

describe('currency', function () {
let fakeCurrencyFileServer;
let sandbox;
let clock;

let fn = sinon.spy();

beforeEach(function () {
fakeCurrencyFileServer = sinon.fakeServer.create();
sinon.stub(utils, 'logInfo');
sinon.stub(utils, 'logError');
});

afterEach(function () {
utils.logInfo.restore();
utils.logError.restore();
fakeCurrencyFileServer.restore();
idlibrary.setConfig({});
});

describe('setConfig', function () {
beforeEach(function() {
sandbox = sinon.sandbox.create();
clock = sinon.useFakeTimers(1046952000000); // 2003-03-06T12:00:00Z
});

afterEach(function () {
sandbox.restore();
clock.restore();
});

it('results when no config available', function () {
idlibrary.setConfig({});
sinon.assert.called(utils.logError);
});
it('results with config available', function () {
idlibrary.setConfig({ 'url': 'URL' });
sinon.assert.called(utils.logInfo);
});
it('results with config default debounce ', function () {
let config = { 'url': 'URL' }
idlibrary.setConfig(config);
expect(config.debounce).to.be.equal(250);
});
it('results with config default fullscan ', function () {
let config = { 'url': 'URL' }
idlibrary.setConfig(config);
expect(config.fullscan).to.be.equal(true);
});
it('results with config fullscan ', function () {
let config = { 'url': 'URL', 'fullscan': false }
idlibrary.setConfig(config);
expect(config.fullscan).to.be.equal(false);
});
});
});