Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: sanitize field names #1898

Merged
merged 58 commits into from
Dec 22, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
58 commits
Select commit Hold shift + click to select a range
d48f320
feat: sanitize_field_names functionality implemented
Nov 30, 2020
ea2df4e
fix: inevitable lint
Nov 30, 2020
e54e1bc
fix: fixes out of control recursion
Nov 30, 2020
45fd126
fix: lint
Nov 30, 2020
2ae3bcc
tests: starting integration tests for sanitize field names
Dec 1, 2020
154685d
fix: lint
Dec 1, 2020
80f886c
test: test runner and fixtures for express
Dec 2, 2020
3931b39
chore: remove comments
Dec 2, 2020
f02c413
feat: add koa to test harness for sanitize_field_names
Dec 3, 2020
27495ec
chore: fix accidental "accept both", remove console.log
Dec 3, 2020
a501ea2
feat: hapi tests
Dec 3, 2020
7717e6d
feat: restify tests added to harness
Dec 3, 2020
4746d29
feat: fastify tests
Dec 3, 2020
0bdc12d
feat: fastify tests
Dec 3, 2020
bb7e852
chore: adding issue links
Dec 3, 2020
c5f562e
chore: use strict
Dec 3, 2020
ab80a07
chore: revert this -- handling in a seperate pr
Dec 3, 2020
f1b5e24
feat: document the new configuration value
Dec 3, 2020
56351d8
chore: tav for body parsing
Dec 4, 2020
4050379
chore: update config type
Dec 4, 2020
1bf733d
chore: changelog entry
Dec 4, 2020
819c3b8
chore: console.log removal
Dec 4, 2020
6958a65
chore: remove unneeded tests
Dec 4, 2020
0bba0ee
fix: making sanitizeFieldNames optional
Dec 4, 2020
dc4169c
chore: fix ascii doc build errors
Dec 4, 2020
12a6e6d
feat: redact instead of remove, fix changelog
Dec 16, 2020
d049c29
chore: typos
Dec 16, 2020
58f3af7
feat: adds sanitize regular expressions to is-secret
Dec 17, 2020
b9735f3
chore: lint
Dec 17, 2020
a2b8b21
chore: update default in docs
Dec 17, 2020
5dfc498
fix: adjusting tests to reflect the new world
Dec 17, 2020
4ad6433
chore: lint lint lint
Dec 18, 2020
56f91c1
Update docs/configuration.asciidoc
astorm Dec 18, 2020
295d76e
Update docs/configuration.asciidoc
astorm Dec 18, 2020
38388f8
chore: alphabetical order
Dec 18, 2020
b90b973
chore: update comments, rename function
Dec 18, 2020
ae3b34c
Update test/sanitize-field-names/fixtures.js
astorm Dec 19, 2020
bea04d9
Update test/sanitize-field-names/fixtures.js
astorm Dec 19, 2020
ca4f97f
chore: central place for consants
Dec 19, 2020
435cf80
chore: lint
Dec 19, 2020
a851b08
chore: assert the error, not the fail
Dec 19, 2020
7b333f1
chore: fix copy/pasta
Dec 19, 2020
545571e
chore: ensure shared and fixture are not seen as tests
Dec 19, 2020
7261952
chore: old comment hanging around
Dec 19, 2020
2d0cf94
chore: dup test?
Dec 19, 2020
6424cde
chore: fixing test label
Dec 19, 2020
a78bbb3
chore: subtest naming conventions
Dec 19, 2020
48fd4ee
chore: redact instead of remove
Dec 19, 2020
ea40071
fix: avoid extra array allocated
Dec 19, 2020
18c6d58
chore: deault typo
Dec 19, 2020
14066f7
chore: adjust so things are loaded after the agent
Dec 19, 2020
2dc2d6f
chore: new fixture
Dec 19, 2020
b831268
chore: lint
Dec 19, 2020
04769bf
Merge branch 'master' into astorm/sanitize_field_names
Dec 22, 2020
03f5a7f
fix: _conf, not _config
Dec 22, 2020
eb44700
Update test/sanitize-field-names/express.js
astorm Dec 22, 2020
0ce1520
feat: remove Buffer handling code and tests
Dec 22, 2020
026f6df
chore: remove code that will never be called
Dec 22, 2020
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions .tav.yml
Original file line number Diff line number Diff line change
Expand Up @@ -348,3 +348,8 @@ finalhandler:
memcached:
versions: '>=2.2.0'
commands: node test/instrumentation/modules/memcached.js

body-parser:
Copy link
Contributor Author

@astorm astorm Dec 4, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Express has an external body parser middleware module for capturing application/x-www-form-urlencoded form bodies, so we run its tests through TAV.

Restify and HAPI do not use external middleware for capturing application/x-www-form-urlencoded form bodies, so there's no module to run through TAV.

Fastify and koa do use external middleware for capturing application/x-www-form-urlencoded bodies, but the agent doesn't currently capture their request bodies, so we don't test those module (yet!)

versions: '>=1.19.0'
commands:
- node test/sanitize-field-names/express.js
6 changes: 5 additions & 1 deletion CHANGELOG.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,6 @@ Notes:
==== x.x.x - YYYY/MM/DD
////


[[release-notes-3.x]]
=== Node.js Agent version 3.x

Expand Down Expand Up @@ -65,6 +64,11 @@ slightly to commonalize:

* feat: Add `log_level` central config support. {pull}1908[#1908] +
Spec: https://github.com/elastic/apm/blob/master/specs/agents/logging.md
* feat: implemented sanitize_feature_names specification +
Allows users to configure a list of wildcard patterns to _remove_ items
from the agent's HTTP header and `application/x-www-form-urlencoded` payloads.
** https://github.com/elastic/apm/blob/master/specs/agents/sanitization.md[spec]
** https://github.com/elastic/apm-agent-nodejs/blob/master/docs/configuration.asciidoc#sanitize-field-names[docs]

[float]
===== Bug fixes
Expand Down
16 changes: 15 additions & 1 deletion docs/configuration.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -706,6 +706,20 @@ The timeout is applied once the agent has sent the entire request body to the AP
If the response from the server takes longer than allowed by this timeout,
the HTTP request is terminated and the TCP socket closed.

[[sanitize-field-names]]
==== `sanitizeFieldNames`
* *Type:* Array
* *Default:* `['password', 'passwd', 'pwd', 'secret', '*key', '*token*', '*session*', '*credit*', '*card*', 'authorization', 'set-cookie', 'pw', 'pass', 'connect.sid']`
* *Env:* `ELASTIC_SANITIZE_FIELD_NAMES`

Remove sensitive data sent to Elastic APM.

The `sanitizeFieldNames` configuration value allows you to configure a list of wildcard patterns of field names which should be redacted from agent payloads. Wildcard matches are
case-insensitive by default. You may make wildcard searches case-sensitive by
using the `(?-i)` prefix. These patterns apply to the request and response HTTP headers, as well as any form field captured during an `application/x-www-form-urlencoded` data request.

The `sanitizeFieldNames` will redact any matched _field names_. If you wish to filter or _redact_ other data the <<filter-http-headers,`filterHttpHeaders`>> configuration field or the <<apm-add-filter,API filtering functions>> may be a better choice.

[[filter-http-headers]]
==== `filterHttpHeaders`

Expand All @@ -723,7 +737,7 @@ the header value will be set to `[REDACTED]`

Currently, the following HTTP headers are anonymized by default:

* `Authorization` - The full value of this header is redacted
* `Authorization` - The full value of this header is redacted. In versions of the agent greater than v3.9, the authorization header is redacted by the default configuration of <<sanitize-field-names>>.
* `Cookie` - The cookies inside the `Cookie` header are analyzed and their values redacted if they appear sensitive (like a session cookie).
See the https://github.com/watson/is-secret[is-secret] module for details about which patterns are considered sensitive.

Expand Down
1 change: 1 addition & 0 deletions index.d.ts
Original file line number Diff line number Diff line change
Expand Up @@ -217,6 +217,7 @@ interface AgentConfigOptions {
metricsInterval?: string; // Also support `number`, but as we're removing this functionality soon, there's no need to advertise it
payloadLogFile?: string;
centralConfig?: boolean;
sanitizeFieldNames?: Array<string>;
secretToken?: string;
serverCaCertFile?: string;
serverTimeout?: string; // Also support `number`, but as we're removing this functionality soon, there's no need to advertise it
Expand Down
21 changes: 20 additions & 1 deletion lib/config.js
Original file line number Diff line number Diff line change
Expand Up @@ -66,6 +66,10 @@ var DEFAULTS = {
logUncaughtExceptions: false, // TODO: Change to `true` in the v4.0.0
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything in this file is just configuration handling (including converting the wildcards into regular expressions)

metricsInterval: '30s',
metricsLimit: 1000,
sanitizeFieldNames: ['password', 'passwd', 'pwd', 'secret', '*key', '*token*',
'*session*', '*credit*', '*card*', 'authorization', 'set-cookie',
'pw', 'pass', 'connect.sid'
],
serviceNodeName: undefined,
serverTimeout: '30s',
sourceLinesErrorAppFrames: 5,
Expand Down Expand Up @@ -117,6 +121,7 @@ var ENV_TABLE = {
metricsInterval: 'ELASTIC_APM_METRICS_INTERVAL',
metricsLimit: 'ELASTIC_APM_METRICS_LIMIT',
payloadLogFile: 'ELASTIC_APM_PAYLOAD_LOG_FILE',
sanitizeFieldNames: 'ELASTIC_SANITIZE_FIELD_NAMES',
serverCaCertFile: 'ELASTIC_APM_SERVER_CA_CERT_FILE',
secretToken: 'ELASTIC_APM_SECRET_TOKEN',
serverTimeout: 'ELASTIC_APM_SERVER_TIMEOUT',
Expand All @@ -142,7 +147,8 @@ var CENTRAL_CONFIG = {
transaction_sample_rate: 'transactionSampleRate',
transaction_max_spans: 'transactionMaxSpans',
capture_body: 'captureBody',
transaction_ignore_urls: 'transactionIgnoreUrls'
transaction_ignore_urls: 'transactionIgnoreUrls',
sanitize_field_names: 'sanitizeFieldNames'
}

var VALIDATORS = {
Expand Down Expand Up @@ -195,6 +201,7 @@ var MINUS_ONE_EQUAL_INFINITY = [

var ARRAY_OPTS = [
'disableInstrumentations',
'sanitizeFieldNames',
'transactionIgnoreUrls'
]

Expand All @@ -214,6 +221,7 @@ class Config {
this.ignoreUserAgentStr = []
this.ignoreUserAgentRegExp = []
this.transactionIgnoreUrlRegExp = []
this.sanitizeFieldNamesRegExp = []
// If we didn't find a config file on process boot, but a path to one is
// provided as a config option, let's instead try to load that
if (confFile === null && opts && opts.configFile) {
Expand Down Expand Up @@ -408,9 +416,20 @@ function normalize (opts) {
normalizeTime(opts)
normalizeBools(opts)
normalizeIgnoreOptions(opts)
normalizeSanitizeFieldNames(opts)
truncateOptions(opts)
}

function normalizeSanitizeFieldNames (opts) {
if (opts.sanitizeFieldNames) {
const wildcard = new WildcardMatcher()
for (const ptn of opts.sanitizeFieldNames) {
const re = wildcard.compile(ptn)
opts.sanitizeFieldNamesRegExp.push(re)
}
}
}

function normalizeIgnoreOptions (opts) {
if (opts.transactionIgnoreUrls) {
// We can't guarantee that opts will be a Config so set a
Expand Down
7 changes: 7 additions & 0 deletions lib/constants.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
/**
* Central location for shared constants
*/

module.exports = {
REDACTED: '[REDACTED]'
}
43 changes: 27 additions & 16 deletions lib/filters/http-headers.js
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
'use strict'

const REDACTED = '[REDACTED]'
const REDACTED = require('../constants').REDACTED

const cookie = require('cookie')
const redact = require('../redact-secrets')(REDACTED)
Expand All @@ -12,18 +12,28 @@ function httpHeaders (obj) {
const requestHeaders = obj.context && obj.context.request && obj.context.request.headers
const responseHeaders = obj.context && obj.context.response && obj.context.response.headers

if (requestHeaders) filterSensitiveHeaders(requestHeaders)
if (responseHeaders) filterSensitiveHeaders(responseHeaders)
if (requestHeaders) filterCookieHeaders(requestHeaders)
if (responseHeaders) filterCookieHeaders(responseHeaders)

return obj
}

Copy link
Contributor Author

@astorm astorm Dec 4, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We changed the handling of undefined in this module to be consistent across our all redacted header fields. The was prompted by the fact that

  1. We filter headers before they're redacted
  2. We filter headers by setting their value to undefined
  3. This meant an authorization header we had marked for removal via setting it to undefined would set the REDACTED before the agent had a chance to serialize it as JSON

function filterSensitiveHeaders (headers) {
/**
* Filters cookie _values_ in http headers
*
* The filterCookieHeaders method filters individual
* cookie values
*/
function filterCookieHeaders (headers) {
for (const key in headers) {
astorm marked this conversation as resolved.
Show resolved Hide resolved
// undefined headers will be dropped when serialized as
// json, and don't need to be redacted. If a cookie
// header is already redacted there's no need to parse its
// values.
if (headers[key] === undefined || REDACTED === headers[key]) {
astorm marked this conversation as resolved.
Show resolved Hide resolved
continue
}
switch (key.toLowerCase()) {
case 'authorization':
headers[key] = REDACTED
break
case 'cookie':
if (typeof headers[key] === 'string') {
const cookies = cookie.parse(headers[key])
Expand All @@ -34,16 +44,17 @@ function filterSensitiveHeaders (headers) {
}
break
case 'set-cookie':
if (typeof headers[key] !== 'undefined') {
try {
const setCookies = new SetCookie(headers[key])
redact.forEach(setCookies)
headers[key] = stringify(setCookies)
} catch (err) {
// Ignore error
headers[key] = '[malformed set-cookie header]'
}
// if the sanitize_field_names module has redacted this there's
// no need to attempt the sanitizaion of individual cookies
try {
const setCookies = new SetCookie(headers[key])
redact.forEach(setCookies)
headers[key] = stringify(setCookies)
} catch (err) {
// Ignore error
headers[key] = '[malformed set-cookie header]'
}

break
}
}
Expand Down
77 changes: 77 additions & 0 deletions lib/filters/sanitize-field-names.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
'use strict'
const querystring = require('querystring')

const HEADER_FORM_URLENCODED = 'application/x-www-form-urlencoded'
const REDACTED = require('../constants').REDACTED

/**
* Handles req.body as object or string
*
* Express provides multiple body parser middlewares with x-www-form-urlencoded
* handling. See http://expressjs.com/en/resources/middleware/body-parser.html
*/
function redactKeysFromPostedFormVariables (body, requestHeaders, regexes) {
// only redact from application/x-www-form-urlencoded
if (HEADER_FORM_URLENCODED !== requestHeaders['content-type']) {
return body
}

// if body is a plain object, use redactKeysFromObject
if (body !== null && !Buffer.isBuffer(body) && typeof body === 'object') {
return redactKeysFromObject(body, regexes)
}

// if body is a string, use querystring to create object,
// pass to redactKeysFromObject, and reserialize as string
if (typeof body === 'string') {
const objBody = querystring.parse(body)
redactKeysFromObject(objBody, regexes)
return querystring.stringify(objBody)
}

return body
}

function redactKeyFromObject (obj, regex) {
for (const [key] of Object.entries(obj)) {
if (regex.test(key)) {
obj[key] = REDACTED
}
}
return obj
}

function redactKeysFromObject (obj, regexes) {
if (!obj || !Array.isArray(regexes)) {
return obj
}
for (const [, regex] of regexes.entries()) {
redactKeyFromObject(obj, regex)
}
return obj
}

function createFilter (conf) {
return sanitizeFieldNames

function sanitizeFieldNames (obj) {
const requestHeaders = obj.context && obj.context.request && obj.context.request.headers
const responseHeaders = obj.context && obj.context.response && obj.context.response.headers
const body = obj.context && obj.context.request && obj.context.request.body

redactKeysFromObject(requestHeaders, conf.sanitizeFieldNamesRegExp)
redactKeysFromObject(responseHeaders, conf.sanitizeFieldNamesRegExp)

if (body) {
obj.context.request.body = redactKeysFromPostedFormVariables(body, requestHeaders, conf.sanitizeFieldNamesRegExp)
}

return obj
}
}

module.exports = {
createFilter,
redactKeysFromObject,
redactKeysFromPostedFormVariables
}
13 changes: 12 additions & 1 deletion lib/parsers.js
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,11 @@ var stackman = require('./stackman')

const _MAX_HTTP_BODY_CHARS = 2048

const {
redactKeysFromObject,
redactKeysFromPostedFormVariables
} = require('./filters/sanitize-field-names')

var mysqlErrorMsg = /(ER_[A-Z_]+): /

function parseMessage (msg) {
Expand Down Expand Up @@ -145,7 +150,10 @@ function getContextFromRequest (req, conf, type) {
}

if (conf.captureHeaders) {
context.headers = Object.assign({}, req.headers)
context.headers = redactKeysFromObject(
Object.assign({}, req.headers),
conf.sanitizeFieldNamesRegExp
)
}

var contentLength = parseInt(req.headers['content-length'], 10)
Expand All @@ -160,6 +168,8 @@ function getContextFromRequest (req, conf, type) {
} else if (Buffer.isBuffer(body)) {
context.body = '<Buffer>'
} else {
body = redactKeysFromPostedFormVariables(body, req.headers, conf.sanitizeFieldNamesRegExp)

if (typeof body !== 'string') {
body = tryJsonStringify(body) || stringify(body)
}
Expand All @@ -186,6 +196,7 @@ function getContextFromResponse (res, conf, isError) {

if (conf.captureHeaders) {
context.headers = res.headers || httpHeaders(res, true)
context.headers = redactKeysFromObject(context.headers, conf.sanitizeFieldNamesRegExp)
}

if (isError) {
Expand Down
10 changes: 6 additions & 4 deletions lib/redact-secrets/is-secret.js
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,9 @@
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
// SOFTWARE.

const agent = require('../..')

var KEYS = [
// generic
/passw(or)?d/i,
Expand All @@ -39,14 +42,13 @@ var KEYS = [
var VALUES = [
/^\d{4}[- ]?\d{4}[- ]?\d{4}[- ]?\d{4}$/ // credit card number
]

exports.key = key
exports.value = value

function key (str) {
return KEYS.some(function (regex) {
return regex.test(str)
})
// grab current value of agent._conf.sanitizeFieldNames
// and include those as keys
return KEYS.some((regex) => regex.test(str)) || agent._conf.sanitizeFieldNamesRegExp.some((regex) => regex.test(str))
}

function value (str) {
Expand Down
3 changes: 3 additions & 0 deletions package.json
Original file line number Diff line number Diff line change
Expand Up @@ -118,8 +118,10 @@
"@babel/core": "^7.8.4",
"@babel/preset-env": "^7.8.4",
"@elastic/elasticsearch": "^7.10.0",
"body-parser": "^1.19.0",
"@hapi/hapi": "^18.4.1",
"@koa/router": "^9.0.1",
"koa-bodyparser": "^3.2.0",
"@types/node": "^13.7.4",
"ajv": "^6.12.6",
"apollo-server-express": "^2.10.1",
Expand All @@ -138,6 +140,7 @@
"express-graphql": "^0.9.0",
"express-queue": "^0.0.12",
"fastify": "^2.12.0",
"fastify-formbody": "^3.2",
"finalhandler": "^1.1.2",
"generic-pool": "^3.7.1",
"get-port": "^5.1.1",
Expand Down
Loading