Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Requests to Elasticsearch with two traceparent headers should yield 400 Bad Request #107338

Closed
trentm opened this issue Apr 10, 2024 · 2 comments · Fixed by #107355
Closed

Requests to Elasticsearch with two traceparent headers should yield 400 Bad Request #107338

trentm opened this issue Apr 10, 2024 · 2 comments · Fixed by #107355
Labels
>bug :Distributed Coordination/Network Http and internode communication implementations Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination.

Comments

@trentm
Copy link
Member

trentm commented Apr 10, 2024

Elasticsearch Version

8.12.2

Installed Plugins

No response

Java Version

bundled

OS Version

Darwin (though this happens on ESS, and an Elasticsearch docker image)

Problem Description

Requests to Elasticsearch with two traceparent headers fail. I don't think they should, but I'm happy to be proven wrong.

We received an issue from a user that requests to Elasticsearch from the JS client were failing when their Node.js app was instrumented with the Elastic APM Node.js agent. It turns out that the user was using two separate APM agents on the app. With both APM agents running, HTTP requests to Elasticsearch would have two traceparent headers.

This section of the W3C trace-context spec (https://w3c.github.io/trace-context/#a-traceparent-is-received) says:

5.1 Processing Model for Working with Trace Context Request Header
...
If a traceparent header is received:

  1. The vendor checks an incoming request for a traceparent and a tracestate header.
  2. Because the traceparent header is present, the vendor tries to parse the version of the traceparent header.
    1. If the version cannot be parsed, the vendor creates a new traceparent header and deletes tracestate.
      ...

My read of that spec is that an invalid traceparent is received (I'd classify multiple traceparent headers as invalid, though the spec doesn't specifically say that), then the traceparent and tracestate headers should be ignored.

Steps to Reproduce

curl -v $ES_URL -H traceparent:asdf -H traceparent:qwerty

For example:

% curl -v https://[REDACTED].es.us-west2.gcp.elastic-cloud.com/ -H traceparent:asdf -H traceparent:qwerty
...
> GET / HTTP/2
> Host:[REDACTED].es.us-west2.gcp.elastic-cloud.com
> User-Agent: curl/8.4.0
> Accept: */*
> traceparent:asdf
> traceparent:qwerty
>
* received GOAWAY, error=0, last_stream=1
< HTTP/2 502
< content-type: application/json; charset=UTF-8
< x-cloud-request-id: xHAxKfEoQKiYGTbKqHzfzw
< x-found-handling-cluster: [REDACTED]
< x-found-handling-instance: instance-0000000000
< content-length: 51
< date: Wed, 10 Apr 2024 21:58:48 GMT
<
{"ok":false,"message":"backend closed connection"}

A log message from a v8.13.0-SNAPSHOT test cluster of mine for a request list this:

[instance-0000000000] caught exception while handling client http traffic, closing connection Netty4HttpChannel{localAddress=/172.17.0.12:18804, remoteAddress=/10.47.207.223:45596} java.lang.IllegalArgumentException: multiple values for single-valued header [traceparent]. at org.elasticsearch.action.ActionModule.copyRequestHeadersToThreadContext(ActionModule.java:591) ~[elasticsearch-8.13.0-SNAPSHOT.jar:?] at org.elasticsearch.xpack.security.Security.lambda$getHttpTransports$35(Security.java:1753) ~[?:?] at org.elasticsearch.xpack.security.Security.lambda$getHttpServerTransportWithHeadersValidator$38(Security.java:1815) ~[?:?] at org.elasticsearch.http.netty4.internal.HttpHeadersAuthenticatorUtils.lambda$getValidatorInboundHandler$2(HttpHeadersAuthenticatorUtils.java:47) ~[?:?] at org.elasticsearch.http.netty4.Netty4HttpHeaderValidator.requestStart(Netty4HttpHeaderValidator.java:126) ~[?:?] at org.elasticsearch.http.netty4.Netty4HttpHeaderValidator.channelRead(Netty4HttpHeaderValidator.java:61) ~[?:?] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444) ~[?:?] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420) ~[?:?] at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412) ~[?:?] at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:346) ~[?:?] at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:318) ~[?:?] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444) ~[?:?] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420) ~[?:?] at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412) ~[?:?] at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103) ~[?:?] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444) ~[?:?] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420) ~[?:?] at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412) ~[?:?] at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1383) ~[?:?] at io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1246) ~[?:?] at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1295) ~[?:?] at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:529) ~[?:?] at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:468) ~[?:?] at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:290) ~[?:?] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444) ~[?:?] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420) ~[?:?] at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412) ~[?:?] at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410) ~[?:?] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:440) ~[?:?] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420) ~[?:?] at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919) ~[?:?] at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166) ~[?:?] at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:788) ~[?:?] at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:689) ~[?:?] at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:652) ~[?:?] at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:562) ~[?:?] at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997) ~[?:?] at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) ~[?:?] at java.lang.Thread.run(Thread.java:1583) ~[?:?]

Logs (if relevant)

[2024-04-11T10:27:50,384][WARN ][o.e.h.AbstractHttpServerTransport] [runTask-0] caught exception while handling client http traffic, closing connection Netty4HttpChannel{localAddress=/[0:0:0:0:0:0:0:1]:9200, remoteAddress=/[0:0:0:0:0:0:0:1]:62287} java.lang.IllegalArgumentException: multiple values for single-valued header [traceparent].
        at [email protected]/org.elasticsearch.action.ActionModule.copyRequestHeadersToThreadContext(ActionModule.java:584)
        at [email protected]/org.elasticsearch.xpack.security.Security.lambda$getHttpTransports$35(Security.java:1803)
        at [email protected]/org.elasticsearch.xpack.security.Security.lambda$getHttpServerTransportWithHeadersValidator$38(Security.java:1865)
        at [email protected]/org.elasticsearch.http.netty4.internal.HttpHeadersAuthenticatorUtils.lambda$getValidatorInboundHandler$2(HttpHeadersAuthenticatorUtils.java:47)
        at [email protected]/org.elasticsearch.http.netty4.Netty4HttpHeaderValidator.requestStart(Netty4HttpHeaderValidator.java:126)
        at [email protected]/org.elasticsearch.http.netty4.Netty4HttpHeaderValidator.channelRead(Netty4HttpHeaderValidator.java:61)
        at [email protected]/io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444)
        at [email protected]/io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
        at [email protected]/io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
        at [email protected]/io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:346)
        at [email protected]/io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:318)
        at [email protected]/io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444)
        at [email protected]/io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
        at [email protected]/io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
        at [email protected]/io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
        at [email protected]/io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444)
        at [email protected]/io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
        at [email protected]/io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
        at [email protected]/io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
        at [email protected]/io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:440)
        at [email protected]/io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
        at [email protected]/io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
        at [email protected]/io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166)
        at [email protected]/io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:788)
        at [email protected]/io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:689)
        at [email protected]/io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:652)
        at [email protected]/io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:562)
        at [email protected]/io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
        at [email protected]/io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
        at java.base/java.lang.Thread.run(Thread.java:1583)
@DaveCTurner
Copy link
Contributor

The spec describes how to handle the case where no header is received, and when a (singular) header is received, but not how to deal with multiple such headers, so I think it's legitimate to consider such requests malformed and reject them straight away. We generally prefer to be as strict as possible about matters like this. I dread to think how hard it would be to debug this if we made things more lenient.

That said, we should be returning a 400 Bad Request response, not just closing the connection abruptly like this, so that's a bug.

@DaveCTurner DaveCTurner added :Distributed Coordination/Network Http and internode communication implementations and removed needs:triage Requires assignment of a team area label labels Apr 11, 2024
@elasticsearchmachine elasticsearchmachine added the Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. label Apr 11, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (Team:Distributed)

@DaveCTurner DaveCTurner changed the title requests to Elasticsearch with two traceparent headers fail, and shouldn't Requests to Elasticsearch with two traceparent headers should yield 400 Bad Request Apr 11, 2024
DaveCTurner added a commit to DaveCTurner/elasticsearch that referenced this issue Apr 11, 2024
Today if the HTTP header validation throws an exception (rather than
calling `listener.onFailure()`) then we treat this as a server-side
error, record it in the logs, and close the connection abruptly without
sending a response. In practice such an exception is more likely a
client-side error, so with this commit we catch it and marshal it back
to the client instead.

Closes elastic#107338
elasticsearchmachine pushed a commit that referenced this issue Apr 15, 2024
Today if the HTTP header validation throws an exception (rather than
calling `listener.onFailure()`) then we treat this as a server-side
error, record it in the logs, and close the connection abruptly without
sending a response. In practice such an exception is more likely a
client-side error, so with this commit we catch it and marshal it back
to the client instead.

Closes #107338
DaveCTurner added a commit to DaveCTurner/elasticsearch that referenced this issue Apr 15, 2024
Today if the HTTP header validation throws an exception (rather than
calling `listener.onFailure()`) then we treat this as a server-side
error, record it in the logs, and close the connection abruptly without
sending a response. In practice such an exception is more likely a
client-side error, so with this commit we catch it and marshal it back
to the client instead.

Closes elastic#107338
elasticsearchmachine pushed a commit that referenced this issue Apr 15, 2024
Today if the HTTP header validation throws an exception (rather than
calling `listener.onFailure()`) then we treat this as a server-side
error, record it in the logs, and close the connection abruptly without
sending a response. In practice such an exception is more likely a
client-side error, so with this commit we catch it and marshal it back
to the client instead.

Closes #107338
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Distributed Coordination/Network Http and internode communication implementations Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants