Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add responseBodyHook for http request and response #3104

Closed
haddasbronfman opened this issue Jul 21, 2022 · 21 comments · Fixed by #3407
Closed

Add responseBodyHook for http request and response #3104

haddasbronfman opened this issue Jul 21, 2022 · 21 comments · Fixed by #3407
Labels
bug Something isn't working pkg:instrumentation-http priority:p2 Bugs and spec inconsistencies which cause telemetry to be incomplete or incorrect signal:traces Issues and PRs related to general traces signal

Comments

@haddasbronfman
Copy link
Member

haddasbronfman commented Jul 21, 2022

Is your feature request related to a problem? Please describe.

I work on otel distribution, and as part of our HTTP instrumentation, we want to have the HTTP request body and the HTTP response body. These apparently are not easy to get. This is because otel is closing the span before the 'end' callback on my request/response has been triggered.
This is happening here and here

Describe the solution you'd like

I want to add a responseBodyHook of this type:

export interface HttpResponseBodyFunction {
    (span: Span, response: IncomingMessage | ServerResponse, body: String): void;
}

(I think the body should be of type to be String | Buffer, but for POC I tried it with 'String')

Describe alternatives you've considered

In order to get the body, I tried to override the 'end' function in the same way otel overrides it. something like this:

const originalEnd = clientRequest.end;
clientRequest.end = function (..._args: ResponseEndArgs) {
        clientRequest.end = originalEnd;
        span.setAttribute('http.request.body', _args[0] as AttributeValue);
        return clientRequest.end.apply(this, arguments as never);
};

This works but not for all cases, and I think the solution should come from otel, because it shouldn't close the span before the user got the body. The user should have the opportunity to have the body and add attributes according to it to his span.

Additional context

I would like to solve this issue, if this is acceptable can you assign it to me?

@legendecas
Copy link
Member

Are requestHook and responseHook sufficient for the problem? You can add body listeners on the given request and response.

@haddasbronfman
Copy link
Member Author

I think they are not sufficient. When I tried to use them I added this code to my responseHook:

      const listener = (chunk: any) => {
        bodyHandler.addChunk(chunk); //bodyHandler aggregates the chunks
      };
      response.on('data', listener);

      response.on('end', () => {
        span.setAttribute(SemanticAttributes.HTTP_RESPONSE_BODY, bodyHandler.getChunks());
        response.removeListener('data', listener);
      });

but when the end callback was called, the span was already closed and setAttribute failed.
This happens because the end function (overridden in line 448) is closing the span, and only after that, my end callback is called.

@legendecas
Copy link
Member

I believe the problem is that response.end is asynchronous -- however, in the instrumentation, it is been treated as a synchronous method.

@haddasbronfman
Copy link
Member Author

What do you mean by "treated as a synchronous method"? I see that my end callback is called after the span is closed. what can I do differently?

@legendecas
Copy link
Member

legendecas commented Jul 25, 2022

response.end accepts a callback as the last parameter. Only when the callback is invoked, we can safely assert that the data passed to response.end is sent to the other end. So, I think the span should be marked as ended when the callback of response.end is invoked, which should solve the problem in the OP. I'd believe this is a bug of the http-instrumentation.

@haddasbronfman
Copy link
Member Author

But this callback argument is optional

@legendecas
Copy link
Member

It is asynchronous regardless of the presence of the optional callback.

@github-actions
Copy link

github-actions bot commented Oct 3, 2022

This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 14 days.

@github-actions github-actions bot added the stale label Oct 3, 2022
@legendecas legendecas added bug Something isn't working pkg:instrumentation-http signal:traces Issues and PRs related to general traces signal and removed feature-request stale labels Oct 8, 2022
@dyladan
Copy link
Member

dyladan commented Oct 12, 2022

@legendecas is there some reason you think this is a bug? If so can you add a priority label for it?

@legendecas legendecas added priority:p3 Bugs which cause problems in user apps not related to correctness (performance, memory use, etc) priority:p2 Bugs and spec inconsistencies which cause telemetry to be incomplete or incorrect and removed priority:p3 Bugs which cause problems in user apps not related to correctness (performance, memory use, etc) labels Oct 13, 2022
@legendecas
Copy link
Member

Yeah, the http-instrumentation didn't end the span in the end callback of the response, which may end up finishing the span earlier in conditions mentioned in #3104 (comment).

I'm tagging this as priority:p2 Bugs and spec inconsistencies which cause telemetry to be incomplete or incorrect as it causes correctness issues.

@legendecas
Copy link
Member

legendecas commented Nov 14, 2022

@haddasbronfman I've submitted a fix at #3407. Would you mind verifying if it meets your use case? Thank you.

@haddasbronfman
Copy link
Member Author

Hi, I'm no longer working on this so I can't answer for sure. thanks for your solution!

@gmann42
Copy link

gmann42 commented Jan 27, 2023

@legendecas @haddasbronfman @dyladan
Hey folks, I've been trying to capture the body by using responseHook.
However not able to make this work, any inputs/directions please ?

 responseHook: (span, response) => {
          response.on('data', (chunk) => {
            if(!response['myBody']){
              response['myBody'] = [];
              response['myBody'].push(chunk);
            }
            else{
              response['myBody'].push(chunk);
            }
          });
          response.on('end', () => {
            const responseBody = Buffer.concat(response['myBody']).toString();
            console.log('respnse body is', responseBody);
            console.log('response headers are', response.headers);
            span.setAttribute('http.response.body',responseBody);
            span.setAttributes('http.response.headers', response.headers);

          });

@legendecas
Copy link
Member

@gmann42 response is a writable and writable emit finish when the write has completed.

@gmann42
Copy link

gmann42 commented Jan 31, 2023

@gmann42 response is a writable and writable emit finish when the write has completed.

@legendecas Thanks, I don't quite follow will look at docs.

But I've managed to do this one level above by using opentelemetry-express. Patched the send method of response object from express to add the body as a span attribute.

@u11d-michal-miler
Copy link

@haddasbronfman I've submitted a fix at #3407. Would you mind verifying if it meets your use case? Thank you.

I confirm the following works fine:

const responseHook = async (
  span: api.Span,
  response: IncomingMessage | ServerResponse,
) => {
  let body = '';
  response.on('data', (chunk: Buffer) => {
    body += chunk.toString();
  });
  response.on('end', () => {
    span.setAttribute('http.response.body', body);
    response.removeAllListeners();
  });
};

@haddasbronfman
Copy link
Member Author

@u11d-michal-miler I no longer use this use case.

@helloworld
Copy link

@legendecas thank you so much for the fix! I'm sorry to bump an old thread here—

@u11d-michal-miler's code example works, but only for non compressed responses. When we have a gzip/deflate/br compressed response, we see gibberish in the span. Here's an example:

const { HttpInstrumentation } = require("@opentelemetry/instrumentation-http");
const { NodeTracerProvider } = require("@opentelemetry/sdk-trace-node");
const { registerInstrumentations } = require("@opentelemetry/instrumentation");
const {
  ConsoleSpanExporter,
  BatchSpanProcessor,
} = require("@opentelemetry/sdk-trace-base");

registerInstrumentations({
  instrumentations: [new HttpInstrumentation({
    responseHook: async (
      span,
      response
    ) => {
      if (!response.headers) return;

      const contentEncoding = response.headers['content-encoding'];
      span.setAttribute('http.response.content_encoding', contentEncoding);

      let body = '';
      response.on('data', (chunk) => {
        body += chunk.toString();
      });
      response.on('end', () => {
        span.setAttribute('http.response.body', body);
        response.removeAllListeners();
      });
    },
  })],
});

const provider = new NodeTracerProvider();
const exporter = new ConsoleSpanExporter();
const processor = new BatchSpanProcessor(exporter);
provider.addSpanProcessor(processor);
provider.register();

const http = require("http");
const axios = require('axios');

const server = http.createServer(async (req, res) => {
  console.log("Request started")
  // No compression
  await axios.get(
    `https://api.github.com/users/defunkt`,
    {
      headers: {
        'Accept-Encoding': 'identity',
        Accept: 'application/vnd.github+json',
      },
    }
  );

  // Gzip compressed response
  await axios.get(
    `https://api.github.com/users/defunkt`,
  );

  res.end("Success");
});

server.listen(3000, () => {
  console.log("Server running at http://localhost:3000");
});

Here we have a server with two requests, one that has a gzip'd response and one that doesn't. Here are the resulting spans:

{
  traceId: 'af939f97f5cc6ed91562669ecf1fa45f',
  parentId: '528bf06e369592df',
  traceState: undefined,
  name: 'GET',
  id: 'aa2ecd5ba2654566',
  kind: 2,
  timestamp: 1688264149851000,
  duration: 95547,
  attributes: {
    'http.url': 'https://api.github.com/users/defunkt',
    'http.method': 'GET',
    'http.target': '/users/defunkt',
    'net.peer.name': 'api.github.com',
    'http.host': 'api.github.com:443',
    'net.peer.ip': '140.82.113.5',
    'net.peer.port': 443,
    'http.status_code': 200,
    'http.status_text': 'OK',
    'http.flavor': '1.1',
    'net.transport': 'ip_tcp',
    'http.response.content_encoding': 'gzip',
    'http.response.body': '\x1F�\b\x00\x00\x00\x00\x00\x00\x03���n�@\x14E\x7F����\x01��1R�E�颕"%m��5�\x18�\x1Df��\x03�E����/�\x13z\x07p�zQ�����}�ǻ�L�R\x19��B\x1EZ�؂���ɂ\x19[�}�f\x1F��m>\x7F���o\x1Fnq_t��۷N�^E������_���6k�t�5$\r]��-O�v�k����A\x19\x07\x172��$F\x0E:��X���\x17�džC�K��jm�`/]�G�?Cp5^+S�\x17\x00�sK�Ĉ`�)<��4��\x00�<��\x15\x04\t��;Y̱3!0s4��s\'\x1B;h��ϝjHY3��? ��+�Q?�l!�\x1E|04��\x00\x00�\x1D6k\x169\x12=o��D~\n' +
      "cp2���L�]�\x10�S#��\x0Fx�a\x8A�^\x14uH�Ah/\x11'Q��w�S��\x17a<9A\x15j���0'��V�\x05�\x10�)W�U\x1Eʏ��aˁh�\x0F3?3�\x16\n" +
      'a\x1C\x05*��4�M��B���ͣ"�6st4\x165m�U�\x1F���v���a\x15�AخΩ@��?��ܼ�I8Y�aЛ0RA�D�v\x19G�$����d�ƻGxh��uM�ZF�e��O�t�M�7���/͗u�\x04\x00\x00'
  },
  status: { code: 0 },
  events: [],
  links: []
}
{
  traceId: '301574b2815191e189fc0e714511029b',
  parentId: '28c2684141a1cf96',
  traceState: undefined,
  name: 'GET',
  id: 'bd6658905463b6de',
  kind: 2,
  timestamp: 1688264149932000,
  duration: 92838,
  attributes: {
    'http.url': 'https://api.github.com/users/defunkt',
    'http.method': 'GET',
    'http.target': '/users/defunkt',
    'net.peer.name': 'api.github.com',
    'http.host': 'api.github.com:443',
    'net.peer.ip': '140.82.112.6',
    'net.peer.port': 443,
    'http.response_content_length': 485,
    'http.status_code': 200,
    'http.status_text': 'OK',
    'http.flavor': '1.1',
    'net.transport': 'ip_tcp',
    'http.response.content_encoding': 'gzip',
    'http.response.body': '\x1F�\b\x00\x00\x00\x00\x00\x00\x03���n�@\x14E\x7F����\x01��1R�E�颕"%m��5�\x18�\x1Df��\x03�E����/�\x13z\x07p�zQ�����}�ǻ�L�R\x19��B\x1EZ�؂���ɂ\x19[�}�f\x1F��m>\x7F���o\x1Fnq_t��۷N�^E������_���6k�t�5$\r]��-O�v�k����A\x19\x07\x172��$F\x0E:��X���\x17�džC�K��jm�`/]�G�?Cp5^+S�\x17\x00�sK�Ĉ`�)<��4��\x00�<��\x15\x04\t��;Y̱3!0s4��s\'\x1B;h��ϝjHY3��? ��+�Q?�l!�\x1E|04��\x00\x00�\x1D6k\x169\x12=o��D~\n' +
      "cp2���L�]�\x10�S#��\x0Fx�a\x8A�^\x14uH�Ah/\x11'Q��w�S��\x17a<9A\x15j���0'��V�\x05�\x10�)W�U\x1Eʏ��aˁh�\x0F3?3�\x16\n" +
      'a\x1C\x05*��4�M��B���ͣ"�6st4\x165m�U�\x1F���v���a\x15�AخΩ@��?��ܼ�I8Y�aЛ0RA�D�v\x19G�$����d�ƻGxh��uM�ZF�e��O�t�M�7���/͗u�\x04\x00\x00'
  },
  status: { code: 0 },
  events: [],
  links: []
}

I tried updating the response hook like so, but it still doesn't work. My guess is that there is a race condition where we are not seeing all the chunks by the time the response closes.

responseHook: async (
      span,
      response
    ) => {
      const encoding = response.headers['content-encoding'];
      span.setAttribute('http.response.content_encoding', encoding);

      let stream = response;

      // Handle different types of encoding.
      if (['gzip', 'deflate', 'br'].includes(encoding)) {
        const decompress = encoding === 'gzip' ? zlib.createGunzip() :
                          encoding === 'deflate' ? zlib.createInflate() :
                          zlib.createBrotliDecompress();
        
        await pipelineAsync(response, decompress);
        stream = decompress;
      }
      const chunks = [];
      for await (const chunk of stream) {
        chunks.push(chunk.toString());
      }

      const body = chunks.join('');
      span.setAttribute('http.response.body', body);
    },
  })],

Here's a reproduction: HTTP response body reproduction

@legendecas
Copy link
Member

@helloworld I don't think it is helpful to decompress the server response to collect span attributes. It is computational costing and should be avoided. I'd suggest hooking a server framework instrumentation instead.

@helloworld
Copy link

@legendecas do you have any pointers to server framework instrumentation I should be looking at? Do you mean express instrumentation, and such?

@emilys314
Copy link

I believe I'm encountering the same issue with try to get the response added as a custom attribute.

The requestHook is working fine for me:

new HttpInstrumentation({
      requestHook: (span, req) => {
        let body = "";
        req.on("data", (chunk) => {
          body += chunk;
        });
        req.on("end", (chunk) => {
          span.setAttribute("http.request.body", body);
        });
      }
      // responseHook: async (span, response) => {
      // console.log(response.req);
      // }
    }),

But trying the same thing for the responseHook, the response callback functions don't execute:

responseHook: (span, res) => {
        console.log("This is run");
        res.on("data", (chunk) => {
          console.log("not run");
        });
        res.on("end", (chunk) => {
          console.log("not run");
        });
      }

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working pkg:instrumentation-http priority:p2 Bugs and spec inconsistencies which cause telemetry to be incomplete or incorrect signal:traces Issues and PRs related to general traces signal
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants