Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ability to pass custom http client into node #880

Merged
merged 24 commits into from
Jul 13, 2023

Conversation

silesky
Copy link
Contributor

@silesky silesky commented Jul 5, 2023

Package: @segment/analytics-node.
image

Please see the types/default implementation here: packages/node/src/lib/http-client.ts

This is based on work / research from @MichaelGHSeg!

Use a custom fetch-like implementation with proxy (simple, recommended)

import { HTTPFetchFn } from '@segment/analytics-node'
import axios from 'axios'

const httpClient: HTTPFetchFn = async (url, options) => {
  return axios({
    url,
    proxy: {
        protocol: 'http',
        host: 'proxy.example.com',
        port: 8886,
        auth: {
          username: 'user',
          password: 'pass',
        },
      },
    ...options,
  })
}

const analytics = new Analytics({
  writeKey: '<YOUR_WRITE_KEY',
  httpClient,
})

Augment the default HTTP Client

import { FetchHTTPClient, HTTPClientRequest  } from '@segment/analytics-node' 
 
class MyClient extends FetchHTTPClient {
  async makeRequest(options: HTTPClientRequest) {
    return super.makeRequest({
         ...options, 
         headers: { ...options.headers, foo: 'bar'  }
      }})
  }
}

const analytics = new Analytics({ 
  writeKey: '<YOUR_WRITE_KEY>', 
  httpClient: new MyClient() 
})

Option 2: Completely override the full HTTPClient (Advanced, you probably don't need to do this)

import { HTTPClient, HTTPClientRequest } from '@segment/analytics-node'

class CustomClient implements HTTPClient {
  async makeRequest(options: HTTPClientRequest) {
    return someRequestLibrary(options.url, { 
      method: options.method,
      body: JSON.stringify(options.data) // serialize data
      headers: options.headers,
    })
  }
}
const analytics = new Analytics({ 
  writeKey: '<YOUR_WRITE_KEY>', 
  httpClient: new CustomClient() 
})

@changeset-bot
Copy link

changeset-bot bot commented Jul 5, 2023

🦋 Changeset detected

Latest commit: 84ac43b

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package
Name Type
@segment/analytics-node Minor

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@silesky silesky force-pushed the add-ability-to-pass-custom-node-client branch 3 times, most recently from 47c2e34 to e9dca8d Compare July 5, 2023 20:51
@silesky silesky changed the title Add ability to pass custom node client Add ability to pass custom http client into node Jul 5, 2023
silesky added 4 commits July 6, 2023 15:17
wip

wip

wip

wip

wip

wip

wip

wip

wip

wip

wip
@silesky silesky force-pushed the add-ability-to-pass-custom-node-client branch from e9dca8d to a6a0d95 Compare July 6, 2023 23:14
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was this added accidentally?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch!

try {
const requestInit = {
signal: signal,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you have a specific fetch replacement library in mind that doesn't support 'signal' here? I have questions about simultaneously allowing access to tinker with our internals and passing around non-standard objects.

Copy link
Contributor Author

@silesky silesky Jul 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All fetch libraries support AbortSignal, but not all HTTP Libraries do (e.g Request).

That's why signal is present in the FetchHTTPClient class/interface but not the HTTPClient interface -- it's fetch specific.

Does this answer your concern? I am not entirely sure I am following you on you standard objects / monkeying with internals -- maybe you can elaborate on what part of the examples (see PR description) you feel is problematic. I wanted to allow a couple layers extensibility -- users can swap out the default fetch client, or, if they need, replace the entire http client with something that doesn't use fetch / have any relationship with the fetch interface at all.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just wanted to know what the impetus was. Figured you had a reason for moving some of the fiddly bits into the user replaceable thing that they then have to repeat.

@@ -14,7 +14,7 @@ export type NodeEmitterEvents = CoreEmitterContract<Context> & {
url: string
method: string
headers: Record<string, string>
body: string
body: Record<string, any>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the reasoning for changing this from a string to an object? Seems like it saves us from having to do a JSON.parse() in our tests, but I'd also expect an HTTP body to be a string if I'm specifically listening for HTTP requests/want to know exactly what got sent to the service.

(Also this would technically be a breaking change going from a string to an object - someone already doing a JSON.parse() would now encounter an error, though I don't expect many users of this API yet)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The serialization part is the client's responsibility, and I didn't want the http client to have take an emitter -- so the emitting part now happens in the publisher code (before the request is passed to the http client). I saw no downsides to this change, and only upsides.

* Timeout in milliseconds
* @example 10000
*/
timeout: number
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've seen a lot of different HTTP-related timeouts (timeout to establish connection, socket idle timeout, time for response headers, time for full response body) - versus browsers that were way more limited with what XHR provided. Can we be more detailed on what this timeout covers?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. We currently support httpRequestTimeout which is "timeout for full response body" -- I can rename it to match our config.

* JSON data to be sent with the request (will be stringified)
* @example { "batch": [ ... ]}
*/
data: Record<string, any>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It feels a little weird for an HTTP client to have data instead of body. (headers and query params are data too!)

How come we accept an object rather than a string/byte array here? I would expect the HTTP Client to only be concerned with making the HTTP request, not transforming the body.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can see the possibility of wanting to change the data before sending to the server and hence wanting to have the object and not parse from JSON.

But we have already a few places to do these operations, so I agree with this, let's keep the HTTP Client dumb to discourage having too much logic here

Copy link
Contributor Author

@silesky silesky Jul 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason I used data here is:

  1. to distinguish the higher level representation of the body (JSON object) from the serialized body (string) -- fetch takes a string
  2. To allows people to use a custom JSON serializer if they want.
  3. To allow people to make modifications to "data/payload" without deserializing and re-serializing (expensive)
  4. This is what axios uses, so it's a comment convention

The HTTPClient is meant to be generic in terms of what we support (for example, if we didn't support timeout, that wouldn't be an option).

PS: this pattern is also used by Stripe in their SDK's HTTPClient.

PPS:The shape will be { batch: [{ ..... }] }

* @link https://developer.mozilla.org/en-US/docs/Web/API/Response
*/
export interface HTTPResponse {
ok: boolean
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kind of wonder if we really need ok - it's just a boolean that is true if status if 200-299. We get status already so could check that ourselves (then if someone uses an HTTP client that doesn't return ok, they don't have to worry about it)

}

return this._fetch(options.url, requestInit).finally(() =>
clearTimeout(timeoutId)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see, timeout is time to response headers (to answer my earlier question 😄)

Copy link
Contributor

@oscb oscb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. I +1 on some of the comments before, I think the most important one would be to remove the ok from the responses to prevent weird issues where it's missed to turn on (or the other way around.

* JSON data to be sent with the request (will be stringified)
* @example { "batch": [ ... ]}
*/
data: Record<string, any>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can see the possibility of wanting to change the data before sending to the server and hence wanting to have the object and not parse from JSON.

But we have already a few places to do these operations, so I agree with this, let's keep the HTTP Client dumb to discourage having too much logic here

@silesky silesky force-pushed the add-ability-to-pass-custom-node-client branch from 9c43c89 to 84ac43b Compare July 12, 2023 20:20
Copy link
Contributor

@oscb oscb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@silesky silesky merged commit 5f50363 into master Jul 13, 2023
@silesky silesky deleted the add-ability-to-pass-custom-node-client branch July 13, 2023 16:23
@github-actions github-actions bot mentioned this pull request Jul 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants