Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Design document for the new HTTP API #2971

Merged
merged 16 commits into from
Apr 21, 2023
Merged
399 changes: 399 additions & 0 deletions docs/design/018-new-http-api.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,399 @@
# New HTTP API

## Authors
The k6 core team

## Why is this needed?

The HTTP API (in k6 <=v0.43.0) used by k6 scripts has many limitations, inconsistencies and performance issues, that lead to a poor user experience. Considering that it's the most commonly used JS API, improving it would benefit most k6 users.

The list of issues with the current API is too long to mention in this document, but you can see a detailed list of [GitHub issues labeled `new-http`](https://github.com/grafana/k6/issues?q=is%3Aopen+is%3Aissue+label%3Anew-http) that should be fixed by this proposal, as well as the [epic issue #2461](https://github.com/grafana/k6/issues/2461). Here we'll only mention the relatively more significant ones:

* [#2311](https://github.com/grafana/k6/issues/2311): files being uploaded are copied several times in memory, causing more memory usage than necessary. Related issue: [#1931](https://github.com/grafana/k6/issues/1931)
* [#857](https://github.com/grafana/k6/issues/857), [#1045](https://github.com/grafana/k6/issues/1045): it's not possible to configure transport options, such as proxies or DNS, per VU or group of requests.
* [#761](https://github.com/grafana/k6/issues/761): specifying configuration options globally is not supported out-of-the-box, and workarounds like the [httpx library](https://k6.io/docs/javascript-api/jslib/httpx/) are required.
* [#746](https://github.com/grafana/k6/issues/746): async functionality like Server-sent Events is not supported.
* Related to the previous point, all (except asyncRequest) current methods are synchronous, which is inflexible, and doesn't align with modern APIs from other JS runtimes.
* [#436](https://github.com/grafana/k6/issues/436): the current API is not very friendly or ergonomic. Different methods also have parameters that change places, e.g. `params` is the second argument in `http.get()`, but the third one in `http.post()`.


## Proposed solution(s)

### Design

In general, the design of the API should follow these guidelines:

- It should be familiar to users of HTTP APIs from other JS runtimes, and easy for new users to pick up.

As such, it would serve us well to draw inspiration from existing runtimes and frameworks. Particularly:

- The [Fetch API](https://developer.mozilla.org/en-US/docs/Web/API/Fetch_API), a [WHATWG standard](https://fetch.spec.whatwg.org/) supported by most modern browsers.
[Deno's implementation](https://deno.land/manual/examples/fetch_data) and [GitHub's polyfill](https://github.com/github/fetch) are good references to follow.

This was already suggested in [issue #2424](https://github.com/grafana/k6/issues/2424).

- The [Streams API](https://developer.mozilla.org/en-US/docs/Web/API/Streams_API), a [WHATWG standard](https://streams.spec.whatwg.org/) supported by most modern browsers.
[Deno's implementation](https://deno.land/[email protected]/examples/fetch_data#files-and-streams) is a good reference to follow.

There's a related, but very old [proposal](https://github.com/grafana/k6/issues/592) before the Streams API was standardized, so we shouldn't use it, but it's clear there's community interest in such an API.

Streaming files both from disk to RAM to the network, and from network to RAM and possibly disk, would also partly solve our [performance and memory issues with loading large files](https://github.com/grafana/k6/issues/2311).
imiric marked this conversation as resolved.
Show resolved Hide resolved

- Native support for the [`FormData` API](https://developer.mozilla.org/en-US/docs/Web/API/FormData).

Currently this is supported with a [JS polyfill](https://k6.io/docs/examples/data-uploads/#advanced-multipart-request), which should be deprecated.

- Aborting requests or any other async process with the [`AbortSignal`/`AbortController` API](https://developer.mozilla.org/en-US/docs/Web/API/AbortSignal), part of the [WHATWG DOM standard](https://dom.spec.whatwg.org/#aborting-ongoing-activities).

This is slightly out of scope for the initial phases of implementation, but aborting async processes like `fetch()` is an important feature.

- The Fetch API alone would not address all our requirements (e.g. specifying global and transport options), so we still need more flexible and composable interfaces.

One source of inspiration is the Go `net/http` package, which the k6 team is already familiar with. Based on this, our JS API could have similar entities:

- `Dialer`: a low-level interface for configuring TCP/IP options, such as TCP timeout and keep-alive, TLS settings, DNS resolution, IP version preference, etc.

- `Transport`: interface for configuring HTTP connection options, such as proxies, HTTP version preferences, etc.

It enables advanced behaviors like intercepting requests before they're sent to the server.

- `Client`: the main entrypoint for making requests, it encompasses all of the above options. A k6 script should be able to initialize more than one `Client`, each with their separate configuration.

In order to simplify the API, the creation of a `Client` should use sane defaults for `Dialer` and `Transport`.

There should be some research into existing JS APIs that offer similar features (e.g. Node/Deno), as we want to offer an API familiar to JS developers, not necessarily Go developers.

- `Request`/`Response`: represent objects sent by the client, and received from the server. In contrast to the current API, the k6 script should be able to construct `Request` objects declaratively, and then reuse them to make multiple requests with the same (or similar) data.

- All methods that perform I/O calls must be asynchronous. Now that we have `Promise`, event loop and `async`/`await` support natively in k6, there's no reason for these to be synchronous anymore.

- The API should avoid any automagic behavior. That is, it should not attempt to infer desired behavior or options based on some implicit value.

We've historically had many issues with this ([#878](https://github.com/grafana/k6/issues/878), [#1185](https://github.com/grafana/k6/issues/1185)), resulting in confusion for users, and we want to avoid it in the new API. Even though we want to have sane defaults for most behavior, instead of guessing what the user wants, all behavior should be explicitly configured by the user. In cases where some behavior is ambiguous, the API should raise an error indicating so.


#### Sockets

A Socket represents the file or network socket over which client/server or peer communication happens.

It can be of three types:
- `tcp`: a stream-oriented network socket using the Transmission Control Protocol.
- `udp`: a message-oriented network socket using the User Datagram Protocol.
- `ipc`: a mechanism for communicating between processes on the same machine, typically using files.

The Socket state can either be _active_—meaning connected for a TCP socket, bound for a UDP socket, or open for an IPC socket—, or _inactive_—meaning disconnected, unbound, or closed, respectively.

##### Example

- TCP:
```javascript
import { dialTCP } from 'k6/x/net';
import { Client } from 'k6/x/net/http';
imiric marked this conversation as resolved.
Show resolved Hide resolved

export default async function () {
const socket = await dialTCP('192.168.1.1:80', {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a fan of these dial* functions. I added them inspired by Go's net package, but this doesn't provide the ability to implement custom dialers. So maybe we should expose some Dialer type? I wasn't sure how to structure it, though.

And I'm slightly against calling these "dialers" to begin with, since it's mostly a Go concept. At the same time, I didn't want to call this connect() either, since it wouldn't make sense for UDP or IPC. dialIPC still doesn't make sense to me, since it's essentially opening a file, but this too is following Go's net package. 🤔

WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, what about trying to be consistent with the HTTP Client API and doing something similar?

We have a class, a constructor, and methods.

import { TCPSocket, <Other Protocols>... } from 'k6/x/net'; // TCP, TCPSocket, TCPConn (We should use a name that makes sense for the context and is aligned with the RFC

export default async function () {
  const socket = new TCPSocket(''192.168.1.1:80'")
  await socket.open()

If it doesn't make sense for UDP then we can have new UDP() and that's all. This should also be very similar to the Linux API.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had something like that initially, but decided against it, since you shouldn't be able to create a socket object, without it being actually connected/bound/open. Otherwise, you might be tempted to set some of its properties, which would be meaningless unless it's actually bound to an OS socket. In fact, most/all of its properties should be read-only. This is why I opted to have a constructor function instead.

An HTTP client is a bit different, since it's only used to make requests, and doesn't correspond to some system state. So you could theoretically have a Client instance that hasn't established a socket connection yet, and making the first request would establish the connection.

I also considered something like:

import { TCP, UDP, IPC } from 'k6/x/net';
const socket = TCP.dial('192.168.1.1:80');

So in a sense you have some abstract protocol object, and calling dial() on it returns you a TCPSocket, UDPSocket or IPCSocket instance. I'm slightly leaning towards this now, even though dial is still in the name. WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const socket = await dialTCP('192.168.1.1:80', {
const socket = await TCP.dial('192.168.1.1:80');

Yeah, I'm also happy with it. The name dial is never mentioned in the RFC, so it could not be familiar to someone not experienced with Go. I think open or connect would be more readable (but it is very opinable) and more used in the JS ecosystem

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went with open in 8c97fe0.

connect doesn't quite make sense for UDP or IPC, whereas open is generic enough to apply for all 3.

I think dial is not just Go's invention. It does have some networking roots, though I'm not able to find definitive references outside of Go, and "dial-up modem" 😄, right now.

Copy link
Contributor Author

@imiric imiric Mar 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more thing: the proposal doesn't currently specify a way to define custom dialers.

I wonder if we could handle this with an event system as well. E.g. have before/after events scripts can subscribe to, similar to the HTTP events, emitted before/after the socket is opened.

This could be a way to handle DNS lookups, instead of the currently proposed lookup hook function, but it would be generic enough for any other use cases (TCP proxies, custom metrics, etc.).

So you would have something like:

const socket = await TCP.open('myhost', {
  eventHandlers: {
    before: async event => {
      event.data.address = await dns.resolve(event.data.hostname, {
        rrtype: 'A',
        servers: ['1.1.1.1:53', '8.8.8.8:53'],
      });
    },
  },
});

I'm not entirely sold on the syntax to change event.data like this, as it's unclear which properties will be used after the event handler executes. Also, event handlers typically shouldn't return any values either, as that would make them too specific for a single purpose, so returning is not an option.

I'm not really sure we should allow creating custom dialer implementations from scratch in JS. I.e. nobody would choose to implement custom protocols from scratch in JS, and if they need to do that, then our Go API should allow it by being easily extensible. So for JS, it might just be good enough to have hooks, or event handlers, so scripts can implement some custom logic around the socket implementation.

// default | possible values
ipVersion: 0, // 0 | 4 (IPv4), 6 (IPv6), 0 (both)
keepAlive: true, // false |
lookup: null, // dns.lookup() |
proxy: 'myproxy:3030', // '' |
});
console.log(socket.active); // true

// Writing directly to the socket.
// Requires TextEncoder implementation, otherwise typed arrays can be used as well.
await socket.write(new TextEncoder().encode('GET / HTTP/1.1\r\n\r\n'));

// And reading...
socket.on('data', (data) => {
console.log(`received ${data}`);
socket.close();
});

await socket.done();
}
```

- UDP:
```javascript
import { dialUDP } from 'k6/x/net';
import { Client } from 'k6/x/net/http';
imiric marked this conversation as resolved.
Show resolved Hide resolved

export default async function () {
const socket = new dialUDP('192.168.1.1:9090');

await socket.write(new TextEncoder().encode('GET / HTTP/1.1\r\n\r\n'));
}
```

- IPC:
```javascript
import { dialIPC } from 'k6/x/net;
imiric marked this conversation as resolved.
Show resolved Hide resolved
import { Client } from 'k6/x/net/http';

export default async function () {
const socket = await dialIPC('/tmp/unix.sock');

console.log(socket.file.path); // /tmp/unix.sock

// The HTTP client supports communicating over a Unix socket.
const client = new Client({
socket: socket,
});
await client.get('http://unix/get');
}
```

#### Client

An HTTP Client is used to communicate with an HTTP server.

##### Examples

- Using a client with default transport settings, and making a GET request:
```javascript
import { Client } from 'k6/x/net/http';

export default async function () {
const client = new Client();
const response = await client.get('https://httpbin.test.k6.io/get');
const jsonData = await response.json();
Copy link
Member

@na-- na-- Apr 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part of the current HTTP API is probably not a good design to copy. Why should a generic HTTP response have a .json() method?

It's fine to have something that wraps a regular Client and works only with JSON, but a generic http.Client should probably not need to know what JSON is.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

http.Client wouldn't know what JSON is, but I don't see why http.Response couldn't have helper methods to parse the response body and return it as various objects.

The web Fetch API has a Response with such methods, and so does Deno's implementation. Regardless if we end up implementing Fetch or not, this is very convenient, and it would be a UX regression if we decide to remove them.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fetch and Deno might have this, but Go, axios and got don't. We should consider what makes sense for us.

These are my thoughts on the topic, in roughly a descending order of certainty:

  1. We definitely shouldn't have a .json([selector]) method like the current one, where we have a selector as the argument. That should be a separate JSONPath API that is not tied directly together to the core HTTP API, it should probably work with any string or buffer.
  2. It's fine to have a Fetch polyfill/wrapper that has a .json() method, built on top of our generic HTTP Client (either in Go or in JS!). This solves some of the UX problem of a generic API not having one.
  3. It's probably a good idea to have a dedicated custom JSONClient type built on top of the generic Client, which will be well suited for easily working with JSON REST APIs. Including potentially automatically marshaling request bodies that are objects and maybe even automatically universalizing responses. Maybe even with some JSONPath integration 🤔 This provides all of the UX benefits, but without anything magical and with a clear separation of concerns (i.e. the composable approach).
  4. The default generic HTTP Client should not make any guesses about the content of the HTTP requests and responses it is handling. This gives us more room for optimization, a clear separation of concerns, and a more consistent UX.
  5. We probably shouldn't have a .json() method (without any arguments) on the generic HTTP Client. If someone really wants to use the generic k6 http Client for one-off request with a JSON response, then JSON.parse(resp.body) is not that much worse than resp.json()...
  6. If we decide to add .json() , are we also going to add to port the current .html()? If not, then why not? And are also going to have .xml() or .protobuf()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fetch and Deno might have this, but Go, axios and got don't.

I wasn't familiar with got, looks nice 👍 It doesn't implement json() directly on Response, but does on the Promise. It even supports the json property for POST requests.

BTW, there's another HTTP lib from the same team called ky. 😅 This one apparently targets Deno, but it follows similar design principles as got.

Axios actually parses JSON bodies by default 😄 But I don't think we should look at it for inspiration, considering it's quite old at this point, and has been superseeded by other libs. It's still very popular and featureful, but also has many limitations.

Go shouldn't be an inspiration for the JS API. As stated in the design goals, we want to make things familiar to JS devs, not Go devs. We can borrow certain concepts to make the API composable and whatnot, but refering to it directly would only pollute our way of thinking about idiomatic JS.

We should consider what makes sense for us.

If by "us" you mean "k6 users", then agreed. These convenience methods are all part of offering a good experience for JS developers. I don't see why that's so controversial.

To address some of your points:

  • Sure, the Selector API makes sense as a separate JSONPath API. json() should just return the body as an object, as it does in the other JS libs.

  • Having purpose-built Client implementations just for sending different headers, interpreting requests and responses, seems like overkill, and might be too limiting in some cases. What if you want to send JSON but receive binary data, or any such weird combination we didn't predict? Having a single Client that offers some helper mechanisms for the most common use cases adds a fairly low overhead, which can be ignored for those who don't need it. I.e. everyone can just ignore the json property, and use body instead.

  • I'm not sure why you keep mentioning this, but Client itself wouldn't have a json() method. It would return a Promise<Response> which would have it.

  • I don't see why we couldn't have html() either. The point of these methods are convenience for the most common use cases. If there are good reasons to include xml() or protobuf(), then we should consider them as well.

    Are you suggesting that we have an HTMLClient, a ProtobufClient or an XMLClient instead?

    Separate client implementations make sense for truly different protocols. So it probably makes sense to have a GRPCClient that extends the base Client, or maybe a SOAPClient 😄, but not just to avoid these simple payload handling helpers.

Copy link
Member

@na-- na-- Apr 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Go shouldn't be an inspiration for the JS API. As stated in the design goals, we want to make things familiar to JS devs, not Go devs. We can borrow certain concepts to make the API composable and whatnot, but refering to it directly would only pollute our way of thinking about idiomatic JS.

I mostly agree, but I will push back slightly on this. Go can and probably should be an inspiration, as long as the end result is also an idiomatic JavaScript API. If Go has an elegant solution to a problem, and if that solution doesn't look strange for JS APIs, is easy to understand, and cleanly solves a bunch of problems, why not adopt it? 😕 Sure, we don't need to copy the Go API directly, but we don't have to avoid its good ideas just be cause they are not JS. Good API design is somewhat universal after all...

Having purpose-built Client implementations just for sending different headers, interpreting requests and responses, seems like overkill, and might be too limiting in some cases.

I didn't meed completely separate Client implementations. Just wrappers around a generic Client with maybe a few extra methods and some pre-defined Event handlers for the same events that you mention in the design doc.

What if you want to send JSON but receive binary data, or any such weird combination we didn't predict?

Use the generic Client, if it's a one-off, or build your own wrapper (because you should be able to easily do that, since everything is composable).

Having a single Client that offers some helper mechanisms for the most common use cases adds a fairly low overhead, which can be ignored for those who don't need it. I.e. everyone can just ignore the json property, and use body instead.

Sure, but having more than one way to do the same thing, especially when it's a simple thing, is generally somewhat of an API design smell. I don't object too strongly to the simple .json() method, it's only at point 5 in my list, but I don't see the need. It's also one of these extra things that we can easily add at any future point, but once we add it we can never remove, so I don't see the need to start with it.

I'm not sure why you keep mentioning this, but Client itself wouldn't have a json() method. It would return a Promise which would have it.

Sorry, I meant the Response object returned by the generic Client (or, I guess, the Promise that resolves to it).

I don't see why we couldn't have html() either. The point of these methods are convenience for the most common use cases. If there are good reasons to include xml() or protobuf(), then we should consider them as well.

See the problems that .html() causes for the current k6/http API... Tightly coupling these is a mistake.

Are you suggesting that we have an HTMLClient, a ProtobufClient or an XMLClient instead?

No, that was the point. At best, we may have some helpers and wrappers, but we probably shouldn't. We should make it super easy and efficient for users to build whatever helpers and wrappers. We should provide the flexible and generic toolbox, not try to build a dedicated tool to solve every possible weird use case that exists out there.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm actually, I am slightly walking back some of my claims... 😅 Thinking about idiomatic Promise-based JS APIs, a .json() helper on a Promise that also returns another Promise actually makes some sense 😅 I still don't like it, but it probably provides enough value to pass muster 🤔

console.log(jsonData);
}
```

- Passing a socket with custom transport settings, some HTTP options, and making a POST request:
```javascript
import { dialTCP } from 'k6/x/net';
import { Client } from 'k6/x/net/http';

export default async function () {
const socket = await dialTCP('10.0.0.10:80, { keepAlive: true });
imiric marked this conversation as resolved.
Show resolved Hide resolved
const client = new Client({
socket: socket,
proxy: 'https://myproxy',
version: 1.1, // force a specific HTTP version
headers: { 'User-Agent': 'k6' }, // set some global headers
});
await client.post('http://10.0.0.10/post', {
json: { name: 'k6' }, // automatically adds 'Content-Type: application/json' header
});
Comment on lines +185 to +187
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same logic as https://github.com/grafana/k6/pull/2971/files#r1161730930 - we shouldn't add any special handling of json options in the default Client. This is not composable, it's the opposite of that - trying to make a Client that satisfies all use cases. And it adds "automagic behavior", as your // automatically adds 'Content-Type: application/json' header comment directly shows.

The request bodies that the normal HTTP Client accepts should probably be limited to string, ArrayBuffer or some sort of a Stream. Anything else needs to be an error. We can have a separate RESTClient or JSONClient that has additional automated handling of JSON requests and responses.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not really "automagic" behavior, but a convenience for a very common use case. It's inspired by the request library, and was suggested in #436. The json property makes it clear what happens behind the scenes, and implicitly adding the header is much more convenient than having to remember to type it correctly manually, and using JSON.stringify().

I'll remove this if you feel strongly about it, but having separate client implementations just for this seems overkill.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is that it makes the API have weird edge cases. For example, what happens if I provide both body and json and formData? 😅 Now the library needs to have some extra internal logic for the order of precedence, and we have do document and test that, etc. Having more than one way to do the same thing in the same object is generally not great API design.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, from looking at the request library API you linked to, its json property is a boolean one, you still have to supply to body, right? Which may be even worse than this suggestion, since you now have extra parameters that just control the behavior of how the body is automagically processed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it slightly complicates the validation, but passing conflicting options should just error out.

No, json can either be a boolean or an object.

json - sets body to JSON representation of value and adds Content-type: application/json header. Additionally, parses the response body as JSON.

I don't like the boolean functionality either, but serializing the passed object and adding the header makes a lot of sense. This is also supported by got, as I mentioned in the other thread.

}
```

- A tentative HTTP/3 example:
```javascript
import { dialUDP } from 'k6/x/net';
import { Client } from 'k6/x/net/http';

export default async function () {
const socket = new dialUDP('192.168.1.1:9090');

const client = new Client({
socket: socket,
version: 3, // A UDP socket would imply HTTP/3, but this makes it explicit.
});
await client.get('https://httpbin.test.k6.io/get');
}
```


#### Host name resolution

Host names can be resolved to IP addresses in several ways:

- Via a static lookup map defined in the script.
- Via the operating system's facilities (`/etc/hosts`, `/etc/resolv.conf`, etc.).
- By querying specific DNS servers.

When connecting to an address using a host name, the resolution can be controlled via the `lookup` function passed to the socket constructor. By default, the mechanism provided by the operating system is used (`dns.lookup()`).

For example:
```javascript
import { dialTCP } from 'k6/x/net';
import dns from 'k6/x/net/dns';

const hosts = {
'hostA': '10.0.0.10',
'hostB': '10.0.0.11',
};

export default async function () {
const socket = await dialTCP('myhost', {
lookup: async hostname => {
// Return either the IP from the static map, or do an OS lookup,
// or fallback to making a DNS query to specific servers.
return hosts[hostname] || await dns.lookup(hostname) ||
await dns.resolve(hostname, {
rrtype: 'A',
servers: ['1.1.1.1:53', '8.8.8.8:53'],
});
},
});
}
```

#### Requests and responses

HTTP requests can be created declaratively, and sent only when needed. This allows reusing request data to send many similar requests.

For example:
```javascript
import { Client, Request } from 'k6/x/net/http';

export default async function () {
const client = new Client({
headers: { 'User-Agent': 'k6' }, // set some global headers
});
const request = new Request('https://httpbin.test.k6.io/get', {
// These will be merged with the Client options.
headers: { 'Case-Sensitive-Header': 'somevalue' },
});
const response = await client.get(request, {
// These will override any options for this specific submission.
headers: { 'Case-Sensitive-Header': 'anothervalue' },
});
Comment on lines +277 to +284
Copy link
Member

@na-- na-- Apr 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This API proposal seems a bit messy. The HTTP request method should probably be part of the Request object, right? The whole point of a Request object is to fully contain everything you need to make a request.

But then, what client.get() probably shouldn't accept Request parameters, since that Request could be a POST. I am not sure if methods like .get() and .post() are even necessary in the default HTTP Client API 🤔 Maybe a single generic Client.request(Request) or Client.do(Request) is enough. But if we choose to have helper methods, they probably shouldn't accept a Request, just a method, params, body or something like that 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This example is intentionally contrived, to show how request options can be applied globally and then merged/overridden, either in a Request object, or for each individual client.get() call.

method should indeed be part of the Request object. Essentially, most options that can be passed to client.{get,request,...}, can be used to construct a Request object. The point of Request is to reuse common options to make multiple requests.

I think overriding should be done from the bottom-up (or top-down, depending on how you look at it 😄). I.e. the options passed to client.{get,request,...} will override whatever was set in Request, which will in turn override whatever was set globally. So if you set method: 'POST' in the Request, and you call client.get() with it, then a GET request will be done, with whatever else was in the Request object.

The same confusion could arise if we only allow Client.request(Request). What should happen if you have:

const r = new Request('https://k6.io/', { method: 'GET' });
client.request(r, { method: 'POST' });

?

Removing the helper .get() and .post() methods would just remove the convenience to not specify method everytime, but not this possible inconsistency.

Or would you want to remove all options from client.request() as well, and only allow passing a Request object? 🤔 I think forcing the use of Request always would be inconvenient.

const jsonData = await response.json();
console.log(jsonData);
}
```


#### Data streaming

The [Streams API](https://developer.mozilla.org/en-US/docs/Web/API/Streams_API) allows streaming data that is received or sent over the network, or read from or written to the local filesystem. This enables more efficient usage of memory, as only chunks of it need to be allocated at once.

This is a separate project from the HTTP API, tracked in [issue #2978](https://github.com/grafana/k6/issues/2978), and involves changes in other parts of k6. Certain HTTP API functionality, however, depends on this API being available.

An example inspired by [Deno](https://deno.land/manual/examples/fetch_data#files-and-streams) of how this might work in k6:
```javascript
import { open } from 'k6/x/file';
import { Client } from 'k6/x/net/http';

// Will need supporting await in init context
const file = await open('./logo.svg'); // by default assumes 'read'

export default async function () {
const client = new Client();
await client.post('https://httpbin.test.k6.io/post', { body: file.readable });
}
```


#### Fetch API

The [Fetch API](https://developer.mozilla.org/en-US/docs/Web/API/Fetch_API) is a convenience wrapper over existing Client, Socket and other low-level interfaces, with the benefit of being easy to use, and having sane defaults. It's a quick way to fire off some HTTP requests and get some responses, without worrying about advanced configuration.

The implementation in k6 differs slightly from the web API, but we've tried to make it familiar to use wherever possible.

Example:
```
imiric marked this conversation as resolved.
Show resolved Hide resolved
import { fetch } from 'k6/x/net/http';

export default async function () {
await fetch('https://httpbin.test.k6.io/get');
await fetch('https://httpbin.test.k6.io/post', {
// Supports the same options as Client.request()
method: 'POST',
headers: { 'User-Agent': 'k6' },
json: { name: 'k6' },
});
}
```


### Implementation

Trying to solve all `new-http` issues with a single large and glorious change wouldn't be reasonable, so improvements will undoubtedly need to be done gradually, in several phases, and over several k6 development cycles.

With this in mind, we propose the following phases:

#### Phase 1: create initial k6 extension

**Goals**:

- Implement a barebones async API that serves as a proof-of-concept for what the final developer experience will look and feel like.
The code should be in a state that allows it to be easily extended.

By barebones, we mean:

- The `Client` interface with only one method: `request()`, which will work similarly to the current `http.asyncRequest()`.

For the initial PoC, it's fine if only `GET` and `POST` methods are supported.

It's not required to make `Dialer` and `Transport` fully configurable at this point, but they should use sane defaults, and it should be clear how the configuration will be done.
codebien marked this conversation as resolved.
Show resolved Hide resolved

- This initial API should solve a minor, but concrete, issue of current API. It should fix something that's currently not possible and doesn't have a good workaround.

Good candidates: [#936](https://github.com/grafana/k6/issues/936), [#970](https://github.com/grafana/k6/issues/970).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If they are confirmed, do you plan to include the concrete proposal for them in this document?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#936 would be fixed by the version property.

h2c could be supported via another property, so maybe:

  const client = new Client({
    version: 2,
    insecure: true,
  });

Though this would mean that you couldn't pass an already connected socket, or it would force a reconnection. 🤔 Not sure, we need to decide how to handle the whole Socket/Transport API, and if we're implementing that first, or exposing it later. Once we have a better idea of what we're doing there, making h2c configurable would be a relatively minor addition.

But to answer your question: yes, we should have proposals for issues we plan to work on in phase 1 before merging this PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I know, HTTP version negotiation happens at a lower level, so I am not sure if the Client is the place where we should have a version property 🤔 On the network or transport level is probably more appropriate 🤔 Though I am not sure if we should even specify these things in this design doc, a PoC seems like a better place to figure them out

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HTTP version negotiation happens at a lower level

Right, it can happen during TLS negotiation, as part of ALPN. But HTTP/1.1 connections can also be upgraded via a header. I'm conflicted about it as well, though it definitely should be somewhere lower level as well.

Maybe part of TLS.connect()?

import { TLS } from 'k6/x/net';
const tlsSocket = await TLS.connect('10.0.0.10:443',
  { alpn: [ 'h2', 'http/1.1' ] });
const client = new Client({
  socket: tlsSocket,
});

This way you could force HTTP/1.1 over TLS, even if the server supports HTTP/2. This could be simplified to a versions array, instead of alpn.

Since h2c must be negotiated via an Upgrade header, and can only be done without TLS, then something like the insecure flag above on the Client itself would be the way the go, in which case we should also keep the version property. It wouldn't make sense to specify that as a, say, TCP.open() option.

But I agree that we don't need to agree on every single detail to start working on the PoC. We can iron out these details as we make progress.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to note that the UPGRADE header is only used for (in practice) websockets and h2c without "prior knowledge".

The usuall http2 upgrade happens in practice in the tls handshake and h2c with prior knowledge just starts talking http2 directly.

http3 being over UDP means that the upgrade is basically a completely new connection. Except again if "prior knowlege" is used in which case it is still on a new connection, it just skips the first one ;).

I am not certain where this should be configured ... or even if it should be one place or if it should be solved with some more involved setup.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Good candidates: [#936](https://github.com/grafana/k6/issues/936), [#970](https://github.com/grafana/k6/issues/970).
This initial API must solve a minor, but concrete, issues of the current API. It fixes something that's currently not possible and doesn't have a good workaround as [#936](https://github.com/grafana/k6/issues/936) and [#970](https://github.com/grafana/k6/issues/970).

I think we can promote them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In 0fde824, I added TLS options to TCP.open(), instead of having TLS be a separate k6/x/net class.


- Features like configuring options globally, or per VU or request, should be implemented.
Deprecating the `httpx` library should be possible after this phase.


**Non-goals**:

- We won't yet try to solve performance/memory issues of the current API, or implement major new features like data streaming.

codebien marked this conversation as resolved.
Show resolved Hide resolved

#### Phase 2: work on major issues

**Goals**:

- Work should be started on some of the most impactful issues from the current API.
Issues like high memory usage when uploading files ([#2311](https://github.com/grafana/k6/issues/2311)), and data streaming ([#592](https://github.com/grafana/k6/issues/592)), are good candidates to focus on first.


#### Phase 3: work on leftover issues

**Goals**:

- All leftover `new-http` issues should be worked on in this phase.
**TODO**: Specify which issues and in what order should be worked on here.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this is the part with all the socket and dns stuff?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not clear yet, so we should decide how to handle that.

The DNS lib? Sure, it can be done here, as it's not critical. For the socket lib, I'm not so sure, and think it should be done earlier, maybe even in phase 1.

This is partly a reply to your top comment, but I think we should structure the Go API in such a way that it mirrors the JS API, so that exposing things like the Sockets API would just be a matter of adding JS bindings to it. If we don't think about how the API will look from the JS side and drive the implementation based on that, then we might end up with a state that uses Go semantics and is difficult to later expose to JS.

This is why I think we should start with the lower level APIs the HTTP API depends on first, and then build on top of it. I added an introduction to the Design section that touches on why this is important.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, forgot to mention that phase 3 would essentially be for features we consider lower priority. The high priority features would be done in phases 1 and 2, but we should decide which features should be done when. The spreadsheet I shared attempts to do this, so we can start there.


- The extension should be thoroughly tested, by both internal and external users.


#### Phase 4: expand, polish and stabilize the API

**Goals**:

- The API should be expanded to include all HTTP methods supported by the current API.
For the most part, it should reach feature parity with the current API.

- A standalone `fetch()` function should be added that resembles the web Fetch API. There will be some differences in the options compared to the web API, as we want to make parts of the transport/client configurable.

Internally, this function will create a new client (or reuse a global one?), and will simply act as a convenience wrapper over the underlying `Client`/`Dialer`/`Transport` implementations, which will be initialized with sane default values.

- Towards the end of this phase, the API should be mostly stable, based on community feedback.
Small changes will be inevitable, but there should be no discussion about the overall design.
Comment on lines +467 to +479
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kind of don't see the benefit of any of this especially as part of this proposal.

While some or all of those changes (won't call them improvements as I personally don't like them), might be added. None of those IMO are a good reason to not make the API generally available.

Copy link
Contributor Author

@imiric imiric Mar 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which part are you specifically referring to? All 3 points of the Goals section, or just the Fetch point?

I'm surprised you object to the Fetch API, since you opened #2424. 😉

I don't understand why this is such a hot topic...

First of all, this is not a blocker for making the API generally available. It will be available to anyone who wants to use the extension starting from phase 1. And we now agreed that the extension should be merged into k6 as an experimental module at the end of phase 2.

Secondly, once the main API is implemented, adding a fetch() wrapper on top of it is such a minuscule and trivial part of it, that's it's not even worth debating at this point.

The reason I think it's worth mentioning as a general design goal, even if it's not as important as other aspects of the API (which is why it's in phase 4), is because it would be very convenient to use, similar to how k6/http is now. Most users won't care about configuring clients or changing transport-related settings, and would just want to fire off a quick request. In this sense, I expect fetch() to be the most popular API, since it's familiar and easy to use. A good indicator of usable APIs is to expose complexity gradually and as needed; make things easy to use for newcomers, but flexible enough for people to explore naturally. To quote Alan Kay: "Simple things should be simple; complex things should be possible." ❤️ Since the overall goal of k6 is to deliver the best DX possible, having familiar and friendly wrappers like this aligns well with that mission.

So please suggest which parts of this you would take out, or how you would change it, but I don't think we should remove the Fetch API as a design goal.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would argue the proposal here is about making things possible.

Both we and every user can (and will) extend this API.

But for me adding a bunch of UX improvements and that being not a small part of the specification does not help with the discussion around - it hinders it. I now or whoever has to read a bunch more text and then decide that whoever wrote it and the people who have agreed, meant those as inspirational things instead of as goals that need to be reached by this proposal to be called fulfilled.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it help if we split this large proposal into several smaller ones? The scope is quite large already, and we haven't even fleshed out the details of the Sockets or DNS APIs. Splitting this into separate proposals would make each one more manageable, and allow us to prioritize them as a related group. Then the UX improvements, the Fetch API and any such convenience wrappers, could go into one.

I'm not sure where to start with this, so if we agree to do this, any suggestions to move it forward would be appreciated.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this point, I think we would be better served by proofs of concept. Iterating on code instead of on more lengthy text seems like it would be the more productive way forward.

Copy link
Contributor

@codebien codebien Apr 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it help if we split this large proposal into several smaller ones?

@imiric If possible, I would prefer to have one single source of truth and not fragment the information. In the end, the doc sounds still manageable to me.

I think the unique very detailed phase should be the first. All the rest should give us a guideline in terms of vision and roadmap for the long-term. We should re-iterate this doc at the end of each phase and before starting a new one.

I suggest changing phase 4 with something like this:

Suggested change
#### Phase 4: expand, polish and stabilize the API
**Goals**:
- The API should be expanded to include all HTTP methods supported by the current API.
For the most part, it should reach feature parity with the current API.
- A standalone `fetch()` function should be added that resembles the web Fetch API. There will be some differences in the options compared to the web API, as we want to make parts of the transport/client configurable.
Internally, this function will create a new client (or reuse a global one?), and will simply act as a convenience wrapper over the underlying `Client`/`Dialer`/`Transport` implementations, which will be initialized with sane default values.
- Towards the end of this phase, the API should be mostly stable, based on community feedback.
Small changes will be inevitable, but there should be no discussion about the overall design.
#### Phase 4: expand, polish and stabilize the API
**Goals**:
- The API should be expanded to include all features supported by the current API.
For the most part, it should reach feature parity with the current API.
- Include all the major required features not available in the current API: [we can drop here the list of all the issues not included in the previous phases - I expect we will update it after the first phase].
- Towards the end of this phase, the API should be mostly stable, based on community feedback.
Small changes will be inevitable, but there should be no discussion about the overall design.

Copy link
Contributor Author

@imiric imiric Apr 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this point, I think we would be better served by proofs of concept. Iterating on code instead of on more lengthy text seems like it would be the more productive way forward.

Agreed. I started working on a PoC weeks ago, but Ned then suggested it was better to discuss the design first, so I abandoned it. At this point, it doesn't even feel like we agree on phase 1 of the current proposal, so I don't think we're anymore ready to work on the PoC than we were back then.

The reason to split this proposal into several more focused ones is to address Mihail's feedback that the scope of this proposal has expanded beyond just an HTTP API. And particularly to get rid of any mentions of UX improvements and the Fetch API, which seems to be controversial.

If we don't want to split it, then I guess everyone is fine with most of the sections here being incomplete, and the proposal "tiring to read"? It's frustrating trying to address some feedback, and then getting mixed signals about the way to proceed. 😓

In order to work on the PoC again, does everyone agree with the current phase 1?

That is: we won't be exposing a Sockets API, and network/transport options won't be configurable. The only goal is to expose a Client.request() that fixes a minor issue like #936 or #970. Although, in practice, both #936 and #970 will probably require configuring the transport, so maybe another issue would be a better fit.

If you agree, please approve the PR and let's merge it as is. If not, please suggest improvements to phase 1 only. We can iterate on and flesh out the other phases later.



#### Phase 5: merge into k6-core, more testing

At this point the extension should be relatively featureful and stable to be useful to all k6 users.

**Goals**:

- Merge the extension into k6 core, and make it available to k6 Cloud users.
mstoykov marked this conversation as resolved.
Show resolved Hide resolved

- Continue to gather and address feedback from users, thorough testing and polishing.


#### Phase 6: deprecate `k6/http`

As the final phase, we should add deprecation warnings when `k6/http` is used, and point users to the new API.
Eventually, months down the line, we can consider replacing `k6/http` altogether with the new module.
mstoykov marked this conversation as resolved.
Show resolved Hide resolved


## Potential risks

* Long implementation time.

Not so much of a risk, but more of a necessary side-effect of spreading the work in phases, and over several development cycles. We need this approach in order to have ample time for community feedback, to implement any unplanned features, and to make sure the new API fixes all existing issues.
Given this, it's likely that the entire process will take many months, possibly more than a year to finalize.


## Technical decisions

TBD after team discussion. In the meantime, see the "Proposed solution(s)" section.