Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

On-demand copy for Response.Request.Body #2395

Closed
codebien opened this issue Feb 22, 2022 · 7 comments
Closed

On-demand copy for Response.Request.Body #2395

codebien opened this issue Feb 22, 2022 · 7 comments
Labels
evaluation needed proposal needs to be validated or tested before fully implementing it in k6 refactor

Comments

@codebien
Copy link
Contributor

As part of #2311 investigation and as previously reported in #1931 (comment) the Response.Request.Body field as a string type requires executing a dedicated copy for each request.
This problem creates noise for better profiling of #2311 and it should be required from any solution applied to #2311.

type Request struct {
Method string `json:"method"`
URL string `json:"url"`
Headers map[string][]string `json:"headers"`
Body string `json:"body"`

Suggestion

The current string field is really required only when the JS property is accessed directly. The proposal is to switch to []byte and generate the string only on-demand when the property is accessed from JS. It should allow reusing the original slice used on the HTTP parsing.

The copy should be executed in the Get function for a goja.Proxy object. The Proxy wraps the k6/http.Response object.

Example

import http from "k6/http";
import { sleep } from "k6";

const binFile = open("1MB.bin", "b");

export const options = {
        systemTags: [],
	iterations: 1000,
	vus: 200,
};

export default function () {
    const data = {
	field: Math.random(),
	file: http.file(binFile, "testdata.bin"),
    };

    const res = http.post("http://test:8080/upload", data);
    sleep(1);
}

profiling

@codebien codebien added refactor evaluation needed proposal needs to be validated or tested before fully implementing it in k6 labels Feb 22, 2022
@mstoykov
Copy link
Contributor

The predominant problem with this it will require quite a big refactor, making access to everything slower to get the benefit only in the cases that it's useful.

Additionally, golang will return the body as io.ReadCloser not as byte slice so a byte slice still need to be made. And then when the body is requested it will need to be transformed from []byte to string.

I would argue that given all of the above this likely will not have that good of a performance improvement. The best case scenario is that literally nothing is ever requested, which is kind of the most likely scenario. But in all other cases it will be worse(by some amount).

Arguably something like #845 will be much easier to implement and likely will have better final performance.

IMO (as I tried to elude in #2311) the correct fix for those are to move towards an API that can make all of these problems ... not a problem ;). Like in this case unless you request the body (through some way) we won't actually keep it. We also probably should require the whole body in order to make the request and then also not keep it around.

This of course means a lot of work and a smaller temporary workaround might be a good stop gap solution, as long as it isn't very complex to implement - like this one IMO.

@na--
Copy link
Member

na-- commented Feb 22, 2022

@mstoykov, this is about the copy of the request body that is saved in Response.request.body, not the response body. So io.ReadCloser has nothing to do with it and some sort of an on-demand lazy copy seems like it should be easy to do.

Moreover, this probably is actually a bug, not just very inefficient:

respReq.Body = preq.Body.String()

For binary bodies, this will likely mangle them... So, I am fine with trying to use a proxy. I am also fine with making a minor breaking change and actually make the property return a lazy-loaded ArrayBuffer since it would basically be a bugfix 🤷‍♂️ Or have it purely on the JS side and reference literally the exact same variable that was passed to http.request(), so no copying takes place.

@codebien
Copy link
Contributor Author

codebien commented Feb 23, 2022

Or have it purely on the JS side and reference literally the exact same variable that was passed to http.request(), so no copying takes place.

@na-- Do you mean totally dropping the response.request object? If not, can you provide an example of the expected API, please? For example, how should be the following use case?

export default function () {
  const url = 'http://test.k6.io/login'

  const payload = JSON.stringify({
    email: 'aaa',
    password: 'bbb',
  })

  const params = {
    headers: {
      'Content-Type': 'application/json',
    },
  }
  let res = http.post(url, payload, params)
  console.log(res.request.headers)
}

@na--
Copy link
Member

na-- commented Feb 23, 2022

Or have it purely on the JS side and reference literally the exact same variable that was passed to http.request(), so no copying takes place.

@na-- Do you mean totally dropping the response.request object?

No, I mean the exact same goja.Value that is potentially given as args[0] here:

func (c *Client) Request(method string, url goja.Value, args ...goja.Value) (*Response, error) {
state := c.moduleInstance.vu.State()
if state == nil {
return nil, ErrHTTPForbiddenInInitContext
}
var body interface{}
var params goja.Value
if len(args) > 0 {
body = args[0].Export()

Can be threaded through the code and returned by the Proxy (or whatever) as Response.request.body. This will mean literally no copying on the Go side for this specific usage of the request body.

If not, can you provide an example of the expected API, please? For example, how should be the following use case?
...

Exactly what is currently returned - if technically possible, I propose no changes to anything in Response.request besides body.

@mstoykov
Copy link
Contributor

@mstoykov, this is about the copy of the request body that is saved in Response.request.body, not the response body. So io.ReadCloser has nothing to do with it and some sort of an on-demand lazy copy seems like it should be easy to do.

My mistake 🤦

This still doesn't really work for all cases.

http.post(url, {something : here, file: http.file(...)}...) should the response.request.body in this case be the given object?
What is the point of having the response.request.body.

And an example that I kind of give in #2311 (but not as heads on as I remember):

const data = new ArrayBuffer(8);
const view = new Int32Array(buffer);
view[0] = 20;
let response1 = http.post(url, data)
// some more code
view[2] =23;
let response2 = http.post(url, data)
// some more code
// What is the value of `response1.request.body` here? 

Also while we still don't have asynchronous code - we soon should be able to have one in which case this is even a potential race condition

...
http.asyncPromise(url, data).then(....)
view[3]=12
http.asyncPromise(url2, data).then(....)
...

Again this feels like we are trying to patch the problem while IMO the problem is systemic in the API and the fix should be to move to a different API that doesn't have those problems. This also will be a really good promotion of the new API - "it works way better than the previous one"

@na--
Copy link
Member

na-- commented Feb 24, 2022

should the response.request.body in this case be the given object?

Ideally no, but what I said in #2395 (comment) is that I would be fine with that compromise if we couldn't implement something better, e.g. a no-copy Proxy or a direct ArrayBuffer reference to the []byte sent or something along these lines...

What is the point of having the response.request.body.

There is probably very little point in Response.request.body in general and should have probably never been included. But now that it's there, it may allow someone to more easily correlate responses? 🤷‍♂️

Again this feels like we are trying to patch the problem while IMO the problem is systemic in the API and the fix should be to move to a different API that doesn't have those problems. This also will be a really good promotion of the new API - "it works way better than the previous one"

Maybe, but fixing this unnecessary and wrong string copying won't make all of these efforts worse (esp. if the Proxy can be made read-only), while it will immediately and substantially reduce memory usage of existing scripts. So I suggest we do both and we start with this fix.

@codebien
Copy link
Contributor Author

Closing as we prefer to directly address the core problem in the new HTTP API.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
evaluation needed proposal needs to be validated or tested before fully implementing it in k6 refactor
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants