Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: encoding/json support context in marshal/unmarshal. like MarshalWithContext(ctx context.Context,v interface{}) #50422

Closed
armstrongli opened this issue Jan 4, 2022 · 7 comments

Comments

@armstrongli
Copy link

armstrongli commented Jan 4, 2022

Proposals

Proposing to support a context.Context as an argument to encoding/json package to do interaction while marshalling/unmarshalling payloads. e.g.

func MarshalWithContext(ctx context.Context, v any) ([]byte,error) ....
func UnmarshalWithContext(ctx context.Context, data []byte, v any) error ....

Use Cases

kubernetes response sometimes can be very large(e.g. >300MB) when listing cluster wide objects. the response can be very heavy for server to encode/decode. there are timeout configurations to exit the request with timeout failure because the object encoding doesn't finish within the time limitation. the response already returned but the marshalling still goes on and consumes CPU/Memory.
with a context parsed to marshal/unmarshal, it is possible to respect the context and cancel the encoding/decoding as soon as possible rather than exhausting unnecessary resources.

@gopherbot gopherbot added this to the Proposal milestone Jan 4, 2022
@mvdan
Copy link
Member

mvdan commented Jan 4, 2022

Usually, a context or timeout/deadline is required for asynchronous work where it's practically impossible to know how long an operation will take.

I also imagine it would be useful for libraries that interpret or execute code, such as interpreting a script that might loop forever.

My first impression is that encoding/json falls under neither of those categories. The amount of time a Marshal or Unmarshal call takes is fairly predictable, as it only scales with the size of the input, and shouldn't block or loop for a long time unless you're using some special MarshalJSON/UnmarshalJSON methods.

For example, for Unmarshal, why is it not enough to have a limit on the number of bytes being decoded?

I realise that for Marshal it's not quite as easy as len(data), but still - if you're marshalling a data structure, I imagine you have built that data structure somehow. You should still have some knowledge about how large it is.

@mvdan
Copy link
Member

mvdan commented Jan 4, 2022

@dsnet
Copy link
Member

dsnet commented Jan 4, 2022

This may be related to #20280. I don't think the json package should do anything unless there is a way to cancel an ongoing Read or Write call. Otherwise, I don't see how the json package itself can implement this reliably.

@armstrongli
Copy link
Author

Usually, a context or timeout/deadline is required for asynchronous work where it's practically impossible to know how long an operation will take.

partially agree on this. a context or timeout/deadline is required to control the expectation of some work(sync or async) and the work can continue or drop by how the work respects the context. a Context holds some context of the logic and wants some response or output within the context. the work can end directly if the context is ended except some atomic operations. here we can treat json.MarshalJSON as atomic operation, then the marshal work contiues until it finishes. we can also open the box and make it interactive to respect the context and end the marshal ASAP. e.g. TCP stream is ended directly when context ends rather than continuely receiving data until stream data finish. the TCP stream ends can save server's CPU/Memory/bandwidth/etc. same to client side.

there are customized marshal/unmarshal methods, too, as you described as special methods can consume unexpectly time.

I realise that for Marshal it's not quite as easy as len(data), but still - if you're marshalling a data structure, I imagine you have built that data structure somehow. You should still have some knowledge about how large it is.

right. i know how large the data is, and also i know how much time it takes to do marshal. the problem is the request has 30 seconds as timeout and the response is already done and leaving marshal work as dangling until it finishes the work. it slows down the web server(our k8s control plane a lot).

@mvdan
Copy link
Member

mvdan commented Jan 5, 2022

there are customized marshal/unmarshal methods, too, as you described as special methods can consume unexpectly time.

Those methods do not receive a context though, so if your problem is that they may consume an unexpected amount of time, this proposal won't help.

the response is already done and leaving marshal work as dangling until it finishes the work. it slows down the web server

Okay, that makes more sense to me. I can't say I've ever marshalled or unmarshalled such huge chunks of data, but I can understand wanting to cancel when done. I share @dsnet's thoughts, though - besides MarshalJSON and UnmarshalJSON, which don't take contexts, the only other blocking operation in encoding/json is reading and writing, which is currently not cancellable.

So it seems wrong to add Context to the json APIs. Even if they might improve your situation in some cases, in some others cancelling the context would accomplish nothing at all, and that would be a very confusing and unhelpful API.

@mvdan
Copy link
Member

mvdan commented Jan 5, 2022

For your particular use case, another approach could be to use the Encoder with an io.Writer implementation around a bytes.Buffer, which would return an error when the request has been cancelled. Note that that won't quite work today (see #33714), but at least that solution isn't blocked on io.Reader and io.Writer being cancellable.

@armstrongli
Copy link
Author

So it seems wrong to add Context to the json APIs. Even if they might improve your situation in some cases, in some others cancelling the context would accomplish nothing at all, and that would be a very confusing and unhelpful API.

from this perspective, it makes sense not to add the context parameter for the tidy and cleanliness of package encoding/xxx .

also thank you for the approatch of using io.Writer, buffer solution as we need a way to parse the context into the writer and control it.

@golang golang locked and limited conversation to collaborators Jan 6, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants