Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

decoder: sanity-check the length of slices/maps before allocating them #19

Merged
merged 5 commits into from
Nov 22, 2023

Conversation

2opremio
Copy link

@2opremio 2opremio commented Nov 15, 2023

This is meant to protect against input XDR doctored to cause a heap explosion.

The decoder won't allocate any container (slice/map) with a length larger than the remaining size of the input reader (if available).

This is meant to protect against XDR doctored to cause a heap explosion.

The decoder won't make any allocation larger than the provided maximum.
@2opremio 2opremio changed the title Enforce maximum reflection allocation size decoder: Support maximum reflection allocation size Nov 15, 2023
Copy link
Member

@leighmcculloch leighmcculloch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One suggestion, but regardless looks good to me.

xdr3/decode.go Outdated
@@ -509,6 +511,10 @@ func (d *Decoder) decodeArray(v reflect.Value, ignoreOpaque bool, maxSize int, m
// existing slice does not have enough capacity.
sliceLen := int(dataLen)
if v.Cap() < sliceLen {
growth := (sliceLen - v.Cap()) * int(v.Type().Size())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A little concerned about the conversion assuming uintptr is the same bitsize as int, although in practice I think that's usually the case it's not guaranteed by the language. The safest thing to do would be to do the conversion then check for overflow before using the value.

Copy link

@tamirms tamirms left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer the approach taken in the Rust and C++ xdr library where you have a limit on the maximum number of bytes that will be read from the binary payload:

https://github.com/stellar/xdrgen/blob/master/lib/xdrgen/generators/rust/src/types.rs#L191-L290

I think implementing that approach would not be more complex than the code you have here. Also, it should be just as efficient.

In all the places where we decode xdr in horizon and soroban-rpc we know the length of the binary payload. We can also default to the maximum body size allowed by the http server in case for some reason we need to parse xdr from a binary stream.

The reason why I prefer the approach where we keep track of the number of bytes read from the binary payload is because I think you can have a tighter bound compared to capping the limit on each allocation and I think it's nice to have a consistent approach between the C++, Rust, and Go xdr decoders.

@2opremio
Copy link
Author

I think implementing that approach would not be more complex than the code you have here. Also, it should be just as efficient.

I think it would be more complex and inefficient because:

  1. You need to keep track of the consumed bytes.
  2. You need to match serialized sizes with allocation sizes. To protect from allocation explosions, before every allocation you need to estimate whether the remaining buffer size is reasonable. I think this can be tricky.

Also, whilst in horizon and soroban-rpc we know the length of the binary payload, that may not be the case in general. This is less of a problem since I guess it's ok to tailor the library for Stellar's use.

Finally, I am not sure it's much better to limit it by input size. Ergonomically, you still need to pass a size parameter and we still need to limit the size of every HTTP request individually (so the fact that the size is limited independently of allocations doesn't give us a big advantage)

@2opremio
Copy link
Author

2opremio commented Nov 21, 2023

you still need to pass a size parameter

Uhm after thinking about it, we could probably figure out whether the reader passed is an io.Buffer or {bytes/strings}.Reader by checking for a Len() method and use that as the size limit.

This would be both efficient (the usage computation is reused from the input reader) and cleaner (no explicit size parameters) so this would make me lean towards limiting by size (although it would may be too implicit?).

I still don't know how to resolve (2) above nicely though. @leighmcculloch / @graydon how did you do it in Rust/ C++?

@2opremio
Copy link
Author

2opremio commented Nov 21, 2023

I discussed it offline with @tamirms and it seems appropiate to limit the length of allocated containers to the buffer remaining size.

@2opremio
Copy link
Author

@tamirms PTAL

@2opremio 2opremio changed the title decoder: Support maximum reflection allocation size decoder: sanity-check the length of slices/maps before allocating them Nov 21, 2023
@2opremio
Copy link
Author

2opremio commented Nov 21, 2023

I have just realized that base64.Decoder doesn't provide a Len() method ...

@leighmcculloch
Copy link
Member

I have just realized that base64.Decoder doesn't provide a Len() method ...

https://pkg.go.dev/encoding/base64#Encoding.DecodedLen
https://pkg.go.dev/encoding/base64#Encoding.EncodedLen

Does these do what you need?

@2opremio
Copy link
Author

I have just realized that base64.Decoder doesn't provide a Len() method ...

https://pkg.go.dev/encoding/base64#Encoding.DecodedLen https://pkg.go.dev/encoding/base64#Encoding.EncodedLen

Does these do what you need?

Yes, but they are not provided by the reader. In the end I ended up creating LenReader (see the updated PR description)

xdr3/decode.go Outdated
// uses the remaining input length provided by the len reader
// to protect against doctored input XDR which could cause
// allocation explosions.
func NewDecoderFromLenReader(r LenReader) *Decoder {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to me that we now have two configuration options available to us when decoding:

  1. max depth which is passed into DecodeWithMaxDepth()
  2. max input length which is passed into NewLenReader()

I think it would be better to make these configuration options more explicit:

type DecodeOptions struct {
	MaxDepth      int
	MaxInputLen  int
}

func NewDecoder(r io.Reader) *Decoder {
	return NewDecoderWithOptions(r, DecodeOptions{MaxDepth: DecodeDefaultMaxDepth, MaxInputLen: math.MaxInt32})
}

func NewDecoderWithOptions(r io.Reader, options DecodeOptions) *Decoder {
	...
}

Now that a decoder is always constructed with a MaxInputLen, you can remove the following code which follows maxSize = d.mergeInputLenAndMaxSize(maxSize) :

	if maxSize == 0 {
		maxSize = maxInt32
	}

Copy link
Member

@leighmcculloch leighmcculloch Nov 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fwiw we went with something like this design in the Rust XDR, where we called it "Limits" where you provide a set of limits and that object gets passed around and consumed/modified as it goes.

I considered "Options", and it seemed to work okay with the depth since when it was modified it was always unmodified on the way back up so options were consistent in each function, but it seemed odd that the options would be modified permanently with the length field, which is why I went with "Limits".

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will implement @tamirms 's suggestion but I think I will stick with Options.

Copy link

@tamirms tamirms left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks good! I left one suggestion which you are free to ignore if you think it doesn't make sense

@2opremio 2opremio merged commit b53fb00 into master Nov 22, 2023
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants