Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement resource budgets in dagcbor parsing. #85

Merged
merged 1 commit into from
Oct 20, 2020
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
65 changes: 60 additions & 5 deletions codec/dagcbor/unmarshal.go
Original file line number Diff line number Diff line change
Expand Up @@ -14,14 +14,29 @@ import (
)

var (
ErrInvalidMultibase = errors.New("invalid multibase on IPLD link")
ErrInvalidMultibase = errors.New("invalid multibase on IPLD link")
ErrAllocationBudgetExceeded = errors.New("message structure demanded too many resources to process")
)

const (
mapEntryGasScore = 8
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how are you deriving these? is it a rough approximation of the overhead?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep.

Fairly arbitrarily derived. I figure it takes -- very, very roughly -- at least this many bytes in memory to add another map entry to memory.

Is this number wrong? Absolutely. I have no idea how many bytes it takes to add a map entry in memory. It's a property of the golang native map implementation, and varies based on the size of the map, which is not something I'm particularly interested in predicting or making specific claims about... and then, even that is only in some Nodes. It's actually zero in some nodes (e.g. codegen'd structs). Or who knows what in some other Node implementation: it's an interface users can supply their own of, after all.

It's sufficient to have a limit; I'm not sure it's necessarily important for the limit to be easy to intuit.

listEntryGasScore = 4
)

// This should be identical to the general feature in the parent package,
// except for the `case tok.TBytes` block,
// which has dag-cbor's special sauce for detecting schemafree links.

func Unmarshal(na ipld.NodeAssembler, tokSrc shared.TokenSource) error {
// Have a gas budget, which will be decremented as we allocate memory, and an error returned when execeeded (or about to be exceeded).
// This is a DoS defense mechanism.
// It's *roughly* in units of bytes (but only very, VERY roughly) -- it also treats words as 1 in many cases.
// FUTURE: this ought be configurable somehow. (How, and at what granularity though?)
var gas int = 1048576 * 10
return unmarshal1(na, tokSrc, &gas)
}

func unmarshal1(na ipld.NodeAssembler, tokSrc shared.TokenSource, gas *int) error {
var tk tok.Token
done, err := tokSrc.Step(&tk)
if err != nil {
Expand All @@ -30,12 +45,12 @@ func Unmarshal(na ipld.NodeAssembler, tokSrc shared.TokenSource) error {
if done && !tk.Type.IsValue() {
return fmt.Errorf("unexpected eof")
}
return unmarshal(na, tokSrc, &tk)
return unmarshal2(na, tokSrc, &tk, gas)
}

// starts with the first token already primed. Necessary to get recursion
// to flow right without a peek+unpeek system.
func unmarshal(na ipld.NodeAssembler, tokSrc shared.TokenSource, tk *tok.Token) error {
func unmarshal2(na ipld.NodeAssembler, tokSrc shared.TokenSource, tk *tok.Token, gas *int) error {
// FUTURE: check for schema.TypedNodeBuilder that's going to parse a Link (they can slurp any token kind they want).
switch tk.Type {
case tok.TMapOpen:
Expand All @@ -44,6 +59,10 @@ func unmarshal(na ipld.NodeAssembler, tokSrc shared.TokenSource, tk *tok.Token)
if tk.Length == -1 {
expectLen = math.MaxInt32
allocLen = 0
} else {
if *gas-allocLen < 0 { // halt early if this will clearly demand too many resources
return ErrAllocationBudgetExceeded
}
}
ma, err := na.BeginMap(allocLen)
if err != nil {
Expand All @@ -62,6 +81,10 @@ func unmarshal(na ipld.NodeAssembler, tokSrc shared.TokenSource, tk *tok.Token)
}
return ma.Finish()
case tok.TString:
*gas -= len(tk.Str) + mapEntryGasScore
if *gas < 0 {
return ErrAllocationBudgetExceeded
}
// continue
default:
return fmt.Errorf("unexpected %s token while expecting map key", tk.Type)
Expand All @@ -74,7 +97,7 @@ func unmarshal(na ipld.NodeAssembler, tokSrc shared.TokenSource, tk *tok.Token)
if err != nil { // return in error if the key was rejected
return err
}
err = Unmarshal(mva, tokSrc)
err = unmarshal1(mva, tokSrc, gas)
if err != nil { // return in error if some part of the recursion errored
return err
}
Expand All @@ -87,6 +110,10 @@ func unmarshal(na ipld.NodeAssembler, tokSrc shared.TokenSource, tk *tok.Token)
if tk.Length == -1 {
expectLen = math.MaxInt32
allocLen = 0
} else {
if *gas-allocLen < 0 { // halt early if this will clearly demand too many resources
return ErrAllocationBudgetExceeded
}
}
la, err := na.BeginList(allocLen)
if err != nil {
Expand All @@ -105,11 +132,15 @@ func unmarshal(na ipld.NodeAssembler, tokSrc shared.TokenSource, tk *tok.Token)
}
return la.Finish()
default:
*gas -= listEntryGasScore
if *gas < 0 {
return ErrAllocationBudgetExceeded
}
observedLen++
if observedLen > expectLen {
return fmt.Errorf("unexpected continuation of array elements beyond declared length")
}
err := unmarshal(la.AssembleValue(), tokSrc, tk)
err := unmarshal2(la.AssembleValue(), tokSrc, tk, gas)
if err != nil { // return in error if some part of the recursion errored
return err
}
Expand All @@ -120,8 +151,16 @@ func unmarshal(na ipld.NodeAssembler, tokSrc shared.TokenSource, tk *tok.Token)
case tok.TNull:
return na.AssignNull()
case tok.TString:
*gas -= len(tk.Str)
if *gas < 0 {
return ErrAllocationBudgetExceeded
}
return na.AssignString(tk.Str)
case tok.TBytes:
*gas -= len(tk.Bytes)
if *gas < 0 {
return ErrAllocationBudgetExceeded
}
if !tk.Tagged {
return na.AssignBytes(tk.Bytes)
}
Expand All @@ -139,12 +178,28 @@ func unmarshal(na ipld.NodeAssembler, tokSrc shared.TokenSource, tk *tok.Token)
return fmt.Errorf("unhandled cbor tag %d", tk.Tag)
}
case tok.TBool:
*gas -= 1
if *gas < 0 {
return ErrAllocationBudgetExceeded
}
return na.AssignBool(tk.Bool)
case tok.TInt:
*gas -= 1
if *gas < 0 {
return ErrAllocationBudgetExceeded
}
return na.AssignInt(int(tk.Int)) // FIXME overflow check
case tok.TUint:
*gas -= 1
if *gas < 0 {
return ErrAllocationBudgetExceeded
}
return na.AssignInt(int(tk.Uint)) // FIXME overflow check
case tok.TFloat64:
*gas -= 1
if *gas < 0 {
return ErrAllocationBudgetExceeded
}
return na.AssignFloat(tk.Float64)
default:
panic("unreachable")
Expand Down