Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

up to 20x faster jsonutils deserialization #18183

Merged
merged 2 commits into from
Jun 5, 2021

Conversation

timotheecour
Copy link
Member

@timotheecour timotheecour commented Jun 5, 2021

nim r -d:danger main
(taking best of 5 runs)

before PR

0.27

after PR

0.012

example directly adapted from https://forum.nim-lang.org/t/8074

when true:
  import times, std/json, std/jsonutils

  type OptionContract* = ref object
    id*:               string
    right*:            string
    expiration*:       string
    strike_raw*:       float
    premium_raw*:      float
    data_type*:        string

  type OptionChain* = object
    contracts*: seq[OptionContract]

  proc stub_data(): OptionChain =
    result = OptionChain()
    for _ in 1..6000:
      result.contracts.add OptionContract(
        id: "AMZN CALL 2021-03-19 1460.0 USD",
        right: "call",
        expiration: "2021-03-19",
        strike_raw: 1460.0,
        premium_raw: 1676.03,
        data_type: "some type"
      )
  let json_str = stub_data().toJson.pretty

  let j = json_str.parseJson
  let time = cpuTime()
  for i in 1..5:
    discard j.jsonTo(OptionChain)
  let t2 = cpuTime()
  echo "Time taken: ", t2 - time, " sec"

performance lesson learned

avoid function calls in hotspots (even {.inline.} or --passc:-flto would not help here)

actually the real explanation is instead explained here: #18183 (comment), which is that templates, unlike procs, allow lazy parameter evaluation eg:

checkJson ok, $(json.len, num, numMatched, $T, json)
  # The `msg` is only evaluated on failure if checkJson is a template, but not if it's a proc

# just pick 1 exception type for simplicity; other choices would be:
# JsonError, JsonParser, JsonKindError
raise newException(ValueError, msg)
proc raiseJsonException(condStr: string, msg: string) =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
proc raiseJsonException(condStr: string, msg: string) =
proc raiseJsonException(condStr: string, msg: string) {.noinline.} =

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done, but can't the C compiler figure out on its own (and without PGO) that it's not worth inlining because it's a noreturn ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With --exceptions:goto it's not a noreturn.

@Araq
Copy link
Member

Araq commented Jun 5, 2021

avoid function calls in hotspots (even {.inline.} or --passc:-flto would not help here)

The lesson seems to me "split code into hot/cold sections", it was a template and thus inlined before.

@timotheecour
Copy link
Member Author

timotheecour commented Jun 5, 2021

avoid function calls in hotspots (even {.inline.} or --passc:-flto would not help here)
The lesson seems to me "split code into hot/cold sections", it was a template and thus inlined before.

actually both my original conclusion and yours are off-track, IMO the real explanation here is that template calls allow for lazy parameter evaluation (in D, the analog is lazy, refs https://dlang.org/spec/function.html#lazy-params):

checkJson ok, $(json.len, num, numMatched, $T, json)

before PR, $(json.len, num, numMatched, $T, json) would have to be evaluated before calling checkJsonImpl proc, even if not needed then when cond is true.

after PR, $(json.len, num, numMatched, $T, json) is only evaluated if cond is false (cold section)

This also explains why {.inline.} or --passc:-flto would have no effect, as the function call semantics would still require evaluating proc arguments.

This is applicable in other scenarios, and in particular means that it's usually ok [1] to have verbose asserts with potentially complex code when a condition fails, so long the assert/enforce is a template, like doAssert or checkJson

[1] instruction cache size being a second-order effect

@Araq Araq merged commit 9c6259e into nim-lang:devel Jun 5, 2021
@Araq
Copy link
Member

Araq commented Jun 5, 2021

[1] instruction cache size being a second-order effect

Which I care about though...

@timotheecour timotheecour deleted the pr_jsonutils_slow branch June 5, 2021 08:02
@timotheecour
Copy link
Member Author

timotheecour commented Jun 5, 2021

[1] instruction cache size being a second-order effect

Which I care about though...

that's essentially what --lean is about, #14282, which can be used throughout stdlib (or 3rd party libraries) via compileOption("lean") to get lean error msgs, for smaller binaries (or via push/pop inside a context)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants