Accurate performance baseline #45

countvajhula · 2022-06-30T22:16:44Z

Summary of Changes

I've addressed your comments on #20 , @michaelballantyne , and avoided constructing lists in the benchmarks. This results in a seemingly murderous advantage for Racket over Qi, as you predicted. Although, I'm not really sure what these benchmarks are telling us. These differences only become apparent at all on scales much finer than practical workloads would seem to involve (and e.g. benchmarks in libraries using Qi are unaffected when using Qi vs Racket). So it might be like using bricks vs atoms to build a skyscraper, maybe - in practice it may not matter to the result, although using atoms might look nicer under a microscope. So I find myself wondering whether these benchmarks are actually useful in representing Qi vs Racket performance, and if not, what would be more useful / representative?

I'm also wondering what kinds of benchmarks would be useful as we undertake performance improvements once the compiler work is underway. Probably the forms benchmarks in profile/forms would be useful here, but those don't reflect non-local interactions.

At this point we probably just want to have some minimally accurate baseline against which future improvements / regressions could be seen with some confidence.

Would love to hear any thoughts you may have on this.

Public Domain Dedication

In contributing, I relinquish any copyright claims on my contribution and freely release it into the public domain in the simple hope that it will provide value.

michaelballantyne · 2022-07-01T23:34:36Z

Although, I'm not really sure what these benchmarks are telling us. These differences only become apparent at all on scales much finer than practical workloads would seem to involve... so I find myself wondering whether these benchmarks are actually useful in representing Qi vs Racket performance, and if not, what would be more useful / representative?

I guess the question is whether there are cases you care about where Qi's performance is relevant to programs that use it. If yes, then micro-benchmarks like these that point out the specific overheads should be helpful as a starting point in thinking about compiler optimizations. If not... maybe there isn't actually a good reason to make Qi faster?

countvajhula · 2022-07-02T02:34:12Z

All the performance!!! 😝 Yeah I’d definitely like Qi to be a viable alternative to Racket in all cases where possible. If these benchmarks are legitimate (and sounds like they are from what you’re saying - “microbenchmarks” eh, good to know the term), then I’m definitely interested in improving performance here 😼

…

Sent from my iPhone

On Jul 1, 2022, at 4:34 PM, Michael Ballantyne ***@***.***> wrote: Although, I'm not really sure what these benchmarks are telling us. These differences only become apparent at all on scales much finer than practical workloads would seem to involve... so I find myself wondering whether these benchmarks are actually useful in representing Qi vs Racket performance, and if not, what would be more useful / representative? I guess the question is whether there are cases you care about where Qi's performance is relevant to programs that use it. If yes, then micro-benchmarks like these that point out the specific overheads should be helpful as a starting point in thinking about compiler optimizations. If not... maybe there isn't actually a good reason to make Qi faster? — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.

countvajhula · 2022-07-08T20:39:14Z

Discussed with @michaelballantyne and he said that these are in the right ballpark now, so, merging. To summarize his suggestions:

Qi could only exceed Racket performance if there are cases where it could have a stronger theory of optimization, where there are more "language equivalences" than Racket can employ in these cases. This could happen if either:

There are cases where Racket compiler could do optimizations but, for whatever reason, currently doesn't (e.g. this may include cases where intermediate collections are constructed in functional pipelines where, Qi might avoid these via a compiler optimization. Such pipelines are much more common in Qi vs Racket, which might explain the relative prioritization of such optimizations in Racket)
Qi introduced "undefined behavior" -- that is, eliminated invariants upheld by the Racket language in order to do additional optimizations (e.g. make no guarantees about side effects and mutation in esc forms -- and only providing guarantees for explicit use of effect which might entail mutation and side effects -- this seems reasonable for Qi).

In cases where Qi is slow in the local / general case, some options include:

There may be nonlocal optimizations where instances of these expressions could end up equalling Racket performance by virtue of the rewrite rule at the higher level.
Building a library of metadata about standard library functions, specifically argument arities, to be able to rewrite arbitrary-arity flows to known-arity function invocations where possible.

countvajhula added 4 commits June 30, 2022 13:02

Improve accuracy of competitive benchmarks... (see #20)

03507e0

fix error on differently sized inputs

a562527

accuracy of competitive benchmarks ...

71c037e

accuracy etc.

13fa080

countvajhula temporarily deployed to test-env June 30, 2022 22:24 Inactive

countvajhula merged commit a374516 into main Jul 8, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Accurate performance baseline #45

Accurate performance baseline #45

countvajhula commented Jun 30, 2022

michaelballantyne commented Jul 1, 2022

countvajhula commented Jul 2, 2022 via email

countvajhula commented Jul 8, 2022

Accurate performance baseline #45

Accurate performance baseline #45

Conversation

countvajhula commented Jun 30, 2022

Summary of Changes

Public Domain Dedication

michaelballantyne commented Jul 1, 2022

countvajhula commented Jul 2, 2022 via email

countvajhula commented Jul 8, 2022