Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch to InsnKind in the program table #533

Closed
4 tasks done
naure opened this issue Nov 2, 2024 · 6 comments · Fixed by #538
Closed
4 tasks done

Switch to InsnKind in the program table #533

naure opened this issue Nov 2, 2024 · 6 comments · Fixed by #538
Assignees
Labels
cleanup Refactors, simplifications, hindsight 20/20 tasks. speed

Comments

@naure
Copy link
Collaborator

naure commented Nov 2, 2024

Idea extracted from discussions with @matthiasgoergens (thread).

All circuits support a single constant opcode/funct3/funct7, so they can be merged into a single constant with enum InsnKind.

This saves 1 fixed column, 1 fixed record field, and cleans up a bit.

Note: One day we might merge some opcode circuits together or split circuits for special cases, to improve the verifier or prover cost respectively. But that will probably not match the original opcode/etc classification. So there’s no drawback in removing it.

  • Double-check that opcode/funct3/funct7 are constants in all circuits.
  • Change the layout of the program table.
  • Change the calls to circuit_builder.lk_fetch.
  • Remove func7 from imm_internal.
@matthiasgoergens
Copy link
Collaborator

matthiasgoergens commented Nov 2, 2024

Thanks!

I see you added a 'speed' label. Specifically, this would speed up proving a bit. But more important is the simplification of the circuits. Speed is just a welcome side effect. :⁠-⁠)

Do we have a label for that as well?

Simpler circuits are easier to understand and audit. Less likely to hide bugs and harbour exploits.

@naure naure added the cleanup Refactors, simplifications, hindsight 20/20 tasks. label Nov 2, 2024
@naure
Copy link
Collaborator Author

naure commented Nov 2, 2024

We do now (cleanup).

@matthiasgoergens matthiasgoergens changed the title Switch to InsnKind in the program table Switch to InsnKind in the program table Nov 2, 2024
@matthiasgoergens
Copy link
Collaborator

matthiasgoergens commented Nov 2, 2024

So my suggestion is to make this table as simple as possible. And then use the same data structure (or as much as the same as possible) to drive both the circuits and the emulator.

See how this table is essentially the same as DecodedInstruction in 'decode statically' PR with the same fields. And together with the 'named struct fields' PR we can have actually the same struct on the level of Rust code.

@naure
Copy link
Collaborator Author

naure commented Nov 2, 2024

use the same data structure (or as much as the same as possible) to drive both the circuits and the emulator.

Of course. It is called DecodedInstruction. The circuits have an extra step, that is a serialization to a list of field elements: DecodedInstruction -> InsnRecord.

this table is essentially the same as DecodedInstruction

My goal is to completely separate the standard and stable emulator logic from weird ZK tricks that change every day.

I had just been postponing some needed cleanup. Now fixed here: #536

Hope that clears it up.

@matthiasgoergens
Copy link
Collaborator

matthiasgoergens commented Nov 2, 2024

My goal is to completely separate the standard and stable emulator logic from weird ZK tricks that change every day.

I suggest that the goal should be to make the emulator use the same logic as the circuits. Because we need that logic anyway, and that way it gets an additional pass of tests (and the emulator is easier to test than the circuits.)

That's especially necessary because (a) as you say the ZK tricks might change, thus need special scrutiny, and (b) emulator speed isn't as crucial as circuit correctness. (Even emulator correctness is less important than circuit correctness.)

@naure
Copy link
Collaborator Author

naure commented Nov 2, 2024

That is not any of this works. Will explain some other time.

@naure naure self-assigned this Nov 2, 2024
@naure naure linked a pull request Nov 2, 2024 that will close this issue
hero78119 pushed a commit that referenced this issue Nov 4, 2024
_Matthias:_

Risc-V instruction are encoded in multiple different 'types'. These
types have various fields. One particularly common field is `opcode`,
which typically (but not always) occupies the 7 least significant bits.

As the name suggests, opcode helps code _which_ operation is intended. I
say 'helps' because some operations share the same opcode.

Our code before this PR assumed that bits `12..=14` in the instruction
word are exactly enough extra information to determine the operation.
(Confusingly enough, our code follows Risc0 convention, and calls that
range of bits `funct3`. Only some operations have a field named `funct3`
in the spec, and not all of them have it in the same place.) That
assumption is wrong, and our circuits are unsound. Specifically, the
distinction between 'signed' (ie 'arithmetic') and 'unsigned' (ie
'logical') immediate right shifts is encoded in bit 30, as laid out in
section 2.4.1 of the
[spec](https://github.com/riscv/riscv-isa-manual/releases/download/20240411/unpriv-isa-asciidoc.pdf).

We could introduce a new column called 'bit 30' in the program table to
fix the symptom. But instead of such a piecemeal approach, we attack the
underlying problem: we switch the program table to record `InsnKind`.
`InsnKind` directly reports the kind of instruction we are dealing with.
We already have that information available in the decoder, and just
re-use it here.

Fixes #539 and
#533

---------

Co-authored-by: Aurélien Nicolas <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cleanup Refactors, simplifications, hindsight 20/20 tasks. speed
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants