Skip to content
This repository has been archived by the owner on Jul 5, 2024. It is now read-only.

Research ways to make usage of ConstraintBuilder more readable #1115

Closed
aguzmant103 opened this issue Jan 26, 2023 · 8 comments
Closed

Research ways to make usage of ConstraintBuilder more readable #1115

aguzmant103 opened this issue Jan 26, 2023 · 8 comments
Labels
T-tech-debt Type: tech-debt generated or cleaned up
Milestone

Comments

@aguzmant103
Copy link
Member

aguzmant103 commented Jan 26, 2023

The focus should be in avoiding code repetition and simplifying the code base

Update: we can use this issue to discuss ways to make the usage of the Constraint Builder more readable.

@aguzmant103 aguzmant103 converted this from a draft issue Jan 26, 2023
@aguzmant103 aguzmant103 added this to the Increase opcode support milestone Jan 26, 2023
@ed255
Copy link
Member

ed255 commented Jan 26, 2023

Here are some ideas to research:

  • Currently we have a CellManager in Keccak and another implementation in EVM. And maybe a new one coming up in MPT. Can we unify them?
  • Brecht introduced some macros in the MPT refactor, could they be applied to the EVM circuit to simplify the code? See

@Brechtpd
Copy link
Collaborator

Wrote some docs about the tools I use in the MPT circuits here: #972 (comment).

I also think there are a lot ways to make writing circuits better, which will automatically lead to more code reuse. I think it makes a lot of sense to use these tools in other circuits as well (I made them so they do not contain any MPT specific code so they can be reused everywhere), especially the more simple circuits that do not yet contain a lot of abstractions to make them easier to understand.

I only focused on the constraints part for now because that's way I needed, but something like a generalized CellManager is I think possible as well so that the circuit layout part is also easier to manage for all circuits.

@leolara
Copy link
Contributor

leolara commented Feb 1, 2023

Here a proposal: https://hackmd.io/@leolara/getting-rid-halo2-meta

@Brechtpd
Copy link
Collaborator

Brechtpd commented Feb 3, 2023

Another nice thing with exposing higher level control flow blocks like if/else and match to the constraint builder is that a CellManager can directly use that information to allocate cells in an optimal way, without having to resort to trying to reverse engineer this from the expressions or having to do this manually. For example, whenever the constraint builder sees this:

if condition_a {
  let value_a = cm.query_cell();
} else {
  let value_b = cm.query_cell();
}

the CellManager automatically knows it can allocate value_a and value_b to the same cell (we do have to be careful that the cell isn't returned by the if however because then it needs to remain allocated of course). And so if the circuit uses these constructions for the whole circuit you will automatically get a layout that is optimal (or as optimal as the code allows at least).

@leolara
Copy link
Contributor

leolara commented Feb 3, 2023

I think my main point is, what makes this code for example difficult to read?

meta.create_gate("Byte to Byte row", |meta| {
    let mut cb = BaseConstraintBuilder::default();

    cb.require_equal(
        "next.length == cur.length",
        meta.query_advice(length, Rotation::next()),
        meta.query_advice(length, Rotation::cur()),
    );

    cb.require_equal(
        "next.index == cur.index + 1",
        meta.query_advice(bytecode_table.index, Rotation::next()),
        meta.query_advice(bytecode_table.index, Rotation::cur()) + 1.expr(),
    );

    cb.require_equal(
        "next.hash == cur.hash",
        meta.query_advice(bytecode_table.code_hash, Rotation::next()),
        meta.query_advice(bytecode_table.code_hash, Rotation::cur()),
    );

    cb.require_equal(
        "next.value_rlc == cur.value_rlc * randomness + next.value",
        meta.query_advice(value_rlc, Rotation::next()),
        meta.query_advice(value_rlc, Rotation::cur()) * challenges.keccak_input()
            + meta.query_advice(value, Rotation::next()),
    );

    cb.require_equal(
        "next.push_data_left == cur.is_code ? cur.push_data_size : cur.push_data_left - 1",
        meta.query_advice(push_data_left, Rotation::next()),
        select::expr(
            meta.query_advice(bytecode_table.is_code, Rotation::cur()),
            meta.query_advice(push_data_size, Rotation::cur()),
            meta.query_advice(push_data_left, Rotation::cur()) - 1.expr(),
        ),
    );

    cb.gate(and::expr(vec![
        meta.query_fixed(q_enable, Rotation::cur()),
        not::expr(meta.query_fixed(q_last, Rotation::cur())),
        is_byte_to_byte(meta),
    ]))
});

In my opinion, it is the column queries and having to refer to these methods in meta and Rotation. If we could write them like this:

meta.create_gate("Byte to Byte row", |meta| {
    let mut cb = BaseConstraintBuilder::default();

    cb.require_equal(
        "next.length == cur.length",
        length.next(),
        length.cur()
    );

    cb.require_equal(
        "next.index == cur.index + 1",
        bytecode_table.index.next(),
        bytecode_table.index.cur() + 1.expr(),
    );

    cb.require_equal(
        "next.hash == cur.hash",
        bytecode_table.code_hash.next(),
        bytecode_table.code_hash.cur(),
    );

    cb.require_equal(
        "next.value_rlc == cur.value_rlc * randomness + next.value",
        value_rlc.next(),
        value_rlc.cur() * challenges.keccak_input() + value.next(),
    );

    cb.require_equal(
        "next.push_data_left == cur.is_code ? cur.push_data_size : cur.push_data_left - 1",
        push_data_left.next(),
        select::expr(
            bytecode_table.is_code.next(),
            push_data_size.cur(),
            push_data_left.cur() - 1.expr(),
        ),
    );

    cb.gate(and::expr(vec![
        q_enable.cur(),
        not::expr(q_last.cur()),
        is_byte_to_byte(),
    ]))
});

That makes the most difference. We can have the macros that wrap contraint expressions to make it even more readable, but for me this change makes all the difference.

Note: we would need a method that wraps meta.create_gate in order to "inject" the gate meta to the column queries.

Because we are wrapping the columns on a new type, we can add to it the constrain operators, like 'equal' or and to them, this that not invalid the possibility of having the operators also outside or with macros.

gb.when("Byte to Byte row",
  is_byte_to_byte.and([q_enable.cur(), not::expr(q_last.cur())]), || {
    [
        (
            "next.length == cur.length",
            length.next().eq(length.cur())
        ),

        (
            "next.index == cur.index + 1",
            bytecode_table.index.next().eq(bytecode_table.index.cur() + 1.expr())
        ),

        (
            "next.hash == cur.hash",
            bytecode_table.code_hash.next().eq(bytecode_table.code_hash.cur())
        ),

        (
            "next.value_rlc == cur.value_rlc * randomness + next.value",
            value_rlc.next().eq(value_rlc.cur() * challenges.keccak_input() + value.next())
        ),

        (
            "next.push_data_left == cur.is_code ? cur.push_data_size : cur.push_data_left - 1",
            push_data_left.next().eq(
                select::expr(
                    bytecode_table.is_code.next(),
                    push_data_size.cur(),
                    push_data_left.cur() - 1.expr(),
                )
            )
        )
    ]
});

In my opinion there is no advance in readability if there is no replacement for the columns queries.

I see in the macros solution there are macros a! and f! that attach this problem. In this case, I would have macros cur!, next!, prev! and things like that. So the solution with macros includes what is my main point, however we should check if this causes problems with debugging and tooling.

@Brechtpd
Copy link
Collaborator

Brechtpd commented Feb 3, 2023

Totally agreed that the querying is one the main problems for readability. The only reason it's necessary to the query in the first place is because of the query index, which is not something you care about while writing the circuits, I believe it should be possible to just set those query indices on the halo2 side when it receives the expressions instead of having to rely on people writing circuits to have to manually query each cell themselves. I briefly tried to modify halo2 to do just that, but it was more work than expected so gave up and just decided to go with the macro route for now. We may want to still wrap things on the zk-evm side, but I think it makes sense to modify halo2 to already do a lot of this stuff (like .cur()/.next()) directly on the halo2 types, otherwise we still have to keep this mutable meta object in mind for querying stuff which is bothersome (and these changes can be done in a backward compatible way).

@ed255 ed255 added the T-tech-debt Type: tech-debt generated or cleaned up label Feb 14, 2023
@ed255 ed255 removed this from the tech-debt-1 milestone Feb 14, 2023
@ed255 ed255 changed the title Research ways to improve constraint builder Research ways to make usage of ConstraintBuilder more readable Feb 14, 2023
@ed255 ed255 added this to the tech-debt-1 milestone Feb 16, 2023
@naure
Copy link
Contributor

naure commented Feb 23, 2023

This issue is not only about readability of a particular block of code, but also about code organization and abstraction boundaries.

Say we want to split a big chip into two smaller chips, each with its bounded responsibilities and some interface between them. That will require the chips to accept or expose Queries (column, rotation), or Expression. But we are not allowed to do that, because these objects are made available inside of the create_gate closure. Many attempts at code organization will result in illegally crossing or nesting this boundary. This forces or encourages giant configure and create_gate blocks responsible for everything at once.

Alternatively, one can steal the queries and expressions from the closure (query_expression(…) for example). This works but does not respect the API and its concept of context (VirtualCells).

Another problem is that the queries are not stored in the configuration and available to synthesis. So developers have to reproduce the layout of rotations using hard-coded offset logic (example: offset + 1). Instead, synthesis could use Cells from the configuration, which abstracts the positions. Similarly, code duplicates many expressions, once with the Expression structure, and once with Field ops (example: all the modules that have an expr and a value). Instead, synthesis could have the opportunity to evaluate Expression trees from the configuration as the simplest way to synthesize (although that is a bigger change to achieve that).

These problems can be solved by not requiring to work in a closure, and making the cells and expressions officially objects to be passed around across chips, functions, modules, config structs, etc. This will let developers organize code in any programming style appropriate to each task.

Here is an experiment about splitting code into decoupled chips. This is done by stealing queries (code). Then chips can accept and return Cells and Expressions (example). This is similar to what the EVM circuit does as well, so that looks like a good direction.

@ChihChengLiang ChihChengLiang moved this from 📋 Sprint Focus to Milestone Tasks in zkEVM Community Edition May 18, 2023
@ed255
Copy link
Member

ed255 commented Jun 23, 2023

Closing due to inactivity. I believe there's been some work related to this issue in the macro system that @Brechtpd made for the MPT circuit. Once it reaches main we can discuss it to see if we would like to extend it's usage to the rest of the circuits to achieve making the ConstraintBuilder more readable.

@ed255 ed255 closed this as completed Jun 23, 2023
@ChihChengLiang ChihChengLiang moved this from Milestone Tasks to ✅ Done in zkEVM Community Edition Jun 26, 2023
lispc pushed a commit that referenced this issue Mar 6, 2024
* doc for eip EIP2930_1559

* add 1559 gadget

* modify md format
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
T-tech-debt Type: tech-debt generated or cleaned up
Projects
Status: Done
Development

No branches or pull requests

5 participants