Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DRAFT]: Not for merge #163

Closed
wants to merge 6 commits into from

Conversation

buffalojoec
Copy link
Collaborator

This is a draft PR outlining a few issues with SolFuzz-Agave that make
conformance testing between a BPF and builtin difficult.

Not for merge, just for discussion.

Each commit corresponds to one issue, and commits with prefix global: are
changes that affect the base harness (not behind feature flags).

Active Features (no commit)

This issue does not correspond to a commit.

The BPF versions of some programs have been built with the assumption that
certain features, such as the Address Lookup Table program's
relax authority checks have already been activated on all clusters.

This means that any fixtures where these features are disabled are never going
to match between builtin and BPF.

I'm assuming fixtures where the feature is both disabled or enabled are
necessary for Firedancer, so if we can't figure out an elegant solution then we
can skip these fixtures in our testing. However, this would be a large portion
of the vector collections. For example, over 3/4 of the vectors for Address
Lookup Table have the relax authority feature disabled.

Syscall SolGetSysvar

BPF versions of programs like Address Lookup Table and Stake require the new
SolGetSysvar syscall to be enabled, since it will allow them to access syvar
information such as SlotHashes and StakeHistory.

Similar to the above issue, any fixture where this feature is disabled will
never succeed when testing BPF against builtin.

The commit for this issue suggests we could just hard-code the feature for the
BPF side only.

No Modified Accounts on Error

When a builtin program returns an error, SolFuzz-Agave still returns modified
accounts. This is because builtins update the accounts in the
TransactionContext instantaneously. However, if the program fails, these
account updates are never committed.

Conversely, when BPF programs fail, the VM's account data regions are not
deserialized back into the TransactionContext, so BPF programs can fail at any
point and the modified_accounts vector will be empty.

Technically, since changes to accounts when a builtin program errors are not
committed, these should not be contained in modified_accounts. I think the
harness and the fixtures may need to be updated here.

Instruction Account Roles

Conformance fixtures provide instruction accounts by defining the index in the
input accounts list as well as the roles (signer, writable). However, when these
are converted to compiled InstructionAccounts, their roles must be
deduplicated. See
https://github.com/anza-xyz/agave/blob/c6e8239843af8e6301cd198e39d0a44add427bef/sdk/program/src/message/legacy.rs#L357.

Without this deduplication of roles, BPF program serialization is invalid, and
some AccountInfo are serialized with the wrong roles.

Trouble is, this change to use account role unification will probably affect all
of the effects.

@ravyu-jump
Copy link
Contributor

ravyu-jump commented Nov 15, 2024

Hey @buffalojoec, thanks for the writeup lemme respond to a couple of these

Active Features (no commit)

Assuming the features you mentioned get cleaned up as part of the Core BPF migration, we have the ability to regenerate fixtures without these features. So that shouldn't be a problem (eventually, there might be some hiccups during transition)

Syscall SolGetSysvar

This is actively being worked on by @mjain-jump, and as mentioned above we can treat this as a "cleaned-up" feature and retroactively regenerate fixtures.

@mjain-jump
Copy link
Contributor

Technically, since changes to accounts when a builtin program errors are not
committed, these should not be contained in modified_accounts. I think the
harness and the fixtures may need to be updated here.

In failure cases, we still capture the in-progress account modification because usually, any differences in that account data could be an indicator of an underlying bug. This way we can more closely match Agave's behavior.

What I recommend you do is using solana-conformance to specify a custom diff function instead (I can show you how to do this over a call or chat). You can then specify to not compare account states on error cases. This way, you don't have to touch the harnesses.

Without this deduplication of roles, BPF program serialization is invalid, and
some AccountInfo are serialized with the wrong roles.

The fuzzer generates all kinds of test cases to test a wide variety of behavior, this included. The easiest thing to do for now is to just reject the input if it contains duplicate mismatching roles. You could probably put it behind the core-bpf feature flag to return None if the account infos are mismatching, and conformance should skip those inputs.

@buffalojoec
Copy link
Collaborator Author

Assuming the features you mentioned get cleaned up as part of the Core BPF migration, we have the ability to regenerate fixtures without these features. So that shouldn't be a problem (eventually, there might be some hiccups during transition)

Yeah that's exactly right. Here's an example: anza-xyz/agave#3540

This is actively being worked on by @mjain-jump, and as mentioned above we can treat this as a "cleaned-up" feature and retroactively regenerate fixtures.

Cool! Conceptually, that works for me. The weird bit is that it isn't activated yet (soon will be), so it won't be cleaned up in Agave for a while. However, only BPF programs use it so far, but the loader still has to implement it behind a feature-gate.

What I recommend you do is using solana-conformance to specify a custom diff function instead (I can show you how to do this over a call or chat). You can then specify to not compare account states on error cases. This way, you don't have to touch the harnesses.

Yep, this would be great. I'll reach out about this.

The easiest thing to do for now is to just reject the input if it contains duplicate mismatching roles. You could probably put it behind the core-bpf feature flag to return None if the account infos are mismatching, and conformance should skip those inputs.

If I do this under the core-bpf feature flag, wouldn't that mean it would only skip these fixtures on the BPF side? The builtin target would still run them and then there would be a mismatch.

@mjain-jump
Copy link
Contributor

If I do this under the core-bpf feature flag, wouldn't that mean it would only skip these fixtures on the BPF side? The builtin target would still run them and then there would be a mismatch.

If any one of the targets rejects the input and returns None, the test case is skipped in conformance.

@kbhargava-jump kbhargava-jump changed the base branch from agave-v2.1.0 to agave-v2.1.3 November 27, 2024 21:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants