Make the assertions part of a generalized instruction #779

elazarg · 2024-11-07T21:39:15Z

Each GuardedInstruction is a pair (Instruction, vector<Assertion>).

Jump instructions are retained in the nondeterministic graph, since they store the preconditions of the jump

Summary by CodeRabbit

Release Notes

New Features
- Enhanced error handling in marshaling and unmarshaling processes for improved clarity and robustness.
- Introduced a new method for processing assertions within the eBPF domain.
Bug Fixes
- Adjusted error handling logic to ensure proper flow and type safety across various functions.
Documentation
- Updated documentation for return types and parameters to enhance clarity.
Refactor
- Transitioned from using Instruction to Command across multiple files for consistency in instruction representation.
Style
- Improved code readability through formatting adjustments and additional comments.

Each instruction is a pair (Command, vector<Assertion>). Jump instructions are retained in the nondeterministic graph, since they store the preconditions of the jump Signed-off-by: Elazar Gershuni <[email protected]>

coderabbitai · 2024-11-07T21:39:23Z

Warning

Rate limit exceeded

@elazarg has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 9 minutes and 8 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between bf98ef7 and afd0a48.

Walkthrough

The pull request introduces significant changes across multiple files, primarily focusing on the transition from the Instruction type to the Command type. This involves updating function signatures, modifying error handling, and enhancing the overall structure of the codebase. Key changes include alterations to marshaling and unmarshaling logic, assertion handling, and the introduction of new structures like GuardedInstruction and IncrementLoopCounter. Additionally, the error handling mechanisms have been refined to improve type safety and clarity, particularly in the context of the check_clang-format script and various classes within the eBPF domain.

Changes

File	Change Summary
scripts/format-code	Modified error handling in `check_clang-format` to exit with status 0 if `clang-format` is not installed; commented out version checks.
src/asm_cfg.cpp	Changed handling of instructions from `Instruction` to `Command` in multiple functions; updated error handling logic and assertions.
src/asm_marshal.cpp	Updated function signatures and internal logic to use `Command` instead of `Instruction`; removed `Assert` operator overload, replaced with exceptions.
src/asm_ostream.cpp	Implemented visitor pattern for assertion handling; renamed `InstructionPrinterVisitor` to `CommandPrinterVisitor`; updated print logic for assembly instructions.
src/asm_ostream.hpp	Removed `operator<<` overloads for `Instruction`, added overload for `Assertion`; updated `to_string` function to reflect changes.
src/asm_syntax.hpp	Introduced `IncrementLoopCounter` structure; replaced `Assert` with `GuardedInstruction`; updated `pc_t` type from `size_t` to `uint32_t`.
src/asm_unmarshal.hpp	Updated documentation for the `unmarshal` function return type for clarity.
src/assertions.cpp	Changed return types in `AssertExtractor` methods from `vector<Assert>` to `vector<Assertion>`; updated handling of assertions.
src/crab/cfg.hpp	Modified `basic_block_t` access control and updated `stmt_list_t` to use `GuardedInstruction`; changed return type of `get_assertions` function.
src/crab/ebpf_domain.cpp	Added new method for processing assertions; updated method to iterate over `GuardedInstruction`.
src/crab/ebpf_domain.hpp	Added several new operator overloads; modified existing overloads for const-correctness.
src/crab/split_dbm.cpp	Improved readability and error handling in methods, updated method signature for `diffcsts_of_assign`.
src/crab/wto.cpp	Minor formatting changes to `visit_args_t` constructor; no functional changes.
src/test/test_marshal.cpp	Updated marshaling and unmarshaling logic to use `Command`; enhanced error handling in tests; refined code comments.

Possibly related PRs

Use get_if instead of holds_alternative to avoid repetition #703: Replaces std::holds_alternative with std::get_if, enhancing type safety in handling Instruction types.
Reject binary ops where the source is an uninitialized register #740: Enhances type checking for binary operations, ensuring non-numeric registers are not used.
Enforce upper bound on loop count on all instructions in loops #745: Introduces a BoundedLoopCount struct to enforce loop iteration limits, relevant to control flow and error handling.
Restrict comparisons with pointers to legal operations #758: Refines pointer comparison logic in the AssertExtractor class, aligning with improvements in error handling.
Simplify configuration options #775: Restructures configuration options for clarity and maintainability, relevant to error handling improvements.
Use gsl::narrow in ebpf_domain.cpp #709: Uses gsl::narrow for safer type conversions in ebpf_domain.cpp, enhancing error handling.
Use gsl::narrow in asm files #710: Similar to Use gsl::narrow in ebpf_domain.cpp #709, focuses on safer type conversions in asm_files.cpp.
Integrate format code script #759: Integrates formatting scripts to ensure code quality, indirectly affecting error handling.
Don't permit truncating non-numbers #772: Prevents truncation of non-numeric values, directly related to error handling and type validation.
Use collapsable instructions in README.md #773: Restructures README for clarity, aiding in understanding error handling practices.
Simplify configuration options #775: Simplifies configuration options, aligning with improvements in code organization and clarity.

🐇 In the code where we play,
Commands lead the way,
From Instruction to Command,
We hop and make our stand!
Errors handled with great care,
In this rabbit's code, we share! 🐰

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 18

🧹 Outside diff range comments (2)

src/asm_marshal.hpp (1)
Line range hint 10-12: Add function documentation.

Consider adding documentation that explains:

The purpose of the marshal function

The expected format/content of the Command parameter

The meaning of the pc parameter

The format of the returned ebpf_inst vector
+/**
+ * Marshals a Command into a sequence of eBPF instructions.
+ * @param cmd The command to marshal
+ * @param pc The program counter value
+ * @return Vector of eBPF instructions representing the command
+ */
 std::vector<ebpf_inst> marshal(const Command& cmd, pc_t pc);
src/crab/cfg.hpp (1)
Line range hint 62-68: Update documentation and parameter type to use Command instead of Instruction.

The documentation and method signature should be updated to reflect the transition from Instruction to Command type, as indicated in the PR objectives.
-    /// Insert an Instruction at the front of the basic block.
+    /// Insert a Command at the front of the basic block.
     /// @note Cannot modify entry or exit blocks.
-    void insert_front(const Instruction& arg) {
+    void insert_front(const Command& arg) {

📜 Review details

Configuration used: CodeRabbit UI
Review profile: ASSERTIVE

📥 Commits

Reviewing files that changed from the base of the PR and between bd55b33 and 21add5e.

📒 Files selected for processing (22)

scripts/format-code (1 hunks)
src/asm_cfg.cpp (13 hunks)
src/asm_files.cpp (3 hunks)
src/asm_marshal.cpp (4 hunks)
src/asm_marshal.hpp (1 hunks)
src/asm_ostream.cpp (9 hunks)
src/asm_ostream.hpp (2 hunks)
src/asm_parse.cpp (2 hunks)
src/asm_parse.hpp (1 hunks)
src/asm_syntax.hpp (7 hunks)
src/asm_unmarshal.cpp (7 hunks)
src/asm_unmarshal.hpp (1 hunks)
src/assertions.cpp (6 hunks)
src/crab/cfg.hpp (2 hunks)
src/crab/ebpf_domain.cpp (3 hunks)
src/crab/ebpf_domain.hpp (1 hunks)
src/crab/fwd_analyzer.cpp (1 hunks)
src/crab/split_dbm.cpp (1 hunks)
src/crab/wto.cpp (1 hunks)
src/spec_type_descriptors.hpp (1 hunks)
src/test/ebpf_yaml.cpp (2 hunks)
src/test/test_marshal.cpp (3 hunks)

🧰 Additional context used

📓 Learnings (3)

src/crab/ebpf_domain.cpp (1)

Learnt from: Alan-Jowett
PR: vbpf/ebpf-verifier#745
File: src/crab/ebpf_domain.hpp:81-81
Timestamp: 2024-11-05T20:06:15.970Z
Learning: In this codebase, `operator()` method declarations in the `ebpf_domain_t` class typically do not have documentation comments, so adding such comments is unnecessary.

src/crab/ebpf_domain.hpp (1)

Learnt from: Alan-Jowett
PR: vbpf/ebpf-verifier#745
File: src/crab/ebpf_domain.hpp:81-81
Timestamp: 2024-11-05T20:06:15.970Z
Learning: In this codebase, `operator()` method declarations in the `ebpf_domain_t` class typically do not have documentation comments, so adding such comments is unnecessary.

src/crab/fwd_analyzer.cpp (1)

Learnt from: Alan-Jowett
PR: vbpf/ebpf-verifier#745
File: src/crab/fwd_analyzer.hpp:14-14
Timestamp: 2024-11-05T20:59:02.521Z
Learning: When reviewing changes, avoid suggesting documentation additions for pre-existing functions (e.g., `run_forward_analyzer` in `src/crab/fwd_analyzer.hpp`) if they are outside the scope of the current PR.

🪛 cppcheck

src/test/test_marshal.cpp

[performance] 292-292: Function parameter 'expected_error_message' should be passed by const reference.

(passedByValue)

🔇 Additional comments (63)

src/asm_parse.hpp (2)

9-9: LGTM! Function signature change aligns with PR objectives.

The change from Instruction to Command return type is consistent with the PR's goal of restructuring instruction representation.

9-9: Verify all callers have been updated.

Since this is a public API change in a header file, we should verify that all callers of parse_instruction have been updated to handle the new Command return type.

✅ Verification successful

Let me analyze the results and generate additional verification scripts to ensure all callers have been properly updated.

Let me verify one more aspect to ensure the Command type is properly handled in all contexts.

All callers have been properly updated to handle the Command return type

The verification shows that:

The parse_instruction function is consistently declared and implemented to return Command type in all locations
All callers properly handle the Command return type:
- In src/asm_parse.cpp: Uses Command ins = parse_instruction(...)
- In src/test/ebpf_yaml.cpp: Uses const Command& ins = parse_instruction(...)
The Command type is a variant type that represents the actual instruction data, and the codebase shows proper type handling with std::holds_alternative checks
The relationship between Command and Instruction is well-defined in asm_syntax.hpp where Instruction contains Command as a member, showing this is an intentional design change

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Find all callers of parse_instruction to verify they've been updated
# to handle the Command return type instead of Instruction

# Search for function calls to parse_instruction
echo "Searching for parse_instruction callers:"
rg -p "parse_instruction\s*\(" --type cpp

# Search for any remaining references to Instruction type
# that might need updating
echo -e "\nSearching for potential missed Instruction type references:"
rg "\bInstruction\b" --type cpp

Length of output: 1708

Script:

#!/bin/bash
# Let's examine the actual implementation and usage context of parse_instruction
# to verify the return type handling

# First, check the implementation in asm_parse.cpp
echo "=== Implementation of parse_instruction ==="
rg -A 5 "Command parse_instruction.*\{" src/asm_parse.cpp

# Check the context of usage in test/ebpf_yaml.cpp
echo -e "\n=== Usage context in test/ebpf_yaml.cpp ==="
rg -B 2 -A 3 "parse_instruction\(" src/test/ebpf_yaml.cpp

# Check if Command type is properly included/used in these files
echo -e "\n=== Command type usage ==="
rg "Command" src/asm_parse.hpp src/asm_parse.cpp src/test/ebpf_yaml.cpp

Length of output: 1800

Script:

#!/bin/bash
# Let's check the Command type definition and its relationship with Instruction
echo "=== Command type definition ==="
rg -p "Command" --type=hpp

# Check for any type conversions or relationships between Command and Instruction
echo -e "\n=== Command and Instruction relationship ==="
rg -p "Command|Instruction" src/asm_syntax.hpp

Length of output: 987

src/asm_marshal.hpp (1)

10-10: Verify all callers have been updated.

The change from Instruction to Command type is aligned with the PR objectives. Let's verify that all callers have been updated accordingly.

✅ Verification successful

Let me analyze the search results:

The marshal function with Command type is defined in src/asm_marshal.hpp and implemented in src/asm_marshal.cpp.
All usages of marshal function in the codebase are with the new Command type:
- Direct usage in src/asm_marshal.cpp with Command parameter
- Test cases in src/test/test_marshal.cpp using Command and its derived types
- No instances found of using marshal with Instruction type

Let me verify the implementation to ensure it's consistent:

Based on the comprehensive analysis of the search results and implementation check:

All callers have been properly updated to use the new Command type

The verification confirms:

The function signature change from Instruction to Command has been consistently applied across the codebase
All usages of marshal function are with the new Command type
No legacy references to marshal with Instruction type remain
The implementation in src/asm_marshal.cpp correctly uses the new type

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Search for any remaining references to marshal function with Instruction type
# that might have been missed during the refactoring.

echo "Searching for potential missed marshal function calls..."
rg -l "marshal.*Instruction" --type cpp

echo "Searching for all marshal function calls for manual verification..."
rg -A 2 "marshal\(" --type cpp

Length of output: 19357

Script:

#!/bin/bash
# Check the implementation of marshal function
cat src/asm_marshal.cpp | head -n 20

Length of output: 511

src/spec_type_descriptors.hpp (1)

70-70: LGTM! Verify type consistency.

The comment update aligns with the broader refactoring from Instruction to Command. However, the prog vector still uses ebpf_inst type.

Let's verify if ebpf_inst has been updated elsewhere or if it needs attention:

src/asm_ostream.hpp (2)

42-43: LGTM: Core Command operator declarations

The base operator and conversion function declarations follow good C++ practices and support the transition from Instruction to Command.

70-71: Verify Assertion operator implementations

The declarations look good, but let's verify the corresponding implementations exist.

✅ Verification successful

Let me verify the implementations by checking their content.

Assertion operators are properly implemented

Both operator<< and to_string are implemented in src/asm_ostream.cpp with appropriate functionality:

operator<< uses a visitor pattern (AssertionPrinterVisitor) to handle different assertion types
to_string implementation correctly leverages the operator<< through a stringstream

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for Assertion operator implementations
echo "Searching for Assertion operator implementations..."
rg "operator<<.*Assertion.*\{" --type cpp

echo "Searching for Assertion to_string implementation..."
rg "to_string.*Assertion.*\{" --type cpp

Length of output: 473

Script:

#!/bin/bash
# Get the implementation details
echo "operator<< implementation:"
rg -A 5 "operator<<.*Assertion.*\{" --type cpp

echo -e "\nto_string implementation:"
rg -A 5 "to_string.*Assertion.*\{" --type cpp

Length of output: 784

src/crab/fwd_analyzer.cpp (1)

131-131: Address the design inconsistency highlighted in the TODO comment.

The TODO comment suggests that there's still an explicit call that hasn't been migrated to use the new Command-based approach. Since this PR aims to make assertions part of a generalized instruction using Commands, this inconsistency should be addressed.

Let's verify the impact on termination checking:

scripts/format-code (1)

177-180: ⚠️ Potential issue

Version compatibility check should not be commented out.

Removing the version check could lead to inconsistent formatting across the team if different clang-format versions are used. clang-format is known to produce different results across versions.

Consider either:

Removing the commented code entirely if version check is no longer needed
Restoring the version check with proper justification

Let's check if there are any formatting inconsistencies in recent PRs that might be related to different clang-format versions:

src/crab/ebpf_domain.hpp (2)

70-70: LGTM! Key change for assertion integration

This operator implements the PR's objective of making assertions part of instructions. The change aligns well with the existing current_assertion member variable.

Let's verify the assertion integration:

✅ Verification successful

LGTM! Assertion operator implementation is well integrated

The implementation in ebpf_domain.cpp shows proper handling of assertions:

Sets the current assertion context using to_string(assertion)
Visits the assertion using the visitor pattern
Properly cleans up by clearing the assertion context
Respects configuration through check_require and assume_assertions flags

The operator is also consistently integrated across the codebase:

Printing support in asm_ostream.cpp
Proper marshaling behavior in asm_marshal.cpp
Well-defined visitor pattern implementation

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the integration of assertions with instructions
# Look for assertion-related code patterns

# Search for assertion usage patterns
ast-grep --pattern 'operator()(const Assertion& $_) {
  $$$
}'

# Search for current_assertion usage
rg -p "current_assertion" --type cpp

Length of output: 476

Script:

#!/bin/bash
# Let's verify the assertion integration with more specific patterns

# Look for the implementation of operator()(const Assertion&)
ast-grep --pattern 'void operator()(const Assertion& $_) {
  $$$
}'

# Search for assertion-related code with context
rg -p "class Assertion" --type cpp -A 5
rg -p "struct Assertion" --type cpp -A 5

# Look for assertion usage in the domain implementation
rg -p "operator.*Assertion" --type cpp -A 5

Length of output: 1749

61-61: Verify const correctness of jump handling

Making operator()(const Jmp&) const suggests that jump handling no longer modifies the domain state. This change could impact how state transitions are managed during verification.

Let's verify the usage of this operator:

✅ Verification successful

Making operator()(const Jmp&) const is correct and safe

The verification results show that:

Jmp is a simple data structure (in src/asm_syntax.hpp) containing only a condition and target label
The implementation in src/crab/ebpf_domain.cpp:1396 is empty: void ebpf_domain_t::operator()(const Jmp&) const {}
Other parts of the codebase handle Jmp-related state modifications through different mechanisms (e.g., assertions.cpp has a const operator returning assertions)

Making this operator const is consistent with its empty implementation and the overall design where state changes are handled through the assertion system rather than during jump processing.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check if the const correctness change affects any existing call sites
# Look for places where the operator might be expected to modify state

# Search for calls to operator() with Jmp
rg -p "operator.*\(.*Jmp.*\)" --type cpp

# Search for Jmp-related state modifications
rg -p "Jmp.*state" --type cpp

Length of output: 476

Script:

#!/bin/bash
# Let's try a more comprehensive search for Jmp-related operations

# Search for the Jmp operator implementation
ast-grep --pattern 'void operator()(const Jmp& $_) {
  $$$
}'

# Search for Jmp class/struct definition and usage
rg -p "class.*Jmp|struct.*Jmp" --type cpp -A 5

# Search for any Jmp-related method calls
rg -p "\..*Jmp|->.*Jmp" --type cpp

# Look for operator() calls in general to understand the pattern
rg -p "operator\(\)" --type cpp

Length of output: 9549

src/crab/wto.cpp (2)

44-44: LGTM! Minor formatting improvement.

The added space before the closing brace in the constructor's initialization list improves code readability.

Line range hint 1-256: Verify if WTO implementation needs updates for Command type.

Since this PR changes the core instruction representation from Instruction to Command, we should verify if the WTO implementation needs any updates to handle the new type.

src/asm_marshal.cpp (3)

297-298: LGTM: Marshal function signature updated correctly.

The function signature change from Instruction to Command maintains the core marshaling functionality while aligning with the new type system.

Line range hint 301-309: LGTM: Size calculation properly adapted for Command type.

The size calculation logic remains correct while transitioning to the new Command type, properly handling special cases for LoadMapFd and lddw instructions.

208-208: LGTM: Assertion operator correctly prevents marshaling.

The implementation correctly prevents assertions from being marshaled into bytecode, as they are compile-time verification constructs.

Let's verify that assertions are only used for verification:

✅ Verification successful

Assertion marshaling implementation is correct

The implementation correctly prevents assertions from being marshaled into bytecode, consistent with other non-executable constructs like assumptions. The code follows the pattern of explicitly throwing exceptions for compile-time verification constructs that should not be converted to runtime bytecode.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for any potential bytecode generation of assertions
rg -l "marshal.*assertion"

Length of output: 48

Script:

#!/bin/bash
# Let's examine the actual usage of assertions in the codebase
rg -B2 -A2 "marshal.*assertion" src/asm_marshal.cpp

# Also check for any other assertion-related code
ast-grep --pattern 'struct Assertion { $$$ }'

Length of output: 394

src/asm_syntax.hpp (7)

159-159: LGTM: Consistent terminology update.

The comment update from "instruction" to "Command" aligns with the new structural changes.

248-248: LGTM: Consistent terminology update.

The comment update from "instruction" to "Command" aligns with the new structural changes.

261-261: LGTM: Consistent terminology update.

The comment update from "instruction" to "Command" aligns with the new structural changes.

269-269: LGTM: Consistent terminology update.

The comment update from "instruction" to "Command" aligns with the new structural changes.

279-279: LGTM: Consistent terminology update.

The comment update from "instruction" to "Command" aligns with the new structural changes.

299-299: LGTM: Consistent terminology update.

The comment update from "instruction" to "Command" aligns with the new structural changes.

430-430: Verify the impact of changing pc_t from size_t to uint32_t.

The type change is well-documented for supporting 32-bit PC offsets, but let's verify there are no arithmetic issues.

✅ Verification successful

Let me generate additional verification scripts to check for potential issues with pc_t arithmetic operations more thoroughly.

Based on the code analysis, I can now provide a final response:

The change from size_t to uint32_t for pc_t type is safe

The code inspection shows that pc_t is used safely with proper bounds checking:

All arithmetic is bounded (pc + 1 + offset with bounds check)
No unchecked arithmetic operations or overflows
Used mainly for program counter tracking and jump offset calculations
Target validation occurs before any pc_t arithmetic (e.g., if (new_pc >= insts.size()))

The change to uint32_t aligns with the documented purpose of supporting 32-bit PC offsets in cpu=v4 and the implementation shows proper safeguards against overflow conditions.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check for potential issues with pc_t arithmetic
# Look for arithmetic operations, comparisons, or casts involving pc_t
# that might be affected by the type change

echo "Checking for pc_t arithmetic operations..."
rg -l "pc_t" | xargs rg "(\+|-|\*|/|<<|>>|<|>|<=|>=|==|!=)"

echo "Checking for size_t to pc_t conversions..."
rg "static_cast<(size_t|pc_t)>"

echo "Checking for pc_t array indexing..."
rg "pc_t.*\["

Length of output: 66140

Script:

#!/bin/bash
# Look for pc_t usage in arithmetic expressions and assignments
echo "Checking pc_t arithmetic usage..."
ast-grep --pattern 'pc_t $var = $expr;'

echo "Checking pc_t comparisons..."
ast-grep --pattern 'pc_t $var $op $expr'

echo "Checking pc_t function parameters..."
ast-grep --pattern 'void $func($$$, pc_t $param, $$$)'

# Check for any potential overflow checks
echo "Checking for overflow handling..."
rg -A 3 -B 3 "pc_t.*\+\+" 
rg -A 3 -B 3 "pc_t.*\-\-"
rg -A 3 -B 3 "pc_t.*\+"
rg -A 3 -B 3 "pc_t.*\-"

Length of output: 3827

src/assertions.cpp (12)

Line range hint 23-28: LGTM: Return type change aligns with PR objectives

The change from vector<Assert> to vector<Assertion> maintains the existing logic while aligning with the PR's goal of generalizing assertions.

40-47: LGTM: Consistent return type updates

Return type changes from vector<Assert> to vector<Assertion> are applied consistently across basic instruction operators.

50-59: LGTM: Proper return value validation maintained

The Exit operator correctly maintains the critical validation that ensures R0 contains a number when returning from the main function.

88-91: LGTM: Equivalent assertion handling for context pointers

The modified loop maintains the same functionality while adapting to the new Assertion type.

118-125: LGTM: Consistent operator updates

Return type changes are applied consistently while preserving the existing validation logic.

128-132: LGTM: Proper privilege checking maintained

The method correctly maintains privilege checking while adapting to the new assertion type.

179-186: LGTM: Consistent condition validation

The operators maintain proper condition validation while adapting to the new assertion type.

Line range hint 188-215: LGTM: Memory access validation preserved

The operator maintains comprehensive memory access validation, including stack pointer bounds checking and proper type constraints.

217-225: LGTM: Proper atomic operation validation

The operator correctly wraps constraints with the new Assertion type while maintaining all necessary validations.

227-229: LGTM: Proper unary operation validation

The operator correctly maintains number type constraint with the new assertion type.

231-287: LGTM: Comprehensive binary operation validation

The operator maintains proper type constraints and operation-specific validations while transitioning to the new assertion type.

291-305: LGTM: Proper Command type integration

The functions correctly adapt to use the new Command type while maintaining the same assertion extraction and application logic.

src/asm_cfg.cpp (7)

Line range hint 28-41: LGTM: Command type handling is correct.

The function properly handles all cases with the new Command type using appropriate variant access methods.

50-57: LGTM: CallLocal handling updated correctly.

The changes properly adapt the code to use Command type while maintaining the original logic.

76-81: LGTM: Exit and Call variant handling is correct.

The code properly handles both Exit and Call variants with the new Command type.

Line range hint 126-132: LGTM: CallLocal and stack frame handling is correct.

The code properly handles CallLocal variant and maintains important stack frame depth checks.

Line range hint 145-174: LGTM: CFG construction logic updated correctly.

The changes properly adapt the CFG construction logic to use Command type while maintaining all necessary checks and error handling.

Line range hint 276-348: LGTM: Instruction type classification updated correctly.

The function properly handles all instruction variants with the new Command type while maintaining the correct classification logic.

21-26: LGTM: Command type transition looks correct.

The function has been properly updated to use the new Command type while maintaining the same logic.

✅ Verification successful

Let me verify the caller to ensure it's passing a Command type.

Let me check the type of inst to ensure it's properly using Command.

Let me try a different approach to check the struct definition.

Let me check the definition of InstructionSeq and Instruction types.

Transition to Command type is correctly implemented

The verification confirms that:

The get_jump function correctly accepts a Command parameter
The caller passes inst.cmd which is correct as Instruction struct contains a Command cmd member
The type transition maintains type safety throughout the call chain

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Verify all callers of get_jump have been updated to pass Command
rg -p "get_jump\(" --type cpp

Length of output: 156

Script:

#!/bin/bash
# Check the context around the caller of get_jump
rg -p "get_jump\(" --type cpp -B 5 -A 2

Length of output: 549

Script:

#!/bin/bash
# Check the definition of inst and its type
ast-grep --pattern 'struct $_ {
  $$$
  Command cmd;
  $$$
}'

Length of output: 63

Script:

#!/bin/bash
# Search for struct definitions containing inst or command
rg -p "struct.*\{" --type cpp -B 1 -A 10 src/asm_cfg.cpp

Length of output: 1908

Script:

#!/bin/bash
# Look for the definition of InstructionSeq and Instruction
rg -p "struct (Instruction|InstructionSeq)" --type cpp -B 2 -A 5

Length of output: 367

src/test/ebpf_yaml.cpp (2)

185-185: LGTM! Comment updated for consistency.

The comment has been appropriately updated to use the new Command terminology, maintaining consistency with the codebase changes.

155-155: LGTM! Verify type consistency across codebase.

The type change from Instruction to Command aligns with the PR objective and appears safe.

Let's verify the type consistency:

✅ Verification successful

Let me gather more specific information about the usage context.

Based on the gathered information, I can now provide a final response:

Type change is consistent and correctly implemented

The verification shows that:

parse_instruction in src/asm_parse.cpp correctly returns Command type
The usage in src/test/ebpf_yaml.cpp correctly receives const Command&
The remaining Instruction references in other files are part of different subsystems (asm, crab) and are not related to this change

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify that all relevant functions have been updated to use Command instead of Instruction

# Test 1: Check for any remaining Instruction type usage
rg -l "Instruction[^a-zA-Z]" --type cpp

# Test 2: Verify Command type usage in parse_instruction
ast-grep --pattern 'Command parse_instruction'

Length of output: 269

Script:

#!/bin/bash
# Check the context of Instruction usage in relevant files
echo "=== Checking parse_instruction signature ==="
rg "parse_instruction" --type cpp -B 1 -A 1

echo -e "\n=== Checking Command type usage ==="
rg "Command\s+\w+" --type cpp

echo -e "\n=== Checking specific file context ==="
cat src/test/ebpf_yaml.cpp

Length of output: 19971

src/crab/cfg.hpp (3)

627-627: LGTM! Function signature aligns with PR objectives.

The updated signature correctly reflects the transition to using Command type and supports the goal of making assertions part of generalized instructions.

632-634: LGTM! Simplified operator declarations.

Good cleanup: removed redundant namespace prefixes since the operators are already in the crab namespace.

Line range hint 29-29: Consider updating remaining Instruction type usages.

The file still contains several uses of the Instruction type, particularly in the basic_block_t class where stmt_list_t is defined as std::vector<Instruction>. These should be updated to use Command type to maintain consistency with the PR objectives.

src/asm_files.cpp (3)

203-203: LGTM: Documentation update aligns with new terminology.

The comment update from "Instruction" to "Command" correctly reflects the architectural changes described in the PR objectives.

253-253: LGTM: Documentation update aligns with new terminology.

The comment update from "Instruction" to "Command" maintains consistency with the architectural changes.

Line range hint 483-490: LGTM: Documentation and line info propagation logic are correct.

The comment update maintains terminology consistency, and the line information propagation logic correctly handles instructions without explicit BTF data by copying from the previous instruction.

src/test/test_marshal.cpp (4)

235-235: LGTM: Command type transition

The change correctly accesses the command through the cmd field, aligning with the transition from Instruction to Command type.

253-253: LGTM: Consistent Command type usage

The change maintains consistency with the Command type transition pattern.

266-271: LGTM: Improved type safety in parsing

The changes enhance type safety by using std::get for parsing the instruction sequence.

279-281: LGTM: Clear documentation

The added comment clearly explains the purpose of the Command marshaling verification.

src/asm_parse.cpp (1)

136-136: ⚠️ Potential issue

Update return type of parse_program to CommandSeq.

The function parse_program still returns InstructionSeq, but it should return CommandSeq to reflect the new Command type introduced in the refactoring. This ensures consistency across the codebase.

Apply this diff to fix the return type:

-static InstructionSeq parse_program(std::istream& is) {
+static CommandSeq parse_program(std::istream& is) {

Likely invalid or redundant comment.

src/asm_unmarshal.cpp (7)

81-83: Function shift32 return type correctly updated to Command

The shift32 function's return type has been changed from Instruction to Command, aligning with the new instruction representation. This update ensures consistency across the codebase.

Line range hint 277-383: Function makeMemOp return type updated to Command with enhanced error handling

The makeMemOp function now returns a Command instead of an Instruction. The updated function includes robust error handling for invalid instructions and unsupported operations, improving the reliability of memory operations parsing.

Line range hint 384-417: Function makeAluOp return type updated to Command; division by zero check added

The makeAluOp function's return type has been updated to Command. It now includes a check for division by zero when allow_division_by_zero is not enabled, enhancing safety by detecting potential runtime errors.

Line range hint 397-413: Correct use of std::visit with overloaded lambdas for operation handling

The use of std::visit with the overloaded pattern effectively handles both Un::Op and Bin::Op variants. This implementation ensures that unary and binary operations are appropriately processed.

Line range hint 420-434: Function makeLddw return type updated to Command; improved instruction validation

The makeLddw function now returns a Command, consistent with the new instruction structure. Additional validation checks have been added to handle platform support and instruction integrity, increasing robustness when loading 64-bit immediate values.

Line range hint 591-703: Function makeJmp return type updated to Command; consistent jump instruction handling

The makeJmp function's return type is updated to Command. The function maintains consistent validation and processing of jump instructions, including calls and exits, in line with the updated instruction representation.

Line range hint 707-768: Variable new_ins type updated to Command in unmarshal function

In the unmarshal function, the variable new_ins has been changed from Instruction to Command. This update aligns with the overall refactoring and ensures that instruction parsing correctly uses the new Command type throughout the function.

src/crab/ebpf_domain.cpp (2)

1396-1398: Function correctly implemented as a no-op.

The operator()(const Jmp&) function appropriately acts as a placeholder for jump preconditions without introducing unnecessary operations.

1462-1463: Correct handling of helper function IDs as singleton intervals.

The code correctly processes helper function IDs when src_interval is a singleton, ensuring proper validation and invocation of helper functions only when the ID is definite and valid.

Also applies to: 1466-1466, 1472-1473

src/asm_marshal.hpp

src/spec_type_descriptors.hpp

src/asm_ostream.hpp

scripts/format-code

src/crab/ebpf_domain.hpp

src/test/test_marshal.cpp

src/asm_parse.cpp

src/asm_ostream.cpp

src/crab/ebpf_domain.cpp

Signed-off-by: Elazar Gershuni <[email protected]>

coderabbitai

Actionable comments posted: 2

📜 Review details

Configuration used: CodeRabbit UI
Review profile: ASSERTIVE

📥 Commits

Reviewing files that changed from the base of the PR and between 21add5e and 01a02e7.

📒 Files selected for processing (2)

src/asm_ostream.hpp (4 hunks)
src/test/test_marshal.cpp (4 hunks)

🔇 Additional comments (7)

src/asm_ostream.hpp (4)

Line range hint 15-23: label_to_offset16 function updated appropriately

The function label_to_offset16 now takes const pc_t pc as a parameter, which is suitable. The lambda correctly calculates the offset for 16-bit jumps.

Line range hint 25-33: label_to_offset32 function updated appropriately

The function label_to_offset32 now accepts const pc_t pc and accurately computes the offset for 32-bit jumps when the offset doesn't fit in an int16_t.

42-43: Operator overloads for Command added correctly

The addition of operator<< and to_string overloads for Command aligns with the shift from Instruction to Command, ensuring proper stream output and string conversion.

57-58: Operator overloads for Assertion added correctly

The new operator<< and to_string functions for Assertion provide necessary functionality for handling assertions, enhancing code clarity and maintainability.

src/test/test_marshal.cpp (3)

253-253: Handle potential errors from unmarshal to prevent exceptions

Similar to the previous comment, in this function, ensure that you check the result of unmarshal before using std::get<InstructionSeq> to avoid exceptions when an error occurs.

266-266: Handle potential errors from unmarshal to prevent exceptions

Again, to prevent exceptions, verify the result of unmarshal before accessing it directly with std::get<InstructionSeq>.

235-235: ⚠️ Potential issue

Handle potential errors from unmarshal to prevent exceptions

The function unmarshal returns a std::variant<InstructionSeq, std::string>. Directly using std::get<InstructionSeq> without checking may throw an exception if unmarshal returns an error message. It's safer to check the result before accessing it to prevent potential exceptions.

Apply this change to safely handle the result:

-auto parsed = std::get<InstructionSeq>(unmarshal(raw_program{"", "", 0, "", {ins, exit, exit}, info}));
+auto result = unmarshal(raw_program{"", "", 0, "", {ins, exit, exit}, info});
+REQUIRE(std::holds_alternative<InstructionSeq>(result));
+InstructionSeq parsed = std::get<InstructionSeq>(result);

Likely invalid or redundant comment.

src/test/test_marshal.cpp

coveralls · 2024-11-07T22:12:20Z

coverage: 90.53% (-0.002%) from 90.532%
when pulling afd0a48 on extended-instruction
into bd55b33 on main.

elazarg · 2024-11-07T22:16:30Z

@Alan-Jowett what do you think about the change? One benefit is having more direct connection between the location of a Command and the location of an ebpf instruction.

In the future, I intend to

Split handling of commands and assertions with ebpf_domain_t into different classes, and keep only domain operations (join, meet, etc) in the ebpf_domain_t.
Potentially add a phase where the preconditions are moved backwards, merged and resolved as a special abstract domains by themselves. I've implemented a POC of this approach in the past, and it yields significant performance improvement, and simpler CFG for debugging (albeit losing track of the origin of each assertion). This is related to Idea: split basic blocks into preconditions+code #258, except I no longer think that basic blocks are particularly special.

Signed-off-by: Elazar Gershuni <[email protected]>

coderabbitai

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI
Review profile: ASSERTIVE

📥 Commits

Reviewing files that changed from the base of the PR and between 01a02e7 and 59db668.

📒 Files selected for processing (3)

src/asm_marshal.cpp (3 hunks)
src/asm_ostream.cpp (9 hunks)
src/asm_syntax.hpp (2 hunks)

🔇 Additional comments (12)

src/asm_marshal.cpp (5)

295-297: LGTM: Command type transition looks correct.

The signature change from Instruction to Command is consistent with the PR objectives while maintaining the existing marshaling logic.

Line range hint 299-307: LGTM: Size calculation properly adapted for Command type.

The function correctly maintains its logic while transitioning to the new type system.

316-316: LGTM: Label mapping correctly adapted.

The change to access inst.cmd is consistent with the new instruction structure.

327-332: LGTM: Instruction sequence marshaling properly adapted.

The changes correctly handle Command access while maintaining the existing marshaling logic. The jump instruction processing is properly adapted to the new structure.

Line range hint 295-332: Verify performance impact of Command type transition.

While the changes are functionally correct, let's verify there's no significant performance regression in the marshaling process.

✅ Verification successful

Let me gather more information about the actual code changes and their potential impact.

Based on the code analysis, I can now provide a definitive response about the performance impact of the Command type transition.

No significant performance impact from Command type transition

The code analysis reveals that:

The marshal function is a simple visitor pattern implementation with direct mapping
The transition maintains O(1) complexity for single instruction marshaling
Test coverage shows extensive marshaling/unmarshaling tests with no performance-specific concerns
The Command type is used in a zero-copy manner without additional allocations

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Look for performance-related tests or benchmarks

# Search for performance tests
echo "Searching for performance tests..."
rg -l "benchmark|perf_test|performance" --type cpp

# Look for existing timing measurements
echo "Searching for timing-related code..."
rg "chrono|duration|timer" --type cpp

Length of output: 1258

Script:

#!/bin/bash
# Check the actual implementation differences
echo "Checking Command vs Instruction implementation..."
rg -B2 -A2 "struct (Command|Instruction)" --type cpp

# Look for any performance-critical loops or processing
echo "Checking marshal usage and processing..."
rg -B2 -A2 "marshal\(" --type cpp

# Look for test cases that might reveal complexity
echo "Checking test coverage..."
fd "test.*\.cpp$" --exec rg -l "marshal"

Length of output: 28540

src/asm_syntax.hpp (1)

430-430: Verify the implications of changing pc_t to uint32_t

Changing pc_t from size_t to uint32_t may have implications on systems where size_t is larger than 32 bits. Please ensure that this change doesn't affect the correctness or portability of the code, especially if there's a possibility of program counters exceeding 32-bit limits.

Run the following script to identify all usages of pc_t and check for potential issues related to its size:

✅ Verification successful

Based on the search results, I can now generate the final response since I have enough context about how pc_t is used throughout the codebase:

The change from size_t to uint32_t for pc_t is safe and intentionally designed

The code explicitly documents this design choice with the comment "cpu=v4 supports 32-bit PC offsets so we need a large enough type." The usage analysis shows that:

pc_t is primarily used for instruction indexing and offset calculations
All offset calculations are done using 16-bit or 32-bit signed integers
The codebase includes bounds checking (e.g., if (new_pc >= insts.size()))
No operations require addresses beyond 32-bit range

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Find all usages of pc_t and verify compatibility with uint32_t.

# Search for all definitions and usages of pc_t.
rg -A 3 -B 3 '\bpc_t\b'

Length of output: 7347

src/asm_ostream.cpp (6)

108-185: Implementation of AssertionPrinterVisitor is well-structured

The AssertionPrinterVisitor struct effectively organizes the printing logic for various assertion types, enhancing code maintainability and readability.

Line range hint 188-262: CommandPrinterVisitor correctly replaces InstructionPrinterVisitor

The new CommandPrinterVisitor struct accurately handles the printing of different command types, aligning with the updated instruction representation.

364-365: Overloaded operator<< for Command is properly updated

The stream insertion operator for Command now uses the CommandPrinterVisitor, ensuring consistent and correct output formatting.

375-377: Overloaded operator<< for Assertion utilizes AssertionPrinterVisitor effectively

The changes streamline the printing of assertions by using the visitor pattern, improving code organization.

386-386: Verify that size function covers all Command variants

The size function accounts for Bin commands with lddw and LoadMapFd. Ensure that any new or existing Command variants that may affect instruction size are handled appropriately.

434-441: Exception handling in jump instruction printing is appropriate

The logic for handling missing labels in jump instructions correctly throws a runtime error, which aids in identifying issues during runtime.

src/asm_ostream.cpp

Signed-off-by: Elazar Gershuni <[email protected]>

coderabbitai

Actionable comments posted: 7

🧹 Outside diff range comments (4)

src/assertions.cpp (3)
Line range hint 61-91: Consider enhancing error handling for map_fd_reg assertion.

The implementation is solid, but the assertion on line 81 (assert(map_fd_reg)) could be enhanced with a more descriptive error message.
-                assert(map_fd_reg);
+                assert(map_fd_reg && "map_fd_reg must be set before PTR_TO_MAP_KEY/VALUE");
Line range hint 128-179: Consider extracting condition operation mapping to a lookup table.

The implementation is correct, but the boolean assignment for allow_pointers could be simplified using a lookup table.
static const std::unordered_map<Condition::Op, bool> ALLOWS_POINTERS = {
    {Condition::Op::EQ, true},
    {Condition::Op::NE, true},
    // ... other mappings
};
bool allow_pointers = ALLOWS_POINTERS.at(cond.op) && cond.is64;
Line range hint 188-215: Consider extracting magic number to a named constant.

The implementation is solid, but the stack size check could be more readable.
-            if (offset < -EBPF_SUBPROGRAM_STACK_SIZE || offset + static_cast<int>(width.v) > 0) {
+            const int MAX_NEGATIVE_STACK_OFFSET = -EBPF_SUBPROGRAM_STACK_SIZE;
+            if (offset < MAX_NEGATIVE_STACK_OFFSET || offset + static_cast<int>(width.v) > 0) {
src/asm_cfg.cpp (1)
Line range hint 276-312: Pass Instruction by const reference in instype function

The instype function currently takes Instruction ins by value. Since Instruction may be a non-trivial object, passing it by const reference avoids unnecessary copies and improves performance.

Apply this diff to pass by const reference:
-static std::string instype(Instruction ins) {
+static std::string instype(const Instruction& ins) {

📜 Review details

Configuration used: CodeRabbit UI
Review profile: ASSERTIVE

📥 Commits

Reviewing files that changed from the base of the PR and between 59db668 and bf98ef7.

📒 Files selected for processing (10)

src/asm_cfg.cpp (12 hunks)
src/asm_marshal.cpp (2 hunks)
src/asm_ostream.cpp (10 hunks)
src/asm_ostream.hpp (3 hunks)
src/asm_syntax.hpp (2 hunks)
src/asm_unmarshal.hpp (1 hunks)
src/assertions.cpp (7 hunks)
src/crab/cfg.hpp (3 hunks)
src/crab/ebpf_domain.cpp (3 hunks)
src/test/test_marshal.cpp (5 hunks)

🔇 Additional comments (31)

src/asm_unmarshal.hpp (1)

18-18: LGTM: Grammar fix in documentation.

The grammatical correction from "instruction" to "instructions" improves clarity.

src/asm_ostream.hpp (2)

15-15: LGTM! Improved const-correctness.

The addition of const to the parameter declaration is a good practice, making the immutability of pc explicit.

57-58: LGTM! Consistent with existing patterns.

The new declarations for Assertion type handling follow the established pattern in the file and support the PR's goal of integrating assertions into the instruction set.

Note: The previous suggestion about using templates to reduce operator overload duplication still applies here.

src/assertions.cpp (9)

40-47: LGTM: Consistent type changes across operator overloads.

The return type changes maintain behavioral consistency while aligning with the new type system.

50-58: LGTM: Sound validation logic for packet and exit operations.

The implementation properly handles both context pointer verification and return value type constraints.

118-126: LGTM: Proper constraint handling for call operations.

The implementation correctly enforces type and function constraints for callable entities.

181-186: LGTM: Clean implementation of jump condition handling.

217-223: LGTM: Comprehensive validation for atomic operations.

The implementation properly enforces type constraints and memory access validation for atomic operations.

227-229: LGTM: Clear and concise unary operation handling.

291-292: LGTM: Consistent type system update.

303-304: LGTM: Proper assertion application in CFG traversal.

23-24: LGTM: Type change aligns with PR objectives.

The change from vector<Assert> to vector<Assertion> is consistent with the PR's goal of generalizing instructions.

✅ Verification successful

Type transition from Assert to Assertion is complete and consistent

Based on the search results, the codebase shows a complete transition to the Assertion type. No remaining instances of vector<Assert> were found, and all related code (function returns, parameters, and usage) consistently uses the new Assertion type. The type is properly defined in src/asm_syntax.hpp as a variant type and is used consistently throughout the codebase.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Verify consistent usage of Assertion type
rg -l "vector<Assert>" .

Length of output: 26

Script:

#!/bin/bash
# Search for any remaining usage of Assert type
rg "Assert\b" -A 2 -B 2

Length of output: 25

Script:

#!/bin/bash
# Search for Assertion type usage to verify the transition
rg "Assertion\b" -A 2 -B 2

Length of output: 10817

Script:

#!/bin/bash
# Look for class/struct definition of Assert and Assertion
ast-grep --pattern 'class $_ {
  $$$
}'

ast-grep --pattern 'struct $_ {
  $$$
}'

Length of output: 84

src/crab/cfg.hpp (4)

Line range hint 56-68: LGTM! Method signatures updated consistently.

The insert and insert_front methods have been properly updated to handle GuardedInstruction. The documentation improvements for insert_front are helpful.

632-634: LGTM! Stream operators maintained correctly.

The stream operator declarations have been simplified while maintaining functionality for both block types.

627-627: Verify the implementation of get_assertions.

The function signature has been updated to handle the new assertion model. Please ensure the implementation properly handles all cases with the new parameters.

#!/bin/bash
# Find the implementation of get_assertions
ast-grep --pattern 'vector<Assertion> get_assertions(Instruction $_,$_,$_) {
  $$$
}'

42-42: Verify usage of GuardedInstruction across the codebase.

The change from vector<Instruction> to vector<GuardedInstruction> is fundamental to the PR's goal. This type change needs careful verification to ensure all consumers handle the new type correctly.

✅ Verification successful

Based on the search results, I can now generate the final response:

The type change from vector<Instruction> to vector<GuardedInstruction> appears to be safe.

The codebase shows that:

Instruction remains as a variant type (src/asm_syntax.hpp) and is now wrapped within GuardedInstruction which adds preconditions
All direct usages of Instruction are for processing individual instructions (parsing, marshaling, printing) rather than storing collections
No instances of std::vector<Instruction> were found, indicating this was likely the only occurrence that needed updating

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for Instruction type usage that might need updating
rg --type cpp "std::vector<Instruction>" -A 3
# Search for direct Instruction usage that might need updating
rg --type cpp "\bInstruction\b" -A 3

Length of output: 8692

src/test/test_marshal.cpp (5)

235-235: LGTM: Correctly updated to access the cmd member.

The change correctly reflects the transition from Instruction to Command type, maintaining consistency with the PR's objectives.

253-253: LGTM: Consistent update to access the cmd member.

The change maintains consistency with other similar modifications in the file.

266-271: LGTM: Correctly updated the 64-bit immediate instruction handling.

The changes properly handle the transition to the new Command type while maintaining the existing functionality for 64-bit immediate instructions.

289-289: LGTM: Consistent update to access the cmd member.

The change maintains consistency with other similar modifications in the file.

292-298: LGTM: Enhanced error handling with safer type checking.

The updated error handling now uses std::get_if for safer type checking when retrieving error messages. This is a good improvement in type safety.

src/asm_marshal.cpp (3)

316-316: Correctly updating size calculation with inst.cmd

The change to use inst.cmd in the size calculation aligns with the updated instruction representation.

327-327: Properly adjusting jump targets using ins.cmd

The update correctly modifies the jump target calculations to reflect the new Command structure.

330-330: Marshaling instructions with updated Command representation

The loop now correctly marshals ins.cmd, ensuring that instructions are processed appropriately with the updated structure.

src/asm_cfg.cpp (3)

145-147: Skip processing undefined instructions

The check for Undefined instructions at line 145 ensures that they are skipped during CFG construction. This is a good practice to prevent processing invalid or placeholder instructions.

Line range hint 126-131: Recursive call to add_cfg_nodes with stack frame depth check

The recursive call to add_cfg_nodes at line 129 includes a stack frame depth check to prevent excessive recursion, which could lead to a stack overflow.

370-372: Insertion of IncrementLoopCounter for loop detection

At line 370, inserting IncrementLoopCounter instructions at loop heads is a correct implementation for tracking loop iterations and aiding in termination checks.

src/asm_ostream.cpp (3)

108-185: Refactored assertion printing is correct.

The AssertionPrinterVisitor struct encapsulates the logic for printing different types of assertions using the visitor pattern effectively. This refactoring enhances code organization and readability. The operator overloads for various assertion types are correctly implemented.

Line range hint 188-378: Command printing updates are appropriate and consistent.

The CommandPrinterVisitor struct has been updated to handle the Command type instead of Instruction, reflecting the changes in the codebase. The visitor pattern is consistently applied, and the printing logic for different command types is correctly implemented. This ensures accurate and consistent output of assembly instructions.

375-380: Verify that all operator<< overloads use Command instead of Instruction.

To maintain consistency and prevent potential issues with outdated references, ensure that all overloads of operator<< have been updated to accept Command rather than Instruction.

Run the following script to identify any remaining operator<< overloads that use Instruction:

src/crab/ebpf_domain.cpp (1)

1198-1204: Potential exception safety issue with current_assertion.

If std::visit(*this, assertion); throws an exception, current_assertion may not be cleared, potentially leading to incorrect state in subsequent operations. Consider using a try-finally mechanism or an RAII guard to ensure current_assertion is cleared even if an exception occurs.

src/asm_unmarshal.hpp

src/assertions.cpp

src/asm_syntax.hpp

src/asm_cfg.cpp

src/crab/ebpf_domain.cpp

Signed-off-by: Elazar Gershuni <[email protected]>

Make the assertions part of the generalized instruction

21add5e

Each instruction is a pair (Command, vector<Assertion>). Jump instructions are retained in the nondeterministic graph, since they store the preconditions of the jump Signed-off-by: Elazar Gershuni <[email protected]>

coderabbitai bot reviewed Nov 7, 2024

View reviewed changes

address review comments

01a02e7

Signed-off-by: Elazar Gershuni <[email protected]>

coderabbitai bot reviewed Nov 7, 2024

View reviewed changes

src/test/test_marshal.cpp Show resolved Hide resolved

src/test/test_marshal.cpp Show resolved Hide resolved

minor

59db668

Signed-off-by: Elazar Gershuni <[email protected]>

coderabbitai bot reviewed Nov 7, 2024

View reviewed changes

src/asm_ostream.cpp Show resolved Hide resolved

elazarg added 6 commits November 8, 2024 14:31

rename: keep Instruction, introduce GuardedInstruction

31ca739

Signed-off-by: Elazar Gershuni <[email protected]>

minor

7862db8

Signed-off-by: Elazar Gershuni <[email protected]>

minor

c72754c

Signed-off-by: Elazar Gershuni <[email protected]>

minor

4f61041

Signed-off-by: Elazar Gershuni <[email protected]>

minor

bf98ef7

Signed-off-by: Elazar Gershuni <[email protected]>

InstructionSeq is Instruction, not GuardedInstruction

9e61378

Signed-off-by: Elazar Gershuni <[email protected]>

coderabbitai bot reviewed Nov 8, 2024

View reviewed changes

elazarg added 2 commits November 8, 2024 14:50

move InstructionSeq close to Instruction

09c297b

Signed-off-by: Elazar Gershuni <[email protected]>

review comments

afd0a48

Signed-off-by: Elazar Gershuni <[email protected]>

elazarg merged commit f07d66a into main Nov 8, 2024
19 checks passed

elazarg deleted the extended-instruction branch November 8, 2024 13:32

This was referenced Nov 9, 2024

Fix bpf2bpf local function calls with stack usage #784

Merged

Split ebpf_domain_t into domain, transformer, checker #787

Merged

Use GuardedInstruction for cfg_t node, instead of basic_block #789

Merged

This was referenced Nov 20, 2024

Simplify invariant output #798

Merged

Print simplified cfg #799

Merged

coderabbitai bot mentioned this pull request Jan 7, 2025

Add support for 64bit immediate with type 2 #820

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make the assertions part of a generalized instruction #779

Make the assertions part of a generalized instruction #779

elazarg commented Nov 7, 2024 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Nov 7, 2024 •

edited

Loading

Rate limit exceeded

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (`.coderabbit.yaml`)

Documentation and Community

coderabbitai bot left a comment

coderabbitai bot left a comment

coveralls commented Nov 7, 2024 •

edited

Loading

elazarg commented Nov 7, 2024

coderabbitai bot left a comment

coderabbitai bot left a comment

Make the assertions part of a generalized instruction #779

Make the assertions part of a generalized instruction #779

Conversation

elazarg commented Nov 7, 2024 • edited by coderabbitai bot Loading

Summary by CodeRabbit

Release Notes

coderabbitai bot commented Nov 7, 2024 • edited Loading

Rate limit exceeded

Walkthrough

Changes

Possibly related PRs

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coveralls commented Nov 7, 2024 • edited Loading

elazarg commented Nov 7, 2024

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

elazarg commented Nov 7, 2024 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Nov 7, 2024 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)

coveralls commented Nov 7, 2024 •

edited

Loading