-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make the assertions part of a generalized instruction #779
Conversation
Each instruction is a pair (Command, vector<Assertion>). Jump instructions are retained in the nondeterministic graph, since they store the preconditions of the jump Signed-off-by: Elazar Gershuni <[email protected]>
Warning Rate limit exceeded@elazarg has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 9 minutes and 8 seconds before requesting another review. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. WalkthroughThe pull request introduces significant changes across multiple files, primarily focusing on the transition from the Changes
Possibly related PRs
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 18
🧹 Outside diff range comments (2)
src/asm_marshal.hpp (1)
Line range hint
10-12
: Add function documentation.Consider adding documentation that explains:
- The purpose of the marshal function
- The expected format/content of the Command parameter
- The meaning of the pc parameter
- The format of the returned ebpf_inst vector
+/** + * Marshals a Command into a sequence of eBPF instructions. + * @param cmd The command to marshal + * @param pc The program counter value + * @return Vector of eBPF instructions representing the command + */ std::vector<ebpf_inst> marshal(const Command& cmd, pc_t pc);src/crab/cfg.hpp (1)
Line range hint
62-68
: Update documentation and parameter type to useCommand
instead ofInstruction
.The documentation and method signature should be updated to reflect the transition from
Instruction
toCommand
type, as indicated in the PR objectives.- /// Insert an Instruction at the front of the basic block. + /// Insert a Command at the front of the basic block. /// @note Cannot modify entry or exit blocks. - void insert_front(const Instruction& arg) { + void insert_front(const Command& arg) {
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
📒 Files selected for processing (22)
scripts/format-code
(1 hunks)src/asm_cfg.cpp
(13 hunks)src/asm_files.cpp
(3 hunks)src/asm_marshal.cpp
(4 hunks)src/asm_marshal.hpp
(1 hunks)src/asm_ostream.cpp
(9 hunks)src/asm_ostream.hpp
(2 hunks)src/asm_parse.cpp
(2 hunks)src/asm_parse.hpp
(1 hunks)src/asm_syntax.hpp
(7 hunks)src/asm_unmarshal.cpp
(7 hunks)src/asm_unmarshal.hpp
(1 hunks)src/assertions.cpp
(6 hunks)src/crab/cfg.hpp
(2 hunks)src/crab/ebpf_domain.cpp
(3 hunks)src/crab/ebpf_domain.hpp
(1 hunks)src/crab/fwd_analyzer.cpp
(1 hunks)src/crab/split_dbm.cpp
(1 hunks)src/crab/wto.cpp
(1 hunks)src/spec_type_descriptors.hpp
(1 hunks)src/test/ebpf_yaml.cpp
(2 hunks)src/test/test_marshal.cpp
(3 hunks)
🧰 Additional context used
📓 Learnings (3)
src/crab/ebpf_domain.cpp (1)
Learnt from: Alan-Jowett
PR: vbpf/ebpf-verifier#745
File: src/crab/ebpf_domain.hpp:81-81
Timestamp: 2024-11-05T20:06:15.970Z
Learning: In this codebase, `operator()` method declarations in the `ebpf_domain_t` class typically do not have documentation comments, so adding such comments is unnecessary.
src/crab/ebpf_domain.hpp (1)
Learnt from: Alan-Jowett
PR: vbpf/ebpf-verifier#745
File: src/crab/ebpf_domain.hpp:81-81
Timestamp: 2024-11-05T20:06:15.970Z
Learning: In this codebase, `operator()` method declarations in the `ebpf_domain_t` class typically do not have documentation comments, so adding such comments is unnecessary.
src/crab/fwd_analyzer.cpp (1)
Learnt from: Alan-Jowett
PR: vbpf/ebpf-verifier#745
File: src/crab/fwd_analyzer.hpp:14-14
Timestamp: 2024-11-05T20:59:02.521Z
Learning: When reviewing changes, avoid suggesting documentation additions for pre-existing functions (e.g., `run_forward_analyzer` in `src/crab/fwd_analyzer.hpp`) if they are outside the scope of the current PR.
🪛 cppcheck
src/test/test_marshal.cpp
[performance] 292-292: Function parameter 'expected_error_message' should be passed by const reference.
(passedByValue)
🔇 Additional comments (63)
src/asm_parse.hpp (2)
9-9
: LGTM! Function signature change aligns with PR objectives.
The change from Instruction
to Command
return type is consistent with the PR's goal of restructuring instruction representation.
9-9
: Verify all callers have been updated.
Since this is a public API change in a header file, we should verify that all callers of parse_instruction
have been updated to handle the new Command
return type.
✅ Verification successful
Let me analyze the results and generate additional verification scripts to ensure all callers have been properly updated.
Let me verify one more aspect to ensure the Command type is properly handled in all contexts.
All callers have been properly updated to handle the Command return type
The verification shows that:
- The
parse_instruction
function is consistently declared and implemented to returnCommand
type in all locations - All callers properly handle the
Command
return type:- In
src/asm_parse.cpp
: UsesCommand ins = parse_instruction(...)
- In
src/test/ebpf_yaml.cpp
: Usesconst Command& ins = parse_instruction(...)
- In
- The
Command
type is a variant type that represents the actual instruction data, and the codebase shows proper type handling withstd::holds_alternative
checks - The relationship between
Command
andInstruction
is well-defined inasm_syntax.hpp
whereInstruction
containsCommand
as a member, showing this is an intentional design change
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Find all callers of parse_instruction to verify they've been updated
# to handle the Command return type instead of Instruction
# Search for function calls to parse_instruction
echo "Searching for parse_instruction callers:"
rg -p "parse_instruction\s*\(" --type cpp
# Search for any remaining references to Instruction type
# that might need updating
echo -e "\nSearching for potential missed Instruction type references:"
rg "\bInstruction\b" --type cpp
Length of output: 1708
Script:
#!/bin/bash
# Let's examine the actual implementation and usage context of parse_instruction
# to verify the return type handling
# First, check the implementation in asm_parse.cpp
echo "=== Implementation of parse_instruction ==="
rg -A 5 "Command parse_instruction.*\{" src/asm_parse.cpp
# Check the context of usage in test/ebpf_yaml.cpp
echo -e "\n=== Usage context in test/ebpf_yaml.cpp ==="
rg -B 2 -A 3 "parse_instruction\(" src/test/ebpf_yaml.cpp
# Check if Command type is properly included/used in these files
echo -e "\n=== Command type usage ==="
rg "Command" src/asm_parse.hpp src/asm_parse.cpp src/test/ebpf_yaml.cpp
Length of output: 1800
Script:
#!/bin/bash
# Let's check the Command type definition and its relationship with Instruction
echo "=== Command type definition ==="
rg -p "Command" --type=hpp
# Check for any type conversions or relationships between Command and Instruction
echo -e "\n=== Command and Instruction relationship ==="
rg -p "Command|Instruction" src/asm_syntax.hpp
Length of output: 987
src/asm_marshal.hpp (1)
10-10
: Verify all callers have been updated.
The change from Instruction
to Command
type is aligned with the PR objectives. Let's verify that all callers have been updated accordingly.
✅ Verification successful
Let me analyze the search results:
- The
marshal
function withCommand
type is defined insrc/asm_marshal.hpp
and implemented insrc/asm_marshal.cpp
. - All usages of
marshal
function in the codebase are with the newCommand
type:- Direct usage in
src/asm_marshal.cpp
withCommand
parameter - Test cases in
src/test/test_marshal.cpp
usingCommand
and its derived types - No instances found of using
marshal
withInstruction
type
- Direct usage in
Let me verify the implementation to ensure it's consistent:
Based on the comprehensive analysis of the search results and implementation check:
All callers have been properly updated to use the new Command
type
The verification confirms:
- The function signature change from
Instruction
toCommand
has been consistently applied across the codebase - All usages of
marshal
function are with the newCommand
type - No legacy references to
marshal
withInstruction
type remain - The implementation in
src/asm_marshal.cpp
correctly uses the new type
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Search for any remaining references to marshal function with Instruction type
# that might have been missed during the refactoring.
echo "Searching for potential missed marshal function calls..."
rg -l "marshal.*Instruction" --type cpp
echo "Searching for all marshal function calls for manual verification..."
rg -A 2 "marshal\(" --type cpp
Length of output: 19357
Script:
#!/bin/bash
# Check the implementation of marshal function
cat src/asm_marshal.cpp | head -n 20
Length of output: 511
src/spec_type_descriptors.hpp (1)
70-70
: LGTM! Verify type consistency.
The comment update aligns with the broader refactoring from Instruction
to Command
. However, the prog
vector still uses ebpf_inst
type.
Let's verify if ebpf_inst
has been updated elsewhere or if it needs attention:
src/asm_ostream.hpp (2)
42-43
: LGTM: Core Command operator declarations
The base operator and conversion function declarations follow good C++ practices and support the transition from Instruction
to Command
.
70-71
: Verify Assertion operator implementations
The declarations look good, but let's verify the corresponding implementations exist.
✅ Verification successful
Let me verify the implementations by checking their content.
Assertion operators are properly implemented
Both operator<< and to_string are implemented in src/asm_ostream.cpp with appropriate functionality:
- operator<< uses a visitor pattern (AssertionPrinterVisitor) to handle different assertion types
- to_string implementation correctly leverages the operator<< through a stringstream
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Search for Assertion operator implementations
echo "Searching for Assertion operator implementations..."
rg "operator<<.*Assertion.*\{" --type cpp
echo "Searching for Assertion to_string implementation..."
rg "to_string.*Assertion.*\{" --type cpp
Length of output: 473
Script:
#!/bin/bash
# Get the implementation details
echo "operator<< implementation:"
rg -A 5 "operator<<.*Assertion.*\{" --type cpp
echo -e "\nto_string implementation:"
rg -A 5 "to_string.*Assertion.*\{" --type cpp
Length of output: 784
src/crab/fwd_analyzer.cpp (1)
131-131
: Address the design inconsistency highlighted in the TODO comment.
The TODO comment suggests that there's still an explicit call that hasn't been migrated to use the new Command-based approach. Since this PR aims to make assertions part of a generalized instruction using Commands, this inconsistency should be addressed.
Let's verify the impact on termination checking:
scripts/format-code (1)
177-180
:
Version compatibility check should not be commented out.
Removing the version check could lead to inconsistent formatting across the team if different clang-format versions are used. clang-format is known to produce different results across versions.
Consider either:
- Removing the commented code entirely if version check is no longer needed
- Restoring the version check with proper justification
Let's check if there are any formatting inconsistencies in recent PRs that might be related to different clang-format versions:
src/crab/ebpf_domain.hpp (2)
70-70
: LGTM! Key change for assertion integration
This operator implements the PR's objective of making assertions part of instructions. The change aligns well with the existing current_assertion
member variable.
Let's verify the assertion integration:
✅ Verification successful
LGTM! Assertion operator implementation is well integrated
The implementation in ebpf_domain.cpp
shows proper handling of assertions:
- Sets the current assertion context using
to_string(assertion)
- Visits the assertion using the visitor pattern
- Properly cleans up by clearing the assertion context
- Respects configuration through
check_require
andassume_assertions
flags
The operator is also consistently integrated across the codebase:
- Printing support in
asm_ostream.cpp
- Proper marshaling behavior in
asm_marshal.cpp
- Well-defined visitor pattern implementation
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Verify the integration of assertions with instructions
# Look for assertion-related code patterns
# Search for assertion usage patterns
ast-grep --pattern 'operator()(const Assertion& $_) {
$$$
}'
# Search for current_assertion usage
rg -p "current_assertion" --type cpp
Length of output: 476
Script:
#!/bin/bash
# Let's verify the assertion integration with more specific patterns
# Look for the implementation of operator()(const Assertion&)
ast-grep --pattern 'void operator()(const Assertion& $_) {
$$$
}'
# Search for assertion-related code with context
rg -p "class Assertion" --type cpp -A 5
rg -p "struct Assertion" --type cpp -A 5
# Look for assertion usage in the domain implementation
rg -p "operator.*Assertion" --type cpp -A 5
Length of output: 1749
61-61
: Verify const correctness of jump handling
Making operator()(const Jmp&)
const suggests that jump handling no longer modifies the domain state. This change could impact how state transitions are managed during verification.
Let's verify the usage of this operator:
✅ Verification successful
Making operator()(const Jmp&)
const is correct and safe
The verification results show that:
Jmp
is a simple data structure (insrc/asm_syntax.hpp
) containing only a condition and target label- The implementation in
src/crab/ebpf_domain.cpp:1396
is empty:void ebpf_domain_t::operator()(const Jmp&) const {}
- Other parts of the codebase handle Jmp-related state modifications through different mechanisms (e.g.,
assertions.cpp
has a const operator returning assertions)
Making this operator const is consistent with its empty implementation and the overall design where state changes are handled through the assertion system rather than during jump processing.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check if the const correctness change affects any existing call sites
# Look for places where the operator might be expected to modify state
# Search for calls to operator() with Jmp
rg -p "operator.*\(.*Jmp.*\)" --type cpp
# Search for Jmp-related state modifications
rg -p "Jmp.*state" --type cpp
Length of output: 476
Script:
#!/bin/bash
# Let's try a more comprehensive search for Jmp-related operations
# Search for the Jmp operator implementation
ast-grep --pattern 'void operator()(const Jmp& $_) {
$$$
}'
# Search for Jmp class/struct definition and usage
rg -p "class.*Jmp|struct.*Jmp" --type cpp -A 5
# Search for any Jmp-related method calls
rg -p "\..*Jmp|->.*Jmp" --type cpp
# Look for operator() calls in general to understand the pattern
rg -p "operator\(\)" --type cpp
Length of output: 9549
src/crab/wto.cpp (2)
44-44
: LGTM! Minor formatting improvement.
The added space before the closing brace in the constructor's initialization list improves code readability.
Line range hint 1-256
: Verify if WTO implementation needs updates for Command type.
Since this PR changes the core instruction representation from Instruction
to Command
, we should verify if the WTO implementation needs any updates to handle the new type.
src/asm_marshal.cpp (3)
297-298
: LGTM: Marshal function signature updated correctly.
The function signature change from Instruction
to Command
maintains the core marshaling functionality while aligning with the new type system.
Line range hint 301-309
: LGTM: Size calculation properly adapted for Command type.
The size calculation logic remains correct while transitioning to the new Command type, properly handling special cases for LoadMapFd and lddw instructions.
208-208
: LGTM: Assertion operator correctly prevents marshaling.
The implementation correctly prevents assertions from being marshaled into bytecode, as they are compile-time verification constructs.
Let's verify that assertions are only used for verification:
✅ Verification successful
Assertion marshaling implementation is correct
The implementation correctly prevents assertions from being marshaled into bytecode, consistent with other non-executable constructs like assumptions. The code follows the pattern of explicitly throwing exceptions for compile-time verification constructs that should not be converted to runtime bytecode.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Search for any potential bytecode generation of assertions
rg -l "marshal.*assertion"
Length of output: 48
Script:
#!/bin/bash
# Let's examine the actual usage of assertions in the codebase
rg -B2 -A2 "marshal.*assertion" src/asm_marshal.cpp
# Also check for any other assertion-related code
ast-grep --pattern 'struct Assertion { $$$ }'
Length of output: 394
src/asm_syntax.hpp (7)
159-159
: LGTM: Consistent terminology update.
The comment update from "instruction" to "Command" aligns with the new structural changes.
248-248
: LGTM: Consistent terminology update.
The comment update from "instruction" to "Command" aligns with the new structural changes.
261-261
: LGTM: Consistent terminology update.
The comment update from "instruction" to "Command" aligns with the new structural changes.
269-269
: LGTM: Consistent terminology update.
The comment update from "instruction" to "Command" aligns with the new structural changes.
279-279
: LGTM: Consistent terminology update.
The comment update from "instruction" to "Command" aligns with the new structural changes.
299-299
: LGTM: Consistent terminology update.
The comment update from "instruction" to "Command" aligns with the new structural changes.
430-430
: Verify the impact of changing pc_t from size_t to uint32_t.
The type change is well-documented for supporting 32-bit PC offsets, but let's verify there are no arithmetic issues.
✅ Verification successful
Let me generate additional verification scripts to check for potential issues with pc_t arithmetic operations more thoroughly.
Based on the code analysis, I can now provide a final response:
The change from size_t to uint32_t for pc_t type is safe
The code inspection shows that pc_t is used safely with proper bounds checking:
- All arithmetic is bounded (pc + 1 + offset with bounds check)
- No unchecked arithmetic operations or overflows
- Used mainly for program counter tracking and jump offset calculations
- Target validation occurs before any pc_t arithmetic (e.g.,
if (new_pc >= insts.size())
)
The change to uint32_t aligns with the documented purpose of supporting 32-bit PC offsets in cpu=v4 and the implementation shows proper safeguards against overflow conditions.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check for potential issues with pc_t arithmetic
# Look for arithmetic operations, comparisons, or casts involving pc_t
# that might be affected by the type change
echo "Checking for pc_t arithmetic operations..."
rg -l "pc_t" | xargs rg "(\+|-|\*|/|<<|>>|<|>|<=|>=|==|!=)"
echo "Checking for size_t to pc_t conversions..."
rg "static_cast<(size_t|pc_t)>"
echo "Checking for pc_t array indexing..."
rg "pc_t.*\["
Length of output: 66140
Script:
#!/bin/bash
# Look for pc_t usage in arithmetic expressions and assignments
echo "Checking pc_t arithmetic usage..."
ast-grep --pattern 'pc_t $var = $expr;'
echo "Checking pc_t comparisons..."
ast-grep --pattern 'pc_t $var $op $expr'
echo "Checking pc_t function parameters..."
ast-grep --pattern 'void $func($$$, pc_t $param, $$$)'
# Check for any potential overflow checks
echo "Checking for overflow handling..."
rg -A 3 -B 3 "pc_t.*\+\+"
rg -A 3 -B 3 "pc_t.*\-\-"
rg -A 3 -B 3 "pc_t.*\+"
rg -A 3 -B 3 "pc_t.*\-"
Length of output: 3827
src/assertions.cpp (12)
Line range hint 23-28
: LGTM: Return type change aligns with PR objectives
The change from vector<Assert>
to vector<Assertion>
maintains the existing logic while aligning with the PR's goal of generalizing assertions.
40-47
: LGTM: Consistent return type updates
Return type changes from vector<Assert>
to vector<Assertion>
are applied consistently across basic instruction operators.
50-59
: LGTM: Proper return value validation maintained
The Exit operator correctly maintains the critical validation that ensures R0 contains a number when returning from the main function.
88-91
: LGTM: Equivalent assertion handling for context pointers
The modified loop maintains the same functionality while adapting to the new Assertion
type.
118-125
: LGTM: Consistent operator updates
Return type changes are applied consistently while preserving the existing validation logic.
128-132
: LGTM: Proper privilege checking maintained
The method correctly maintains privilege checking while adapting to the new assertion type.
179-186
: LGTM: Consistent condition validation
The operators maintain proper condition validation while adapting to the new assertion type.
Line range hint 188-215
: LGTM: Memory access validation preserved
The operator maintains comprehensive memory access validation, including stack pointer bounds checking and proper type constraints.
217-225
: LGTM: Proper atomic operation validation
The operator correctly wraps constraints with the new Assertion
type while maintaining all necessary validations.
227-229
: LGTM: Proper unary operation validation
The operator correctly maintains number type constraint with the new assertion type.
231-287
: LGTM: Comprehensive binary operation validation
The operator maintains proper type constraints and operation-specific validations while transitioning to the new assertion type.
291-305
: LGTM: Proper Command type integration
The functions correctly adapt to use the new Command
type while maintaining the same assertion extraction and application logic.
src/asm_cfg.cpp (7)
Line range hint 28-41
: LGTM: Command type handling is correct.
The function properly handles all cases with the new Command type using appropriate variant access methods.
50-57
: LGTM: CallLocal handling updated correctly.
The changes properly adapt the code to use Command type while maintaining the original logic.
76-81
: LGTM: Exit and Call variant handling is correct.
The code properly handles both Exit and Call variants with the new Command type.
Line range hint 126-132
: LGTM: CallLocal and stack frame handling is correct.
The code properly handles CallLocal variant and maintains important stack frame depth checks.
Line range hint 145-174
: LGTM: CFG construction logic updated correctly.
The changes properly adapt the CFG construction logic to use Command type while maintaining all necessary checks and error handling.
Line range hint 276-348
: LGTM: Instruction type classification updated correctly.
The function properly handles all instruction variants with the new Command type while maintaining the correct classification logic.
21-26
: LGTM: Command type transition looks correct.
The function has been properly updated to use the new Command type while maintaining the same logic.
✅ Verification successful
Let me verify the caller to ensure it's passing a Command type.
Let me check the type of inst
to ensure it's properly using Command.
Let me try a different approach to check the struct definition.
Let me check the definition of InstructionSeq and Instruction types.
Transition to Command type is correctly implemented
The verification confirms that:
- The
get_jump
function correctly accepts aCommand
parameter - The caller passes
inst.cmd
which is correct asInstruction
struct contains aCommand cmd
member - The type transition maintains type safety throughout the call chain
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Verify all callers of get_jump have been updated to pass Command
rg -p "get_jump\(" --type cpp
Length of output: 156
Script:
#!/bin/bash
# Check the context around the caller of get_jump
rg -p "get_jump\(" --type cpp -B 5 -A 2
Length of output: 549
Script:
#!/bin/bash
# Check the definition of inst and its type
ast-grep --pattern 'struct $_ {
$$$
Command cmd;
$$$
}'
Length of output: 63
Script:
#!/bin/bash
# Search for struct definitions containing inst or command
rg -p "struct.*\{" --type cpp -B 1 -A 10 src/asm_cfg.cpp
Length of output: 1908
Script:
#!/bin/bash
# Look for the definition of InstructionSeq and Instruction
rg -p "struct (Instruction|InstructionSeq)" --type cpp -B 2 -A 5
Length of output: 367
src/test/ebpf_yaml.cpp (2)
185-185
: LGTM! Comment updated for consistency.
The comment has been appropriately updated to use the new Command
terminology, maintaining consistency with the codebase changes.
155-155
: LGTM! Verify type consistency across codebase.
The type change from Instruction
to Command
aligns with the PR objective and appears safe.
Let's verify the type consistency:
✅ Verification successful
Let me gather more specific information about the usage context.
Based on the gathered information, I can now provide a final response:
Type change is consistent and correctly implemented
The verification shows that:
parse_instruction
insrc/asm_parse.cpp
correctly returnsCommand
type- The usage in
src/test/ebpf_yaml.cpp
correctly receivesconst Command&
- The remaining
Instruction
references in other files are part of different subsystems (asm, crab) and are not related to this change
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Verify that all relevant functions have been updated to use Command instead of Instruction
# Test 1: Check for any remaining Instruction type usage
rg -l "Instruction[^a-zA-Z]" --type cpp
# Test 2: Verify Command type usage in parse_instruction
ast-grep --pattern 'Command parse_instruction'
Length of output: 269
Script:
#!/bin/bash
# Check the context of Instruction usage in relevant files
echo "=== Checking parse_instruction signature ==="
rg "parse_instruction" --type cpp -B 1 -A 1
echo -e "\n=== Checking Command type usage ==="
rg "Command\s+\w+" --type cpp
echo -e "\n=== Checking specific file context ==="
cat src/test/ebpf_yaml.cpp
Length of output: 19971
src/crab/cfg.hpp (3)
627-627
: LGTM! Function signature aligns with PR objectives.
The updated signature correctly reflects the transition to using Command
type and supports the goal of making assertions part of generalized instructions.
632-634
: LGTM! Simplified operator declarations.
Good cleanup: removed redundant namespace prefixes since the operators are already in the crab namespace.
Line range hint 29-29
: Consider updating remaining Instruction
type usages.
The file still contains several uses of the Instruction
type, particularly in the basic_block_t
class where stmt_list_t
is defined as std::vector<Instruction>
. These should be updated to use Command
type to maintain consistency with the PR objectives.
src/asm_files.cpp (3)
203-203
: LGTM: Documentation update aligns with new terminology.
The comment update from "Instruction" to "Command" correctly reflects the architectural changes described in the PR objectives.
253-253
: LGTM: Documentation update aligns with new terminology.
The comment update from "Instruction" to "Command" maintains consistency with the architectural changes.
Line range hint 483-490
: LGTM: Documentation and line info propagation logic are correct.
The comment update maintains terminology consistency, and the line information propagation logic correctly handles instructions without explicit BTF data by copying from the previous instruction.
src/test/test_marshal.cpp (4)
235-235
: LGTM: Command type transition
The change correctly accesses the command through the cmd
field, aligning with the transition from Instruction
to Command
type.
253-253
: LGTM: Consistent Command type usage
The change maintains consistency with the Command type transition pattern.
266-271
: LGTM: Improved type safety in parsing
The changes enhance type safety by using std::get for parsing the instruction sequence.
279-281
: LGTM: Clear documentation
The added comment clearly explains the purpose of the Command marshaling verification.
src/asm_parse.cpp (1)
136-136
:
Update return type of parse_program
to CommandSeq
.
The function parse_program
still returns InstructionSeq
, but it should return CommandSeq
to reflect the new Command
type introduced in the refactoring. This ensures consistency across the codebase.
Apply this diff to fix the return type:
-static InstructionSeq parse_program(std::istream& is) {
+static CommandSeq parse_program(std::istream& is) {
Likely invalid or redundant comment.
src/asm_unmarshal.cpp (7)
81-83
: Function shift32
return type correctly updated to Command
The shift32
function's return type has been changed from Instruction
to Command
, aligning with the new instruction representation. This update ensures consistency across the codebase.
Line range hint 277-383
: Function makeMemOp
return type updated to Command
with enhanced error handling
The makeMemOp
function now returns a Command
instead of an Instruction
. The updated function includes robust error handling for invalid instructions and unsupported operations, improving the reliability of memory operations parsing.
Line range hint 384-417
: Function makeAluOp
return type updated to Command
; division by zero check added
The makeAluOp
function's return type has been updated to Command
. It now includes a check for division by zero when allow_division_by_zero
is not enabled, enhancing safety by detecting potential runtime errors.
Line range hint 397-413
: Correct use of std::visit
with overloaded lambdas for operation handling
The use of std::visit
with the overloaded
pattern effectively handles both Un::Op
and Bin::Op
variants. This implementation ensures that unary and binary operations are appropriately processed.
Line range hint 420-434
: Function makeLddw
return type updated to Command
; improved instruction validation
The makeLddw
function now returns a Command
, consistent with the new instruction structure. Additional validation checks have been added to handle platform support and instruction integrity, increasing robustness when loading 64-bit immediate values.
Line range hint 591-703
: Function makeJmp
return type updated to Command
; consistent jump instruction handling
The makeJmp
function's return type is updated to Command
. The function maintains consistent validation and processing of jump instructions, including calls and exits, in line with the updated instruction representation.
Line range hint 707-768
: Variable new_ins
type updated to Command
in unmarshal
function
In the unmarshal
function, the variable new_ins
has been changed from Instruction
to Command
. This update aligns with the overall refactoring and ensures that instruction parsing correctly uses the new Command
type throughout the function.
src/crab/ebpf_domain.cpp (2)
1396-1398
: Function correctly implemented as a no-op.
The operator()(const Jmp&)
function appropriately acts as a placeholder for jump preconditions without introducing unnecessary operations.
1462-1463
: Correct handling of helper function IDs as singleton intervals.
The code correctly processes helper function IDs when src_interval
is a singleton, ensuring proper validation and invocation of helper functions only when the ID is definite and valid.
Also applies to: 1466-1466, 1472-1473
Signed-off-by: Elazar Gershuni <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
📒 Files selected for processing (2)
src/asm_ostream.hpp
(4 hunks)src/test/test_marshal.cpp
(4 hunks)
🔇 Additional comments (7)
src/asm_ostream.hpp (4)
Line range hint 15-23
: label_to_offset16
function updated appropriately
The function label_to_offset16
now takes const pc_t pc
as a parameter, which is suitable. The lambda correctly calculates the offset for 16-bit jumps.
Line range hint 25-33
: label_to_offset32
function updated appropriately
The function label_to_offset32
now accepts const pc_t pc
and accurately computes the offset for 32-bit jumps when the offset doesn't fit in an int16_t
.
42-43
: Operator overloads for Command
added correctly
The addition of operator<<
and to_string
overloads for Command
aligns with the shift from Instruction
to Command
, ensuring proper stream output and string conversion.
57-58
: Operator overloads for Assertion
added correctly
The new operator<<
and to_string
functions for Assertion
provide necessary functionality for handling assertions, enhancing code clarity and maintainability.
src/test/test_marshal.cpp (3)
253-253
: Handle potential errors from unmarshal
to prevent exceptions
Similar to the previous comment, in this function, ensure that you check the result of unmarshal
before using std::get<InstructionSeq>
to avoid exceptions when an error occurs.
266-266
: Handle potential errors from unmarshal
to prevent exceptions
Again, to prevent exceptions, verify the result of unmarshal
before accessing it directly with std::get<InstructionSeq>
.
235-235
:
Handle potential errors from unmarshal
to prevent exceptions
The function unmarshal
returns a std::variant<InstructionSeq, std::string>
. Directly using std::get<InstructionSeq>
without checking may throw an exception if unmarshal
returns an error message. It's safer to check the result before accessing it to prevent potential exceptions.
Apply this change to safely handle the result:
-auto parsed = std::get<InstructionSeq>(unmarshal(raw_program{"", "", 0, "", {ins, exit, exit}, info}));
+auto result = unmarshal(raw_program{"", "", 0, "", {ins, exit, exit}, info});
+REQUIRE(std::holds_alternative<InstructionSeq>(result));
+InstructionSeq parsed = std::get<InstructionSeq>(result);
Likely invalid or redundant comment.
@Alan-Jowett what do you think about the change? One benefit is having more direct connection between the location of a Command and the location of an ebpf instruction. In the future, I intend to
|
Signed-off-by: Elazar Gershuni <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
📒 Files selected for processing (3)
src/asm_marshal.cpp
(3 hunks)src/asm_ostream.cpp
(9 hunks)src/asm_syntax.hpp
(2 hunks)
🔇 Additional comments (12)
src/asm_marshal.cpp (5)
295-297
: LGTM: Command type transition looks correct.
The signature change from Instruction
to Command
is consistent with the PR objectives while maintaining the existing marshaling logic.
Line range hint 299-307
: LGTM: Size calculation properly adapted for Command type.
The function correctly maintains its logic while transitioning to the new type system.
316-316
: LGTM: Label mapping correctly adapted.
The change to access inst.cmd
is consistent with the new instruction structure.
327-332
: LGTM: Instruction sequence marshaling properly adapted.
The changes correctly handle Command access while maintaining the existing marshaling logic. The jump instruction processing is properly adapted to the new structure.
Line range hint 295-332
: Verify performance impact of Command type transition.
While the changes are functionally correct, let's verify there's no significant performance regression in the marshaling process.
✅ Verification successful
Let me gather more information about the actual code changes and their potential impact.
Based on the code analysis, I can now provide a definitive response about the performance impact of the Command type transition.
No significant performance impact from Command type transition
The code analysis reveals that:
- The
marshal
function is a simple visitor pattern implementation with direct mapping - The transition maintains O(1) complexity for single instruction marshaling
- Test coverage shows extensive marshaling/unmarshaling tests with no performance-specific concerns
- The Command type is used in a zero-copy manner without additional allocations
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Look for performance-related tests or benchmarks
# Search for performance tests
echo "Searching for performance tests..."
rg -l "benchmark|perf_test|performance" --type cpp
# Look for existing timing measurements
echo "Searching for timing-related code..."
rg "chrono|duration|timer" --type cpp
Length of output: 1258
Script:
#!/bin/bash
# Check the actual implementation differences
echo "Checking Command vs Instruction implementation..."
rg -B2 -A2 "struct (Command|Instruction)" --type cpp
# Look for any performance-critical loops or processing
echo "Checking marshal usage and processing..."
rg -B2 -A2 "marshal\(" --type cpp
# Look for test cases that might reveal complexity
echo "Checking test coverage..."
fd "test.*\.cpp$" --exec rg -l "marshal"
Length of output: 28540
src/asm_syntax.hpp (1)
430-430
: Verify the implications of changing pc_t
to uint32_t
Changing pc_t
from size_t
to uint32_t
may have implications on systems where size_t
is larger than 32 bits. Please ensure that this change doesn't affect the correctness or portability of the code, especially if there's a possibility of program counters exceeding 32-bit limits.
Run the following script to identify all usages of pc_t
and check for potential issues related to its size:
✅ Verification successful
Based on the search results, I can now generate the final response since I have enough context about how pc_t
is used throughout the codebase:
The change from size_t
to uint32_t
for pc_t
is safe and intentionally designed
The code explicitly documents this design choice with the comment "cpu=v4 supports 32-bit PC offsets so we need a large enough type." The usage analysis shows that:
pc_t
is primarily used for instruction indexing and offset calculations- All offset calculations are done using 16-bit or 32-bit signed integers
- The codebase includes bounds checking (e.g.,
if (new_pc >= insts.size())
) - No operations require addresses beyond 32-bit range
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Find all usages of pc_t and verify compatibility with uint32_t.
# Search for all definitions and usages of pc_t.
rg -A 3 -B 3 '\bpc_t\b'
Length of output: 7347
src/asm_ostream.cpp (6)
108-185
: Implementation of AssertionPrinterVisitor
is well-structured
The AssertionPrinterVisitor
struct effectively organizes the printing logic for various assertion types, enhancing code maintainability and readability.
Line range hint 188-262
: CommandPrinterVisitor
correctly replaces InstructionPrinterVisitor
The new CommandPrinterVisitor
struct accurately handles the printing of different command types, aligning with the updated instruction representation.
364-365
: Overloaded operator<<
for Command
is properly updated
The stream insertion operator for Command
now uses the CommandPrinterVisitor
, ensuring consistent and correct output formatting.
375-377
: Overloaded operator<<
for Assertion
utilizes AssertionPrinterVisitor
effectively
The changes streamline the printing of assertions by using the visitor pattern, improving code organization.
386-386
: Verify that size
function covers all Command
variants
The size
function accounts for Bin
commands with lddw
and LoadMapFd
. Ensure that any new or existing Command
variants that may affect instruction size are handled appropriately.
434-441
: Exception handling in jump instruction printing is appropriate
The logic for handling missing labels in jump instructions correctly throws a runtime error, which aids in identifying issues during runtime.
Signed-off-by: Elazar Gershuni <[email protected]>
Signed-off-by: Elazar Gershuni <[email protected]>
Signed-off-by: Elazar Gershuni <[email protected]>
Signed-off-by: Elazar Gershuni <[email protected]>
Signed-off-by: Elazar Gershuni <[email protected]>
Signed-off-by: Elazar Gershuni <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 7
🧹 Outside diff range comments (4)
src/assertions.cpp (3)
Line range hint
61-91
: Consider enhancing error handling for map_fd_reg assertion.The implementation is solid, but the assertion on line 81 (
assert(map_fd_reg)
) could be enhanced with a more descriptive error message.- assert(map_fd_reg); + assert(map_fd_reg && "map_fd_reg must be set before PTR_TO_MAP_KEY/VALUE");
Line range hint
128-179
: Consider extracting condition operation mapping to a lookup table.The implementation is correct, but the boolean assignment for
allow_pointers
could be simplified using a lookup table.static const std::unordered_map<Condition::Op, bool> ALLOWS_POINTERS = { {Condition::Op::EQ, true}, {Condition::Op::NE, true}, // ... other mappings }; bool allow_pointers = ALLOWS_POINTERS.at(cond.op) && cond.is64;
Line range hint
188-215
: Consider extracting magic number to a named constant.The implementation is solid, but the stack size check could be more readable.
- if (offset < -EBPF_SUBPROGRAM_STACK_SIZE || offset + static_cast<int>(width.v) > 0) { + const int MAX_NEGATIVE_STACK_OFFSET = -EBPF_SUBPROGRAM_STACK_SIZE; + if (offset < MAX_NEGATIVE_STACK_OFFSET || offset + static_cast<int>(width.v) > 0) {src/asm_cfg.cpp (1)
Line range hint
276-312
: PassInstruction
by const reference ininstype
functionThe
instype
function currently takesInstruction ins
by value. SinceInstruction
may be a non-trivial object, passing it by const reference avoids unnecessary copies and improves performance.Apply this diff to pass by const reference:
-static std::string instype(Instruction ins) { +static std::string instype(const Instruction& ins) {
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
📒 Files selected for processing (10)
src/asm_cfg.cpp
(12 hunks)src/asm_marshal.cpp
(2 hunks)src/asm_ostream.cpp
(10 hunks)src/asm_ostream.hpp
(3 hunks)src/asm_syntax.hpp
(2 hunks)src/asm_unmarshal.hpp
(1 hunks)src/assertions.cpp
(7 hunks)src/crab/cfg.hpp
(3 hunks)src/crab/ebpf_domain.cpp
(3 hunks)src/test/test_marshal.cpp
(5 hunks)
🔇 Additional comments (31)
src/asm_unmarshal.hpp (1)
18-18
: LGTM: Grammar fix in documentation.
The grammatical correction from "instruction" to "instructions" improves clarity.
src/asm_ostream.hpp (2)
15-15
: LGTM! Improved const-correctness.
The addition of const
to the parameter declaration is a good practice, making the immutability of pc
explicit.
57-58
: LGTM! Consistent with existing patterns.
The new declarations for Assertion
type handling follow the established pattern in the file and support the PR's goal of integrating assertions into the instruction set.
Note: The previous suggestion about using templates to reduce operator overload duplication still applies here.
src/assertions.cpp (9)
40-47
: LGTM: Consistent type changes across operator overloads.
The return type changes maintain behavioral consistency while aligning with the new type system.
50-58
: LGTM: Sound validation logic for packet and exit operations.
The implementation properly handles both context pointer verification and return value type constraints.
118-126
: LGTM: Proper constraint handling for call operations.
The implementation correctly enforces type and function constraints for callable entities.
181-186
: LGTM: Clean implementation of jump condition handling.
217-223
: LGTM: Comprehensive validation for atomic operations.
The implementation properly enforces type constraints and memory access validation for atomic operations.
227-229
: LGTM: Clear and concise unary operation handling.
291-292
: LGTM: Consistent type system update.
303-304
: LGTM: Proper assertion application in CFG traversal.
23-24
: LGTM: Type change aligns with PR objectives.
The change from vector<Assert>
to vector<Assertion>
is consistent with the PR's goal of generalizing instructions.
✅ Verification successful
Type transition from Assert
to Assertion
is complete and consistent
Based on the search results, the codebase shows a complete transition to the Assertion
type. No remaining instances of vector<Assert>
were found, and all related code (function returns, parameters, and usage) consistently uses the new Assertion
type. The type is properly defined in src/asm_syntax.hpp
as a variant type and is used consistently throughout the codebase.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Verify consistent usage of Assertion type
rg -l "vector<Assert>" .
Length of output: 26
Script:
#!/bin/bash
# Search for any remaining usage of Assert type
rg "Assert\b" -A 2 -B 2
Length of output: 25
Script:
#!/bin/bash
# Search for Assertion type usage to verify the transition
rg "Assertion\b" -A 2 -B 2
Length of output: 10817
Script:
#!/bin/bash
# Look for class/struct definition of Assert and Assertion
ast-grep --pattern 'class $_ {
$$$
}'
ast-grep --pattern 'struct $_ {
$$$
}'
Length of output: 84
src/crab/cfg.hpp (4)
Line range hint 56-68
: LGTM! Method signatures updated consistently.
The insert
and insert_front
methods have been properly updated to handle GuardedInstruction
. The documentation improvements for insert_front
are helpful.
632-634
: LGTM! Stream operators maintained correctly.
The stream operator declarations have been simplified while maintaining functionality for both block types.
627-627
: Verify the implementation of get_assertions.
The function signature has been updated to handle the new assertion model. Please ensure the implementation properly handles all cases with the new parameters.
#!/bin/bash
# Find the implementation of get_assertions
ast-grep --pattern 'vector<Assertion> get_assertions(Instruction $_,$_,$_) {
$$$
}'
42-42
: Verify usage of GuardedInstruction across the codebase.
The change from vector<Instruction>
to vector<GuardedInstruction>
is fundamental to the PR's goal. This type change needs careful verification to ensure all consumers handle the new type correctly.
✅ Verification successful
Based on the search results, I can now generate the final response:
The type change from vector<Instruction>
to vector<GuardedInstruction>
appears to be safe.
The codebase shows that:
Instruction
remains as a variant type (src/asm_syntax.hpp
) and is now wrapped withinGuardedInstruction
which adds preconditions- All direct usages of
Instruction
are for processing individual instructions (parsing, marshaling, printing) rather than storing collections - No instances of
std::vector<Instruction>
were found, indicating this was likely the only occurrence that needed updating
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Search for Instruction type usage that might need updating
rg --type cpp "std::vector<Instruction>" -A 3
# Search for direct Instruction usage that might need updating
rg --type cpp "\bInstruction\b" -A 3
Length of output: 8692
src/test/test_marshal.cpp (5)
235-235
: LGTM: Correctly updated to access the cmd
member.
The change correctly reflects the transition from Instruction
to Command
type, maintaining consistency with the PR's objectives.
253-253
: LGTM: Consistent update to access the cmd
member.
The change maintains consistency with other similar modifications in the file.
266-271
: LGTM: Correctly updated the 64-bit immediate instruction handling.
The changes properly handle the transition to the new Command
type while maintaining the existing functionality for 64-bit immediate instructions.
289-289
: LGTM: Consistent update to access the cmd
member.
The change maintains consistency with other similar modifications in the file.
292-298
: LGTM: Enhanced error handling with safer type checking.
The updated error handling now uses std::get_if
for safer type checking when retrieving error messages. This is a good improvement in type safety.
src/asm_marshal.cpp (3)
316-316
: Correctly updating size calculation with inst.cmd
The change to use inst.cmd
in the size calculation aligns with the updated instruction representation.
327-327
: Properly adjusting jump targets using ins.cmd
The update correctly modifies the jump target calculations to reflect the new Command
structure.
330-330
: Marshaling instructions with updated Command
representation
The loop now correctly marshals ins.cmd
, ensuring that instructions are processed appropriately with the updated structure.
src/asm_cfg.cpp (3)
145-147
: Skip processing undefined instructions
The check for Undefined
instructions at line 145 ensures that they are skipped during CFG construction. This is a good practice to prevent processing invalid or placeholder instructions.
Line range hint 126-131
: Recursive call to add_cfg_nodes
with stack frame depth check
The recursive call to add_cfg_nodes
at line 129 includes a stack frame depth check to prevent excessive recursion, which could lead to a stack overflow.
370-372
: Insertion of IncrementLoopCounter
for loop detection
At line 370, inserting IncrementLoopCounter
instructions at loop heads is a correct implementation for tracking loop iterations and aiding in termination checks.
src/asm_ostream.cpp (3)
108-185
: Refactored assertion printing is correct.
The AssertionPrinterVisitor
struct encapsulates the logic for printing different types of assertions using the visitor pattern effectively. This refactoring enhances code organization and readability. The operator overloads for various assertion types are correctly implemented.
Line range hint 188-378
: Command printing updates are appropriate and consistent.
The CommandPrinterVisitor
struct has been updated to handle the Command
type instead of Instruction
, reflecting the changes in the codebase. The visitor pattern is consistently applied, and the printing logic for different command types is correctly implemented. This ensures accurate and consistent output of assembly instructions.
375-380
: Verify that all operator<<
overloads use Command
instead of Instruction
.
To maintain consistency and prevent potential issues with outdated references, ensure that all overloads of operator<<
have been updated to accept Command
rather than Instruction
.
Run the following script to identify any remaining operator<<
overloads that use Instruction
:
src/crab/ebpf_domain.cpp (1)
1198-1204
: Potential exception safety issue with current_assertion
.
If std::visit(*this, assertion);
throws an exception, current_assertion
may not be cleared, potentially leading to incorrect state in subsequent operations. Consider using a try-finally mechanism or an RAII guard to ensure current_assertion
is cleared even if an exception occurs.
Signed-off-by: Elazar Gershuni <[email protected]>
Signed-off-by: Elazar Gershuni <[email protected]>
Each GuardedInstruction is a pair (
Instruction
,vector<Assertion>
).Jump instructions are retained in the nondeterministic graph, since they store the preconditions of the jump
Summary by CodeRabbit
Release Notes
New Features
Bug Fixes
Documentation
Refactor
Instruction
toCommand
across multiple files for consistency in instruction representation.Style