-
Notifications
You must be signed in to change notification settings - Fork 258
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
eBPF assembler #10
eBPF assembler #10
Conversation
Supports the same syntax as ubpf, plus optional "64" suffixes for ALU mnemonics.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So we have the assembler, the doc, the examples, plus nice error handling (better than my code for the VMs by the way)… That's just fantastic! Thanks a lot for this PR!
Minor comments:
-
See inline comments regarding min and max values for immediate and offset operands. I'll merge after having your answer on that.
-
We now have several ways to build eBPF program: slices of bytes, loading from ELF, assembler, and soon a specific API (PR [wip] API to build BPF programs #6). My opinion is that the unit tests coming from uBPF should use the assembly language, so that we can easily stay in sync with uBPF. The other tests could use the API, mostly, once finished. The examples should use a mix of all those methods, to show the users how they can build a program.
-
Thanks for supporting both
add
andadd64
syntax. I think we should useadd
style for all tests coming from uBPF, in order to remain compatible. I suggest that new tests use theadd64
style, though, which I find more explicit? -
Could we have leading
#
for user comments, as in uBPF, in a future commit?
src/assembler.rs
Outdated
return Err(format!("Invalid offset {}", off)); | ||
} | ||
if imm < -2147483648 || imm >= 4294967296 { | ||
return Err(format!("Invalid immediate {}", imm)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't be 2147483648 instead of 4294967296?
For all these values, we could maybe use instead std::i16::MIN
(-32768), std::i16::MAX
(32767), std::i32::MIN
(-2147483648) and std::i32::MAX
(2147483647) (or std::u32::MAX
if you got the right value, but in this case could you explain, please?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd been assuming we need to support 32-bit unsigned integer constants. Thinking more about it, constants with the high bit set are going to be sign-extended by most instructions and allowing positive-looking constants that are interpreted by the VM as negative would be confusing. I pushed a commit fixing this.
I initially used std:::i32::MIN etc, but found that they were sign-extended when casting to i64. Rather than doing a cast plus a mask, I thought it would be more readable to write out the value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
… because we have to parse numbers as i64
… Ok, I see. Well, fine with me, then. Also, forbidding unsigned immediate sounds like a good idea, and we can still revert this later if it appears that it is really needed. Let's merge all this!
off: off as i16, | ||
imm: imm as i32, | ||
}) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This whole function is pretty close to a simplified version of the program building API @alex-diez has been working on in PR #6. Once we have the API, we could use it here instead to avoid re-defining ways to build instructions.
(LoadImm, Register(dst), Integer(imm), Nil) => insn(opc, dst, 0, 0, (imm << 32) >> 32), | ||
_ => Err(format!("Unexpected operands: {:?}", operands)), | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably our major difference in formatting style: I'd definitely align the patterns on something like this :) I find it more readable, but it's not really a good habit since later modifications of a line can force to realign everything (so I'm not asking you to align, that's just for the record).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I'd normally align it but I've been using rustfmt which would delete the extra spaces.
Err("Failed to encode ja: Invalid offset -32769".to_string())); | ||
assert_eq!(asm("ja 32768"), | ||
Err("Failed to encode ja: Invalid offset 32768".to_string())); | ||
assert_eq!(asm("add r1, 4294967296"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well then, since you use the same value again I suppose this is not a mistake… Let me guess, user can consider the immediate value as a signed or as an unsigned? That would make sense indeed. Is there a reason why we should not allow the same for the offset in that case?
Sure, I can add '#' syntax for comments in a future pull request. I also plan to send a pull request that changes (automated) the ubpf tests to use the assembler. |
Thanks a lot! Merging now |
Supports the same syntax as ubpf, plus "64" suffixes for ALU instructions. Doesn't yet support exotic instructions like TAIL_CALL or ST_W_XADD.
A couple of tests are changed to use the assembler instead of binary as a proof of concept. We can do the rest in another pull request.
Fixes #7.