-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Format improvements (WIP) #63
base: main
Are you sure you want to change the base?
Conversation
@@ -476,14 +476,18 @@ pub enum Expr { | |||
}, | |||
} | |||
|
|||
#[derive(Debug, Copy, Clone, PartialEq, Eq, Hash)] | |||
pub struct IntFormat { | |||
pub unsigned: bool, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might prefer signed
rather than unsigned
to avoid double negatives
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went with unsigned
since it's more of an exception to the norm than anything. I tend to structure my bools with true
as the "do something different" value so that they don't need to be negated whenever being read, but I guess that doesn't matter as much with match
since you have to be explicit in both cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
true
as "do something different" makes sense in C-likes where it's easy to zero-initialize things. It doesn't make so much sense in rust.
@@ -665,7 +679,7 @@ impl AtomRaiser<'_, '_> { | |||
| ArgEncoding::Integer { ty_color: None, format, .. } | |||
=> Ok(ast::Expr::LitInt { value: raw.expect_int(), format: *format }), | |||
|
|||
| ArgEncoding::Padding | |||
| ArgEncoding::Padding { .. } | |||
=> Ok(ast::Expr::from(raw.expect_int())), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if this is unreachable now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's definitely not unreachable since the padding args are only dropped after being raised thanks to some sort of logic with @arg0
. I don't know how important the positioning of that is since you'd left a comment about it being important to reference the original args list, so I chose to leave it there for now. If it's possible to implement the "verify the bytes are 0 or fallback to blob" check at at line 541 before raising all the arguments, then it would be unreachable for sure.
src/formats/ecl.rs
Outdated
@@ -1082,7 +1103,7 @@ impl InstrFormat for OldeEclHooks { | |||
fn read_instr(&self, f: &mut BinReader, emitter: &dyn Emitter) -> ReadResult<ReadInstr> { | |||
let time = f.read_i32()?; | |||
let opcode = f.read_u16()?; | |||
let size = f.read_u16()? as usize; | |||
let size = f.read_i16()? as usize; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it seems to me this is just begging for an Out of Memory error without providing any advantages
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The field is signed though, read with MOVSX. If a size is greater than 0x7FFF, it'll index backwards (and probably explode)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't matter whether it's signed in game because it gets cast immediately to an unsigned type. So unless you plan to add range checking (which you certainly can do), you're literally just trading one incorrect behavior for another...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Range checking would definitely be ideal, though IDK how you'd want the error to be formatted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems similar to the check in anm
@@ -629,7 +629,7 @@ static ECL_08_09: &'static CoreSignatures = &CoreSignatures { | |||
(Th08, 127, Some(("S", None))), | |||
(Th08, 128, Some(("S(imm)f(imm)f(imm)f(imm)f(imm)", None))), // Argument 1 is unread | |||
(Th08, 129, Some(("b(imm)---", None))), | |||
(Th08, 130, Some((r#"s(imm;enum="EclSub")--"#, None))), | |||
(Th08, 130, Some((r#"s(imm;extend;enum="EclSub")--"#, None))), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"extend" is too short of a name for something that is probably only ever going to be used once. Probably want something with multiple words, meaning we might need to pick now whether we want snake_case
or kebab-case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There're quite a few places where the game engines will read a full dword from the signature in order to check for variables but then only read the lower bits of the resulting value. Ideally extend
could be replaced with a more general purpose attribute applicable to those situations as well since that would eliminate the need for extended padding anyway. Maybe this could be written as E(imm;range="s")
or something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was also thinking this should probably be a dword-sized value with an attribute that limits its value size. However, I really do not want to see abi letters get used inside string literals.
src/llir/abi.rs
Outdated
@@ -42,7 +42,7 @@ pub enum ArgEncoding { | |||
/// The first argument may have `arg0` if it is two bytes large. This indicates that the argument is | |||
/// stored in the arg0 header field of the instruction in EoSD and PCB ECL. (which is mapped to the | |||
/// `@arg0` pseudo-argument in raw instruction syntax) | |||
Integer { size: u8, ty_color: Option<TypeColor>, arg0: bool, immediate: bool, format: ast::IntFormat }, | |||
Integer { size: u8, ty_color: Option<TypeColor>, arg0: bool, immediate: bool, extend: bool, format: ast::IntFormat }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is getting out of hand, let's break this up over lines.
@@ -42,7 +42,7 @@ pub enum ArgEncoding { | |||
/// The first argument may have `arg0` if it is two bytes large. This indicates that the argument is | |||
/// stored in the arg0 header field of the instruction in EoSD and PCB ECL. (which is mapped to the | |||
/// `@arg0` pseudo-argument in raw instruction syntax) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Document extend
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What sort of formatting should be used for that? The only currently documented attributes are mask
and furibug
, which don't even use the same format.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh bugger, I forgot to document enum
. That's the only thing I see missing from here.
It should be in prose in the documentation of Integer
, just like the documentation for arg0
that I put this review comment under. (basically, "put it here!")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well, imm
is also missing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And hex
isn't included either. There're enough attributes that there should probably be a list with the names at the front.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we probably want to make a markdown list
///
/// Supported attributes:
///
/// * **`imm`**: lorem ipsum dolor sit amet
/// i have a barnicle in my backseat
/// * **`arg0`**: blah blah blah blah
etc
@@ -424,7 +426,7 @@ fn abi_to_signature(abi: &InstrAbi, abi_span: Span, ctx: &mut CompilerContext<'_ | |||
|
|||
defs::Signature { | |||
return_ty: sp!(value::ExprType::Void), | |||
params: abi.encodings.iter().enumerate().flat_map(|(index, enc)| { | |||
params: abi.encodings.iter().filter(|enc| !matches!(*enc, ArgEncoding::Padding { .. })).enumerate().flat_map(|(index, enc)| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line is too noisy. Funnily, this is a flat_map and yet it doesn't even look like anything is currently taking advantage of that; but this is exactly the sort of use case for that, so let's make use of it:
params: abi.encodings.iter().filter(|enc| !matches!(*enc, ArgEncoding::Padding { .. })).enumerate().flat_map(|(index, enc)| { | |
params: abi.encodings.iter().enumerate().flat_map(|(index, enc)| { |
And change the Padding match arm to:
| ArgEncoding::Padding { .. }
=> return None, // hide from signature
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was concerned something like that would end up inserting a None into the resulting collection. Didn't know it would skip over it. Seems like there could be issues in the future though if index
is ever used for anything though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops I missed the fact that this would change indices, lemme read this over again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was concerned something like that would end up inserting a None into the resulting collection.
flat_map
flattens the iterators being returned by the closures. Option
is an iterator of zero or one items, so None
produces no items. Technically, this should be using .filter_map
instead (you can just change the .flat_map
to .filter_map
and it should still compile); this is just an alias for Option
-based flat_map
s that is clearer in intent.
Oops I missed the fact that this would change indices, lemme read this over again.
So the index is only used to generate arg names. I would pull that out.
// in a line of code outside the loop
let arg_names = (1..).map(|i| ident!("arg_{i}"));
// inside the loop, change this line:
let name = sp!(abi_span => ctx.resolutions.attach_fresh_res(arg_names.next().unwrap()));
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and of course get rid of the enumerate()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make more sense to just add another function to the InstrAbi
implementation similar to arg_encodings
that returns the filtered version? This seems like the sort of thing that would end up continually popping up throughout the code base otherwise. (I would've done that in the first place, but rust complained about the filter output not being compatible with VeclikeIterator
or something and I don't know how to make it happy.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmmmm. Crap, that's a pain.
Writing this function is probably a good idea. It will have to return the slightly weaker DoubleEndedIterator
instead of VeclikeIterator
. (basically because filter
can't be ExactSizeIterator
, which is the other half of VeclikeIterator
)
// Important: we put the shortest iterator (args_iter) first in the zip list | ||
// to ensure that this loop reads an equal number of items from all iters. | ||
assert!(args_iter.len() <= arg_encodings_iter.len()); | ||
for enc in arg_encodings_iter.by_ref() { | ||
if let ArgEncoding::Padding { size } = enc { | ||
match size { | ||
1 => args_blob.write_u8(0).expect("Cursor<Vec> failed?!"), | ||
1 => { | ||
if (args_blob.position() & 3) == 0 { small_padding_value = 0i8 } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please avoid one-line ifs. I reserve these for extreme cases (usually only when there are multiple similar ifs in a row and I want to emphasize the similarities between them)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll freely admit the entire logic for handling that was a 5 minute quick fix that bolts a random functionality where it doesn't quite belong. I'd love to replace the logic with something better that doesn't feel as dirty.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As the complexity of this function increases (which it is no doubt going to, without end), we're going to want to start factoring out pieces of it into separable units.
This thing right here is easily separated out into a Finite State Machine. You could have something like:
/// Finite-state machine helper for the value of `-` padding.
struct SmallPaddingFillHelper {
fill_value: i8,
}
impl SmallPaddingFillHelper {
/// Update the helper in response to the next arg encoding.
///
/// Returns the current fill value for `-` padding.
fn step(&mut self, enc: &ArgEncoding, blob_position: u64) -> i8 {
todo!("update self.fill_value");
self.fill_value
}
}
and then you could put let small_padding_fill = padding_helper.step(enc, args_blob.position());
near the top of the loop
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant more that I'd rather leave the padding bytes as exclusively 0, eliminate the padding fill variable entirely, and let the values be handled by a range
sort of attribute within a single larger value. That way there isn't any more multi-iteration mutable state than there needs to be.
@@ -554,18 +554,25 @@ fn encode_args( | |||
let mut param_mask: raw::ParamMask = 0; | |||
let mut current_param_mask_bit: raw::ParamMask = 1; | |||
|
|||
let mut small_padding_value = 0i8; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I want to see tests for:
- basic functionality of
extend
- padding value resetting to 0 at dword boundaries. (
s(extend)------
) - resetting to zero on positive values
That last one would involve
10 s(extend)b-
ins_10(-2, 1)
args_blob
should compile to blobify![-2i16, 1i16]
. You will need to impl Blobify for i16
here:
truth/tests/integration_impl/mod.rs
Line 227 in c41358e
impl Blobify for i32 { |
No description provided.