Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improvements : Idiomatic code #4

Open
chris-ha458 opened this issue Sep 24, 2023 · 11 comments
Open

Improvements : Idiomatic code #4

chris-ha458 opened this issue Sep 24, 2023 · 11 comments

Comments

@chris-ha458
Copy link
Contributor

Re : #3

This codebase has been ported from Python and a lot of the design patterns could be improved to be more idiomatic rust code.
Such a move will make it easier to improve speed and maintainability, ensure correct operation from a rust point of view.

Some examples would be avoiding for loops, using matches instead of if chains etc.

Many require deeper consideration.

For example, this codebase has extensive use of f32. Unless using intrinsics, f64 are as fast as or faster than f32 in rust.
Moreover, trying to cast to and back for f32 and f64 can harm performance and make it difficult to ensure correct code. For instance there are instances of exact compare between f32 and f64 variables, and this is very unlikely to operate in the intended way. If it is intended, it would be valuable to have documentation regarding that, suppressing relevant lints as well.
However, if there is a need to maintain ABI compatibility or follow a specification it might be inevitable. Also, on-disk size could be a consideration.
In summary f32 vs f64 handling could serve as both idiomatic code and speed but only if done right.

I will try to prepare some PRs that change some things. Despite my best efforts, I am sure that many of my changes or views might be based on a flawed understanding of the code, so feel free to explain why things were done the way they were.
In such cases I will help with documentation.

@nickspring
Copy link
Owner

nickspring commented Sep 25, 2023

Hi, I don't think f64 is faster than f32. It is the same or slower.
I.e. https://hugotunius.se/2017/12/04/rust-f64-vs-f32.html

Also I believe there is no conversion f64 <-> f32 (I use everywhere f32). I found f64 only in performance.rs and it can be replaced by f32 easily :)

@chris-ha458
Copy link
Contributor Author

If it is the same, I would argue that the added precision would be useful.
But even if that both were true (speed is the same, and added precision is useful) it doesnt seem like a priority so we can revisit when necessary.
There is a case of f64 vs f32 comparison that popped up in clippy so I'll look into that first.

@chris-ha458
Copy link
Contributor Author

The round_float fn makes me think high precision was not critical here.
Can you share what you do with that function?

  1. Why is it necessary in the first place
  2. Can we use an optimized implementation in other crates(such as float-cmp or rust_decimal) for that?

@chris-ha458
Copy link
Contributor Author

with #5 merged, there are still some more low hanging fruit remaining, but not many.
Thanks to @nickspring for swift review and merges.

Next, I will address some function signatures, specifically their types . Hopefully this can enable more refactoring with less unnecessary copies or clones.

Beyond that, I think breaking down large functions in lib.rs such as from_bytes could be a good candidate to enable better testing and optimization.

@chris-ha458
Copy link
Contributor Author

Btw would you prefer PRs to be smaller(focusing on single functions or at most, files) or current level?

@nickspring
Copy link
Owner

The round_float fn makes me think high precision was not critical here. Can you share what you do with that function?

  1. Why is it necessary in the first place
  2. Can we use an optimized implementation in other crates(such as float-cmp or rust_decimal) for that?

this function only rounds floats to some number of digits after the point in the end of processing. I don't think that's good idea to add rust_decimals because of such functionality.

@nickspring
Copy link
Owner

nickspring commented Sep 26, 2023

Btw would you prefer PRs to be smaller(focusing on single functions or at most, files) or current level?

current level is OK, no problem :)

@chris-ha458
Copy link
Contributor Author

this function only rounds floats to some number of digits after the point in the end of processing. I don't think that's good idea to add rust_decimals because of such functionality.

If it is only for output purposes, we might be able to use something like
let rounded = format!("{:.2}", num); while maintaining internal representation with full precision (whether it be f32 or f64).

@chris-ha458
Copy link
Contributor Author

94e7b0c

Seems to indicate that there was a regression.
The way I changed it was to follow the upstream code as closely as a possible
(instead of removing the unreachable path, fixed it so it became reachable)

But it seems to have caused a drop in accuracy.
I am wondering what could have prevented such a regression and allowed for it to become detected earlier.

I think it might have been apparent if I tested with the performance feature flag on.

@nickspring
Copy link
Owner

nickspring commented Oct 1, 2023

94e7b0c

Seems to indicate that there was a regression. The way I changed it was to follow the upstream code as closely as a possible (instead of removing the unreachable path, fixed it so it became reachable)

But it seems to have caused a drop in accuracy. I am wondering what could have prevented such a regression and allowed for it to become detected earlier.

I think it might have been apparent if I tested with the performance feature flag on.

Aha... I believed we should just removed unreachable code, but in original version logic was changed - I didn't realize it. I left a question about actual accuracy in original PR.

@chris-ha458
Copy link
Contributor Author

There are some small things that could be done, but I don't see any big things atm.

The only big thing I had in mind is to write a wrapper around encoding_rs to emulate the current (and unmaintained) encoding crate dependency, but it is likely to be a lot of work, with unclear gains regarding speed or correctness. I might hack around that in my free time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants