-
Notifications
You must be signed in to change notification settings - Fork 77
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Patch to version 0.4.0 #257
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## development #257 +/- ##
===============================================
- Coverage 46.44% 45.49% -0.96%
===============================================
Files 85 85
Lines 7247 7254 +7
===============================================
- Hits 3366 3300 -66
- Misses 3881 3954 +73 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@morenol we need to fix Wasm tests |
releasing a new minor for these changes is not completely justified. We need to add some more implementations to this to justify a minor release. |
Also: probably, in accord with the library principles, probably it is better to panic early at initialization than returning a |
To which function are you referring? I get that if a config or something early is flawed, the app should exit with useful feedback. Elsewhere, the user can "opt-in" to a panic by using "unwrap". Here is a link that might be useful https://doc.rust-lang.org/book/ch09-03-to-panic-or-not-to-panic.html. I see a line between if "dangerous/harmful" then panic, otherwise Err. Err is a valid answer given the input. Panic is not generally expected behavior. Furthermore "fail early" is not what we are talking about, it's more "at the time a mistake is discovered" what does the lib return. As a user of the library, I would expect to be able to send commands over time. A user (or service), might "get it wrong". Does that mean the user cannot "try again"? In the event I run this in Tokio, I would not want the thread to panic (very bad and tough to navigate), but rather give me the answer - that can include: Err could not process b/c data was wrong, a match branch I can recover from. Runtime errors in py is the reason I moved to Rust. Returning Err is not a runtime error like we experience in py. It's only a decision on what to return - app crash/panic or Err. More generally about runtime errors: by definition of the compile-time checks in Rust, they don't equate to py (incomparable). Right now the lib is fragile because it doesn't have strong typing through and through (likely mostly on the result side of the process). It will get better over time where the app does not instantiate a bad state (like what we have with DenseMatrix in the 0.4). Until then let the user know with an Err. For instance getting results is a fragile process. What do you think? |
I also think that a Result is better than panic for this case. Also agree that we could delay the crates.io release until we have more things that we could add to a breaking change release |
I agree there is more that can be done for the next major release. I have some ideas for how to add to the changes for 0.4. Much of the ideas come from what
I tried to use the coefficients and intercept to output probabilities (not binary classes) and was surprised by how fragile the process was. e.g., One way to go from let coefficients: Vec<_> = lr.coefficients().into();
let intercept: f64 = lr.intercept().into();
// or
let coefficients: Vec<_> = lr.coefficients().iter().collect();
let intercept: f64 = lr.intercept(); // casts to f64 There are other "best practices" for increasing the usability of each method by using traits such as
For the next major after this, or the next after that :)), I would start thinking more about performance. I hit performance issues with 50k x 100 sample size. Part of this would include a review of when we borrow vs take ownership. Ironically, I wonder if there is too much effort to avoid copies. My main point is this, it's still too early to worry about performance in my view. Just note, we have room to improve when the time comes. Unlike python, it will actually make the code both better and more readable, not more convoluted :)) Finally, I've surveyed several packages to accomplish what I need for building stat models. It's a fragmented space with many "half-way" efforts. I like what you all have accomplished. I believe the core strength is assembly: I hope this input is helpful. I spent a many years working in Rust. I'm happy to help however you all like. |
Thanks for your ideas. I am glad you found I believe the library can be very useful in different domains and I am now focusing into using the library for applications to see what it can do. My idea is to keep the library different from the others on multiple dimensions. I think there is already a lot of effort into making things work for "larger" high-level use-cases and that side-tracking this approach can be very interesting in the long term for domains that are maybe not in the "big data" scenario; It would be great to have more idiomatic Rust, I think let's start with the transient types and the builder pattern, after that we want probably to set up good profiling (we have a bench repo but it is very limited and difficult to run, it would be nice to have something like tarpaulin for profiling, that runs with CI/CD). Vlad, that created the library, is also of the idea that we should drop our custom traits system on arrays but this is something we can see later as far as we provide the bindings to Welcome! @EdmundsEcho |
To better understand the design intent was implemented, may I ask
|
|
@morenol please take a look |
I think that we need to run |
Collecting changes for version
0.4.0
New expected behaviour
see #256
Change logs
Result
type toDenseMatrix
constructor by @EdmundsEchoDenseMatrix
constructor, now constructor returns aResult