-
Notifications
You must be signed in to change notification settings - Fork 178
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Selection algorithms #500
Selection algorithms #500
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @gareth-nx for these algorithms. My major comment is about the license: is it compatible with the MIT license?
Co-authored-by: Jeremie Vandenplas <[email protected]>
@gareth-nx what is the status of this PR? Re: the license? |
@jvdp1 So far the author of the matlab code didn't reply to me. I plan to just re-write the partition step (which still mirrors the matlab code -- the rest does not). This should be straightforward. So far I haven't found the time, but likely in the next week or two I can do that. |
So my recent changes introduced bugs on some platforms but not others. It seemed to work on my machine, but I will test with a few different compilation options to try to catch the issue. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found a few more nits to pick.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've left some minor comments.
I'm also worried what happens if an array contains NaN's, and how they interact with infinite loops. The R sort routine (also covers partial sorting) has an option to place NaN at the front or the end. I'm fine with having this PR merged (documenting that no precautions are taken for NaN values) and then discussing potential NaN modifications in a new issue. |
I'm also worried what happens if an array contains NaN's, and how they
interact with infinite loops. The R sort
<https://www.rdocumentation.org/packages/base/versions/3.6.2/topics/sort>
routine
(also covers partial sorting) has an option to place NaN at the front or
the end.
A sentence like in `stdlib_sorting.fypp` could be added:
"If both the type of `array` is real and at least one of the elements is a
`NaN`, then the ordering of the result
is undefined."
Le mer. 24 nov. 2021 à 21:08, Ivan Pribec ***@***.***> a
écrit :
… I'm also worried what happens if an array contains NaN's, and how they
interact with infinite loops. The R sort
<https://www.rdocumentation.org/packages/base/versions/3.6.2/topics/sort>
routine (also covers partial sorting) has an option to place NaN at the
front or the end.
I'm fine with having this PR merged (documenting that no precautions are
taken for NaN values) and then discussing potential NaN modifications in a
new issue.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#500 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AD5RO7ABKVILY45276C6B3LUNVA2HANCNFSM5C6RRLPQ>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
|
Co-authored-by: Ivan Pribec <[email protected]>
I'm also not sure what would happen if we pass an array with NaN values -- does the loop even exit? Will have to look into it. |
I ran a few experiments with NaN values, and the code seems to run fine (it exits).
From experiments, the code runs when |
I think that I've addressed all the recent comments -- thanks all for the suggestions. |
I believe the recent commit addresses the remaining comments of @ivan-pi -- ready to merge if there are no other issues. Thanks. |
@ivan-pi: are you happy with @gareth-nx 's changes? If yes, could you merge this PR, please? |
Thank you all, let's merge. @ivan-pi if there are further changes needed, let's address them in a follow-up PR. |
This pull request implements quick selection algorithms, which can find the kth-smallest entry of an array in time that scales with O(size(array)), so is faster than sorting the whole array.
See discussion in issue 471.
Basically this is useful for efficient computation of the median and other percentiles, and as a building block for problems such as mentioned in issue495 and issue 405 and issue 378.
This is my first time contributing to stdlib, and one thing I am a uncertain about is the structure of the CMake and Makefiles in the test suite,. I copied and modified those from the stats tests, and while it all seems to work, I wonder if it can be simplified?