-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
remove allocations in hostrange_find #6259
Conversation
I decided to look back in on this and try it in a microbenchmark today just to see if maybe it was an interaction with sched that made it look like no improvement. After a few more tweaks, I'm getting about a 3x performance improvement for this test: #include <benchmark/benchmark.h>
#include "../src/common/libhostlist/hostlist.h"
static void BM_SomeFunction(benchmark::State& state) {
hostlist *hl = hostlist_decode("host1[1,2-50]");
// Perform setup here
for (auto _ : state) {
// This code gets timed
if (hostlist_find(hl, "host151") >= 0)
exit (1);
if (hostlist_find(hl, "host149") < 0)
exit (2);
}
}
// Register the function as a benchmark
BENCHMARK(BM_SomeFunction); For the sched case, I think it would be an even bigger improvement if I had a way to create a "hostname" handle that sched could keep hold of. In core we could kinda reach into hostlist and do that now, but sched relies on the external stable interface. How would you feel about exposing something for a "hostname" sched could hold on to? It could be opaque just like the hostlist, but not having to re-scan the thing every time would be a massive saving, even if it takes an allocation. |
That sounds reasonable for a performant search or match. What would you think the interface would be, something like: struct hostlist_hostname *hostlist_hostname_create (const char *, size_t len);
int hostlist_find_hostname (struct hostlist *hl, struct hostlist_hostname *hostname); ? |
That's what I was originally thinking yeah, I'm playing with whether I could get away with re-using the existing hostlist and have a |
Nope, not practical, the hostname interface like you posed above is much more feasible because of the way hostrange is laid out. The main thing to keep it cheap is not to have to parse the int, re-validate, and re-strlen the thing. |
Yeah, the only other idea I had was a more generic |
This probably needs some stylistic tweaks, but it's working and I need to drop it for a bit to take care of some admin stuff. Part of me really wants to get rid of having two separate structs for the borrowed, or "stack", hostname but getting rid of the "prefix" member as a null-terminated string that's just the prefix is definitely a big rework. |
This looks good to me besides some cleanup that is probably just a result of this being a WIP. If it is helpful I can do the stylistic changes (mainly just formatting) and add some tests for the new interface. |
Yeah the formatting is definitely messed up in a few places, the main thing I want to make sure gets done (other than tests) is renaming the struct from hostname to |
Ok, I'll wait for updates and plan to work on this Monday if you don't get to it first. thanks @trws! |
87a1943
to
9204fb2
Compare
Ok, I think I got it cleaned up. Happy with this if you are @grondo. |
sigh helps if I actually push all the files, should go much better this time. |
Some trivial cleanup still needed (just formatting afaics, and sorry we don't have a valid clang format for our style 😞) Also, something weird is happening in the asan build. It still seems to be picking up a
Edit: oh, this is probably related to your comment above, nvm! |
Yeah, I still need to get the updated clang-format file merged and set to only apply where we've already applied it. It's definitely possible I missed some things though, I'm not good at catching them (part of why I always try to automate that). |
Going through this one again and I think probably the easiest thing to do here would be to squash all these commits together into one commit like Also, there's still some style issues. Finally, I'd suggest renaming either the @trws, I can take care of that if you are busy (and don't mind me rebasing and force-pushing to your branch) |
I am fine with all of that. Honestly I was hoping I could merge the stack hostname out and just have one struct, but it isn’t feasible without a lot more work. If you would I’d appreciate it, or I can but it will probably have to wait until at least Wednesday. Would actually like to know what format things I missed, they aren’t jumping out at me and that means something is missing or wrong in my mental model.
Get Outlook for iOS<https://aka.ms/o0ukef>
…________________________________
From: Mark Grondona ***@***.***>
Sent: Monday, September 9, 2024 1:51:14 PM
To: flux-framework/flux-core ***@***.***>
Cc: Scogland, Tom ***@***.***>; Mention ***@***.***>
Subject: Re: [flux-framework/flux-core] remove allocations in hostrange_find (PR #6259)
Going through this one again and I think probably the easiest thing to do here would be to squash all these commits together into one commit like libhostlist: improve hostlist_find performance by reducing allocations or similar. Then we still need to add a commit that adds unit tests for the new hostlist_hostname based functions.
Also, there's still some style issues.
Finally, I'd suggest renaming either the struct stack_hostname or the constructors for this struct hostname_stack_create() so they match.
@trws<https://urldefense.us/v3/__https://github.com/trws__;!!G2kpM7uM-TzIFchu!02WQvxuGgT_Vs-Khes2rWk1xvFSM8opWslCmviSD-prIYsJ-6XP3dk3mj4Vte4lWjNj_LYM1Ce-krkv9ZR4ieZjEk8o$>, I can take care of that if you are busy (and don't mind me rebasing and force-pushing to your branch)
—
Reply to this email directly, view it on GitHub<https://urldefense.us/v3/__https://github.com/flux-framework/flux-core/pull/6259*issuecomment-2339065089__;Iw!!G2kpM7uM-TzIFchu!02WQvxuGgT_Vs-Khes2rWk1xvFSM8opWslCmviSD-prIYsJ-6XP3dk3mj4Vte4lWjNj_LYM1Ce-krkv9ZR4ilvlAzdI$>, or unsubscribe<https://urldefense.us/v3/__https://github.com/notifications/unsubscribe-auth/AAFBFNI4NNN4HQ2EFATB6ILZVYC4FAVCNFSM6AAAAABNVJQJW6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMZZGA3DKMBYHE__;!!G2kpM7uM-TzIFchu!02WQvxuGgT_Vs-Khes2rWk1xvFSM8opWslCmviSD-prIYsJ-6XP3dk3mj4Vte4lWjNj_LYM1Ce-krkv9ZR4irsmD5Ck$>.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
Actually looking back I don't believe you really missed anything related to formatting. I saw some parameters broken onto the next line with one indent, but I think that was because you were just breaking a long line, which seems fine to me. |
6fc817c
to
b229fcc
Compare
Ok @trws - pushed directly here an update that:
|
Looks good to me, anything further you'd like to see here? |
Nothing more from me. Thanks! I can approve since all I made were some comment and very minor whitespace changes, and added unit tests. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thanks
struct stack_hostname *hostname_stack_create_from_hostname ( | ||
struct stack_hostname *hn, | ||
struct hostlist_hostname *hn_src); | ||
struct stack_hostname *hostname_stack_create_with_suffix ( | ||
struct stack_hostname *hn, | ||
const char *hostname, | ||
int len, | ||
int idx); | ||
struct stack_hostname *hostname_stack_copy_one_less_digit ( | ||
struct stack_hostname *dst, | ||
struct stack_hostname *src); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: formatting doesn't match current flux-core coding style, first parameter on long list goes on same line as function, the rest of the parameters are one per line, aligned with opening parens. (And so sorry we don't have a good clang format for this)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does that override the line length? If I join these to the open peren then they stretch to as far as 85 cols. Hence the break in this case, this is actually what clang-format does when told to do what flux core style wants but given too little space. We discussed increasing to ~88 as the break point in the clang-format PR but had never officially agreed to do it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These were old comments I didn't know GitHub was going to post. Sorry! I think your assessment and the formatting is correct here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No worries, I guessed that was it after I read the next couple. I'm going to clean up the clang-format file update PR in a bit, have a bit of time I can use for that this morning. Also have a much better solution to making sure it's consistent now than I used to. Hope jury duty isn't too bad btw. 🤞
src/common/libhostlist/hostname.c
Outdated
struct stack_hostname * hostname_stack_create_with_suffix (struct stack_hostname *hn, | ||
const char *hostname, unsigned long len, int idx) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: style, break long parameter lists one per line. Opening function brace should go on its own line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oof, that one's awful. I thought I had disabled this, thanks for catching it.
src/common/libhostlist/hostlist.h
Outdated
/* | ||
* Search hostlist hl for the first host matching hostname | ||
* and return position in list if found. | ||
* | ||
* Leaves cursor pointing to the matching host. | ||
* | ||
* Returns -1 if host is not found. | ||
*/ | ||
struct hostlist_hostname *hostlist_hostname_create (const char *); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Documentation for this function is coped from hostlist_find()
and needs to be updated.
src/common/libhostlist/hostname.c
Outdated
static const int pow10[10] = { | ||
1, 10, 100, 1000, 10000, | ||
100000, 1000000, 10000000, 100000000, 1000000000 | ||
}; | ||
static const int pow10[10] = {1, | ||
10, | ||
100, | ||
1000, | ||
10000, | ||
100000, | ||
1000000, | ||
10000000, | ||
100000000, | ||
1000000000}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reformat probably doesn't belong in this commit.
problem: hostrange_hn_within requires an allocated hostname object, which must be re-created for every query. This is very expensive when the queries are frequent or repeated, especially in hostlist_find_host solution: add a new function hostrange_str_within that uses the exact same logic without building a newly allocated hostname object Use the existing string lengths in more places, and instead of re-parsing the suffix find the power of 10 matching the length of the string, and mod by that to chop off the extra digit in the recursive case Finally, add an interface to do all the prep work and return an opaque handle to the hostname object hostlist uses internally so a caller can re-use a pre-built object with the new hostlist_find_hostname interface. The result comes out a bit over twice as fast as directly using the allocation-free interface.
Problem: There are no tests of hostlist_find_hostname(). Add them using the existing hostlist_find() inputs.
b229fcc
to
92f327a
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #6259 +/- ##
==========================================
- Coverage 83.35% 83.35% -0.01%
==========================================
Files 521 521
Lines 85723 85793 +70
==========================================
+ Hits 71453 71509 +56
- Misses 14270 14284 +14
|
This is an in-progress PR looking at making hostrange_find allocation-free. It actually does that, and also keeps track of the length of all string components such that it can use comparison functions that don't have to check for a null terminator, but when I was doing tests with this a few weeks ago, it was no faster for the flux-sched case. Maybe worth poking at to see if I did something screwy but something strange is going on.