-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Incorrect niche optimization explanation for the linked list in the box section #1820
Comments
Okay, I managed to increase my confidence level to 99%, by actually creating the linked list that is compatible with niche optimization. The trick is, that one has to put the Data type in the book: My data type: In the book, between the Full source code for my linked list:
Output:
In the last line, that the optimization triggers, and Nil is simply represented with a 64 bit all zero pointer. Also, note that in my representation every item of the linked list is 64 bits + 32 bits (for the data), which is only 96 bit total, while in the book's linked list every item is 128 bit. The only disadvantage of my solution, is that it's a list with a mandatory "head item", so there is ZERO data on the stack, and if one wants the first data, even for that one needs to do a jump to the heap. So, we have multiple options regarding the book:
My order of preference:
|
Good catch. I think option 1 is undesirable - I also think that the current content seems incorrect - AFAIU Option 3 SGTM. (We could spend some time discussing whether option 3 is the best long-term way forward here, but it does seem like a desirable improvement over the status quo, so IMHO we should probably land this first and then just consider discussing other options as a follow-up. For example |
This is a complex topic! It's currently buried under "more to explore", which is a nice way of saying "maybe don't talk about this in the class". I agree that we should get it right, though. I suspect a lot of students will be interested in a slightly larger class of topics, which includes this and #1817, #1819: important implementation details of Rust. I just don't think that the String and Box slides are the right place to get into that. There's a bit of time on the last day, so perhaps we could add a segment for these, following the unsafe section? Something like
The exercise should probably be something quick, similar to the analysis earlier in this issue: give students some unsafe code to see what's going on in the data structure, and then challenge them to experiment and explain what they see somehow. |
Also worth noting that the original example enum List<T> {
Element(T, Box<List<T>>),
Nil,
} is technically an example of niche-optimization because the following types are all the same size (according to List<T>
Option<(T, Box<List<T>>)>
(T, Box<List<T>>) This means that values of type In general though, I agree that the reasoning and diagram in the book aren't correct. Niche value optimization should probably be introduced as just "
I'm in favor of this approach |
Wow, thank you @djmitche @anforowicz @QnnOkabayashi for the great comments and for taking the time to send me detailed replies to my questions, really appreciated! My findings today:
With these findings, I kinda changed my opinion and now I prefer option 2, let me motivate why. I proposed this new datatype: In your comments you proposed this: At the stage where we are in the book (day 3 only), the best fix in my opinion is:
If this looks like a plan to you all, I will write up a draft PR by end of next week (hopefully lot earlier), and then we will have something to discuss around. |
Hi all, Wow, thanks a lot for looking deeply into this!
My goals with the slide were:
I think your plan above sounds great: it keeps the non-trivial example but fixes. It would be great to add a test for this: a small Rust program which makes a few assertions with As for the diagram, it's of course very "symbolic". I've mostly drawn them by hand, but https://asciiflow.com/ can be very helpful to get a quick prototype. There is also https://ivanceras.github.io/svgbob-editor/ to visualize things in the browser. I hope that helps!
@djmitche, I love that you look at the big picture here! I guess putting together such a segment is something we could do after fixing the slide here? Assuming there is material and interest for it, of course. |
Yeah, I think a dedicated segment is a good follow-up. Also, we've taken a "circular" approach throughout the course, where concepts come up repeatedly. The first time, they're mentioned without much detail, and then explored in more detail later. So with the new segment perhaps we could move the linked-list example to the new segment and just leave a speaker note here mentioning the niche optimization using |
Thinking on this more, I think we should keep the recursive data type example and the niche optimization example independent, but possibly still on the same page. The whole point of the linked list example is to show how |
I think that makes a lot of sense! It also gives an opportunity to introduce the niche optimization earlier, in the Option slide (in fact, it's already mentioned there -- maybe another sentence or two there, or a link, would be useful!) So maybe the right approach right now is this:
If that last bit is in a PR on its own, then we can hold onto it for a bit until the other slides in that segment are also written. |
@QnnOkabayashi would you be able/willing to do any or all of those steps? |
I usually teach niche optimization when we talk about discriminant representation on the Enum slide. That slide might be worth splitting into two, because the speaker notes include a whole second example that digresses a bit from enums as a way to define data to talking about implementation details of enums. But people do seem happy with getting this information at that point in the class, as enums with payloads are novel and incite curiosity about how everything fits together in memory after leaving the simple C fields-in-sequence-with-padding paradigm. |
OK, that's two mentions of the niche optimization in speaker notes, with a fuller description planned in an "Implementation Details" segment. So, let's not worry about niche optimization in the smart-pointers segment. I'll make a PR to do that bit now, and a new issue to build an "Implementation Details" segment. |
#1946 did some of this work already, so the PR is pretty simple. |
Niche optimization is currently mentioned in three places: - Enums (User-Defined Types, Day 1 Afternoon) - Option (Standard Library Types, Day 2 Afternoon) - Box (Smart Pointers, Day 3 Morning) This is a tricky thing to get right, and it was just in the speaker notes in each place. google#1820 will introduce a fuller explanation.
Sorry for being absent on this, I appreciate all the work that you guys put into this and definitely looking forward to check out the new setup one I have time to learn a bit of rust again.
Thanks again for all the work!
…On 30 September 2024 20:26:03 GMT+01:00, "Dustin J. Mitchell" ***@***.***> wrote:
#1946 did some of this work already, so the PR is pretty simple.
--
Reply to this email directly or view it on GitHub:
#1820 (comment)
You are receiving this because you authored the thread.
Message ID: ***@***.***>
|
Niche optimization is currently mentioned in three places: - Enums (User-Defined Types, Day 1 Afternoon) - Option (Standard Library Types, Day 2 Afternoon) - Box (Smart Pointers, Day 3 Morning) This is a tricky thing to get right, and it was just in the speaker notes in each place. #1820 will introduce a fuller explanation. Fixes #1820.
https://google.github.io/comprehensive-rust/smart-pointers/box.html#:~:text=A%20Box%20cannot%20be%20empty%2C%20so%20the%20pointer%20is%20always%20valid%20and%20non%2Dnull.%20This%20allows%20the%20compiler%20to%20optimize%20the%20memory%20layout
I think this is not correct.
I wrote this debugging code:
What I'm trying to do here, is to get the representation out of Rust in all 3 steps of the linked list, and this is the output I have:
What we see here (on all 3 architectures), is that the first and second list element has an enum discriminant of (0x0), while the third one has a discriminant of (0x1).
The discriminants are at the end of the printout, because all 3 architectures are little endian.
And I also have a theoretical reasoning why the provided picture for the niche optimization is not possible: linked lists are famous for being a data structure, where you can return a reference to the middle and users can continue to walk the linked list from there, or maybe even share data from there, etc. Imagine how would we implement a
find()
method for this linked list? It would probably returnList<T>
, the first item that matched the search criteria, and the caller than can even see the items behind the found item. Now, if there are no results found, we have to returnNil
, which is at the end of the list anyway, so we can just return thatNil
. But if the niche optimization were really to happen the way how the slide explains, then that final closingNil
element would not be there.I'm reporting an issue here instead of a pull request, because I'm only 95% sure, and I'm happy to do the work, but wanted to hear confirmation first from someone more experienced. @djmitche Do you have an opinion about this too?
The text was updated successfully, but these errors were encountered: