-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor libcollections as part of collection reform #18519
Conversation
r? @aturon |
Still waiting on full results from make-check, but there's some stuff that needs to be discussed about this patch (module renaming) anyway. Might as well parallelize that. Also note: I couldn't split up bitv properly, as BitvSet is too coupled to Bitv's internal implementation details. I settled for renaming bitv.rs to bit (so that there could be a different bitv module). |
CC #18424 |
pub use self::set::DifferenceItems; | ||
pub use self::set::UnionItems; | ||
pub use self::set::SymDifferenceItems; | ||
pub use self::set::IntersectionItems; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an interesting design, any particular reason you wanted to remove all the reexports here?
I kinda like how today when I'm looking for a hash set or a hash map I don't have to think about what module to look inside (there's only one). That being said I don't look much farther than Thanks for this! |
@alexcrichton: @aturon can probably give a better rationale, but here's a few points off the top of my head:
|
One observation: in general the module names should be snake cased versions of the type names. That is, whenever you have an embedded capital leader in the type name, it should map to an underscore in the module name. Examples: I realize that some of these are a bit odd/painful, but naming consistency is important. In a lot of these cases, you'll know the type name by heart, and being able to translate directly to the module name is helpful. (We could potentially consider a different convention, but snake case is a long-standing one and we'd need to actually spell out the rules, go through an RFC, etc.) |
Yes, I brought this up on IRC and the overwhelming preference was alllowercase. It was noted that the official recommendation of PEP8 is:
Which I agree with. Underscores are annoying/noisy in modules, especially with things like |
@gankro But that's Python's guidelines, not ours! We have a strong convention toward snake case for "value-level" items like modules and functions. Clearly the proposal here isn't to throw that out in general, or even just for modules. But how would you decide when to use which convention? The conventions lose much of their benefit if there's no consistent way to follow them. (Things like "if it improves readability" immediately mean I can no longer mechanically come up with the right name; it's a subjective choice.) (Not trying to put this all on you, just thinking out loud about the issue.) What would you think about having the type names |
@gankro More broadly, I wouldn't take much away from the IRC discussion (which was basically like: "yeah those seem unfortunate".) We're trying to set up official conventions to help nail down these issues. The basic naming conventions are basically just de facto standards based on long-running practice, but as we're seeing they're not 100% followed. Maybe this suggests that even these basic conventions need to go through a round of RFC, to resolve the kind of issue you're raising. In any case, we certainly don't need to hold up the PR for this, since these problems already existed in the collections. But I would like to have an official resolution. |
@alexcrichton I think @gankro laid out several of the important benefits of this design, but here are a few others:
Basically, I think the fact that hash maps and sets lived in the same module is something you had to learn, even if it now seems convenient, and it only applies in certain cases (where there's both a set and a map implemented in the same way). Consistently splitting up the modules means there's less to learn/remember, and I think it actually improves ergonomics (as @gankro mentioned, making globs more useful). |
@gankro This looks good to me; let's resolve the module naming issue separately. I want to give @alexcrichton a chance to weigh in on the design rationale before I r+. |
I would just push for alwayslowercase on module/file names, and maybe snake case for crates (since that seems more consistent). The snake_case convention on module names doesn't strike me as very consistently implemented. There's these collection modules like smallintmap (which I didn't change, it was always like that), as well as stdio and bitflags. Most modules are a single word, so it's hard to get a lot of data. Module names strike me as a place where writeability is much more important than readability, as they're going to be sprinkled around a lot as namespaces, but their actual meaning is unimportant to the reader. I'm not a huge fan of Btree of Dlist, as they're supposed to be shortenings of the proper names B-Tree and DoublyLinkedList. Although I've typo'd those enough times out of uncertainty to not have a strong resolve. As a not-very-good data point, settling on this convention has less breakage for the collections, as alllowercase is the overwhelming convention there. |
@gankro There are a couple of problems with just going for alwayslowercase. For one, as even its name suggests, it is often not very readable -- hence the PEP recommendation to sprinkle underscores when it "helps readability". Another downside is it means proliferation of more, different naming styles. Why is snake case OK when typing function and method names (which are more commonly multiword) but not modules? You're right that it's hard to get good data here, since most modules are single words. But snake case is definitely followed to some degree throughout (e.g. What about a convention that with snake case you never do |
@aturon @alexcrichton I've pushed a second commit with the suggested changes. |
@gankro Awesome, I feel like that's a clear improvement and am glad to see the name |
ick, the change I made to reference.md is nonsensical. I think the whole example is nonsensical now since hashset and hashmap are separate, but I don't know what to replace it with. Never seen the |
d571304
to
dda3a34
Compare
All squashed. I was able to mostly verify the patch, but I've got a local corruption that's making a few run-pass consistently fail. Everything after that is untested. Need to figure that out, somehow. |
* Moves multi-collection files into their own directory, and splits them into seperate files * Changes exports so that each collection has its own module * Adds underscores to public modules and filenames to match standard naming conventions (that is, treemap::{TreeMap, TreeSet} => tree_map::TreeMap, tree_set::TreeSet) * Renames PriorityQueue to BinaryHeap * Renames SmallIntMap to VecMap * Miscellanious fallout fixes [breaking-change]
rebased for merge conflicts |
RFC on the snake case convention we settled on: rust-lang/rfcs#430 |
Confirmed builds/checks on my lab machine. |
fix: Fix related documents diagnostics
(that is, treemap::{TreeMap, TreeSet} => tree_map::TreeMap, tree_set::TreeSet)
[breaking-change]