-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
API considerations (now & future) #3
Comments
@ctavan , thanks for getting this conversation going again. I think your questions are spot on, especially the bit about having a class-based API. The more I think about this, the more I like that idea, so I found myself taking a stab at what it might look like. Main things to notice:
To address your comments (and explain some of the above) ...
💯 agree. Hence, static factory methods.
This makes sense (kind of) for packaged modules, but if this is going to be built-in the community is probably better served by going with a less contentious practice. Hence, single, top-level export.
I think this makes sense. It allows users to work with both the binary and string forms of a UUID. E.g. in the CodePen above, you can create from either form (
With a class-based API, validation in the constructor is a no-brainer, IMHO.
No. UUID creation is generally a very fast process. It also tends to be CPU-bound rather than IO-bound, so I don't think an async API buys much. That said, there was one issue where the
I'd like to figure out how to handle the wonky timestamps in version 1 UUIDs. |
👋 digging myself out form a giant mountain of work related to dropping Node 6 at Google, will do my best to pull this conversation into the README we're working on. @littledan's advice was that we concentrate on fleshing things out in the README, don't worry about tests our the formal API definition; we can basically start with pseudo code I think. |
Hmm, I'm wondering about some of the aspects here. I like how the
Why not have a default version which does UUIDv4? Seems like that's what people need most of the time, unless they have a particular need for repeatability (given that we all agree that we'll require a good source of randomness). It would be nice to save people the effort/mistakes by making it hard for people to figure out which one to use.
For the native implementation, the idea would be that it's built into JS, so you don't have to worry about that. We could still make a decision based on implementation techniques for polyfills, though, if we need to.
Are users asking for this? I don't think we should provide it just because. Unless developers are really clamoring for more features or messily implementing them themselves, I'd suggest using a function-based API.
BigInt is at Stage 3, shipping in Chrome, and implementation is well underway in Firefox and Safari. I plan to propose it for Stage 4 in June. I think it's fine to depend on it. |
I think this question almost boils down to asking again whether the library should support anything else than v4 UUIDs at all. First of all I also assume that the heaviest use case for UUIDs is v4, most likely used as entity identifiers in databases and APIs and I also assume that most people just work with the string representation of these UUIDs, however I don't have any actual data that would support these assumptions apart from my own professional working experience over the past 5 years. Following my assumptions above I think that a class representation of v4 UUIDs is indeed of rather limited use, after all there's not more than the version/variant information plus randomness in it. Having a class representation to me really starts making sense when working with v1 UUIDs (and likely v3/v5, although no personal experience here) where the timestamp which is included in these UUIDs is actually useful information that people want to parse and use (the namespace in v3/5 may be of similar interest). The use case that I had back when I contributed the initial implementation of v1 UUIDs (I believe it was in 2011) was primary keys (=unique timestamps) for time series stored in a Cassandra database. While this particular use case seems to be considered bad practice by now (at least according to https://stackoverflow.com/a/17946236) it was at least for me the use case that made me implement this stuff in javascript. So to approach this question I would ask again: Should this library support all versions of UUID or should we go with just v4? Do we have any data on whether users would miss v1/3/5? How could we gather such data? |
My suggestion is that we start with an API that has a single default export of a function (called |
I agree that v4 string uuids are the 80-90% use case. But we would be remiss to not consider how other cases (v1, v3, v5, and binary uuids) would dovetail into whatever API we start with. If we start with a default v4 string-uuid function, can we pencil out what a future version of that API that supports v1/v3/v5, parsing, and binary uuids might look like? |
BTW, I just had some fun with the BigQuery GitHub Dataset. Here's what I did:
Here's the result:
So you were spot-on with your 80%-guess @broofa 😉 If you want I can share the gcloud project and/or queries with you if you want to dig deeper. |
@ctavan this is amazing, sorry I didn't respond earlier (ramping up at Google has been more of an avalanche of work than I expected). I'm attending TC39 in Berlin right now, and am going to float the initial work we've done on this specification with some of the delegates. |
@ctavan @broofa @littledan why don't we start with an API that looks something like:
and we can, in a separate section, point out that this could be extended on to:
My thinking is we shouldn't propose an options object out of the gate though, since it could lead to feature creep. |
We have two audiences: The naive user who just wants to get to uuid (v4) strings as quickly as possible, and the more advanced user who's going to want "more". For the first user, I agree that experience should look something like this:
I'm fine with that, as long as it doesn't interfere with providing the "more" part for the latter user. And I don't think it does. E.g. Is there any reason not to expose something like the UUID class I sketched out above as a non-default export (at some future date), thusly:
That's reasonable, right? Not saying this needs to be the future API, only that users wouldn't find this sort of incantation objectionable. Regarding the That will increase demand for the advanced API, but that's not necessarily a bad thing. |
I don't think it's a good idea to bake in the assumption that v4 will be the obvious choice for the rest of time just because it happens to be option in most common use in the available data sets right now. It seems like import { uuidv4 } from 'std:uuid'; is not significantly harder to use and is both more explicit and more future-proof. (I am not suggesting that other algorithms be supported in the initial proposal, just that the proposal avoid pick one as the default forever.) |
@bakkot Did you see the text in the readme and supporting documents explaining the default? Did you see flaws with that reasoning? Even if we support additional UUID types in the future, this seems like a strong default to recommend. |
@littledan I saw the analysis directory talking about general background and usage statistics, and this readme entry which talks about usage statistics. If there's other docs, I didn't see them. Those seem like they make a compelling case for providing v4 and no other things in the initial version of this proposal. They do not seem like they make a compelling case for assuming uuidv4 will always be the correct default (nor do they even appear to try to make that case), especially since explicitly naming the version provided does not seem to me to add much overhead. |
I think the evidence there gives good reasoning to have v4 be an opinionated default: the uses of v1 tended to be in error, which is probably encouraged by the API shape of the npm uuid module. I agree that we shouldn't rule out these extensions for the future, though. However, I'm fine to be flexible on this aspect of the API shape. |
@bakkot what my analysis of open source projects has shown and what I was trying to summarize in the faq entry is that in fact among the most popular open source projects that were using v1 UUIDs there was only one single project that had an inevitable reason to do so, see this section of the analysis. In all other cases that I investigated it turned out that developers had chosen Now we could of course argue that it is not our duty to ensure that developers read the UUID spec and choose the right algorithm for their purpose. However evidence from the open source project analysis shows that this simply does not happen in practice and that by nudging people into using |
@ctavan I am convinced that we should not make |
@bakkot I agree with your point and I'm confident that we'll be able to further improve our reasoning in the README. In this FAQ entry we tried to argue that the Assuming that this standard library will be restricted to UUIDs as defined per RFC 4122 and will not be extended to support things like flake-id, nanoid, cuid or ulid, and given the arguments about the irreparable flaws of Or would you suggest opening up the discussion to keep this API open to extensions for unique identifiers even beyond RFC 4122 (which would be counter our current assumptions but maybe worthwhile discussing)? |
It seems like folks' points are being missed. It sounds to me like @bakkot is saying that the very existence of a version FOUR means that there will inevitably be a version FIVE, and that it will be strongly recommended over v4 at that time. That suggests that even if v4 is the only good choice right now, it might be better to have no default at all rather than risking future migrations from N to N + 1 being harder. |
@ljharb I was indeed not reading @bakkot's argument the way you rephrased it so far. I was under the assumption that RFC 4122 won't change or be extended but I have to admit that I'm not very familiar with the lifecycle of IETF RFC's and whether that is something to be expected. Apart from that I believe that your answer also shows why there is so much confusion around UUID version numbering and what these "version" numbers actually mean.
In fact there already are If you take a look back at my initial post in this thread you can see that my original assumption was that the API should indeed be symmetric in the different UUID algorithms. Further discussion and the analysis of Open Source repositories convinced me that the UUID RFC is apparently not widely understood and that enough people tend to not dive deep enough into it to pick the right algorithm. This has led us to the idea of promoting |
I'd hope that, if some later RFC replaces or extends that one with additional subtypes of the same variant of UUIDs defined in 4122 (which there is explicitly room to do), then we would consider adding those new subtypes to this library. But yes, I agree it makes sense to scope it to just that RFC (and any future extensions), and that we do not necessarily need to provide all of the variants in the RFC (in particular, I agree that we really should not expose v1).
I think it is a reasonable default now, but the problem with designing things for standard libraries for languages like JS is that they can never be changed. If, 15 years from now, there is a v7 which is considered to be the best practice, it would be unfortunate if I would like to avoid that situation. One way to avoid it is to say that you have to write
RFCs being updated or obsoleted is a normal thing to happen. (See, for example, the TLS 1.3 RFC.) I don't know if there's a particular reason to expect it to happen in this case (other than the fact that the RFC explicitly reserves four bits for describing the subtype, despite only needing three), but I also don't know of a particular reason to expect it not to happen (though this may, of course, just be ignorance on my part). |
Sounds like a good argument for versioning in the standard library. tc39/proposal-built-in-modules#17 |
That issue to me is a good argument to avoid versioning like the plague, and instead strive to design APIs so they never need breaking changes :-) |
I've been mulling this issue over:
I'm agreeing with @bakkot's summary of the problem, but am hoping we potentially figure out a better API surface. |
My proposed solution to this problem is to make this library expose only a function for generating v4 uuids, and have that function be named something with For example, import { uuidv4 } from 'std:uuid'; would accomplish this (assuming This makes |
That doesn't make it the default, that just makes it the only option initially. As soon as there is support for other versions (v1/v3/v5, for example) then it's no longer the default, it's just one of many, and we're back to the same problem we have currently where people use v1 because it's... well... "1". For the record, I'm not concerned about a newer/better version coming along. It's been 14 years since 4122 was finalized and I'm not aware of any interest or activity going into developing a new version. IMHO, we're at least 10 years out from a compelling alternative. (I know, I know... "famous last words"). If/when something does emerge it's debatable whether it would even fall under the purview of 4122. My money would be on it featuring more bits, making it unsuitable as an extension to the current RFC. |
Right, so don't do that. |
Sometimes you're looking for a quick and easy random id that you don't care too much about, where UUIDv4 just happens to be a good answer. A single function API like And sometimes you specifically need UUIDv4. You've got some system that specifies it so you go looking for it. As someone who tends to know/research which UUID type I want, not having v4 in the name would leave me wondering whether I'm getting the version I need. It would also feel weird if I needed something other than v4 and nothing from the built-in UUID implementation could be leveraged. I realize that there's almost nothing reusable between the implementations, it's more of an itch than a practical implication. Maybe I personally like the idea of a UUID class with all the trimmings, but I have to admit that I've never had a use for more than to/from bytes (storage savings in bulk quantities). If other flavours of UUID get added later they could be added along with a class that could include a v4 subclass (or whatever) for completeness. |
@bakkot @broofa @ctavan, I like @rmg's suggestion that we default to version 4, but call the export @waldemarhorwat and another peer have raised the point to me that, by the IETF definition of a UUID, a UUID cannot have more entropy than version 4, and still be considered a UUID (it uses the minimal 6 bits for meta-information, the rest is entropy). I think that, as long as we draw attention to the fact that this is an API for a random UUID, there's no danger that a UUID will come along with more randomness. If at some point a better specification for creating identifiers emerges, this would not be an IETF UUID, I think it would be an outside context problem. |
I would be fine with naming the v4 export I also like the argument, that BTW while looking at other languages I found a few interesting things:
|
|
To be clear, is the suggestion that the export statement would be as follows: export default function randomUUID() {...} ... such that: import uuid from 'std:uuid'; // uuid === randomUUID above
import {randomUUID, ...} from 'std:uuid'; // randomUUID === randomUUID above |
@broofa, yes your summary agrees with what I was thinking: If we go the route of the global namespace, something like: randomUUID() // as a super-simple interface, that provides version 4 UUIDs.
// and potentially something like this, as an advanced interface:
const uuid = new UUID({version});
uuid(); If we ended up going the module route, something like: export class UUID {...}
export default function randomUUID() {...}
// such that:
import uuid from 'std:uuid'; // uuid === randomUUID above
import {randomUUID, UUID} from 'std:uuid'; // randomUUID === randomUUID above @rmg ☝️ does this fit what you were thinking? I think we could then flesh out the |
@bcoe @broofa yes to both as interpretations of my suggestion, but over time. export default function randomUUID() {...}
// TODO: export class UUID {...} I think the The only things I've actually done with UUIDs that isn't directly satisfied by a |
Discussion in #3 yielded the following conclusions as of 2019-11-25: - We only want to support v4 UUIDs initially - Exporting them as uuid() might cause confusion should we ever want to support other algoritihms in the future. - Using "v4" in the name seems to cause more confusion than clarification since the whole "Version" terminology in the RFC is pretty confusing per se. - Using the term "random" to describe v4 UUIDs yielded generally positive feedback and other languages/libraries seem to use it as well. - There's general agreement that we should not support algorithms other than v4 random UUIDs for now.
Discussion in #3 yielded the following conclusions as of 2019-11-25: - We only want to support v4 UUIDs initially - Exporting them as uuid() might cause confusion should we ever want to support other algoritihms in the future. - Using "v4" in the name seems to cause more confusion than clarification since the whole "Version" terminology in the RFC is pretty confusing per se. - Using the term "random" to describe v4 UUIDs yielded generally positive feedback and other languages/libraries seem to use it as well. - There's general agreement that we should not support algorithms other than v4 random UUIDs for now.
Discussion in #3 showed that there is general agreement for only supporting the v4 algorithm. However, there are concerns that, if we design an API that promotes one algorithm as the default, this assumption might not hold in the future. In order to provide a more future-proof API we may therefore only support certain algorithms, but will likely not treat any of the supported algorithms in a speacial way.
Having switched to ULIDs from UUIDs really helped during development because of ability to quickly sort data by id. |
In order to kick off some discussion on the API I wanted to start collecting some thoughts:
uuid()
generate a vX UUID by default).uuid
npm module it made sense to allow deep import of the different uuid version methods, e.g. to allow reducing bundle size when used in the browser and only a certain type of UUIDs was needed. Do we already have an idea of how the technical implementation of standard modules will look like?What am I missing?
The text was updated successfully, but these errors were encountered: