-
Notifications
You must be signed in to change notification settings - Fork 57
Support JSON serialisation of BigInt values #24
Comments
Do we want that behavior? Maybe if we return |
Being able to store 64-bit integers in JSON would be nice: https://dev.twitter.com/overview/api/twitter-ids-json-and-snowflake Not sure how to best do that, though. One option is to allow Integer literals in JSON. |
@rwaldron explained to me a while ago that we can't just do that. It'd break the JSON-web, in the sense that if one endpoint starts sending things that are outside the normal JSON grammar that the other endpoint doesn't expect, that could cause things to not parse. For this reason, the Integer proposal doesn't currently add to JSON. A user has to pass in their own serializer function, etc, to get Integer support. |
I would prefer |
Note that Also, the coercion to Number doesn’t seem implicit to me, at least no more than in the current situation. I mean, doing
|
It would really be great if we could support larger values in JSON, but I don't have a full understanding of compatibility risks. Let's see if we can get input from a broader set of experts here. |
But since the spec doesn't limit the length of a number representation in JSON, that is actually the "normal JSON grammar". And as @rauschma mentioned, there are already APIs that use long precision integers in their JSON responses, and @claudepache reminded that common parsing implementations just lose precision. Any JSON-parsing method that breaks on them is outright a bad (or incomplete/limited) implementation. About existing APIs, it's not like that "Integers happen", like a feature that just stops working because its support has been removed: to use them, a deliberate action on the code has to be taken. Numbers won't get coerced to Integers, so existing APIs won't break anyway.
If you want to make the problem of using Integers in JSON evident, I'd prefer if it could throw a On the other hand, I understand your approach here:
That's completely reasonable. Can we say that further study on the impact of JSON support has to be done, rather than cut everything short with |
Yes, that's what this bug is for. The currently proposed semantics are just a starting point. |
Reasonably sure, yes; there are already other values which likewise disappear from serialization. For that to break an endpoint, you'd have to start serializing a value of a type which did not previously exist, and be relying on it appearing in the serialization. But it's already the case that not all values appear in serialization.
We already did that with symbols. The fundamental problem is that JSON is a protocol which exists outside of JS, which means we can't extend it to new types. In this particular case, we might be able to get around it using the fact that the JSON spec allows arbitrary-width integers, but it's not something we can avoid in general. Currently we have an invariant (unless Incidentally, "the spec" and "the normal JSON grammar" are kind of ambiguous. RFC 7159, which is the thing I assume we mean, says: "This specification allows implementations to set limits on the range and precision of numbers accepted." It further goes on to say that IEEE doubles are probably good enough. Of course, parsing JSON in practice is already pretty crazy, without broad agreement on corner cases. |
FWIW in TC39-land, we might be talking about ECMA 404. This document does not describe the interpretation of Numbers. |
This is actually exactly where I landed last week when I was first @-mentioned here, so consider this a "second" in support of @littledan's call for more information.
I'd like to piece together a bit of historic information here, as it will be easier to discuss with everything present:
Anyway, don't take any of that as argument or evidence in support of any particular point, I just wanted to make sure we all had all of the same information and historical context. |
I remember Brendan Eich and other people at TC39 suggesting the max-min strategy in different proposals. @rwaldron just mentioned several points to change if we want to simply start writing changes to support a new type in JSON. This is only the start. We also need to identify where a change to JSON grammar would affect, and that's way beyond what we usually call web-reality. It's not limited to browsers, but anything parsing json. Include other languages, Perl, Python, etc. Call everyone to check for compatibility and to create some expectation for implementation. @littledan my suggestion is to roll with casting Integers to null - as mentioned before - and then we can expect the work on JSON in a separate proposal, which might be bigger than this one. remember the max-min: cap the proposal to the minimum necessary for us to land this (that's my own interpretation). Further extensions to Integer support should come separately. If anyone really wants to pursue this JSON support, I suggest working in a proposal addressing all of these fields we will need this change. We'll need tests, we'll need impact research, and more. Nice to have features might require a lot of extra work, which I'm fond to avoid for now while we don't even have the basic new type landing as a feature yet. |
I don't think there's anything wrong with omitting a given type of value from JSON, conceptually. However, I think it would be very unfortunate, unintuitive, and confusing if Integers were silently omitted from JSON. Symbols, functions, regexes, and dates don't have an intuitive or reliable non-string serialization format, so there's no conceptual conflict with omitting them. However, integers absolutely do - you just represent the integers. I've often seen non-JS runtimes produce integer values as Numbers in JSON that are larger than JS itself can support (64 bit tweet IDs, for example) that worked in Java and Ruby, but broke when they landed in JS due to truncation - which means that the JSON ecosystem already works fine with numbers that JS itself can't truly support. I think this JSON work should be, if not completed, at least relatively mapped out prior to Integers being introduced. |
@ljharb I understand and agree we should have compatibility with JSON. My suggestion is to apply this first as-is, and then we work in an exclusive proposal for JSON support. |
I think it could be dangerous to ship the first part without understanding the shape of the second part. |
@ljharb From the comments in this thread, it seems like the upgrade to Integers in JSON will be a heavy, breaking change. It might not be possible to migrate the ecosystem to support it at all. I'm not sure what sort of planning we can do ahead of time to mitigate these issues. |
@littledan fail fast is the general guidance in such situations, I guess? |
@littledan right, that's why i don't think it's appropriate to defer the JSON question until "later" - i think a definitive answer is needed now. If integers in JSON just stored a number representation of it - even if JS is unable to parse it accurately - I think the ecosystem would handle it just fine. If you wanted your big integers to be precise, you'd simply need a |
Creating a situation where re-encoding a JSON document is possibly lossy seems like a recipe for disaster IMO. JSON should support integers with different syntax I think (though I make no judgement on how likely this is to work). |
@bterlson Good point. Throwing here breaks new ground, but it seems justified for future-proofing. I'll make a patch to do this. |
@bterlson that's already always been the case with JSON - JSON can have duplicate keys, for example, and parsing + restringifying it will collapse them. |
@ljharb While that's true (and more relevantly JSON can already contain arbitrarily large integers), we currently do have that |
I think the two acceptable options here are:
I don't think we should do something which loses precision of an Integer over this round-tripping that @bakkot mentions. The whole point of this proposal is to provide precision. Losing precision is somehow worse than having a duplicate key missing--it's just subtly wrong, so the bug is less likely to be caught in simple testing. |
I'm starting to really like the concept of allowing conversion (and serialization) below MAX_SAFE_INTEGER, and throwing/omitting otherwise. By far the common case will be to use safe integers, and if they can't easily convert to Numbers and send them over the wire via JSON, then i think the majority of users will never bother using this type. |
The choice should dovetail with other choices. If Integers are going to be "throwy" for things like possibly lossy math operations then throwy makes sense here too. I don't mind the "do it if you can without losing precision, otherwise throw" approach, though I can imagine this causing service failures when some application generates an ID over MAX_SAFE_INTEGER after many months of operation :-P FWIW, ignoring doesn't make sense to me in this case. Symbols seem like something that is clearly not representable in JSON and ignoring is what you probably want to do all the time anyway. Integers, seems less likely to be so. |
That's a major concern with serializing only when ≤ 2^53. People will try it with small Integers and get no indication that it will fail for larger ones. I'd expect that to be a major source of bugs. |
I think it's important to note that any implementation supporting integers would in theory (if they serialized to a numeric integer representation, even for large ints) also support parsing them, so the invariant above would still hold. The difficulty would only be that older unpolyfilled json implementations couldn't parse large ints without losing precision - which is what already happens if you get an int64 from a non-js json source. |
Modifying JSON.parse would be a backwards incompatible change: |
For those who came here seeking to serialize // Does JSON.stringify, with support for BigInt (irreversible)
function toJson(data) {
if (data !== undefined) {
return JSON.stringify(data, (_, v) => typeof v === 'bigint' ? `${v}n` : v)
.replace(/"(-?\d+)n"/g, (_, a) => a);
}
} It formats each Two things to keep in mind:
But on the whole, this is quite usable in all cases where the reverse operation is not required. For example, within database code - generating JSON with |
FYI there is a trial to make it (partly) reversible: https://github.com/tc39/proposal-json-parse-with-source |
@saschanaz Thanks! In my case I didn't need it to be reversible, so I came up with the simplest solution that I've shared here ;) |
above is not generally applicable, because its common for json-data to have random-string ids that have a small-chance of looking like bigint: var jsonData = {
// random id could potentially have bigint signature,
// e.g. "id": "123456789012n"
"id": Math.random().toString(16).slice(2),
"bigint": "1234n"
} there are honestly surprisingly few javascript use-cases where you need to revive bigint values from json. like temporals, "bigints" are mostly meant for [external] database-logic (e.g. wasm-sqlite3). javascript's role is mainly i/o and baton-passing stringified temporals/bigint/etc. between databases, rather than doing any of those business-logic themselves. |
although i agree with most of your point. JavaScript’s role is not mainly i/o, or baton-passing; it absolutely includes business logic - I’m not sure how many dozens of times this has to be repeated before you internalize it. |
wasm-sqlite3 will change that. it's more cost-effective/maintainnable to use sql-queries for data aggregation/sorting/joining/etc. than in javascript. |
Whether it will or not (i highly doubt that it will), that's irrelevant to the current state of things - which is that JavaScript is primarily for everything, and minimizing/brushing off use cases merely because you don't have that use case is not productive. |
@kaizhu256 You were repeating what I said about the use of the format already. Surely @ljharb I've just released my PostgreSQL driver that's in heavy use by the DEV community, which now supports Because that is very real, hardcore hands-on use case for using |
As an update, improving on safety of the suggested earlier work-around... You can make the work-around safe by counting the number of replacements to match that of the // Like JSON.stringify, but with safe support for BigInt (irreversible)
function toJson(data) {
if (data !== undefined) {
let intCount = 0, repCount = 0;
const json = JSON.stringify(data, (_, v) => {
if (typeof v === 'bigint') {
intCount++;
return `${v}#bigint`;
}
return v;
});
const res = json.replace(/"(-?\d+)#bigint"/g, (_, a) => {
repCount++;
return a;
});
if (repCount > intCount) {
// You have a string somewhere that looks like "123#bigint";
throw new Error(`BigInt serialization pattern conflict with a string value.`);
}
return res;
}
} Above I am using |
The spec specifically allows you to integrate with |
@apaprocki You are missing the point. I'm well aware of that spec. The code I provided is to be able to generate Example const input = {
value: 12345n
};
// Needs to become after serialization:
//=> {"value": 12345}
// And NOT this one:
//=> {"value": "12345"} |
Yes, but personally I would explore whether the receiving end with that restriction is open to supporting a quoted representation. There’s no reason why it couldn’t support both, especially with the restrictions of JSON.parse in JS. |
Are you are suggesting that I should pester PostgreSQL team to support JSON differently? :)))))) They don't care what JavaScript can or cannot do, because the server wasn't developed for JavaScript clients. They use their own versions of JSON parsers that have nothing to do with JavaScript. This is why there are libraries, like mine, that mediate the discrepancies internally. JavaScript, on the other hand, should be flexible enough to allow at least some level of customization on how it generates or parses JSON. I think this would be more reasonable to pursue. As of now, lack of provision for serializing |
Well, looking at the documentation, they already have a double() jsonpath operator for the purpose of returning a double value from either a number or string — it just seems like no one asked for or submitted the bigint() equivalent. |
If "full" JSON support is needed, here is such a proposal. Replace ES6 with ECMAScript. |
@cyberphone That is rad, if you can make it stick! 😄 That is way beyond my ambitions on this matter. |
@vitaly-t i think you’re misunderstanding me; what I’m saying is unproductive is claiming there’s no use case for what you’re talking about. Happy to see people experimenting in this space. I think https://github.com/tc39/proposal-json-parse-with-source or similar proposals are good directions to explore supporting BigInt in JSON. |
@ljharb From my answer to you earlier, how is not a use case? -
I explained why and where it is needed, on a very practical note, and you are saying this is not a practical case? Here's associated research I had to make for this. |
@vitaly-t again, you misunderstand. It is a practical case, i respect your use case. I was never replying to you before, only to @kaizhu256’s attempt to marginalize your use case. |
@ljharb Ok, never mind, I think the way you phrased it confused me completely 😄 So it was kind of double-negation perspective that was lost on me, as such things often do 😄 |
Acknowledge marginalization came from me and apologize for any wrath misdirected @ljharb rather than me. |
The abstract operation SerializeJSONProperty needs to be patched, so that
JSON.stringify(1n)
returns"1"
instead ofundefined
:I don’t think there is anything to do for
JSON.parse
, though.The text was updated successfully, but these errors were encountered: