encoding/json: parser ignores the case of member names #14750

cyberphone · 2016-03-10T13:04:44Z

What version of Go are you using? 5.3
What operating system and processor architecture are you using? amd64,windows
What did you do?
Read this: https://mailarchive.ietf.org/arch/msg/json/Ju-bwuRv-bq9IuOGzwqlV3aU9XE
What did you expect to see?
...
What did you see instead?
...

The text was updated successfully, but these errors were encountered:

bradfitz · 2016-03-10T15:32:58Z

Playgroud link: http://play.golang.org/p/9j0ome9HqK

@rsc, @adg, thoughts? This surprised me. I thought we only did the case insensitive thing when there was no struct tag.

But from the ietf link above:

is looks quite diabolical for security as it is trivial to create valid JSON values that will be interpreted differently by different implementations.

Related to "differently by different implementations", we permit JSON keys twice: http://play.golang.org/p/lPgEj1T6Zk (two "alg"). That probably differs between implementations in either its output or whether it's even successful.

rsc · 2016-03-10T15:42:13Z

This has been the behavior at least as far back as Go 1.2 (I can't run
earlier on my Mac).
The docs also seem to state quite clearly that this is what happens:

To unmarshal JSON into a struct, Unmarshal matches incoming object keys to
the keys used by Marshal (either the struct field name or its tag),
preferring an exact match but also accepting a case-insensitive match.
Unmarshal will only set exported fields of the struct.

I understand there are security implications if JSON is used in security
contexts, and I was a little surprised too, but the docs are very clear. I
doubt that encoding/json's behavior should be dictated by security
concerns. FWIW, I don't believe we should go out of our way to reject
repeated fields either, especially not now. I might go so far as to argue
that using JSON in security standards is a mistake anyway, but I won't do
that here.

In any event it's too late to change the defaults. If we want to support
the security use case, maybe we could have a UseStrictNames method on
Decoder (like UseNumber).

bradfitz · 2016-03-10T15:45:15Z

UseStrictNames SGTM.

cyberphone · 2016-03-10T15:46:36Z

I would consider addressing a bunch of related issue as an option:
#14749
#14135

rsc · 2016-03-10T16:51:41Z

The other issues are unrelated and shouldn't be mixed in here. For one thing, we were talking about a decoding problem and #14749 is an encoding problem.

manger · 2016-03-13T05:24:03Z

UseStrictNames does look like a decent way to add case-sensitive decoding support in a backward compatible way.

However, while that would make case-sensitive decodes possible, it wouldn't help make this safer mode common or the default.

How about also defining a "strict" field tag option (c.f. "omitempty")? That should allow types to be safely used via newDecoder, or Unmarshall, or when the decoding is within a library. You can define the safety in the type, without finding every place it might be decoded.

cespare · 2016-04-13T18:56:23Z

Shall I send a UseStrictNames CL? Should that apply strict names across the board, whether or not the user supplied a struct tag?

type Foo struct {
        Bar string
        Baz string `json:"baz"`
}

So that would only match exactly "Bar" and "baz".

rsc · 2016-04-15T16:22:59Z

@cespare If it's simple, then yes sure. But after a thought on #15314 I wonder if maybe the UseStrictNames method is the wrong approach and instead it should be a tag attribute on the field. Then it can be enforced by the author of the struct instead of the author of the unmarshal call. Specifically, something like json:"Foo,exactname".

jessfraz · 2016-06-22T23:01:16Z

Would it get super repetitive if the person had to set it on every tag name?

smyrman · 2020-06-03T10:16:29Z

I think the CL is interesting, but it would, from my perspective, be even more useful to explicitly define an option. It's useful to be able to set that a case-insensitive match be treated as an unknown field.

As a comment on #14750 (comment):

I would vote for allowing case-sensitive matching as a decoder option enabled through a method call. (UseStrictNames, DisableEqualFold, or some better name) over struct-tags.

I want to advocate for this solution not because it's a better solution, but because it's in line with the current design for DisallowUnknownFields and UseNumbers. Having one annoying -- but consistent -- API has a value; it means the same work-around can be used for all cases where a custom json.Unmarshaler implementation is needed.

The DisallowUnknownFields and UseNumbers are in my opinion within the same class of problems, and that puts some restrains on how a strictly case-sensitve parser should be implemented.

Some notes:

Projects can easily define their own jsonDecoder(data []byte) json.Decoder helper-functions to reduce the boiler plate in json.Unmarshaler implementations.
Making a new json.Unmarshaler interface that accepts passing along options (or better still, accept a pre-initialized (sub) json.Docoder with inherited options) is certainly possible, but it's a separate issue. It is not given that it's always preferred.

smyrman · 2020-06-03T10:45:18Z

PS! The Encoder/Decoder interfaces defined in #12001 (old issue) could serve as a way of passing along decoder option. Perhaps that problem can be discussed there, although the motivation here is different.

icholy · 2021-05-13T17:19:59Z

@rsc this issue seems to have died down and gone off-topic. Would it be appropriate to reboot this as a proposal for UseStrictNames?

zigo101 · 2021-08-26T15:36:53Z

preferring an exact match but also accepting a case-insensitive match

The wording of the first half line is some misleading IMHO.
It contains negative information.
I recommend to remove this half line to match the current behavior.

nemith · 2021-10-20T16:32:11Z

Just learned about this bahavior and this screams unintended bugs (no matter how much documentation). I always assumed that explicit struct tags take precedence over field names but they seem to just be ignore if there i field that matched.

krigga · 2022-09-09T17:24:01Z

This breaks binance API parsing (see https://binance-docs.github.io/apidocs/spot/en/#individual-symbol-ticker-streams)
The API returns objects that have many single-character properties, many of them are the same letter but in different cases (so you pretty much have half the object's properties confused by the parser), and as far as I understand, after 6 years of this issue being known, there is still no workaround aside from parsing to a map[string]interface{} and then parsing to the desired object
I don't mind implementing a decoder option or whatever, but it needs to be decided so that I or someone else can implement it

Lz-Gustavo · 2022-09-09T18:34:54Z

@krigga a different workaround wouldn't be to declare all possible attributes cases, even those not required by the application?

For example, considering Binance's ticker payload, even if you only wanted to parse the "a" atribute, it seems to me that you could avoid the mismatching with the "A" field by declaring both on your target structure and simply ignoring the unwanted value https://go.dev/play/p/1_71Bw5FBJz.

Of course, this is just a workaround for the issue. One shouldn't be required to parse unwanted atributes for the sole purpose of avoiding this behavior. On this same snippet, removing any of the struct fields (by commenting line 14 or 15) leaves the script susceptible to the mentioned awkward behavior, by parsing "a" into an atribute with an "A" field tag.

james-d-elliott · 2022-11-29T02:33:13Z

This fork deals with the issue and implements the original proposed change I believe; https://github.com/james-d-elliott/go-json

It's a 1:1 fork of go1.19.3 other than the additional method and usage of said method. If the original proposal ever gets traction and is accepted you're welcome to ping me and/or just use the code as-is/modified with/without crediting me.

dsnet · 2023-10-06T06:14:48Z

Hi all, we kicked off a discussion for a possible "encoding/json/v2" package that addresses the spirit of this proposal.

In v2, we propose that Unmarshal perform a case-sensitive match on the field name by default.

For flexibility, the v2 prototype provides the ability to alter this behavior with:

json.Unmarshal(v, json.MatchCaseInsensitiveNames(false)) at the unmarshal-call level OR
struct{ FooField T `json:",nocase"` } at the struct-field level

disconnect3d · 2024-08-30T15:20:37Z

Hi, we stumbled upon this issue at Trail of Bits during one of our security assessments. We found a critical vulnerability due to how two different parsers treated similar JSON keys with matching text but differing capitalization - one of which was a Go program. A bit off-topic, but this is also not the first time that the lack of specification on how to treat duplicate JSON keys caused a critical vuln, as per Parsing JSON is a Minefield 💣:

[Update 2017-11-18] A RCE vulnerability was found in CouchDB because two JSON parsers handle duplicate key differently. The same JSON object, when parsed in JavaScript, contains "roles": []', but when parsed in Erlang it contains "roles": ["_admin"].

Anyway, I am writing here to encourage you to fix this issue - and at least request that this feature be documented better. I’m requesting this because the latest - Go 1.23 - documentation for JSON.Unmarshal is still incorrect - or at least very unclear:

To unmarshal JSON into a struct, Unmarshal matches incoming object keys to the keys used by Marshal (either the struct field name or its tag), preferring an exact match but also accepting a case-insensitive match. By default, object keys which don't have a corresponding struct field are ignored (see Decoder.DisallowUnknownFields for an alternative).

The doc says it "prefers an exact key match"; however, the JSON.Unmarshal code, when decoding object keys, simply goes over all the keys and overwrites (through this line in d.literalStore) the value of an exact key match if a non-exact (case insensitive) key match is found later on. While this was shown in multiple cases already, this one - https://go.dev/play/p/BL8Sw_KEUz9 - shows both cases and confirms that it is always the last key which is used:

[...]
        # we always end up with "last" value
	deserialize([]byte(`{"someName": "first", "SOMEname": "last"}`))
	deserialize([]byte(`{"SOMEname": "first", "someName": "last"}`))

Now, it seems that different solutions were suggested here such as:

Multiple suggestions to add other JSON decoder options such as "UseStrictNames" or "DisallowEqualFold".
There was an attempt to clarify the documentation in 221117: encoding/json: clarify how we decode into structs - which seems that it wasn't accepted because documenting it could "accept" the behavior and make it harder to change in the future
There was an attempt to change this behavior in 224079: encoding/json: skip inexact matches after exact matches
There is a encoding/json/v2 discussion which seems to be an ongoing effort of creating a new API that would hopefully fix this. The discussion includes a go-json-experiment/json experiment implementation.

Having written all of this, I still believe we could at least improve the documentation in the current implementation to make this clear once and for all.

dsnet · 2024-08-30T18:20:15Z

Making the documentation more clear sounds good to me. At this point, I don't think we can change the v1 behavior.

The v2 package will be the path forward for fixing this as we're aware that this is a potential security vulnerability. For those interested, here's an explanation how duplicate names (and case-insensitive names by extension) can be exploited: https://youtu.be/avilmOcHKHE?feature=shared&t=1011

ianlancetaylor · 2024-09-03T00:29:08Z

@dsnet: What is your opinion today on https://go.dev/cl/221117 ?

mvdan · 2024-09-03T07:00:35Z

Yes, if we're not going to change the behavior in v1, I'd also document it. I'm still okay with my wording from four years ago, but happy to get another review.

bradfitz changed the title ~~JSON parser ignores the case of member names~~ encoding/json: parser ignores the case of member names Mar 10, 2016

bradfitz added the Security label Mar 10, 2016

ianlancetaylor added this to the Go1.7 milestone Mar 10, 2016

cyberphone mentioned this issue Mar 13, 2016

encoding/json: marshaller does not provide Base64Url support #14804

Open

rsc mentioned this issue Apr 15, 2016

proposal: encoding/json: reject unknown fields in Decoder #15314

Closed

rsc modified the milestones: Go1.8, Go1.7 May 18, 2016

quentinmit added the NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. label Oct 6, 2016

rsc modified the milestones: Go1.9Early, Go1.8 Oct 26, 2016

bradfitz modified the milestones: Go1.9Maybe, Go1.9Early May 3, 2017

bradfitz modified the milestones: Go1.9Maybe, Go1.10 Jul 20, 2017

rsc modified the milestones: Go1.10, Go1.11 Nov 22, 2017

bradfitz mentioned this issue Feb 8, 2018

encoding/json: Sequential Key Casing #23726

Closed

ianlancetaylor modified the milestones: Go1.11, Unplanned Jun 23, 2018

smyrman mentioned this issue Jun 3, 2020

encoding/json: Marshaler/Unmarshaler not stream friendly #12001

Closed

mvdan mentioned this issue Jan 14, 2021

encoding/json: clarify what happens when multiple items match the same field #43664

Open

adeinega mentioned this issue Mar 27, 2021

Polymorphism ietf-wg-gnap/gnap-core-protocol#6

Closed

mvdan mentioned this issue Apr 23, 2021

json.Unmarshal return json.UnmarshalTypeError at certain situation, which should not #45717

Closed

seankhliao mentioned this issue Aug 26, 2021

encoding/json: Unmarshal doesn't use preferred match #47984

Closed

liggitt mentioned this issue Sep 15, 2021

switch from json-iterator to forked stdlib json decoder kubernetes/kubernetes#105030

Merged

4 tasks

zigo101 mentioned this issue Oct 28, 2021

encoding/json: add Decoder.DisallowDuplicateFields #48298

Open

ferretdb-bot mentioned this issue Nov 5, 2021

Duplicate keys in JSON and fuzzing problem FerretDB/FerretDB#31

Closed

This was referenced Aug 9, 2022

proposal: encoding/json: mark field when struct decoding #54351

Closed

encoding/json: Decoding returns error when case-insensitive keys overlap and not all struct fields defined #54404

Closed

seankhliao added this to the Unplanned milestone Aug 20, 2022

keogami mentioned this issue Jan 6, 2023

Test is flaky because go's json parser is case-insensitive quasilevel/ccsu-opendata#38

Open

exageraldo mentioned this issue Feb 27, 2023

SCIM routes pagination google/go-github#2679

Closed

sding3 mentioned this issue Apr 4, 2023

proposal: encoding/json: add decoder option to reject duplicate fields #59414

Closed

fabiante mentioned this issue Feb 2, 2024

Fix unmarshaling of JSON keys which differ only in case DEXPRO-Solutions-GmbH/easclient#27

Merged

seankhliao mentioned this issue May 29, 2024

Bug Report: JSON Deserialization Case Sensitivity Issue in Go #67689

Closed

colega mentioned this issue Aug 1, 2024

encoding/json: parser ignores the case of member names colega/unexpected-go#31

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

encoding/json: parser ignores the case of member names #14750

encoding/json: parser ignores the case of member names #14750

cyberphone commented Mar 10, 2016

bradfitz commented Mar 10, 2016

rsc commented Mar 10, 2016 •

edited

Loading

bradfitz commented Mar 10, 2016

cyberphone commented Mar 10, 2016

rsc commented Mar 10, 2016 via email

manger commented Mar 13, 2016

cespare commented Apr 13, 2016

rsc commented Apr 15, 2016

jessfraz commented Jun 22, 2016

smyrman commented Jun 3, 2020 •

edited

Loading

smyrman commented Jun 3, 2020 •

edited

Loading

icholy commented May 13, 2021

zigo101 commented Aug 26, 2021 •

edited

Loading

nemith commented Oct 20, 2021

krigga commented Sep 9, 2022

Lz-Gustavo commented Sep 9, 2022

james-d-elliott commented Nov 29, 2022 •

edited

Loading

dsnet commented Oct 6, 2023

disconnect3d commented Aug 30, 2024

dsnet commented Aug 30, 2024

ianlancetaylor commented Sep 3, 2024

mvdan commented Sep 3, 2024

encoding/json: parser ignores the case of member names #14750

encoding/json: parser ignores the case of member names #14750

Comments

cyberphone commented Mar 10, 2016

bradfitz commented Mar 10, 2016

rsc commented Mar 10, 2016 • edited Loading

bradfitz commented Mar 10, 2016

cyberphone commented Mar 10, 2016

rsc commented Mar 10, 2016 via email

manger commented Mar 13, 2016

cespare commented Apr 13, 2016

rsc commented Apr 15, 2016

jessfraz commented Jun 22, 2016

smyrman commented Jun 3, 2020 • edited Loading

smyrman commented Jun 3, 2020 • edited Loading

icholy commented May 13, 2021

zigo101 commented Aug 26, 2021 • edited Loading

nemith commented Oct 20, 2021

krigga commented Sep 9, 2022

Lz-Gustavo commented Sep 9, 2022

james-d-elliott commented Nov 29, 2022 • edited Loading

dsnet commented Oct 6, 2023

disconnect3d commented Aug 30, 2024

dsnet commented Aug 30, 2024

ianlancetaylor commented Sep 3, 2024

mvdan commented Sep 3, 2024

rsc commented Mar 10, 2016 •

edited

Loading

smyrman commented Jun 3, 2020 •

edited

Loading

smyrman commented Jun 3, 2020 •

edited

Loading

zigo101 commented Aug 26, 2021 •

edited

Loading

james-d-elliott commented Nov 29, 2022 •

edited

Loading