Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

encoding/json: Unmarshal map[string]interface{} target mishandled when passed as an interface{} #33487

Closed
jexh opened this issue Aug 6, 2019 · 8 comments

Comments

@jexh
Copy link

jexh commented Aug 6, 2019

What version of Go are you using (go version)?

$ go version
go version go1.12.6 linux/amd64

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GOARCH="amd64"
GOBIN="xxx"
GOCACHE="xxx"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH="xxx"
GOPROXY=""
GORACE=""
GOROOT="xxx"
GOTMPDIR=""
GOTOOLDIR="xxx"
GCCGO="gccgo"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build894533395=/tmp/go-build -gno-record-gcc-switches"

What did you do?

Used a reference to the same map in multiple call to json.Unmarshal. See https://play.golang.org/p/bum-Q01e96M

What did you expect to see?

Expected JSON unmarshaling into maps to merge entries between different calls to json.Unmarshal, as indicated by the docs.

The play example should output:

map1: map[key1:data1 key2:data2]
map2: map[key1:data1 key2:data2]
map3: map[key1:map[key1.1:data1 key1.2:data2]]

What did you see instead?

Whenever the type of the unmarshal target is interface{} then a replacement value is created even if the existing entry is of type map[string]interface{} and is a suitable instance for unmarshal to use.

The play example outputs:

result: map[key2:data2]
result: map[key1:data1 key2:data2]
result: map[key1:map[key1.2:data2]]
@smasher164
Copy link
Member

This is similar to, if not a duplicate of #26946, where we considered adding a special case to Unmarshal that looks inside an interface{} for a struct or a map to reuse. However, it was thought that this change would make Unmarshal inconsistent, and it would be better to document this behavior. So I would modify your first example to take the address of the map:

var map1 interface{}
m := map[string]interface{}{
    "key1": "data1",
}
map1 = &m
json.Unmarshal([]byte(`{"key2":"data2"}`), map1)
fmt.Println("map1:", map1) // map1: &map[key1:data1 key2:data2]
fmt.Println("m:", m) // m: map[key1:data1 key2:data2]

@dmitshur dmitshur changed the title Unmarshal map[string]interface{} target mishandled when passed as an interface{} encoding/json: Unmarshal map[string]interface{} target mishandled when passed as an interface{} Aug 6, 2019
@dmitshur dmitshur added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Aug 6, 2019
@dmitshur dmitshur added this to the Go1.14 milestone Aug 6, 2019
@jexh
Copy link
Author

jexh commented Aug 7, 2019

After reading the linked issue I believe I better understand the decisions that were made, however, I still think there is a strong argument for providing the merging behavior for map[string]interface{} as a special case (and different to the behaviour when unmarshaling into structs).

  • The cases when a developer is using map[string]interface{} typically indicates that the JSON has an unknown structure. Merging entries when multiple calls to unmarshal are made using the same map is useful behaviour. The existing behaviour is unintuitive - merging top-level keys but replacing every nested object.
  • With a merging implementation, the code to avoid merging is easy - use a new map.
  • Also, the code to get the current behaviour is easy - use 2 maps and then add the key-value pairs of the second map to the first.
  • It provides a reasonably simple way of implementing merging for custom struct types: marshal multiple instance hierarchies to multiple JSON strings -> unmarshal into a single merged map -> marshal to a single JSON string -> unmarshal into single instance hierarchy.
  • Implementing merging using the existing json library requires some unintuitive code (playing with this I find that using pointers to a map alias is required to get the behaviour) and inefficient, at least for a simple implementation where every layer requires multiple calls to json.Unmarshal and is O(n^2): https://play.golang.org/p/0-0zOXhCdqI (Although please correct me if there is an easier way.).

@smasher164
Copy link
Member

I think the key here is to not predeclare an interface{} value that you pass into Unmarshal. You get all the properties you ask for if you simply pass in a *map[string]interface{}.

m := map[string]interface{}{
    "key1": "data1",
}
json.Unmarshal([]byte(`{"key2":"data2"}`), &m)
fmt.Println("m:", m) // m: map[key1:data1 key2:data2]

This allows the reflection in Unmarshal to work the way you desire.

@jexh
Copy link
Author

jexh commented Aug 7, 2019

This allows the reflection in Unmarshal to work the way you desire.

No, this just copes with the top level map - if you look at map3 in the initial example you will see that I want merging to happen all the way down the hierarchy (and I think the docs imply it should work like this). Including using an example like the following:

var map3 map[string]interface{}
json.Unmarshal([]byte(`{"key1":{"key1.1":"data2"}}`), &map3)
json.Unmarshal([]byte(`{"key1":{"key1.2":"data2"}}`), &map3)
fmt.Println("map3:", map3)

This should print:
map3: map[key1:map[key1.1:data1 key1.2:data2]]

@smasher164
Copy link
Member

smasher164 commented Aug 7, 2019

I see, sorry I misunderstood your question. Unmarshal does not perform recursive merging, so you'd have to manually add elements from one map to the other, or serialization/reflection-based helper functions to do it for you. https://play.golang.org/p/Iu4P8qgdldb

This is what libraries like https://github.com/peterbourgon/mergemap do, as mentioned here: https://groups.google.com/forum/#!topic/golang-nuts/nLCy75zMlS8.

EDIT: A more detailed answer for why can be found in this issue: #21857.

@jexh
Copy link
Author

jexh commented Aug 8, 2019

Thanks for the issue link, that is another similar request. I see, from the comments on that issue, that this is unlikely to progress and that it could only go forward as new functionality in the json package to preserve backwards compatibility. For now, using something like the mergemap you link to is the best option.

I suppose the key point remaining is that the current json docs indicate that the merge should happen(from https://golang.org/pkg/encoding/json/#Unmarshal):

To unmarshal a JSON object into a map, Unmarshal first establishes a map to use. If the map is nil, Unmarshal allocates a new map. Otherwise Unmarshal reuses the existing map, keeping existing entries. Unmarshal then stores key-value pairs from the JSON object into the map. The map's key type must either be a string, an integer, or implement encoding.TextUnmarshaler.

When I read this, I expected the rules to be applied recursively, which led to this issue.

@smasher164
Copy link
Member

smasher164 commented Aug 9, 2019

It may be worthwhile to document that keeping existing entries does not imply the recursive merging of maps. @gopherbot, add Documentation, remove NeedsInvestigation

@gopherbot gopherbot added Documentation and removed NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Aug 9, 2019
@andybons
Copy link
Member

Let's consolidate documentation needs around this in the aforementioned issue.
Duplicate of #26946

@golang golang locked and limited conversation to collaborators Aug 11, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

5 participants