Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

parser: Fix YAML maps key type #4138

Merged
merged 1 commit into from
Feb 9, 2018
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
37 changes: 37 additions & 0 deletions parser/frontmatter.go
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ import (
"bytes"
"encoding/json"
"errors"
"fmt"
"io"
"strings"

Expand Down Expand Up @@ -201,9 +202,45 @@ func removeTOMLIdentifier(datum []byte) []byte {
func HandleYAMLMetaData(datum []byte) (interface{}, error) {
m := map[string]interface{}{}
err := yaml.Unmarshal(datum, &m)

// To support boolean keys, the `yaml` package unmarshals maps to
// map[interface{}]interface{}. Here we recurse through the result
// and change all maps to map[string]interface{} like we would've
// gotten from `json`.
if err == nil {
for k, v := range m {
m[k] = stringifyYAMLMapKeys(v)
}
}

return m, err
}

// stringifyKeysMapValue recurses into in and changes all instances of
// map[interface{}]interface{} to map[string]interface{}. This is useful to
// work around the impedence mismatch between JSON and YAML unmarshaling that's
// described here: https://github.com/go-yaml/yaml/issues/139
//
// Inspired by https://github.com/stripe/stripe-mock, MIT licensed
func stringifyYAMLMapKeys(in interface{}) interface{} {
switch in := in.(type) {
case []interface{}:
res := make([]interface{}, len(in))
for i, v := range in {
res[i] = stringifyYAMLMapKeys(v)
}
return res
case map[interface{}]interface{}:
res := make(map[string]interface{})
for k, v := range in {
res[fmt.Sprintf("%v", k)] = stringifyYAMLMapKeys(v)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This still assumes that this is a map with string keys. If I, for some reason, use integers, then I don't want Hugo to convert them into strings.

  • We can probably assume that all the keys are of the same type, so check one element before doing a conversion.
  • Also, use cast.ToStringE to convert into a string (much faster)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see your point but I'd like to point out that converting all keys to strings is a default behaviour for both TOML and JSON parsers in Hugo. What should we do about that? :-)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @bep. Have you thought this through? IMO keeping consistent parsing behaviour between different formats is a better approach, but I leave the decision to you.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"default behaviour for both TOML and JSON parsers in Hugo." I think you are right for the top level maps. But here you are doing a deep conversion, which does not make sense. Arrest me if I'm wrong because I have not looked at this code in a while.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please review a simple test case I wrote and run on Hugo code without my changes:

package main

import (
	"github.com/gohugoio/hugo/parser"
	"fmt"
)

func main() {
	res, _ := parser.HandleJSONMetaData([]byte("{\"inted\": {\"1\": \"a\"}, \"booled\": { \"true\": \"a\"}}"))
	fmt.Printf("%#v\n", res)
	// map[string]interface {}{"inted":map[string]interface {}{"1":"a"}, "booled":map[string]interface {}{"true":"a"}}

	res, _ = parser.HandleTOMLMetaData([]byte("[inted]\n1 = \"a\"\n[booled]\ntrue = \"a\""))
	fmt.Printf("%#v\n", res)
	// map[string]interface {}{"inted":map[string]interface {}{"1":"a"}, "booled":map[string]interface {}{"true":"a"}}

	res, _ = parser.HandleYAMLMetaData([]byte("inted:\n  1: a\nbooled:\n  true: a"))
	fmt.Printf("%#v\n", res)
	// map[string]interface {}{"inted":map[interface {}]interface {}{1:"a"}, "booled":map[interface {}]interface {}{true:"a"}}
}

It looks that indeed Hugo unmarshals TOML and JSON to map[string] not only for top-level maps. But maybe I did something wrong, I'm not a Go expert :-)

Deep conversion would made no sense for article front matter but it's intended for complex data stored in data directory. YAML seems easier and more readable to structure nested maps and arrays than TOML but because of this inconsistency one cannot sort or range over those maps.

}
return res
default:
return in
}
}

// HandleJSONMetaData unmarshals JSON-encoded datum and returns a Go interface
// representing the encoded data structure.
func HandleJSONMetaData(datum []byte) (interface{}, error) {
Expand Down
38 changes: 37 additions & 1 deletion parser/frontmatter_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -169,7 +169,8 @@ func TestHandleYAMLMetaData(t *testing.T) {
}{
{nil, map[string]interface{}{}, false},
{[]byte("title: test 1"), map[string]interface{}{"title": "test 1"}, false},
{[]byte("a: Easy!\nb:\n c: 2\n d: [3, 4]"), map[string]interface{}{"a": "Easy!", "b": map[interface{}]interface{}{"c": 2, "d": []interface{}{3, 4}}}, false},
{[]byte("a: Easy!\nb:\n c: 2\n d: [3, 4]"), map[string]interface{}{"a": "Easy!", "b": map[string]interface{}{"c": 2, "d": []interface{}{3, 4}}}, false},
{[]byte("a:\n true: 1\n false: 2"), map[string]interface{}{"a": map[string]interface{}{"true": 1, "false": 2}}, false},
// errors
{[]byte("z = not toml"), nil, true},
}
Expand Down Expand Up @@ -320,6 +321,41 @@ func TestRemoveTOMLIdentifier(t *testing.T) {
}
}

func TestStringifyYAMLMapKeys(t *testing.T) {
cases := []struct {
input map[interface{}]interface{}
want map[string]interface{}
}{
{
map[interface{}]interface{}{"a": 1, "b": 2},
map[string]interface{}{"a": 1, "b": 2},
},
{
map[interface{}]interface{}{"a": []interface{}{1, map[interface{}]interface{}{"b": 2}}},
map[string]interface{}{"a": []interface{}{1, map[string]interface{}{"b": 2}}},
},
{
map[interface{}]interface{}{true: 1, "b": false},
map[string]interface{}{"true": 1, "b": false},
},
{
map[interface{}]interface{}{1: "a", 2: "b"},
map[string]interface{}{"1": "a", "2": "b"},
},
{
map[interface{}]interface{}{"a": map[interface{}]interface{}{"b": 1}},
map[string]interface{}{"a": map[string]interface{}{"b": 1}},
},
}

for i, c := range cases {
res := stringifyYAMLMapKeys(c.input)
if !reflect.DeepEqual(res, c.want) {
t.Errorf("[%d] given %q\nwant: %q\n got: %q", i, c.input, c.want, res)
}
}
}

func BenchmarkFrontmatterTags(b *testing.B) {

for _, frontmatter := range []string{"JSON", "YAML", "YAML2", "TOML"} {
Expand Down