Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add Extract and Unquote functions for JSON. #3353

Merged
merged 28 commits into from
Jun 2, 2017
Merged
Show file tree
Hide file tree
Changes from 17 commits
Commits
Show all changes
28 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
106 changes: 106 additions & 0 deletions util/types/json/functions.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,106 @@
// Copyright 2017 PingCAP, Inc.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// See the License for the specific language governing permissions and
// limitations under the License.

package json

import "fmt"

// Type returns type of JSON as string.
func (j JSON) Type() string {
switch j.typeCode {
case typeCodeObject:
return "OBJECT"
case typeCodeArray:
return "ARRAY"
case typeCodeLiteral:
switch byte(j.i64) {
case jsonLiteralNil:
return "NULL"
default:
return "BOOLEAN"
}
case typeCodeInt64:
return "INTEGER"
case typeCodeFloat64:
return "DOUBLE"
case typeCodeString:
return "STRING"
default:
msg := fmt.Sprintf(unknownTypeCodeErrorMsg, j.typeCode)
panic(msg)
}
}

// Extract receives several path expressions as arguments, matches them in j, and returns:
// ret: target JSON matched any path expressions. maybe autowrapped as an array.
// found: true if any path expressions matched.
func (j JSON) Extract(pathExprList []PathExpression) (ret JSON, found bool) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How to distinguish the returned array is one of matched path, or a wrapped array?
Does it matters?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. I will test select json_extract('{"a": [1, 2]}', '$.a') and select json_extract('{"a": [1, 2]}', "$.a[0]", "$.a[1]").

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these two statements have same result value on MySQL 5.7, so it seems we cannot distinguish them, and we don't need, either.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

elemList := make([]JSON, 0, len(pathExprList))
for _, pathExpr := range pathExprList {
elemList = append(elemList, extract(j, pathExpr)...)
}
if len(elemList) == 0 {
found = false
} else if len(pathExprList) == 1 && len(elemList) == 1 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems len(elemList) will always be 1 here ?

Copy link
Contributor Author

@hicqu hicqu May 31, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if pathExpr contains any asterisks, len(elemList) won't be 1 even if len(pathExprList) equals to 1.

// If pathExpr contains asterisks, len(elemList) won't be 1
// even if len(pathExprList) equals to 1.
found = true
ret = elemList[0]
} else {
found = true
ret.typeCode = typeCodeArray
ret.array = append(ret.array, elemList...)
}
return
}

// Unquote is for JSON_UNQUOTE.
func (j JSON) Unquote() string {
switch j.typeCode {
case typeCodeString:
return j.str
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

j.String() return j.str when it's typeCodeString ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for example,

select json_unquote(`"hello, world"`); -- should return `hello, world`
-- but j.String() will return a json marshal string,
-- in this case will be `"hello, world"`

so I use j.str instead of j.String.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Get it.

default:
return j.String()
}
}

// extract is used by Extract.
// NOTE: the return value will share something with j.
func extract(j JSON, pathExpr PathExpression) (ret []JSON) {
if len(pathExpr.legs) == 0 {
return []JSON{j}
}
var currentLeg = pathExpr.legs[0]
pathExpr.legs = pathExpr.legs[1:]
if currentLeg.isArrayIndex && j.typeCode == typeCodeArray {
if currentLeg.arrayIndex == arrayIndexAsterisk {
for _, child := range j.array {
ret = append(ret, extract(child, pathExpr)...)
}
} else if currentLeg.arrayIndex < len(j.array) {
childRet := extract(j.array[currentLeg.arrayIndex], pathExpr)
ret = append(ret, childRet...)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will arrayIndex < 0 and arrayIndex != -1 ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, it won't.

} else if !currentLeg.isArrayIndex && j.typeCode == typeCodeObject {
if len(currentLeg.dotKey) == 1 && currentLeg.dotKey[0] == '*' {
var sortedKeys = getSortedKeys(j.object) // iterate over sorted keys.
for _, child := range sortedKeys {
ret = append(ret, extract(j.object[child], pathExpr)...)
}
} else if child, ok := j.object[currentLeg.dotKey]; ok {
childRet := extract(child, pathExpr)
ret = append(ret, childRet...)
}
}
return
}
103 changes: 103 additions & 0 deletions util/types/json/functions_test.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,103 @@
// Copyright 2017 PingCAP, Inc.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// See the License for the specific language governing permissions and
// limitations under the License.

package json

import (
"bytes"

. "github.com/pingcap/check"
)

func (s *testJSONSuite) TestJSONType(c *C) {
j1 := mustParseFromString(`{"a": "b"}`)
j2 := mustParseFromString(`["a", "b"]`)
j3 := mustParseFromString(`3`)
j4 := mustParseFromString(`3.0`)
j5 := mustParseFromString(`null`)
j6 := mustParseFromString(`true`)
var jList = []struct {
In JSON
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use string as In, then parse in the loop is more clear.

Out string
}{
{j1, "OBJECT"},
{j2, "ARRAY"},
{j3, "INTEGER"},
{j4, "DOUBLE"},
{j5, "NULL"},
{j6, "BOOLEAN"},
}
for _, j := range jList {
c.Assert(j.In.Type(), Equals, j.Out)
}
}

func (s *testJSONSuite) TestJSONExtract(c *C) {
j1 := mustParseFromString(`{"a": [1, "2", {"aa": "bb"}, 4.0, {"aa": "cc"}], "b": true, "c": ["d"]}`)
j2 := mustParseFromString(`[{"a": 1, "b": true}, 3, 3.5, "hello, world", null, true]`)

var tests = []struct {
j JSON
pathExprStrings []string
expected JSON
found bool
err error
}{
// test extract with only one path expression.
{j1, []string{"$.a"}, j1.object["a"], true, nil},
{j2, []string{"$.a"}, CreateJSON(nil), false, nil},
{j1, []string{"$[0]"}, CreateJSON(nil), false, nil},
{j2, []string{"$[0]"}, j2.array[0], true, nil},
{j1, []string{"$.a[2].aa"}, CreateJSON("bb"), true, nil},
{j1, []string{"$.a[*].aa"}, mustParseFromString(`["bb", "cc"]`), true, nil},
{j1, []string{"$.*[0]"}, mustParseFromString(`[1, "d"]`), true, nil},
{j1, []string{`$.a[*]."aa"`}, mustParseFromString(`["bb", "cc"]`), true, nil},

// test extract with multi path expressions.
{j1, []string{"$.a", "$[0]"}, mustParseFromString(`[[1, "2", {"aa": "bb"}, 4.0, {"aa": "cc"}]]`), true, nil},
{j2, []string{"$.a", "$[0]"}, mustParseFromString(`[{"a": 1, "b": true}]`), true, nil},
}

for _, tt := range tests {
var pathExprList = make([]PathExpression, 0)
for _, peStr := range tt.pathExprStrings {
pe, err := ParseJSONPathExpr(peStr)
c.Assert(err, IsNil)
pathExprList = append(pathExprList, pe)
}

expected, found := tt.j.Extract(pathExprList)
c.Assert(found, Equals, tt.found)
if found {
b1 := Serialize(expected)
b2 := Serialize(tt.expected)
c.Assert(bytes.Compare(b1, b2), Equals, 0)
}
}
}

func (s *testJSONSuite) TestJSONUnquote(c *C) {
var tests = []struct {
j JSON
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use string type for j is easier to read.

unquoted string
}{
{j: mustParseFromString(`3`), unquoted: "3"},
{j: mustParseFromString(`"3"`), unquoted: "3"},
{j: mustParseFromString(`true`), unquoted: "true"},
{j: mustParseFromString(`null`), unquoted: "null"},
{j: mustParseFromString(`{"a": [1, 2]}`), unquoted: `{"a":[1,2]}`},
}
for _, tt := range tests {
c.Assert(tt.j.Unquote(), Equals, tt.unquoted)
}
}
26 changes: 0 additions & 26 deletions util/types/json/json.go
Original file line number Diff line number Diff line change
Expand Up @@ -123,32 +123,6 @@ func (j JSON) String() string {
return strings.TrimSpace(hack.String(bytes))
}

// Type returns type of JSON as string.
func (j JSON) Type() string {
switch j.typeCode {
case typeCodeObject:
return "OBJECT"
case typeCodeArray:
return "ARRAY"
case typeCodeLiteral:
switch byte(j.i64) {
case jsonLiteralNil:
return "NULL"
default:
return "BOOLEAN"
}
case typeCodeInt64:
return "INTEGER"
case typeCodeFloat64:
return "DOUBLE"
case typeCodeString:
return "STRING"
default:
msg := fmt.Sprintf(unknownTypeCodeErrorMsg, j.typeCode)
panic(msg)
}
}

var (
// ErrInvalidJSONText means invalid JSON text.
ErrInvalidJSONText = terror.ClassJSON.New(mysql.ErrInvalidJSONText, mysql.MySQLErrName[mysql.ErrInvalidJSONText])
Expand Down
101 changes: 33 additions & 68 deletions util/types/json/json_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@
package json

import (
"fmt"
"testing"

. "github.com/pingcap/check"
Expand All @@ -27,19 +28,30 @@ func TestT(t *testing.T) {
TestingT(t)
}

// mustParseFromString parse a JSON from a string.
// Panic if string is not a valid JSON.
func mustParseFromString(s string) JSON {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment for this function.

j, err := ParseFromString(s)
if err != nil {
msg := fmt.Sprintf("ParseFromString(%s) fail", s)
panic(msg)
}
return j
}

func (s *testJSONSuite) TestParseFromString(c *C) {
jstr1 := `{"a": [1, "2", {"aa": "bb"}, 4, null], "b": true, "c": null}`
jstr2 := mustParseFromString(jstr1).String()
c.Assert(jstr2, Equals, `{"a":[1,"2",{"aa":"bb"},4,null],"b":true,"c":null}`)
}

func (s *testJSONSuite) TestJSONSerde(c *C) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Serde is not a commonly used short name.
Please use full words.
And JSON can be removed as the test suite is already named JSON.

var jsonNilValue = CreateJSON(nil)
var jsonBoolValue = CreateJSON(true)
var jsonDoubleValue = CreateJSON(3.24)
var jsonStringValue = CreateJSON("hello, 世界")

var jstr1 = `{"aaaaaaaaaaa": [1, "2", {"aa": "bb"}, 4.0], "bbbbbbbbbb": true, "ccccccccc": "d"}`
j1, err := ParseFromString(jstr1)
c.Assert(err, IsNil)

var jstr2 = `[{"a": 1, "b": true}, 3, 3.5, "hello, world", null, true]`
j2, err := ParseFromString(jstr2)
c.Assert(err, IsNil)
j1 := mustParseFromString(`{"aaaaaaaaaaa": [1, "2", {"aa": "bb"}, 4.0], "bbbbbbbbbb": true, "ccccccccc": "d"}`)
j2 := mustParseFromString(`[{"a": 1, "b": true}, 3, 3.5, "hello, world", null, true]`)

var testcses = []struct {
In JSON
Expand All @@ -64,65 +76,19 @@ func (s *testJSONSuite) TestJSONSerde(c *C) {
}
}

func (s *testJSONSuite) TestParseFromString(c *C) {
var jstr1 = `{"a": [1, "2", {"aa": "bb"}, 4, null], "b": true, "c": null}`

j1, err := ParseFromString(jstr1)
c.Assert(err, IsNil)

var jstr2 = j1.String()
c.Assert(jstr2, Equals, `{"a":[1,"2",{"aa":"bb"},4,null],"b":true,"c":null}`)
}

func (s *testJSONSuite) TestJSONType(c *C) {
j1, err := ParseFromString(`{"a": "b"}`)
c.Assert(err, IsNil)

j2, err := ParseFromString(`["a", "b"]`)
c.Assert(err, IsNil)

j3, err := ParseFromString(`3`)
c.Assert(err, IsNil)

j4, err := ParseFromString(`3.0`)
c.Assert(err, IsNil)

j5, err := ParseFromString(`null`)
c.Assert(err, IsNil)

j6, err := ParseFromString(`true`)
c.Assert(err, IsNil)

var jList = []struct {
In JSON
Out string
}{
{j1, "OBJECT"},
{j2, "ARRAY"},
{j3, "INTEGER"},
{j4, "DOUBLE"},
{j5, "NULL"},
{j6, "BOOLEAN"},
}

for _, j := range jList {
c.Assert(j.In.Type(), Equals, j.Out)
}
}

func (s *testJSONSuite) TestCompareJSON(c *C) {
jNull, _ := ParseFromString(`null`)
jBoolTrue, _ := ParseFromString(`true`)
jBoolFalse, _ := ParseFromString(`false`)
jIntegerLarge, _ := ParseFromString(`5`)
jIntegerSmall, _ := ParseFromString(`3`)
jStringLarge, _ := ParseFromString(`"hello, world"`)
jStringSmall, _ := ParseFromString(`"hello"`)
jArrayLarge, _ := ParseFromString(`["a", "c"]`)
jArraySmall, _ := ParseFromString(`["a", "b"]`)
jObject, _ := ParseFromString(`{"a": "b"}`)

var caseList = []struct {
jNull := mustParseFromString(`null`)
jBoolTrue := mustParseFromString(`true`)
jBoolFalse := mustParseFromString(`false`)
jIntegerLarge := mustParseFromString(`5`)
jIntegerSmall := mustParseFromString(`3`)
jStringLarge := mustParseFromString(`"hello, world"`)
jStringSmall := mustParseFromString(`"hello"`)
jArrayLarge := mustParseFromString(`["a", "c"]`)
jArraySmall := mustParseFromString(`["a", "b"]`)
jObject := mustParseFromString(`{"a": "b"}`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's better to define strings in test cases, and parse it to string in the loop.

And our convention to name the test table tests, and the case element tt in for range.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

renamed. but I prefer keep to define strings in here, because they are used many times in testcases.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, this case, the variable name is more meaningful than the string value.


var tests = []struct {
left JSON
right JSON
}{
Expand All @@ -136,8 +102,7 @@ func (s *testJSONSuite) TestCompareJSON(c *C) {
{jArrayLarge, jBoolFalse},
{jBoolFalse, jBoolTrue},
}

for _, cmpCase := range caseList {
for _, cmpCase := range tests {
cmp, err := CompareJSON(cmpCase.left, cmpCase.right)
c.Assert(err, IsNil)
c.Assert(cmp < 0, IsTrue)
Expand Down
Loading