Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add Extract and Unquote functions for JSON. #3353

Merged
merged 28 commits into from
Jun 2, 2017
Merged
Show file tree
Hide file tree
Changes from 11 commits
Commits
Show all changes
28 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
107 changes: 107 additions & 0 deletions util/types/json/functions.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,107 @@
// Copyright 2017 PingCAP, Inc.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// See the License for the specific language governing permissions and
// limitations under the License.

package json

import "fmt"

// Type returns type of JSON as string.
func (j JSON) Type() string {
switch j.typeCode {
case typeCodeObject:
return "OBJECT"
case typeCodeArray:
return "ARRAY"
case typeCodeLiteral:
switch byte(j.i64) {
case jsonLiteralNil:
return "NULL"
default:
return "BOOLEAN"
}
case typeCodeInt64:
return "INTEGER"
case typeCodeFloat64:
return "DOUBLE"
case typeCodeString:
return "STRING"
default:
msg := fmt.Sprintf(unknownTypeCodeErrorMsg, j.typeCode)
panic(msg)
}
}

// Extract receives several path expressions as arguments, matches them in j, and returns:
// ret: target JSON matched any path expressions. maybe autowrapped as an array.
// found: true if any path expressions matched.
func (j JSON) Extract(pathExprList []PathExpression) (ret JSON, found bool) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How to distinguish the returned array is one of matched path, or a wrapped array?
Does it matters?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. I will test select json_extract('{"a": [1, 2]}', '$.a') and select json_extract('{"a": [1, 2]}', "$.a[0]", "$.a[1]").

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these two statements have same result value on MySQL 5.7, so it seems we cannot distinguish them, and we don't need, either.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

elemList := make([]JSON, 0, len(pathExprList))
for _, pathExpr := range pathExprList {
elemList = append(elemList, extract(j, pathExpr)...)
}
if len(elemList) == 0 {
found = false
} else if len(pathExprList) == 1 && len(elemList) == 1 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems len(elemList) will always be 1 here ?

Copy link
Contributor Author

@hicqu hicqu May 31, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if pathExpr contains any asterisks, len(elemList) won't be 1 even if len(pathExprList) equals to 1.

// If pathExpr contains asterisks, len(elemList) won't be 1
// even if len(pathExprList) equals to 1.
found = true
ret = elemList[0]
} else {
found = true
ret.typeCode = typeCodeArray
ret.array = append(ret.array, elemList...)
}
return
}

// Unquote is for JSON_UNQUOTE.
func (j JSON) Unquote() string {
switch j.typeCode {
case typeCodeString:
return j.str
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

j.String() return j.str when it's typeCodeString ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for example,

select json_unquote(`"hello, world"`); -- should return `hello, world`
-- but j.String() will return a json marshal string,
-- in this case will be `"hello, world"`

so I use j.str instead of j.String.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Get it.

default:
return j.String()
}
}

// extract is used by Extract.
// NOTE: the return value will share something with j.
func extract(j JSON, pathExpr PathExpression) (ret []JSON) {
if len(pathExpr.legs) == 0 {
return []JSON{j}
}
var currentLeg = pathExpr.legs[0]
pathExpr.legs = pathExpr.legs[1:]
if currentLeg.isArrayIndex && j.typeCode == typeCodeArray {
if currentLeg.arrayIndex == arrayIndexAsterisk {
for _, child := range j.array {
ret = append(ret, extract(child, pathExpr)...)
}
} else if currentLeg.arrayIndex < len(j.array) {
childRet := extract(j.array[currentLeg.arrayIndex], pathExpr)
ret = append(ret, childRet...)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will arrayIndex < 0 and arrayIndex != -1 ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, it won't.

} else if !currentLeg.isArrayIndex && j.typeCode == typeCodeObject {
var key = pathExpr.raw[currentLeg.start:currentLeg.end]
if len(key) == 1 && key[0] == '*' {
var sortedKeys = getSortedKeys(j.object) // iterate over sorted keys.
for _, child := range sortedKeys {
ret = append(ret, extract(j.object[child], pathExpr)...)
}
} else if child, ok := j.object[key]; ok {
childRet := extract(child, pathExpr)
ret = append(ret, childRet...)
}
}
return
}
102 changes: 102 additions & 0 deletions util/types/json/functions_test.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,102 @@
// Copyright 2017 PingCAP, Inc.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// See the License for the specific language governing permissions and
// limitations under the License.

package json

import (
"bytes"

. "github.com/pingcap/check"
)

func (s *testJSONSuite) TestJSONType(c *C) {
j1 := parseFromStringPanic(`{"a": "b"}`)
j2 := parseFromStringPanic(`["a", "b"]`)
j3 := parseFromStringPanic(`3`)
j4 := parseFromStringPanic(`3.0`)
j5 := parseFromStringPanic(`null`)
j6 := parseFromStringPanic(`true`)
var jList = []struct {
In JSON
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use string as In, then parse in the loop is more clear.

Out string
}{
{j1, "OBJECT"},
{j2, "ARRAY"},
{j3, "INTEGER"},
{j4, "DOUBLE"},
{j5, "NULL"},
{j6, "BOOLEAN"},
}
for _, j := range jList {
c.Assert(j.In.Type(), Equals, j.Out)
}
}

func (s *testJSONSuite) TestJSONExtract(c *C) {
j1 := parseFromStringPanic(`{"a": [1, "2", {"aa": "bb"}, 4.0, {"aa": "cc"}], "b": true, "c": ["d"]}`)
j2 := parseFromStringPanic(`[{"a": 1, "b": true}, 3, 3.5, "hello, world", null, true]`)

var caseList = []struct {
j JSON
pathExprStrings []string
expected JSON
found bool
err error
}{
// test extract with only one path expression.
{j1, []string{"$.a"}, j1.object["a"], true, nil},
{j2, []string{"$.a"}, CreateJSON(nil), false, nil},
{j1, []string{"$[0]"}, CreateJSON(nil), false, nil},
{j2, []string{"$[0]"}, j2.array[0], true, nil},
{j1, []string{"$.a[2].aa"}, CreateJSON("bb"), true, nil},
{j1, []string{"$.a[*].aa"}, parseFromStringPanic(`["bb", "cc"]`), true, nil},
{j1, []string{"$.*[0]"}, parseFromStringPanic(`[1, "d"]`), true, nil},

// test extract with multi path expressions.
{j1, []string{"$.a", "$[0]"}, parseFromStringPanic(`[[1, "2", {"aa": "bb"}, 4.0, {"aa": "cc"}]]`), true, nil},
{j2, []string{"$.a", "$[0]"}, parseFromStringPanic(`[{"a": 1, "b": true}]`), true, nil},
}

for _, caseItem := range caseList {
var pathExprList = make([]PathExpression, 0)
for _, peStr := range caseItem.pathExprStrings {
pe, err := ParseJSONPathExpr(peStr)
c.Assert(err, IsNil)
pathExprList = append(pathExprList, pe)
}

expected, found := caseItem.j.Extract(pathExprList)
c.Assert(found, Equals, caseItem.found)
if found {
b1 := Serialize(expected)
b2 := Serialize(caseItem.expected)
c.Assert(bytes.Compare(b1, b2), Equals, 0)
}
}
}

func (s *testJSONSuite) TestJSONUnquote(c *C) {
var caseList = []struct {
j JSON
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use string type for j is easier to read.

unquoted string
}{
{j: parseFromStringPanic(`3`), unquoted: "3"},
{j: parseFromStringPanic(`"3"`), unquoted: "3"},
{j: parseFromStringPanic(`true`), unquoted: "true"},
{j: parseFromStringPanic(`null`), unquoted: "null"},
{j: parseFromStringPanic(`{"a": [1, 2]}`), unquoted: `{"a":[1,2]}`},
}
for _, caseItem := range caseList {
c.Assert(caseItem.j.Unquote(), Equals, caseItem.unquoted)
}
}
26 changes: 0 additions & 26 deletions util/types/json/json.go
Original file line number Diff line number Diff line change
Expand Up @@ -123,32 +123,6 @@ func (j JSON) String() string {
return strings.TrimSpace(hack.String(bytes))
}

// Type returns type of JSON as string.
func (j JSON) Type() string {
switch j.typeCode {
case typeCodeObject:
return "OBJECT"
case typeCodeArray:
return "ARRAY"
case typeCodeLiteral:
switch byte(j.i64) {
case jsonLiteralNil:
return "NULL"
default:
return "BOOLEAN"
}
case typeCodeInt64:
return "INTEGER"
case typeCodeFloat64:
return "DOUBLE"
case typeCodeString:
return "STRING"
default:
msg := fmt.Sprintf(unknownTypeCodeErrorMsg, j.typeCode)
panic(msg)
}
}

var (
// ErrInvalidJSONText means invalid JSON text.
ErrInvalidJSONText = terror.ClassJSON.New(mysql.ErrInvalidJSONText, mysql.MySQLErrName[mysql.ErrInvalidJSONText])
Expand Down
93 changes: 28 additions & 65 deletions util/types/json/json_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@
package json

import (
"fmt"
"testing"

. "github.com/pingcap/check"
Expand All @@ -27,19 +28,28 @@ func TestT(t *testing.T) {
TestingT(t)
}

func parseFromStringPanic(s string) JSON {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

parseFromStringPanic is not a good name, mustParseFromString would confirm to Go idiom.

j, err := ParseFromString(s)
if err != nil {
msg := fmt.Sprintf("ParseFromString(%s) fail", s)
panic(msg)
}
return j
}

func (s *testJSONSuite) TestParseFromString(c *C) {
jstr1 := `{"a": [1, "2", {"aa": "bb"}, 4, null], "b": true, "c": null}`
jstr2 := parseFromStringPanic(jstr1).String()
c.Assert(jstr2, Equals, `{"a":[1,"2",{"aa":"bb"},4,null],"b":true,"c":null}`)
}

func (s *testJSONSuite) TestJSONSerde(c *C) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Serde is not a commonly used short name.
Please use full words.
And JSON can be removed as the test suite is already named JSON.

var jsonNilValue = CreateJSON(nil)
var jsonBoolValue = CreateJSON(true)
var jsonDoubleValue = CreateJSON(3.24)
var jsonStringValue = CreateJSON("hello, 世界")

var jstr1 = `{"aaaaaaaaaaa": [1, "2", {"aa": "bb"}, 4.0], "bbbbbbbbbb": true, "ccccccccc": "d"}`
j1, err := ParseFromString(jstr1)
c.Assert(err, IsNil)

var jstr2 = `[{"a": 1, "b": true}, 3, 3.5, "hello, world", null, true]`
j2, err := ParseFromString(jstr2)
c.Assert(err, IsNil)
j1 := parseFromStringPanic(`{"aaaaaaaaaaa": [1, "2", {"aa": "bb"}, 4.0], "bbbbbbbbbb": true, "ccccccccc": "d"}`)
j2 := parseFromStringPanic(`[{"a": 1, "b": true}, 3, 3.5, "hello, world", null, true]`)

var testcses = []struct {
In JSON
Expand All @@ -64,63 +74,17 @@ func (s *testJSONSuite) TestJSONSerde(c *C) {
}
}

func (s *testJSONSuite) TestParseFromString(c *C) {
var jstr1 = `{"a": [1, "2", {"aa": "bb"}, 4, null], "b": true, "c": null}`

j1, err := ParseFromString(jstr1)
c.Assert(err, IsNil)

var jstr2 = j1.String()
c.Assert(jstr2, Equals, `{"a":[1,"2",{"aa":"bb"},4,null],"b":true,"c":null}`)
}

func (s *testJSONSuite) TestJSONType(c *C) {
j1, err := ParseFromString(`{"a": "b"}`)
c.Assert(err, IsNil)

j2, err := ParseFromString(`["a", "b"]`)
c.Assert(err, IsNil)

j3, err := ParseFromString(`3`)
c.Assert(err, IsNil)

j4, err := ParseFromString(`3.0`)
c.Assert(err, IsNil)

j5, err := ParseFromString(`null`)
c.Assert(err, IsNil)

j6, err := ParseFromString(`true`)
c.Assert(err, IsNil)

var jList = []struct {
In JSON
Out string
}{
{j1, "OBJECT"},
{j2, "ARRAY"},
{j3, "INTEGER"},
{j4, "DOUBLE"},
{j5, "NULL"},
{j6, "BOOLEAN"},
}

for _, j := range jList {
c.Assert(j.In.Type(), Equals, j.Out)
}
}

func (s *testJSONSuite) TestCompareJSON(c *C) {
jNull, _ := ParseFromString(`null`)
jBoolTrue, _ := ParseFromString(`true`)
jBoolFalse, _ := ParseFromString(`false`)
jIntegerLarge, _ := ParseFromString(`5`)
jIntegerSmall, _ := ParseFromString(`3`)
jStringLarge, _ := ParseFromString(`"hello, world"`)
jStringSmall, _ := ParseFromString(`"hello"`)
jArrayLarge, _ := ParseFromString(`["a", "c"]`)
jArraySmall, _ := ParseFromString(`["a", "b"]`)
jObject, _ := ParseFromString(`{"a": "b"}`)
jNull := parseFromStringPanic(`null`)
jBoolTrue := parseFromStringPanic(`true`)
jBoolFalse := parseFromStringPanic(`false`)
jIntegerLarge := parseFromStringPanic(`5`)
jIntegerSmall := parseFromStringPanic(`3`)
jStringLarge := parseFromStringPanic(`"hello, world"`)
jStringSmall := parseFromStringPanic(`"hello"`)
jArrayLarge := parseFromStringPanic(`["a", "c"]`)
jArraySmall := parseFromStringPanic(`["a", "b"]`)
jObject := parseFromStringPanic(`{"a": "b"}`)

var caseList = []struct {
left JSON
Expand All @@ -136,7 +100,6 @@ func (s *testJSONSuite) TestCompareJSON(c *C) {
{jArrayLarge, jBoolFalse},
{jBoolFalse, jBoolTrue},
}

for _, cmpCase := range caseList {
cmp, err := CompareJSON(cmpCase.left, cmpCase.right)
c.Assert(err, IsNil)
Expand Down
Loading