-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
executor: support group_concat under new aggregation evaluation framework #7032
Conversation
// limitations under the License. | ||
package aggfuncs | ||
|
||
import ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reorganize the import packages
baseAggFunc | ||
} | ||
|
||
func (e *baseGroupConcat4String) AppendFinalResult2Chunk(sctx sessionctx.Context, pr PartialResult, chk *chunk.Chunk) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
put functions belonging to baseGroupConcat4String
together
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
baseGroupConcat4String only has one function.
p.buffer.WriteString(s) | ||
} | ||
} | ||
// TODO: if total length is greater than global var group_concat_max_len, truncate it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make this as an issue and put the issue number into this comment.
|
||
func (e *groupConcat4String) ResetPartialResult(pr PartialResult) { | ||
p := (*partialResult4ConcatString)(pr) | ||
p.sep, p.sepInited, p.buffer = "", false, nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
p.sep
should be a field of groupConcat4String
, not the partial result. The separator
can only be a constant and should be consider as a property of the group_concat
function.
/run-all-tests |
|
||
func (e *groupConcat4String) UpdatePartialResult(sctx sessionctx.Context, rowsInGroup []chunk.Row, pr PartialResult) (err error) { | ||
p := (*partialResult4ConcatString)(pr) | ||
if !e.sepInited { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
e.sep
can be initialized when we create the groupConcat4String
object?
executor/aggfuncs/builder.go
Outdated
} | ||
} | ||
base := baseAggFunc{ | ||
args: aggFuncDesc.Args, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about args: aggFuncDesc.Args[:len(aggFuncDesc.Args)-1]
} | ||
if isNull { | ||
continue | ||
} |
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
PTAL @zz-jason |
@XuHuaiyu We need some benchmarks to decide whether it is worthy to make that change. |
/run-all-tests tidb-test=pr/573 |
@zz-jason func BenchmarkString(b *testing.B) {
buffer := &bytes.Buffer{}
array := []string{"abc", "bcd", "efg", "hij"}
b.ResetTimer()
valBuf := make([]string, len(array))
for i := 0; i < b.N; i ++{
for j := range array {
valBuf[j] = array[j]
}
for j := range valBuf{
buffer.WriteString(valBuf[j])
}
}
}
func BenchmarkString2(b *testing.B) {
buffer := &bytes.Buffer{}
array := []string{"abc", "bcd", "efg", "hij"}
b.ResetTimer()
for i := 0; i < b.N; i ++{
for j := range array {
buffer.WriteString(array[j])
}
}
} BenchmarkString-4 20000000 70.1 ns/op 28 B/op 0 allocs/op |
1 similar comment
@zz-jason func BenchmarkString(b *testing.B) {
buffer := &bytes.Buffer{}
array := []string{"abc", "bcd", "efg", "hij"}
b.ResetTimer()
valBuf := make([]string, len(array))
for i := 0; i < b.N; i ++{
for j := range array {
valBuf[j] = array[j]
}
for j := range valBuf{
buffer.WriteString(valBuf[j])
}
}
}
func BenchmarkString2(b *testing.B) {
buffer := &bytes.Buffer{}
array := []string{"abc", "bcd", "efg", "hij"}
b.ResetTimer()
for i := 0; i < b.N; i ++{
for j := range array {
buffer.WriteString(array[j])
}
}
} BenchmarkString-4 20000000 70.1 ns/op 28 B/op 0 allocs/op |
(70.1-59.8)/59.8, about 17%, I think it's worth to refine. |
we need to reset |
it's ok to not reset |
p.buffer.WriteString(e.sep) | ||
} | ||
} | ||
p.buffer.Truncate(p.buffer.Len() - 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can be removed?
} | ||
p.buffer.WriteString(v) | ||
} | ||
if isWriteSep { |
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
if isNull { | ||
continue | ||
} | ||
valsBuf[i] = v |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
valsBuf
can be optimized out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This valsBuf should be maintained,
cause we need to get all the values here for checking distinct later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we make valsBuf
be an object of bytes.Buffer
? Maybe it's faster than strings.Join
?
/run-all-tests |
executor/aggfuncs/builder.go
Outdated
sep, _, err := c.EvalString(nil, nil) | ||
// This err will never happen, we check it here for passing the errcheck. | ||
if err != nil { | ||
log.Warning("Error happened when buildGroupConcat:", errors.Trace(err).Error()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should return nil
or panic here to avoid further execution.
PTAL @zz-jason |
/run-all-tests |
executor/aggfuncs/aggfuncs.go
Outdated
@@ -38,6 +38,9 @@ var ( | |||
// All the AggFunc implementations for "MAX" are listed here. | |||
// All the AggFunc implementations for "MIN" are listed here. | |||
// All the AggFunc implementations for "GROUP_CONCAT" are listed here. | |||
_ AggFunc = (*groupConcat4DistinctString)(nil) | |||
_ AggFunc = (*groupConcat4String)(nil) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
group_concat
always executes on the string inputs, I think we can remove the 4String
suffix. How about groupConcat
and groupConcatDistinct
?
buffer *bytes.Buffer | ||
} | ||
|
||
type partialResult4ConcatString struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about partialResult4GroupConcat
?
v, isNull := "", false | ||
for _, row := range rowsInGroup { | ||
isWriteSep := false | ||
for i, l := 0, len(e.args); i < l; i++ { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for _, arg := range e.args
could be simpler.
p := (*partialResult4ConcatString)(pr) | ||
v, isNull := "", false | ||
for _, row := range rowsInGroup { | ||
isWriteSep := false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about s/isWriteSep/writeSep/?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer the current name.😄
return nil | ||
} | ||
|
||
type partialResult4ConcatDistinctString struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about partialResult4GroupConcatDistinct
?
p := (*partialResult4ConcatDistinctString)(pr) | ||
v, isNull, valsBuf, joinedVals := "", false, make([]string, len(e.args)), "" | ||
for _, row := range rowsInGroup { | ||
for i, l := 0, len(e.args); i < l; i++ { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
if isNull { | ||
continue | ||
} | ||
valsBuf[i] = v |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we make valsBuf
be an object of bytes.Buffer
? Maybe it's faster than strings.Join
?
@AndreMouche is working on the failed case. |
comments addressed, PTAL @zz-jason |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
executor/aggfuncs/builder.go
Outdated
base := baseAggFunc{ | ||
args: aggFuncDesc.Args[:len(aggFuncDesc.Args)-1], | ||
ordinal: ordinal, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove base
(line#201~204) to below default
(line#209) ?
p := (*partialResult4GroupConcat)(pr) | ||
v, isNull := "", false | ||
for _, row := range rowsInGroup { | ||
isWriteSep := false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about define isWriteSep
out of for loop
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
/run-all-tests |
What have you changed? (mandatory)
implement
groupConcat4DistinctString
groupConcat4String
What is the type of the changes? (mandatory)
Improvement
How has this PR been tested? (mandatory)
The existing test cases.
Does this PR affect documentation (docs/docs-cn) update? (mandatory)
No
Does this PR affect tidb-ansible update? (mandatory)
No
Does this PR need to be added to the release notes? (mandatory)
No
Refer to a related PR or issue link (optional)
#6952
Benchmark result if necessary (optional)
Add a few positive/negative examples (optional)