Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[coordinator] Influxdb write endpoint tag copy fix #2126

Merged
merged 3 commits into from
Feb 13, 2020
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 16 additions & 1 deletion src/query/api/v1/handler/influxdb/write.go
Original file line number Diff line number Diff line change
Expand Up @@ -188,11 +188,26 @@ func (ii *ingestIterator) Next() bool {
return false
}

func copyTagsWithNewName(t models.Tags, name []byte) models.Tags {
copiedTags := make([]models.Tag, t.Len())
metricName := t.Opts.MetricName()
nameHandled := false
for i, tag := range t.Tags {
if !nameHandled && bytes.Equal(tag.Name, metricName) {
copiedTags[i] = models.Tag{Name: metricName, Value: name}
nameHandled = true
} else {
copiedTags[i] = tag
}
}
return models.Tags{Tags: copiedTags, Opts: t.Opts}
}

func (ii *ingestIterator) Current() (models.Tags, ts.Datapoints, xtime.Unit, []byte) {
if ii.pointIndex < len(ii.points) && ii.nextFieldIndex > 0 && len(ii.fields) > (ii.nextFieldIndex-1) {
point := ii.points[ii.pointIndex]
field := ii.fields[ii.nextFieldIndex-1]
tags := ii.tags.SetName(field.name)
tags := copyTagsWithNewName(ii.tags, field.name)

return tags, []ts.Datapoint{ts.Datapoint{Timestamp: point.Time(),
Value: field.value}}, xtime.Nanosecond, nil
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: I missed this before, it might be worth trying to choose the most granular precision to improve the M3TSZ compression ratio.

Otherwise there could be high variability in the delta-of-delta encoding when not required (i.e. a delta-of-delta encoding 4000000 nanoseconds takes many more bits than a delta-to-delta encoding of 4 milliseconds).

You could basically divide each time unit by the nanoseconds and see which one falls within exactly each unit then choose the highest occuring one that equally divides (i.e. try seconds first, then millisecond, then so on). If they exactly fall into a unit, then emit with that unit?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually wondered about that, because we get pretty lousy M3TSZ compression ratio. That would explain it, I will address that too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added it in commit, but it's only ~10% space savings in our test load (1/N scale of real thing). However, better than nothing.

Expand Down