Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ESQL: Mark counter fields as unsupported #99054

Merged
merged 9 commits into from
Sep 13, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions docs/changelog/99054.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
pr: 99054
summary: "ESQL: Mark counter fields as unsupported"
area: ES|QL
type: enhancement
issues: []
Original file line number Diff line number Diff line change
Expand Up @@ -32,8 +32,10 @@ setup:
properties:
tx:
type: long
time_series_metric: counter
rx:
type: long
time_series_metric: counter
- do:
bulk:
refresh: true
Expand Down Expand Up @@ -70,9 +72,9 @@ load everything:
- match: {columns.2.name: "k8s.pod.name"}
- match: {columns.2.type: "keyword"}
- match: {columns.3.name: "k8s.pod.network.rx"}
- match: {columns.3.type: "long"}
- match: {columns.3.type: "unsupported"}
- match: {columns.4.name: "k8s.pod.network.tx"}
- match: {columns.4.type: "long"}
- match: {columns.4.type: "unsupported"}
- match: {columns.5.name: "k8s.pod.uid"}
- match: {columns.5.type: "keyword"}
- match: {columns.6.name: "metricset"}
Expand All @@ -84,14 +86,22 @@ load a document:
- do:
esql.query:
body:
query: 'from test | where k8s.pod.network.tx == 1434577921'
query: 'from test | where @timestamp == "2021-04-28T18:50:23.142Z"'

- length: {values: 1}
- length: {values.0: 7}
- match: {values.0.0: "2021-04-28T18:50:23.142Z"}
- match: {values.0.1: "10.10.55.3"}
- match: {values.0.2: "dog"}
- match: {values.0.3: 530600088}
- match: {values.0.4: 1434577921}
- match: {values.0.3: "<unsupported>"}
- match: {values.0.4: "<unsupported>"}
- match: {values.0.5: "df3145b3-0563-4d3b-a0f7-897eb2876ea9"}
- match: {values.0.6: "pod"}

---
filter on counter:
- do:
catch: /Cannot use field \[k8s.pod.network.tx\] with unsupported type \[counter\]/
esql.query:
body:
query: 'from test | where k8s.pod.network.tx == 1434577921'
Original file line number Diff line number Diff line change
@@ -0,0 +1,112 @@
setup:
- do:
indices.create:
index: test
body:
settings:
index:
mode: standard
mappings:
properties:
"@timestamp":
type: date
metricset:
type: keyword
time_series_dimension: true
k8s:
properties:
pod:
properties:
uid:
type: keyword
time_series_dimension: true
name:
type: keyword
ip:
type: ip
network:
properties:
tx:
type: long
time_series_metric: counter
rx:
type: long
time_series_metric: counter
- do:
bulk:
refresh: true
index: test
body:
- '{"index": {}}'
- '{"@timestamp": "2021-04-28T18:50:04.467Z", "metricset": "pod", "k8s": {"pod": {"name": "cat", "uid":"947e4ced-1786-4e53-9e0c-5c447e959507", "ip": "10.10.55.1", "network": {"tx": 2001818691, "rx": 802133794}}}}'
- '{"index": {}}'
- '{"@timestamp": "2021-04-28T18:50:24.467Z", "metricset": "pod", "k8s": {"pod": {"name": "cat", "uid":"947e4ced-1786-4e53-9e0c-5c447e959507", "ip": "10.10.55.1", "network": {"tx": 2005177954, "rx": 801479970}}}}'
- '{"index": {}}'
- '{"@timestamp": "2021-04-28T18:50:44.467Z", "metricset": "pod", "k8s": {"pod": {"name": "cat", "uid":"947e4ced-1786-4e53-9e0c-5c447e959507", "ip": "10.10.55.1", "network": {"tx": 2006223737, "rx": 802337279}}}}'
- '{"index": {}}'
- '{"@timestamp": "2021-04-28T18:51:04.467Z", "metricset": "pod", "k8s": {"pod": {"name": "cat", "uid":"947e4ced-1786-4e53-9e0c-5c447e959507", "ip": "10.10.55.2", "network": {"tx": 2012916202, "rx": 803685721}}}}'
- '{"index": {}}'
- '{"@timestamp": "2021-04-28T18:50:03.142Z", "metricset": "pod", "k8s": {"pod": {"name": "dog", "uid":"df3145b3-0563-4d3b-a0f7-897eb2876ea9", "ip": "10.10.55.3", "network": {"tx": 1434521831, "rx": 530575198}}}}'
- '{"index": {}}'
- '{"@timestamp": "2021-04-28T18:50:23.142Z", "metricset": "pod", "k8s": {"pod": {"name": "dog", "uid":"df3145b3-0563-4d3b-a0f7-897eb2876ea9", "ip": "10.10.55.3", "network": {"tx": 1434577921, "rx": 530600088}}}}'
- '{"index": {}}'
- '{"@timestamp": "2021-04-28T18:50:53.142Z", "metricset": "pod", "k8s": {"pod": {"name": "dog", "uid":"df3145b3-0563-4d3b-a0f7-897eb2876ea9", "ip": "10.10.55.3", "network": {"tx": 1434587694, "rx": 530604797}}}}'
- '{"index": {}}'
- '{"@timestamp": "2021-04-28T18:51:03.142Z", "metricset": "pod", "k8s": {"pod": {"name": "dog", "uid":"df3145b3-0563-4d3b-a0f7-897eb2876ea9", "ip": "10.10.55.3", "network": {"tx": 1434595272, "rx": 530605511}}}}'

---
load everything:
- do:
esql.query:
body:
query: 'from test'

- match: {columns.0.name: "@timestamp"}
- match: {columns.0.type: "date"}
- match: {columns.1.name: "k8s.pod.ip"}
- match: {columns.1.type: "ip"}
- match: {columns.2.name: "k8s.pod.name"}
- match: {columns.2.type: "keyword"}
- match: {columns.3.name: "k8s.pod.network.rx"}
- match: {columns.3.type: "long"}
- match: {columns.4.name: "k8s.pod.network.tx"}
- match: {columns.4.type: "long"}
- match: {columns.5.name: "k8s.pod.uid"}
- match: {columns.5.type: "keyword"}
- match: {columns.6.name: "metricset"}
- match: {columns.6.type: "keyword"}
- length: {values: 8}

---
load a document:
- do:
esql.query:
body:
query: 'from test | where @timestamp == "2021-04-28T18:50:23.142Z"'

- length: {values: 1}
- length: {values.0: 7}
- match: {values.0.0: "2021-04-28T18:50:23.142Z"}
- match: {values.0.1: "10.10.55.3"}
- match: {values.0.2: "dog"}
- match: {values.0.3: 530600088}
- match: {values.0.4: 1434577921}
- match: {values.0.5: "df3145b3-0563-4d3b-a0f7-897eb2876ea9"}
- match: {values.0.6: "pod"}

---
filter on counter:
- do:
esql.query:
body:
query: 'from test | where k8s.pod.network.tx == 1434577921'

- length: {values: 1}
- length: {values.0: 7}
- match: {values.0.0: "2021-04-28T18:50:23.142Z"}
- match: {values.0.1: "10.10.55.3"}
- match: {values.0.2: "dog"}
- match: {values.0.3: 530600088}
- match: {values.0.4: 1434577921}
- match: {values.0.5: "df3145b3-0563-4d3b-a0f7-897eb2876ea9"}
- match: {values.0.6: "pod"}
Original file line number Diff line number Diff line change
Expand Up @@ -247,7 +247,7 @@ private static Object valueAt(String dataType, Block block, int offset, BytesRef
*/
private static Page valuesToPage(List<String> dataTypes, List<List<Object>> values) {
List<Block.Builder> results = dataTypes.stream()
.map(c -> LocalExecutionPlanner.toElementType(EsqlDataTypes.fromEs(c)).newBlockBuilder(values.size()))
.map(c -> LocalExecutionPlanner.toElementType(EsqlDataTypes.fromName(c)).newBlockBuilder(values.size()))
.toList();

for (List<Object> row : values) {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@

package org.elasticsearch.xpack.esql.type;

import org.elasticsearch.index.mapper.TimeSeriesParams;
import org.elasticsearch.xpack.ql.type.DataType;
import org.elasticsearch.xpack.ql.type.DataTypeConverter;
import org.elasticsearch.xpack.ql.type.DataTypeRegistry;
Expand All @@ -29,8 +30,12 @@ public Collection<DataType> dataTypes() {
}

@Override
public DataType fromEs(String typeName) {
return EsqlDataTypes.fromEs(typeName);
public DataType fromEs(String typeName, TimeSeriesParams.MetricType metricType) {
if (metricType == TimeSeriesParams.MetricType.COUNTER) {
// Counter fields will be a counter type, for now they are unsupported
return DataTypes.UNSUPPORTED;
}
return EsqlDataTypes.fromName(typeName);
}

@Override
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -87,7 +87,7 @@ public static DataType fromTypeName(String name) {
return NAME_TO_TYPE.get(name.toLowerCase(Locale.ROOT));
}

public static DataType fromEs(String name) {
public static DataType fromName(String name) {
DataType type = ES_TO_TYPE.get(name);
return type != null ? type : UNSUPPORTED;
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -72,7 +72,7 @@ private ColumnInfo randomColumnInfo() {

private Page randomPage(List<ColumnInfo> columns) {
return new Page(columns.stream().map(c -> {
Block.Builder builder = LocalExecutionPlanner.toElementType(EsqlDataTypes.fromEs(c.type())).newBlockBuilder(1);
Block.Builder builder = LocalExecutionPlanner.toElementType(EsqlDataTypes.fromName(c.type())).newBlockBuilder(1);
switch (c.type()) {
case "unsigned_long", "long" -> ((LongBlock.Builder) builder).appendLong(randomLong());
case "integer" -> ((IntBlock.Builder) builder).appendInt(randomInt());
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0; you may not use this file except in compliance with the Elastic License
* 2.0.
*/
package org.elasticsearch.xpack.esql.type;

import org.elasticsearch.action.fieldcaps.FieldCapabilities;
import org.elasticsearch.action.fieldcaps.FieldCapabilitiesResponse;
import org.elasticsearch.index.mapper.TimeSeriesParams;
import org.elasticsearch.test.ESTestCase;
import org.elasticsearch.xpack.ql.index.IndexResolution;
import org.elasticsearch.xpack.ql.index.IndexResolver;
import org.elasticsearch.xpack.ql.type.DataType;
import org.elasticsearch.xpack.ql.type.DataTypes;
import org.elasticsearch.xpack.ql.type.EsField;

import java.util.Map;

import static org.hamcrest.Matchers.equalTo;

public class EsqlDataTypeRegistryTests extends ESTestCase {
public void testCounter() {
resolve("long", TimeSeriesParams.MetricType.COUNTER, DataTypes.UNSUPPORTED);
}

public void testGauge() {
resolve("long", TimeSeriesParams.MetricType.GAUGE, DataTypes.LONG);
}

public void testLong() {
resolve("long", null, DataTypes.LONG);
}

private void resolve(String esTypeName, TimeSeriesParams.MetricType metricType, DataType expected) {
String[] indices = new String[] { "idx-" + randomAlphaOfLength(5) };
FieldCapabilities fieldCap = new FieldCapabilities(
randomAlphaOfLength(3),
esTypeName,
false,
true,
true,
false,
metricType,
indices,
null,
null,
null,
null,
Map.of()
);
FieldCapabilitiesResponse caps = new FieldCapabilitiesResponse(indices, Map.of(fieldCap.getName(), Map.of(esTypeName, fieldCap)));
IndexResolution resolution = IndexResolver.mergedMappings(EsqlDataTypeRegistry.INSTANCE, "idx-*", caps);

EsField f = resolution.get().mapping().get(fieldCap.getName());
assertThat(f.getDataType(), equalTo(expected));
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@
import org.elasticsearch.common.util.Maps;
import org.elasticsearch.core.Tuple;
import org.elasticsearch.index.IndexNotFoundException;
import org.elasticsearch.index.mapper.TimeSeriesParams;
import org.elasticsearch.transport.NoSuchRemoteClusterException;
import org.elasticsearch.xpack.ql.QlIllegalArgumentException;
import org.elasticsearch.xpack.ql.type.DataType;
Expand Down Expand Up @@ -453,15 +454,23 @@ private static EsField createField(
// lack of parent implies the field is an alias
if (map == null) {
// as such, create the field manually, marking the field to also be an alias
fieldFunction = s -> createField(typeRegistry, s, OBJECT.esType(), new TreeMap<>(), false, true);
fieldFunction = s -> createField(typeRegistry, s, OBJECT.esType(), null, new TreeMap<>(), false, true);
} else {
Iterator<FieldCapabilities> iterator = map.values().iterator();
FieldCapabilities parentCap = iterator.next();
if (iterator.hasNext() && UNMAPPED.equals(parentCap.getType())) {
parentCap = iterator.next();
}
final FieldCapabilities parentC = parentCap;
fieldFunction = s -> createField(typeRegistry, s, parentC.getType(), new TreeMap<>(), parentC.isAggregatable(), false);
fieldFunction = s -> createField(
typeRegistry,
s,
parentC.getType(),
parentC.getMetricType(),
new TreeMap<>(),
parentC.isAggregatable(),
false
);
}

parent = createField(typeRegistry, parentName, globalCaps, hierarchicalMapping, flattedMapping, fieldFunction);
Expand Down Expand Up @@ -495,11 +504,12 @@ private static EsField createField(
DataTypeRegistry typeRegistry,
String fieldName,
String typeName,
TimeSeriesParams.MetricType metricType,
Map<String, EsField> props,
boolean isAggregateable,
boolean isAlias
) {
DataType esType = typeRegistry.fromEs(typeName);
DataType esType = typeRegistry.fromEs(typeName, metricType);

if (esType == TEXT) {
return new TextEsField(fieldName, props, false, isAlias);
Expand All @@ -514,7 +524,8 @@ private static EsField createField(
return DateEsField.dateEsField(fieldName, props, isAggregateable);
}
if (esType == UNSUPPORTED) {
return new UnsupportedEsField(fieldName, typeName, null, props);
String originalType = metricType == TimeSeriesParams.MetricType.COUNTER ? "counter" : typeName;
return new UnsupportedEsField(fieldName, originalType, null, props);
}

return new EsField(fieldName, esType, props, isAggregateable, isAlias);
Expand Down Expand Up @@ -727,7 +738,15 @@ private static void createField(
indexFields.flattedMapping,
s -> invalidField != null
? invalidField
: createField(typeRegistry, s, typeCap.getType(), emptyMap(), typeCap.isAggregatable(), isAliasFieldType.get())
: createField(
typeRegistry,
s,
typeCap.getType(),
typeCap.getMetricType(),
emptyMap(),
typeCap.isAggregatable(),
isAliasFieldType.get()
)
);
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,8 @@

package org.elasticsearch.xpack.ql.type;

import org.elasticsearch.index.mapper.TimeSeriesParams;

import java.util.Collection;

/**
Expand All @@ -19,7 +21,7 @@ public interface DataTypeRegistry {
//
Collection<DataType> dataTypes();

DataType fromEs(String typeName);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would try to make this a bit more generic, ie instead of specifically looking at TimeSeriesParams.MetricType I'd try to use the FieldCapabilities instance itself from where I could extract whatever information necessary to decide the supportability of a data type. Another suggestion is to look into defining a second fromEs method that receives this FieldCapabilities parameter and keep the original one unchanged.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the IndexResolver has been created the field_caps API didn't return such additional information and I am not seeing any less intrusive way to support this other than what you suggested.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be worried about keeping the original one unchanged because it's kind of dangerous, at least in ESQL it is. But I'm happy to move the whole field caps response there.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm stumbling over the same issue - how about encoding the TIME_SERIES_METRIC_PARAM inside the name along with its value, as a suffix for example?
This convention avoids having to extend the static method and can handle in the future more parameter or special types outside the metric ones without having to extend the method yet again.

For example when dealing with a timeseries metric type, append that to the typeName itself and then check in EsqlDataType if the typeName ends up TIME_SERIES_METRIC_PARAM-COUNTER.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding it to the string feels easy to break. It'd be even more likely for folks to accidentally not use the metric param.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked at taking the whole FieldCapabilities object but there is a place we don't have it - in the mapping walking code. I'm not sure it's worth trying to get it there when adding the smaller parameter works.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about adding a generic object as field context instead of typing it to TimeSeriesParams.MetricType and doing an instanceof check?
It is not OOP-ish but in this case we're looking at a bag properties for fields vs just the name.
For example how about DataType fromEs(String typeName, Object typeContext)
and then in ESQL do:

if (typeContext ==TimeSeriesParams.MetricType.COUNTER)) {
  ...
}

I'd like to avoid us to add a new param for each special type on the method signature and instead use a "bucket" type approach.
If you're on the fence, we can merge the PR and revisit this in the future.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get where your coming from. My preference would be to merge as is and then, when we got to add the next "funny" parameter, we do something like:

record FromEs(String typeName, TimeSeriesParams.MetricType metricType, TheNewThing newThing) {}

DataType fromEs(FromEs fromEs) {

If it feels sensible then. I don't feel like we need something so fancy for two fields. But three or our we really may.

DataType fromEs(String typeName, TimeSeriesParams.MetricType metricType);

DataType fromJava(Object value);

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,8 @@

package org.elasticsearch.xpack.ql.type;

import org.elasticsearch.index.mapper.TimeSeriesParams;

import java.util.Collection;

public class DefaultDataTypeRegistry implements DataTypeRegistry {
Expand All @@ -21,7 +23,7 @@ public Collection<DataType> dataTypes() {
}

@Override
public DataType fromEs(String typeName) {
public DataType fromEs(String typeName, TimeSeriesParams.MetricType metricType) {
return DataTypes.fromEs(typeName);
}

Expand Down
Loading