Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ES|QL does not consistently handle field names starting with a number #100312

Closed
VimCommando opened this issue Oct 5, 2023 · 2 comments · Fixed by #100740
Closed

ES|QL does not consistently handle field names starting with a number #100312

VimCommando opened this issue Oct 5, 2023 · 2 comments · Fixed by #100740
Assignees
Labels
:Analytics/ES|QL AKA ESQL >bug Team:QL (Deprecated) Meta label for query languages team
Milestone

Comments

@VimCommando
Copy link
Contributor

Elasticsearch Version

8.11.0-SNAPSHOT

Installed Plugins

No response

Java Version

bundled

OS Version

Darwin MacBook-Pro.local 22.6.0 Darwin Kernel Version 22.6.0: Fri Sep 15 13:41:28 PDT 2023; root:xnu-8796.141.3.700.8~1/RELEASE_ARM64_T6000 arm64

Problem Description

If a field name starts with a leading digit, ES|QL commands sometimes accept it and other times error on it.

I expect the field name to be handled consistently between all commands.

For a real-world example, take Elasticsearch's own stack monitoring:

node_stats.os.cpu.load_average.1m
elasticsearch.node.stats.os.cpu.load_avg.1m

The child field 1m is what causes errors.

Steps to Reproduce

Take a mapping that uses fields with leading numbers. Elasticsearch's stack monitoring has *.load_average.1m as a good starting point.

SORT

Use a base query to see the data does exist, and ES|QL can retrieve it:

from .monitoring-es-8-*
| where metricset.name == "node_stats"
| keep source_node.name, elasticsearch.node.stats.os.cpu.load_avg.1m

image

Now add a sort clause, and see it error on the .1 portion of the field name:

from .monitoring-es-8-*
| where metricset.name == "node_stats"
| keep source_node.name, elasticsearch.node.stats.os.cpu.load_avg.1m
| sort elasticsearch.node.stats.os.cpu.load_avg.1m

image

STATS ... BY

Use stats on node_stats.process.cpu.percent and it works:

from .monitoring-es-8-*
| where metricset.name == "node_stats"
| keep source_node.name, node_stats.process.cpu.percent
| stats avg_cpu = avg(node_stats.process.cpu.percent) by source_node.name

image

Use stats on elasticsearch.node.stats.os.cpu.load_avg.1m and it fails:

from .monitoring-es-8-*
| where metricset.name == "node_stats"
| keep source_node.name, elasticsearch.node.stats.os.cpu.load_avg.1m
| stats avg_cpu = avg(elasticsearch.node.stats.os.cpu.load_avg.1m) by source_node.name

image

RENAME

If you first rename the field, sort and stats work:

from .monitoring-es-8-*
| where metricset.name == "node_stats"
| keep source_node.name, elasticsearch.node.stats.os.cpu.load_avg.1m
| rename elasticsearch.node.stats.os.cpu.load_avg.1m as load_average
| stats max_load_avg = max(load_average) by source_node.name
| sort max_load_avg desc

image

Autocomplete

The field also breaks autocomplete. If you use autocomplete before the numerical field it works:

image

But after the field, it fails:

image

Logs (if relevant)

Logs for ailure on sort:

2023-10-04 16:47:13 {"@timestamp":"2023-10-05T02:47:13.482Z", "log.level":"DEBUG", "message":"Parsing as statement: from .monitoring-es-8-*\n| where metricset.name == \"node_stats\"\n| keep source_node.name, elasticsearch.node.stats.os.cpu.load_avg.1m\n| sort elasticsearch.node.stats.os.cpu.load_avg.1m", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"elasticsearch[esql-data][esql][T#5]","log.logger":"org.elasticsearch.xpack.esql.parser.EsqlParser","trace.id":"8ce5c7c44f33d01f60397e2626bbc588","elasticsearch.cluster.uuid":"Z_D3SfJYRfKxdyJ8sQnLPA","elasticsearch.node.id":"7P6LpFHCQXa83mrNEW1X4Q","elasticsearch.node.name":"esql-data","elasticsearch.cluster.name":"esql"}
2023-10-04 16:47:13 {"@timestamp":"2023-10-05T02:47:13.483Z", "log.level": "INFO", "message":"Failed execution of ESQL query.\nQuery string: [from .monitoring-es-8-*\n| where metricset.name == \"node_stats\"\n| keep source_node.name, elasticsearch.node.stats.os.cpu.load_avg.1m\n| sort elasticsearch.node.stats.os.cpu.load_avg.1m]\nExecution time: [1]ms", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"elasticsearch[esql-data][esql][T#5]","log.logger":"org.elasticsearch.xpack.esql.action.EsqlResponseListener","trace.id":"8ce5c7c44f33d01f60397e2626bbc588","elasticsearch.cluster.uuid":"Z_D3SfJYRfKxdyJ8sQnLPA","elasticsearch.node.id":"7P6LpFHCQXa83mrNEW1X4Q","elasticsearch.node.name":"esql-data","elasticsearch.cluster.name":"esql"}

Logs for failure on stats ... by:

2023-10-04 16:49:08 {"@timestamp":"2023-10-05T02:49:08.287Z", "log.level":"DEBUG", "message":"ESQL query:\nfrom .monitoring-es-8-*\n| where metricset.name == \"node_stats\"\n| keep source_node.name, elasticsearch.node.stats.os.cpu.load_avg.1m\n| stats elasticsearch.node.stats.os.cpu.load_avg.1m by source_node.name", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"elasticsearch[esql-data][esql][T#4]","log.logger":"org.elasticsearch.xpack.esql.session.EsqlSession","trace.id":"9281898aba91c07f83cefe03ed0f8e74","elasticsearch.cluster.uuid":"Z_D3SfJYRfKxdyJ8sQnLPA","elasticsearch.node.id":"7P6LpFHCQXa83mrNEW1X4Q","elasticsearch.node.name":"esql-data","elasticsearch.cluster.name":"esql"}
2023-10-04 16:49:08 {"@timestamp":"2023-10-05T02:49:08.287Z", "log.level":"DEBUG", "message":"Parsing as statement: from .monitoring-es-8-*\n| where metricset.name == \"node_stats\"\n| keep source_node.name, elasticsearch.node.stats.os.cpu.load_avg.1m\n| stats elasticsearch.node.stats.os.cpu.load_avg.1m by source_node.name", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"elasticsearch[esql-data][esql][T#4]","log.logger":"org.elasticsearch.xpack.esql.parser.EsqlParser","trace.id":"9281898aba91c07f83cefe03ed0f8e74","elasticsearch.cluster.uuid":"Z_D3SfJYRfKxdyJ8sQnLPA","elasticsearch.node.id":"7P6LpFHCQXa83mrNEW1X4Q","elasticsearch.node.name":"esql-data","elasticsearch.cluster.name":"esql"}
2023-10-04 16:49:08 {"@timestamp":"2023-10-05T02:49:08.287Z", "log.level": "INFO", "message":"Beginning execution of ESQL query.\nQuery string: [from .monitoring-es-8-*\n| where metricset.name == \"node_stats\"\n| keep source_node.name, elasticsearch.node.stats.os.cpu.load_avg.1m\n| stats elasticsearch.node.stats.os.cpu.load_avg.1m by source_node.name]", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"elasticsearch[esql-data][transport_worker][T#2]","log.logger":"org.elasticsearch.xpack.esql.action.RestEsqlQueryAction","trace.id":"9281898aba91c07f83cefe03ed0f8e74","elasticsearch.cluster.uuid":"Z_D3SfJYRfKxdyJ8sQnLPA","elasticsearch.node.id":"7P6LpFHCQXa83mrNEW1X4Q","elasticsearch.node.name":"esql-data","elasticsearch.cluster.name":"esql"}
2023-10-04 16:49:08 {"@timestamp":"2023-10-05T02:49:08.287Z", "log.level":"DEBUG", "message":"ESQL query:\nfrom .monitoring-es-8-*\n| where metricset.name == \"node_stats\"\n| keep source_node.name, elasticsearch.node.stats.os.cpu.load_avg.1m\n| stats elasticsearch.node.stats.os.cpu.load_avg.1m by source_node.name", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"elasticsearch[esql-data][esql][T#3]","log.logger":"org.elasticsearch.xpack.esql.session.EsqlSession","trace.id":"9281898aba91c07f83cefe03ed0f8e74","elasticsearch.cluster.uuid":"Z_D3SfJYRfKxdyJ8sQnLPA","elasticsearch.node.id":"7P6LpFHCQXa83mrNEW1X4Q","elasticsearch.node.name":"esql-data","elasticsearch.cluster.name":"esql"}
2023-10-04 16:49:08 {"@timestamp":"2023-10-05T02:49:08.288Z", "log.level":"DEBUG", "message":"Parsing as statement: from .monitoring-es-8-*\n| where metricset.name == \"node_stats\"\n| keep source_node.name, elasticsearch.node.stats.os.cpu.load_avg.1m\n| stats elasticsearch.node.stats.os.cpu.load_avg.1m by source_node.name", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"elasticsearch[esql-data][esql][T#3]","log.logger":"org.elasticsearch.xpack.esql.parser.EsqlParser","trace.id":"9281898aba91c07f83cefe03ed0f8e74","elasticsearch.cluster.uuid":"Z_D3SfJYRfKxdyJ8sQnLPA","elasticsearch.node.id":"7P6LpFHCQXa83mrNEW1X4Q","elasticsearch.node.name":"esql-data","elasticsearch.cluster.name":"esql"}
2023-10-04 16:49:08 {"@timestamp":"2023-10-05T02:49:08.288Z", "log.level": "INFO", "message":"Failed execution of ESQL query.\nQuery string: [from .monitoring-es-8-*\n| where metricset.name == \"node_stats\"\n| keep source_node.name, elasticsearch.node.stats.os.cpu.load_avg.1m\n| stats elasticsearch.node.stats.os.cpu.load_avg.1m by source_node.name]\nExecution time: [1]ms", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"elasticsearch[esql-data][esql][T#4]","log.logger":"org.elasticsearch.xpack.esql.action.EsqlResponseListener","trace.id":"9281898aba91c07f83cefe03ed0f8e74","elasticsearch.cluster.uuid":"Z_D3SfJYRfKxdyJ8sQnLPA","elasticsearch.node.id":"7P6LpFHCQXa83mrNEW1X4Q","elasticsearch.node.name":"esql-data","elasticsearch.cluster.name":"esql"}
2023-10-04 16:49:08 {"@timestamp":"2023-10-05T02:49:08.289Z", "log.level": "INFO", "message":"Failed execution of ESQL query.\nQuery string: [from .monitoring-es-8-*\n| where metricset.name == \"node_stats\"\n| keep source_node.name, elasticsearch.node.stats.os.cpu.load_avg.1m\n| stats elasticsearch.node.stats.os.cpu.load_avg.1m by source_node.name]\nExecution time: [1]ms", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"elasticsearch[esql-data][esql][T#3]","log.logger":"org.elasticsearch.xpack.esql.action.EsqlResponseListener","trace.id":"9281898aba91c07f83cefe03ed0f8e74","elasticsearch.cluster.uuid":"Z_D3SfJYRfKxdyJ8sQnLPA","elasticsearch.node.id":"7P6LpFHCQXa83mrNEW1X4Q","elasticsearch.node.name":"esql-data","elasticsearch.cluster.name":"esql"}
@VimCommando VimCommando added >bug needs:triage Requires assignment of a team area label :Analytics/ES|QL AKA ESQL labels Oct 5, 2023
@elasticsearchmachine elasticsearchmachine added Team:QL (Deprecated) Meta label for query languages team and removed needs:triage Requires assignment of a team area label labels Oct 5, 2023
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-ql (Team:QL)

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/elasticsearch-esql (:Query Languages/ES|QL)

@astefan astefan self-assigned this Oct 11, 2023
costin added a commit to costin/elasticsearch that referenced this issue Oct 12, 2023
Extend the unquoted identifier to contain . not just numbers. Without it
 the lexer picks the characters as decimal literal which leads to errors

Fix elastic#100312
@astefan astefan assigned costin and unassigned astefan Oct 12, 2023
@costin costin added this to the 8.12 milestone Nov 16, 2023
costin added a commit that referenced this issue Dec 12, 2023
Extend the unquoted identifier to contain . not just numbers. Without it
 the lexer picks the characters as decimal literal which leads to errors
Additionally fix a bug in quoting identifiers.

Fix #100312
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/ES|QL AKA ESQL >bug Team:QL (Deprecated) Meta label for query languages team
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants