Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PromQL query returns unrelated series #865

Closed
sboschman opened this issue Aug 29, 2018 · 19 comments
Closed

PromQL query returns unrelated series #865

sboschman opened this issue Aug 29, 2018 · 19 comments
Assignees

Comments

@sboschman
Copy link
Contributor

sboschman commented Aug 29, 2018

As mentioned on Gitter, I run into some strange issue with queries against M3DB returning unrelated series. This results in Grafana throwing errors like "many-to-many matching not allowed" or "Multiple Series Error" on f.e. the default Prometheus node-exporter dashboard.


Prometheus: 2.3.2
M3Coordinator: 0.4.1 (running on the same host as prometheus)
M3DB: 0.4.1 (4 node cluster, 2 replicasets of 2 nodes each with 3 seednodes)
Prometheus config: read_recent: true

PromQL (latest value):

up{job="node-exporter",` instance="m3db-node04:9100"}

Result (omitted the values):

up{instance="m3db-node04:9100",job="node-exporter"}

PromQL (1 day window):

up{job="node-exporter",` instance="m3db-node04:9100"}[1d]

Result:

up{instance="m3db-node04:9100",job="node-exporter"}
node_memory_CmaFree_bytes{instance="m3db-node03:9100",job="node-exporter"}

Didn't expect to get a 'node_memory_CmaFree_bytes' serie, doesn't even match the instance...


PromQL (2 day window):

up{job="node-exporter",` instance="m3db-node04:9100"}[2d]

Result:

    up{instance="m3db-node04:9100",job="node-exporter"}
    node_network_transmit_packets_total{device="lo",instance="m3db-node04:9100",job="node-exporter"}
    node_memory_CmaFree_bytes{instance="m3db-node03:9100",job="node-exporter"}
    go_memstats_alloc_bytes{instance="m3db-node03:9100",job="node-exporter"}

These extra series have values up to 'now', so they are not old series or something.


PromQL (1 week window):
A total of 6 series show up.


PromQL (2 week window):

up

Result:
All the 'up' series show up (different instance tags ofc), but no other series.


I have two Prometheus instances scraping the same nodes. On the 2nd node I changed the config to read_recent: false. The 2nd node only shows the requested serie, as expected.

@richardartoul
Copy link
Contributor

@nikunjgit @robskillington Does one of you have an idea what might be going on here? I'm not really familiar with the Prometheus stuff

@robskillington
Copy link
Collaborator

I'll look into this, might have something to do with the matchers and special characters (wonder if we're specifying regexp instead of exact match for some of the matchers we're sending to the index).

@sboschman
Copy link
Contributor Author

sboschman commented Aug 31, 2018

M3DB 0.4.2

I finally managed to query directly on httpClusterListenAddress (port 9003):

curl http://localhost:9003/query -sS -X POST -d '{
  "namespace": "metrics",
  "query": {
     "conjunction": {
       "queries": [
          { "term": { "field": "__name__", "term": "up" } },
          { "term": { "field": "job", "term": "node-exporter" } },
          { "term": { "field": "instance", "term": "m3db-node01:9100" } }
       ]
     }      
  },
  "noData": true,
  "rangeStart": 0,
  "rangeEnd":'"$(date +"%s")"'
}' | jq .

I would expect 1 result, not 2:

{
  "results": [
    {
      "id": "__name__=up,instance=m3db-node01:9100,job=node-exporter,",
      "tags": [
        {
          "name": "job",
          "value": "node-exporter"
        },
        {
          "name": "__name__",
          "value": "up"
        },
        {
          "name": "instance",
          "value": "m3db-node01:9100"
        }
      ],
      "datapoints": null
    },
    {
      "id": "__name__=up,instance=prometheus02:9100,job=node-exporter,",
      "tags": [
        {
          "name": "__name__",
          "value": "up"
        },
        {
          "name": "instance",
          "value": "prometheus02:9100"
        },
        {
          "name": "job",
          "value": "node-exporter"
        }
      ],
      "datapoints": null
    }
  ],
  "exhaustive": true
}

@sboschman
Copy link
Contributor Author

Most likely I am not translating the promql query the same as m3coordinator does, but another finding:

PromQL

node_memory_SwapTotal_bytes{instance="m3db-node01:9100"}[2d]

Result:

    node_xfs_allocation_btree_records_deleted_total{device="sda1",instance="m3db-node01:9100",job="node-exporter"}
    node_memory_SwapTotal_bytes{instance="m3db-node01:9100",job="node-exporter"}
    node_disk_io_now{device="sda",instance="m3db-node01:9100",job="node-exporter"}
    go_memstats_mallocs_total{instance="m3db-node01:9100",job="node-exporter"}

m3ql query:

curl http://localhost:9003/query -sS -X POST -d '{
  "namespace": "metrics",
  "query": {
     "conjunction": {
       "queries": [
          { "term": { "field": "__name__", "term": "node_memory_SwapTotal_bytes" } },
          { "term": { "field": "instance", "term": "m3db-node01:9100" } }
       ]
     }      
  },
  "noData": true,
  "rangeStart": '"$(date -d "2 days ago" +"%s")"',
  "rangeEnd":'"$(date +"%s")"'
}' | jq .

Result:

{
  "results": [
    {
      "id": "__name__=node_memory_SwapTotal_bytes,instance=alertmanager01:9100,job=node-exporter,",
      "tags": [
        {
          "name": "__name__",
          "value": "node_memory_SwapTotal_bytes"
        },
        {
          "name": "instance",
          "value": "alertmanager01:9100"
        },
        {
          "name": "job",
          "value": "node-exporter"
        },
      ],
      "datapoints": null
    },
    {
      "id": "__name__=node_memory_SwapTotal_bytes,instance=m3db-node01:9100,job=node-exporter,",
      "tags": [
        {
          "name": "instance",
          "value": "m3db-node01:9100"
        },
        {
          "name": "job",
          "value": "node-exporter"
        },
        {
          "name": "__name__",
          "value": "node_memory_SwapTotal_bytes"
        },
      ],
      "datapoints": null
    },
    {
      "id": "__name__=node_memory_SwapTotal_bytes,instance=m3db-node03:9100,job=node-exporter,",
      "tags": [
        {
          "name": "__name__",
          "value": "node_memory_SwapTotal_bytes"
        },
        {
          "name": "instance",
          "value": "m3db-node03:9100"
        },
        {
          "name": "job",
          "value": "node-exporter"
        },
      ],
      "datapoints": null
    }
  ],
  "exhaustive": true
}

So, where m3coordinator sometimes also returns unrelated series, m3db itself returns the correct serie but for fields it shouldn't.

@sboschman
Copy link
Contributor Author

Did some digging around and it seems the order of query terms is relevant as well. Probably explains the difference in my previous post between query from Prometheus and my direct m3db query.

Query term order instance - name:

curl http://localhost:9003/query -sS -X POST -d '{
  "namespace": "metrics",
  "query": {
     "conjunction": {
       "queries": [
         { "term": { "field": "instance", "term": "m3db-node01:9100" } },
         { "term": { "field": "__name__", "term": "node_memory_SwapTotal_bytes" } }
       ]
     }      
  },
  "noData": true,
  "rangeStart": '"$(date -d "2 days ago" +"%s")"',
  "rangeEnd":'"$(date +"%s")"'
}' | jq .

Result:

{
  "results": [
    {
      "id": "__name__=node_cpu_seconds_total,cpu=1,instance=m3db-node01:9100,job=node-exporter,mode=system,",
      "tags": [
        {
          "name": "cpu",
          "value": "1"
        },
        {
          "name": "instance",
          "value": "m3db-node01:9100"
        },
        {
          "name": "job",
          "value": "node-exporter"
        },
        {
          "name": "mode",
          "value": "system"
        },
        {
          "name": "__name__",
          "value": "node_cpu_seconds_total"
        }
      ],
      "datapoints": null
    },
    {
      "id": "__name__=node_memory_SwapTotal_bytes,instance=m3db-node01:9100,job=node-exporter,",
      "tags": [
        {
          "name": "job",
          "value": "node-exporter"
        },
        {
          "name": "__name__",
          "value": "node_memory_SwapTotal_bytes"
        },
        {
          "name": "instance",
          "value": "m3db-node01:9100"
        }
      ],
      "datapoints": null
    }
  ],
  "exhaustive": true
}

Query term order name - instance:

curl http://localhost:9003/query -sS -X POST -d '{
  "namespace": "metrics",
  "query": {
     "conjunction": {
       "queries": [
         { "term": { "field": "__name__", "term": "node_memory_SwapTotal_bytes" } },
         { "term": { "field": "instance", "term": "m3db-node01:9100" } }
       ]
     }
  },
  "noData": true,
  "rangeStart": '"$(date -d "2 days ago" +"%s")"',
  "rangeEnd":'"$(date +"%s")"'
}' | jq .

I would expect to get the same results...

{
  "results": [
    {
      "id": "__name__=node_memory_SwapTotal_bytes,instance=alertmanager03:9100,job=node-exporter,",
      "tags": [
        {
          "name": "instance",
          "value": "alertmanager03:9100"
        },
        {
          "name": "job",
          "value": "node-exporter"
        },
        {
          "name": "__name__",
          "value": "node_memory_SwapTotal_bytes"
        },
      ],
      "datapoints": null
    },
    {
      "id": "__name__=node_memory_SwapTotal_bytes,instance=m3db-node01:9100,job=node-exporter,",
      "tags": [
        {
          "name": "job",
          "value": "node-exporter"
        },
        {
          "name": "__name__",
          "value": "node_memory_SwapTotal_bytes"
        },
        {
          "name": "instance",
          "value": "m3db-node01:9100"
        }
      ],
      "datapoints": null
    },
    {
      "id": "__name__=node_memory_SwapTotal_bytes,instance=prometheus01:9100,job=node-exporter,",
      "tags": [
        {
          "name": "__name__",
          "value": "node_memory_SwapTotal_bytes"
        },
        {
          "name": "instance",
          "value": "prometheus01:9100"
        },
        {
          "name": "job",
          "value": "node-exporter"
        }
      ],
      "datapoints": null
    }
  ],
  "exhaustive": true
}

@prateek
Copy link
Collaborator

prateek commented Sep 5, 2018

@sboschman thanks for the detailed report! I'll dig into this today.

@prateek
Copy link
Collaborator

prateek commented Sep 5, 2018

@sboschman #860 addressed the m3coordinator related portion, and #883 addresses the m3db bits. There's more details in the PRs if you're interested in the root cause.

I'll land #883 tomorrow and cut a release after.

@prateek
Copy link
Collaborator

prateek commented Sep 6, 2018

@sboschman I cut release v0.4.3 with all the fixes. Please give it a shot.

@sboschman
Copy link
Contributor Author

Upgraded everything to 0.4.3, and the Grafana dashboard for the Prometheus node-exporter looks a lot better. The multiple series error is gone. 👍

Just noticed that sometimes a grafana graph does not render, giving an error like (but after a page refresh it is oke again)

Invalid label name: ��name&promhttp_metric_handler_requests_

In the m3coordinator log are the follwoing errors (no idea if they are related though):

Sep 06 13:51:36 prometheus01 m3coordinator[59672]: 2018-09-06T13:51:36.007Z        INFO        finished handling request        {"rqID": "6da834ee-bbcb-40cf-b93a-574ffb91747d", "time": "2018-09-06T13:51:36.007Z", "response": "1.05089285s", "url": "/api/v1/prom/remote/read"}
Sep 06 13:53:00 prometheus01 m3coordinator[59672]: 2018-09-06T13:53:00.916Z        ERROR        unable to fetch data        {"rqID": "758514d0-00e8-4712-8c2f-8d3c66919519", "error": "zero width assertions not allowed"}
Sep 06 13:53:04 prometheus01 m3coordinator[59672]: 2018-09-06T13:53:04.460Z        ERROR        unable to fetch data        {"rqID": "f598ae07-7840-4491-a1f4-15458ad448f4", "error": "zero width assertions not allowed"}
Sep 06 13:53:06 prometheus01 m3coordinator[59672]: 2018-09-06T13:53:06.422Z        ERROR        unable to fetch data        {"rqID": "51dd2666-0dd9-49a2-8303-e83795c9b552", "error": "zero width assertions not allowed"}
Sep 06 13:53:39 prometheus01 m3coordinator[59672]: 2018-09-06T13:53:39.569Z        ERROR        unable to fetch data        {"rqID": "1d433827-8085-4a80-8d3d-b0df977534fb", "error": "zero width assertions not allowed"}
Sep 06 13:57:54 prometheus01 m3coordinator[59672]: http: multiple response.WriteHeader calls
Sep 06 13:57:54 prometheus01 m3coordinator[59672]: 2018-09-06T13:57:54.604Z        ERROR        unable to encode read results to snappy        {"rqID": "c43ec429-63e1-449f-a925-c5f0c754fc90", "err": "write tcp [::1]:7201->[::1]:44064: write: broken pipe"}
Sep 06 13:57:56 prometheus01 m3coordinator[59672]: 2018-09-06T13:57:56.364Z        INFO        finished handling request        {"rqID": "4818d464-5742-481c-b0c7-6bd0aa379bd2", "time": "2018-09-06T13:57:56.363Z", "response": "1.715677465s", "url": "/api/v1/prom/remote/read"}
Sep 06 13:57:57 prometheus01 m3coordinator[59672]: 2018-09-06T13:57:57.939Z        INFO        finished handling request        {"rqID": "43d68b1d-4bcb-4009-950c-a0bf411bf60f", "time": "2018-09-06T13:57:57.939Z", "response": "1.077992195s", "url": "/api/v1/prom/remote/read"}
Sep 06 13:58:00 prometheus01 m3coordinator[59672]: 2018-09-06T13:58:00.522Z        INFO        finished handling request        {"rqID": "f234965b-2b03-4723-98ed-be9e0a5e1c52", "time": "2018-09-06T13:58:00.522Z", "response": "1.214589779s", "url": "/api/v1/prom/remote/read"}
Sep 06 13:58:02 prometheus01 m3coordinator[59672]: 2018-09-06T13:58:02.033Z        INFO        finished handling request        {"rqID": "c5347f88-d94d-46e0-8cde-7e55302a1965", "time": "2018-09-06T13:58:02.032Z", "response": "1.035613288s", "url": "/api/v1/prom/remote/read"}
Sep 06 14:00:14 prometheus01 m3coordinator[59672]: 2018-09-06T14:00:14.123Z        INFO        finished handling request        {"rqID": "32f0748c-0fc8-4c9c-b1c5-e06bbbef78b5", "time": "2018-09-06T14:00:14.123Z", "response": "1.176064635s", "url": "/api/v1/prom/remote/read"}
Sep 06 14:00:14 prometheus01 m3coordinator[59672]: 2018-09-06T14:00:14.288Z        ERROR        unable to fetch data        {"rqID": "40eca335-6fdd-4cc0-8197-f3253e833f67", "error": "no clusters can fulfill query"}
Sep 06 14:00:15 prometheus01 m3coordinator[59672]: 2018-09-06T14:00:15.916Z        WARN        connection closed by client        {"rqID": "a35fc94b-fd6e-4282-b9b7-8734f82d8bec"}
Sep 06 14:00:16 prometheus01 m3coordinator[59672]: http: multiple response.WriteHeader calls
Sep 06 14:00:16 prometheus01 m3coordinator[59672]: 2018-09-06T14:00:16.167Z        ERROR        unable to encode read results to snappy        {"rqID": "a35fc94b-fd6e-4282-b9b7-8734f82d8bec", "err": "write tcp [::1]:7201->[::1]:44472: write: broken pipe"}
Sep 06 14:00:16 prometheus01 m3coordinator[59672]: 2018-09-06T14:00:16.167Z        INFO        finished handling request        {"rqID": "a35fc94b-fd6e-4282-b9b7-8734f82d8bec", "time": "2018-09-06T14:00:16.167Z", "response": "1.06986071s", "url": "/api/v1/prom/remote/read"}
Sep 06 14:00:17 prometheus01 m3coordinator[59672]: http: multiple response.WriteHeader calls
Sep 06 14:00:17 prometheus01 m3coordinator[59672]: 2018-09-06T14:00:17.408Z        ERROR        unable to encode read results to snappy        {"rqID": "3ce75095-80a8-488c-adab-8329b20001ce", "err": "write tcp [::1]:7201->[::1]:43382: write: broken pipe"}
Sep 06 14:00:17 prometheus01 m3coordinator[59672]: 2018-09-06T14:00:17.408Z        INFO        finished handling request        {"rqID": "3ce75095-80a8-488c-adab-8329b20001ce", "time": "2018-09-06T14:00:17.408Z", "response": "1.397279004s", "url": "/api/v1/prom/remote/read"}
Sep 06 14:00:17 prometheus01 m3coordinator[59672]: http: multiple response.WriteHeader calls
Sep 06 14:00:17 prometheus01 m3coordinator[59672]: 2018-09-06T14:00:17.411Z        ERROR        unable to encode read results to snappy        {"rqID": "28cb581b-8002-450d-b255-97a378ec5765", "err": "write tcp [::1]:7201->[::1]:44468: write: broken pipe"}

Don't have time next week to look into this, so might take a while before I can add some more info.

@prateek
Copy link
Collaborator

prateek commented Sep 6, 2018

@sboschman cheers, thanks for the update.

Re the log messages pertaining zero width assertions not allowed: addressed that issue in #885. Will cut a release once that PR lands.

I'm unsure about the other issues at the moment. Will dig around.

Re: "Invalid label name" - @nikunjgit could this be related to the tag translation coordinator does?

@sboschman
Copy link
Contributor Author

@prateek About the invalid label name, are you referring to the flow including the function 'FromM3IdentToMetric' in index.go for "tag translation"? Did some poor man's debugging and the raw bytes for the ID of FetchTaggedIDResult_ (rpc.go) look incorrect.

The ID should be like 'name=...', but sometimes it contains gibberish like 'se_size_bytes_bucket,handler=/query,instance=prometheus01,job=no'.

So I suspect the issue is more m3db related than coordinator. Assuming FetchTaggedIDResult_ contains what goes over the wire.

@prateek
Copy link
Collaborator

prateek commented Sep 21, 2018

@sboschman Not exactly. At write time, m3coordinator generates an ID for a given prom metric as the concatenation of all it's tag name/value (after sorting) pairs, link to the code. So it'd be expected to see IDs being of the form name1=value1;name2=value...

Re: FetchTaggedIDResult_ , yeah it's where you'd expect to see the data returned over the wire.

Mind pasting the entire ID you think is corrupted?

@matejzero
Copy link
Contributor

matejzero commented Sep 22, 2018

I think I have the same problem regarding invalid metric names.

This problem is easiest to reproduce on Node exporter dashboard I got from grafana.org. When I load it, I get a lot of empty panels with ! in the upper left corner. When I hover over it, there is an error about invalid label name, but the name has nothing to do with the panel that should be shown.

For example, I have a CPU busy singlestat with the following query: (((count(count(node_cpu_seconds_total{instance=~"$instance",job=~"node"}) by (cpu))) - avg(sum by (mode)(irate(node_cpu_seconds_total{mode='idle',instance=~"$instance",job=~"node"}[5m])))) * 100) / count(count(node_cpu_seconds_total{instance=~"$instance",job=~"node"}) by (cpu))

The error I for for this panel is: Invalid label name: �id�52�instance�host.exmaple.tld�job�ipmi�name�Drive 7�type Drive Slot��name=fortiwlc_client

The correct query would be: ipmi_sensor_state{id="52",instance="host.example.tld",job="ipmi",name="Drive 7",type="Drive Slot"}, fortiwlc_client is a whole different metric from another job (ipmi vs fortiwlc).

Another example from Uptime singlestat panel
Error: Invalid label name: �write�origin�replication_agentj__name__=node_disk_io_time_weighted_seconds_total,device=dm-28,env=prod,instan
Correct: node_disk_io_time_weighted_seconds_total{device="dm-28",env="prod",instance="host.example.tld",job="node"}

Errors on prometheus side (they were taken later the the upper errors, so they are not connected):

Sep 23 17:36:35 host1.example.tlc prometheus[25660]: level=error ts=2018-09-23T15:36:35.148880535Z caller=engine.go:504 component="query engine" msg="error expanding series set" err="Invalid label name: \u0003env\u0004prod\u000benvironment\u000esap_production\u0004host\u000chost.example.tld\u0008instance\u0010host2.example.tld\u0003job\u0004node\u0004name\u0007Yumrep"
Sep 23 17:36:35 host1.example.tlc prometheus[25660]: level=error ts=2018-09-23T15:36:35.157641191Z caller=engine.go:504 component="query engine" msg="error expanding series set" err="Invalid label name: \u0008instance\rhost3.example.tld\u0003job\u0005mysql\u0006schema\twordpress\u0005table\u001bwp_12993_term_relationships\ufffd\u0001__name__=mysql"

Refreshing the dashboard via the button in the upper right corner usually renders all the panels successfully. If not, another 1 or 2 clicks help.

screen shot 2018-09-22 at 20 26 02

@sboschman
Copy link
Contributor Author

Did some debugging and this is what I have found so far (after digging through m3coordinator and realizing the data received was faulty):

Looks like the document contains a lot of fields (100+ instead of the normal <10) when it gets added to the results (https://github.com/m3db/m3/blob/master/src/dbnode/storage/index/results.go#L73). The field name and value pairs in that case are pretty much 'garbage' and correspond to what you see in m3coordinator and grafana.

So the next thing I want to look into is the offset used to read a document from the raw data. If the offset is wrong, I can imagine it ending up with 100+ fields and the name/values being garbage.

Disclaimer: Just started with go, so baby steps at a time here.

@prateek
Copy link
Collaborator

prateek commented Sep 25, 2018

h/t to @sboschman for all the debugging, he found a race in the code which would cause this behaviour. Addressed in #938

@prateek
Copy link
Collaborator

prateek commented Sep 25, 2018

@sboschman @matejzero landed a fix for the issue in #938. Please give it a shot when you get a chance (v0.4.5 https://github.com/m3db/m3/releases/tag/v0.4.5)

@sboschman
Copy link
Contributor Author

LGTM, been clicking and refreshing for a while now and haven't spotted any failed graphs yet.

So closing this issue, thanks for the fixes!

@matejzero
Copy link
Contributor

Good on my end too.

@prateek
Copy link
Collaborator

prateek commented Sep 25, 2018

Glad to hear it! Thanks again @sboschman & @matejzero!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants