Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/sp prometheus remote rw #1070

Merged
merged 8 commits into from
Feb 19, 2021
Merged

Feature/sp prometheus remote rw #1070

merged 8 commits into from
Feb 19, 2021

Conversation

XSHui
Copy link
Contributor

@XSHui XSHui commented Jan 15, 2021

What problem does this PR solve?

close #1063
Support write prometheus data to remote.

What is changed and how it works?

https://prometheus.io/docs/prometheus/latest/configuration/configuration/#remote_write

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
tiup cluster edit-config test_tidb_cluster
monitoring_servers:
- host: 10.4.45.162
  ssh_port: 22
  imported: true
  port: 9090
  deploy_dir: /data/deploy
  data_dir: /data/deploy/prometheus2.0.0.data.metrics
  log_dir: /data/deploy/logremote_config:
  remote_config:
    remote_write:
    - queue_config:
        batch_send_deadline: 5m
        capacity: 100000
        max_samples_per_send: 10000
        max_shards: 300
      url: http://127.0.0.1://8003/write
    remote_read:
    - url: http://127.0.0.1://8003/read
tiup cluster reload test_tidb_cluster -R prometheus
  • No code

Code changes

  • Has exported function/method change
  • Has exported variable/fields change
  • Has interface methods change
  • Has persistent data change

Side effects

  • Possible performance regression
  • Increased code complexity
  • Breaking backward compatibility

Related changes

  • Need to cherry-pick to the release branch
  • Need to update the documentation

Release notes:

NONE

@CLAassistant
Copy link

CLAassistant commented Jan 15, 2021

CLA assistant check
All committers have signed the CLA.

@ti-chi-bot ti-chi-bot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Jan 15, 2021
@codecov-io
Copy link

codecov-io commented Jan 15, 2021

Codecov Report

Merging #1070 (174316a) into master (8ecc546) will decrease coverage by 4.04%.
The diff coverage is 73.33%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1070      +/-   ##
==========================================
- Coverage   53.52%   49.48%   -4.05%     
==========================================
  Files         285      285              
  Lines       20258    20272      +14     
==========================================
- Hits        10843    10031     -812     
- Misses       7750     8686     +936     
+ Partials     1665     1555     -110     
Flag Coverage Δ
cluster 38.96% <46.66%> (-5.93%) ⬇️
dm 25.54% <46.66%> (+0.11%) ⬆️
integrate 43.67% <46.66%> (-4.23%) ⬇️
playground 2.93% <ø> (ø)
tiup 16.36% <6.66%> (-0.05%) ⬇️
unittest 22.96% <38.46%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
pkg/cluster/spec/prometheus.go 80.72% <50.00%> (-0.76%) ⬇️
pkg/cluster/spec/server_config.go 68.15% <75.00%> (+0.36%) ⬆️
pkg/cluster/embed/autogen_pkger.go 100.00% <100.00%> (ø)
pkg/cluster/template/config/prometheus.go 72.30% <100.00%> (+0.87%) ⬆️
pkg/queue/any_queue.go 0.00% <0.00%> (-83.34%) ⬇️
pkg/cluster/task/limits.go 0.00% <0.00%> (-68.75%) ⬇️
pkg/cluster/task/sysctl.go 0.00% <0.00%> (-66.67%) ⬇️
components/cluster/command/check.go 5.97% <0.00%> (-63.86%) ⬇️
pkg/cluster/task/copy_file.go 0.00% <0.00%> (-54.55%) ⬇️
components/cluster/command/audit.go 27.27% <0.00%> (-54.55%) ⬇️
... and 43 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 8ecc546...174316a. Read the comment docs.

pkg/cluster/embed/autogen_pkger.go Outdated Show resolved Hide resolved
pkg/cluster/spec/prometheus.go Outdated Show resolved Hide resolved
pkg/cluster/spec/prometheus.go Outdated Show resolved Hide resolved
pkg/cluster/template/config/prometheus.go Outdated Show resolved Hide resolved
templates/config/prometheus.yml.tpl Outdated Show resolved Hide resolved
templates/config/prometheus.yml.tpl Outdated Show resolved Hide resolved
@AstroProfundis
Copy link
Contributor

I'd like to have Prometheus' config been fully exported like what tidb/tikv/pd does, but i'm also ok with current approach.

@ti-chi-bot ti-chi-bot added needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Jan 25, 2021
pkg/cluster/spec/prometheus.go Outdated Show resolved Hide resolved
templates/config/prometheus.yml.tpl Show resolved Hide resolved
@lucklove
Copy link
Member

And add some test please

@ti-chi-bot ti-chi-bot added needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Jan 25, 2021
@ti-chi-bot ti-chi-bot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jan 27, 2021
@lucklove
Copy link
Member

lucklove commented Jan 27, 2021

I'm afraid this will not work at all: https://play.golang.org/p/ZQRYbtvhtu0

The output is something like:

a: b
---
c: d
---
e: f
---
g: h
---
i: j
---
k: l

It's invalid.

Suggestion:

Maybe you try this: https://play.golang.org/p/VWp4uf_go1V

@lucklove
Copy link
Member

lucklove commented Jan 27, 2021

And please add a test in server_config_test.go, something like:

func (s *configSuite) TestEncodeRemoteCfg(c *check.C) {
	yamlData := []byte(`remote_write:
  remote_timeout: 30
  url: http://172.16.5.140:8808
  write_relabel_configs:
    source_labels:
    - label1
    - label2
`)

	bs, err := encodeRemoteCfg2Yaml(Remote{
		RemoteWrite: map[string]interface{}{
			"url":            "http://172.16.5.140:8808",
			"remote_timeout": 30,
			"write_relabel_configs": map[string]interface{}{
				"source_labels": []string{"label1", "label2"},
			},
		},
	})

	c.Assert(err, check.IsNil)
	c.Assert(bs, check.BytesEquals, yamlData)
}

@9547
Copy link
Contributor

9547 commented Jan 27, 2021

I'm afraid this will not work at all: https://play.golang.org/p/ZQRYbtvhtu0

The output is something like:

a: b
---
c: d
---
e: f
---
g: h
---
i: j
---
k: l

It's invalid.

Suggestion:

Maybe you try this: https://play.golang.org/p/VWp4uf_go1V

Based on https://play.golang.org/p/VWp4uf_go1V the remote_write in array's example is here https://play.golang.org/p/dM8bvnOotD3

remote_write:
- g: h
  i: j
  k: l
remote_read:
- a1: b
  c1: d
  e1: f
- a2: b
  c2: d
  e3: f

@ti-chi-bot ti-chi-bot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jan 28, 2021
@lucklove
Copy link
Member

Thank you very much!

@lucklove
Copy link
Member

/lgtm

@ti-chi-bot
Copy link
Member

[REVIEW NOTIFICATION]

This pull request has been approved by:

  • lucklove

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by writing /lgtm in a comment.
Reviewer can cancel approval by writing /lgtm cancel in a comment.

@ti-chi-bot ti-chi-bot added the status/LGT1 Indicates that a PR has LGTM 1. label Jan 28, 2021
@lucklove
Copy link
Member

/merge

@ti-chi-bot
Copy link
Member

This pull request has been accepted and is ready to merge.

Commit hash: 30a6d47

@XSHui
Copy link
Contributor Author

XSHui commented Jan 29, 2021

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
1 out of 2 committers have signed the CLA.
✅ ti-chi-bot
❌ 熊双辉
熊双辉 seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.

@XSHui It seems that you didn't sign the CLA, the bot can't merge this PR before you sign it.

signed

@9547
Copy link
Contributor

9547 commented Jan 30, 2021

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
1 out of 2 committers have signed the CLA.
✅ ti-chi-bot
❌ 熊双辉
熊双辉 seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.

@XSHui It seems that you didn't sign the CLA, the bot can't merge this PR before you sign it.

signed

Seems the CLA check not passed, you need to sign it with your Github's name and email: `[email protected]

@AstroProfundis AstroProfundis added the category/monitoring Categorizes issue or PR related to monitoring components. label Feb 2, 2021
@ti-chi-bot
Copy link
Member

@XSHui: Your PR has out-of-dated, I have automatically updated it for you.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

@ti-chi-bot ti-chi-bot removed the status/can-merge Indicates a PR has been approved by a committer. label Feb 3, 2021
熊双辉 and others added 3 commits February 4, 2021 21:00
add remote storage work

add more remote config

rollback makefile

fix ut

fix comments

fix ut

rm global prom cfg

rm prom cfg already in tmpl

rm marshal & unmarshal

rm dirty code

fix encode yaml failed

fix if condition
@ti-chi-bot ti-chi-bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 10, 2021
@ti-chi-bot ti-chi-bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 18, 2021
@XSHui
Copy link
Contributor Author

XSHui commented Feb 18, 2021

@breeswish Could you please help me to see the CI task which had been canceled?

@breezewish
Copy link
Member

@XSHui I have no idea. Seems that TiDB process is killed according to the log: 8197 tidb signal: killed. However no log is printed in this task.

BTW I have a question for this PR, why not using config? Then user can even supply more kind of configurations, like customizing other scrape targets.

@XSHui
Copy link
Contributor Author

XSHui commented Feb 18, 2021

@XSHui I have no idea. Seems that TiDB process is killed according to the log: 8197 tidb signal: killed. However no log is printed in this task.

BTW I have a question for this PR, why not using config? Then user can even supply more kind of configurations, like customizing other scrape targets.

Other configuration of prometheus already configed in tiup/templates/config/prometheus.yml.tpl

https://prometheus.io/docs/prometheus/latest/configuration/configuration/

@breezewish
Copy link
Member

breezewish commented Feb 18, 2021

@XSHui I have no idea. Seems that TiDB process is killed according to the log: 8197 tidb signal: killed. However no log is printed in this task.
BTW I have a question for this PR, why not using config? Then user can even supply more kind of configurations, like customizing other scrape targets.

Other configuration of prometheus already configed in tiup/templates/config/prometheus.yml.tpl

https://prometheus.io/docs/prometheus/latest/configuration/configuration/

How about allowing users to combine these configurations? For example, user can specify his own scrape targets and this does not overwrite TiUP built-in scrape targets. Instead they got merged.

@XSHui
Copy link
Contributor Author

XSHui commented Feb 18, 2021

@XSHui I have no idea. Seems that TiDB process is killed according to the log: 8197 tidb signal: killed. However no log is printed in this task.
BTW I have a question for this PR, why not using config? Then user can even supply more kind of configurations, like customizing other scrape targets.

Other configuration of prometheus already configed in tiup/templates/config/prometheus.yml.tpl
https://prometheus.io/docs/prometheus/latest/configuration/configuration/

How about allowing users to combine these configurations? For example, user can specify his own scrape targets and this does not overwrite TiUP built-in scrape targets. Instead they got merged.

May I implement this feature in a new branch ?

@AstroProfundis
Copy link
Contributor

The test_playground CI task should be fixed now.

@ti-chi-bot ti-chi-bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 19, 2021
@ti-chi-bot ti-chi-bot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Feb 19, 2021
@XSHui
Copy link
Contributor Author

XSHui commented Feb 19, 2021

The test_playground CI task should be fixed now.

How to rerun CI integrate-cluster-scale / cluster (test_scale_core_tls) (pull_request) ?

@AstroProfundis
Copy link
Contributor

/merge

@ti-chi-bot
Copy link
Member

This pull request has been accepted and is ready to merge.

Commit hash: 8b7fcaa

@ti-chi-bot ti-chi-bot added the status/can-merge Indicates a PR has been approved by a committer. label Feb 19, 2021
@ti-chi-bot ti-chi-bot merged commit b769ff8 into pingcap:master Feb 19, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category/monitoring Categorizes issue or PR related to monitoring components. first-time-contributor size/M Denotes a PR that changes 30-99 lines, ignoring generated files. status/can-merge Indicates a PR has been approved by a committer. status/LGT1 Indicates that a PR has LGTM 1.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

write monitor data to remote storage
9 participants