Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tiup deploy should use given address by default instead of 0.0.0.0 #1082

Closed
BusyJay opened this issue Jan 20, 2021 · 9 comments
Closed

tiup deploy should use given address by default instead of 0.0.0.0 #1082

BusyJay opened this issue Jan 20, 2021 · 9 comments
Labels
status/TODO Categorizes issue as we will do it. type/bug Categorizes issue as related to a bug.

Comments

@BusyJay
Copy link
Contributor

BusyJay commented Jan 20, 2021

Bug Report

Please answer these questions before submitting your issue. Thanks!

  1. What did you do?
    Using tiup to deploy a cluster.

  2. What did you expect to see?
    Everything works correctly.

  3. What did you see instead?
    There are warnings in logs:

[2021/01/20 16:34:38.757 +08:00] [WARN] [mod.rs:520] ["failed to register addr to pd"] [body=Body(Streaming)] ["status code"=400]
[2021/01/20 16:34:38.781 +08:00] [WARN] [mod.rs:520] ["failed to register addr to pd"] [body=Body(Streaming)] ["status code"=400]
[2021/01/20 16:34:38.799 +08:00] [WARN] [mod.rs:520] ["failed to register addr to pd"] [body=Body(Streaming)] ["status code"=400]
[2021/01/20 16:34:38.820 +08:00] [WARN] [mod.rs:520] ["failed to register addr to pd"] [body=Body(Streaming)] ["status code"=400]
[2021/01/20 16:34:38.820 +08:00] [WARN] [mod.rs:530] ["failed to register addr to pd after 5 tries"]

That's because tiup use 0.0.0.0 as status address for all nodes, and same address can't be registered multiple times. Actually I have never configure it to use 0.0.0.0, what I set to host is IP like 127.0.0.1. Tiup should utilize the IP instead.

  1. What version of TiUP are you using (tiup --version)?
v1.3.1 tiup
Go Version: go1.13
Git Branch: release-1.3
GitHash: d51bd0c
@BusyJay BusyJay added the type/bug Categorizes issue as related to a bug. label Jan 20, 2021
@lucklove
Copy link
Member

lucklove commented Jan 20, 2021

I deployed a cluster with the following config (with tidb v4.0.9), this issue not reproduced:

tidb_servers:
  - host: 172.16.5.140

pd_servers:
  - host: 172.16.5.140
    name: pd-1

tikv_servers:
  - host: 172.16.5.140
  - host: 172.16.5.139
  - host: 172.16.5.134

monitoring_servers:
  - host: 172.16.5.140

grafana_servers:
  - host: 172.16.5.140

alertmanager_servers:
  - host: 172.16.5.140

I notice that the generated tikv script is:

#!/bin/bash
set -e

# WARNING: This file was auto-generated. Do not edit!
#          All your edit might be overwritten!
cd "/home/tidb/deploy/tikv-20160" || exit 1

echo -n 'sync ... '
stat=$(time sync || sync)
echo ok
echo $stat
exec bin/tikv-server \
    --addr "0.0.0.0:20160" \
    --advertise-addr "172.16.5.140:20160" \
    --status-addr "0.0.0.0:20180" \
    --advertise-status-addr "172.16.5.140:20180" \
    --pd "172.16.5.140:2379" \
    --data-dir "/home/tidb/deploy/tikv-20160/data" \
    --config conf/tikv.toml \
    --log-file "/home/tidb/deploy/tikv-20160/log/tikv.log" 2>> "/home/tidb/deploy/tikv-20160/log/tikv_stderr.log"

Only --addr and --status-addr use 0.0.0.0, but I found that --addr is 0.0.0.0 from the ansible's time. The only difference is that the --status-addr is 0.0.0.0 for TiUP but not for ansible. So is there any reason that TiKV can't use 0.0.0.0 as status address even if advertise-status-addr is set correctly?

@BusyJay
Copy link
Contributor Author

BusyJay commented Jan 20, 2021

I didn't see advertise-status-addr from script:

exec bin/tikv-server \
    --addr "0.0.0.0:20160" \
    --advertise-addr "127.0.0.1:20160" \
    --status-addr "0.0.0.0:20180" \
    --pd "127.0.0.1:2379" \
    --data-dir "/home/tidb/deploy/tikv-20160/data" \
    --config conf/tikv.toml \
    --log-file "/home/tidb/deploy/tikv-20160/log/tikv.log" 2>> "/home/tidb/deploy/tikv-20160/log/tikv_stderr.log"

@lucklove
Copy link
Member

I didn't see advertise-status-addr from script:

exec bin/tikv-server \
    --addr "0.0.0.0:20160" \
    --advertise-addr "127.0.0.1:20160" \
    --status-addr "0.0.0.0:20180" \
    --pd "127.0.0.1:2379" \
    --data-dir "/home/tidb/deploy/tikv-20160/data" \
    --config conf/tikv.toml \
    --log-file "/home/tidb/deploy/tikv-20160/log/tikv.log" 2>> "/home/tidb/deploy/tikv-20160/log/tikv_stderr.log"

So what's the version of tidb you used?

@BusyJay
Copy link
Contributor Author

BusyJay commented Jan 20, 2021

Nightly, the hash of TiKV is 95951d04e6e8c0bb267c6a5591fb336ed5483b41, which is several days ago.

@lucklove
Copy link
Member

Nightly, the hash of TiKV is 95951d04e6e8c0bb267c6a5591fb336ed5483b41, which is several days ago.

The advertise-status-addr only apply on the version after v4.0.1, bug ignore nightly (because the nightly maybe deployed long long ago, where it is v3's time)

#951

@BusyJay
Copy link
Contributor Author

BusyJay commented Jan 21, 2021

I don't understand why utilizing advertise status port is wrong for cluster deployed long time ago. Shouldn't it decide its behavior on what exact version it is? And the cluster was deployed at the beginning of this year.

@lucklove
Copy link
Member

lucklove commented Jan 21, 2021

I don't understand why utilizing advertise status port is wrong for cluster deployed long time ago

TiKV supports --advertise-status-addr from v4.0.1, so TiUP only add this args for TiKV >= v4.0.1.

And nightly is a special version because we don't know if it supports this arg at all.

For example: if I ask, does TiKV nightly support pessimistic lock? I think your answer must be: for nightly released recently, yes, for nightly released two years ago, no.

However, it's also reasonable to assume the nightly must be today's nightly, but this may cause the nightly cluster deployed before can't work any more.

@BusyJay
Copy link
Contributor Author

BusyJay commented Jan 21, 2021

if I ask, does TiKV nightly support pessimistic lock?

For TiKV 5.0-rc and forward, pessimistic lock is supported. I think you miss the point that TiKV doesn't name the version as nightly, it name it follow semver spec. Nightly is a release channel, rather than version.

@lucklove
Copy link
Member

The nightly in TiUP is not a specific version, too. It's a cursor:

"nightly":"v5.0.0-nightly-20210121"

Maybe we can use the underlay version v5.0.0-nightly-20210121 after the user installed any nightly component. That's to say, we should implement nightly as an alias for v5.0.0-nightly-20210121. Then we can deal with this issue.

@lucklove lucklove added the status/TODO Categorizes issue as we will do it. label Jan 22, 2021
lucklove added a commit to lucklove/tiup that referenced this issue Feb 4, 2021
lucklove added a commit to lucklove/tiup that referenced this issue Feb 22, 2021
lucklove added a commit to lucklove/tiup that referenced this issue Mar 4, 2021
…version (pingcap#1128)

* Fix the issue that the default selected version may be a preprelease version

Close pingcap#1119

Partly fix pingcap#1082

* Add test

* Test

* Fix nightly

* Fix test

* Fix nightly clone

Close pingcap#1108

Co-authored-by: Ti Chi Robot <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status/TODO Categorizes issue as we will do it. type/bug Categorizes issue as related to a bug.
Projects
None yet
Development

No branches or pull requests

2 participants