Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tiup fail to start tiflash because of libtiflash_proxy.so #1135

Closed
tohghua opened this issue Feb 10, 2021 · 11 comments · Fixed by #1152
Closed

Tiup fail to start tiflash because of libtiflash_proxy.so #1135

tohghua opened this issue Feb 10, 2021 · 11 comments · Fixed by #1152
Labels
type/bug Categorizes issue as related to a bug.
Milestone

Comments

@tohghua
Copy link
Contributor

tohghua commented Feb 10, 2021

  1. What did you do?
    $ tiup tiflash

  2. What did you expect to see?
    No error message.

  3. What did you see instead?
    /home/hgh/.tiup/components/tiflash/v5.0.0-rc/tiflash/tiflash: error while loading shared libraries: libtiflash_proxy.so: cannot open shared object file: No such file or directory
    Error: run /home/hgh/.tiup/components/tiflash/v5.0.0-rc/tiflash/tiflash (wd:/home/hgh/.tiup/data/SOccVcq) failed: exit status 127

  4. What version of TiUP are you using (tiup --version)?
    v1.3.2
    It seems related to LD_LIBRARY_PATH env variable.
    BTW, I'm using Ubuntu 20.04LTS

@tohghua tohghua added the type/bug Categorizes issue as related to a bug. label Feb 10, 2021
@tohghua
Copy link
Contributor Author

tohghua commented Feb 11, 2021

Start playground failed too:

Starting component playground: /home/hgh/.tiup/components/playground/v1.3.2/tiup-playground --host 0.0.0.0
Use the latest stable version: v5.0.0-rc

Specify version manually:   tiup playground <version>
The stable version:         tiup playground v4.0.0
The nightly version:        tiup playground nightly

Playground Bootstrapping...
Start pd instance
Start tikv instance
Start tidb instance
Waiting for tidb instances ready
192.168.1.194:4000 ... Done
Start tiflash instance
Waiting for tiflash instances ready
0.0.0.0:3930 ... Error
CLUSTER START SUCCESSFULLY, Enjoy it ^-^
To connect TiDB: mysql --host 192.168.1.194 --port 4000 -u root
To view the dashboard: http://192.168.1.194:2379/dashboard
To view the Prometheus: http://0.0.0.0:9090
To view the Grafana: http://0.0.0.0:3000

@shhdgit
Copy link
Member

shhdgit commented Feb 18, 2021

Any updates?

@AstroProfundis
Copy link
Contributor

AstroProfundis commented Feb 19, 2021

@JaySon-Huang PTAL, i suspect it may be related to the Jenkins job of release packaging.

@JaySon-Huang
Copy link
Contributor

JaySon-Huang commented Feb 19, 2021

Now we don't support starting TiFlash instance without a TiDB cluster. So fail to run tiup tiflash won't fix by now.

But starting tiup playground with TiFlash failed is not as expected. We will take a look at this. cc @LittleFall @birdstorm

@tohghua
Copy link
Contributor Author

tohghua commented Feb 19, 2021

Now we don't support starting TiFlash instance without a TiDB cluster. So fail to run tiup tiflash won't fix by now.

But starting tiup playground with TiFlash failed is not as expected. We will take a look at this. cc @LittleFall @birdstorm

Thanks!

@birdstorm
Copy link
Contributor

@tohghua Can you get the log from tiflash when tiup playground failed? It should be under the ~/.tiup/data/ directory

@tohghua
Copy link
Contributor Author

tohghua commented Feb 19, 2021

@tohghua Can you get the log from tiflash when tiup playground failed? It should be under the ~/.tiup/data/ directory

tiflash.log
[2021/02/19 13:06:12.225 +08:00] [INFO] [] ["Starting daemon with revision 54381"] [thread_id=1]
[2021/02/19 13:06:12.225 +08:00] [INFO] [] ["TiFlash build info: TiFlash\nRelease Version: v5.0.0-rc\nEdition: Community\nGit Commit Hash: 06fbf2ac0d494a9a567d077623685410e5dfc10d\nGit Branch: heads/refs/tags/v5.0.0-rc\nUTC Build Time: 2021-01-12 06:59:50\nProfile: RELWITHDEBINFO\n"] [thread_id=1]
[2021/02/19 13:06:12.225 +08:00] [INFO] [] ["Application: starting up"] [thread_id=1]
[2021/02/19 13:06:12.231 +08:00] [INFO] [] ["Application: Got jemalloc version: 5.2.1-0-gea6b3e973b477b8061e0076bb257dbd7f3faa756"] [thread_id=1]
[2021/02/19 13:06:12.231 +08:00] [INFO] [] ["Application: Not found environment variable MALLOC_CONF"] [thread_id=1]
[2021/02/19 13:06:12.231 +08:00] [INFO] [] ["Application: Got jemalloc config: opt.background_thread 0, opt.max_background_threads 4"] [thread_id=1]
[2021/02/19 13:06:12.231 +08:00] [INFO] [] ["Application: Try to use background_thread of jemalloc to handle purging asynchronously"] [thread_id=1]
[2021/02/19 13:06:12.231 +08:00] [INFO] [] ["Application: Set jemalloc.max_background_threads 1"] [thread_id=1]
[2021/02/19 13:06:12.232 +08:00] [INFO] [] ["Application: Set jemalloc.background_thread 1"] [thread_id=1]
[2021/02/19 13:06:12.274 +08:00] [INFO] [] ["Application: wait for tiflash proxy initializing"] [thread_id=1]
[2021/02/19 13:06:12.274 +08:00] [INFO] [] ["Application: Start raft store proxy"] [thread_id=2]

tiflash_tikv.log
[2021/02/19 13:06:12.496 +08:00] [INFO] [lib.rs:35] ["Welcome To TiFlash Raft Proxy"]
[2021/02/19 13:06:12.531 +08:00] [INFO] [lib.rs:37] ["Git Commit Hash: 05fb7b423862097b9de024b27699cbbaab7cb908"]
[2021/02/19 13:06:12.531 +08:00] [INFO] [lib.rs:37] ["Git Commit Branch: HEAD"]
[2021/02/19 13:06:12.531 +08:00] [INFO] [lib.rs:37] ["UTC Build Time: 2021-01-12 06:43:53"]
[2021/02/19 13:06:12.532 +08:00] [INFO] [lib.rs:37] ["Rust Version: rustc 1.49.0-nightly (b1496c6e6 2020-10-18)"]
[2021/02/19 13:06:12.532 +08:00] [INFO] [lib.rs:37] ["Storage Engine: tiflash"]
[2021/02/19 13:06:12.532 +08:00] [INFO] [lib.rs:37] ["Prometheus Prefix: tiflash_proxy_"]
[2021/02/19 13:06:12.532 +08:00] [INFO] [lib.rs:37] ["Profile: release"]
[2021/02/19 13:06:12.532 +08:00] [INFO] [mod.rs:64] ["memory limit in bytes: 5118423040, cpu cores quota: 4"]
[2021/02/19 13:06:12.544 +08:00] [WARN] [lib.rs:527] ["environment variable TZ is missing, using /etc/localtime"]
[2021/02/19 13:06:12.552 +08:00] [WARN] [server.rs:956] ["check: kernel"] [err="kernel parameters net.core.somaxconn got 4096, expect 32768"]
[2021/02/19 13:06:12.552 +08:00] [WARN] [server.rs:956] ["check: kernel"] [err="kernel parameters net.ipv4.tcp_syncookies got 1, expect 0"]
[2021/02/19 13:06:12.552 +08:00] [WARN] [server.rs:956] ["check: kernel"] [err="kernel parameters vm.swappiness got 60, expect 0"]
[2021/02/19 13:06:12.598 +08:00] [INFO] [util.rs:395] ["connecting to PD endpoint"] [endpoints=192.168.1.195:2379]
[2021/02/19 13:06:12.620 +08:00] [INFO] [] ["TCP_USER_TIMEOUT is available. TCP_USER_TIMEOUT will be used thereafter"]
[2021/02/19 13:06:12.624 +08:00] [INFO] [] ["New connected subchannel at 0x7fb9efc49150 for subchannel 0x7fb9f3050c40"]
[2021/02/19 13:06:12.631 +08:00] [INFO] [util.rs:395] ["connecting to PD endpoint"] [endpoints=http://192.168.1.195:2379]
[2021/02/19 13:06:12.632 +08:00] [INFO] [] ["New connected subchannel at 0x7fb9ef449150 for subchannel 0x7fb9f3050a80"]
[2021/02/19 13:06:12.635 +08:00] [INFO] [util.rs:395] ["connecting to PD endpoint"] [endpoints=http://192.168.1.195:2379]
[2021/02/19 13:06:12.638 +08:00] [INFO] [] ["New connected subchannel at 0x7fb9eec49150 for subchannel 0x7fb9f3050c40"]
[2021/02/19 13:06:12.639 +08:00] [INFO] [util.rs:459] ["connected to PD leader"] [endpoints=http://192.168.1.195:2379]
[2021/02/19 13:06:12.639 +08:00] [INFO] [util.rs:383] ["all PD endpoints are consistent"] [endpoints="["192.168.1.195:2379"]"]
[2021/02/19 13:06:12.648 +08:00] [INFO] [server.rs:342] ["connect to PD cluster"] [cluster_id=6928386802463615129]
[2021/02/19 13:06:12.648 +08:00] [INFO] [config.rs:1930] ["readpool.storage.use-unified-pool is not set, set to true by default"]
[2021/02/19 13:06:12.649 +08:00] [INFO] [config.rs:1953] ["readpool.coprocessor.use-unified-pool is not set, set to true by default"]
[2021/02/19 13:06:12.651 +08:00] [FATAL] [setup.rs:320] ["invalid configuration: "[src/server/config.rs:205]: invalid advertise-addr: \"0.0.0.0:20170\"""]

@JaySon-Huang
Copy link
Contributor

Seems there is something wrong when starting with " --host 0.0.0.0 ". TiUP playground will convert "0.0.0.0" to specific IP for the "advertise-addr" of TiKV but not convert it for TiFlash-Proxy.

Can you try to set the parameter to " --host 127.0.0.1" or to a specific IP? @tohghua

@tohghua
Copy link
Contributor Author

tohghua commented Feb 22, 2021

@JaySon-Huang
Now tiflash complains "invalid advertise-addr":

[2021/02/22 09:19:25.671 +08:00] [INFO] [lib.rs:35] ["Welcome To TiFlash Raft Proxy"]
[2021/02/22 09:19:25.672 +08:00] [INFO] [lib.rs:37] ["Git Commit Hash: 05fb7b423862097b9de024b27699cbbaab7cb908"]
[2021/02/22 09:19:25.672 +08:00] [INFO] [lib.rs:37] ["Git Commit Branch: HEAD"]
[2021/02/22 09:19:25.672 +08:00] [INFO] [lib.rs:37] ["UTC Build Time: 2021-01-12 06:43:53"]
[2021/02/22 09:19:25.672 +08:00] [INFO] [lib.rs:37] ["Rust Version: rustc 1.49.0-nightly (b1496c6e6 2020-10-18)"]
[2021/02/22 09:19:25.672 +08:00] [INFO] [lib.rs:37] ["Storage Engine: tiflash"]
[2021/02/22 09:19:25.672 +08:00] [INFO] [lib.rs:37] ["Prometheus Prefix: tiflash_proxy_"]
[2021/02/22 09:19:25.672 +08:00] [INFO] [lib.rs:37] ["Profile: release"]
[2021/02/22 09:19:25.672 +08:00] [INFO] [mod.rs:64] ["memory limit in bytes: 5118423040, cpu cores quota: 4"]
[2021/02/22 09:19:25.673 +08:00] [WARN] [lib.rs:527] ["environment variable TZ is missing, using /etc/localtime"]
[2021/02/22 09:19:25.673 +08:00] [WARN] [server.rs:956] ["check: kernel"] [err="kernel parameters net.core.somaxconn got 4096, expect 32768"]
[2021/02/22 09:19:25.673 +08:00] [WARN] [server.rs:956] ["check: kernel"] [err="kernel parameters net.ipv4.tcp_syncookies got 1, expect 0"]
[2021/02/22 09:19:25.673 +08:00] [WARN] [server.rs:956] ["check: kernel"] [err="kernel parameters vm.swappiness got 60, expect 0"]
[2021/02/22 09:19:25.679 +08:00] [INFO] [util.rs:395] ["connecting to PD endpoint"] [endpoints=127.0.0.1:2379]
[2021/02/22 09:19:25.679 +08:00] [INFO] [] ["TCP_USER_TIMEOUT is available. TCP_USER_TIMEOUT will be used thereafter"]
[2021/02/22 09:19:25.701 +08:00] [INFO] [] ["New connected subchannel at 0x7fe5cea49150 for subchannel 0x7fe5d1e50c40"]
[2021/02/22 09:19:25.709 +08:00] [INFO] [util.rs:395] ["connecting to PD endpoint"] [endpoints=http://127.0.0.1:2379]
[2021/02/22 09:19:25.710 +08:00] [INFO] [] ["New connected subchannel at 0x7fe5cde49150 for subchannel 0x7fe5d1e50a80"]
[2021/02/22 09:19:25.712 +08:00] [INFO] [util.rs:395] ["connecting to PD endpoint"] [endpoints=http://127.0.0.1:2379]
[2021/02/22 09:19:25.713 +08:00] [INFO] [] ["New connected subchannel at 0x7fe5cda49150 for subchannel 0x7fe5d1e50c40"]
[2021/02/22 09:19:25.714 +08:00] [INFO] [util.rs:459] ["connected to PD leader"] [endpoints=http://127.0.0.1:2379]
[2021/02/22 09:19:25.714 +08:00] [INFO] [util.rs:383] ["all PD endpoints are consistent"] [endpoints="["127.0.0.1:2379"]"]
[2021/02/22 09:19:25.730 +08:00] [INFO] [server.rs:342] ["connect to PD cluster"] [cluster_id=6928386802463615129]
[2021/02/22 09:19:25.734 +08:00] [INFO] [config.rs:1930] ["readpool.storage.use-unified-pool is not set, set to true by default"]
[2021/02/22 09:19:25.734 +08:00] [INFO] [config.rs:1953] ["readpool.coprocessor.use-unified-pool is not set, set to true by default"]
[2021/02/22 09:19:25.739 +08:00] [FATAL] [setup.rs:320] ["invalid configuration: "[src/server/config.rs:205]: invalid advertise-addr: \"0.0.0.0:20170\"""]

@JaySon-Huang
Copy link
Contributor

JaySon-Huang commented Feb 22, 2021

Can you upload your tiflash.toml and tiflash-learner.toml under the ~/.tiup/data/ directory after running tiup playground --host 127.0.0.1? @tohghua

@tohghua
Copy link
Contributor Author

tohghua commented Feb 22, 2021

@JaySon-Huang
I delete the data folder and run "tiup -T v5 playground --host 127.0.0.1 --tiflash 1", it succeeds now.

So pls check the "--host 0.0.0.0" case. Attached is tiflash.toml & tiflash-learner.toml at this time

a.zip

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/bug Categorizes issue as related to a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants