Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Platform] YSQL backups with node-to-node TLS encryption enabled hang forever #6965

Closed
tvesely opened this issue Jan 22, 2021 · 2 comments
Closed
Assignees
Labels
area/platform Yugabyte Platform priority/critical Critical issue
Milestone

Comments

@tvesely
Copy link
Contributor

tvesely commented Jan 22, 2021

A backup of a YSQL namespace with the Yugaware platform will hang indefinitely. The yb_backup.py script calls ysql_dump on a node:

$ ps -ef wwf | grep -C4 23947
root   23928 1995  0 Jan19 ? Ss 0:00 \_ sshd: ybuser [priv]
ybuser 23932 23928 0 Jan19 ? S  0:00 | \_ sshd: ybuser@notty
ybuser 23933 23932 0 Jan19 ? Ss 0:00 | \_ bash -c cd / && sudo -u yugabyte bash -c '/home/yugabyte/master/postgres/bin/ysql_dump --host=10.35.101.43 --masters=10.35.101.41:7100,10.35.101.42:7100,10.35.101.43:7100 --include-yb-metadata --serializable-deferrable --create --schema-only --dbname=yugabyte --file=/tmp/yb_backup_fwjnxcllppncgfkg/YSQLDump'
root   23945 23933 0 Jan19 ? S  0:00 | \_ sudo -u yugabyte bash -c /home/yugabyte/master/postgres/bin/ysql_dump --host=10.35.101.43 --masters=10.35.101.41:7100,10.35.101.42:7100,10.35.101.43:7100 --include-yb-metadata --serializable-deferrable --create --schema-only --dbname=yugabyte --file=/tmp/yb_backup_fwjnxcllppncgfkg/YSQLDump
yugabyte 23947 23945 0 Jan19 ? Sl 9:00 | \_ /home/yugabyte/master/postgres/bin/ysql_dump --host=10.35.101.43 --masters=10.35.101.41:7100,10.35.101.42:7100,10.35.101.43:7100 --include-yb-metadata --serializable-deferrable --create --schema-only --dbname=yugabyte --file=/tmp/yb_backup_fwjnxcllppncgfkg/YSQLDump

Because node_to_node_encryption is enabled, the ysql_dump utility is not able to connect to the master server. The ysql_dump command attempts to connect to the master nodes without TLS encryption, and the master server drops the connection. Because ysql_dump never gets a response from the master nodes, it retries the connection indefinitely...

[yugabyte@yb-demo-tvesely-yugabyte-4-n1 ~]$ /home/yugabyte/master/postgres/bin/ysql_dump --host=10.35.101.43 --masters=10.35.101.41:7100,10.35.101.42:7100,10.35.101.43:7100 --include-yb-metadata --serializable-deferrable --create --schema-only --dbname=yugabyte
WARNING: Logging before InitGoogleLogging() is written to STDERR
I0120 21:47:39.557555 8802 ybc_pggate_tool.cc:58] Setting custom master addresses: 10.35.101.41:7100,10.35.101.42:7100,10.35.101.43:7100
I0120 21:47:39.558488 8802 pggate.cc:107] Reset YSQL bind address to 0.0.0.0:5432
I0120 21:47:39.558554 8802 server_base_options.cc:124] Updating master addrs to {10.35.101.41:7100},{10.35.101.42:7100},{10.35.101.43:7100}
I0120 21:47:39.558671 8802 mem_tracker.cc:249] MemTracker: hard memory limit is 53.480759 GB
I0120 21:47:39.558678 8802 mem_tracker.cc:251] MemTracker: soft memory limit is 45.458645 GB
I0120 21:47:39.559253 8802 thread_pool.cc:166] Starting thread pool { name: pggate_ybclient queue_limit: 10000 max_workers: 1024 }
I0120 21:47:39.559726 8809 async_initializer.cc:80] Starting to init ybclient
I0120 21:47:39.559849 8809 client-internal.cc:1976] New master addresses: [10.35.101.41:7100,10.35.101.42:7100,10.35.101.43:7100]
E0120 21:48:39.856043 8809 async_initializer.cc:93] Failed to initialize client: Timed out (yb/rpc/rpc.cc:213): Could not locate the leader master: GetLeaderMasterRpc(addrs: [10.35.101.41:7100, 10.35.101.42:7100, 10.35.101.43:7100], num_attempts: 344) passed its deadline 6672347.447s (passed: 60.294s): Network error (yb/util/net/socket.cc:535): recvmsg got EOF from remote (system error 108)

ysql_dump can be forced to use TLS by setting the following environment variables:

FLAGS_certs_dir=</path/to/certs/dir>
FLAGS_use_node_to_node_encryption=true

I believe that the Yugaware platform either needs to set these environment variables when calling ysql_dump, or the client itself needs to be made aware of the master's TLS settings.

@tvesely tvesely assigned tvesely and iSignal and unassigned tvesely Jan 22, 2021
@tvesely tvesely added the area/platform Yugabyte Platform label Jan 22, 2021
@streddy-yb streddy-yb added this to the 2.5.x milestone Jan 22, 2021
@chirag-yb
Copy link
Contributor

@streddy-yb This should be targeted for 2.4.1

iSignal added a commit that referenced this issue Jan 27, 2021
…e TLS enabled universe

Summary:
ysql_dump needs to contact masters for certain metadata. When
node to node TLS is enabled, this means that it needs to be aware of
node certificate dirs and enable a TLS conn. This diff sets those flags
through yb_backup.py

Test Plan:
Backup an SQL db on a node to node TLS enabled univ through YW, verify it fails before and succeeds after this change.

Tested S3 (backup to, restore from) X (node to node tls, non TLS) universe.

Reviewers: arnav, oleg

Reviewed By: oleg

Subscribers: jenkins-bot, yugaware

Differential Revision: https://phabricator.dev.yugabyte.com/D10447
iSignal added a commit that referenced this issue Feb 4, 2021
…ing up node to node TLS enabled universe

Summary:
ysql_dump needs to contact masters for certain metadata. When
node to node TLS is enabled, this means that it needs to be aware of
node certificate dirs and enable a TLS conn. This diff sets those flags
through yb_backup.py

Original diff: https://phabricator.dev.yugabyte.com/D10447 / 3579c17

Test Plan:
1. run unit tests  ybd --cxx-test tools_yb-backup-test_ent && ybd --java-test org.yb.cql.TestYbBackup && ybd --java-test org.yb.pgsql.TestYbBackup

2. Run a manual SQL backup and restore against a local AWS TLS enabled univ

Reviewers: arnav, oleg

Reviewed By: oleg

Subscribers: bogdan, jenkins-bot, yugaware

Differential Revision: https://phabricator.dev.yugabyte.com/D10524
polarweasel pushed a commit to lizayugabyte/yugabyte-db that referenced this issue Mar 9, 2021
…e to node TLS enabled universe

Summary:
ysql_dump needs to contact masters for certain metadata. When
node to node TLS is enabled, this means that it needs to be aware of
node certificate dirs and enable a TLS conn. This diff sets those flags
through yb_backup.py

Test Plan:
Backup an SQL db on a node to node TLS enabled univ through YW, verify it fails before and succeeds after this change.

Tested S3 (backup to, restore from) X (node to node tls, non TLS) universe.

Reviewers: arnav, oleg

Reviewed By: oleg

Subscribers: jenkins-bot, yugaware

Differential Revision: https://phabricator.dev.yugabyte.com/D10447
@VijiYB
Copy link

VijiYB commented Apr 27, 2021

Validated on 2.4.2
created universe with node-node TLS enabled.
Ran some workloads for ysql
Took backup & Restored

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/platform Yugabyte Platform priority/critical Critical issue
Projects
None yet
Development

No branches or pull requests

5 participants