-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[YSQL][ASH] Include PID in yb_active_session_history view #23070
Comments
@pdvmoto Thanks for submitting the issue. Can you elaborate on the use case when PID will help and ysql_session_id will not? Since the PID is not available for sessions that have ended, I am not sure how to correlate that. |
I can add some cases where PID may be helpful:
|
What Franck Said. The "problem" I have with sql_session_id: it doesnt seem to link to anything (yet) ? and I realize the limitation of "sampled" information. but ASH is a Good Start! |
Summary: yb_active_session_history had a field for the pg client session id, but it's not very useful, because it's an internal counter and the mapping between session id and pid is only present in the log files. This makes it hard for the user to correlate session id with an actual process. This diff replaces the session id with pid which is more readily available. The pid is of the process which is executing the query. For YSQL, it will be pid of the backend which is executing the query, for YCQL and background activities like flushes and compactions, it will be pid of the TServer. This diff also does some refactoring - Rename YbSetAshClientAddrAndPort to YbAshSetOneTimeMetadata - Move YbAshSetOneTimeMetadata to yb_ash.c Upgrade/Rollback safety: - Old node sends the PB to new node -- Old node sets the session_id, but the new node ignores it -- Old node cannot set the pid, so the new node takes the pid as 0, it's incorrect until the nodes are upgraded, and that's fine - New node sends the PB to old node -- New node sets the pid, the old node is not aware of it and ignores it -- New node doesn't set the session_id, old node takes the session id as 0, it's also incorrect and fine The reason that the incorrect values are fine - - The data in the circular buffer is only supposed to be there for a few hours, so it's fine to have incorrect value of one field during the upgrade 'ysql_yb_enable_ash' Will be made an AutoFlag at GA time Test Plan: ./yb_build.sh --java-test TestYsqlUpgrade ./yb_build.sh --java-test TestYbAsh#testYsqlPids Manually tested that the pids are correct ``` yugabyte=# select wait_event_component, pid, client_node_ip, count(*) from yb_active_session_history group by wait_event_component, pid, client_node_ip order by pid, wait_event_component; wait_event_component | pid | client_node_ip | count ----------------------+-------+-----------------+------- TServer | 78409 | | 872 TServer | 78623 | 127.0.0.1:57719 | 17 YSQL | 78623 | 127.0.0.1:57719 | 15 YSQL | 78667 | 127.0.0.1:57736 | 3 TServer | 78672 | 127.0.0.1:57740 | 57 YSQL | 78672 | 127.0.0.1:57740 | 348 TServer | 78673 | 127.0.0.1:57742 | 57 YSQL | 78673 | 127.0.0.1:57742 | 356 TServer | 78674 | 127.0.0.1:57744 | 59 YSQL | 78674 | 127.0.0.1:57744 | 352 TServer | 78675 | 127.0.0.1:57746 | 60 YSQL | 78675 | 127.0.0.1:57746 | 344 TServer | 78676 | 127.0.0.1:57748 | 52 YSQL | 78676 | 127.0.0.1:57748 | 349 TServer | 78677 | 127.0.0.1:57750 | 59 YSQL | 78677 | 127.0.0.1:57750 | 348 TServer | 78678 | 127.0.0.1:57752 | 50 YSQL | 78678 | 127.0.0.1:57752 | 344 TServer | 78679 | 127.0.0.1:57754 | 44 YSQL | 78679 | 127.0.0.1:57754 | 350 TServer | 78680 | 127.0.0.1:57756 | 54 YSQL | 78680 | 127.0.0.1:57756 | 348 TServer | 78681 | 127.0.0.1:57758 | 46 YSQL | 78681 | 127.0.0.1:57758 | 358 YSQL | 78693 | 127.0.0.1:57761 | 2 ``` Output of `ps -A | grep yugabyte` ``` 78672 ?? 0:06.75 postgres: yugabyte yugabyte 127.0.0.1(57740) idle 78673 ?? 0:06.71 postgres: yugabyte yugabyte 127.0.0.1(57742) INSERT 78674 ?? 0:06.64 postgres: yugabyte yugabyte 127.0.0.1(57744) UPDATE 78675 ?? 0:06.66 postgres: yugabyte yugabyte 127.0.0.1(57746) COMMIT 78676 ?? 0:06.55 postgres: yugabyte yugabyte 127.0.0.1(57748) UPDATE 78677 ?? 0:06.59 postgres: yugabyte yugabyte 127.0.0.1(57750) UPDATE 78678 ?? 0:06.58 postgres: yugabyte yugabyte 127.0.0.1(57752) COMMIT 78679 ?? 0:06.52 postgres: yugabyte yugabyte 127.0.0.1(57754) UPDATE 78680 ?? 0:06.56 postgres: yugabyte yugabyte 127.0.0.1(57756) UPDATE 78681 ?? 0:06.48 postgres: yugabyte yugabyte 127.0.0.1(57758) idle 78693 ?? 0:00.18 postgres: yugabyte yugabyte 127.0.0.1(57761) idle 78409 ttys006 4:08.89 /Users/asaha/code/yugabyte-db/build/latest/bin/yb-tserver ... ``` Reviewers: jason, amitanand Reviewed By: jason Subscribers: hsunder, yql, amitanand, hbhanawat, ybase Differential Revision: https://phorge.dev.yugabyte.com/D37081
Hi @pdvmoto @FranckPachot, thanks for raising this issue, we have replaced the ysql_session_id with pid. cc: @hbhanawat |
Thx! Will this be available in any of the docker-images to test ? (I'm unable to compile from source myself) |
Summary: bd1e19e [PLAT-14835]: Add extra transient YCQL index tables in xClusterTableConfig during GET calls and metrics. 2715c58 [docs] updated yb version with shortcode (#23456) a2b5495 [PLAT-14912] docs warning for install root as subdirectory (#23470) 53365b1 [#23422] YSQL: Disable random warmup feature by default for connection manager tests 09d6e96 [#22876][#22835][#22773] CDCSDK: Add null checks & remove optimisation in table removal from CDC stream 69d4052 [#22862] XCluster: Improving XCluster Index Base WAL Retention Policy 706e97d [#23460] DocDB: Read vector index data b1a90b9 [#23428] docdb: Remove non-transactional snapshot code 581648f [PLAT-13957] Update RBAC wrapper for xCluster DR fbaf945 Whitepaper on migration (#23468) f6af2f5 [PLAT-13936] Upgrade Grpc and its dependencies to fix CVEs d7027fe [PLAT-14892] Update PITR configuration step text 92804ac [PLAT-14760] Use new xCluster sync API 3cb8faf [PLAT-11243] Upgrade python requests to latest version Excluded: a036313 [#23070] YSQL, ASH: Replace ysql_session_id with pid 4d2f71f [PLAT-14882] Retrieve userName from the attribute lists in case not found in dn 99489c0 [PLAT-14909] Upgrade YBC version to 2.2.0.0-b4 Test Plan: Jenkins: rebase: pg15-cherrypicks Reviewers: jason, tfoucher Tags: #jenkins-ready Differential Revision: https://phorge.dev.yugabyte.com/D37272
…d with pid Summary: - postinit.c -- /* Yugabyte includes */ --- YB de8c183 introduced `#include <arpa/inet.h>` first, and then YB a036313 moved it to yb_ash.c --- PG 55782d5 added the adjacent line `/* Yugabyte includes */` --- Keep the `/* Yugabyte includes */` line and remove the `#include <arpa/inet.h>` line -- static void YbEnsureSysTablePrefetchingStopped(); --- YB de8c183 introduced the function declaration `YbSetAshClientAddrAndPort` and YB a036313 renamed `YbSetAshClientAddrAndPort` to `YbAshSetOneTimeMetadata` and moved in to yb_ash.h/yb_ash.c --- PG 55782d5 introduced the function declaration `InitPostgresImpl` and `YbEnsureSysTablePrefetchingStopped` adjacent to it --- Keep `InitPostgresImpl` and `YbEnsureSysTablePrefetchingStopped` and remove `YbSetAshClientAddrAndPort` The original diff summary: yb_active_session_history had a field for the pg client session id, but it's not very useful, because it's an internal counter and the mapping between session id and pid is only present in the log files. This makes it hard for the user to correlate session id with an actual process. This diff replaces the session id with pid which is more readily available. The pid is of the process which is executing the query. For YSQL, it will be pid of the backend which is executing the query, for YCQL and background activities like flushes and compactions, it will be pid of the TServer. This diff also does some refactoring - Rename YbSetAshClientAddrAndPort to YbAshSetOneTimeMetadata - Move YbAshSetOneTimeMetadata to yb_ash.c Upgrade/Rollback safety: - Old node sends the PB to new node -- Old node sets the session_id, but the new node ignores it -- Old node cannot set the pid, so the new node takes the pid as 0, it's incorrect until the nodes are upgraded, and that's fine - New node sends the PB to old node -- New node sets the pid, the old node is not aware of it and ignores it -- New node doesn't set the session_id, old node takes the session id as 0, it's also incorrect and fine The reason that the incorrect values are fine - - The data in the circular buffer is only supposed to be there for a few hours, so it's fine to have incorrect value of one field during the upgrade 'ysql_yb_enable_ash' Will be made an AutoFlag at GA time Test Plan: ./yb_build.sh --java-test TestYsqlUpgrade ./yb_build.sh --java-test TestYbAsh#testYsqlPids Manually tested that the pids are correct ``` yugabyte=# select wait_event_component, pid, client_node_ip, count(*) from yb_active_session_history group by wait_event_component, pid, client_node_ip order by pid, wait_event_component; wait_event_component | pid | client_node_ip | count ----------------------+-------+-----------------+------- TServer | 78409 | | 872 TServer | 78623 | 127.0.0.1:57719 | 17 YSQL | 78623 | 127.0.0.1:57719 | 15 YSQL | 78667 | 127.0.0.1:57736 | 3 TServer | 78672 | 127.0.0.1:57740 | 57 YSQL | 78672 | 127.0.0.1:57740 | 348 TServer | 78673 | 127.0.0.1:57742 | 57 YSQL | 78673 | 127.0.0.1:57742 | 356 TServer | 78674 | 127.0.0.1:57744 | 59 YSQL | 78674 | 127.0.0.1:57744 | 352 TServer | 78675 | 127.0.0.1:57746 | 60 YSQL | 78675 | 127.0.0.1:57746 | 344 TServer | 78676 | 127.0.0.1:57748 | 52 YSQL | 78676 | 127.0.0.1:57748 | 349 TServer | 78677 | 127.0.0.1:57750 | 59 YSQL | 78677 | 127.0.0.1:57750 | 348 TServer | 78678 | 127.0.0.1:57752 | 50 YSQL | 78678 | 127.0.0.1:57752 | 344 TServer | 78679 | 127.0.0.1:57754 | 44 YSQL | 78679 | 127.0.0.1:57754 | 350 TServer | 78680 | 127.0.0.1:57756 | 54 YSQL | 78680 | 127.0.0.1:57756 | 348 TServer | 78681 | 127.0.0.1:57758 | 46 YSQL | 78681 | 127.0.0.1:57758 | 358 YSQL | 78693 | 127.0.0.1:57761 | 2 ``` Output of `ps -A | grep yugabyte` ``` 78672 ?? 0:06.75 postgres: yugabyte yugabyte 127.0.0.1(57740) idle 78673 ?? 0:06.71 postgres: yugabyte yugabyte 127.0.0.1(57742) INSERT 78674 ?? 0:06.64 postgres: yugabyte yugabyte 127.0.0.1(57744) UPDATE 78675 ?? 0:06.66 postgres: yugabyte yugabyte 127.0.0.1(57746) COMMIT 78676 ?? 0:06.55 postgres: yugabyte yugabyte 127.0.0.1(57748) UPDATE 78677 ?? 0:06.59 postgres: yugabyte yugabyte 127.0.0.1(57750) UPDATE 78678 ?? 0:06.58 postgres: yugabyte yugabyte 127.0.0.1(57752) COMMIT 78679 ?? 0:06.52 postgres: yugabyte yugabyte 127.0.0.1(57754) UPDATE 78680 ?? 0:06.56 postgres: yugabyte yugabyte 127.0.0.1(57756) UPDATE 78681 ?? 0:06.48 postgres: yugabyte yugabyte 127.0.0.1(57758) idle 78693 ?? 0:00.18 postgres: yugabyte yugabyte 127.0.0.1(57761) idle 78409 ttys006 4:08.89 /Users/asaha/code/yugabyte-db/build/latest/bin/yb-tserver ... ``` Reviewers: jason, tfoucher Reviewed By: jason Tags: #jenkins-ready Differential Revision: https://phorge.dev.yugabyte.com/D37288
Thx, only see this now. Will check ASAP, early next week. |
Positive! First Tests done. It Works Fine. I can identify processes from ash, and I can find ash-data per-process. |
Description
Asking to include PID in yb_active_session_history. This field is already present in pg_stat_activity, which seems to contain similar data. the current ysql_session_id is just a local-counter, and is not much use "externally", wheras the linux-PID will allow linkup to an actual process.
Warning: Please confirm that this issue does not contain any sensitive information
The text was updated successfully, but these errors were encountered: