Skip to content

Commit

Permalink
[BACKPORT pg15-cherrypicks][#23070] YSQL, ASH: Replace ysql_session_i…
Browse files Browse the repository at this point in the history
…d with pid

Summary:
- postinit.c
-- /* Yugabyte includes */
--- YB de8c183 introduced `#include <arpa/inet.h>` first, and then YB a036313 moved it to yb_ash.c
--- PG 55782d5 added the adjacent line `/* Yugabyte includes */`
--- Keep the `/* Yugabyte includes */` line and remove the `#include <arpa/inet.h>` line
-- static void YbEnsureSysTablePrefetchingStopped();
--- YB de8c183 introduced the function declaration `YbSetAshClientAddrAndPort` and YB a036313 renamed `YbSetAshClientAddrAndPort` to `YbAshSetOneTimeMetadata` and moved in to yb_ash.h/yb_ash.c
--- PG 55782d5 introduced the function declaration `InitPostgresImpl` and `YbEnsureSysTablePrefetchingStopped` adjacent to it
--- Keep `InitPostgresImpl` and `YbEnsureSysTablePrefetchingStopped` and remove `YbSetAshClientAddrAndPort`

The original diff summary:

yb_active_session_history had a field for the pg client session id,
but it's not very useful, because it's an internal counter and the
mapping between session id and pid is only present in the log files.
This makes it hard for the user to correlate session id with an
actual process.

This diff replaces the session id with pid which is more readily
available. The pid is of the process which is executing the query.
For YSQL, it will be pid of the backend which is executing the query,
for YCQL and background activities like flushes and compactions, it
will be pid of the TServer.

This diff also does some refactoring
- Rename YbSetAshClientAddrAndPort to YbAshSetOneTimeMetadata
- Move YbAshSetOneTimeMetadata to yb_ash.c

Upgrade/Rollback safety:

- Old node sends the PB to new node
-- Old node sets the session_id, but the new node ignores it
-- Old node cannot set the pid, so the new node takes the pid as 0, it's incorrect until the nodes are upgraded, and that's fine

- New node sends the PB to old node
-- New node sets the pid, the old node is not aware of it and ignores it
-- New node doesn't set the session_id, old node takes the session id as 0, it's also incorrect and fine

The reason that the incorrect values are fine -
- The data in the circular buffer is only supposed to be there for a few hours, so it's fine to have incorrect value of one field during the upgrade

'ysql_yb_enable_ash' Will be made an AutoFlag at GA time

Test Plan:
./yb_build.sh --java-test TestYsqlUpgrade
./yb_build.sh --java-test TestYbAsh#testYsqlPids

Manually tested that the pids are correct

```
yugabyte=# select wait_event_component, pid, client_node_ip, count(*) from yb_active_session_history group by wait_event_component, pid, client_node_ip order by pid, wait_event_component;
 wait_event_component |  pid  | client_node_ip  | count
----------------------+-------+-----------------+-------
 TServer              | 78409 |                 |   872
 TServer              | 78623 | 127.0.0.1:57719 |    17
 YSQL                 | 78623 | 127.0.0.1:57719 |    15
 YSQL                 | 78667 | 127.0.0.1:57736 |     3
 TServer              | 78672 | 127.0.0.1:57740 |    57
 YSQL                 | 78672 | 127.0.0.1:57740 |   348
 TServer              | 78673 | 127.0.0.1:57742 |    57
 YSQL                 | 78673 | 127.0.0.1:57742 |   356
 TServer              | 78674 | 127.0.0.1:57744 |    59
 YSQL                 | 78674 | 127.0.0.1:57744 |   352
 TServer              | 78675 | 127.0.0.1:57746 |    60
 YSQL                 | 78675 | 127.0.0.1:57746 |   344
 TServer              | 78676 | 127.0.0.1:57748 |    52
 YSQL                 | 78676 | 127.0.0.1:57748 |   349
 TServer              | 78677 | 127.0.0.1:57750 |    59
 YSQL                 | 78677 | 127.0.0.1:57750 |   348
 TServer              | 78678 | 127.0.0.1:57752 |    50
 YSQL                 | 78678 | 127.0.0.1:57752 |   344
 TServer              | 78679 | 127.0.0.1:57754 |    44
 YSQL                 | 78679 | 127.0.0.1:57754 |   350
 TServer              | 78680 | 127.0.0.1:57756 |    54
 YSQL                 | 78680 | 127.0.0.1:57756 |   348
 TServer              | 78681 | 127.0.0.1:57758 |    46
 YSQL                 | 78681 | 127.0.0.1:57758 |   358
 YSQL                 | 78693 | 127.0.0.1:57761 |     2

```

Output of `ps -A | grep yugabyte`

```
78672 ??         0:06.75 postgres: yugabyte yugabyte 127.0.0.1(57740) idle
78673 ??         0:06.71 postgres: yugabyte yugabyte 127.0.0.1(57742) INSERT
78674 ??         0:06.64 postgres: yugabyte yugabyte 127.0.0.1(57744) UPDATE
78675 ??         0:06.66 postgres: yugabyte yugabyte 127.0.0.1(57746) COMMIT
78676 ??         0:06.55 postgres: yugabyte yugabyte 127.0.0.1(57748) UPDATE
78677 ??         0:06.59 postgres: yugabyte yugabyte 127.0.0.1(57750) UPDATE
78678 ??         0:06.58 postgres: yugabyte yugabyte 127.0.0.1(57752) COMMIT
78679 ??         0:06.52 postgres: yugabyte yugabyte 127.0.0.1(57754) UPDATE
78680 ??         0:06.56 postgres: yugabyte yugabyte 127.0.0.1(57756) UPDATE
78681 ??         0:06.48 postgres: yugabyte yugabyte 127.0.0.1(57758) idle
78693 ??         0:00.18 postgres: yugabyte yugabyte 127.0.0.1(57761) idle
78409 ttys006    4:08.89 /Users/asaha/code/yugabyte-db/build/latest/bin/yb-tserver ...
```

Reviewers: jason, tfoucher

Reviewed By: jason

Tags: #jenkins-ready

Differential Revision: https://phorge.dev.yugabyte.com/D37288
  • Loading branch information
abhinab-yb committed Aug 16, 2024
1 parent 1468868 commit d051f55
Show file tree
Hide file tree
Showing 23 changed files with 202 additions and 129 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -79,7 +79,7 @@ This view provides a list of wait events and their metadata. The columns of the
| wait_event_aux | text | Additional information for the wait event. For example, tablet ID for TServer wait events. |
| top_level_node_id | UUID | 16-byte TServer UUID of the YSQL/YCQL node where the query is being executed. |
| query_id | bigint | Query ID as seen on the `/statements` endpoint. This can be used to join with [pg_stat_statements](../../query-1-performance/pg-stat-statements/)/[ycql_stat_statements](../../query-1-performance/ycql-stat-statements/). It is set as a known constant for background activities. For example, _flush_ is 2, _compaction_ is 3, and so on. |
| ysql_session_id | bigint | YSQL session identifier. Zero for YCQL and background activities. |
| pid | bigint | PID of the process that is executing the query. For YCQL and background activites, this will be the YB-TServer PID. |
| client_node_ip | text | IP address of the client which sent the query to YSQL/YCQL. Null for background activities. |
| sample_weight | float | If in any sampling interval there are too many events, YugabyteDB only collects `ysql_yb_ash_sample_size` samples/events. Based on how many were sampled, weights are assigned to the collected events. <br><br>For example, if there are 200 events, but only 100 events are collected, each of the collected samples will have a weight of (200 / 100) = 2.0 |

Expand Down
29 changes: 29 additions & 0 deletions java/yb-pgsql/src/test/java/org/yb/pgsql/TestYbAsh.java
Original file line number Diff line number Diff line change
Expand Up @@ -333,4 +333,33 @@ public void testSampleSize() throws Exception {
}
}
}

/**
* Verify that we see the YSQL backend's pid in ASH
*/
@Test
public void testYsqlPids() throws Exception {
setAshConfigAndRestartCluster(100, ASH_SAMPLE_SIZE);

try (Statement statement = connection.createStatement()) {
statement.execute("CREATE TABLE test_table(k INT, v TEXT)");
for (int i = 0; i < 100; ++i) {
statement.execute(String.format("INSERT INTO test_table VALUES(%d, 'v-%d')", i, i));
}
int pid = getSingleRow(statement, "SELECT pg_backend_pid()").getInt(0);
int res = getSingleRow(statement, "SELECT COUNT(*) FROM " + ASH_VIEW +
" WHERE pid = " + pid).getLong(0).intValue();
assertGreaterThan(res, 0);
}

try (Statement statement = connection.createStatement()) {
for (int i = 0; i < 100; ++i) {
statement.execute(String.format("SELECT * FROM test_table WHERE k = %d", i));
}
int pid = getSingleRow(statement, "SELECT pg_backend_pid()").getInt(0);
int res = getSingleRow(statement, "SELECT COUNT(*) FROM " + ASH_VIEW +
" WHERE pid = " + pid).getLong(0).intValue();
assertGreaterThan(res, 0);
}
}
}
78 changes: 1 addition & 77 deletions src/postgres/src/backend/utils/init/postinit.c
Original file line number Diff line number Diff line change
Expand Up @@ -83,16 +83,13 @@
#include "utils/timeout.h"

/* Yugabyte includes */
#include <arpa/inet.h>
#include "pg_yb_utils.h"
#include "catalog/pg_auth_members.h"
#include "catalog/pg_yb_catalog_version.h"
#include "catalog/pg_yb_profile.h"
#include "catalog/pg_yb_role_profile.h"
#include "catalog/pg_yb_tablegroup.h"
#include "catalog/yb_catalog_version.h"
#include "common/ip.h"
#include "utils/builtins.h"
#include "utils/yb_inheritscache.h"

static HeapTuple GetDatabaseTuple(const char *dbname);
Expand All @@ -118,7 +115,6 @@ static void InitPostgresImpl(const char *in_dbname, Oid dboid,
uint64_t *session_id,
bool* yb_sys_table_prefetching_started);
static void YbEnsureSysTablePrefetchingStopped();
static void YbSetAshClientAddrAndPort();

/*** InitPostgres support ***/

Expand Down Expand Up @@ -841,7 +837,7 @@ InitPostgresImpl(const char *in_dbname, Oid dboid,
* bootstrap because it won't have client address anyway.
*/
if (YbAshIsClientAddrSet())
YbSetAshClientAddrAndPort();
YbAshSetOneTimeMetadata();

/* Connect to YugaByte cluster. */
if (bootstrap)
Expand Down Expand Up @@ -1530,75 +1526,3 @@ ThereIsAtLeastOneRole(void)

return result;
}

/*
* Sets the client address and port for ASH metadata.
* If the address family is not AF_INET or AF_INET6, then the PGPPROC ASH metadata
* fields for client address and port don't mean anything. Otherwise, if
* pg_getnameinfo_all returns non-zero value, a warning is printed with the error
* code and ASH keeps working without client address and port for the current PG
* backend.
*
* ASH samples only normal backends and this excludes background workers.
* So it's fine in that case to not set the client address.
*/
static void
YbSetAshClientAddrAndPort()
{
/* Background workers which creates a postgres backend may have null MyProcPort. */
if (MyProcPort == NULL)
{
Assert(MyProc->isBackgroundWorker == true);
return;
}

LWLockAcquire(&MyProc->yb_ash_metadata_lock, LW_EXCLUSIVE);

/* Set the address family and null the client_addr and client_port */
MyProc->yb_ash_metadata.addr_family = MyProcPort->raddr.addr.ss_family;
MemSet(MyProc->yb_ash_metadata.client_addr, 0, 16);
MyProc->yb_ash_metadata.client_port = 0;

switch (MyProcPort->raddr.addr.ss_family)
{
case AF_INET:
#ifdef HAVE_IPV6
case AF_INET6:
#endif
break;
default:
LWLockRelease(&MyProc->yb_ash_metadata_lock);
return;
}

char remote_host[NI_MAXHOST];
int ret;

ret = pg_getnameinfo_all(&MyProcPort->raddr.addr, MyProcPort->raddr.salen,
remote_host, sizeof(remote_host),
NULL, 0,
NI_NUMERICHOST | NI_NUMERICSERV);

if (ret != 0)
{
ereport(WARNING,
(errmsg("pg_getnameinfo_all while setting ash metadata failed"),
errdetail("%s\naddress family: %u",
gai_strerror(ret),
MyProcPort->raddr.addr.ss_family)));

LWLockRelease(&MyProc->yb_ash_metadata_lock);
return;
}

clean_ipv6_addr(MyProcPort->raddr.addr.ss_family, remote_host);

/* Setting ip address */
inet_pton(MyProcPort->raddr.addr.ss_family, remote_host,
MyProc->yb_ash_metadata.client_addr);

/* Setting port */
MyProc->yb_ash_metadata.client_port = atoi(MyProcPort->remote_port);

LWLockRelease(&MyProc->yb_ash_metadata_lock);
}
2 changes: 0 additions & 2 deletions src/postgres/src/backend/utils/misc/pg_yb_utils.c
Original file line number Diff line number Diff line change
Expand Up @@ -936,8 +936,6 @@ YBInitPostgresBackend(
* mapped to PG backends.
*/
yb_pgstat_add_session_info(YBCPgGetSessionID());
if (yb_ash_enable_infra)
YbAshSetSessionId(YBCPgGetSessionID());
}
}

Expand Down
87 changes: 78 additions & 9 deletions src/postgres/src/backend/utils/misc/yb_ash.c
Original file line number Diff line number Diff line change
Expand Up @@ -26,9 +26,13 @@

#include "yb_ash.h"

#include <arpa/inet.h>

#include "access/hash.h"
#include "common/ip.h"
#include "executor/executor.h"
#include "funcapi.h"
#include "libpq/libpq-be.h"
#include "miscadmin.h"
#include "parser/scansup.h"
#include "pgstat.h"
Expand Down Expand Up @@ -191,14 +195,6 @@ YbAshInstallHooks(void)
ProcessUtility_hook = yb_ash_ProcessUtility;
}

void
YbAshSetSessionId(uint64 session_id)
{
LWLockAcquire(&MyProc->yb_ash_metadata_lock, LW_EXCLUSIVE);
MyProc->yb_ash_metadata.session_id = session_id;
LWLockRelease(&MyProc->yb_ash_metadata_lock);
}

void
YbAshSetDatabaseId(Oid database_id)
{
Expand Down Expand Up @@ -506,6 +502,79 @@ YbAshUnsetMetadata(void)
LWLockRelease(&MyProc->yb_ash_metadata_lock);
}

/*
* Sets the client address, port and pid for ASH metadata.
* If the address family is not AF_INET or AF_INET6, then the PGPPROC ASH metadata
* fields for client address and port don't mean anything. Otherwise, if
* pg_getnameinfo_all returns non-zero value, a warning is printed with the error
* code and ASH keeps working without client address and port for the current PG
* backend.
*
* ASH samples only normal backends and this excludes background workers.
* So it's fine in that case to not set the client address.
*/
void
YbAshSetOneTimeMetadata()
{
/* Background workers which creates a postgres backend may have null MyProcPort. */
if (MyProcPort == NULL)
{
Assert(MyProc->isBackgroundWorker == true);
return;
}

LWLockAcquire(&MyProc->yb_ash_metadata_lock, LW_EXCLUSIVE);

/* Set the address family and null the client_addr and client_port */
MyProc->yb_ash_metadata.addr_family = MyProcPort->raddr.addr.ss_family;
MemSet(MyProc->yb_ash_metadata.client_addr, 0, 16);
MyProc->yb_ash_metadata.client_port = 0;
MyProc->yb_ash_metadata.pid = MyProcPid;

switch (MyProcPort->raddr.addr.ss_family)
{
case AF_INET:
#ifdef HAVE_IPV6
case AF_INET6:
#endif
break;
default:
LWLockRelease(&MyProc->yb_ash_metadata_lock);
return;
}

char remote_host[NI_MAXHOST];
int ret;

ret = pg_getnameinfo_all(&MyProcPort->raddr.addr, MyProcPort->raddr.salen,
remote_host, sizeof(remote_host),
NULL, 0,
NI_NUMERICHOST | NI_NUMERICSERV);

if (ret != 0)
{
ereport(WARNING,
(errmsg("pg_getnameinfo_all while setting ash metadata failed"),
errdetail("%s\naddress family: %u",
gai_strerror(ret),
MyProcPort->raddr.addr.ss_family)));

LWLockRelease(&MyProc->yb_ash_metadata_lock);
return;
}

clean_ipv6_addr(MyProcPort->raddr.addr.ss_family, remote_host);

/* Setting ip address */
inet_pton(MyProcPort->raddr.addr.ss_family, remote_host,
MyProc->yb_ash_metadata.client_addr);

/* Setting port */
MyProc->yb_ash_metadata.client_port = atoi(MyProcPort->remote_port);

LWLockRelease(&MyProc->yb_ash_metadata_lock);
}

/*
* Calculate the query id for utility statements. This takes parts of pgss_store
* from pg_stat_statements.
Expand Down Expand Up @@ -883,7 +952,7 @@ yb_active_session_history(PG_FUNCTION_ARGS)
values[j++] = UUIDPGetDatum(&yql_endpoint_tserver_uuid);

values[j++] = UInt64GetDatum(metadata->query_id);
values[j++] = UInt64GetDatum(metadata->session_id);
values[j++] = Int32GetDatum(metadata->pid);

if (metadata->addr_family == AF_INET || metadata->addr_family == AF_INET6)
{
Expand Down
4 changes: 2 additions & 2 deletions src/postgres/src/include/catalog/pg_proc.dat
Original file line number Diff line number Diff line change
Expand Up @@ -12020,9 +12020,9 @@
proname => 'yb_active_session_history', prorows => '100000',
proretset => 't', provolatile => 'v', proparallel => 'r',
prorettype => 'record', proargtypes => '',
proallargtypes => '{timestamptz,uuid,int8,text,text,text,uuid,int8,int8,text,text,float4,text,oid}',
proallargtypes => '{timestamptz,uuid,int8,text,text,text,uuid,int8,int4,text,text,float4,text,oid}',
proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
proargnames => '{sample_time,root_request_id,rpc_request_id,wait_event_component,wait_event_class,wait_event,top_level_node_id,query_id,ysql_session_id,client_node_ip,wait_event_aux,sample_weight,wait_event_type,ysql_dbid}',
proargnames => '{sample_time,root_request_id,rpc_request_id,wait_event_component,wait_event_class,wait_event,top_level_node_id,query_id,pid,client_node_ip,wait_event_aux,sample_weight,wait_event_type,ysql_dbid}',
prosrc => 'yb_active_session_history'},

{ oid => '8066',
Expand Down
4 changes: 2 additions & 2 deletions src/postgres/src/include/catalog/pg_yb_migration.dat
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@
[

# For better version control conflict detection, list latest migration filename
# here: V53__22144__yb_query_diagnostics.sql
{ major => '53', minor => '0', name => '<baseline>', time_applied => '_null_' }
# here: V54__23070__yb_ash_ysql_session_id_to_pid.sql
{ major => '54', minor => '0', name => '<baseline>', time_applied => '_null_' }

]
3 changes: 2 additions & 1 deletion src/postgres/src/include/yb_ash.h
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,6 @@ extern void YbAshRegister(void);
extern void YbAshMain(Datum main_arg);

extern void YbAshInit(void);
extern void YbAshSetSessionId(uint64 session_id);
extern void YbAshSetDatabaseId(Oid database_id);
extern bool YbAshShouldIgnoreWaitEvent(uint32 wait_event_info);

Expand All @@ -69,4 +68,6 @@ extern bool yb_enable_ash_check_hook(bool *newval,
extern void YbAshSetMetadata(void);
extern void YbAshUnsetMetadata(void);

extern void YbAshSetOneTimeMetadata(void);

#endif /* YB_ASH_H */
4 changes: 2 additions & 2 deletions src/postgres/src/test/regress/expected/yb_pg_rules.out
Original file line number Diff line number Diff line change
Expand Up @@ -2348,13 +2348,13 @@ yb_active_session_history| SELECT yb_active_session_history.sample_time,
yb_active_session_history.wait_event,
yb_active_session_history.top_level_node_id,
yb_active_session_history.query_id,
yb_active_session_history.ysql_session_id,
yb_active_session_history.pid,
yb_active_session_history.client_node_ip,
yb_active_session_history.wait_event_aux,
yb_active_session_history.sample_weight,
yb_active_session_history.wait_event_type,
yb_active_session_history.ysql_dbid
FROM yb_active_session_history() yb_active_session_history(sample_time, root_request_id, rpc_request_id, wait_event_component, wait_event_class, wait_event, top_level_node_id, query_id, ysql_session_id, client_node_ip, wait_event_aux, sample_weight, wait_event_type, ysql_dbid);
FROM yb_active_session_history() yb_active_session_history(sample_time, root_request_id, rpc_request_id, wait_event_component, wait_event_class, wait_event, top_level_node_id, query_id, pid, client_node_ip, wait_event_aux, sample_weight, wait_event_type, ysql_dbid);
yb_local_tablets| SELECT yb_local_tablets.tablet_id,
yb_local_tablets.table_id,
yb_local_tablets.table_type,
Expand Down
10 changes: 5 additions & 5 deletions src/yb/ash/wait_state-test.cc
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ AshMetadata GenerateRandomMetadata() {
.root_request_id{Uuid::Generate()},
.yql_endpoint_tserver_uuid{Uuid::Generate()},
.query_id = RandomUniformInt<uint64_t>(),
.session_id = RandomUniformInt<uint64_t>(),
.pid = RandomUniformInt<pid_t>(),
.database_id = RandomUniformInt<uint32_t>(),
.rpc_request_id = RandomUniformInt<int64_t>(),
.client_host_port = RandomHostPort()};
Expand All @@ -51,7 +51,7 @@ void testToAndFromPB() {
ASSERT_EQ(meta1.root_request_id, meta2.root_request_id);
ASSERT_EQ(meta1.yql_endpoint_tserver_uuid, meta2.yql_endpoint_tserver_uuid);
ASSERT_EQ(meta1.query_id, meta2.query_id);
ASSERT_EQ(meta1.session_id, meta2.session_id);
ASSERT_EQ(meta1.pid, meta2.pid);
ASSERT_EQ(meta1.database_id, meta2.database_id);
ASSERT_EQ(meta1.rpc_request_id, meta2.rpc_request_id);
ASSERT_EQ(meta1.client_host_port, meta2.client_host_port);
Expand All @@ -69,13 +69,13 @@ TEST(WaitStateTest, TestUpdate) {
auto pb1_root_request_id = Uuid::Generate();
pb1_root_request_id.ToBytes(pb1.mutable_root_request_id());
pb1.set_query_id(RandomUniformInt<uint64_t>());
pb1.set_session_id(RandomUniformInt<uint64_t>());
pb1.set_pid(RandomUniformInt<pid_t>());
HostPortToPB(RandomHostPort(), pb1.mutable_client_host_port());
meta1.UpdateFrom(AshMetadata::FromPB(pb1));
ASSERT_EQ(meta1.root_request_id, pb1_root_request_id);
ASSERT_EQ(meta1.yql_endpoint_tserver_uuid, meta1_copy.yql_endpoint_tserver_uuid);
ASSERT_EQ(meta1.query_id, pb1.query_id());
ASSERT_EQ(meta1.session_id, pb1.session_id());
ASSERT_EQ(meta1.pid, pb1.pid());
ASSERT_EQ(meta1.database_id, meta1_copy.database_id);
ASSERT_EQ(meta1.rpc_request_id, meta1_copy.rpc_request_id);
ASSERT_EQ(meta1.client_host_port, HostPortFromPB(pb1.client_host_port()));
Expand All @@ -90,7 +90,7 @@ TEST(WaitStateTest, TestUpdate) {
ASSERT_EQ(meta1.root_request_id, meta1_copy.root_request_id);
ASSERT_EQ(meta1.yql_endpoint_tserver_uuid, pb2_yql_endpoint_tserver_uuid);
ASSERT_EQ(meta1.query_id, meta1_copy.query_id);
ASSERT_EQ(meta1.session_id, meta1_copy.session_id);
ASSERT_EQ(meta1.pid, meta1_copy.pid);
ASSERT_EQ(meta1.rpc_request_id, pb2.rpc_request_id());
ASSERT_EQ(meta1.client_host_port, meta1_copy.client_host_port);
}
Expand Down
Loading

0 comments on commit d051f55

Please sign in to comment.