Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Number of open connections locks keep increasing on reporting server postgresql instance #1922

Open
dante-saggin opened this issue Nov 15, 2024 · 4 comments · Fixed by world-federation-of-advertisers/common-jvm#289 or #1949
Assignees
Labels
bug Something isn't working

Comments

@dante-saggin
Copy link

dante-saggin commented Nov 15, 2024

Describe the bug
The PostgreSQL instance of the Reporting Server is experiencing a steady increase in open connections and locks over time, necessitating periodic restarts to maintain optimal performance.

Steps to reproduce

  1. After postgres-internal-reporting-server-deployment is up for a a few days
  2. Checking the number of open connections, locks on postgres instance keeps increasing.

Component(s) affected
Reporting Server

Version
v5.11

Environment
Origin production

Additional context
image

image

image

@dante-saggin dante-saggin added the bug Something isn't working label Nov 15, 2024
@SanjayVas
Copy link
Member

@tristanvuong2021 can you look into this? Did you ever end up implementing connection pooling, and could this be related to that?

@TNATALI
Copy link
Member

TNATALI commented Dec 5, 2024

Message from Tristan:
When the scenario occurs, can you (Origin team) run this query. This should help figure out which queries are leading to the issue.

SELECT
wait_event_type, wait_event, query
FROM
pg_stat_activity
WHERE
state = 'active' AND wait_event_type != NULL;

@tristanvuong2021
Copy link
Contributor

#1949 helps a bit with preventing connections from getting stuck in the first place. #289 closes idle connections that are open for too long. I do have something in that PR that should shut down active connections too, but I can only recreate idle connections remaining open.

@SanjayVas
Copy link
Member

SanjayVas commented Dec 18, 2024

#1971 is leading to an issue when testing in cloud. This is from CreateReportingSet.

[DefaultDispatcher-worker-1] gRPC 54cf064bff35e5606906f6e9e7594280 error: UNKNOWN
java.lang.IllegalStateException: The connection is closed
    at io.r2dbc.pool.PooledConnection.assertNotClosed(PooledConnection.java:208)
    at io.r2dbc.pool.PooledConnection.close(PooledConnection.java:110)
    at io.r2dbc.pool.PooledConnection.close(PooledConnection.java:44)
    at org.wfanet.measurement.common.db.r2dbc.ReadContextImpl.close$suspendImpl(ReadContext.kt:56)
    at org.wfanet.measurement.common.db.r2dbc.ReadContextImpl.close(ReadContext.kt)
    at org.wfanet.measurement.common.db.r2dbc.postgres.PostgresWriter.execute(PostgresWriter.kt:76)
    at org.wfanet.measurement.common.db.r2dbc.postgres.PostgresWriter$execute$1.invokeSuspend(PostgresWriter.kt)
    at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:33)
    at kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:104)
    at kotlinx.coroutines.scheduling.CoroutineScheduler.runSafely(CoroutineScheduler.kt:584)
    at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.executeTask(CoroutineScheduler.kt:811)
    at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.runWorker(CoroutineScheduler.kt:715)
    at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.run(CoroutineScheduler.kt:702)

@SanjayVas SanjayVas reopened this Dec 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
4 participants