Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Source Connector] Timestamp-based Querying Gets Stuck in Postgres #54

Closed
NJAldwin opened this issue Oct 2, 2020 · 3 comments
Closed

Comments

@NJAldwin
Copy link

NJAldwin commented Oct 2, 2020

In Postgres, CURRENT_TIMESTAMP (and others) change behavior within a
transaction:

These SQL-standard functions all return values based on the start time
of the current transaction

(see 1)

In 29581b3, the JDBC connection for the
source connector had auto-commit disabled, apparently to force cursor
behavior in some JDBC drivers. When auto-commit is disabled, a
transaction is opened and not committed until commit is called -- and
it's only called in the sink connector.

Because the generic dialect uses CURRENT_TIMESTAMP as its timestamp
query, and the Postgres dialect has no override, the end timestamp used
in timestamp- and timestamp+incrementing-mode is calculated using the
timestamp of the transaction start. The transaction was started when
the connector started up, so this timestamp always reads the same.

Here are some log lines from one of our production connectors illustrating this. I have added annotations and omitted days of logs:

[2020-09-29 22:09:01,019] DEBUG Executing prepared statement with timestamp value = 2020-09-25 19:59:12.238 end time = 2020-09-26 23:07:56.308 (io.aiven.connect.jdbc.source.TimestampIncrementingCriteria:170)
[2020-09-29 22:24:01,035] DEBUG Executing prepared statement with timestamp value = 2020-09-25 19:59:12.238 end time = 2020-09-26 23:07:56.308 (io.aiven.connect.jdbc.source.TimestampIncrementingCriteria:170)
[2020-09-29 22:39:01,047] DEBUG Executing prepared statement with timestamp value = 2020-09-25 19:59:12.238 end time = 2020-09-26 23:07:56.308 (io.aiven.connect.jdbc.source.TimestampIncrementingCriteria:170)
[2020-09-29 22:54:01,072] DEBUG Executing prepared statement with timestamp value = 2020-09-25 19:59:12.238 end time = 2020-09-26 23:07:56.308 (io.aiven.connect.jdbc.source.TimestampIncrementingCriteria:170)
--- connector restarted ---
[2020-09-29 22:54:59,136] DEBUG Executing prepared statement with timestamp value = 2020-09-25 19:59:12.238 end time = 2020-09-29 22:53:59.135 (io.aiven.connect.jdbc.source.TimestampIncrementingCriteria:170)
[2020-09-29 23:09:59,531] DEBUG Executing prepared statement with timestamp value = 2020-09-28 19:55:07.147 end time = 2020-09-29 22:53:59.135 (io.aiven.connect.jdbc.source.TimestampIncrementingCriteria:170)
[2020-09-29 23:24:59,545] DEBUG Executing prepared statement with timestamp value = 2020-09-28 19:55:07.147 end time = 2020-09-29 22:53:59.135 (io.aiven.connect.jdbc.source.TimestampIncrementingCriteria:170)
[2020-09-29 23:39:59,561] DEBUG Executing prepared statement with timestamp value = 2020-09-28 19:55:07.147 end time = 2020-09-29 22:53:59.135 (io.aiven.connect.jdbc.source.TimestampIncrementingCriteria:170)
[2020-09-29 23:54:59,574] DEBUG Executing prepared statement with timestamp value = 2020-09-28 19:55:07.147 end time = 2020-09-29 22:53:59.135 (io.aiven.connect.jdbc.source.TimestampIncrementingCriteria:170)
[2020-09-30 00:09:59,586] DEBUG Executing prepared statement with timestamp value = 2020-09-28 19:55:07.147 end time = 2020-09-29 22:53:59.135 (io.aiven.connect.jdbc.source.TimestampIncrementingCriteria:170)
[2020-09-30 00:24:59,609] DEBUG Executing prepared statement with timestamp value = 2020-09-28 19:55:07.147 end time = 2020-09-29 22:53:59.135 (io.aiven.connect.jdbc.source.TimestampIncrementingCriteria:170)
[2020-09-30 00:39:59,673] DEBUG Executing prepared statement with timestamp value = 2020-09-28 19:55:07.147 end time = 2020-09-29 22:53:59.135 (io.aiven.connect.jdbc.source.TimestampIncrementingCriteria:170)
...
[2020-10-01 23:55:04,836] DEBUG Executing prepared statement with timestamp value = 2020-09-28 19:55:07.147 end time = 2020-09-29 22:53:59.135 (io.aiven.connect.jdbc.source.TimestampIncrementingCriteria:170)
[2020-10-02 00:10:04,885] DEBUG Executing prepared statement with timestamp value = 2020-09-28 19:55:07.147 end time = 2020-09-29 22:53:59.135 (io.aiven.connect.jdbc.source.TimestampIncrementingCriteria:170)
[2020-10-02 00:25:04,958] DEBUG Executing prepared statement with timestamp value = 2020-09-28 19:55:07.147 end time = 2020-09-29 22:53:59.135 (io.aiven.connect.jdbc.source.TimestampIncrementingCriteria:170)
[2020-10-02 00:40:05,139] DEBUG Executing prepared statement with timestamp value = 2020-09-28 19:55:07.147 end time = 2020-09-29 22:53:59.135 (io.aiven.connect.jdbc.source.TimestampIncrementingCriteria:170)
[2020-10-02 00:55:05,184] DEBUG Executing prepared statement with timestamp value = 2020-09-28 19:55:07.147 end time = 2020-09-29 22:53:59.135 (io.aiven.connect.jdbc.source.TimestampIncrementingCriteria:170)
[2020-10-02 01:10:05,360] DEBUG Executing prepared statement with timestamp value = 2020-09-28 19:55:07.147 end time = 2020-09-29 22:53:59.135 (io.aiven.connect.jdbc.source.TimestampIncrementingCriteria:170)
[2020-10-02 01:25:05,415] DEBUG Executing prepared statement with timestamp value = 2020-09-28 19:55:07.147 end time = 2020-09-29 22:53:59.135 (io.aiven.connect.jdbc.source.TimestampIncrementingCriteria:170)
[2020-10-02 01:40:05,446] DEBUG Executing prepared statement with timestamp value = 2020-09-28 19:55:07.147 end time = 2020-09-29 22:53:59.135 (io.aiven.connect.jdbc.source.TimestampIncrementingCriteria:170)
[2020-10-02 01:55:05,481] DEBUG Executing prepared statement with timestamp value = 2020-09-28 19:55:07.147 end time = 2020-09-29 22:53:59.135 (io.aiven.connect.jdbc.source.TimestampIncrementingCriteria:170)

In effect, the connector must be continually restarted for
timestamp-based querying to function in Postgres.

There are a couple of possible ways to work around this:

@ivanyu
Copy link
Contributor

ivanyu commented Oct 5, 2020

Hey @NJAldwin
Thank you for the issue report and the PR, I'm looking into them!

@ivanyu
Copy link
Contributor

ivanyu commented Oct 5, 2020

Once again, thank you for raising this concern.

turn auto-commit back on just for postgres

This is not possible: Postgres was the reason to introduce it in the first place.

make the postgres time query return the actual "current time"

This will work and would be the easiest fix.
However, there is more to this than stale timestamps. Long-running (effectively, indefinitely running in this case) transactions are generally a bad idea. So

explicitly commit after each statement

will be the way to go. The fix is incoming.

@ivanyu
Copy link
Contributor

ivanyu commented Oct 5, 2020

Made a PR #57

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants