-
Notifications
You must be signed in to change notification settings - Fork 159
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: Checkpointing doesn't trigger records flush to clickhouse #156
Comments
Hi @karpoftea Thanks for the insight! |
I seem to have encountered the same problem, have you solved it yet? Could you please help me and tell me how to solve it? I hope to receive your reply. Thank you @karpoftea |
@LJSJackson no I haven’t. I’m on a way to launch this connector as is on production workloads, an if no critical issue arises, then propose a patch for this. |
Hello all: I made a temporary fix, sorry for the late submission. Hi @karpoftea: |
[fix]: Ensure checkpointing triggers records flush to ClickHouse #156
What happened?
Originally I've tested connector agains type error failures (type incompatibility between kafka source table and clickhouse sink table): selected from kafka-table integer column (say, json number field as
cnt
INTEGER in kafka table) and inserted it to clickhouse table column (Int64). If cnt=1 everything works as expected - value is saved to clickhouse. If in clickhouse I change column type to UInt64 and cnt=-1 then exception occurs (which is OK), task restart and after several restart it changes state to RUNNING (so just leaving corrupted message behind). That is not an expected behaviour, because data was lost. Expected behaviour is to stuck and wait for manual resolve (either move offset or change clickhouse table schema).After I digged into code and found that DynamicTableSink is implemented using OutputFormatProvider/OutputFormat. My guess that OutputFormat does not call flush() when checkpoint occurs and thus checkpointing is always OK. Then I changed connector sink.flush-interval to 10min and set flink checkpoint to 1min, and saw that ClickHouseBatchOutputFormat.flush() is not triggered by checkpoint. Seems like my guess is right.
Can you kindly tell If using OutputFormat as a SinkRuntimeProvider was a design choice? If yes what was the reason for not choosing SinkAPI (org.apache.flink.api.connector.sink2.Sink) for implementation?
Affects Versions
master/1.16.0
What are you seeing the problem on?
Flink-Table-Api (SQL)
How to reproduce
Relevant log output
No response
Anything else
the core problem is that checkpointing does not trigger flush, so event if sink has exception (flushException) it will be healthy for a flink runtime
Are you willing to submit a PR?
Code of Conduct
The text was updated successfully, but these errors were encountered: