Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Safekeeper peer recovery preparatory patches #5118

Merged
merged 6 commits into from
Aug 29, 2023
Merged

Conversation

arssher
Copy link
Contributor

@arssher arssher commented Aug 28, 2023

Implements #4875

Now available under GET /tenant/xxx/timeline/yyy for inspection.
@arssher arssher requested review from a team as code owners August 28, 2023 04:40
@arssher arssher requested review from petuhovskiy, knizhnik and koivunej and removed request for a team August 28, 2023 04:40
@arssher
Copy link
Contributor Author

arssher commented Aug 28, 2023

@petuhovskiy Trying to make this easier to digest I attempt to do it in several smaller commits -- the ones pushed here are ready for review.

@github-actions
Copy link

github-actions bot commented Aug 28, 2023

1624 tests run: 1550 passed, 0 failed, 74 skipped (full report)


Flaky tests (1)

Postgres 14

  • test_crafted_wal_end[last_wal_record_xlog_switch_ends_on_page_boundary]: release
The comment gets automatically updated with the latest test results
8f0ae23 at 2023-08-29T19:38:46.837Z :recycle:

Slightly refactors init: now load_tenant_timelines is also async to properly
init the timeline, but to keep global map lock sync we just acquire it anew for
each timeline.

Recovery task itself is just a stub here.

part of
#4875
Add derive Ord for easy comparison of <term, lsn> pairs.

part of #4875
Copy link
Contributor

@knizhnik knizhnik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this recovery can be somehow combined with retrieving at compute WAL needed for logical replication? In principle - the approach is similar: we need to send WAL till some boundary (in this case determined by logical replication slot) to walproposer.

Or it is better not to mix this two things?

@arssher
Copy link
Contributor Author

arssher commented Aug 28, 2023

Or it is better not to mix this two things?

These are not very related. Interface for fetching WAL from safekeepers by pg protocol exists for a long time, and can be used for logical replication as well, in fact we already have fetching code in walproposer (which I plan to remove soon, but anyway, it is trivial). This patchset extends so that not committed part can also be dynamically fetched, but that's not much needed for replication, as not committed part most often can be still on the compute as it generates it.

I still think that hardest part about logical repl is persistency of replication slots and historical snapshots...

Copy link
Member

@petuhovskiy petuhovskiy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I forgot that this PR is splitted into commits and reviewed all changed as usual (altogether in Files changed tab).

Overall LGTM, let's merge and deploy.

safekeeper/src/recovery.rs Show resolved Hide resolved
safekeeper/src/recovery.rs Show resolved Hide resolved
safekeeper/src/timelines_global_map.rs Show resolved Hide resolved
safekeeper/src/send_wal.rs Show resolved Hide resolved
It will be used by safekeeper as well.
Instead of fixed during the start of replication. To this end, create
term_flush_lsn watch channel similar to commit_lsn one. This allows to continue
recovery streaming if new data appears.
@arssher arssher changed the title Safekeeper peer recovery Safekeeper peer recovery prep patches Aug 29, 2023
@arssher arssher changed the title Safekeeper peer recovery prep patches Safekeeper peer recovery preparatory patches Aug 29, 2023
@arssher arssher enabled auto-merge (rebase) August 29, 2023 19:34
@arssher arssher merged commit 81b6578 into main Aug 29, 2023
28 checks passed
@arssher arssher deleted the sk-peer-recovery-2 branch August 29, 2023 20:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants