Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: support dual mode data storage #369

Merged
merged 31 commits into from
Dec 2, 2023
Merged

feat: support dual mode data storage #369

merged 31 commits into from
Dec 2, 2023

Conversation

jrconlin
Copy link
Member

Allow for dual mode, where data is written to the primary, but the
secondary is read in order to 'back fill'. For simplicity, Dual presumes
that the primary data store is Bigtable and the secondary is DynamoDb.

This presumes that the db_settings contains serialized settings for
both data stores.

Closes: SYNC-3452

Blocks on #364

@jrconlin jrconlin marked this pull request as ready for review April 25, 2023 22:52
@jrconlin jrconlin requested a review from pjenvey April 25, 2023 22:52
@jrconlin jrconlin marked this pull request as draft April 26, 2023 18:14
@jrconlin jrconlin marked this pull request as ready for review May 2, 2023 02:29
@jrconlin jrconlin force-pushed the feat/3452-dual_data branch from 7911837 to cb3e5e2 Compare July 12, 2023 21:08
@jrconlin jrconlin marked this pull request as draft July 28, 2023 21:48
@jrconlin jrconlin removed the request for review from pjenvey July 28, 2023 21:49
@jrconlin jrconlin force-pushed the feat/3452-dual_data branch from b5351c2 to fae3f73 Compare July 28, 2023 23:28
Allow for dual mode, where data is written to the primary, but the
secondary is read in order to 'back fill'. For simplicity, Dual presumes
that the primary data store is Bigtable and the secondary is DynamoDb.

This presumes that the `db_settings` contains serialized settings for
both data stores.

Closes: SYNC-3452
@jrconlin jrconlin force-pushed the feat/3452-dual_data branch from 9d65c01 to c00cb44 Compare August 3, 2023 17:23
@jrconlin jrconlin marked this pull request as ready for review August 3, 2023 17:23
This flag makes writing to the secondary an active feature. This is
required due to the potential for clients to be arbitrarily assigned to
the dual mode storage or the older single mode storage. This flag is set
to `true` as the default since we want both data stores active until we
have sufficient cross coverage.
@jrconlin jrconlin marked this pull request as draft October 6, 2023 21:14
@jrconlin jrconlin marked this pull request as ready for review November 3, 2023 21:39
@jrconlin jrconlin requested a review from pjenvey November 3, 2023 21:40
@jrconlin jrconlin force-pushed the feat/3452-dual_data branch from dd93d0c to 036ba4b Compare November 17, 2023 21:45
@jrconlin jrconlin force-pushed the feat/3452-dual_data branch from 036ba4b to 577ca3f Compare November 17, 2023 22:09
Cargo.toml Outdated
Comment on lines 81 to 83
sentry-core = { version = "0.31", default-features = false, features = [
"client",
] }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this client feature is being used anywhere?

Suggested change
sentry-core = { version = "0.31", default-features = false, features = [
"client",
] }
sentry-core = { version = "0.31" }

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do wonder if i should audit the libraries and features we're pulling in. I know there's a cargo feature that allows for that, but it needs nightly...

Comment on lines 37 to 49
#[derive(Clone, Debug, Default, Deserialize)]
pub struct DualDbSettings {
#[serde(default)]
primary: DbSettings,
#[serde(default)]
secondary: DbSettings,
#[serde(default = "set_true")]
write_to_secondary: bool,
}

fn set_true() -> bool {
true
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#[derive(Clone, Debug, Default, Deserialize)]
pub struct DualDbSettings {
#[serde(default)]
primary: DbSettings,
#[serde(default)]
secondary: DbSettings,
#[serde(default = "set_true")]
write_to_secondary: bool,
}
fn set_true() -> bool {
true
}
#[derive(Clone, Debug, Deserialize)]
pub struct DualDbSettings {
primary: DbSettings,
secondary: DbSettings,
write_to_secondary: bool,
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair, I had originally wanted to do these as optional and resortable, but that's not happened.

}

async fn router_table_exists(&self) -> DbResult<bool> {
self.primary.router_table_exists().await
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not check secondary?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was debating that.
I wanted the call to be reasonably fast and working through the logic around which sources to poll might be weird. Plus, since we don't pass the UAID, we're kinda arbitrarily picking which data store to read from anyway.

Picking just primary ensures that the prime database is correct, with the secondary being just that, and lesser priority. (Honestly, once we're on bigtable, we could drop this entirely, but there's probably good reasons to keep it in case we ever have to move to another data store).

@jrconlin jrconlin merged commit 6624a19 into master Dec 2, 2023
1 check passed
@jrconlin jrconlin deleted the feat/3452-dual_data branch December 2, 2023 00:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants