Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: add database crates #1407

Merged
merged 35 commits into from
Jan 10, 2023
Merged

refactor: add database crates #1407

merged 35 commits into from
Jan 10, 2023

Conversation

ethowitz
Copy link
Contributor

@ethowitz ethowitz commented Sep 30, 2022

Overview

This PR breaks up syncstorage's and tokenserver's database functionality into a number of new crates:

  • syncserver-db-common: Database-related code that is shared between syncstorage and tokenserver
    • Does not depend on any other database crates
  • syncstorage-db-common: Database-related code that is shared between the MySQL and Spanner backends of syncstorage. Note that this crate is not used in any way by Tokenserver
    • Depends on syncserver-db-common
  • syncstorage-db: Acts as an interface between the two database backends and the syncserver crate
    • Depends on:
      • syncstorage-mysql
      • syncstorage-spanner
      • syncserver-db-common
      • syncstorage-db-common
  • syncstorage-mysql: The code that powers syncstorage's MySQL database backend
    • Depends on:
      • syncstorage-db-common
      • syncserver-db-common
  • syncstoarge-spanner: The code the powers syncstorage's Spanner database backend
    • Depends on:
      • syncstorage-db-common
      • syncserver-db-common
  • tokenserver-db: The code that powers tokenserver's MySQL database backend
    • Depends on:
      • syncserver-db-common

I'd recommend generating the docs using cargo doc --no-deps --workspace to view the public interfaces for each crate in the workspace. I tried to keep the public interfaces for each crate as simple as possible.

Crate Relationships

It might be useful to visualize how the new crates relate to one another (I generated this using cargo depgraph):
graph

The idea is that syncserver should not need to know any details about the underlying database backend. syncstorage-db acts as an interface to the database that reveals as little as possible about what's happening under the hood. syncstorage-db has two features (mysql and spanner) that enable the MySQL and Spanner backends, respectively.

cargo features

I added two cargo features to the top-level syncserver crate: mysql and spanner. These features represent the two syncstorage database backends. This was an important addition because the Spanner backend pulls in protobuf and grpcio, which both contribute very substantially to the time it takes to build syncstorage. Giving developers the ability to omit the code related to Spanner when developing locally substantially improves compile times. By my measurements, compiling with the spanner feature takes about 4.5 minutes, whereas compiling with the mysql feature takes only about 1.5 minutes.

Adding these cargo features also reduces the size of the compiled binary, since we are omitting substantial amounts of code we never use.

Error Handling

The toughest part about this work was figuring out how to handle errors. Originally, we had a single DbError which encapsulated knowledge of both the Spanner and MySQL backends. Decoupling the Spanner and MySQL backend meant that each needed to have its own error type that included only knowledge of its own backend. To resolve this, I defined a number of different error types in different places and nested them using enums.

  • syncserver-db-common:
    • MysqlError: An error representing low-level, MySQL-specific errors (e.g. connection errors, migration errors, errors that arise during queries, etc.). Basically, this wraps a few diesel errors. This type is shared by the syncstorage MySQL backend and the Tokenserver MySQL backend
  • syncstorage-db-common:
    • SyncstorageDbError: An error that represents syncstorage-specific errors that arise during database queries. It represents things like CollectionNotFound, BsoNotFound, quota errors, batch errors, etc. This type doesn't wrap any lower-level errors, and the errors are not specific to MySQL or Spanner.
  • syncstorage-mysql:
    • DbError: This is a top-level database error for the MySQL backend. It encapsulates both MysqlError and SyncstorageDbError, representing any error that may occur when processing a MySQL query for syncstorage.
  • syncstorage-spanner:
    • DbError: This is a top-level database error for the Spanner backend. It encapsulates SyncstorageDbError and lower-level Spanner errors (there was no need for a SpannerError type akin to the MysqlError type, since the Tokenserver backend doesn't use Spanner). It represents any error that may occur when processing a Spanner query for syncstorage.
  • tokenserver-mysql:
    • DbError: This is a top-level database error for tokenserver. It encapsulates MysqlError and any other internal errors that may occur when processing a MySQL query for tokenserver.

I also made the *ErrorKind types private. This was a more opinionated decision on my part, but IMO, it simplifies the public APIs of the error types, making them easier to understand and use by clients. I include a number of helper methods on the base error types to allow clients to access information hidden in the *ErrorKind types.

Use of debug_assertions

Unfortunately, it's not possible to pass through items defined conditionally under the test feature to other crates (i.e. if something is defined with #[cfg(test)] in a crate, it won't be rendered visible to parent crates). There are numerous testing utilities that we define under the test feature, and I thought it would be wrong to include them in release builds. To remedy this, I define testing utilities with the debug_assertions feature instead, which only compiles the testing utilities in debug builds. This feels a little inelegant, but I thought it would be preferable to including testing utilities in release builds or defining our own new testing feature, which would further complicate compiling syncserver.

CI

This PR introduces new cargo features (one for the Spanner backend and one for the MySQL backend), which are mutually exclusive. This makes it impossible to build a single Docker image against which we can run the MySQL and Spanner e2e tests. Instead, we need to build two separate Docker images: one compiled with the MySQL database backend and one compiled with the Spanner database backend. Our Docker image builds are quite lengthy, so I introduced cargo chef to facilitate the caching of build dependencies between CI runs. This dramatically reduces build times, since only our own crates need to be recompiled with code changes -- the dependencies are redownloaded and recompiled only if our Cargo.tomls change.

Additionally, I configured the MySQL and Spanner images/e2e tests to run in parallel, further improving build times. We might want to consider using a larger CircleCI resource class to allow cargo to build more crates in parallel and to improve single-threaded CPU performance.

Testing

Given that no functionality should be changing, ensuring compilation and that the unit and integration tests pass should be sufficient. We should definitely run through the end-to-end QA tests on stage before we deploy this to production.

Issue(s)

Closes #1277

syncstorage-db/src/lib.rs Outdated Show resolved Hide resolved
Comment on lines +39 to +43
#[cfg(all(feature = "mysql", feature = "spanner"))]
compile_error!("only one of the \"mysql\" and \"spanner\" features can be enabled at a time");

#[cfg(not(any(feature = "mysql", feature = "spanner")))]
compile_error!("exactly one of the \"mysql\" and \"spanner\" features must be enabled");
Copy link
Contributor Author

@ethowitz ethowitz Sep 30, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some notes on mutually exclusive features: https://doc.rust-lang.org/cargo/reference/features.html#mutually-exclusive-features

To me, this seemed like a good use case for them, given that we specifically want to omit dependencies we don't need when compiling with the MySQL backend. An alternative would be to have one database backend be the "default" and then have a single feature to override that default to use the other backend. I opted not to go with that route because I don't think one database backend is obviously the "default", since we use Spanner in prod but MySQL locally (for the most part)


#[async_trait]
pub trait DbPoolTrait: Sync + Send + Debug + GetPoolState {
type Error;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to introduce an associated type here because implementers of this trait have their own DbError types

#[derive(Debug, Error)]
enum DbErrorKind {
#[error("{}", _0)]
Common(SyncstorageDbError),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I struggled a bit with the naming here. Common is supposed to be a Syncstorage database error that is not specific to the database backend

Comment on lines +62 to +72
#[error("Connection expired")]
Expired,

#[error("A database error occurred: {}", _0)]
Grpc(#[from] grpcio::Error),

#[error("Database integrity error: {}", _0)]
Integrity(String),

#[error("Spanner data load too large: {}", _0)]
TooLarge(String),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These could have been pulled into a SpannerDbError type akin to the MysqlDbError type, but the reason I made the MysqlDbError type was because it's also used by Tokenserver

@ethowitz
Copy link
Contributor Author

This can't be merged until #1276 is merged

@ethowitz ethowitz force-pushed the refactor/add-settings-crates branch from ca8bdc9 to 6b75585 Compare October 3, 2022 21:58
@ethowitz ethowitz force-pushed the refactor/add-db-crates branch from 7399bb2 to af9875b Compare October 4, 2022 18:52
@ethowitz ethowitz force-pushed the refactor/add-settings-crates branch from 6b75585 to db7d3f5 Compare October 14, 2022 13:15
@ethowitz ethowitz force-pushed the refactor/add-db-crates branch from af9875b to 65497c3 Compare October 14, 2022 13:23
Base automatically changed from refactor/add-settings-crates to master October 19, 2022 21:15
@ethowitz ethowitz force-pushed the refactor/add-db-crates branch from 65497c3 to 21d0c30 Compare October 25, 2022 16:50
@ethowitz ethowitz marked this pull request as ready for review October 25, 2022 16:50
@ethowitz ethowitz requested review from pjenvey and jrconlin and removed request for pjenvey October 25, 2022 16:50
Copy link
Member

@jrconlin jrconlin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, aside from the Makefile pointing to your directory.

Dockerfile Outdated Show resolved Hide resolved
Makefile Outdated Show resolved Hide resolved
Makefile Outdated Show resolved Hide resolved
actix-cors = "0.5"
actix-service = "1.0.6"
async-trait = "0.1.40"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to the package stuff, we could also boost a bunch of the common dependency crates to up to the parent crates.
https://doc.rust-lang.org/cargo/reference/workspaces.html#the-workspacedependencies-table

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call! I created a separate ticket to do this: https://mozilla-hub.atlassian.net/browse/SYNC-3549

@@ -4,5 +4,16 @@ version = "0.13.0"
edition = "2021"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it makes sense to use some of the more modern crate management stuff here:
https://doc.rust-lang.org/cargo/reference/workspaces.html#the-workspacepackage-table

That way we don't have to muck with updating a dozen versions whenever something changes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to above, I created a ticket to track this: https://mozilla-hub.atlassian.net/browse/SYNC-3550

@@ -98,7 +98,7 @@ pub async fn append_async(db: &SpannerDb, params: params::AppendToBatch) -> Resu
if !exists {
// NOTE: db tests expects this but it doesn't seem necessary w/ the
// handler validating the batch before appends
Err(DbErrorKind::BatchNotFound)?
return Err(DbError::batch_not_found());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know that adding the ? does the same thing, but I agree that this is more readable, particularly when fast reading the code.

Copy link
Member

@pjenvey pjenvey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Wow on the cargo chef changes, thanks for adding those!

Of course on a huge PR like this I really only have one thing to bikeshed, and it's mostly just food for thought: the Trait suffix for Db/PoolTrait feels odd -- I don't think I've ever seen it used like that in Rust -- and a bit superfluous when there's a dyn beside it (which is most of the time when referencing traits). One alternative could be to keep the trait names as is and rename syncstorage_db::Db and DbPool to DbImpl/DbPoolImpl (I saw a complex trait setup in servo that does this) -- and maybe that Impl suffix kinda helps mark that the type's swapped via the feature flag and backed by a trait?

match &kind {
DbErrorKind::Common(dbe) => Self {
status: dbe.status,
backtrace: Box::new(dbe.backtrace.clone()),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little worried about the amount of cloning we're doing of backtraces. At least in mysql, we'd clone it from the Kind and then ApiError eventually clones it again, all for propagating one error.

However the current 0.13.1 on prod isn't far from this, I think it's generating 2 different stack traces for a DbError (in DbError and then another one in ApiError) so maybe this does not matter much.

The other problem with it is the new std::backtrace::Backtrace doesn't implement Clone (which will hopefully replace all uses of the backtrace::Backtrace at some point in the future).

Maybe we can just log an issue to look at having backtrace methods that would return them from the inner types to avoid the cloning?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the backtrace method approach sounds good 👍🏻 I created a ticket here

@ethowitz
Copy link
Contributor Author

ethowitz commented Jan 9, 2023

@pjenvey I like your idea about removing the Trait suffix and adding an Impl suffix to the types that implement the traits. I'll make the change!

@ethowitz ethowitz requested review from pjenvey and jrconlin January 9, 2023 20:19
jrconlin
jrconlin previously approved these changes Jan 9, 2023
@@ -16,9 +19,9 @@ lazy_static! {
}

#[tokio::test]
async fn bso_successfully_updates_single_values() -> Result<()> {
async fn bso_successfully_updates_single_values() -> Result<(), DbError> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bit surprised you didn't create a type for this, just to save keystrokes, but it's fine.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There were a few different Result/DbResult/DbFuture aliases, and I thought it would make things easier to follow if we were more explicit with our return types but 🤷🏻 perhaps that's just me

@ethowitz
Copy link
Contributor Author

ethowitz commented Jan 10, 2023

CI was stuck so I did a git commit --amend --no-edit and force pushed, though I still see that GitHub appears to be waiting for the status of the e2e-tests job, which no longer exists in the CircleCI file...

@jrconlin
Copy link
Member

I removed the requirement for the missing e2e-test from the master branch protection settings.

I am curious why they're missing.

@ethowitz
Copy link
Contributor Author

@jrconlin The original e2e-tests job was removed and replaced with mysql-e2e-tests and spanner-e2e-tests -- should I update the branch protection to reflect that?

@jrconlin
Copy link
Member

Ah, yeah. Sorry, forgot about that.

@ethowitz
Copy link
Contributor Author

@jrconlin Ok I updated the branch protection settings, and that seems to have fixed the issue. I think this needs another approval before I can merge

@ethowitz ethowitz merged commit b5b7e57 into master Jan 10, 2023
@ethowitz ethowitz deleted the refactor/add-db-crates branch January 10, 2023 21:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add database crates
3 participants