-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor: add database crates #1407
Conversation
#[cfg(all(feature = "mysql", feature = "spanner"))] | ||
compile_error!("only one of the \"mysql\" and \"spanner\" features can be enabled at a time"); | ||
|
||
#[cfg(not(any(feature = "mysql", feature = "spanner")))] | ||
compile_error!("exactly one of the \"mysql\" and \"spanner\" features must be enabled"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some notes on mutually exclusive features: https://doc.rust-lang.org/cargo/reference/features.html#mutually-exclusive-features
To me, this seemed like a good use case for them, given that we specifically want to omit dependencies we don't need when compiling with the MySQL backend. An alternative would be to have one database backend be the "default" and then have a single feature to override that default to use the other backend. I opted not to go with that route because I don't think one database backend is obviously the "default", since we use Spanner in prod but MySQL locally (for the most part)
|
||
#[async_trait] | ||
pub trait DbPoolTrait: Sync + Send + Debug + GetPoolState { | ||
type Error; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to introduce an associated type here because implementers of this trait have their own DbError
types
#[derive(Debug, Error)] | ||
enum DbErrorKind { | ||
#[error("{}", _0)] | ||
Common(SyncstorageDbError), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I struggled a bit with the naming here. Common
is supposed to be a Syncstorage database error that is not specific to the database backend
#[error("Connection expired")] | ||
Expired, | ||
|
||
#[error("A database error occurred: {}", _0)] | ||
Grpc(#[from] grpcio::Error), | ||
|
||
#[error("Database integrity error: {}", _0)] | ||
Integrity(String), | ||
|
||
#[error("Spanner data load too large: {}", _0)] | ||
TooLarge(String), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These could have been pulled into a SpannerDbError
type akin to the MysqlDbError
type, but the reason I made the MysqlDbError
type was because it's also used by Tokenserver
This can't be merged until #1276 is merged |
ca8bdc9
to
6b75585
Compare
7399bb2
to
af9875b
Compare
6b75585
to
db7d3f5
Compare
af9875b
to
65497c3
Compare
65497c3
to
21d0c30
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, aside from the Makefile pointing to your directory.
actix-cors = "0.5" | ||
actix-service = "1.0.6" | ||
async-trait = "0.1.40" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to the package stuff, we could also boost a bunch of the common dependency crates to up to the parent crates.
https://doc.rust-lang.org/cargo/reference/workspaces.html#the-workspacedependencies-table
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call! I created a separate ticket to do this: https://mozilla-hub.atlassian.net/browse/SYNC-3549
@@ -4,5 +4,16 @@ version = "0.13.0" | |||
edition = "2021" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if it makes sense to use some of the more modern crate management stuff here:
https://doc.rust-lang.org/cargo/reference/workspaces.html#the-workspacepackage-table
That way we don't have to muck with updating a dozen versions whenever something changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to above, I created a ticket to track this: https://mozilla-hub.atlassian.net/browse/SYNC-3550
@@ -98,7 +98,7 @@ pub async fn append_async(db: &SpannerDb, params: params::AppendToBatch) -> Resu | |||
if !exists { | |||
// NOTE: db tests expects this but it doesn't seem necessary w/ the | |||
// handler validating the batch before appends | |||
Err(DbErrorKind::BatchNotFound)? | |||
return Err(DbError::batch_not_found()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know that adding the ?
does the same thing, but I agree that this is more readable, particularly when fast reading the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Wow on the cargo chef
changes, thanks for adding those!
Of course on a huge PR like this I really only have one thing to bikeshed, and it's mostly just food for thought: the Trait
suffix for Db/PoolTrait
feels odd -- I don't think I've ever seen it used like that in Rust -- and a bit superfluous when there's a dyn
beside it (which is most of the time when referencing traits). One alternative could be to keep the trait names as is and rename syncstorage_db::Db
and DbPool
to DbImpl/DbPoolImpl
(I saw a complex trait setup in servo that does this) -- and maybe that Impl
suffix kinda helps mark that the type's swapped via the feature flag and backed by a trait?
match &kind { | ||
DbErrorKind::Common(dbe) => Self { | ||
status: dbe.status, | ||
backtrace: Box::new(dbe.backtrace.clone()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a little worried about the amount of cloning we're doing of backtraces. At least in mysql, we'd clone it from the Kind
and then ApiError
eventually clones it again, all for propagating one error.
However the current 0.13.1 on prod isn't far from this, I think it's generating 2 different stack traces for a DbError
(in DbError
and then another one in ApiError
) so maybe this does not matter much.
The other problem with it is the new std::backtrace::Backtrace
doesn't implement Clone
(which will hopefully replace all uses of the backtrace::Backtrace
at some point in the future).
Maybe we can just log an issue to look at having backtrace
methods that would return them from the inner types to avoid the cloning?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the backtrace
method approach sounds good 👍🏻 I created a ticket here
@pjenvey I like your idea about removing the |
@@ -16,9 +19,9 @@ lazy_static! { | |||
} | |||
|
|||
#[tokio::test] | |||
async fn bso_successfully_updates_single_values() -> Result<()> { | |||
async fn bso_successfully_updates_single_values() -> Result<(), DbError> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bit surprised you didn't create a type for this, just to save keystrokes, but it's fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There were a few different Result
/DbResult
/DbFuture
aliases, and I thought it would make things easier to follow if we were more explicit with our return types but 🤷🏻 perhaps that's just me
278b88a
to
e459814
Compare
CI was stuck so I did a |
I removed the requirement for the missing e2e-test from the master branch protection settings. I am curious why they're missing. |
@jrconlin The original |
Ah, yeah. Sorry, forgot about that. |
@jrconlin Ok I updated the branch protection settings, and that seems to have fixed the issue. I think this needs another approval before I can merge |
Overview
This PR breaks up syncstorage's and tokenserver's database functionality into a number of new crates:
syncserver
crateI'd recommend generating the docs using
cargo doc --no-deps --workspace
to view the public interfaces for each crate in the workspace. I tried to keep the public interfaces for each crate as simple as possible.Crate Relationships
It might be useful to visualize how the new crates relate to one another (I generated this using
cargo depgraph
):The idea is that
syncserver
should not need to know any details about the underlying database backend.syncstorage-db
acts as an interface to the database that reveals as little as possible about what's happening under the hood.syncstorage-db
has two features (mysql
andspanner
) that enable the MySQL and Spanner backends, respectively.cargo
featuresI added two
cargo
features to the top-levelsyncserver
crate:mysql
andspanner
. These features represent the two syncstorage database backends. This was an important addition because the Spanner backend pulls inprotobuf
andgrpcio
, which both contribute very substantially to the time it takes to build syncstorage. Giving developers the ability to omit the code related to Spanner when developing locally substantially improves compile times. By my measurements, compiling with thespanner
feature takes about 4.5 minutes, whereas compiling with themysql
feature takes only about 1.5 minutes.Adding these
cargo
features also reduces the size of the compiled binary, since we are omitting substantial amounts of code we never use.Error Handling
The toughest part about this work was figuring out how to handle errors. Originally, we had a single
DbError
which encapsulated knowledge of both the Spanner and MySQL backends. Decoupling the Spanner and MySQL backend meant that each needed to have its own error type that included only knowledge of its own backend. To resolve this, I defined a number of different error types in different places and nested them usingenum
s.MysqlError
: An error representing low-level, MySQL-specific errors (e.g. connection errors, migration errors, errors that arise during queries, etc.). Basically, this wraps a fewdiesel
errors. This type is shared by the syncstorage MySQL backend and the Tokenserver MySQL backendSyncstorageDbError
: An error that represents syncstorage-specific errors that arise during database queries. It represents things likeCollectionNotFound
,BsoNotFound
, quota errors, batch errors, etc. This type doesn't wrap any lower-level errors, and the errors are not specific to MySQL or Spanner.DbError
: This is a top-level database error for the MySQL backend. It encapsulates bothMysqlError
andSyncstorageDbError
, representing any error that may occur when processing a MySQL query for syncstorage.DbError
: This is a top-level database error for the Spanner backend. It encapsulatesSyncstorageDbError
and lower-level Spanner errors (there was no need for aSpannerError
type akin to theMysqlError
type, since the Tokenserver backend doesn't use Spanner). It represents any error that may occur when processing a Spanner query for syncstorage.DbError
: This is a top-level database error for tokenserver. It encapsulatesMysqlError
and any other internal errors that may occur when processing a MySQL query for tokenserver.I also made the
*ErrorKind
types private. This was a more opinionated decision on my part, but IMO, it simplifies the public APIs of the error types, making them easier to understand and use by clients. I include a number of helper methods on the base error types to allow clients to access information hidden in the*ErrorKind
types.Use of
debug_assertions
Unfortunately, it's not possible to pass through items defined conditionally under the
test
feature to other crates (i.e. if something is defined with#[cfg(test)]
in a crate, it won't be rendered visible to parent crates). There are numerous testing utilities that we define under thetest
feature, and I thought it would be wrong to include them in release builds. To remedy this, I define testing utilities with thedebug_assertions
feature instead, which only compiles the testing utilities in debug builds. This feels a little inelegant, but I thought it would be preferable to including testing utilities in release builds or defining our own newtesting
feature, which would further complicate compiling syncserver.CI
This PR introduces new cargo features (one for the Spanner backend and one for the MySQL backend), which are mutually exclusive. This makes it impossible to build a single Docker image against which we can run the MySQL and Spanner e2e tests. Instead, we need to build two separate Docker images: one compiled with the MySQL database backend and one compiled with the Spanner database backend. Our Docker image builds are quite lengthy, so I introduced
cargo chef
to facilitate the caching of build dependencies between CI runs. This dramatically reduces build times, since only our own crates need to be recompiled with code changes -- the dependencies are redownloaded and recompiled only if our Cargo.tomls change.Additionally, I configured the MySQL and Spanner images/e2e tests to run in parallel, further improving build times. We might want to consider using a larger CircleCI resource class to allow
cargo
to build more crates in parallel and to improve single-threaded CPU performance.Testing
Given that no functionality should be changing, ensuring compilation and that the unit and integration tests pass should be sufficient. We should definitely run through the end-to-end QA tests on stage before we deploy this to production.
Issue(s)
Closes #1277