-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add BrowserID support for Tokenserver #1216
Conversation
Makefile
Outdated
@@ -36,10 +36,10 @@ docker_stop_spanner: | |||
docker-compose -f docker-compose.spanner.yaml down | |||
|
|||
run: | |||
RUST_LOG=debug RUST_BACKTRACE=full cargo run --features tokenserver_test_mode -- --config config/local.toml | |||
RUST_LOG=debug RUST_BACKTRACE=full cargo run --features grpcio/openssl -- --config config/local.toml |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The build was failing on my machine because of a conflict between openssl and boringssl. This resolves the issue (and I see that there's a comment relating to this in Cargo.toml). Is it worth leaving this here to prevent issues in the future?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. Along with a comment explaining why you're including the feature.
(Remember, future folks in a hurry will not read everything, so repeating important points is very useful)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have output of this build failure?
I think this ends up disabling boringssl entirely in grpcio. We avoided doing this in the past but I can't track down why (possibly it wasn't even an option when we originally built syncstorage). There's a couple concerns about doing this though:
- we should test it extensively on stage before it rolls out further
- I think it should be the default so all the unit/integration tests are running against this option
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This specifically started happening when I added reqwest
as a new dependency. The first error I ran into was when I tried to create a new reqwest
client -- I got this very unhelpful runtime error:
reqwest::Error { kind: Builder, source: Normal(ErrorStack([])) }
I tried a series of random things until I eventually added the native-tls-vendored
feature to see if vendoring the TLS library would solve my problem. Then I got a series of linker errors suggesting a conflict between openssl and boringssl:
= note: /usr/bin/ld: /home/edonowitz/syncstorage-rs/target/debug/deps/libgrpcio_sys-f6372abe3c9ae8ee.rlib(ssl_lib.cc.o): in function `SSL_free':
/home/edonowitz/.cargo/registry/src/github.com-1ecc6299db9ec823/boringssl-src-0.3.0+688fc5c/boringssl/src/ssl/ssl_lib.cc:744: multiple definition of `SSL_free'; /home/edonowitz/syncstorage-rs/target/debug/deps/libopenssl_sys-917c211f838ee9d5.rlib(ssl_lib.o):/home/edonowitz/syncstorage-rs/target/debug/build/openssl-sys-215daf9bb4881309/out/openssl-build/build/src/ssl/ssl_lib.c:1147: first defined here
I was able to resolve the issue by removing the native-tls-vendored
feature and adding --features grpcio/openssl
when compiling, which I assumed to mean that the conflict between boringssl and openssl was, in fact, the culprit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Finally tracked down the issue I was thinking of: syncstorage-rs#766 (CC: @jrconlin).
Basically: we enabled openssl narrowed to solely on Ubuntu. Due to a cargo issue I believe it became enabled on more platforms such as Debian used in our Docker, resulting in "Corruption detected" errors on stage.
This grpc.io issue suggests it could still be an issue. We could try enabling it again and see if the issue persists. Like I mentioned earlier, we would probably want it on by default in Cargo.toml so all tests run against it.
Another potential alternative we could try is using rusttls
for reqwest instead.
@@ -28,6 +28,13 @@ services: | |||
MYSQL_USER: test | |||
MYSQL_PASSWORD: test | |||
|
|||
mock-fxa-server: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tokenserver's "test mode" stubbed out the main token verifiers for fake ones, prevent us from being able to test the verifiers themselves with integration tests. I removed the test mode in favor of adding a local server that mocks responses from FxA. This way, we don't need to stub out the verifiers in the Rust code.
| `tokenserver.fxa_email_domain` | `"api.accounts.firefox.com"` | the email domain used to contruct the FxA email address from the user's FxA UID | | ||
| `tokenserver.fxa_oauth_server_url` | `None` | the URL of the FxA OAuth server to be used to verify user's OAuth tokens | | ||
| `tokenserver.test_mode_enabled` | `false` | whether to enable Tokenserver's [test mode](#test-mode) | | ||
You can find example settings for Tokenserver in [config/local.example.toml](../../config/local.example.toml). The available settings are described in doc comments [here](../../src/tokenserver/settings.rs). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I figured it made sense not to have to maintain documentation in two places
|
||
// If FxA responds with an invalid response body, report a 503 to the client | ||
let response_body = response | ||
.json::<VerifyResponse>() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Due to serde/Rust's type system, there is some deviation between how the old and new Tokenserver handles a faulty FxA response. For certain situations (e.g. a fxa-generation
claim that isn't an integer), the old Tokenserver returns a 401, but for others (e.g. FxA responds with malformed JSON), it returns a 503. I would have had to write a pretty lengthy custom Deserializer
to handle these cases the same way, and I figured it would be better just to handle all of them as 503 errors anyway, since they would be errors internal to Sync.
#[derive(Deserialize, Serialize)] | ||
#[serde(tag = "status", rename_all = "lowercase")] | ||
enum VerifyResponse { | ||
Okay { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The old Tokenserver manually checks to see if the status
field is set to "okay", but I am relying on serde's support for internally-tagged enums to express this check in the type system. A status other than "okay" or "failure" would result in a 503 error (which differs from the old Tokenserver, which would return a 401), but the structure of error vs success responses is well-documented, and it seemed reasonable to me to implement it this way. I think it should be considered an internal error if FxA sends us an errant status.
} | ||
|
||
// Approach inspired by: https://github.com/serde-rs/serde/issues/984#issuecomment-314143738 | ||
fn strict_deserialize<'de, T, D>(deserializer: D) -> Result<Option<T>, D::Error> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I use a custom deserializer here to differentiate between missing fields and null fields, since these cases are handled differently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't a missing field be an error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These fields aren't mandatory, and I think older clients may not send them. The legacy Tokenserver is built to handle situations where the fields aren't sent: https://github.com/mozilla-services/tokenserver/blob/master/tokenserver/views.py#L355-L361
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I think I understand the problem. My apologies if I'm stating the obvious.
All of these values are optional.
If they are present, they can either be Null
or a value.
There is handling logic present for all of these cases.
As I understand, json_serde translates Some(())
as equivalent to Null
. The problem is that once the value has been deserialized, that bit of information is potentially lost since typing doesn't have the concept of Null
. Your solution is to double-wrap the value and let the outside Option
indicate if the value is Present or Not, with the inner Option
indicate if the value is set or Null
. The only other option would be to keep the values as serde_json::Value
and then do is_null()
/ as_*()
on demand.
I don't really think there's a great solution here, but I do think we absolutely need to document this so that folks later understand what's going on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, that's it. I'll definitely add some more comments to make it more obvious whats going on 👍🏻
src/tokenserver/auth/oauth.rs
Outdated
// The PyFxA OAuth client does not offer a way to set a request timeout, so we set one here | ||
// by timing out the future if the verification process blocks this thread for longer | ||
// than the specified number of seconds. | ||
time::timeout(Duration::new(self.timeout, 0), fut) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The old Tokenserver sets a timeout
attribute on the OAuth client object, but I don't see such an attribute referenced in the code: https://github.com/mozilla/PyFxA/blob/main/fxa/oauth.py#L24
|
||
/// Verifies a BrowserID assertion. Returns `VerifyOutput` for valid assertions and a | ||
/// `TokenserverError` for invalid assertions. | ||
async fn verify(&self, assertion: String) -> Result<VerifyOutput, TokenserverError> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is comparable to this method on the old Tokenserver. Note that the Python function throws a number of different exceptions, which are turned into errors reported to the client here
@@ -1080,7 +1080,7 @@ mod tests { | |||
service_id, | |||
email: email1.to_owned(), | |||
generation: 1, | |||
client_state: "616161".to_owned(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The unit and integration tests previously set the client state equal to "aaa" (which is not valid hex), even though the client state is stored in the database as hex. I updated the client state values throughout the unit and integration tests to improve the consistency and make things easier to understand.
src/tokenserver/extractors.rs
Outdated
@@ -72,62 +69,73 @@ impl TokenserverRequest { | |||
/// of the FxA server may not have been sending all the expected fields, and | |||
/// that some clients do not report the `generation` timestamp. | |||
fn validate(&self) -> Result<(), TokenserverError> { | |||
/// Returns true only if both arguments are Some and opta > optb. | |||
fn gt(opta: Option<i64>, optb: Option<i64>) -> bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Open to suggestions here...I wanted to just compare the options themselves using comparison operators, but the semantics of Option
's impl for PartialOrd
didn't work for my use case: https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=2073cf8cbebae1806f073b04c3110dd2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed you could also use matches!
to reduce the nested scopes instead.
I think I find the matches!
version easier to understand but that's debatable 😃 (cargo fmt definitely disagrees since it's a macro).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried my hand at writing my own (very simple) opt_cmp!
macro to accomplish something similar, but IMO it's a bit more readable. It lets you do something like opt_cmp!(opta < optb)
, where opta
and optb
are both Option
s. Let me know what you think!
@@ -63,7 +66,7 @@ pub async fn get_tokenserver_result( | |||
id: token, | |||
key: derived_secret, | |||
uid: updates.uid, | |||
api_endpoint: format!("{:}/1.5/{:}", req.user.node, req.user.uid), | |||
api_endpoint: format!("{:}/1.5/{:}", req.user.node, updates.uid), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was a bug :(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But it was caught by the integration tests :)
tools/integration_tests/run.py
Outdated
the_server_subprocess = start_server() | ||
try: | ||
res = 0 | ||
res |= run_live_functional_tests(TestStorage, sys.argv) | ||
res |= run_local_tests() | ||
os.environ.setdefault("TOKENSERVER_AUTH_METHOD", "oauth") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We run all the local tests with both BrowserID and OAuth requests to be extra safe here
|
||
|
||
@view_config(route_name='mock_oauth_verify', renderer='json') | ||
def _mock_oauth_verify(request): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These mock verify endpoints work the same way as this mock server from the old Tokenserver's load tests. Instead of sending an opaque JWT token, the client sending the request to Tokenserver encodes the response it desires from FxA in the Authorization
header (e.g. Authorization: { "body": { .. }, "status": 200 }
)
Note that this server is stubbing out FxA, so the flow is:
User --> Tokenserver --> FxA
Tokenserver is oblivious to the fact that the client is sending it bogus tokens since it just passes it along to the verification server.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll note that I see one exception reported from the mock verifier during the docker compose e2e tests -- is it expected?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That exception is from a test that sends it improperly-formatted JSON -- an exception is thrown by json.loads
and a 5XX error is returned. Do you think we should catch that exception and return a response explicitly?
@@ -60,20 +71,65 @@ def tearDown(self): | |||
|
|||
self.database.close() | |||
|
|||
def _forge_oauth_token(self, generation=None, sub='test', scope='scope'): | |||
def _build_oauth_headers(self, generation=None, user='test', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_build_oauth_headers()
and _build_browserid_headers()
are responsible for building the bogus auth headers I referenced here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you've got a pattern where you call this, then add a bunch of extra headers. Have you thought about just setting headers, then updating from kwargs
so you could add the extra headers to the function arguments?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's way better :) My Python skills are rusty
3cc0139
to
20c446b
Compare
pub struct RemoteVerifier { | ||
// Note that we do not need to use an Arc here, since Py is already a reference-counted | ||
// pointer | ||
inner: Py<PyAny>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If Py
didn't use an Arc internally, we'd need to use one here to maintain keep-alive connections between threads
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ef17850
to
9660d46
Compare
src/tokenserver/auth/browserid.rs
Outdated
fxa_verifier_url: String, | ||
// We need to use an Arc here because reqwest's blocking client doesn't use one internally, | ||
// and we want to share keep-alive connections across threads | ||
request_client: Arc<ReqwestClient>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The async request client does use an Arc internally, but I specifically chose to use the blocking client. We don't want these BrowserID requests to FxA happening on Actix's worker threads; instead, I use web::block
to invoke the requests on threads in the blocking threadpool. If I had used the async client, I would have had to wrap each request invocation with web::block(futures::executor::block_on(|| { .. }))
instead of simply using web::block
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't want these BrowserID requests to FxA happening on Actix's worker threads;
Why not? The verify
method is async so it seems we could use the async reqwest
client -- .await
ing it instead of the block
/block_on
combo.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I think I had myself confused. Since the requests are async, it's okay if they run on the worker threads since they wouldn't be blocking them. Thanks for the catch
f13028f
to
af5a70f
Compare
will continue review tomorrow, but need to reboot and don't want to lose these. |
Dockerfile
Outdated
cargo install --features tokenserver_test_mode --path . --locked --root /app && \ | ||
cargo install --features tokenserver_test_mode --path . --bin purge_ttl --locked --root /app | ||
cargo install --path . --locked --root /app --features grpcio/openssl && \ | ||
cargo install --path . --bin purge_ttl --locked --root /app --features grpcio/openssl |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
half wondering if pairing up the common args might make the difference stand out a bit more.
e.g.
cargo install --path . --locked --root /app --features grpcio/openssl && \
cargo install --path . --locked --root /app --features grpcio/openssl --bin purge_ttl
Makefile
Outdated
@@ -36,10 +36,10 @@ docker_stop_spanner: | |||
docker-compose -f docker-compose.spanner.yaml down | |||
|
|||
run: | |||
RUST_LOG=debug RUST_BACKTRACE=full cargo run --features tokenserver_test_mode -- --config config/local.toml | |||
RUST_LOG=debug RUST_BACKTRACE=full cargo run --features grpcio/openssl -- --config config/local.toml |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. Along with a comment explaining why you're including the feature.
(Remember, future folks in a hurry will not read everything, so repeating important points is very useful)
src/tokenserver/auth/browserid.rs
Outdated
ReqwestClient::builder() | ||
.timeout(Duration::new(settings.fxa_browserid_request_timeout, 0)) | ||
.build() | ||
.map_err(|_| Error::from(()))?, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we want to eat this error, do we?
If the request build failed, shouldn't we pass along the error that is reported rather than just "something went wrong 🤷🏻♀️"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oof, I think I included this as a temporary measure while I was still working on the feature and forgot to clean it up. Thanks for the catch
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually after looking a bit more closely, this is just the code that creates the request client on startup -- we're already returning (panicking with) a descriptive error here if we run into problems. Do you still think we'd benefit from returning something here?
@@ -226,6 +226,7 @@ jobs: | |||
- run: | |||
name: Build Docker image | |||
command: docker build -t app:build . | |||
no_output_timeout: 30m |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might want to fix the error on line 204. That should be a - run:
instead of = run:
(it's an old bug.)
rename = "fxa-generation", | ||
deserialize_with = "strict_deserialize" | ||
)] | ||
generation: Option<Option<i64>>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused why you're wrapping the value in a double Option
, and it seems a bit confusing. What you're saying is that the value can either be None
or None
or i64
, right?
It might be better to present this as Result<Option<i64>>
? That would also keep us from potentially eating errors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Related to this thread
The second Option
indicates whether the field was present or not in the JSON returned by FxA. I chose to use an Option
here instead of a Result
because it's not actually an error if one of these fields is missing in the response
} | ||
|
||
// Approach inspired by: https://github.com/serde-rs/serde/issues/984#issuecomment-314143738 | ||
fn strict_deserialize<'de, T, D>(deserializer: D) -> Result<Option<T>, D::Error> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't a missing field be an error?
@@ -60,20 +71,65 @@ def tearDown(self): | |||
|
|||
self.database.close() | |||
|
|||
def _forge_oauth_token(self, generation=None, sub='test', scope='scope'): | |||
def _build_oauth_headers(self, generation=None, user='test', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you've got a pattern where you call this, then add a bunch of extra headers. Have you thought about just setting headers, then updating from kwargs
so you could add the extra headers to the function arguments?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
still working my way through this one, sorry
src/tokenserver/auth/browserid.rs
Outdated
fxa_verifier_url: String, | ||
// We need to use an Arc here because reqwest's blocking client doesn't use one internally, | ||
// and we want to share keep-alive connections across threads | ||
request_client: Arc<ReqwestClient>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't want these BrowserID requests to FxA happening on Actix's worker threads;
Why not? The verify
method is async so it seems we could use the async reqwest
client -- .await
ing it instead of the block
/block_on
combo.
src/tokenserver/auth/browserid.rs
Outdated
})?; | ||
|
||
// If FxA responds with an HTTP status other than 200, report a 503 to the client | ||
if response.status() != StatusCode::OK { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: You might be able to kill this block entirely by using error_for_status
along with including e.is_status()
along w/ the if e.is_connect
check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ooo nice catch!
src/tokenserver/auth/browserid.rs
Outdated
|
||
#[actix_rt::test] | ||
async fn test_browserid_verifier_success() { | ||
let body = r#"{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No request for changes here but I'll just note I like using serde's json!
macro for these things (you would need to to_string()
its result if you really need the text version). The downside however, as with any macro, is potentially added compile times.
docker-compose.e2e.spanner.yaml
Outdated
TOKENSERVER_HOST: http://localhost:8000 | ||
entrypoint: > | ||
/bin/sh -c " | ||
sleep 28; pip3 install -r /app/tools/integration_tests/requirements.txt && python3 /app/tools/integration_tests/run.py 'http://localhost:8000#secret0' | ||
python3 /app/tools/integration_tests/run.py 'http://localhost:8000#secret0' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might this still need a sleep like e2e.mysql continues to have?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yeah, my removing that was not intentional 😅
src/tokenserver/extractors.rs
Outdated
let device_id = auth_data | ||
.device_id | ||
.clone() | ||
.unwrap_or_else(|| "none".to_owned()); | ||
hash_device_id(&hashed_fxa_uid, &device_id, fxa_metrics_hash_secret) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: can save a clone here
let device_id = auth_data | |
.device_id | |
.clone() | |
.unwrap_or_else(|| "none".to_owned()); | |
hash_device_id(&hashed_fxa_uid, &device_id, fxa_metrics_hash_secret) | |
let device_id = auth_data.device_id.as_deref().unwrap_or("none"); | |
hash_device_id(&hashed_fxa_uid, device_id, fxa_metrics_hash_secret) |
|
||
|
||
@view_config(route_name='mock_oauth_verify', renderer='json') | ||
def _mock_oauth_verify(request): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll note that I see one exception reported from the mock verifier during the docker compose e2e tests -- is it expected?
src/tokenserver/auth/browserid.rs
Outdated
issuer: settings.fxa_browserid_issuer.clone(), | ||
request_client: Arc::new( | ||
ReqwestClient::builder() | ||
.timeout(Duration::new(settings.fxa_browserid_request_timeout, 0)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's also set a connect_timeout
as well.
src/tokenserver/extractors.rs
Outdated
@@ -72,62 +69,73 @@ impl TokenserverRequest { | |||
/// of the FxA server may not have been sending all the expected fields, and | |||
/// that some clients do not report the `generation` timestamp. | |||
fn validate(&self) -> Result<(), TokenserverError> { | |||
/// Returns true only if both arguments are Some and opta > optb. | |||
fn gt(opta: Option<i64>, optb: Option<i64>) -> bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed you could also use matches!
to reduce the nested scopes instead.
I think I find the matches!
version easier to understand but that's debatable 😃 (cargo fmt definitely disagrees since it's a macro).
.map_err(|e| { | ||
if e.is_connect() || e.is_status() { | ||
if e.is_connect() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had to revert this back because it wasn't handling 2XX errors other than 200 correctly (the old Tokenserver returns a 503 to the client if FxA returns anything other than a 200, not just anything other than a 2XX)
42ce7f9
to
cea29df
Compare
cea29df
to
66b197a
Compare
src/tokenserver/auth/browserid.rs
Outdated
.timeout(Duration::new(settings.fxa_browserid_request_timeout, 0)) | ||
.connect_timeout(Duration::new(settings.fxa_browserid_connect_timeout, 0)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.timeout(Duration::new(settings.fxa_browserid_request_timeout, 0)) | |
.connect_timeout(Duration::new(settings.fxa_browserid_connect_timeout, 0)) | |
.timeout(Duration::from_secs(settings.fxa_browserid_request_timeout)) | |
.connect_timeout(Duration::from_secs(settings.fxa_browserid_connect_timeout)) |
src/tokenserver/auth/browserid.rs
Outdated
settings::Settings, | ||
}; | ||
|
||
use core::time::Duration; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: it does live in core
but since it's reimported into std
it's usually referenced from there instead.
use core::time::Duration; | |
use std::time::Duration; |
src/tokenserver/extractors.rs
Outdated
|
||
match token { | ||
Token::BrowserIdAssertion(assertion) => { | ||
let verify_output = state.browserid_verifier.clone().verify(assertion).await?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this clone needed?
let verify_output = state.browserid_verifier.clone().verify(assertion).await?; | |
let verify_output = state.browserid_verifier.verify(assertion).await?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah no, that was from a previous iteration when I was passing the verifier into another thread. Thanks for the catch!
Makefile
Outdated
@@ -36,10 +36,10 @@ docker_stop_spanner: | |||
docker-compose -f docker-compose.spanner.yaml down | |||
|
|||
run: | |||
RUST_LOG=debug RUST_BACKTRACE=full cargo run --features tokenserver_test_mode -- --config config/local.toml | |||
RUST_LOG=debug RUST_BACKTRACE=full cargo run --features grpcio/openssl -- --config config/local.toml |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Finally tracked down the issue I was thinking of: syncstorage-rs#766 (CC: @jrconlin).
Basically: we enabled openssl narrowed to solely on Ubuntu. Due to a cargo issue I believe it became enabled on more platforms such as Debian used in our Docker, resulting in "Corruption detected" errors on stage.
This grpc.io issue suggests it could still be an issue. We could try enabling it again and see if the issue persists. Like I mentioned earlier, we would probably want it on by default in Cargo.toml so all tests run against it.
Another potential alternative we could try is using rusttls
for reqwest instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks pretty good 👍 with a big caveat on switching to openssl (see by above comment).
@pjenvey Are we reasonably confident that rusttls is safe? If so, it's probably easiest just to use that |
In the past foxsec has recommended sticking with more proven options like openssl/necko. I don't recall if they explicitly veto'd rustls usage though, we could ping them about its usage here. Plus it's had an audit and a significant effort put into it recently. (if this is even an option, I noted in slack you had issues trying rustls previously w/ reqwest) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks Ethan!
Description
support.rs
file into a newtokenserver::auth
moduleX-Client-State
header contains the hex-encoded client state bytes and theX-KeyID
header contains the base64-encoded client state bytesTesting
The new verifier has very comprehensive unit and integration test coverage.
Issue(s)
Closes #1215