Skip to content
This repository has been archived by the owner on Mar 5, 2020. It is now read-only.

"Invalid state" after resetting password #735

Closed
secretrobotron opened this issue Apr 20, 2015 · 21 comments
Closed

"Invalid state" after resetting password #735

secretrobotron opened this issue Apr 20, 2015 · 21 comments
Assignees
Labels

Comments

@secretrobotron
Copy link

@edrushka and I just ran into an "Invalid State" error on teach-api.herokuapp.com after resetting her password and trying to log in with new credentials.

@toolness
Copy link
Contributor

Is that reproducible anytime anyone resets their password?

@hannahkane
Copy link

I was just able to successfully reset my password and log in with it.

@secretrobotron
Copy link
Author

I'l try right meow.

@secretrobotron
Copy link
Author

Yeah, can't seem to repro atm. No logs, @toolness ?

@toolness
Copy link
Contributor

Nope, "invalid state" isn't actually an internal server error--it just means that the state parameter passed to the OAuth2 callback doesn't match what's stored in the user's server-side session.

It could be caused by any number of things external to the server--it just means that somewhere in between the time that teach-api forwarded the user to id.webmaker.org's OAuth authorize endpoint and the time that id.webmaker.org sent the user back to teach-api, the state argument passed to the callback wasn't what it was expected to be.

That said, there's some things we could do, none of which are mutually exclusive:

  1. Add some metrics to track how often this is happening. If it's happening a lot, then we should really look into this more.
  2. Add some fault tolerance so that users are presented with something a bit more helpful than an inscrutable error message on a completely unfamiliar website. For instance, directing the user back to the teach site with some kind of message like "sorry, an error occurred when trying to log you in; please try again." could be more helpful.
  3. Instead of storing state as an opaque token that is looked up in the user's server-side session, we could have it be a meaningful value signed by teach-api. This could reduce the possibility of certain kinds of errors occurring, though it could also make login less resilient to certain types of attacks if we don't do it properly.

@secretrobotron
Copy link
Author

1 and 2 sound like good things to do right away. Obviously, this problem shouldn't happen (and is, in fact, hard to reproduce), so some metrics will make it easier to see when and how it happens. And 👍 much better to direct user back to teach with error message.

@toolness
Copy link
Contributor

@jbuck just mentioned to me on IRC that @edrushka may have had multiple browser tabs open. If she initiated login in tab A without completing it, then initiated login in tab B without completing it, and then completed login in tab A, the "invalid state" message would appear, because her session cookie would expect the state value set by tab B but receive the state value set by tab A.

I think my solution (3) would fix this, but we'd have to be very careful to avoid creating replay attack vulnerabilities and such. Alternatively, since it's probably rare, we could potentially just live with the error, using the metrics provided by (1) to make sure it's not too common and the UX provided by (2) to make sure we don't totally lose users.

@secretrobotron
Copy link
Author

👍 good idea

@edrushka
Copy link

I am innocent! I only had one browser tab open! However, Bobby and I were mucking around quite a bit trying to break things, so it's unlikely any will reproduce my exact steps.

@toolness
Copy link
Contributor

Hahaha ok, thanks for the explanation erika :)

So to summarize, we'll go with the following solution, which I will create new issue(s) for.

When the "invalid state" error occurs, we will:

  1. Log this in some useful way, to keep an eye on how often it's happening to people.
  2. Redirect the user back to the teach site, passing it some information (e.g. ?loginerror=yup or even a separate /login-error/ page) informing it that login failed, at which point it will inform the user in a friendly way.

If our metrics from (1) indicate that the error is happening more often than we'd like, we'll look into more rigorous solutions.

@secretrobotron
Copy link
Author

👍 where should we log that? Could use GA, but not sure if that's abusive. @adamlofting ?

@toolness
Copy link
Contributor

Yeah, that's a good question. I was thinking we could actually do the steps in reverse order, so that the logging would happen via GA on the "login error" page on the teach site. If that is abuse of GA though we can figure out something else... How are we doing back-end error logging these days?

@toolness
Copy link
Contributor

Ok, so in mozilla/teach-api@b2f21e2 I've made things at least a tiny bit more user friendly by adding a link back to the teach site and telling the user to try again:

2015-04-21_3-29-49

You can view that page "live" at https://teach-api.herokuapp.com/auth/oauth2/callback?code=invalid&state=invalid.

Obviously it's still suboptimal but I'm thinking for v1, we can just add GA to this page, and if we get lots of hits we can make the error more user-friendly (by redirecting directly to the teach site w/ an error message shown on it) or jump straight to option 3.

@adamlofting
Copy link
Contributor

That's a perfectly valid use of GA. 😄
We just need to choose if the error is 'more' a teach.m.o error, or 'more' a login.w.o error and use the corresponding GA account details to log the error there.

@toolness
Copy link
Contributor

Cool! I think it is 'more' a teach.m.o error, since the page would ideally be hosted on teach.m.o. (and we might move it there once we've implemented #585).

@toolness toolness added the v1.1 label Apr 22, 2015
@hannahkane hannahkane removed this from the Inspector Gadget milestone Apr 22, 2015
@toolness
Copy link
Contributor

Ok, since we're using newrelic for server-side monitoring, it was easier to just add the metrics tracking via newrelic instead of GA, which I did in mozilla/teach-api@c085f24.

@toolness
Copy link
Contributor

At this point there isn't really much to do aside from just watching the metrics to see if oauth2 handshake problems like this occur unreasonably often.

@adamlofting
Copy link
Contributor

That works for now, but we won't have NewRelic much longer as it's too expensive for us.

@toolness
Copy link
Contributor

Oh, um, I am just using the free plan. I guess it kind of sucks since it only keeps metrics around for 1 day though.

@hannahkane
Copy link

@toolness - Are we okay to close this ticket now?

@toolness
Copy link
Contributor

Yup I think so!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

5 participants