-
Notifications
You must be signed in to change notification settings - Fork 30
feat: Use cryptography based JWT parser for increased speed #854
Conversation
Codecov Report
@@ Coverage Diff @@
## master #854 +/- ##
=====================================
Coverage 100% 100%
=====================================
Files 48 49 +1
Lines 9529 9537 +8
=====================================
+ Hits 9529 9537 +8
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really understand the crypto enough to comment on that, added some Python style comments.
autopush/jwt.py
Outdated
from pyasn1.codec.der.encoder import encode | ||
|
||
|
||
"""Why hand roll? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Module docs go above the imports
autopush/jwt.py
Outdated
""" | ||
|
||
|
||
def repad(string): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wasn't there another padding altering function somewhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's in utils, which includes jwt.py, leading to a dependency loop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Work around that by moving the function here and changing that file to import it?
autopush/jwt.py
Outdated
return string | ||
|
||
|
||
class VerifyJWT: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should inherit from (object) still (in Py3 that's implicit).
autopush/jwt.py
Outdated
|
||
class VerifyJWT: | ||
|
||
def __init__(self): # pragma: no cover |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is an empty init needed?
autopush/jwt.py
Outdated
def __init__(self): # pragma: no cover | ||
pass | ||
|
||
@classmethod |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not @staticmethod
? Since these don't seem to use cls.
autopush/jwt.py
Outdated
new_sig = base64.urlsafe_b64encode(encoded) | ||
return "{}.{}".format(payload, new_sig) | ||
|
||
@classmethod |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
staticmethod?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👋 new mozillian here, also one of the pyca/cryptography devs. @bbangert asked me to take a look.
autopush/jwt.py
Outdated
).setComponents( | ||
univ.Integer(int(binascii.hexlify(sig[:32]), 16)), | ||
univ.Integer(int(binascii.hexlify(sig[32:]), 16)), | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cryptography has a function for taking r
and s
and generating the DER encoded body: cryptography.hazmat.primitives.asymmetric.utils.encode_dss_signature
(path from memory, hopefully correct!)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(In addition to being less code for you, it uses a different python asn.1 lib which is faster, so that'll be helpful if perf is a concern)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
absolutely! Thanks!
autopush/jwt.py
Outdated
# comparing content values. If the signatures start failing for | ||
# some unknown reason in the future, decode the signature and | ||
# make sure it matches how we're reconstructing it. | ||
verifier = pkey.verifier( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Recent cryptography's have a helper pkey.verify()
method, which will save you this comment and a few lines of code :-) https://cryptography.io/en/latest/hazmat/primitives/asymmetric/ec/#cryptography.hazmat.primitives.asymmetric.ec.EllipticCurvePublicKey.verify
autopush/jwt.py
Outdated
# It will be captured externally. | ||
verifier.verify() | ||
return json.loads( | ||
base64.urlsafe_b64decode(repad(sig_material.split('.')[1]))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can probably factor out the base64.urlsafe_b64decode(repad(signature.encode()))
above into a message
var, to safe teh re-decoding.
autopush/jwt.py
Outdated
) | ||
encoded = encode(ds) | ||
new_sig = base64.urlsafe_b64encode(encoded) | ||
return "{}.{}".format(payload, new_sig) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the caller immediately calls split()
on this, is there a reason not to return a tuple?
This saves you the splitting, and removes the need to to base64-encode (and then decode) the signature.
2b9da7b
to
b9340d1
Compare
Usage of the cryptography API here LGTM. I don't know anything else about the autopush codebase to say further :-) |
3da65a4
to
033e4e8
Compare
autopush/jwt.py
Outdated
@@ -0,0 +1,84 @@ | |||
"""Why hand roll? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will show up as the module's title in the API docs.... oh, maybe we should add a .rst so that there'll be docs generated for this module?
autopush/jwt.py
Outdated
|
||
import base64 | ||
import binascii | ||
import json |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: line-break between module imports from std lib to 3rd party.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
w/ JOSEError gone you can also move the jose and ecdsa dep from requirements to test-requirements
also there's now at least a couple related bogus mocks in the tests: test_post_bad_jwt in test_endpoint, test_invalid_encryption_jwt in test_web_validation
autopush/jwt.py
Outdated
return string | ||
|
||
|
||
class VerifyJWT: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume you're keeping this as a class to flesh out more later (like the map of algs to signing algorithms, similar to jose's classes)? otherwise these are already nice and isolated and could live as module level functions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Partly. Mostly I wanted to keep the basic concept the same so I didn't have to go do a massive code rewrite.
autopush/jwt.py
Outdated
ec.ECDSA(hashes.SHA256())) | ||
verifier.update(sig_material.encode()) | ||
# This will raise an InvalidSignature exception if failure. | ||
# It will be captured externally. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we're not catching it yet --
I think we can overhaul the JOSEError and AssertionError except clause in webpush now for InvalidSignature (IIRC ecdsa had all those asserts we had to catch?)
This func can definitely trigger ValueErrors (from_encoded_point, maybe the rsplits?) which is ok I guess, we're already catching them in that clause.
jose attempts to catch some ValueErrors and convert them to JOSEErrors -- but maybe it wasn't perfect at it because we still had to catch them.
Looking at jose.jws._load, it's also careful about TypeErrors/binascii.Errors, might we trigger any of those too?
autopush/jwt.py
Outdated
univ.Integer(int(binascii.hexlify(sig[:32]), 16)), | ||
univ.Integer(int(binascii.hexlify(sig[32:]), 16)), | ||
) | ||
encoded = encode(ds) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this ever raise a PyAsn1Error? (OUTDATED)
autopush/jwt.py
Outdated
# some unknown reason in the future, decode the signature and | ||
# make sure it matches how we're reconstructing it. | ||
verifier = pkey.verifier( | ||
base64.urlsafe_b64decode(repad(signature.encode())), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you really want encode() w/ default encoding here and line 97? Should explicitly specify ascii then (OUDATED)
72f8966
to
3e0771b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test_endpoint test_post_bad_jwt still bogusly monkeypatches jose.jws.verify
@@ -321,7 +321,7 @@ def validate_auth(self, d): | |||
|
|||
try: | |||
jwt = extract_jwt(token, public_key) | |||
except (AssertionError, ValueError, JOSEError): | |||
except (ValueError, InvalidSignature, Exception): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exception here definitely not needed
autopush/jwt.py
Outdated
return payload, encoded | ||
|
||
@staticmethod | ||
def decode(token, key=None, *args, **kwargs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
key should be required and don't need *args/**kwargs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, I had those in because I was trying to make the function a drop in replacement and was being paranoid.
autopush/jwt.py
Outdated
return json.loads( | ||
base64.urlsafe_b64decode( | ||
repad(sig_material.split('.')[1]).encode('utf8'))) | ||
except Exception as ex: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'd be nice to restrict this somewhat but it's a damn chore tracking all the potential exceptions. Maybe we should log totally unexpected ones to sentry?
I'll list the ones I know about, do with it what you want =]
splits: ValueError
base64: (TypeError, binascii.Error)
encode_dss_signature (underneath is pyasn stuff): not sure put probably that PyAsn1Error or whatever it was
verify: InvalidSignature
json.loads: ValueError
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
encode_dss_signature
shouldn't ever throw
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(if it is, let us know, we need to either fix a bug or update teh docs: https://cryptography.io/en/latest/hazmat/primitives/asymmetric/utils/#cryptography.hazmat.primitives.asymmetric.utils.encode_dss_signature)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@alex, encode_dss_signature shouldn't throw, but hexlify absolutely will. Granted, it's caught by the parent function, but it's also still in the critical path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pjenvey I like your idea about having a end Exception that reports to sentry. That will let us spot errors, but still have the code behave properly.
Crap, missed pulling that test. Good catch, thanks! |
9729164
to
5f4b3a9
Compare
autopush/jwt.py
Outdated
InvalidSignature): | ||
raise InvalidSignature() | ||
except Exception as ex: # pragma: no cover | ||
Logger().error("Unexpected error processing JWT: ", repr(ex)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you want failure("Unexpected error processing JWT"), that'll grab exc_info and send to sentry
autopush/jwt.py
Outdated
base64.urlsafe_b64decode( | ||
repad(sig_material.split('.')[1]).encode('utf8'))) | ||
except (ValueError, TypeError, binascii.Error, PyAsn1Error, | ||
InvalidSignature): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick don't need InvalidSignature here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, probably don't need PyAsn1Error either according to alex
One more related thing (I notice as I'm looking at sentry): https://sentry.prod.mozaws.net/operations/autopush-prod/issues/383686/ let's start catching TypeError/binascii.Error in extract_jwt's base64 call |
autopush/jwt.py
Outdated
:return tuple containing the signature material and signature | ||
|
||
""" | ||
payload, asig = auth.rsplit(".", 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another thing: I think we can just auth.encode('utf-8') here to begin with: then the other 2 later encode calls aren't needed. jose.jws _load does similar. (then you can even utilize our utils.base64_urldecode)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes on calling .encode('utf8')
no on using utils.base64_urldecode
. the latter would set up a circular dependency.
Switches to using python cryptography library instead of jose/jwt (which relied on python ecdsa library). Also discovered lots of fun discrepencies between how ecdsa and libssl sign things. closes #785
Switches to using python cryptography library instead of jose/jwt
(which relied on python ecdsa library). Also discovered lots of fun
discrepencies between how ecdsa and libssl sign things.
closes #785