Skip to content
This repository has been archived by the owner on Jul 13, 2023. It is now read-only.

feat: Use cryptography based JWT parser for increased speed #854

Merged
merged 1 commit into from
Apr 4, 2017

Conversation

jrconlin
Copy link
Member

Switches to using python cryptography library instead of jose/jwt
(which relied on python ecdsa library). Also discovered lots of fun
discrepencies between how ecdsa and libssl sign things.

closes #785

@codecov-io
Copy link

codecov-io commented Mar 28, 2017

Codecov Report

Merging #854 into master will not change coverage.
The diff coverage is 100%.

Impacted file tree graph

@@          Coverage Diff          @@
##           master   #854   +/-   ##
=====================================
  Coverage     100%   100%           
=====================================
  Files          48     49    +1     
  Lines        9529   9537    +8     
=====================================
+ Hits         9529   9537    +8
Impacted Files Coverage Δ
autopush/tests/test_endpoint.py 100% <100%> (ø) ⬆️
autopush/jwt.py 100% <100%> (ø)
autopush/tests/test_web_validation.py 100% <100%> (ø) ⬆️
autopush/web/webpush.py 100% <100%> (ø) ⬆️
autopush/utils.py 100% <100%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 76f772e...fe9b776. Read the comment docs.

Copy link
Member

@bbangert bbangert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really understand the crypto enough to comment on that, added some Python style comments.

autopush/jwt.py Outdated
from pyasn1.codec.der.encoder import encode


"""Why hand roll?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Module docs go above the imports

autopush/jwt.py Outdated
"""


def repad(string):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wasn't there another padding altering function somewhere?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's in utils, which includes jwt.py, leading to a dependency loop.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Work around that by moving the function here and changing that file to import it?

autopush/jwt.py Outdated
return string


class VerifyJWT:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should inherit from (object) still (in Py3 that's implicit).

autopush/jwt.py Outdated

class VerifyJWT:

def __init__(self): # pragma: no cover
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is an empty init needed?

autopush/jwt.py Outdated
def __init__(self): # pragma: no cover
pass

@classmethod
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not @staticmethod? Since these don't seem to use cls.

autopush/jwt.py Outdated
new_sig = base64.urlsafe_b64encode(encoded)
return "{}.{}".format(payload, new_sig)

@classmethod
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

staticmethod?

Copy link
Contributor

@alex alex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👋 new mozillian here, also one of the pyca/cryptography devs. @bbangert asked me to take a look.

autopush/jwt.py Outdated
).setComponents(
univ.Integer(int(binascii.hexlify(sig[:32]), 16)),
univ.Integer(int(binascii.hexlify(sig[32:]), 16)),
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cryptography has a function for taking r and s and generating the DER encoded body: cryptography.hazmat.primitives.asymmetric.utils.encode_dss_signature (path from memory, hopefully correct!)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(In addition to being less code for you, it uses a different python asn.1 lib which is faster, so that'll be helpful if perf is a concern)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

absolutely! Thanks!

autopush/jwt.py Outdated
# comparing content values. If the signatures start failing for
# some unknown reason in the future, decode the signature and
# make sure it matches how we're reconstructing it.
verifier = pkey.verifier(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recent cryptography's have a helper pkey.verify() method, which will save you this comment and a few lines of code :-) https://cryptography.io/en/latest/hazmat/primitives/asymmetric/ec/#cryptography.hazmat.primitives.asymmetric.ec.EllipticCurvePublicKey.verify

autopush/jwt.py Outdated
# It will be captured externally.
verifier.verify()
return json.loads(
base64.urlsafe_b64decode(repad(sig_material.split('.')[1])))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can probably factor out the base64.urlsafe_b64decode(repad(signature.encode())) above into a message var, to safe teh re-decoding.

autopush/jwt.py Outdated
)
encoded = encode(ds)
new_sig = base64.urlsafe_b64encode(encoded)
return "{}.{}".format(payload, new_sig)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the caller immediately calls split() on this, is there a reason not to return a tuple?

This saves you the splitting, and removes the need to to base64-encode (and then decode) the signature.

@jrconlin jrconlin force-pushed the feat/785 branch 2 times, most recently from 2b9da7b to b9340d1 Compare March 29, 2017 20:34
@alex
Copy link
Contributor

alex commented Mar 29, 2017

Usage of the cryptography API here LGTM. I don't know anything else about the autopush codebase to say further :-)

@jrconlin jrconlin force-pushed the feat/785 branch 2 times, most recently from 3da65a4 to 033e4e8 Compare March 29, 2017 21:47
autopush/jwt.py Outdated
@@ -0,0 +1,84 @@
"""Why hand roll?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will show up as the module's title in the API docs.... oh, maybe we should add a .rst so that there'll be docs generated for this module?

autopush/jwt.py Outdated

import base64
import binascii
import json
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: line-break between module imports from std lib to 3rd party.

Copy link
Member

@pjenvey pjenvey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

w/ JOSEError gone you can also move the jose and ecdsa dep from requirements to test-requirements

also there's now at least a couple related bogus mocks in the tests: test_post_bad_jwt in test_endpoint, test_invalid_encryption_jwt in test_web_validation

autopush/jwt.py Outdated
return string


class VerifyJWT:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume you're keeping this as a class to flesh out more later (like the map of algs to signing algorithms, similar to jose's classes)? otherwise these are already nice and isolated and could live as module level functions

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Partly. Mostly I wanted to keep the basic concept the same so I didn't have to go do a massive code rewrite.

autopush/jwt.py Outdated
ec.ECDSA(hashes.SHA256()))
verifier.update(sig_material.encode())
# This will raise an InvalidSignature exception if failure.
# It will be captured externally.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we're not catching it yet --

I think we can overhaul the JOSEError and AssertionError except clause in webpush now for InvalidSignature (IIRC ecdsa had all those asserts we had to catch?)

This func can definitely trigger ValueErrors (from_encoded_point, maybe the rsplits?) which is ok I guess, we're already catching them in that clause.

jose attempts to catch some ValueErrors and convert them to JOSEErrors -- but maybe it wasn't perfect at it because we still had to catch them.

Looking at jose.jws._load, it's also careful about TypeErrors/binascii.Errors, might we trigger any of those too?

autopush/jwt.py Outdated
univ.Integer(int(binascii.hexlify(sig[:32]), 16)),
univ.Integer(int(binascii.hexlify(sig[32:]), 16)),
)
encoded = encode(ds)
Copy link
Member

@pjenvey pjenvey Mar 29, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this ever raise a PyAsn1Error? (OUTDATED)

autopush/jwt.py Outdated
# some unknown reason in the future, decode the signature and
# make sure it matches how we're reconstructing it.
verifier = pkey.verifier(
base64.urlsafe_b64decode(repad(signature.encode())),
Copy link
Member

@pjenvey pjenvey Mar 29, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you really want encode() w/ default encoding here and line 97? Should explicitly specify ascii then (OUDATED)

@jrconlin jrconlin force-pushed the feat/785 branch 4 times, most recently from 72f8966 to 3e0771b Compare March 29, 2017 23:31
Copy link
Member

@pjenvey pjenvey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test_endpoint test_post_bad_jwt still bogusly monkeypatches jose.jws.verify

@@ -321,7 +321,7 @@ def validate_auth(self, d):

try:
jwt = extract_jwt(token, public_key)
except (AssertionError, ValueError, JOSEError):
except (ValueError, InvalidSignature, Exception):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exception here definitely not needed

autopush/jwt.py Outdated
return payload, encoded

@staticmethod
def decode(token, key=None, *args, **kwargs):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

key should be required and don't need *args/**kwargs

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, I had those in because I was trying to make the function a drop in replacement and was being paranoid.

autopush/jwt.py Outdated
return json.loads(
base64.urlsafe_b64decode(
repad(sig_material.split('.')[1]).encode('utf8')))
except Exception as ex:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'd be nice to restrict this somewhat but it's a damn chore tracking all the potential exceptions. Maybe we should log totally unexpected ones to sentry?

I'll list the ones I know about, do with it what you want =]

splits: ValueError
base64: (TypeError, binascii.Error)
encode_dss_signature (underneath is pyasn stuff): not sure put probably that PyAsn1Error or whatever it was
verify: InvalidSignature
json.loads: ValueError

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

encode_dss_signature shouldn't ever throw

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alex, encode_dss_signature shouldn't throw, but hexlify absolutely will. Granted, it's caught by the parent function, but it's also still in the critical path.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pjenvey I like your idea about having a end Exception that reports to sentry. That will let us spot errors, but still have the code behave properly.

@jrconlin
Copy link
Member Author

Crap, missed pulling that test. Good catch, thanks!

@jrconlin jrconlin force-pushed the feat/785 branch 2 times, most recently from 9729164 to 5f4b3a9 Compare March 30, 2017 18:20
autopush/jwt.py Outdated
InvalidSignature):
raise InvalidSignature()
except Exception as ex: # pragma: no cover
Logger().error("Unexpected error processing JWT: ", repr(ex))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you want failure("Unexpected error processing JWT"), that'll grab exc_info and send to sentry

autopush/jwt.py Outdated
base64.urlsafe_b64decode(
repad(sig_material.split('.')[1]).encode('utf8')))
except (ValueError, TypeError, binascii.Error, PyAsn1Error,
InvalidSignature):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick don't need InvalidSignature here

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, probably don't need PyAsn1Error either according to alex

@pjenvey
Copy link
Member

pjenvey commented Mar 30, 2017

One more related thing (I notice as I'm looking at sentry):

https://sentry.prod.mozaws.net/operations/autopush-prod/issues/383686/

let's start catching TypeError/binascii.Error in extract_jwt's base64 call

autopush/jwt.py Outdated
:return tuple containing the signature material and signature

"""
payload, asig = auth.rsplit(".", 1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another thing: I think we can just auth.encode('utf-8') here to begin with: then the other 2 later encode calls aren't needed. jose.jws _load does similar. (then you can even utilize our utils.base64_urldecode)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes on calling .encode('utf8') no on using utils.base64_urldecode. the latter would set up a circular dependency.

Switches to using python cryptography library instead of jose/jwt
(which relied on python ecdsa library). Also discovered lots of fun
discrepencies between how ecdsa and libssl sign things.

closes #785
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Convert pycrypto to cryptography
5 participants