Skip to content
This repository has been archived by the owner on Jul 13, 2023. It is now read-only.

feat: Add multiple cert handlers for APNs #660

Merged
merged 1 commit into from
Sep 17, 2016
Merged

feat: Add multiple cert handlers for APNs #660

merged 1 commit into from
Sep 17, 2016

Conversation

jrconlin
Copy link
Member

This patch updates APNs handlers to accept platform based cert
configurations. See configs/autopush_shared.ini.sample. In addition,
this patch clarifies some argument references for routers (e.g. less
than useful result is now slightly more descriptive uaid_data)

Custom item names have been normalized to match gcm/fcm.
Document errors cleaned up a bit as well.

BREAKING CHANGE: the APNS configuration options have been altered, see
configs/autopush_shared.ini.sample for new APNS configuration
settings.

Closes #655

@bbangert @pjenvey r?
@kitcambridge (optional r?)

@codecov-io
Copy link

codecov-io commented Sep 14, 2016

Current coverage is 100% (diff: 100%)

Merging #660 into master will not change coverage

@@           master   #660   diff @@
====================================
  Files          45     45          
  Lines        9171   9222    +51   
  Methods         0      0          
  Messages        0      0          
  Branches        0      0          
====================================
+ Hits         9171   9222    +51   
  Misses          0      0          
  Partials        0      0          

Powered by Codecov. Last update b1a7e2d...7eed1ff

Copy link

@ghost ghost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @jrconlin! This looks really good, but I think _route has some concurrency issues.

self.log.debug("Starting APNS router...")
self.ap_settings = ap_settings

def register(self, uaid, router_data, *args, **kwargs):
"""Validate that an APNs instance token is in the ``router_data``"""
def register(self, uaid, router_data, router_token="firefox",
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about calling this app_id to match the docs, and throwing instead of having a default?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed router_token to app_id for registration, added app_id as a default arg for register


"""
router_token = router_data["token"]
self._channel = router_data["channel"]
Copy link

@ghost ghost Sep 15, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ISTM there's only one instance of APNSRouter per app, so this will clobber self._channel for concurrent requests (if one request comes in, then another one comes in while we're still processing the first one).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved rel_channel to messages hash.

# TODO: Add listener for error handling.
self.apns.gateway_server.register_response_listener(self._error)
self.messages[now] = {"token": router_token, "payload": payload}
apns_client.gateway_server.register_response_listener(self._error)
Copy link

@ghost ghost Sep 15, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, self.messages and register_response_listener don't seem safe for concurrent use if APNSRouter is a singleton. time.time() is in seconds, so we might also clobber the previous entry if we handle two requests close together.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this one is intentional to track sent messages.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the gateway_server need to have the self._error continually registered? The response listeners for each channel can't be regg'd at setup?

Copy link
Member

@bbangert bbangert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kit pointed out some issues that need changes, and I noticed on this read-through existing issues in how outbound messages are tracked and updated.

self._connect()
self.apns = dict()
self._config = router_conf
for channel in self._config.keys():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are APNS channels?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the iOS team have certs assigned to the firefox channel release, so these would map to "firefox", "beta", "gecko" and whatever other channel definition they want to use (e.g. "nightly?")

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could that be better expressed by a better variable name or more documentation around this code?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changing to rel_channel as an abbreviation for "release channel".

:type uaid: str
:param router_data: Dict containing router specific configuration info
:type router_data: dict
:param router_token: The channel identifier for cert info lookup
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More tokens? This one is weird because you immediately say that 'router_token' is actually a 'channel_identifier'.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, resolving that. There will still be one "router_token" because that's what APNs calls it.

alert=router_data.get("title", config.get('default_title',
'Mozilla Push')),
content_available=1,
custom=custom)
now = int(time.time())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This results only in second granularity, if multiple requests occur at the same second (pretty easy), then their now value will all be the same.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since I'm collapsing key indexes with the expry, made this nanosecond based.

# TODO: Add listener for error handling.
self.apns.gateway_server.register_response_listener(self._error)
self.messages[now] = {"token": router_token, "payload": payload}
apns_client.gateway_server.register_response_listener(self._error)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this one is intentional to track sent messages.

# TODO: Add listener for error handling.
self.apns.gateway_server.register_response_listener(self._error)
self.messages[now] = {"token": router_token, "payload": payload}
apns_client.gateway_server.register_response_listener(self._error)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the gateway_server need to have the self._error continually registered? The response listeners for each channel can't be regg'd at setup?

self._base_tags)

self.apns.gateway_server.send_notification(token, payload, now)
apns_client.gateway_server.send_notification(router_token, payload,
now)

# cleanup sent messages
if self.messages:
for time_sent in self.messages.keys():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's possible that multiple threads could be iterating on this at once, which means that one of the del calls will throw a KeyError. Would probably be better to toss the message cleanup function back into the event loop so it could clean-up without worry about locking the object.

self.messages[now] = {"token": token, "payload": payload}
# TODO: Add listener for error handling.
self.apns.gateway_server.register_response_listener(self._error)
self.messages[now] = {"token": router_token, "payload": payload}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a router_token, a channel_identifier, or a token? It's not great that its name keeps changing as it moves around to be more and more obscure in meaning.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

router_token is now unique and locally defined as the router_data["token"] from above. I'm fine making it more explicit but router_data_token = router_data["token"] feels verbose.

; ... }
; e.g {"firefox": {"cert": "certs/main.cert", "key": "certs/main.key"}, "beta": {"cert": "certs/beta.cert", "key": "certs/beta.key"}}
#apns_creds =

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The connection node needs this? Or is there some other reason its now in shared.ini?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consistency, mostly. The GCM/FCM definitions are located in shared. In addition, the router args are currently pulled in by main via add_shared_args().

@jrconlin jrconlin force-pushed the feat/655 branch 4 times, most recently from 055e311 to 2aa7a59 Compare September 15, 2016 22:08
router_data = uaid_data["router_data"]
# Kick the entire notification routing off to a thread
return deferToThread(self._route, notification, router_data)
d = deferToThread(self._route, notification, router_data)
d.addCallback(self._cleanup)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than cleaning up after every single message, maybe have the router creation spawn a recurring task that runs every 5 seconds that does this?

content_available=1,
custom=custom)
now = time.time()
self.messages[now] = {"rel_channel": router_data["rel_channel"],
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is rel_channel the same as app_id, or is it the instance_id from {“token”:{instance_id}}? And is router_token also the instance_id?

self._base_tags)

apns_client.gateway_server.send_notification(router_token, payload,
now)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you meant to pass the ID here, too, instead of now?

if message["time_sent"] < now - expry:
del self.messages[msg_id]
return response

def _error(self, err):
"""Error handler"""
if err['status'] == 0:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: self.messages.pop(err['identifier'], None) might be better; del will throw a KeyError if the entry was already removed from the dict.

Copy link
Member

@bbangert bbangert Sep 16, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That shouldn't happen now that cleanup is run off the event loop.

self.apns[rel_channel] = self._connect(self._config[rel_channel])
self.apns[rel_channel].gateway_server.register_response_listener(
self._error)
self._connect(router_conf)
Copy link

@ghost ghost Sep 16, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this still needed, now that the config is a dict?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to keep the connection in a function, even if it adds some slight overhead. There's a bit of messy history about that with the apns lib.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant you're calling self._connect() for each entry in router_conf, and then you're calling self._connect(router_conf). ISTM that won't work? If it does, I'd like to understand why. 😄

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, quite right. context is a good thing for me to have when I reply. I should try it more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah-HA! turns out this was a merge artifact, since it just happened again after a conflict from master. ._.

release channel name.

"""
# Do I still need to call this in _error?
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess not. 😄

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so, yeah, remember my cryptic comment about calling self._connect() before? ...

# "unclean" is used by testing to prevent unclean reactor errors.
# It also prevents the local cleanup loop from running without
# requiring heavy lifting.
if "unclean" not in router_conf:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ugh. Unclean reactor errors make me sad. 😭

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here, thus why I added this to just not call the reactor when we're running tests.

@ghost
Copy link

ghost commented Sep 16, 2016

Great! r+ with nits fixed.

self._config = router_conf
if self._config.get('max_messages'):
self._max_messages = self._config['max_messages']
del(self._config['max_messages'])
Copy link
Member

@bbangert bbangert Sep 16, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

del the function vs. del the keyword?

self.apns = dict()
self.messages = dict()
self._config = router_conf
if self._config.get('max_messages'):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self._max_messages = self._config.pop('max_messages', 100)

del(self._config['max_messages'])
else:
self._max_messages = 100
for rel_channel in self._config.keys():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Default behavior on iteration of a dict is to iterate through its keys. Usually keys() is only needed if you might mutate the dict during iteration such that you need a list first.

self.apns[rel_channel] = self._connect(self._config[rel_channel])
self.apns[rel_channel].gateway_server.register_response_listener(
self._error)
self._connect(router_conf)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, first router_conf is assigned to self._config, then we iterate through keys in self._config (which is now router_conf minux the max_messages key) calling self._connect. But then we also call self._connect with the whole thing again? What's going on here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

git blend merge failure.

# It also prevents the local cleanup loop from running without
# requiring heavy lifting.
if "unclean" not in router_conf:
reactor.callLater(10, self._cleanup)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a task looping call used in main.py for functions that should be called regularly.

:param app_id: The release channel identifier for cert info lookup
:type app_id: str

returns a modified router_data dict to be stored
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:returns: ?

@jrconlin jrconlin force-pushed the feat/655 branch 2 times, most recently from f5d79b6 to 167bde7 Compare September 16, 2016 23:36
self.apns = dict()
self.messages = dict()
self._config = router_conf
if self._config.get('max_messages'):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self._max_messages = self._config.pop('max_messages', 100)

self.ap_settings.metrics.increment(
"updates.client.bridge.apns.succeed",
"updates.client.bridge.apns.%s.attempted" %
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI send_notification appears to never fail w/ enhanced=True (assuming the input is valid for a payload), which if I'm right makes the new separate metrics unnecessary

router_data = uaid_data["router_data"]
# Kick the entire notification routing off to a thread
return deferToThread(self._route, notification, router_data)
d = deferToThread(self._route, notification, router_data)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(since I know you're touching this again) could you please restore: return deferToThread..

Copy link
Member

@pjenvey pjenvey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

r+ pending coverage

This patch updates APNs handlers to accept platform based cert
configurations. See `configs/autopush_shared.ini.sample`. In addition,
this patch clarifies some argument references for routers (e.g. less
than useful `result` is now slightly more descriptive `uaid_data`)

Custom item names have been normalized to match gcm/fcm.
Document errors cleaned up a bit as well.

BREAKING CHANGE: the APNS configuration options have been altered, see
`configs/autopush_shared.ini.sample` for new APNS configuration
settings.

Closes #655
@@ -566,6 +569,9 @@ def endpoint_main(sysargs=None, use_files=True):
# Start the table rotation checker/updater
l = task.LoopingCall(settings.update_rotating_tables)
l.start(60)
if settings.routers.get('apns'):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need to get it, a key check should be fine. "if 'apns' in ...."

"""
router_token = router_data["token"]
rel_channel = router_data["rel_channel"]
config = self._config[rel_channel]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure hope we never, ever, change valid release channel names.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we do, thumbscrews. Fortunately ops has the ability to generate more certs. They'll just have to change the app_id they're sending in to use the new cert pair.

@jrconlin jrconlin merged commit 70d0319 into master Sep 17, 2016
@bbangert bbangert deleted the feat/655 branch September 17, 2016 04:08
@bbangert bbangert restored the feat/655 branch September 17, 2016 04:08
@bbangert bbangert deleted the feat/655 branch September 17, 2016 04:08
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants