Skip to content
This repository has been archived by the owner on Jul 13, 2023. It is now read-only.

feat: add flag to stop table rotation #1179

Merged
merged 1 commit into from
Apr 27, 2018
Merged

feat: add flag to stop table rotation #1179

merged 1 commit into from
Apr 27, 2018

Conversation

jrconlin
Copy link
Member

Closes #1172

@codecov-io
Copy link

codecov-io commented Apr 16, 2018

Codecov Report

Merging #1179 into master will not change coverage.
The diff coverage is 100%.

Impacted file tree graph

@@           Coverage Diff           @@
##           master   #1179    +/-   ##
=======================================
  Coverage     100%    100%            
=======================================
  Files          60      60            
  Lines       10079   10637   +558     
=======================================
+ Hits        10079   10637   +558
Impacted Files Coverage Δ
autopush/diagnostic_cli.py 100% <100%> (ø) ⬆️
autopush/main_argparse.py 100% <100%> (ø) ⬆️
autopush/db.py 100% <100%> (ø) ⬆️
autopush/tests/test_db.py 100% <100%> (ø) ⬆️
autopush/websocket.py 100% <100%> (ø) ⬆️
autopush/config.py 100% <100%> (ø) ⬆️
autopush/tests/test_z_main.py 100% <100%> (ø) ⬆️
autopush/tests/test_rs_integration.py 100% <100%> (ø) ⬆️
autopush/tests/test_integration.py 100% <0%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d609561...ea05211. Read the comment docs.

@bbangert
Copy link
Member

This has a conflict with master branch.

@@ -132,3 +132,6 @@ endpoint_port = 8082
; e.g {"firefox":{"cert":"certs/main.cert","key":"certs/main.key","topic":"com.mozilla.org.Firefox","max_retry":2},"beta":{"cert":"certs/beta.cert","key":"certs/beta.key","topic":"com.mozilla.org.FirefoxBeta"}}
#apns_creds =

; With TTL implemented, message table rotation is no longer required.
; This flag determines if table rotation should be allowed to continue:
#allow_table_rotation = true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

main_argparse isn't handling this yet

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

autopush/db.py Outdated
"""Fetches the name of the last message table"""
response = filter(
lambda name: name.lower().startswith(prefix),
self._resource.meta.client.list_tables().get("TableNames"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this needs to handle list_tables' pagination (maintenance.py in the top level dir has some code for this that might be helpful)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's my understanding that list_tables will return a maximum of 100 entries per call. We rotate tables each month. Do we feel that we will be doing table rotation for 8 years?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it lists all tables for "the current account and endpoint" which is all the tables. there's > 100 on dev and stage. I've actually ran into this on dev at one point and fixed all our cases of it, before we got unlucky on stage/prod

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh. awesome. Yeah, that makes sense.

autopush/db.py Outdated
@@ -1078,6 +1117,21 @@ def create_initial_message_tables(self):
an entry for tomorrow, if tomorrow is a new month.

"""
if not self.allow_table_rotation:
tablename = self.resource.get_latest_message_tablename(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't tablename here always going to be self._message_conf.tablename? why is the get_latest_message_tablename call needed here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly, self._message_conf.tablename is the config option to specify the root name for the message table. Here, it's used as a mask to determine the message table from the returned list of tables.

the following table creation methods will probably not be called outside of test routines, since the table will have already been created.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my fault, I misunderstood that we'd be going w/ the final dated table after toggling allow_table_rotation as False

@@ -466,7 +471,7 @@ def process(self, command):
def _check_storage(self, command):
timestamp = None
messages = []
message = Message(command.message_month,
message = Message(self.get_month(command),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

w/ allow_table_rotation = False all of these Message constructor calls will call ddb list_tables via get_latest_message_tablename. Can't we just do that once (say in DatabaseManager)?

HelloResponse gets the current message month and passes it back and forth between the rust/python side as command.message_month. Could this month setup be isolated to there?

And also those get_latest_message_tablename calls are made w/ the default table_base_string of "message_". I'm leery of all the defaults used: e.g. stage/prod table names are along the lines of "stage.autopush.message" which wouldn't match "message_".

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good call on caching the latest message tablename, That can probably go into the DynamoDBResource.

as for the prefix, we already specify "--message_tablename", this is stowed as self._message_conf.tablename and is used as the tablename_base_string= in DatabaseManager.setup_tables(). I'm not sure of a better parameter to use to specify it, but I'm happy to switch to something better.

@@ -519,7 +524,9 @@ class MigrateUserCommand(ProcessorCommand):
def process(self, command):
# type: (MigrateUser) -> MigrateUserResponse
# Get the current channels for this month
message = Message(command.message_month,
if not self.conf.allow_table_rotation:
return MigrateUserResponse(message_month=None)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add a run of test_rs_integration w/ allow_table_rotation = False

message_month = None looks invalid for the rust side, message_month is always a String there (not Option)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added https://github.com/mozilla-services/autopush/pull/1179/files#diff-f37e66b6947f8539b0024d6672bd5ce1R189.

The rust side still calls to python for data operations, so it should never see a TableName of None.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ported over test_webpush_monthly_rotation to test_rs_integration, which will trigger this error:

https://pastebin.mozilla.org/9083770

{"Severity":5,"Timestamp":1524526849884206435,"Logger":"AutopushRust-0.1.0","EnvVersion":"2.0","Type":"autopush_rs:log","Hostname":"ubuntu","Fields":{"message":"Error for shutdown of 127.0.0.1: invalid type: null, expected a string at line 1 column 74"}}

Of course after all that whats occurs to me: do we even want to be in this code in the first place when allow_table_rotation is False? If not, then we should be disabling that flags["rotate_message_table"] = True in the HelloCommand above

Copy link
Member

@bbangert bbangert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not clear on whether table rotation is halted entirely, or if they move to a table with no month postfix. If they're moved one last time to a non-postfix table, then there'll be a period during deploy before its completed where any reconnects, their record will be invalidated (dropped per _verify_user_record in websocket.py) and they'll be reset to the latest message table that node is aware of.

I think we will need an intermediary version release/deploy cycle so that we can have a set of 'old' nodes for the final migration that are tolerant of message table names that aren't rotating based.

autopush/db.py Outdated
@@ -995,19 +1029,22 @@ class DatabaseManager(object):

router = attrib(default=None) # type: Optional[Router]
message_tables = attrib(default=Factory(list)) # type: List[str]
current_msg_month = attrib(init=False) # type: Optional[str]
current_msg_month = attrib(default=None, init=False) # type: Optional[str]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: The rest of these were re-aligned above to match, but not here?

@jrconlin
Copy link
Member Author

The presumption is that they're kept at the last table that we rotate to. Once the flag is flipped, no further rotation happens.

When we decide we no longer need the flag, all the rotation stuff can be stripped out, and we can even config the message table name. The flag tries to maintain as much of the rotation code as possible so that the switch can be maintained or quickly reverted in case of unexpected issues.

@jrconlin jrconlin force-pushed the feat/1172 branch 3 times, most recently from decb493 to b6f13cf Compare April 18, 2018 23:03
autopush/db.py Outdated
paginator = client.get_paginator("list_tables")
tables = []
for table in paginator.paginate().search(
"TableNames[?contains(@,'{}')==`true`]|sort(@)[-1]".format(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Contains? Is this going to require it as the prefix or will it find the prefix anywhere in the table name?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now our table prpefix's look like this "{dev/stage/prod}.autopush.message", so it should work ok for us as Contains, but it'd prolly be safer if it required the table to start with the prefix.

autopush/db.py Outdated
max_ttl=MAX_EXPIRY):
def __init__(self, tablename=None, metrics=None, boto_resource=None,
max_ttl=MAX_EXPIRY,
table_base_string="message_"):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is the table_base_string different from the prefix?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's not a huge difference between the two, aside from prefix is used mostly in table rotation, where table_base_string is not.

bbangert
bbangert previously approved these changes Apr 23, 2018
@jrconlin jrconlin force-pushed the feat/1172 branch 2 times, most recently from 752a058 to 7306fde Compare April 24, 2018 21:56
if has_connected_this_month(record):
del record["last_connect"]
else:
record["last_connect"] = generate_last_connect()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we stop updating the last_connect just because we turn off table rotation?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be reverted along w/ the similar webpush_server.py change and its accompanying tweak to trigger the table migration (my bad)

autopush/db.py Outdated
@@ -995,19 +1036,23 @@ class DatabaseManager(object):

router = attrib(default=None) # type: Optional[Router]
message_tables = attrib(default=Factory(list)) # type: List[str]
current_msg_month = attrib(init=False) # type: Optional[str]
current_msg_month = attrib(default=None,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this becomes None w/ allow_table_rotation = False. Instead it should be set to the latest pegged tablename (from get_latest_message_tablename). This should really be where we cache that value.

This shouldn't be None because it propagates everywhere (through all the rust Command/Responses, which right now assumes it's a String vs Option. Also the "current_month" in the router table)

@@ -408,15 +413,16 @@ def lookup_user(self, hello):
flags["check_storage"] = True

# Determine if message table rotation is needed
flags["message_month"] = record["current_month"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we still need to continue migrating to the new pegged table, so revert this. my bad!

@@ -348,6 +348,11 @@ def metrics(self):
def process(self, command):
raise NotImplementedError()

def get_month(self, command):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we'll just be passing through the db.current_msg_month so I think this should go away completely? update_rotating_tables will never change it

autopush/db.py Outdated
boto_resource=self.resource
)
self.message_tables.append(tablename)
return
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we still need the previous 2 months in message_tables, otherwise we'll be dropping older users:

https://github.com/mozilla-services/autopush/blob/a8a7d5/autopush/websocket.py#L773

@jrconlin jrconlin force-pushed the feat/1172 branch 2 times, most recently from c26d4bb to c46aced Compare April 26, 2018 22:35
client = self._resource.meta.client
paginator = client.get_paginator("list_tables")
tables = []
for table in paginator.paginate().search(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

echoing ben's previous comment, let's run this through filter(lambda name: name.startswith(prefix), tables) afterwards just to be extra sure. even tempted to re.match the full pattern..

autopush/db.py Outdated
@@ -479,7 +524,8 @@ def __getattr__(self, name):
class Message(object):
"""Create a Message table abstraction on top of a DynamoDB Table object"""
def __init__(self, tablename, metrics=None, boto_resource=None,
max_ttl=MAX_EXPIRY):
max_ttl=MAX_EXPIRY,
table_base_string="message_"):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kill

autopush/db.py Outdated
@@ -1054,7 +1112,8 @@ def setup_tables(self):
self.create_initial_message_tables()
self._message = Message(self.current_msg_month,
self.metrics,
boto_resource=self.resource)
boto_resource=self.resource,
table_base_string=self._message_conf.tablename)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kill

autopush/db.py Outdated
previous=3
)
# Create the most recent table if it's not there.
tablename = tablenames[-1:][0]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick tablenames[-1]

autopush/db.py Outdated
def get_latest_message_tablename(self, prefix="message"):
# type: (Optional[str]) -> str # noqa
"""Fetches the name of the last message table"""
if not self._latest_message_table:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we only call these during initialization now, so let's not cache them here

@@ -105,6 +105,10 @@ def add_shared_args(parser):
help="AWS DynamoDB endpoint override",
type=str, default=None,
env_var="AWS_LOCAL_DYNAMODB")
parser.add_argument('--allow_table_rotation',
help="Allow monthly message table rotation",
action="store_true", default=False,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

default to False here?

Also this isn't being passed along to AutopushConfig. Annoyingly enough, this still needs a line in AutopushConfig::from_argparse

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

crap, yeah, forgot that.

@jrconlin jrconlin force-pushed the feat/1172 branch 2 times, most recently from 29e840b to 598f3b6 Compare April 27, 2018 18:14
@jrconlin
Copy link
Member Author

changed the name (and a bit of logic) to the argument.

the option --no_table_rotation will set the config allow_table_rotation=False. This will allow the app to continue to work as expected UNTIL the --no_table_rotation flag is set.

(it also makes the code a bit more readable rather than not config.no_table_rotation everywhere)

@@ -297,6 +299,9 @@ def from_argparse(cls, ns, **kwargs):
if not ns.no_aws:
ami_id = get_amid() or "Unknown"

import pdb;pdb.set_trace()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

woops

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also helps if I merge and push changes instead of wandering off to get a sandwich. ._.

autopush/db.py Outdated
@@ -458,6 +477,7 @@ def __init__(self, **kwargs):
if "region_name" in conf:
del(conf["region_name"])
self.conf = conf
self._latest_message_table = None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kill

fake_conf.message_table.tablename = "message_bogus"
fake_conf.message_table.read_throughput = 5
fake_conf.message_table.write_throughput = 5
safe = autopush.tests.boto_resource._latest_message_table
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kill

fake_conf,
resource=autopush.tests.boto_resource)
try:
autopush.tests.boto_resource._latest_message_table = None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kill

@bbangert bbangert merged commit fb800c4 into master Apr 27, 2018
@bbangert bbangert deleted the feat/1172 branch April 27, 2018 22:23
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants