Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Too easy to accidentally schedule queries for long periods that cost a lot of money #949

Open
wlach opened this issue May 6, 2019 · 7 comments
Assignees

Comments

@wlach
Copy link

wlach commented May 6, 2019

(filing this on mozilla's redash fork because the details are particular to us)

A couple months ago I did some brainstorming on how to reduce our Athena spend via redash queries:

https://docs.google.com/document/d/1fZDxl-BiB_OXu5NEEMrWmviNy2y3B8iu_MBQHycaAtQ/edit#heading=h.3kudzbqcx32n

As you probably saw, the iodide dashboard (internal only, sorry) + my nagging led to a substantial decrease in cost. But I think we should try to implement some of the suggested ideas in that doc as well, in particular @chutten suggestions about expiry seem valuable.

I think some combination of the following would be super helpful:

  • Warn about scheduling too often (> once per day)
  • Mandatory expiry of queries (with an email warning ala atmo) after some time interval, say 6 months, unless renewed.

There are other things we could also do (e.g. put a dollar figure in the query window, like GCP does) but I think this is a good place to start.

It seems like the latest version of redash already has some notion of query expiry judging from this issue:

getredash#3375

It doesn't seem like that's deployed though? I'm guessing this feature doesn't include the two suggestions above either?

/cc @rafrombrc @jezdez @washort

@jezdez
Copy link

jezdez commented May 9, 2019

So looking at getredash#3375 a bit, this relates to @emtwo's work on porting our feature to let the a query schedule end on a specific date:

Screen-Shot-2019-05-09-12-27-41 88

As the issue states, there is no periodic Celery task yet to clear the schedule field so it becomes less expensive to look for queries to schedule, so this is mostly an optimizing step for reducing the load on the worker system.

What I believe you're describing needs to happen separately, I think both features can happen via our extension redash-stmo:

I think some combination of the following would be super helpful:

* Warn about scheduling too often (> once per day)

Warning about scheduling too often won't work without some code work, but I wonder if we could either:

a) simply remove all query options smaller than 24 hours (via the already existing REDASH_QUERY_REFRESH_INTERVALS env var)

b) add a periodic celery task via redash-stmo that would check the schedule for new queries and send emails to the query owners asking to update it?

* Mandatory expiry of queries (with an email warning ala atmo) after some time interval, say 6 months, unless renewed.

Sure, we could have the a periodic Celery task (via redash-stmo) check new queries and set the expiry and warn the queries authors about it (maybe in the same email about the schedule from the feature above)? I'm not sure if the particual part of the front-end is easy to extend with an actual in-place-warning sadly.

There are other things we could also do (e.g. put a dollar figure in the query window, like GCP does) but I think this is a good place to start.

It seems like the latest version of redash already has some notion of query expiry judging from this issue:

getredash#3375

It doesn't seem like that's deployed though? I'm guessing this feature doesn't include the two suggestions above either?

/cc @rafrombrc @jezdez @washort

@wlach
Copy link
Author

wlach commented May 9, 2019

As the issue states, there is no periodic Celery task yet to clear the schedule field so it becomes less expensive to look for queries to schedule, so this is mostly an optimizing step for reducing the load on the worker system.

Yeah i was just taking that issue as an indication that the feature had landed. Didn't know it was @emtwo who implemented it, that's cool. :) Maybe she would have some thoughts on this area as well...

What I believe you're describing needs to happen separately, I think both features can happen via our extension redash-stmo:

  • Warn about scheduling too often (> once per day)

Warning about scheduling too often won't work without some code work, but I wonder if we could either:

a) simply remove all query options smaller than 24 hours (via the already existing REDASH_QUERY_REFRESH_INTERVALS env var)

b) add a periodic celery task via redash-stmo that would check the schedule for new queries and send emails to the query owners asking to update it?

Unfortunately there are probably legitimate reasons why someone might want to schedule a query more than once a day.

Maybe we could add some text like "Please be mindful that queries can be expensive and slow down the system for others: please only schedule to update as often as you really need."

I could even see a case for making this text customizable per installation, perhaps via the extension mechanism? Can we also set a maximum value for the "expires" field and make it mandatory to expire? Perhaps via some configuration options?

  • Mandatory expiry of queries (with an email warning ala atmo) after some time interval, say 6 months, unless renewed.

Sure, we could have the a periodic Celery task (via redash-stmo) check new queries and set the expiry and warn the queries authors about it (maybe in the same email about the schedule from the feature above)? I'm not sure if the particual part of the front-end is easy to extend with an actual in-place-warning sadly.

Now that I think about it, some generic email when a query is about to expire would be all we would need, if we implemented the above.

@emtwo
Copy link

emtwo commented Aug 8, 2019

The simplest/quickest solution right now would be to flip the sort order of the string that contains schedule options in seconds. This string is stored in the environment variable REDASH_QUERY_REFRESH_INTERVALS

The default value is 60, 300, 600, 900, 1800, 3600, 7200, 10800, 14400, 18000, 21600, 25200, 28800, 32400, 36000, 39600, 43200, 86400, 604800, 1209600, 2592000

I updated it to 2419200, 1209600, 604800, 432000, 172800, 86400, 43200, 39600, 36000, 32400, 28800, 25200, 21600, 18000, 14400, 10800, 7200, 3600, 1800, 900, 600, 300, 60 and the UI now looks like this:

Screen Shot 2019-08-08 at 3 54 17 PM

Note that a strict reverse sort without any changes resulted in days showing up before weeks still (since it had a 30-day option which was the largest). So I switched "30 days" to "4 weeks" instead and added a couple more options for days, as seen in the screenshot

@rafrombrc @jasonthomas If we're all ok with this change, this would be an envar change you can make on your end, right @jasonthomas?

@jezdez
Copy link

jezdez commented Aug 9, 2019

Wow, awesome @emtwo!

@jasonthomas
Copy link
Member

this would be an envar change you can make on your end, right @jasonthomas?

This is what it looks like. I can add it if we are okay to proceed with this change.

@wlach
Copy link
Author

wlach commented Aug 9, 2019

Definitely a change in the right direction, though I still think that atmo-style query expiry is a good idea. It's still too easy to forget about a query after scheduling updates on it.

@emtwo
Copy link

emtwo commented Aug 9, 2019

@wlach Thanks for the reminder, I've opened a separate issue (#982) so we don't forget about it.

I suppose the question is whether we also still want to give a warning for frequently scheduled queries even with this new sort order...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants