Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cost surge using webrisk docker in the last months #60

Open
shlomitsur opened this issue May 23, 2024 · 8 comments
Open

cost surge using webrisk docker in the last months #60

shlomitsur opened this issue May 23, 2024 · 8 comments

Comments

@shlomitsur
Copy link

hi
Our company has receive almost 100% increase in webrisk bills in the last 2 months.
We are using the docker version in Kubernetes.
We are trying to understand what can be the cause for that.

Thank you

@rvilgalys
Copy link
Collaborator

Hi @shlomitsur,

Thanks for letting us know about this -- one possibility for the increase is that there have been recent backend changes to Safebrowsing & Web Risk that expanded our blocklist coverage to protect against more unsafe URLs. This additional coverage means there are more hash-prefixes that need to be looked up via API calls so the blocklist status can be verified, and so this could also mean higher number of API calls.

If you are trying to limit spend, a couple suggestions:

If you only require a subset of these threat types, you can limit your selection running wrserver with the --threatTypes=... arg. For example docker run <...> wr-container --threatTypes=SOCIAL_ENGINEERING,SOCIAL_ENGINEERING_EXTENDED_COVERAGE would just look at the two social engineering threat types.

Other tips we could give might be more use-case specific.

We are also looking into ways to make the container more efficient for api calls, particularly in allowing them to share a cache -- are you running multiple instances of the container that might benefit from this? Or are there a collection of the same URLs that you need to look up repeatedly?

@shlomitsur
Copy link
Author

Thank you @rvilgalys!
I would also want to know about operating in multiple regions:
we are using AWS and we have Oregon and Frankfurt regions.
Can we run a webrisk docker per region using the same api key and somehow share cache to reduce cost?
Thanks!

@rvilgalys
Copy link
Collaborator

We don't yet support a shared cache but it's part of the roadmap -- likely we will have an option to set a Redis target as a shared cache, and if set the client will use that alongside it's own local cache.

One feature I did just enable was the use of maxDatabaseEntries and maxDiffEntries. These were already part of our API but hadn't been included in this client, but if you want further control over the blocklist size (and indirectly control spend on API call lookups), you can set an upper limit with maxDatabaseEntries.

See details in https://github.com/google/webrisk?tab=readme-ov-file#configuration

@corey-Robinson1337
Copy link

I'm having this issue as well. What would be some good values to set maxDatabaseEntries and maxDiffEntries at to try to still get relatively up to date blocklists and lower costs @rvilgalys?

@shlomitsur
Copy link
Author

Thank you @rvilgalys!

@rvilgalys
Copy link
Collaborator

@corey-Robinson1337 sorry I don't have any guidance on this -- we just noticed the API supported these limits but our client here didn't have a way to set them.

The purpose these limits is a carryover from the noncommercial Safebrowsing API and was mainly intended to run the Safebrowsing API in resource-limited environments (like on mobile devices). I'm not sure how (or if) the hashes sent in a constrained blocklist response get prioritized.

@rvilgalys
Copy link
Collaborator

We have a quick update on this:

Our team recently identified about 1.5M URLs we believe were out of date and could be safely removed from the Hash Prefix lists. Over the course of this week those patterns were removed from the threatList diffs.

Hopefully this will also helps cut down on spend while offering the same protection.

Thanks again for helping bring this issue to our attention @shlomitsur @corey-Robinson1337

@shlomitsur
Copy link
Author

thanks @rvilgalys for the update

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants