Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ignite netrisk with HyperLogLog? #3

Open
RubieV opened this issue May 4, 2016 · 3 comments
Open

Ignite netrisk with HyperLogLog? #3

RubieV opened this issue May 4, 2016 · 3 comments

Comments

@RubieV
Copy link
Contributor

RubieV commented May 4, 2016

@markharwood,

Great work on significant terms, maybe even greater visualization of the 4 strategies in your comment!

Working in the same space, yet having access to more detailed data, we have found input carnality to be higher related with attack impact than relative volume, if one vector has to be chosen.

Would love to commit doing a PR for you, yet to demo that in your project the dummy data has to include uri/ua hashes or values.

By the way, do we see you in Berlin at GOTO?

Ruben

@markharwood
Copy link
Owner

Thanks for your comments and PR #2 - I just merged it.

we have found input carnality to be higher related with attack impact than relative volume

I'm guessing you mean "cardinality" of something here? Possibly UAs per subnet? Some time ago I sketched out the typical entities in web interactions and the expected cardinality of each of their relations with reasons for exceptions:
cardinalities
This is clearly a more complex model for assessing risk and implementing it requires a different approach (see "entity centric indexing") which is beyond the scope of this project.

I don't have any plans to develop this "netrisk" project beyond it's current simple form - it was built to support a blog post showing how some of the elasticsearch aggregations can be applied in practice. Feel free to fork it of course if you find it useful :)

@RubieV
Copy link
Contributor Author

RubieV commented May 4, 2016

Sorry, cardinality is indeed what I meant.

We actually use entity centric views, yet currently don cache it back to ES. Would this be the same as ES backed cache for event sourcing?

It's funny you mention this example, as we've hacked together a browser plugin that ships fingerprints to ES, where application and webserver logs reside. Using sign-terms and a graph it's pretty powerful on very diverse datasets. Depending on cluster size, high cardinality fields might as well be used instead of significant terms for cached performance.

Do you happen to know a entity centric indexing / event sourcing framework that both supports ETL (per single event) and ES backed aggregations(historic aggregated events)?

Browser shipper: https://git.bitsensor.io/ruben/browser/blob/master/src/index.js

@markharwood
Copy link
Owner

Do you happen to know a entity centric indexing / event sourcing framework that both supports ETL (per single event) and ES backed aggregations(historic aggregated events)?

http://snowplowanalytics.com/product/ is a big project in this area with "trackers" for a variety of client platforms.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants