Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/efficient reporting #2516

Merged
merged 40 commits into from
Feb 29, 2024
Merged

Feature/efficient reporting #2516

merged 40 commits into from
Feb 29, 2024

Conversation

Donnype
Copy link
Contributor

@Donnype Donnype commented Feb 15, 2024

Changes

This PR adds several changes that should be backward compatible with current reporting, but opens up the possibility to make significant performance improvements:

  • The parent BaseReport class now has a collect_data method that takes in multiple oois. By default this generates the data through the already existing generate_data method. It returns the data grouped by the input_ooi for easy compatibility.
  • The views have been rewritten to use the collect_data method, moving iteration to the report classes in a backward compatible way.
  • Octopoes now has a /query-many endpoint that you also give a path query, but takes in several sources (references), and returns the same result as running that path on all sources separately.
  • Octopoes and the query module have been updated to be able to handle this kind of query to make sure XTDB does the heavy lifting by using or-clauses. Still for compatibility we had to connect this back to the source through some datalog magic.
  • The Mail report has been rewritten to override the collect_data method to make use of this feature. When the other reports do this as well, the amount of queries to octopoes will remain constant instead of scaling with the number of input objects. This is a TODO for a follow up PR that I will create, including benchmark results.

Note: the logic around the Query object in this endpoint gets a bit complicated at this point. We should reserve some time to refactor this at some point, but this would give quite some changes perhaps.

Issue link

Closes #2446 (partly, follow up PR incoming)

Demo

See tests.

Notes for QA

Follow the regular report generation flow, but now expect a performance boost generating (aggregate) reports.


Code Checklist

  • All the commits in this PR are properly PGP-signed and verified.
  • This PR only contains functionality relevant to the issue; tickets have been created for newly discovered issues.
  • I have written unit tests for the changes or fixes I made.
  • For any non-trivial functionality, I have added integration and/or end-to-end tests.
  • I have performed a self-review of my code and refactored it to the best of my abilities.

Communication

  • I have informed others of any required .env changes files if required and changed the .env-dist accordingly.
  • I have made corresponding changes to the documentation, if necessary.
  • I have included comments in the code to elaborate on what is not self-evident from the code itself, including references to issues and discussions online, or implicit behavior of an interface.

Checklist for code reviewers:

Copy-paste the checklist from the docs/source/templates folder into your comment.


Checklist for QA:

Copy-paste the checklist from the docs/source/templates folder into your comment.

@Donnype Donnype requested a review from a team as a code owner February 15, 2024 21:16
Copy link
Contributor

@dekkers dekkers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general I am very happy to see this and it looks good, I have just a few small remarks. I think it is a nice big step towards more efficient report generation.

octopoes/octopoes/api/router.py Show resolved Hide resolved
rocky/reports/report_types/definitions.py Outdated Show resolved Hide resolved
rocky/reports/report_types/ipv6_report/report.py Outdated Show resolved Hide resolved
rocky/reports/report_types/definitions.py Outdated Show resolved Hide resolved
octopoes/octopoes/connector/octopoes.py Outdated Show resolved Hide resolved
octopoes/octopoes/connector/octopoes.py Outdated Show resolved Hide resolved
octopoes/octopoes/xtdb/query.py Outdated Show resolved Hide resolved
octopoes/octopoes/xtdb/query.py Outdated Show resolved Hide resolved
@madelondohmen
Copy link
Contributor

In the meantime, some naming changes have been made to the Mail Report (#2513). I think your code will overwrite this, so it would be good to compare the two and make sure these changes won't be overwritten.

@Donnype
Copy link
Contributor Author

Donnype commented Feb 22, 2024

In the meantime, some naming changes have been made to the Mail Report (#2513). I think your code will overwrite this, so it would be good to compare the two and make sure these changes won't be overwritten.

Yes I noticed those, should be fixed I think 👍

Signed-off-by: Donny Peeters <[email protected]>
@Donnype Donnype requested a review from dekkers February 22, 2024 15:48
Copy link
Contributor

@ammar92 ammar92 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks good overal. I just have a few suggestions that might improve it even more

@Donnype Donnype requested a review from ammar92 February 23, 2024 14:59
ammar92
ammar92 previously approved these changes Feb 23, 2024
@underdarknl underdarknl added this to the OpenKAT v1.15 milestone Feb 27, 2024
@stephanie0x00
Copy link
Contributor

Checklist for QA:

  • I have checked out this branch, and successfully ran a fresh make reset.
  • I confirmed that there are no unintended functional regressions in this branch:
    • I have managed to pass the onboarding flow
    • Objects and Findings are created properly
    • Tasks are created and completed properly
  • I confirmed that the PR's advertised feature or hotfix works as intended.

What works:

Generated an aggregate report successfully for multiple domains. Tasks are created and seem to complete as expected.

What doesn't work:

n/a

Bug or feature?:

n/a

@dekkers dekkers merged commit aa231a3 into main Feb 29, 2024
26 checks passed
@dekkers dekkers deleted the feature/efficient-reporting branch February 29, 2024 10:14
jpbruinsslot added a commit that referenced this pull request Mar 5, 2024
* main: (85 commits)
  Fix wrong solving of merge conflict (#2585)
  Add metrics collection for scheduler using prometheus (#2468)
  Hotfix for where_in queries for abstract types (#2577)
  Update django (#2587)
  Fix octopoes typing (#2555)
  Create findings report (#2393)
  Raise exception if boefje input OOI has been deleted (#2573)
  Set a timeout on hanging test ssl container (#2560)
  Feature/efficient reporting (#2516)
  Updated findings database. Removed old findings, added Impact, Source… (#2569)
  add unit test for web report (#2528)
  Add pool size config and logs (#2541)
  Quick fix for PDF table overflow (#2562)
  Fix/2527 octopoes unicode (#2558)
  Add return typing to report test fixtures (#2557)
  Sort vulnerabilities in vulnerability report (#2378)
  Disable ruff split-on-trailing-comma and update ruff (#2544)
  Select all oois triggers toggle all (#2536)
  Remove unnecessary toplevel dependencies (#2554)
  Make valid time required parameter in the octopoes API (#2543)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

Reduce queries in aggregate reports
6 participants