Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable query caching #756

Draft
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

FlorianBracq
Copy link
Collaborator

Hello,

this is a WIP of the caching mechanism, heavily inspired by @rcobb-scwx work!

To test it, you can add the parameter "cache_path" to your query:
import msticpy as mp prov: mp.QueryProvider = mp.QueryProvider("LogAnalytics") prov.connect() data = prov.Azure.list_aad_signins_for_account(cache_path=<PATH_TO_CACHE>)
If it is executed from a notebook, and PATH_TO_CACHE is the path to the notebook, the cell's output will contain:

  • the HTML representation of the first few rows of the dataframe generated from the query result
  • metadata of the query result:
    • timestamp when the cache was generated
    • String representation of the executed query
    • name of the function called
    • dictionary representation of the parameters provided to the query's function
    • hash of the parameters provided to the query's function (required to return the right cached value)
    • compressed query result

If it is executed outside of a notebook, the same data will be stored in the file provided in cache_path

The path to the notebook is required as the kernel does not know which file it is receiving inputs from, hence cannot know which cell output to read to find the cached data

Things to be done:

  • Handle split queries
  • Proper handling of optional parameters
  • Add tests

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant