Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for iterating over query results #132

Merged
merged 6 commits into from
Oct 22, 2024

Conversation

marevers
Copy link
Contributor

@marevers marevers commented Jul 22, 2024

This PR adds a new function to the exporter which allows you to deal with dynamic database / schema structures.
E.g. you have a multi-tenant PostgreSQL database which has multiple schemas. These schemas contain the same tables and you want to run a collection of queries over each relevant schema without knowing them by name.

Under the job configuration, you can add the following:

  iterator:
    # sql is the SQL to execute to retrieve the list of values to iterate over -
    # query result must be a single column
    sql: "SELECT schema_name FROM information_schema.schemata WHERE schema_name LIKE '%my_structure%'"
    # placeholder should be present in the original query and not also used as an environment variable
    # e.g. {{PLACEHOLDER}} - it will be replaced by the values retrieved by the query
    placeholder: SCHEMA_NAME
    # label is the label name to which the iterator value gets assigned
    label: structure_schema

This will retrieve a list of schemas according to the query specified and then run any query in the job that has the given placeholder ({{SCHEMA_NAME}}) in a loop, once for every result of the aforementioned query. Before running the query, the placeholder is replaced by the value retrieved. The value is also added to a Prometheus label configured under iterator.label to ensure non-duplicate results.

It is safe to combine both queries with and without an iterator placeholder in the same job if iterator is specified. Queries without the placeholder will be ran normally.

This is an easier option than using something like dynamic SQL, because it requires no changes to the original query and does not suffer from the performance implications that dynamic SQL can often have.

@marevers
Copy link
Contributor Author

@dewey is this something you want to support? It comes in very handy in our case. If so, please let me know. I will rebase so you can merge.

@dewey
Copy link
Member

dewey commented Oct 22, 2024

Hey @marevers, I unfortunately don't have much time to put towards this project, so I'm focusing on smaller PRs that I can review without much context. Thanks for pinging me again on this one.

I think it would be valuable to have, could you go ahead and rebase please? Then I'll go ahead and merge it as it sounds like it's working in production for you without negative effects.

@marevers
Copy link
Contributor Author

@dewey Thanks, we are running this in production now for a while and it works well. Fixed the conflict and formatting.

@dewey dewey merged commit a72f7b8 into justwatchcom:master Oct 22, 2024
1 check passed
@dewey
Copy link
Member

dewey commented Oct 22, 2024

I've released a new version and the Docker image is pushing right now. Thanks for the contribution!

@marevers marevers deleted the add-query-iterator branch October 22, 2024 16:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants