-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Accessing Execution Metrics #6809
Comments
Have you looked at the output of I don't think we have upload/download bytes but I do think there are a bunch of parquet level metrics like | | ParquetExec: file_groups={1 group: [[data.parquet]]}, projection=[<cols>], limit=1, metrics=[output_rows=1, elapsed_compute=1ns, bytes_scanned=4763, predicate_evaluation_errors=0, page_index_rows_filtered=0, pushdown_rows_filtered=0, file_open_errors=0, row_groups_pruned=0, file_scan_errors=0, num_predicate_creation_errors=0, time_elapsed_processing=7.350966ms, time_elapsed_opening=1.363919ms, pushdown_eval_time=2ns, time_elapsed_scanning_until_data=6.981077ms, page_index_eval_time=2ns, time_elapsed_scanning_total=6.981147ms] | We can probably add more |
@alamb Thanks. Would How can i get this information using the Rust API? |
Additionally, i would like to run queries and return the results of those queries all while recording the metrics. Is this possible currently? |
I think so
You can access the metrics using https://docs.rs/datafusion/latest/datafusion/physical_plan/trait.ExecutionPlan.html#method.metrics You can walk the ExecutionPlan (either before/during/after execution) and find the relevant ParquetExecNode Here is some code in IOx that walks these metrics and does something with them (in this case, converts them into "tracing spans" format). Perhaps that is helpful: https://github.com/influxdata/influxdb_iox/blob/4a1f8db2546d867c759f76ab2a2b2b7c8f3dac9c/iox_query/src/exec/query_tracing.rs#L93-L214 |
I think #9415 covers a more general way to get distributed access. I don't think this ticket is tracking anything actionable now, so closing. Please reopen if you disagree. |
Is your feature request related to a problem or challenge?
https://github.com/apache/arrow-datafusion/blob/main/datafusion-examples/examples/query-aws-s3.rs
In the above example, how would I get information like: download/upload bytes, execution time (per partition), number partitions, etc.
I would like to track queries metrics like to this to measure quotas.
There doesn’t seems be a simple way of doing this, especially when using the higher level sql function.
Describe the solution you'd like
No response
Describe alternatives you've considered
No response
Additional context
No response
The text was updated successfully, but these errors were encountered: