Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pythonic interface to profile data #389

Closed
wfjm opened this issue Jan 6, 2022 · 4 comments · Fixed by #393
Closed

Pythonic interface to profile data #389

wfjm opened this issue Jan 6, 2022 · 4 comments · Fixed by #393
Assignees
Labels
enhancement New feature or request
Milestone

Comments

@wfjm
Copy link

wfjm commented Jan 6, 2022

Proposal:
Pythonic interface to profile data

Current behavior:
Seems write profile data always to stdout

Desired behavior:
Access to profile data via suitable python structures.

Use case:
Automatize tests, generate specific and compact summaries.

I'm using influxdb-client mostly for testing Flux queries.
When I use profiling with

query_api = client.query_api(query_options=QueryOptions(profilers=["query", "operator"]))
tables = query_api.query(q)

the profiling data is simply printed to stdout.
I had a brief look into the code and found in flux_csv_parser.py

               if self._is_profiler_record(flux_record):
                    self._print_profiler_info(flux_record)
                    continue

I'd love to have a pythonic interface to the profile data, from which I could pick the parts of interest and build compact summaries.
It that available ?

@bednar
Copy link
Contributor

bednar commented Jan 7, 2022

Hi @wfjm,

thanks for using our client.

I'd love to have a pythonic interface to the profile data, from which I could pick the parts of interest and build compact summaries.
It that available ?

Thanks for your suggestion, the client currently don’t have this type of interface.

Is this something you might be willing to help with?

Regards

@bednar bednar added the enhancement New feature or request label Jan 7, 2022
@wfjm
Copy link
Author

wfjm commented Jan 7, 2022

Hi @bednar ,
thanks for the fast response.
My Python skills are not at the level of library designer, and I'm certainly not in the position to propose and implement a solution which is compatible with the overall design and usage.

A few thoughts anyway: The current state is that one can enable profiling, and because the profile data only written to stdout, all else stays as it is. The interface must obviously be extended to provide access profile data at the Python level. One possible solution would be to return a 2 element tuple or dict when profiling is enabled, with the current return value and an appropriate Python structure holding the profile data. That could be done via an additional API, maybe named query_with_profile_api, which always returns such a tuple or dict. The current query_api would stay as it is, to keep backward compatibility.

I'm currently looking into join() performance issues, see https://community.influxdata.com/t/23231.
For use cases like this it would obviously help a lot to have direct Python access to profile data.
That would allow to script and automatize testing.
Currently this is tedious manual spot, pick and cut&past work.

@bednar
Copy link
Contributor

bednar commented Jan 10, 2022

Hi @wfjm,

thanks for your detail info. We are thinking about something like (pseudo code):

from influxdb_client import InfluxDBClient, Point
from influxdb_client.client.flux_table import FluxRecord
from influxdb_client.client.query_api import QueryOptions
from influxdb_client.client.write_api import SYNCHRONOUS

with InfluxDBClient(url="http://localhost:8086", token="my-token", org="my-org", debug=True) as client:
    write_api = client.write_api(write_options=SYNCHRONOUS)

    """
    Prepare data
    """
    _point1 = Point("my_measurement").tag("location", "Prague").field("temperature", 25.3)
    _point2 = Point("my_measurement").tag("location", "New York").field("temperature", 24.3)
    write_api.write(bucket="my-bucket", record=[_point1, _point2])

    """
    Define callback to process profiler results.
    """
    def profiler_callback(profiler_result: FluxRecord):
        print(f'Custom processing of profiler result: {profiler_result}')


    """
    Pass callback to QueryOptions
    """
    query_api = client.query_api(
        query_options=QueryOptions(profilers=["query", "operator"], profiler_callback=profiler_callback))

    """
    Perform query
    """
    tables = query_api.query('from(bucket:"my-bucket") |> range(start: -10m)')
    for table in tables:
        for record in table.records:
            print(record.values)

Is this suitable for your use case?

Regards

@wfjm
Copy link
Author

wfjm commented Jan 11, 2022

@bednar ,
thanks for the proposal !
Yes, a callback mechanism is certainly a good option.
And probably easy to add to the existing code base.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants