-
Notifications
You must be signed in to change notification settings - Fork 70
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(taps): Add api costs hook #704
Conversation
@aaronsteers @edgarrmondragon it looks like the CI setup is not fully there yet for PRs, the job on python 3.10 (py, external) complains about missing credentials for |
@laurentS Don't worry about the pre-commit check. The external tests can be retried by an admin (I'll look into making it wait for approval instead of just failing) |
Codecov Report
@@ Coverage Diff @@
## main #704 +/- ##
==========================================
+ Coverage 85.26% 85.31% +0.05%
==========================================
Files 34 34
Lines 3386 3399 +13
==========================================
+ Hits 2887 2900 +13
Misses 499 499
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@laurentS I've added a few suggestions to rename api
to sync
in the base class, since I can imagine a more generic cost calculation for files, databases, etc.
Other points that might be good to start a discussion on:
-
Which logger should be used. This is using the tap/stream logger, but I wonder if it makes sense to use a different one so metrics and tap behavior are not mixed in logs (releated to Reduce noise from 'ignored properties' warnings #383 (comment))
-
Adding some tests to validate cost calculation in the general case.
Co-authored-by: Edgar R. M. <[email protected]>
Co-authored-by: Edgar R. M. <[email protected]>
From my perspective, this would be much better. Right now, the costs are logged at
I will try to write something for this. |
@edgarrmondragon I've updated the code to match your suggestions, and added some testing to validate the summing code. |
@laurentS thanks for the updates! I left a single suggestion. Other than that, this looks good to merge 😄.
Let's leave that for another PR, another day 👍 |
Co-authored-by: Edgar R. M. <[email protected]>
Co-authored-by: Edgar R. M. <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@laurentS alright, this lgtm. Thank you!
* Add api costs hook * Correct typing hints for older pythons * Rename api to sync Co-authored-by: Edgar R. M. <[email protected]> * Apply suggestions from code review Co-authored-by: Edgar R. M. <[email protected]> * Rename cost methods * Add sync costs calculation test * Use a single loop for logging costs Co-authored-by: Edgar R. M. <[email protected]> * Update tap_base.py * Add test for log_sync_costs Co-authored-by: Edgar R. M. <[email protected]> * Add missing import Co-authored-by: Edgar R. M. <[email protected]> Co-authored-by: Eric Boucher <[email protected]>
This PR addresses #348 and adds a simple callback mechanism to the sdk that is:
calculate_api_request_cost
in their streams.Here is an example of output (at the very end of the tap run) for
tap-github
with the implementation linked above:On the downside, the logging above could be improved for better machine handling.
log_api_costs
could be overridden by tap developers for this. And the way I call the logging method at the end feels a bit hacky, but I couldn't think of a better way to do it.