Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add RowCountTrigger to Collector Service #859

Conversation

magdalenakuhn17
Copy link
Contributor

@magdalenakuhn17 magdalenakuhn17 commented Nov 19, 2023

For testing:

  • go to examples/integrations/collector_service
  • run evidently ui to run UI service
  • run python src/evidently/collector/app.py to run collector service
  • run python example_report.py to configure collector, workspace and report and start sending data
  • after someone else has tested the logic: 1) will remove unnecessary prints from src/evidently/collector/config.py 2) will remove RowsCountTrigger from examples/integrations/collector_service/example_report.py and enable IntervalTrigger again
    Screenshot 2023-11-25 at 09 52 04

Additional thoughts:

  • safeguard IntervalTrigger and RowsCountTrigger to not receive negative and zero as value
  • I can add the CronTrigger or other triggers if needed

@magdalenakuhn17
Copy link
Contributor Author

Despite formatting on my local machine with black -l 120 -t py37 collector_service and committing in 12ca5d1, black executed in the github workflow still raises an error https://github.com/evidentlyai/evidently/actions/runs/6921799242/job/18827823124#step:5:28 Does anyone know how I can mimic the exact github workflow env on my local machine?

@emeli-dral
Copy link
Contributor

Hi @magdalenakuhn17 ,
first of all thank you so much!
I really like what you did in this PR.

I checked the output of the workflow and it looks like there are some extra whitespaces in the src/evidently/collector/config.py

I think the best way to reproduce linter workflow run on the local machine is to run command black --check --diff -l 120 -t py37 src or black --check --diff -l 120 -t py37 src/evidently/collector so that you check all needed files.

print("RowsCountTrigger")
print(f"rows_count: {self.rows_count}")
print(f"buffer length:{len(storage._buffers[config.id])}")
if len(storage._buffers[config.id]) > 0:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

storage might not have _buffers field, only InMemoryStorage has it, which is only an implementation (there are no other implementations, but still).
You can introduce a new abstract method to CollectorStorage like def get_buffer_size(self, id) and use it here instead (and implement for in-memory storage with len(self._buffers[id]))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, it's safer to implement the buffer size as abstract method to force any future storage implementation to provide it. Thanks for the hint :) Done in 7881940 and retested on my local machine.

print(f"rows_count: {self.rows_count}")
print(f"buffer length:{len(storage._buffers[config.id])}")
if len(storage._buffers[config.id]) > 0:
if len(storage._buffers[config.id]) % self.rows_count == 0:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure you want to use modulo here instead of just >=? It means that if rows_count is 10, report will trigger only when buffer size is 10, 20, 30 etc, but not 11, 15, 27 etc
Also can merge this check with previous one via and.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Combined both conditions and used >= instead of % in a705857

@mike0sv
Copy link
Collaborator

mike0sv commented Nov 21, 2023

Hey @magdalenakuhn17 ! Thank you for this PR! I left some comments, and also you'll need to run linters like @emeli-dral suggested. Also, don't forget to remove commented code and debug prints once you are done (but I will remind you if you do forget 😄 )

@magdalenakuhn17
Copy link
Contributor Author

Hi @magdalenakuhn17 , first of all thank you so much! I really like what you did in this PR.

I checked the output of the workflow and it looks like there are some extra whitespaces in the src/evidently/collector/config.py

I think the best way to reproduce linter workflow run on the local machine is to run command black --check --diff -l 120 -t py37 src or black --check --diff -l 120 -t py37 src/evidently/collector so that you check all needed files.

True, I only executed black on examples/integrations/collector_service/ :D Thanks!


def is_ready(self, config: "CollectorConfig", storage: "CollectorStorage") -> bool:
buffer_size = storage.get_buffer_size(config.id)
if buffer_size > 0 and buffer_size >= self.rows_count:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can just return the result of this check instead of literals

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if I understand what you are pointing to, can you explain? Is it that you wouldn't return literal True/False, but instead the integer difference between buffer_size and rows_count or sth. else? 🤔

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just return buffer_size > 0 and buffer_size >= self.rows_count

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

alright, adjusted in 9cbbe9e

@emeli-dral emeli-dral merged commit 5c4d0dd into evidentlyai:main Nov 30, 2023
22 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants