-
Notifications
You must be signed in to change notification settings - Fork 592
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add RowCountTrigger to Collector Service #859
Add RowCountTrigger to Collector Service #859
Conversation
Despite formatting on my local machine with |
Hi @magdalenakuhn17 , I checked the output of the workflow and it looks like there are some extra whitespaces in the I think the best way to reproduce linter workflow run on the local machine is to run command |
src/evidently/collector/config.py
Outdated
print("RowsCountTrigger") | ||
print(f"rows_count: {self.rows_count}") | ||
print(f"buffer length:{len(storage._buffers[config.id])}") | ||
if len(storage._buffers[config.id]) > 0: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
storage
might not have _buffers
field, only InMemoryStorage
has it, which is only an implementation (there are no other implementations, but still).
You can introduce a new abstract method to CollectorStorage
like def get_buffer_size(self, id)
and use it here instead (and implement for in-memory storage with len(self._buffers[id])
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, it's safer to implement the buffer size as abstract method to force any future storage implementation to provide it. Thanks for the hint :) Done in 7881940 and retested on my local machine.
src/evidently/collector/config.py
Outdated
print(f"rows_count: {self.rows_count}") | ||
print(f"buffer length:{len(storage._buffers[config.id])}") | ||
if len(storage._buffers[config.id]) > 0: | ||
if len(storage._buffers[config.id]) % self.rows_count == 0: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure you want to use modulo here instead of just >=
? It means that if rows_count is 10, report will trigger only when buffer size is 10, 20, 30 etc, but not 11, 15, 27 etc
Also can merge this check with previous one via and
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Combined both conditions and used >=
instead of %
in a705857
Hey @magdalenakuhn17 ! Thank you for this PR! I left some comments, and also you'll need to run linters like @emeli-dral suggested. Also, don't forget to remove commented code and debug prints once you are done (but I will remind you if you do forget 😄 ) |
True, I only executed black on |
src/evidently/collector/config.py
Outdated
|
||
def is_ready(self, config: "CollectorConfig", storage: "CollectorStorage") -> bool: | ||
buffer_size = storage.get_buffer_size(config.id) | ||
if buffer_size > 0 and buffer_size >= self.rows_count: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can just return the result of this check instead of literals
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if I understand what you are pointing to, can you explain? Is it that you wouldn't return literal True/False, but instead the integer difference between buffer_size and rows_count or sth. else? 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just return buffer_size > 0 and buffer_size >= self.rows_count
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
alright, adjusted in 9cbbe9e
RowCountTrigger
For testing:
evidently ui
to run UI servicepython src/evidently/collector/app.py
to run collector servicepython example_report.py
to configure collector, workspace and report and start sending datasrc/evidently/collector/config.py
2) will removeRowsCountTrigger
fromexamples/integrations/collector_service/example_report.py
and enableIntervalTrigger
againAdditional thoughts:
IntervalTrigger
andRowsCountTrigger
to not receive negative and zero as valueCronTrigger
or other triggers if needed