-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update fix/analytics data export cleanup #3819
Update fix/analytics data export cleanup #3819
Conversation
📝 Walkthrough📝 WalkthroughWalkthroughThe pull request introduces significant modifications to the Changes
Possibly related PRs
Suggested labels
Suggested reviewers
Warning Tool Failures:Tool Failure Count:📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (1)
🧰 Additional context used📓 Learnings (1)src/analytics/api/models/events.py (1)
🔇 Additional comments (5)src/analytics/api/models/events.py (5)
The consolidation of multiple filter parameters into
The change from generic
The dictionary-based table mapping improves code organization and makes frequency-based table selection more maintainable.
The use of Line range hint The implementation consistently uses parameterized queries for filter values across all query paths, which is secure against SQL injection. Also applies to: 173-187, 200-208 Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
🧹 Outside diff range and nitpick comments (3)
src/analytics/api/utils/data_formatters.py (2)
323-323
: Enhance exception chaining for better debugging.I appreciate the improvement in error propagation. To make debugging even easier, let's maintain the exception chain using Python's exception chaining syntax.
- raise RuntimeError(f"Error while filtering non private sites {rex}") + raise RuntimeError(f"Error while filtering non private sites {rex}") from rexAdditionally, consider adding a return type hint to the function signature:
def filter_non_private_sites(sites: List[str]) -> Dict[str, Union[List[str], Any]]:🧰 Tools
🪛 Ruff
323-323: Within an
except
clause, raise exceptions withraise ... from err
orraise ... from None
to distinguish them from errors in exception handling(B904)
351-351
: Enhance exception handling and documentation.Great improvement in error handling! Let's polish it further with these enhancements:
- Maintain exception chain:
- raise RuntimeError(f"Error while filtering non private devices {rex}") + raise RuntimeError(f"Error while filtering non private devices {rex}") from rex
- Fix docstring typo and parameter description:
""" - FilterS out private device IDs from a provided array of device IDs. + Filters out private device IDs from a provided array of device IDs. Args: - entities(List[str]): List of device/site ids to filter against. + devices(List[str]): List of device IDs to filter against.
- Add return type hint:
def filter_non_private_devices(devices: List[str]) -> Dict[str, Union[List[str], Any]]:🧰 Tools
🪛 Ruff
351-351: Within an
except
clause, raise exceptions withraise ... from err
orraise ... from None
to distinguish them from errors in exception handling(B904)
src/analytics/api/views/data.py (1)
115-117
: Remove unused variable 'data_type'The variable
data_type
is assigned but never used. Consider removing it to clean up the code.Apply this diff to remove the unused variable:
- data_type = self._get_valid_option( - json_data.get("datatype"), valid_options["data_types"], "datatype" - )🧰 Tools
🪛 Ruff
115-115: Local variable
data_type
is assigned to but never usedRemove assignment to unused variable
data_type
(F841)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (4)
src/analytics/api/models/events.py
(8 hunks)src/analytics/api/utils/data_formatters.py
(2 hunks)src/analytics/api/views/data.py
(5 hunks)src/analytics/requirements.txt
(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- src/analytics/requirements.txt
🧰 Additional context used
🪛 Ruff
src/analytics/api/utils/data_formatters.py
323-323: Within an except
clause, raise exceptions with raise ... from err
or raise ... from None
to distinguish them from errors in exception handling
(B904)
351-351: Within an except
clause, raise exceptions with raise ... from err
or raise ... from None
to distinguish them from errors in exception handling
(B904)
src/analytics/api/views/data.py
115-115: Local variable data_type
is assigned to but never used
Remove assignment to unused variable data_type
(F841)
🔇 Additional comments (9)
src/analytics/api/views/data.py (6)
3-3
: Proper initialization of logging
The addition of the logging
import and logger initialization sets up logging correctly for this module.
Also applies to: 37-37
93-97
: Enhanced error handling for filter validation
The new try-except block correctly handles ValueError
exceptions from _get_validated_filter
, returning an appropriate HTTP 400 response.
101-120
: Improved error handling for option validation
The try-except block ensures that invalid options are gracefully handled, providing clear error messages to the user.
🧰 Tools
🪛 Ruff
115-115: Local variable data_type
is assigned to but never used
Remove assignment to unused variable data_type
(F841)
124-126
: Verify default value for 'minimum_output'
The minimum_output
parameter defaults to True
when not provided. Please confirm that this default aligns with the intended behavior and that downstream processes handle this value appropriately.
141-142
: Updated data retrieval with consolidated filters
The call to download_from_bigquery
now correctly uses filter_type
and filter_value
, aligning with the updated filtering mechanism in EventsModel
.
196-235
: Effective validation of filter parameters
The _get_validated_filter
method ensures that exactly one of 'sites', 'devices', or 'airqlouds' is provided, enhancing input validation and preventing ambiguous requests.
src/analytics/api/models/events.py (3)
65-71
: Efficient mapping of data tables based on frequency
The use of a dictionary to map frequencies to their corresponding data tables enhances code readability and maintainability. This dynamic selection simplifies future additions or modifications to frequency options.
72-73
: Improved exception handling for invalid frequency
Raising a ValueError
when an invalid frequency is provided offers clearer feedback and aids in debugging. This is a good practice for robust error handling.
200-201
: Clarify the sorting columns when filtering by airqlouds
Including airqloud_id
in sorting_cols
when filter_type
is 'airqlouds'
ensures the data is sorted correctly. This enhances data consistency and readability in the final output.
if filter_type == "sites": | ||
validated_value = filter_non_private_sites(sites=filter_value).get( | ||
"sites", [] | ||
) | ||
elif filter_type == "devices": | ||
validated_value = filter_non_private_devices(devices=filter_value).get( | ||
"devices", [] | ||
) | ||
else: | ||
# No additional processing is needed for 'airqlouds' | ||
validated_value = filter_value | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Handle cases where filters result in empty lists
After filtering non-private sites or devices, validated_value
may be empty. Consider adding checks to raise a ValueError
if no valid items are found, providing clear feedback to the user.
Apply this diff to add empty checks:
if filter_type == "sites":
validated_value = filter_non_private_sites(sites=filter_value).get(
"sites", []
)
+ if not validated_value:
+ raise ValueError("No valid sites found after filtering.")
elif filter_type == "devices":
validated_value = filter_non_private_devices(devices=filter_value).get(
"devices", []
)
+ if not validated_value:
+ raise ValueError("No valid devices found after filtering.")
else:
# No additional processing is needed for 'airqlouds'
validated_value = filter_value
+ if not validated_value:
+ raise ValueError("No valid airqlouds provided.")
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
if filter_type == "sites": | |
validated_value = filter_non_private_sites(sites=filter_value).get( | |
"sites", [] | |
) | |
elif filter_type == "devices": | |
validated_value = filter_non_private_devices(devices=filter_value).get( | |
"devices", [] | |
) | |
else: | |
# No additional processing is needed for 'airqlouds' | |
validated_value = filter_value | |
if filter_type == "sites": | |
validated_value = filter_non_private_sites(sites=filter_value).get( | |
"sites", [] | |
) | |
if not validated_value: | |
raise ValueError("No valid sites found after filtering.") | |
elif filter_type == "devices": | |
validated_value = filter_non_private_devices(devices=filter_value).get( | |
"devices", [] | |
) | |
if not validated_value: | |
raise ValueError("No valid devices found after filtering.") | |
else: | |
# No additional processing is needed for 'airqlouds' | |
validated_value = filter_value | |
if not validated_value: | |
raise ValueError("No valid airqlouds provided.") |
def _get_valid_option(self, option, valid_options, option_name): | ||
""" | ||
Returns a validated option, defaulting to the first valid option if not provided or invalid. | ||
Returns a validated option, raising an error with valid options if invalid. | ||
|
||
Args: | ||
option (str): Option provided in the request. | ||
valid_options (list): List of valid options. | ||
option_name (str): The name of the option being validated. | ||
|
||
Returns: | ||
str: A validated option from the list. | ||
|
||
Raises: | ||
ValueError: If the provided option is invalid. | ||
""" | ||
return ( | ||
option.lower() | ||
if option and option.lower() in valid_options | ||
else valid_options[0] | ||
) | ||
if option and option.lower() in valid_options: | ||
return option.lower() | ||
if option: | ||
raise ValueError( | ||
f"Invalid {option_name}. Valid values are: {', '.join(valid_options)}." | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Assign default values when options are missing
The _get_valid_option
method returns None
if an option is not provided, which may lead to issues if downstream code does not handle None
values. Consider assigning default values from valid_options
to ensure consistent behavior.
Modify the method to assign default values:
def _get_valid_option(self, option, valid_options, option_name):
"""
Returns a validated option, raising an error with valid options if invalid.
Args:
option (str): Option provided in the request.
valid_options (list): List of valid options.
option_name (str): The name of the option being validated.
Returns:
str: A validated option from the list.
Raises:
ValueError: If the provided option is invalid.
"""
+ if not option:
+ return valid_options[0]
elif option.lower() in valid_options:
return option.lower()
else:
raise ValueError(
f"Invalid {option_name}. Valid values are: {', '.join(valid_options)}."
)
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
def _get_valid_option(self, option, valid_options, option_name): | |
""" | |
Returns a validated option, defaulting to the first valid option if not provided or invalid. | |
Returns a validated option, raising an error with valid options if invalid. | |
Args: | |
option (str): Option provided in the request. | |
valid_options (list): List of valid options. | |
option_name (str): The name of the option being validated. | |
Returns: | |
str: A validated option from the list. | |
Raises: | |
ValueError: If the provided option is invalid. | |
""" | |
return ( | |
option.lower() | |
if option and option.lower() in valid_options | |
else valid_options[0] | |
) | |
if option and option.lower() in valid_options: | |
return option.lower() | |
if option: | |
raise ValueError( | |
f"Invalid {option_name}. Valid values are: {', '.join(valid_options)}." | |
) | |
def _get_valid_option(self, option, valid_options, option_name): | |
""" | |
Returns a validated option, raising an error with valid options if invalid. | |
Args: | |
option (str): Option provided in the request. | |
valid_options (list): List of valid options. | |
option_name (str): The name of the option being validated. | |
Returns: | |
str: A validated option from the list. | |
Raises: | |
ValueError: If the provided option is invalid. | |
""" | |
if not option: | |
return valid_options[0] | |
elif option.lower() in valid_options: | |
return option.lower() | |
else: | |
raise ValueError( | |
f"Invalid {option_name}. Valid values are: {', '.join(valid_options)}." | |
) |
Description
This PR improves error handling and prepares to update the query builder.
Related Issues
Summary by CodeRabbit
New Features
Bug Fixes
Chores