-
Notifications
You must be signed in to change notification settings - Fork 73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(core-clp): get_entry_matching_value should handle ignore-case properly. #690
base: main
Are you sure you want to change the base?
Conversation
WalkthroughThe pull request introduces significant modifications to the Changes
Sequence DiagramsequenceDiagram
participant DR as DictionaryReader
participant EVI as EncodedVariableInterpreter
participant Output as Search Output
DR->>DR: Search for entries
DR-->>EVI: Return vector of matching entries
EVI->>EVI: Process multiple entries
EVI-->>Output: Pass encoded variables
Output->>Output: Populate string queries
Possibly related issues
✨ Finishing Touches
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (9)
components/core/src/clp_s/search/clp_search/EncodedVariableInterpreter.cpp (3)
39-39
: Consider adding more context or logs upon returning false
When the dictionary returns no entries, returning false implies no match. For traceability and debugging, consider logging a brief message indicating the unmatched variable string.
46-51
: Avoid shadowing and preserve clarity
The variableencoded_var
declared inside this block may overshadow theencoded_var
declared above. To improve clarity and maintain scoping hygiene, consider renaming or reusing the existing variable.
53-61
: Confirm efficiency for large sets of matches
Usingstd::unordered_set
for both the IDs and the entries is a good approach for preventing duplicates. If the dictionary can grow large, ensure that insertion and iteration remain performant.components/core/src/clp_s/DictionaryReader.hpp (1)
159-181
: Evaluate case-sensitive early exit logic
Early exit in the case-sensitive branch returns after the first match. This design is acceptable if the business logic requires only one match; otherwise, you may want to continue searching for potential duplicates. Also, consider that repeatedly converting each dictionary entry’s value to uppercase in the else-branch may affect performance on large dictionaries.components/core/src/clp/DictionaryReader.hpp (1)
236-258
: Ensure consistency in case handling
The code pattern for case-sensitive versus case-insensitive matches is consistent. Similar to the otherDictionaryReader
, consider potential performance costs of repeatedly callingboost::algorithm::to_upper_copy()
on each iteration.components/core/tests/test-EncodedVariableInterpreter.cpp (2)
379-379
: Correct typographical error in test name
The word “metching” should be “matching.”- SECTION("Test multiple metching values") { + SECTION("Test multiple matching values") {
382-405
: Comprehensive case-insensitive verification
By storing multiple string variants and confirming that only the case-insensitive retrieval returns all of them, you have validated the updated dictionary logic thoroughly. Consider verifying each returned entry’s content for added rigour.components/core/src/clp_s/search/Output.cpp (2)
935-935
: Consider checking if the returned vector is empty before use.
While it's not strictly necessary, confirming thatentries
is non-empty (or otherwise handling the empty case) can improve clarity and prevent potential unexpected behaviour in future modifications.
940-940
: Reserve space in the set before inserting to avoid repeated rehashing.
Since you already know the approximate number of elements from the size ofentries
, reserving that capacity inmatching_vars
(e.g.,matching_vars.reserve(entries.size())
) can marginally improve performance and reduce memory overhead.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
components/core/src/clp/DictionaryReader.hpp
(2 hunks)components/core/src/clp/EncodedVariableInterpreter.cpp
(1 hunks)components/core/src/clp_s/DictionaryReader.hpp
(2 hunks)components/core/src/clp_s/search/Output.cpp
(1 hunks)components/core/src/clp_s/search/clp_search/EncodedVariableInterpreter.cpp
(1 hunks)components/core/tests/test-EncodedVariableInterpreter.cpp
(1 hunks)
🧰 Additional context used
📓 Path-based instructions (6)
components/core/src/clp_s/search/clp_search/EncodedVariableInterpreter.cpp (1)
Pattern **/*.{cpp,hpp,java,js,jsx,ts,tsx}
: - Prefer false == <expression>
rather than !<expression>
.
components/core/tests/test-EncodedVariableInterpreter.cpp (1)
Pattern **/*.{cpp,hpp,java,js,jsx,ts,tsx}
: - Prefer false == <expression>
rather than !<expression>
.
components/core/src/clp_s/DictionaryReader.hpp (1)
Pattern **/*.{cpp,hpp,java,js,jsx,ts,tsx}
: - Prefer false == <expression>
rather than !<expression>
.
components/core/src/clp/DictionaryReader.hpp (1)
Pattern **/*.{cpp,hpp,java,js,jsx,ts,tsx}
: - Prefer false == <expression>
rather than !<expression>
.
components/core/src/clp_s/search/Output.cpp (1)
Pattern **/*.{cpp,hpp,java,js,jsx,ts,tsx}
: - Prefer false == <expression>
rather than !<expression>
.
components/core/src/clp/EncodedVariableInterpreter.cpp (1)
Pattern **/*.{cpp,hpp,java,js,jsx,ts,tsx}
: - Prefer false == <expression>
rather than !<expression>
.
🔇 Additional comments (9)
components/core/src/clp_s/search/clp_search/EncodedVariableInterpreter.cpp (2)
37-38
: Ensure consistent handling of multiple matched entries
Changing the function call to return a vector works well for retrieving multiple entries, thereby supporting the new requirement for imprecise matching. This approach is clear and logically sound.
45-45
: Verify classification before callingadd_non_double_var
While this is logically consistent with the dictionary path, verify that treating all dictionary-based entries asnon_double
is appropriate in all cases. If some entries are numeric or require special handling, additional checks may be needed.components/core/src/clp_s/DictionaryReader.hpp (1)
67-69
: Accurate docstring update
The revised docstring correctly states that the method returns a vector of matching entries, which may be empty.components/core/src/clp/DictionaryReader.hpp (1)
85-87
: Docstring alignment with new return type
The documentation now accurately states that multiple matching entries can be returned.components/core/src/clp/EncodedVariableInterpreter.cpp (4)
389-390
: Vector-based retrieval
Switching to retrieving all matching entries in a vector offers more flexibility for handling multiple matches. This aligns well with your updated dictionary logic.
391-394
: Early return on empty matches
Returningfalse
when no dictionary entries are found is a clear approach. Ensure callers handle this outcome properly.
398-403
: Single match flow
Dispatching a single dictionary entry and returning immediately keeps the logic straightforward.
404-413
: Handling multiple matches
Encoding and storing all dictionary entries in an imprecise variable set is a practical approach for ambiguous matches. No issues here.components/core/tests/test-EncodedVariableInterpreter.cpp (1)
409-414
: Prudent test clean-up
Deleting the dictionary files at the end of the test prevents workspace pollution.
if (false == ignore_case) { | ||
for (auto const& entry : m_entries) { | ||
if (entry.get_value() == search_string) { | ||
return &entry; | ||
entries.push_back(&entry); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@LinZhihao-723 Should we use emplace_back instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's ok to use push_back
for pointer types. Emplace is only required if the underlying object requires a constructor.
|
||
if (entries.size() == 1) { | ||
clp::VariableDictionaryEntry const* entry = entries[0]; | ||
auto encoded_var = encode_var_dict_id(entry->get_id()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to shadow the encoded_var declared out side of this scope, do we need a new variable here?
Or maybe, we can refactor the code to not have a "globally declared" encoded_var
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it was an oversight. It seems to me that remove the "globally declared" encoded_var here may make the first two cases a little more verbose. At the end of the day, this seems clear enough:
encoded_variable_t encoded_var;
if (convert_string_to_representable_integer_var(var_str, encoded_var)) {
LogTypeDictionaryEntry::add_int_var(logtype);
sub_query.add_non_dict_var(encoded_var);
} else if (convert_string_to_representable_float_var(var_str, encoded_var)) {
LogTypeDictionaryEntry::add_float_var(logtype);
sub_query.add_non_dict_var(encoded_var);
} else {
auto const entries = var_dict.get_entry_matching_value(var_str, ignore_case);
...
}
@@ -82,9 +82,9 @@ class DictionaryReader { | |||
* Gets the entry exactly matching the given search string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should update the docstring.
* Gets the entry exactly matching the given search string | |
* Gets the entry matching the given search string |
@kirkrodrigues any suggestions on how to account for multiple possible matching? "entry(entries)"?
vector<clp::variable_dictionary_id_t> var_ids; | ||
clp::LogTypeDictionaryEntry logtype_dict_entry; | ||
for (auto var_str : var_strs) { | ||
EncodedVariableInterpreter::encode_and_add_to_dictionary( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of directly using the var_strs as msg argument for encode_and_add_to_dictionary, can we construct a message with a dummy log type?
I guess what you can do can be similar to
string const msg_template = "here is a string with a dictionary var: ".
for var_str in var_strs:
auto const msg = msg_template + var_id
...
Thanks for opening the PR. I have left a batch of comments. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (5)
components/core/src/clp_s/DictionaryReader.hpp (2)
69-70
: Consider usingstd::vector<EntryType const*>
on a single line.The return type declaration spans two lines unnecessarily.
- std::vector<EntryType const*> - get_entry_matching_value(std::string const& search_string, bool ignore_case) const; + std::vector<EntryType const*> get_entry_matching_value(std::string const& search_string, bool ignore_case) const;
159-182
: Consider usingemplace_back
instead ofpush_back
.The implementation looks good, but consider using
emplace_back
for potentially better performance when adding entries to the vector.- entries.push_back(&entry); + entries.emplace_back(&entry);Also at:
- entries.push_back(&entry); + entries.emplace_back(&entry);components/core/src/clp/EncodedVariableInterpreter.cpp (3)
389-390
: Consider using auto with explicit type for better readability.The type of
entries
could be made more explicit while still usingauto
.- auto const entries = var_dict.get_entry_matching_value(var_str, ignore_case); + auto const entries = std::vector<VariableDictionaryEntry const*>{var_dict.get_entry_matching_value(var_str, ignore_case)};
398-403
: Consider usingstd::size
instead ofsize()
.The single entry case is handled correctly, but consider using
std::size
for consistency with modern C++.- if (entries.size() == 1) { + if (std::size(entries) == 1) {
405-413
: Consider reserving space in the sets for better performance.When dealing with multiple entries, consider reserving space in both sets to avoid reallocations.
std::unordered_set<encoded_variable_t> encoded_vars; + encoded_vars.reserve(entries.size()); std::unordered_set<clp::VariableDictionaryEntry const*> const entries_set( entries.begin(), entries.end() ); for (auto const& entry : entries) { encoded_vars.insert(encode_var_dict_id(entry->get_id())); }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
components/core/src/clp/DictionaryReader.hpp
(2 hunks)components/core/src/clp/EncodedVariableInterpreter.cpp
(1 hunks)components/core/src/clp_s/DictionaryReader.hpp
(2 hunks)components/core/src/clp_s/search/Output.cpp
(1 hunks)components/core/src/clp_s/search/clp_search/EncodedVariableInterpreter.cpp
(1 hunks)components/core/tests/test-EncodedVariableInterpreter.cpp
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- components/core/src/clp_s/search/clp_search/EncodedVariableInterpreter.cpp
- components/core/tests/test-EncodedVariableInterpreter.cpp
🧰 Additional context used
📓 Path-based instructions (4)
components/core/src/clp/DictionaryReader.hpp (1)
Pattern **/*.{cpp,hpp,java,js,jsx,ts,tsx}
: - Prefer false == <expression>
rather than !<expression>
.
components/core/src/clp/EncodedVariableInterpreter.cpp (1)
Pattern **/*.{cpp,hpp,java,js,jsx,ts,tsx}
: - Prefer false == <expression>
rather than !<expression>
.
components/core/src/clp_s/DictionaryReader.hpp (1)
Pattern **/*.{cpp,hpp,java,js,jsx,ts,tsx}
: - Prefer false == <expression>
rather than !<expression>
.
components/core/src/clp_s/search/Output.cpp (1)
Pattern **/*.{cpp,hpp,java,js,jsx,ts,tsx}
: - Prefer false == <expression>
rather than !<expression>
.
🔇 Additional comments (4)
components/core/src/clp_s/DictionaryReader.hpp (1)
67-67
: LGTM! Documentation updated to reflect the new return type.The docstring accurately describes that the function now returns a vector of entries.
components/core/src/clp/DictionaryReader.hpp (1)
236-259
: Same suggestions as in the other DictionaryReader.hpp file apply here.Consider:
- Using
std::vector<EntryType const*>
on a single line- Using
emplace_back
instead ofpush_back
components/core/src/clp/EncodedVariableInterpreter.cpp (1)
391-397
: LGTM! Early return for empty entries.The empty check and early return is well-placed and maintains the function's contract.
components/core/src/clp_s/search/Output.cpp (1)
935-940
: LGTM! Correctly handles multiple entries from the dictionary.The code properly retrieves and processes multiple entries from the variable dictionary.
This pull request implements a fix for the issue #648
Here's a short recap of the main changes:
get_entry_matching_value
now returns a vectorencode_and_search_dictionary
adds imprecise dict var when there are more matching entriesThere are four commits because I worked on it taking advantage of @haiqi96's advices (see the discussion on #648). Should I squash them?
Summary by CodeRabbit
New Features
Bug Fixes
Tests
The changes improve the system's ability to handle complex dictionary searches, allowing for more nuanced and precise variable matching across different components.