Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce MongoDBTextFilterUsage check #649

Merged
merged 6 commits into from
Jul 31, 2023
Merged

Conversation

philleonard
Copy link
Member

@philleonard philleonard commented May 26, 2023

What?

A simple check which discourages the usage of Filters#text API in Mongo queries.

Why?

We have observed at Picnic that using MongoDB to serve full text search queries can lead to major cluster outages (Mong server OOMs) when query strings are tokenised into even just a handful of tokens, invoking an index scan (IXSCAN) for each individual token in a query string. As Mongo does not gracefully handle such queries, and it is not possible to ban the usage of this filter on the (Mongo) server-side, our best effort is to introduce a Bug Check. MongoDB advises against implementing FTS queries on Mongo Atlas, and to use alternative solutions like Atlas Search, and other search engine databases like Elastic Search, or by simply implementing bespoke queries.

Implementation notes

  • Description "Usage of the Mongo full-text search ($text) queries is discouraged" doesn't actually hold true, perhaps there are other reasons why you would need to use the Filters API to obtain the Bson query and not use it in the Java driver - this Bug Check would still flag this. I can add the driver dependencies and match on the execution of a query using the drivers, or we can settle on this being good enough.
  • Named enum value MONGO, like others in ThirdPartyLibrary unsure why we don't refer to the specific name of the FQC but likely we can revisit later when we need to rename in order to encompass more than just the Filters API in this case.
  • Excluding the org.mongodb:bson-record-codec artifact , not required for testing, but contains classes violating compatibility with byte code enforcement limited by the max JDK version (11). The other option is to (see commit) ignore the class via Maven byteCodeEnforcement ( org.bson.codecs.record.* ) but I think this would be more disruptive.
  • Perhaps a more elaborate summary is desired to help others better understand the underlying problem, but typically we keep these pretty succinct?

Suggested commit message

Introduce `MongoFullTextSearchQueryUsage` check (#649) 

Discourage the inefficient usage of `mongodb-driver-core`'s `Filters#text` 
for constructing full-text search queries. 

@github-actions
Copy link

  • Surviving mutants in this change: 1
  • Killed mutants in this change: 5
class surviving killed
🧟tech.picnic.errorprone.bugpatterns.MongoFullTextSearchQueryUsage 1 5

Mutation testing report by Pitest. Review any surviving mutants by inspecting the line comments under Files changed.

@philleonard philleonard requested review from Stephan202 and rickie and removed request for Stephan202 May 26, 2023 11:59
@github-actions
Copy link

  • Surviving mutants in this change: 1
  • Killed mutants in this change: 5
class surviving killed
🧟tech.picnic.errorprone.bugpatterns.MongoFullTextSearchQueryUsage 1 5

Mutation testing report by Pitest. Review any surviving mutants by inspecting the line comments under Files changed.

Copy link
Member

@rickie rickie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work @philleonard! Just a quick comment :).

|| !TEXT_FILTER_INVOCATION.matches(tree, state)) {
return Description.NO_MATCH;
}
return describeMatch(tree);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we do not have a SuggestedFix we don't need to have this ThirdPartyLibrary.MONGO part. That is to prevent us from suggesting a Mongo fix while someone might not want / need this. Here we are only flagging something, so we can omit that part :).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I misread. I thought the intention of ThirdPartyLibrary was to list all possibly not present dependencies for bug checks even without a suggested fix. Good catch, will update.

@github-actions
Copy link

Looks good. All 4 mutations in this change were killed.

class surviving killed
🎉tech.picnic.errorprone.bugpatterns.MongoFullTextSearchQueryUsage 0 4

Mutation testing report by Pitest. Review any surviving mutants by inspecting the line comments under Files changed.

@philleonard
Copy link
Member Author

If I understood @Stephan202 correctly from an offline discussion we can downgrade this to a SUGGESTION and for our fork of error-prone utilise this upcoming feature to PicnicSupermarket/error-prone#33. However this would apply to 25 other SUGGESTIONs if we were to utilise this flag internally.

Question, would there be an option to add an additional tag from an enum to indicate the Picnic escalation on a per check basis using our fork's processing of the annotation yet still ensuring compatibility with error-prone? Something like:

enum PicnicSeverityEscalation {
    WARNING_TO_ERROR,
    SUGGESTION _TO_ERROR,
    SUGGESTION_TO_WARNING
}

or simply just

enum PicnicSeverity {
    ERROR,
    WARNING,
    SUGGESTION
}

such that for usage in @BugPatterns:

@BugPattern(
    summary =
        "Usage of Mongo for full-text search queries via the `$text` operator is discouraged.",
    link = BUG_PATTERNS_BASE_URL + "MongoFullTextSearchQueryUsage",
    linkType = CUSTOM,
    severity = WARNING,
    tags = {PERFORMANCE, PicnicSeverityEscalation.SUGGESTION_TO_WARNING})
...

then in our fork modify BugCheckerInfo to iterate over the tags and override the BugCheckerInfo#defaultSeverity where necessary?

Interested to hear your thoughts!

@rickie
Copy link
Member

rickie commented Jun 7, 2023

Hi @philleonard,

If I understood @Stephan202 correctly from an offline discussion we can downgrade this to a SUGGESTION and for our fork of error-prone utilise this upcoming feature to PicnicSupermarket/error-prone#33.

Yes we should make this a SUGGESTION. We already use the flag internally (in PSM) so this will already work! The PR you linked is open for reference but it's already included in the release of the fork. It would be nice however if the upstream PR google/error-prone#3301 would be merged. I'll rebase the PR and do another ping there.

W.r.t. your other ideas and suggestions; I think it is easier to just configure/override the severity we want a specific BugChecker to have, so we can use the Error Prone command line flags for that. It would allow us to say, just change the severity of this specific check to WARNING or ERROR, if we do not have the -Xep:AllSuggestionsAsWarnings flag.

Internally, we enable all the checks and only disable the ones that are not applicable on our codebase, or have a high number of false positives for whatever reason 😄.

So what I mentioned above allows us to configure it just like we want to.

@rickie rickie force-pushed the pleonard/mongo-text-search branch from 8242ea9 to 91fad12 Compare June 7, 2023 16:57
@github-actions
Copy link

github-actions bot commented Jun 7, 2023

Looks good. All 4 mutations in this change were killed.

class surviving killed
🎉tech.picnic.errorprone.bugpatterns.MongoFullTextSearchQueryUsage 0 4

Mutation testing report by Pitest. Review any surviving mutants by inspecting the line comments under Files changed.

@github-actions
Copy link

Looks good. All 3 mutations in this change were killed.

class surviving killed
🎉tech.picnic.errorprone.bugpatterns.MongoFullTextSearchQueryUsage 0 3

Mutation testing report by Pitest. Review any surviving mutants by inspecting the line comments under Files changed.

Copy link
Member

@rickie rickie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rebased and added a commit. Looks nice!
Only doubt is about the name if it can be shorter, but didn't find a good alternative. Maybe we can drop the Full part?

Suggested commit message:

Introduce `MongoFullTextSearchQueryUsage` check (#649)

@@ -32,6 +32,7 @@ public enum ThirdPartyLibrary {
* @see <a href="https://github.com/google/guava">Guava on GitHub</a>
*/
GUAVA("com.google.common.collect.ImmutableList"),

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's make sure to not sneak in any unrelated changes 😉.

public final class MongoFullTextSearchQueryUsage extends BugChecker
implements MethodInvocationTreeMatcher {
private static final long serialVersionUID = 1L;
private static final Matcher<ExpressionTree> TEXT_FILTER_INVOCATION =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We usually use _METHOD instead of INVOCATION. In this case I made a suggestion to make it a bit clearer what the matches wants to match. Maybe we can drop the MONGO_ prefix, but now its inline with the class and order of matching.

"Usage of Mongo for full-text search queries via the `$text` operator is discouraged.",
link = BUG_PATTERNS_BASE_URL + "MongoFullTextSearchQueryUsage",
linkType = CUSTOM,
severity = WARNING,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's change to a SUGGESTION as discussed in the other comments.


@Override
public Description matchMethodInvocation(MethodInvocationTree tree, VisitorState state) {
if (!TEXT_FILTER_INVOCATION.matches(tree, state)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can do a ?: here, that'll be readable still. We have one occurrence of this in the MoreMatchersTest :).

"class A {",
"",
" void m() {",
" Bson allowed = Filters.eq(\"a\", \"b\");",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need the Bson allowed = and other variables :).

" // BUG: Diagnostic contains:",
" Bson textSearch = Filters.text(\"Some text\");",
" // BUG: Diagnostic contains:",
" Bson textSearch2 = Filters.text(\"Some text\", new TextSearchOptions().caseSensitive(true));",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added one extra case, just for completeness but might not be that necessary...

@rickie rickie added this to the 0.12.0 milestone Jun 10, 2023
@rickie
Copy link
Member

rickie commented Jun 13, 2023

An extra thought: we could've also used a Checkstyle module here to flag this method. However, I imagine this check being generalized when we find another method we want to flag.

We can have a check that flags multiple specific methods and have a custom message for every thing that we want to flag. Additionally, we should see if we can provide an optional fix for the methods that do have a simple replacement. So the check will contain methods without possible replacement, or simpler replacements then the ones from the FluxFlatMapUsage check.

@Stephan202 Stephan202 modified the milestones: 0.12.0, 0.13.0 Jun 21, 2023
@rickie rickie force-pushed the pleonard/mongo-text-search branch from b7825c4 to 3a782a7 Compare July 2, 2023 07:44
@rickie
Copy link
Member

rickie commented Jul 2, 2023

Rebased.

@github-actions
Copy link

github-actions bot commented Jul 2, 2023

Looks good. All 3 mutations in this change were killed.

class surviving killed
🎉tech.picnic.errorprone.bugpatterns.MongoFullTextSearchQueryUsage 0 3

Mutation testing report by Pitest. Review any surviving mutants by inspecting the line comments under Files changed.

@Stephan202 Stephan202 force-pushed the pleonard/mongo-text-search branch from 3a782a7 to 81441bb Compare July 23, 2023 11:12
Copy link
Member

@Stephan202 Stephan202 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rebased and added a commit. Indeed we might want to eventually generalize this to a check which flags methods with a custom message.

Tweaked suggested commit message, given the proposed rename:

Introduce `MongoDBTextFilterUsage` check (#649)

import com.sun.source.tree.MethodInvocationTree;

/**
* A {@link BugChecker} that flags usages of Mongo {@code $text} filters used for full text
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here and elsewhere: suggest we spell out MongoDB :)

*/
@AutoService(BugChecker.class)
@BugPattern(
summary = "Avoid the `$text` operator in Mongo's full-text search queries",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
summary = "Avoid the `$text` operator in Mongo's full-text search queries",
summary = "Avoid MongoDB's `$text` filter operator, as it can trigger heavy queries and even cause the server to run out of memory",

linkType = CUSTOM,
severity = SUGGESTION,
tags = PERFORMANCE)
public final class MongoFullTextSearchQueryUsage extends BugChecker
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

W.r.t. the name: how about MongoDBTextFilterUsage?

pom.xml Outdated
Comment on lines 450 to 453
<!-- XXX: Drop this when we forgo enforcement of
JDK 11 bytecode version compatibility. See follow-ups from:
https://github.com/PicnicSupermarket/error-prone-support/pull/603
https://github.com/PicnicSupermarket/error-prone-support/pull/198. -->
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For maintainability purposes I propose to omit the links; IIUC even after those changes this exclusion would apply.

pom.xml Outdated
<dependency>
<groupId>org.mongodb</groupId>
<artifactId>mongodb-driver-core</artifactId>
<version>4.9.1</version>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will upgrade.

@github-actions
Copy link

Looks good. All 3 mutations in this change were killed.

class surviving killed
🎉tech.picnic.errorprone.bugpatterns.MongoDBTextFilterUsage 0 3

Mutation testing report by Pitest. Review any surviving mutants by inspecting the line comments under Files changed.

@rickie rickie force-pushed the pleonard/mongo-text-search branch from 81441bb to 94ee99e Compare July 23, 2023 12:36
@rickie
Copy link
Member

rickie commented Jul 23, 2023

Rebased.

@github-actions
Copy link

Looks good. All 3 mutations in this change were killed.

class surviving killed
🎉tech.picnic.errorprone.bugpatterns.MongoDBTextFilterUsage 0 3

Mutation testing report by Pitest. Review any surviving mutants by inspecting the line comments under Files changed.

@rickie rickie force-pushed the pleonard/mongo-text-search branch from 94ee99e to 50030d8 Compare July 24, 2023 08:11
@github-actions
Copy link

Looks good. All 3 mutations in this change were killed.

class surviving killed
🎉tech.picnic.errorprone.bugpatterns.MongoDBTextFilterUsage 0 3

Mutation testing report by Pitest. Review any surviving mutants by inspecting the line comments under Files changed.

@rickie rickie force-pushed the pleonard/mongo-text-search branch from 50030d8 to 521b1b9 Compare July 31, 2023 14:58
@rickie
Copy link
Member

rickie commented Jul 31, 2023

Rebased, will merge!

@sonarqubecloud
Copy link

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 1 Code Smell

100.0% 100.0% Coverage
0.0% 0.0% Duplication

@github-actions
Copy link

Looks good. All 3 mutations in this change were killed.

class surviving killed
🎉tech.picnic.errorprone.bugpatterns.MongoDBTextFilterUsage 0 3

Mutation testing report by Pitest. Review any surviving mutants by inspecting the line comments under Files changed.

@rickie rickie merged commit e03d0de into master Jul 31, 2023
@rickie rickie deleted the pleonard/mongo-text-search branch July 31, 2023 15:19
@Stephan202 Stephan202 changed the title Introduce MongoFullTextSearchQueryUsage check Introduce MongoDBTextFilterUsage check Aug 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

3 participants