Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[v1.0.0 nightly] - Search misses "easy" hits when there are intervening words or the order is changed #2325

Closed
5 tasks done
jecorn opened this issue Apr 4, 2023 · 6 comments · Fixed by #2351
Closed
5 tasks done

Comments

@jecorn
Copy link
Contributor

jecorn commented Apr 4, 2023

First Check

  • This is not a feature request
  • I added a very descriptive title to this issue.
  • I used the GitHub search to find a similar issue and didn't find it.
  • I searched the Mealie documentation, with the integrated search.
  • I already read the docs and didn't find an answer.

What is the issue you are experiencing?

I have a pretty large recipe database (6'000 or so) and have started noticing that search in the nightly release is not returning "easy" hits. It seems almost like search is not able to skip intervening words. For example, I have a recipe (json below) for "Pinto Bean, Beet, and Bulgur Burgers". Searching for "bulgur burger" or "pinto bean" or other combinations of words right next to each other in the recipe text/title/etc returns the recipe. But many other "easy" searches with intervening words or with the words reversed do not. For example, none of the following return the recipe:

  • "burger bulgur" (reverse)
  • "pinto burger" (skip "bean beet and bulgur")
  • "burger pinto"
  • "beet burger" (skip "and bulgur")
  • "pinto beet" (skip "bean")
  • etc

I thought that search was supposed to be robust to these kinds of simple permutations? It's almost like search is omitting skip words and punctuation (e.g. "and" or commas are not required) but otherwise search is operating like case-insensitive regex. But not exactly, because a few skipped word searches do work, like “bean burger”

{
    "id": "7228d1d2-6ae2-4c6c-82a2-05af68c32919",
    "userId": "5dfa0c82-f7fe-4900-bd5b-0215ad2151fa",
    "groupId": "8eb87e39-f2a6-457d-98ef-e16554be8f02",
    "name": "Pinto Bean, Beet, and Bulgur Burgers",
    "slug": "pinto-bean-beet-and-bulgur-burgers",
    "image": "255",
    "recipeYield": "8",
    "totalTime": "1¼ hours",
    "prepTime": null,
    "cookTime": null,
    "performTime": null,
    "description": "For these modern bean burgers, we combined pinto beans with sweet, earthy shredded beets and hearty, chewy bulgur. While the bulgur cooked, we pulsed the rest of the ingredients in a food processor to get just the right consistency. Along with the beets and beans, we added basil for freshness and walnuts for richness and texture. Garlic and mustard deepened the savory flavors. Using carrot baby food (which was already conveniently pureed) as a binder instead of eggs lent the patties a subtle sweetness—and, as an added bonus, it kept the recipe vegan. Panko bread crumbs further bound the mixture and helped the patties to sear up with a nice, crispy crust. Any brand of plain carrot baby food will work here. Use a coarse grater or the shredding disk of a food processor to shred the beets. If using the food processor, you may need to cut the beet into smaller pieces to fit inside the feed tube. ",
    "recipeCategory": [],
    "tags": [
        {
            "id": "a6c8c13c-fc79-47ec-8ccf-6fba22f56814",
            "name": "Main Courses",
            "slug": "main-courses"
        },
        {
            "id": "2691fceb-27e1-4a92-9db9-9d5180365ce6",
            "name": "Vegetables",
            "slug": "vegetables"
        },
        {
            "id": "be753207-823f-4a60-bae3-2ec30afebae7",
            "name": "Grains",
            "slug": "grains"
        },
        {
            "id": "8598b5dd-9590-4f3d-8778-e90dc3d5a04e",
            "name": "Vegan",
            "slug": "vegan"
        },
        {
            "id": "92dfd330-5ba2-4bb5-8908-ef2d8e3bcc8a",
            "name": "Vegetarian",
            "slug": "vegetarian"
        },
        {
            "id": "74361f7e-e7c2-45aa-8741-46ff64b44fcd",
            "name": "Sandwiches",
            "slug": "sandwiches"
        }
    ],
    "tools": [],
    "rating": null,
    "orgURL": "https://www.americastestkitchen.com/recipes/11710-pinto-bean-beet-and-bulgur-burgers",
    "dateAdded": "2023-03-29",
    "dateUpdated": "2023-03-29T09:06:15.124817",
    "createdAt": "2023-03-29T09:06:15.023380",
    "updateAt": "2023-03-29T09:06:15.278474",
    "lastMade": null,
    "recipeIngredient": [
        {
            "title": null,
            "note": "Salt and pepper",
            "unit": null,
            "food": null,
            "disableAmount": true,
            "quantity": 1,
            "originalText": null,
            "referenceId": "705d9047-a90b-46da-b723-a68791f8ba74"
        },
        {
            "title": null,
            "note": "⅔ cup medium-grind bulgur, rinsed",
            "unit": null,
            "food": null,
            "disableAmount": true,
            "quantity": 1,
            "originalText": null,
            "referenceId": "1e6180ea-e809-4055-b772-e6aec41d340a"
        },
        {
            "title": null,
            "note": "1 large beet (9 ounces), peeled and shredded",
            "unit": null,
            "food": null,
            "disableAmount": true,
            "quantity": 1,
            "originalText": null,
            "referenceId": "a946478b-a8b1-4727-bb61-b299e0c066c1"
        },
        {
            "title": null,
            "note": "¾ cup walnuts",
            "unit": null,
            "food": null,
            "disableAmount": true,
            "quantity": 1,
            "originalText": null,
            "referenceId": "cd1e8648-c67e-4864-85ca-4f143232c46a"
        },
        {
            "title": null,
            "note": "½ cup fresh basil leaves",
            "unit": null,
            "food": null,
            "disableAmount": true,
            "quantity": 1,
            "originalText": null,
            "referenceId": "57cb1ee8-3c05-4b88-9dd4-d94d4d45196b"
        },
        {
            "title": null,
            "note": "2 garlic cloves, minced",
            "unit": null,
            "food": null,
            "disableAmount": true,
            "quantity": 1,
            "originalText": null,
            "referenceId": "1804d0c2-0014-4e4c-a903-cb789a68cc60"
        },
        {
            "title": null,
            "note": "1 (15-ounce) can pinto beans, rinsed",
            "unit": null,
            "food": null,
            "disableAmount": true,
            "quantity": 1,
            "originalText": null,
            "referenceId": "96118684-4a13-475b-aaac-cf906434b577"
        },
        {
            "title": null,
            "note": "1 (4-ounce) jar carrot baby food",
            "unit": null,
            "food": null,
            "disableAmount": true,
            "quantity": 1,
            "originalText": null,
            "referenceId": "7a0c4b98-f1f6-43a2-a74a-e32599d292e5"
        },
        {
            "title": null,
            "note": "1 tablespoon whole-grain mustard",
            "unit": null,
            "food": null,
            "disableAmount": true,
            "quantity": 1,
            "originalText": null,
            "referenceId": "8f3d9a56-33b6-4ca2-b874-2de0dd4e7dc0"
        },
        {
            "title": null,
            "note": "1 ½ cups panko bread crumbs",
            "unit": null,
            "food": null,
            "disableAmount": true,
            "quantity": 1,
            "originalText": null,
            "referenceId": "14a5da66-6892-4b0f-b55a-e40e38443146"
        },
        {
            "title": null,
            "note": "¼ cup vegetable oil",
            "unit": null,
            "food": null,
            "disableAmount": true,
            "quantity": 1,
            "originalText": null,
            "referenceId": "fd3e4e00-005d-447f-b55e-7c569c06ec65"
        }
    ],
    "recipeInstructions": [
        {
            "id": "0e9ac89d-9fdf-4446-aad4-04b541774d05",
            "title": null,
            "text": "Bring 1½ cups water and ½ teaspoon salt to boil in small saucepan. Off heat, stir in bulgur, cover, and let stand until tender, 15 to 20 minutes. Drain bulgur, spread onto rimmed baking sheet, and let cool slightly. ",
            "ingredientReferences": []
        },
        {
            "id": "f590886d-1047-4fd1-9f49-9bf465ebe5b3",
            "title": null,
            "text": "Meanwhile, pulse beet, walnuts, basil, and garlic together in food processor until finely chopped, about 12 pulses, scraping down sides of bowl as needed. Add beans, carrot baby food, 2 tablespoons water, mustard, 1½ teaspoons salt, and ½ teaspoon pepper and pulse until well combined, about 8 pulses. ",
            "ingredientReferences": []
        },
        {
            "id": "89876f2f-b260-4fc8-8d44-14795966801a",
            "title": null,
            "text": "Transfer mixture to large bowl and stir in panko and cooled bulgur. Divide mixture into 8 equal portions and pack into 4-inch-wide patties. ",
            "ingredientReferences": []
        },
        {
            "id": "015e5ca2-c593-4231-900b-cd7153ca23ac",
            "title": null,
            "text": "Heat 2 tablespoons oil in 12-inch nonstick skillet over medium-high heat until shimmering. Gently lay 4 patties in skillet and cook until crispy and well browned on both sides, 4 to 5 minutes per side, turning gently halfway through cooking and reducing heat if burgers begin to scorch. Transfer burgers to plate and tent with aluminum foil. ",
            "ingredientReferences": []
        },
        {
            "id": "925f49dc-1a16-4129-b158-e887600d87ea",
            "title": null,
            "text": "Wipe skillet clean with paper towels and repeat with remaining 2 tablespoons oil and remaining patties. Serve. ",
            "ingredientReferences": []
        }
    ],
    "nutrition": {
        "calories": null,
        "fatContent": null,
        "proteinContent": null,
        "carbohydrateContent": null,
        "fiberContent": null,
        "sodiumContent": null,
        "sugarContent": null
    },
    "settings": {
        "public": true,
        "showNutrition": false,
        "showAssets": true,
        "landscapeView": true,
        "disableComments": false,
        "disableAmount": true,
        "locked": false
    },
    "assets": [],
    "notes": [],
    "extras": {},
    "isOcrRecipe": false,
    "comments": []
}

Deployment

Docker (Linux)

Deployment Details

No response

@jecorn
Copy link
Contributor Author

jecorn commented Apr 4, 2023

Almost forgot: this is using an SQLite docker compose stack (separate front and backend, not Omni)

@jecorn
Copy link
Contributor Author

jecorn commented Apr 4, 2023

If I’m wrong about how fuzzy search is supposed to work in current mealie nightly, and this “bug” is actually a Feature Request, then this might be something that I could tackle. I just wouldn’t quite know how to hook the backend method I write up to the frontend.

If it’s a FR, I would envision using a TheFuzz (aka FuzzyWuzzy) or RapidFuzz Levenshtein distance token_set_ratio against a (maybe pre-processed and updated whenever entry changed) FTS5 virtual table of the recipes.

@fleshgolem
Copy link
Contributor

fleshgolem commented Apr 4, 2023

I implemented the current search, so let me shed some light on this. Search right now is not fuzzy and basically a "case-insensitive regex" like you describe. Main reason for this is that the old way of doing this (retrieve all recipes and filter client-side) was really bad and would have been WAY worse for a setup like yours, so the main goal here was to get something working that does the job at all

This could definitely improved, but yeah that would be a feature request. If you want to implement it you have to keep in mind that the solution has to support both sqlite and postgres. This was a big showstopper because it means you cannot just simply implement it once using FTS5

It would maybe make sense to split this into two different features, since one of them is easier implement than the others:

  • Split all search terms on whitespace and combine those to a single AND query, just so that it does not depend on ordering like you describe
  • Implement actual fuzzy search to account for typos etc.

I just wouldn’t quite know how to hook the backend method I write up to the frontend.

I dont quite understand why that matters. Ideally the new solution should just be a drop-in replacement for the old one with neither input parameters nor output formatting changing

@jecorn
Copy link
Contributor Author

jecorn commented Apr 4, 2023

OK, got it, and thanks for your answer. And thanks for implementing server-side search! Client-side search would indeed be quite painful in my use case.

I assumed fuzzy search was implemented b/c I saw it in a v1.0.0 release note and it’s on the nightly intro page. Good point about postgres cross-compatability. I’m not so familiar with postgres and don’t have a working install of that db backend, but will do some digging. Levenshtein and trigram extensions seem to be implemented in both sqlite and postgres, so I would start looking there.

Yes, you’re completely right that the solution should be seamless with the old. I’m not very familiar with the mealie internal codebase/API. If you point me to the location of the relevant search function, then I’ll take a look.

@fleshgolem
Copy link
Contributor

fleshgolem commented Apr 4, 2023

Sure, it's this function here and that is pretty much all there is to it right now
https://github.com/hay-kot/mealie/blob/mealie-next/mealie/repos/repository_recipes.py#L153

Feel free to contact me on the discord server, if you have any more questions

...Also that intro page should probably be changed if it still claims to have this. This was part of 1.0.0 at some point, but only done client-side

@jecorn
Copy link
Contributor Author

jecorn commented Apr 5, 2023

I just made Feature Request #2335 with two proposed solutions. One of them yours which makes a lot of sense, and the other a bit crazy but surprisingly feasible (IMHO). I'm not well versed in Discord, so always get lost in the non-threaded flow of the channels. Sorry to always be posting in the GitHub threads.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants