Skip to content

v1.10.0 🦩

Compare
Choose a tag to compare
@ManyTheFish ManyTheFish released this 26 Aug 05:56
36d8684

Meilisearch v1.10 introduces federated search. This innovative feature allows you to receive a single list of results for multi-search requests. v1.10 also includes a setting to manually define which language or languages are present in your documents, and two new new experimental features: the CONTAINS filter operator and the ability to update a subset of your dataset with a function.

🧰 All official Meilisearch integrations (including SDKs, clients, and other tools) are compatible with this Meilisearch release. Integration deployment happens between 4 to 48 hours after a new version becomes available.

Some SDKs might not include all new features. Consult the project repository for detailed information. Is a feature you need missing from your chosen SDK? Create an issue letting us know you need it, or, for open-source karma points, open a PR implementing it (we'll love you for that ❤️).

New features and updates 🔥

Federated search

Use the new federation setting of the /multi-search route to return a single search result object:

curl \
  -X POST 'http://localhost:7700/multi-search' \
  -H 'Content-Type: application/json' \
  --data-binary '{
    "federation": {
      "offset": 5,
      "limit": 10
    }
    "queries": [
      {
        "q": "Batman",
        "indexUid": "movies"
      },
      {
        "q": "Batman",
        "indexUid": "comics"
      }
    ]
  }'

Response:

{
  "hits": [
    {
      "id": 42,
      "title": "Batman returns",
      "overview": "..",
      "_federation": {
        "indexUid": "movies",
        "queriesPosition": 0
      }
    },
    {
      "comicsId": "batman-killing-joke",
      "description": "..",
      "title": "Batman: the killing joke",
      "_federation": {
        "indexUid": "comics",
        "queriesPosition": 1
      }
    },
    
 ],
  processingTimeMs: 0,
  limit: 20,
  offset: 0,
  estimatedTotalHits: 2,
  semanticHitCount: 0,
}

When performing a federated search, Meilisearch merges the results coming from different sources in descending ranking score order.

If federation is empty ({}), Meilisearch sets offset and limit to 0 and 20 respectively.

If federation is null or missing, multi-search returns one list of search result objects for each index.

Federated results relevancy

When performing federated searches, use federationOptions in the request's queries array to configure the relevancy and the weight of each index:

curl \
 -X POST 'http://localhost:7700/multi-search' \
 -H 'Content-Type: application/json' \
 --data-binary '{
  "federation": {},
  "queries": [
    {
      "q": "apple red",
      "indexUid": "fruits",
      "filter": "BOOSTED = true",
      "_showRankingScore": true,
      "federationOptions": {
        "weight": 3.0
      }
    },
    {
      "q": "apple red",
      "indexUid": "fruits",
      "_showRankingScore": true,
    }
  ]
}'

federationOptions must be an object. It supports a single field, weight, which must be a positive floating-point number:

  • if weight < 1.0, results from this index are less likely to appear in the results
  • if weight > 1.0, results from this index are more likely to appear in the results
  • if not specified, weight defaults to 1.0

📖 Consult the usage page for more information about the merge algorithm.

Done by @dureuill in #4769.

Experimental: CONTAINS filter operator

Enable the containsFilter experimental feature to use the CONTAINS filter operator:

curl \
  -X PATCH 'http://localhost:7700/experimental-features/' \
  -H 'Content-Type: application/json' \
  --data-binary '{
    "containsFilter": true
  }'

CONTAINS filters results containing partial matches to the specified string, similar to a SQL LIKE:

curl \
  -X POST http://localhost:7700/indexes/movies/search \
  -H 'Content-Type: application/json' \
  --data-binary '{
    "q": "super hero",
    "filter": "synopsis CONTAINS spider"
  }'

🗣️ This is an experimental feature, and we need your help to improve it! Share your thoughts and feedback on this GitHub discussion.

Done by @irevoire in #4804.

Language settings

Use the new localizedAttributes index setting and the locales search parameter to explicitly set the languages used in document fields and the search query itself. This is particularly useful for <=v1.9 users who have to occasionally resort to alternative Meilisearch images due to language auto-detect issues in Swedish and Japanese datasets.

Done by @ManyTheFish in #4819.

Set language during indexing with localizedAttributes

Use the newly introduced localizedAttributes setting to explicitly declare which languages correspond to which document fields:

curl \
  -X PATCH 'http://localhost:7700/indexes/movies/settings' \
  -H 'Content-Type: application/json' \
  --data-binary '{
    "localizedAttributes": [
      {"locales": ["jpn"], "attributePatterns": ["*_ja"]},
      {"locales": ["eng"], "attributePatterns": ["*_en"]},
      {"locales": ["cmn"], "attributePatterns": ["*_zh"]},
      {"locales": ["fra", "ita"], "attributePatterns": ["latin.*"]},
      {"locales": [], "attributePatterns": ["*"]}
    ]
  }'

locales is a list of ISO-639-3 language codes to assign to a pattern. The currently supported languages are: epo, eng, rus, cmn, spa, por, ita, ben, fra, deu, ukr, kat, ara, hin, jpn, heb, yid, pol, amh, jav, kor, nob, dan, swe, fin, tur, nld, hun, ces, ell, bul, bel, mar, kan, ron, slv, hrv, srp, mkd, lit, lav, est, tam, vie, urd, tha, guj, uzb, pan, aze, ind, tel, pes, mal, ori, mya, nep, sin, khm, tuk, aka, zul, sna, afr, lat, slk, cat, tgl, hye.

attributePattern is a pattern that can start or end with a * to match one or several attributes.

If an attribute matches several rules, only the first rule in the list will be applied. If the locales list is empty, then Meilisearch is allowed to auto-detect any language in the matching attributes.

These rules are applied to the searchableAttributes, the filterableAttributes, and the sortableAttributes.

Set language at search time with locales

The /search route accepts a new parameter, locales. Use it to define the language used in the current query:

curl \
  -X POST http://localhost:7700/indexes/movies/search \
  -H 'Content-Type: application/json' \
  --data-binary '{
    "q": "進撃の巨人",
    "locales": ["jpn"]
  }'

The locales parameter overrides eventual locales in the index settings.

Experimental: Edit documents with a Rhai function

Use a Rhai function to edit documents in your database directly from Meilisearch:

First, activate the experimental feature:

curl \
  -X PATCH 'http://localhost:7700/experimental-features/' \
  -H 'Content-Type: application/json' \
  --data-binary '{
    "editDocumentsByFunction": true
  }'

Then query the /documents/edit route with the editing function:

curl http://localhost:7700/indexes/movies/documents/edit \
  -H 'content-type: application/json' \
  -d '{
   "function": "doc.title = `✨ ${doc.title.to_upper()} ✨`",
   "filter": "id > 3000"
  }'

/documents/edit accepts three parameters in its payload: function, filter, and context.

function must be a string with a Rhai function. filter must be a filter expression.. context must be an object with data you want to make available for the editing function.

📖 More information here.

🗣️ This is an experimental feature and we need your help to improve it! Share your thoughts and feedback on this GitHub discussion.

Done by @Kerollmops in #4626.

Experimental AI-powered search: quality of life improvements

For the purpose of future stabilization of the feature, we are applying changes and quality-of-life improvements.

Done by @dureuill in #4801, #4815, #4818, #4822.

⚠️ Breaking changes: Changing the parameters of the REST API

The old parameters of the REST API are too numerous and confusing.

Removed parameters: query , inputField, inputType, pathToEmbeddings and embeddingObject.
Replaced by:

  • request : A JSON value that represents the request made by Meilisearch to the remote embedder. The text to embed must be replaced by the placeholder value “{{text}}”.
  • response: A JSON value that represents a fragment of the response made by the remote embedder to Meilisearch. The embedding must be replaced by the placeholder value "{{embedding}}".

Before:

// v1.10 version ✅
{
  "source": "rest",
  "url": "https://localhost:10006",
  "request": {
    "model": "minillm",
    "prompt": "{{text}}"
  },
  "response": {
    "embedding": "{{embedding}}"
  }
}
// v1.9 version ❌
{
  "source": "rest",
  "url": "https://localhost:10006",
  "query": {
    "model": "minillm",
  },
  "inputField": ["prompt"],
  "inputType": "text",
  "embeddingObject": ["embedding"]
}

Caution

This is a breaking change to the configuration of REST embedders.
Importing a dump containing a REST embedder configuration will fail in v1.10 with an error: "Error: unknown field query, expected one of source, model, revision, apiKey, dimensions, documentTemplate, url, request, response, distribution at line 1 column 752".

Upgrade procedure:

  1. Remove embedders with source "rest"
  2. Update your Meilisearch Cloud project or self-hosted Meilisearch instance as usual

Add custom headers to REST embedders

When the source of an embedder is set to rest, you may include an optional headers parameter. Use this to configure custom headers you want Meilisearch to include in the requests it sends the embedder.

Embedding requests sent from Meilisearch to a remote REST embedder always contain two headers:

  • Authorization: Bearer <apiKey> (only if apiKey was provided)
  • Content-Type: application/json

When provided, headers should be a JSON object whose keys represent the name of additional headers to send in requests, and the values represent the value of these additional headers.

If headers is missing or null for a rest embedder, only Authorization and Content-Type are sent, as described above.

If headers contains Authorization and Content-Type, the declared values will override the ones that are sent by default.

Using the headers parameter for any other source besides rest results in an invalid_settings_embedder error.

Other quality-of-life improvements

📖 More details here

  • Add url parameter to the OpenAI embedder. url should be an URL to the embedding endpoint (including the v1/embeddingspart) from OpenAI. If url is missing or null for an openAi embedder, the default OpenAI embedding route will be used (https://api.openai.com/v1/embeddings).
  • dimensions is now available as an optional parameter for ollama embedders. Previously it was only available for rest, openAi and userProvided embedders.
  • Previously _vectors.embedder was omitted for documents without at least one embedding for embedder. This was inconsistent and prevented the user from checking the value of regenerate.
  • When a request to a REST embedder fails, the duration of the exponential backoff is now randomized up to twice its base duration
  • Truncate rather than embed by chunk when OpenAI embeddings are bigger than the max number of tokens
  • Improve error message when indexing documents and embeddings are missing for a user-provided embedder
  • Improve error message when a model configuration cannot be loaded and its "architectures" field does not contain "BertModel"

⚠️ Important change regarding the minimal Ubuntu version compatible with Meilisearch

Because the GitHub Actions runner now enforces the usage of a Node version that is not compatible with Ubuntu 18.04 anymore, we had to upgrade the minimal Ubuntu version compatible with Meilisearch. Indeed, we use these GitHub actions to build and provide our binaries.

Now, Meilisearch is only compatible with Ubuntu 20.04 and later and not with Ubuntu 18.4 anymore.

Done by @curquiza in #4783.

Other improvements

Fixes 🐞

  • Fix invalid primary key for big numbers @JWSong in #4725
  • Fix wrong HTTP status and confusing error message on wrong payload by @Karribalu in #4716
  • Fix the missing geo distance when one or both of the lat/lng are string by @irevoire in #4731
  • Fix errors related to OffsetDateTime: use a fixed date format regardless of features by @dureuill in #4850
  • Fix filter that doesn't return valid documents by @dureuill in #4864 & #4858

Misc

❤️ Thanks again to our external contributors: