First version of the LTR guide.

afoucret · Mar 5, 2024 · b4627c8 · b4627c8
1 parent 4f2c8ca
commit b4627c8
Show file tree

Hide file tree

Showing 7 changed files with 337 additions and 0 deletions.
diff --git a/docs/reference/images/search/learning-to-rank-feature-extraction.png b/docs/reference/images/search/learning-to-rank-feature-extraction.png
diff --git a/docs/reference/images/search/learning-to-rank-judgment-list.png b/docs/reference/images/search/learning-to-rank-judgment-list.png
diff --git a/docs/reference/images/search/learning-to-rank-overview.png b/docs/reference/images/search/learning-to-rank-overview.png
diff --git a/docs/reference/search/search-your-data/learning-to-rank-model-training.asciidoc b/docs/reference/search/search-your-data/learning-to-rank-model-training.asciidoc
@@ -0,0 +1,150 @@
+[[learning-to-rank-model-training]]
+=== Deploy and manage Learning To Rank models
+++++
+<titleabbrev>Deploy and manage LTR models</titleabbrev>
+++++
+
+preview::["The Learning To Rank feature is in technical preview and may be changed or removed in a future release. Elastic will work to fix any issues, but this feature is not subject to the support SLA of official GA features."]
+
+NOTE: This feature is available for Elastic Stack versions 8.12.0 and newer and requires a https://www.elastic.co/pricing[Platinum subscription] or higher.
+
+[discrete]
+[[learning-to-rank-model-training-workflow]]
+==== Train and deploy a model using Eland
+
+https://xgboost.readthedocs.io/en/stable/[XGBoost^] model training typically leverages a standard, Python data science toolkit such as Pandas and scikit-learn
+
+We have developed an example notebook available https://github.com/elastic/elasticsearch-labs/blob/main/notebooks/search/08-learning-to-rank.ipynb[here^], detailing an end-to-end model training and deployment workflow.
+
+We highly recommend integrating https://eland.readthedocs.io/[eland^] into your workflow since it provides some important features in the workflow to integrate Learning To Rank in {es}:
+
+* Configure feature extraction
+
+* Extract features for training
+
+* Deploy the model into {es}
+
+[discrete]
+[[learning-to-rank-model-training-feature-definition]]
+===== Configure feature extraction in Eland
+
+Feature extractors are defined using templated queries. https://eland.readthedocs.io/[Eland^] provides the `eland.ml.ltr.QueryFeatureExtractor` to define these feature extractors directly in Python:
+
+[source,python]
+----
+from eland.ml.ltr import QueryFeatureExtractor
+
+feature_extractors=[
+    # We want to use the score of the match query for the title field as a features:
+    QueryFeatureExtractor(
+        feature_name="title_bm25",
+        query={"match": {"title": "{{query}}"}}
+    ),
+    # We can use a script_score query to get the value of the field rating directly as a feature:
+    QueryFeatureExtractor(
+        feature_name="popularity",
+        query={
+            "script_score": {
+                "query": {"exists": {"field": "popularity"}},
+                "script": {"source": "return doc['popularity'].value;"},
+            }
+        },
+    ),
+    # We can execute a script on the value of the query and use the return value as a feature:
+    QueryFeatureExtractor(
+        feature_name="query_length",
+        query={
+            "script_score": {
+                "query": {"match_all": {}},
+                "script": {
+                    "source": "return params['query'].splitOnToken(' ').length;",
+                    "params": {
+                        "query": "{{query}}",
+                    }
+                },
+            }
+        },
+    ),
+]
+----
+// NOTCONSOLE
+
+Once the feature extractors have been defined, they are wrapped within an `eland.ml.ltr.LTRModelConfig` object which will be used in subsequent steps of the training process:
+
+[source,python]
+----
+from eland.ml.ltr import LTRModelConfig
+
+ltr_config = LTRModelConfig(feature_extractors)
+----
+// NOTCONSOLE
+
+[discrete]
+[[learning-to-rank-model-training-feature-extraction]]
+===== Extracting features for training
+
+One of the most important steps of the training process is to build the dataset that will be used by extracting and adding features to it. Eland provides another helper class, `eland.ml.ltr.FeatureLogger`, to aid in this process:
+
+[source,python]
+----
+from eland.ml.ltr import FeatureLogger
+
+# Create a feature logger that will be used to query {es} to retrieve the features:
+feature_logger = FeatureLogger(es_client, MOVIE_INDEX, ltr_config)
+----
+// NOTCONSOLE
+
+The FeatureLogger provides an `extract_features` method allowing you to extract features for a list of specific documents from your judgment list. At the same time, query parameters used by the feature extractors defined earlier can be passed:
+
+[source,python]
+----
+feature_logger.extract_features(
+    query_params:{"query": "foo"},
+    doc_ids=["doc-1", "doc-2"]
+)
+----
+// NOTCONSOLE
+
+Our https://github.com/elastic/elasticsearch-labs/blob/main/notebooks/search/08-learning-to-rank.ipynb[example notebook^] provides a complete example explaining how to use the `FeatureLogger` to add features to the judgment list in order to build the training dataset.
+
+[discrete]
+[[learning-to-rank-model-training-feature-extraction-notes]]
+====== Notes on features extraction
+
+* We strongly advise against implementing feature extraction on your own. It's crucial to maintain consistency in feature extraction between the training environment and inference in {es}. By utilizing eland tooling, which is developed and tested in tandem with {es}, you can ensure that they function together consistently.
+
+* Feature extraction is performed by executing queries on the {es} server which could cause a lot of stress on your cluster, especially when your judgment list contains a lot of examples or you have many features. Our feature logger implementation is designed to minimize the number of search requests sent to the server in order to reduce the load, however building the training dataset might best be performed using an {es} cluster that is isolated from any user-facing, production traffic
+
+[discrete]
+[[learning-to-rank-model-deployment]]
+===== Deploy your model into {es}
+
+Once your model is trained you will be able to deploy it into your {es} cluster. For this purpose, eland provides the `MLModel.import_ltr_model method`:
+
+[source,python]
+----
+from eland.ml import MLModel
+
+LEARNING_TO_RANK_MODEL_ID="ltr-model-xgboost"
+
+MLModel.import_ltr_model(
+    es_client=es_client,
+    model=ranker,
+    model_id=LEARNING_TO_RANK_MODEL_ID,
+    ltr_model_config=ltr_config,
+    es_if_exists="replace",
+)
+----
+// NOTCONSOLE
+
+This method will serialize the trained model and the Learning To Rank configuration (including feature extraction) in a format that {es} can understand before sending it to Elasticsearch using the https://www.elastic.co/guide/en/elasticsearch/reference/current/put-trained-models.htmlp[Create Trained Models API].
+
+The following types of models are supported for Learning To Rank: https://scikit-learn.org/stable/modules/generated/sklearn.tree.DecisionTreeRegressor.html[`DecisionTreeRegressor`^],  https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestRegressor.html[`RandomForestRegressor`^], https://lightgbm.readthedocs.io/en/latest/pythonapi/lightgbm.LGBMRegressor.html[`LGBMRegressor`^], https://xgboost.readthedocs.io/en/stable/python/python_api.html#xgboost.XGBRanker[`XGBRanker`^], https://xgboost.readthedocs.io/en/stable/python/python_api.html#xgboost.XGBRegressor[`XGBRegressor`^].
+
+More model types will be supported in the future.
+
+[discrete]
+[[learning-to-rank-model-management]]
+==== Learning To Rank model management
+
+Once your model is deployed into {es} it is possible to manage it using the https://www.elastic.co/guide/en/elasticsearch/reference/current/ml-df-trained-models-apis.html[trained model APIs].
diff --git a/docs/reference/search/search-your-data/learning-to-rank-search-usage.asciidoc b/docs/reference/search/search-your-data/learning-to-rank-search-usage.asciidoc
@@ -0,0 +1,77 @@
+[[learning-to-rank-search-usage]]
+=== Search using Learning To Rank
+++++
+<titleabbrev>Search using Learning To Rank</titleabbrev>
+++++
+
+preview::["The Learning To Rank feature is in technical preview and may be changed or removed in a future release. Elastic will work to fix any issues, but this feature is not subject to the support SLA of official GA features."]
+
+NOTE: This feature is available for Elastic Stack versions 8.12.0 and newer and requires a https://www.elastic.co/pricing[Platinum subscription] or higher.
+
+[discrete]
+[[learning-to-rank-rescorer]]
+==== Learning To Rank as a rescorer
+
+Once your LTR model is trained and deployed into {es}, it is possible to use it as a <<rescore, rescorer>> in the Elasticsearch <<search-your-data, Search API>>:
+
+[source,console]
+----
+GET my-index/_search
+{
+  "query": { <1>
+    "multi_match": {
+      "fields": ["title", "content"],
+      "query": "the quick brown fox"
+    }
+  },
+  "rescore": {
+    "learning_to_rank": {
+      "model_id": "ltr-model", <2>
+      "params": { <3>
+        "query_text": "the quick brown fox"
+      }
+    },
+    "window_size": 100 <4>
+  }
+}
+----
+// TEST[skip:TBD]
+<1> First pass query providing documents to be rescored.
+<2> The unique identifier of the trained model uploaded to {es}.
+<3> Named parameters to be passed to the query templates used for feature.
+<4> The number of documents that should be examined by the rescorer on each shard.
+
+[discrete]
+[[learning-to-rank-rescorer-limitations]]
+===== Known limitations
+
+[discrete]
+[[learning-to-rank-rescorer-limitations-window-size]]
+====== Rescore window size
+
+Scores returned by LTR models are usually not comparable with the scores issued by the first pass query and can be lower than the non-rescored score. This can cause the non-rescored result document to be ranked higher than the rescored document.  To prevent this, the window_size parameter is mandatory for LTR rescorers and should be greater than or equal to `from + size`.
+
+[discrete]
+[[learning-to-rank-rescorer-limitations-pagination]]
+====== Pagination
+
+When exposing pagination to users, `window_size` should remain constant as each page is progressed by passing different `from` values. Changing the `window_size` can alter the top hits causing results to confusingly shift as the user steps through pages.
+
+[discrete]
+[[learning-to-rank-rescorer-limitations-negative-scores]]
+====== Negative scores
+
+Depending on how your model is trained, it’s possible that the model will return negative scores for documents. While negative scores are not allowed from first-stage retrieval and ranking, it is possible to use them in the LTR rescorer.
+
+[discrete]
+[[learning-to-rank-rescorer-limitations-field-collapsing]]
+====== Compatibility with field collapsing
+
+LTR rescorers are not compatible with the <<collapse-search-results, collapse feature>>.
+
+[discrete]
+[[learning-to-rank-rescorer-limitations-term-statistics]]
+====== Term statistics as features
+
+We do not currently support term statistics as features, however future releases will introduce this capability.
+
diff --git a/docs/reference/search/search-your-data/learning-to-rank.asciidoc b/docs/reference/search/search-your-data/learning-to-rank.asciidoc
@@ -0,0 +1,109 @@
+[[learning-to-rank]]
+== Learning To Rank
+
+preview::["The Learning To Rank feature is in technical preview and may be changed or removed in a future release. Elastic will work to fix any issues, but this feature is not subject to the support SLA of official GA features."]
+
+NOTE: This feature is available for Elastic Stack versions 8.12.0 and newer and requires a https://www.elastic.co/pricing[Platinum subscription] or higher.
+
+Learning To Rank (LTR) is a technique used to build a complex machine learning (ML) -based ranking function for your search engine.
+
+The LTR function takes a list of documents and a search context and outputs ranked documents:
+
+[[learning-to-rank-overview-diagram]]
+.Learning To Rank overview
+image::images/search/learning-to-rank-overview.png[Learning To Rank overview,align="center"]
+
+
+[discrete]
+[[learning-to-rank-search-context]]
+=== Search context
+
+In addition to the list of documents to sort, the LTR function also requires a search context. Typically, this search context includes at least the search terms provided by the user ('text_query' in the example above).
+
+The search context can also be a lot more complex and provides information like demographic data (geolocation, age, …) data for the user who submitted the search.
+
+[discrete]
+[[learning-to-rank-judgement-list]]
+=== Judgement list
+In the context of LTR the judgment list is the main input used to train the ML model. It consists of a dataset that contains pairs of queries and documents, along with their corresponding relevance labels.
+The relevance label typically contains either a binary label (relevant/irrelevant) or a more granular one as shown in the example above where the label can be a grade, or number between 0 (not relevant at all) to 4 (highly relevant).
+
+[[learning-to-rank-judgment-list-example]]
+.Judgment list example
+image::images/search/learning-to-rank-judgment-list.png[Judgment list example,align="center"]
+
+[discrete]
+[[judgment-list-notes]]
+==== Notes on judgment lists
+
+While a judgment list can be created manually by humans, there are techniques available to utilize user engagement data, such as clicks or conversions, to construct such a judgment list automatically.
+
+The quantity and the quality of your judgment list will greatly influence the overall performance of the LTR model. The following aspects should be considered very carefully when building your judgment list:
+
+* Most search engines can be searched using different query types (e.g: for a movie search engine, users are searching by title but also by actor or director).  It is essential to maintain a balanced number of examples for each query type in your judgment list to prevent overfitting and allow the model to generalize effectively across all query types.
+
+* Users often provide more positive examples than negative ones. By balancing the number of positive and negative examples, you help the model learn to distinguish between relevant and irrelevant content more accurately.
+
+[discrete]
+[[learning-to-rank-feature-extraction]]
+=== Feature extraction
+
+The ML models used for LTR are not able to understand the query and document pair directly but require that we transform their properties into an array of numerical features.
+
+These features fall into one of three main categories:
+
+* Document features:
+  These features are derived directly from the document properties.
+  Examples: product price in an eCommerce store
+
+* Query features:
+  These features are computed directly from the query submitted by the user.
+  Examples: number of words in the query
+
+* Query-document features:
+  Features used to provide information about the document in the context of the query.
+  Examples: BM25 score for the title field, …
+
+To prepare the dataset for training, the features are added to the judgment list:
+
+[[learning-to-rank-judgement-feature-extraction]]
+.Judgment list with features
+image::images/search/learning-to-rank-feature-extraction.png[Judgment list with features,align="center"]
+
+To do this, we use templated queries to extract features both when building the training dataset and during inference at query time:
+
+[source,json]
+----
+[
+  {
+    "query_extractor": {
+      "feature_name": "title_bm25",
+      "query": { "match": { "title": "{{query}}" } }
+    }
+  }
+]
+----
+// NOTCONSOLE
+
+[discrete]
+[[learning-to-rank-models]]
+=== Models
+
+The heart of LTR is of course an ML model. A model is trained using the training data described above in combination with an objective. In the case of LTR, the objective is to rank result documents in an optimal way with respect to a judgment list, given some ranking metric such as https://en.wikipedia.org/wiki/Evaluation_measures_(information_retrieval)#Discounted_cumulative_gain[nDCG^] or https://en.wikipedia.org/wiki/Evaluation_measures_(information_retrieval)#Mean_average_precision[MAP^]. The model has access to only the features in the training data, as well as the associated relevance labels which are used in the ranking metric.
+
+Many approaches and model types exist for LTR and the field is continuing to evolve, however LTR inference in {es} relies specifically on gradient boosted decision tree (GBDT) models. In {es}, we also only support model inference and not the training process itself. As such, training an LTR model needs to happen outside of {es} and using a GBDT model. Among the most popular LTR models used today, LambdaMART provides strong ranking performance with low inference latencies. It relies on GBDT models and is thus a perfect fit for LTR in {es}.
+
+https://xgboost.readthedocs.io/en/stable/[XGBoost^] is a well known library that provides an https://xgboost.readthedocs.io/en/stable/tutorials/learning_to_rank.html[implementation^] of LambdaMART, making it a popular choice for LTR. We offer helpers in https://eland.readthedocs.io/[eland^] to facilitate the integration of a trained https://xgboost.readthedocs.io/en/stable/python/python_api.html#xgboost.XGBRanker[XBGRanker^] model as your LTR model in {es}.
+
+[discrete]
+[[learning-to-rank-in-the-elastic-stack]]
+=== Learning To Rank in the Elastic stack
+
+In the next pages of this guide you will learn to:
+
+* train and deploy a LTR model using eland
+
+* search using your LTR model
+
+include::learning-to-rank-model-training.asciidoc[]
+include::learning-to-rank-search-usage.asciidoc[]
diff --git a/docs/reference/search/search-your-data/search-your-data.asciidoc b/docs/reference/search/search-your-data/search-your-data.asciidoc
@@ -46,6 +46,7 @@ include::search-api.asciidoc[]
 include::search-application-overview.asciidoc[]
 include::knn-search.asciidoc[]
 include::semantic-search.asciidoc[]
+include::learning-to-rank.asciidoc[]
 include::search-across-clusters.asciidoc[]
 include::search-with-synonyms.asciidoc[]
 include::behavioral-analytics/behavioral-analytics-overview.asciidoc[]