elastic · lcawl · Dec 6, 2021 · Dec 4, 2021 · Dec 4, 2021
@@ -4,7 +4,6 @@ include::ml-dfa-overview.asciidoc[leveloffset=+1]
 include::ml-dfa-outlier-detection.asciidoc[leveloffset=+1]
 include::ml-dfa-regression.asciidoc[leveloffset=+1]
 include::ml-dfa-classification.asciidoc[leveloffset=+1]
-include::ml-dfa-lang-ident.asciidoc[leveloffset=+1]
 
 include::ml-dfa-concepts.asciidoc[leveloffset=+1]
 include::ml-how-dfa-works.asciidoc[leveloffset=+2]

@@ -18,40 +18,26 @@ The evaluation API endpoint has the following base:
 ----
 // NOTCONSOLE
 
-All the trained models endpoints have the following base:
-
-[source,js]
-----
-/_ml/trained_models/
-----
-// NOTCONSOLE
-
 // CREATE
 * {ref}/put-dfanalytics.html[Create {dfanalytics-jobs}]
-* {ref}/put-trained-models-aliases.html[Create trained model aliases]
-* {ref}/put-trained-model-definition-part.html[Create trained model definition part]
-* {ref}/put-trained-models.html[Create trained models]
 // DELETE
 * {ref}/delete-dfanalytics.html[Delete {dfanalytics-jobs}]
-* {ref}/delete-trained-models.html[Delete trained models]
 // EVALUATE
 * {ref}/evaluate-dfanalytics.html[Evaluate {dfanalytics}]
 // EXPLAIN
 * {ref}/explain-dfanalytics.html[Explain {dfanalytics}]
 // GET
 * {ref}/get-dfanalytics.html[Get {dfanalytics-jobs} info]
 * {ref}/get-dfanalytics-stats.html[Get {dfanalytics-jobs} statistics]
-* {ref}/get-trained-models.html[Get trained models]
-* {ref}/get-trained-models-stats.html[Get trained models statistics]
-// INFER
-* {ref}//infer-trained-model-deployment.html[Infer trained model deployment]
 // PREVIEW
 * {ref}/preview-dfanalytics.html[Preview {dfanalytics}]
 // START
 * {ref}/start-dfanalytics.html[Start {dfanalytics-jobs}]
 // STOP
 * {ref}/stop-dfanalytics.html[Stop {dfanalytics-jobs}]
-* {ref}/stop-trained-model-deployment.html[Stop trained model deployment]
 // UPDATE
 * {ref}/update-dfanalytics.html[Update {dfanalytics-jobs}]
 
+For information about the APIs related to trained models, refer to
+<<ml-nlp-apis>>.
+
@@ -18,4 +18,6 @@ include::anomaly-detection/index.asciidoc[]
 
 include::df-analytics/index.asciidoc[]
 
+include::nlp/index.asciidoc[]
+
 include::redirects.asciidoc[]
@@ -0,0 +1,5 @@
+include::ml-nlp.asciidoc[]
+include::ml-nlp-overview.asciidoc[leveloffset=+1]
+include::ml-nlp-lang-ident.asciidoc[leveloffset=+2]
+include::ml-nlp-apis.asciidoc[leveloffset=+1]
+
@@ -0,0 +1,29 @@
+[[ml-nlp-apis]]
+= API quick reference
+
+All the trained models endpoints have the following base:
+
+[source,js]
+----
+/_ml/trained_models/
+----
+// NOTCONSOLE
+
+// CREATE
+* {ref}/put-trained-models-aliases.html[Create trained model aliases]
+* {ref}/put-trained-model-definition-part.html[Create trained model definition part]
+* {ref}/put-trained-models.html[Create trained models]
+// DELETE
+* {ref}/delete-trained-models.html[Delete trained models]
+// GET
+* {ref}/get-trained-models.html[Get trained models]
+* {ref}/get-trained-models-stats.html[Get trained models statistics]
+// INFER
+* {ref}/infer-trained-model-deployment.html[Infer trained model deployment]
+// START
+* {ref}//start-trained-model-deployment.html[Start trained model deployment]
+// STOP
+* {ref}/stop-trained-model-deployment.html[Stop trained model deployment]
+// UPDATE
+* {ref}/put-trained-models-aliases.html[Update trained model aliases]
+
@@ -1,5 +1,4 @@
-[role="xpack"]
-[[ml-dfa-lang-ident]]
+[[ml-nlp-lang-ident]]
 = {lang-ident-cap}
 
 :keywords: {ml-init}, {stack}, {dfanalytics}, {lang-ident}

@@ -0,0 +1,44 @@
+[[ml-nlp-overview]]
+= Overview
+
+{nlp-cap} (NLP) refers to the way in which we can use software to understand
+natural language in spoken word or written text.
+
+Classically, NLP was performed using linguistic rules, dictionaries, regular
+expressions, and {ml} for specific tasks such as automatic categorization or
+summarization of text. In recent years, however, deep learning techniques have
+taken over much of the NLP landscape. Deep learning capitalizes on the
+availability of large scale data sets, cheap computation, and techniques for
+learning at scale with less human involvement. Pre-trained language models that
+use a transformer architecture have been particularly successful. For example,
+BERT is a pre-trained language model that was released by Google in 2018. Since
+that time, it has become the inspiration for most of today’s modern NLP
+techniques. The {stack} {ml} features are structured around BERT and
+transformer models. These features support BERT’s tokenization scheme (called
+WordPiece) and transformer models that conform to the standard BERT model
+interface.
+
+To incorporate transformer models and make predictions, {es} uses libtorch,
+which is an underlying native library for PyTorch. Trained models must be in a
+TorchScript representation for use with {stack} {ml} features.
+
+As in the cases of <<ml-dfa-classification,classification>> and
+<<ml-dfa-regression,regression>>, after you deploy a model to your cluster, you
+can use it to make predictions (also known as _inference_) against incoming data.
+You can perform the following NLP tasks:
+
+Extract information::
+* _Named entity recognition (NER)_ enables you to identify and categorize entities
+in your text.
+* _Fill masks_ enable you to predict missing words in text sequences.
+
+Categorize text::
+* <<ml-nlp-lang-ident,Language identification>> enables you to determine the
+language of text.
+* _Text classification_ enables you to classify input text.
+* _Zero-shot text classification_ performs classification without requiring a
+specialized model.
+
+Search and compare text::
+* _Text embedding_ turns content into vectors, which enables you to compare text
+by using mathematical functions.
@@ -0,0 +1,16 @@
+[[ml-nlp]]
+= {nlp-cap}
+
+:keywords: {ml-init}, {stack}, {nlp}, overview
+:description: An introduction to {ml} {nlp} features.
+
+[partintro]	
+--
+
+You can use {stack-ml-features} to analyze natural language data and make
+predictions.
+
+* <<ml-nlp-overview>>
+* <<ml-nlp-apis>>
+
+--
@@ -148,3 +148,8 @@ This content has moved. See <<sample-data-forecasts>>.
 === Next steps
 
 This content has moved. See <<sample-data-next>>.
+
+[role="exclude",id="ml-dfa-lang-ident"]
+=== Language identification
+
+This content has moved. See <<ml-nlp-lang-ident>>.
Original file line number	Diff line number	Diff line change
Expand Up		@@ -18,4 +18,6 @@ include::anomaly-detection/index.asciidoc[]

		include::df-analytics/index.asciidoc[]

		include::nlp/index.asciidoc[]

		include::redirects.asciidoc[]