Skip to content

Commit

Permalink
[DOCS] Adds custom feature processors description to PUT DFA API (#67424
Browse files Browse the repository at this point in the history
) (#67678)

Co-authored-by: Benjamin Trent <[email protected]>
  • Loading branch information
szabosteve and benwtrent authored Jan 19, 2021
1 parent 4e34671 commit 9f7ca26
Show file tree
Hide file tree
Showing 2 changed files with 185 additions and 1 deletion.
105 changes: 105 additions & 0 deletions docs/reference/ml/df-analytics/apis/put-dfanalytics.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -118,6 +118,111 @@ include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=feature-bag-fraction]
`feature_processors`::::
(Optional, list)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=dfas-feature-processors]
+
.Properties of `feature_processors`
[%collapsible%open]
======
`frequency_encoding`::::
(object)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=dfas-feature-processors-frequency]
+
.Properties of `frequency_encoding`
[%collapsible%open]
=======
`feature_name`::::
(Required, string)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=dfas-feature-processors-feat-name]

`field`::::
(Required, string)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=dfas-feature-processors-field]

`frequency_map`::::
(Required, object)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=dfas-feature-processors-frequency-map]
=======
`multi_encoding`::::
(object)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=dfas-feature-processors-multi]
+
.Properties of `multi_encoding`
[%collapsible%open]
=======
`processors`::::
(Required, array)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=dfas-feature-processors-multi-proc]
=======
`ngram_encoding`::::
(object)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=dfas-feature-processors-ngram]
+
.Properties of `ngram_encoding`
[%collapsible%open]
=======
`feature_prefix`::::
(Optional, string)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=dfas-feature-processors-ngram-feat-pref]

`field`::::
(Required, string)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=dfas-feature-processors-ngram-field]

`length`::::
(Optional, integer)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=dfas-feature-processors-ngram-length]

`n_grams`::::
(Required, array)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=dfas-feature-processors-ngram-ngrams]

`start`::::
(Optional, integer)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=dfas-feature-processors-ngram-start]
=======
`one_hot_encoding`::::
(object)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=dfas-feature-processors-one-hot]
+
.Properties of `one_hot_encoding`
[%collapsible%open]
=======
`field`::::
(Required, string)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=dfas-feature-processors-field]

`hot_map`::::
(Required, string)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=dfas-feature-processors-one-hot-map]
=======
`target_mean_encoding`::::
(object)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=dfas-feature-processors-target-mean]
+
.Properties of `target_mean_encoding`
[%collapsible%open]
=======
`default_value`::::
(Required, integer)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=dfas-feature-processors-target-mean-default]

`feature_name`::::
(Required, string)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=dfas-feature-processors-feat-name]

`field`::::
(Required, string)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=dfas-feature-processors-field]

`target_map`::::
(Required, object)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=dfas-feature-processors-target-mean-map]
=======
======

`gamma`::::
(Optional, double)
Expand Down
81 changes: 80 additions & 1 deletion docs/reference/ml/ml-shared.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -554,9 +554,88 @@ A collection of feature preprocessors that modify one or more included fields.
The analysis uses the resulting one or more features instead of the
original document field. Multiple `feature_processors` entries can refer to the
same document fields.
Note, automatic categorical {ml-docs}/ml-feature-encoding.html[feature encoding] still occurs.
Note, automatic categorical {ml-docs}/ml-feature-encoding.html[feature encoding]
still occurs.
end::dfas-feature-processors[]

tag::dfas-feature-processors-feat-name[]
The resulting feature name.
end::dfas-feature-processors-feat-name[]

tag::dfas-feature-processors-field[]
The name of the field to encode.
end::dfas-feature-processors-field[]

tag::dfas-feature-processors-frequency[]
The configuration information necessary to perform frequency encoding.
end::dfas-feature-processors-frequency[]

tag::dfas-feature-processors-frequency-map[]
The resulting frequency map for the field value. If the field value is missing
from the `frequency_map`, the resulting value is `0`.
end::dfas-feature-processors-frequency-map[]

tag::dfas-feature-processors-multi[]
The configuration information necessary to perform multi encoding. It allows
multiple processors to be changed together. This way the output of a processor
can then be passed to another as an input.
end::dfas-feature-processors-multi[]

tag::dfas-feature-processors-multi-proc[]
The ordered array of custom processors to execute. Must be more than 1.
end::dfas-feature-processors-multi-proc[]

tag::dfas-feature-processors-ngram[]
The configuration information necessary to perform ngram encoding. Features
written out by this encoder have the following name format:
`<feature_prefix>.<ngram><string position>`. For example, if the
`feature_prefix` is `f`, the feature name for the second unigram in a string is
`f.11`.
end::dfas-feature-processors-ngram[]

tag::dfas-feature-processors-ngram-feat-pref[]
The feature name prefix. Defaults to `ngram_<start>_<length>`.
end::dfas-feature-processors-ngram-feat-pref[]

tag::dfas-feature-processors-ngram-field[]
The name of the text field to encode.
end::dfas-feature-processors-ngram-field[]

tag::dfas-feature-processors-ngram-length[]
Specifies the length of the ngram substring. Defaults to `50`. Must be greater
than `0`.
end::dfas-feature-processors-ngram-length[]

tag::dfas-feature-processors-ngram-ngrams[]
Specifies which ngrams to gather. It’s an array of integer values where the
minimum value is 1, and a maximum value is 5.
end::dfas-feature-processors-ngram-ngrams[]

tag::dfas-feature-processors-ngram-start[]
Specifies the zero-indexed start of the ngram substring. Negative values are
allowed for encoding ngram of string suffixes. Defaults to `0`.
end::dfas-feature-processors-ngram-start[]

tag::dfas-feature-processors-one-hot[]
The configuration information necessary to perform one hot encoding.
end::dfas-feature-processors-one-hot[]

tag::dfas-feature-processors-one-hot-map[]
The one hot map mapping the field value with the column name.
end::dfas-feature-processors-one-hot-map[]

tag::dfas-feature-processors-target-mean[]
The configuration information necessary to perform target mean encoding.
end::dfas-feature-processors-target-mean[]

tag::dfas-feature-processors-target-mean-default[]
The default value if field value is not found in the `target_map`.
end::dfas-feature-processors-target-mean-default[]

tag::dfas-feature-processors-target-mean-map[]
The field value to target mean transition map.
end::dfas-feature-processors-target-mean-map[]

tag::dfas-iteration[]
The number of iterations on the analysis.
end::dfas-iteration[]
Expand Down

0 comments on commit 9f7ca26

Please sign in to comment.