-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DOCS] Add links to flattened
datatype
#56794
Changes from 1 commit
9bb9814
94f92b2
f5a5693
d666bc7
e856be4
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -5,14 +5,13 @@ | |
++++ | ||
|
||
The `nested` type is a specialised version of the <<object,`object`>> datatype | ||
that allows arrays of objects to be indexed in a way that they can be queried | ||
that allows arrays of objects to be indexed in a way that they can be queried | ||
independently of each other. | ||
|
||
==== How arrays of objects are flattened | ||
|
||
Arrays of inner <<object,`object` fields>> do not work the way you may expect. | ||
Lucene has no concept of inner objects, so Elasticsearch flattens object | ||
hierarchies into a simple list of field names and values. For instance, the | ||
Elasticsearch has no concept of inner objects. Therefore, it flattens object | ||
hierarchies into a simple list of field names and values. For instance, consider the | ||
following document: | ||
|
||
[source,console] | ||
|
@@ -35,7 +34,7 @@ PUT my_index/_doc/1 | |
|
||
<1> The `user` field is dynamically added as a field of type `object`. | ||
|
||
would be transformed internally into a document that looks more like this: | ||
The previous document would be transformed internally into a document that looks more like this: | ||
|
||
[source,js] | ||
-------------------------------------------------- | ||
|
@@ -71,10 +70,15 @@ GET my_index/_search | |
==== Using `nested` fields for arrays of objects | ||
|
||
If you need to index arrays of objects and to maintain the independence of | ||
each object in the array, you should use the `nested` datatype instead of the | ||
<<object,`object`>> datatype. Internally, nested objects index each object in | ||
each object in the array, use the `nested` datatype instead of the | ||
<<object,`object`>> datatype. | ||
|
||
TIP: If you consider creating `nested` objects with two `key` and `value` keyword fields, consider using the <<flattened>> datatype instead. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe we could move this tip to the end of the section? I think it comes right in the middle of an important explanation and breaks the continuity. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
Because nested documents are indexed as separate documents, they can only be accessed within the scope of the nested query. While a flattened object provides only a single field to search on, the object's contents can still be searched using simple queries and aggregations. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. A suggestion for how to restructure this tip:
One other comment -- instead of describing the downside as "only be accessed within the scope of the nested query", I think it'd be clearer to mention that nested documents and queries are generally expensive, and that the flattened datatype is a better fit for this use case. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks @jtibshirani -- I made several changes to this Tip that incorporates your feedback. See my latest commit for specific changes. |
||
|
||
Internally, nested objects index each object in | ||
the array as a separate hidden document, meaning that each nested object can be | ||
queried independently of the others, with the <<query-dsl-nested-query,`nested` query>>: | ||
queried independently of the others with the <<query-dsl-nested-query,`nested` query>>: | ||
|
||
[source,console] | ||
-------------------------------------------------- | ||
|
@@ -230,6 +234,3 @@ settings in place to guard against performance problems: | |
|
||
Additional background on these settings, including information on their default values, can be found | ||
in <<mapping-limit-settings>>. | ||
|
||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To me the phrase "While a flattened object provides only a single field to search on" could be confusing -- it could suggest that you can only search the root field. What would you think of this tweak: "Using the KV Processor can create a large number of field names that you don't control. Consider using the flattened datatype instead, which maps an entire object as a single field and allows for simple searches over its contents."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed! I like your edit, which focuses on the positive (what the user can do) of the
flattened
object, rather than its limitations.