-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DOCS] Add links to flattened
datatype
#56794
[DOCS] Add links to flattened
datatype
#56794
Conversation
Pinging @elastic/es-docs (>docs) |
Pinging @elastic/es-search (:Search/Search) |
flattened
datatype Changes for #52239.flattened
datatype
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like a good start, thank you @lockewritesdocs for addressing it!
I left some suggestions, and also thought of one other place where we could mention flattened. In the 'mappings explosion' section, we could add a tip under "index.mapping.total_fields.limit" saying that if your mappings contain a large, arbitrary set of keys, it could be worth looking into the 'flattened' data type.
@@ -17,8 +16,11 @@ For example, if you have a log message which contains `ip=1.2.3.4 error=REFUSED` | |||
-------------------------------------------------- | |||
// NOTCONSOLE | |||
|
|||
TIP: Using the KV Processor can result in field names that you cannot control. Consider using the <<flattened>> datatype instead, which maps an entire object as a single field. | |||
While a flattened object provides only a single field to search on, the object's contents can still be searched using simple queries and aggregations. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To me the phrase "While a flattened object provides only a single field to search on" could be confusing -- it could suggest that you can only search the root field. What would you think of this tweak: "Using the KV Processor can create a large number of field names that you don't control. Consider using the flattened datatype instead, which maps an entire object as a single field and allows for simple searches over its contents."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed! I like your edit, which focuses on the positive (what the user can do) of the flattened
object, rather than its limitations.
each object in the array, use the `nested` datatype instead of the | ||
<<object,`object`>> datatype. | ||
|
||
TIP: If you consider creating `nested` objects with two `key` and `value` keyword fields, consider using the <<flattened>> datatype instead. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we could move this tip to the end of the section? I think it comes right in the middle of an important explanation and breaks the continuity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
<<object,`object`>> datatype. | ||
|
||
TIP: If you consider creating `nested` objects with two `key` and `value` keyword fields, consider using the <<flattened>> datatype instead. | ||
Because nested documents are indexed as separate documents, they can only be accessed within the scope of the nested query. While a flattened object provides only a single field to search on, the object's contents can still be searched using simple queries and aggregations. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A suggestion for how to restructure this tip:
- We could first give some context, saying that when ingesting key-value pairs with a large arbitrary set of keys, one technique is to model each pair as its own nested document with
key
andvalue
fields. - Instead we'd suggest the using flattened datatype, "which maps an entire object as a single field and allows for simple searches over its contents."
One other comment -- instead of describing the downside as "only be accessed within the scope of the nested query", I think it'd be clearer to mention that nested documents and queries are generally expensive, and that the flattened datatype is a better fit for this use case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @jtibshirani -- I made several changes to this Tip that incorporates your feedback. See my latest commit for specific changes.
…ed options in the Mapping page and referencing them in the Nested page.
As described earlier, each nested object is indexed as a separate document under the hood. | ||
Continuing with the example above, if we indexed a single document containing 100 `user` objects, | ||
then 101 Lucene documents would be created -- one for the parent document, and one for each | ||
As described earlier, each nested object is indexed as a separate document. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we either need to keep 'under the hood', or say 'a separate Lucene document'. Otherwise it this could be interpreted as each nested object corresponding to a new Elasticsearch document.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that "indexed as a separate Lucene document" is more descriptive than "indexed as a separate document under the hood".
@@ -161,6 +165,9 @@ Nested documents can be: | |||
* sorted with <<nested-sorting,nested sorting>>. | |||
* retrieved and highlighted with <<nested-inner-hits,nested inner hits>>. | |||
|
|||
TIP: When ingesting key-value pairs with a large, arbitrary set of keys, you might consider modeling each key-value pair as its own nested document with `key` and `value` fields. Instead, consider using the <<flattened>> datatype, which maps an entire object as a single field and allows for simple searches over its contents. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For me this tip doesn't really fall under the topic 'interacting with nested
documents'. It's actually a suggestion to avoid using nested documents, so I think it'd be good to keep it in the previous section where we introduce what nested objects are.
Alternatively, we could move it to the top of this page right after the initial description so it catches a user's attention. I don't think it would be too distracting, since it's a short tip.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. I think that moving this tip after the initial description makes sense. That way, users can read it and potentially say, "Ah, looks like I should be using flattened
!" instead of having to read deep into the page to discover that information.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me!
Thanks @jtibshirani! Adding @jrodewig for a review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. I left some minor, non-blocking suggestions.
As a note, line width for docs is inconsistent, but we try to wrap documentation lines at 80 characters. This is sometimes impossible with links, but I think several of these new lines can be wrapped.
Co-authored-by: James Rodewig <[email protected]>
Co-authored-by: James Rodewig <[email protected]>
* Changes for elastic#52239. * Incorporating review feedback from Julie T. Also single-sourcing nexted options in the Mapping page and referencing them in the Nested page. * Moving tip after the introduction and clarifying limits. * Update docs/reference/mapping.asciidoc Co-authored-by: James Rodewig <[email protected]> * Update docs/reference/mapping/types/nested.asciidoc Co-authored-by: James Rodewig <[email protected]> Co-authored-by: James Rodewig <[email protected]>
* Changes for #52239. * Incorporating review feedback from Julie T. Also single-sourcing nexted options in the Mapping page and referencing them in the Nested page. * Moving tip after the introduction and clarifying limits. * Update docs/reference/mapping.asciidoc Co-authored-by: James Rodewig <[email protected]> * Update docs/reference/mapping/types/nested.asciidoc Co-authored-by: James Rodewig <[email protected]> Co-authored-by: James Rodewig <[email protected]> Co-authored-by: James Rodewig <[email protected]>
* Changes for elastic#52239. * Incorporating review feedback from Julie T. Also single-sourcing nexted options in the Mapping page and referencing them in the Nested page. * Moving tip after the introduction and clarifying limits. * Update docs/reference/mapping.asciidoc Co-authored-by: James Rodewig <[email protected]> * Update docs/reference/mapping/types/nested.asciidoc Co-authored-by: James Rodewig <[email protected]> Co-authored-by: James Rodewig <[email protected]>
* Changes for #52239. * Incorporating review feedback from Julie T. Also single-sourcing nexted options in the Mapping page and referencing them in the Nested page. * Moving tip after the introduction and clarifying limits. * Update docs/reference/mapping.asciidoc Co-authored-by: James Rodewig <[email protected]> * Update docs/reference/mapping/types/nested.asciidoc Co-authored-by: James Rodewig <[email protected]> Co-authored-by: James Rodewig <[email protected]> Co-authored-by: James Rodewig <[email protected]>
* Changes for elastic#52239. * Incorporating review feedback from Julie T. Also single-sourcing nexted options in the Mapping page and referencing them in the Nested page. * Moving tip after the introduction and clarifying limits. * Update docs/reference/mapping.asciidoc Co-authored-by: James Rodewig <[email protected]> * Update docs/reference/mapping/types/nested.asciidoc Co-authored-by: James Rodewig <[email protected]> Co-authored-by: James Rodewig <[email protected]>
* Changes for #52239. * Incorporating review feedback from Julie T. Also single-sourcing nexted options in the Mapping page and referencing them in the Nested page. * Moving tip after the introduction and clarifying limits. * Update docs/reference/mapping.asciidoc Co-authored-by: James Rodewig <[email protected]> * Update docs/reference/mapping/types/nested.asciidoc Co-authored-by: James Rodewig <[email protected]> Co-authored-by: James Rodewig <[email protected]> Co-authored-by: James Rodewig <[email protected]>
This PR adds notes that link to the
flattened
datatype page from the KV Ingest Processor page and the Nested datatype page.Also clarifies some descriptions about how Elasticsearch flattens objects and fixes minor typos.
Resolves #52239
Backports:
flattened
datatype (#56794) #56959 (7.8)flattened
datatype (#56794) #56962 (7.7)flattened
datatype (#56794) #56963 (7.x)