Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DOCS] Add links to flattened datatype #56794

Merged

Conversation

lockewritesdocs
Copy link
Contributor

@lockewritesdocs lockewritesdocs commented May 14, 2020

This PR adds notes that link to the flattened datatype page from the KV Ingest Processor page and the Nested datatype page.

Also clarifies some descriptions about how Elasticsearch flattens objects and fixes minor typos.

Resolves #52239

Backports:

@lockewritesdocs lockewritesdocs added >docs General docs changes WIP :Search/Search Search-related issues that do not fall into other categories Team:Docs Meta label for docs team Team:Search Meta label for search team labels May 14, 2020
@lockewritesdocs lockewritesdocs self-assigned this May 14, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-docs (>docs)

@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search (:Search/Search)

@lockewritesdocs lockewritesdocs changed the title [DOCS] Add links to flattened datatype Changes for #52239. [DOCS] Add links to flattened datatype May 14, 2020
Copy link
Contributor

@jtibshirani jtibshirani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a good start, thank you @lockewritesdocs for addressing it!

I left some suggestions, and also thought of one other place where we could mention flattened. In the 'mappings explosion' section, we could add a tip under "index.mapping.total_fields.limit" saying that if your mappings contain a large, arbitrary set of keys, it could be worth looking into the 'flattened' data type.

@@ -17,8 +16,11 @@ For example, if you have a log message which contains `ip=1.2.3.4 error=REFUSED`
--------------------------------------------------
// NOTCONSOLE

TIP: Using the KV Processor can result in field names that you cannot control. Consider using the <<flattened>> datatype instead, which maps an entire object as a single field.
While a flattened object provides only a single field to search on, the object's contents can still be searched using simple queries and aggregations.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To me the phrase "While a flattened object provides only a single field to search on" could be confusing -- it could suggest that you can only search the root field. What would you think of this tweak: "Using the KV Processor can create a large number of field names that you don't control. Consider using the flattened datatype instead, which maps an entire object as a single field and allows for simple searches over its contents."

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed! I like your edit, which focuses on the positive (what the user can do) of the flattened object, rather than its limitations.

each object in the array, use the `nested` datatype instead of the
<<object,`object`>> datatype.

TIP: If you consider creating `nested` objects with two `key` and `value` keyword fields, consider using the <<flattened>> datatype instead.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we could move this tip to the end of the section? I think it comes right in the middle of an important explanation and breaks the continuity.

Copy link
Contributor Author

@lockewritesdocs lockewritesdocs May 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm considering a new section for Interacted with nested documents that contains the ways users can interact with these documents, this new tip, and the Important note that exists. For example:

image

<<object,`object`>> datatype.

TIP: If you consider creating `nested` objects with two `key` and `value` keyword fields, consider using the <<flattened>> datatype instead.
Because nested documents are indexed as separate documents, they can only be accessed within the scope of the nested query. While a flattened object provides only a single field to search on, the object's contents can still be searched using simple queries and aggregations.
Copy link
Contributor

@jtibshirani jtibshirani May 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A suggestion for how to restructure this tip:

  • We could first give some context, saying that when ingesting key-value pairs with a large arbitrary set of keys, one technique is to model each pair as its own nested document with key and value fields.
  • Instead we'd suggest the using flattened datatype, "which maps an entire object as a single field and allows for simple searches over its contents."

One other comment -- instead of describing the downside as "only be accessed within the scope of the nested query", I think it'd be clearer to mention that nested documents and queries are generally expensive, and that the flattened datatype is a better fit for this use case.

Copy link
Contributor Author

@lockewritesdocs lockewritesdocs May 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jtibshirani -- I made several changes to this Tip that incorporates your feedback. See my latest commit for specific changes.

…ed options in the Mapping page and referencing them in the Nested page.
As described earlier, each nested object is indexed as a separate document under the hood.
Continuing with the example above, if we indexed a single document containing 100 `user` objects,
then 101 Lucene documents would be created -- one for the parent document, and one for each
As described earlier, each nested object is indexed as a separate document.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we either need to keep 'under the hood', or say 'a separate Lucene document'. Otherwise it this could be interpreted as each nested object corresponding to a new Elasticsearch document.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that "indexed as a separate Lucene document" is more descriptive than "indexed as a separate document under the hood".

@@ -161,6 +165,9 @@ Nested documents can be:
* sorted with <<nested-sorting,nested sorting>>.
* retrieved and highlighted with <<nested-inner-hits,nested inner hits>>.

TIP: When ingesting key-value pairs with a large, arbitrary set of keys, you might consider modeling each key-value pair as its own nested document with `key` and `value` fields. Instead, consider using the <<flattened>> datatype, which maps an entire object as a single field and allows for simple searches over its contents.
Copy link
Contributor

@jtibshirani jtibshirani May 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For me this tip doesn't really fall under the topic 'interacting with nested documents'. It's actually a suggestion to avoid using nested documents, so I think it'd be good to keep it in the previous section where we introduce what nested objects are.

Alternatively, we could move it to the top of this page right after the initial description so it catches a user's attention. I don't think it would be too distracting, since it's a short tip.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I think that moving this tip after the initial description makes sense. That way, users can read it and potentially say, "Ah, looks like I should be using flattened!" instead of having to read deep into the page to discover that information.

Copy link
Contributor

@jtibshirani jtibshirani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

@lockewritesdocs lockewritesdocs requested a review from jrodewig May 18, 2020 01:34
@lockewritesdocs
Copy link
Contributor Author

Thanks @jtibshirani! Adding @jrodewig for a review.

Copy link
Contributor

@jrodewig jrodewig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I left some minor, non-blocking suggestions.

As a note, line width for docs is inconsistent, but we try to wrap documentation lines at 80 characters. This is sometimes impossible with links, but I think several of these new lines can be wrapped.

docs/reference/mapping/types/nested.asciidoc Outdated Show resolved Hide resolved
Adam Locke and others added 2 commits May 19, 2020 12:58
@lockewritesdocs lockewritesdocs merged commit d77388f into elastic:master May 19, 2020
@lockewritesdocs lockewritesdocs deleted the docs__link_flattened_field branch May 19, 2020 17:41
lockewritesdocs pushed a commit to lockewritesdocs/elasticsearch that referenced this pull request May 19, 2020
* Changes for elastic#52239.

* Incorporating review feedback from Julie T. Also single-sourcing nexted options in the Mapping page and referencing them in the Nested page.

* Moving tip after the introduction and clarifying limits.

* Update docs/reference/mapping.asciidoc

Co-authored-by: James Rodewig <[email protected]>

* Update docs/reference/mapping/types/nested.asciidoc

Co-authored-by: James Rodewig <[email protected]>

Co-authored-by: James Rodewig <[email protected]>
lockewritesdocs pushed a commit that referenced this pull request May 19, 2020
* Changes for #52239.

* Incorporating review feedback from Julie T. Also single-sourcing nexted options in the Mapping page and referencing them in the Nested page.

* Moving tip after the introduction and clarifying limits.

* Update docs/reference/mapping.asciidoc

Co-authored-by: James Rodewig <[email protected]>

* Update docs/reference/mapping/types/nested.asciidoc

Co-authored-by: James Rodewig <[email protected]>

Co-authored-by: James Rodewig <[email protected]>

Co-authored-by: James Rodewig <[email protected]>
lockewritesdocs pushed a commit to lockewritesdocs/elasticsearch that referenced this pull request May 19, 2020
* Changes for elastic#52239.

* Incorporating review feedback from Julie T. Also single-sourcing nexted options in the Mapping page and referencing them in the Nested page.

* Moving tip after the introduction and clarifying limits.

* Update docs/reference/mapping.asciidoc

Co-authored-by: James Rodewig <[email protected]>

* Update docs/reference/mapping/types/nested.asciidoc

Co-authored-by: James Rodewig <[email protected]>

Co-authored-by: James Rodewig <[email protected]>
lockewritesdocs pushed a commit that referenced this pull request May 19, 2020
* Changes for #52239.

* Incorporating review feedback from Julie T. Also single-sourcing nexted options in the Mapping page and referencing them in the Nested page.

* Moving tip after the introduction and clarifying limits.

* Update docs/reference/mapping.asciidoc

Co-authored-by: James Rodewig <[email protected]>

* Update docs/reference/mapping/types/nested.asciidoc

Co-authored-by: James Rodewig <[email protected]>

Co-authored-by: James Rodewig <[email protected]>

Co-authored-by: James Rodewig <[email protected]>
lockewritesdocs pushed a commit to lockewritesdocs/elasticsearch that referenced this pull request May 19, 2020
* Changes for elastic#52239.

* Incorporating review feedback from Julie T. Also single-sourcing nexted options in the Mapping page and referencing them in the Nested page.

* Moving tip after the introduction and clarifying limits.

* Update docs/reference/mapping.asciidoc

Co-authored-by: James Rodewig <[email protected]>

* Update docs/reference/mapping/types/nested.asciidoc

Co-authored-by: James Rodewig <[email protected]>

Co-authored-by: James Rodewig <[email protected]>
lockewritesdocs pushed a commit that referenced this pull request May 19, 2020
* Changes for #52239.

* Incorporating review feedback from Julie T. Also single-sourcing nexted options in the Mapping page and referencing them in the Nested page.

* Moving tip after the introduction and clarifying limits.

* Update docs/reference/mapping.asciidoc

Co-authored-by: James Rodewig <[email protected]>

* Update docs/reference/mapping/types/nested.asciidoc

Co-authored-by: James Rodewig <[email protected]>

Co-authored-by: James Rodewig <[email protected]>

Co-authored-by: James Rodewig <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>docs General docs changes :Search/Search Search-related issues that do not fall into other categories Team:Docs Meta label for docs team Team:Search Meta label for search team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Link the flattened field from the kv processor and nested fields
4 participants