Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add basic alias support for data streams #72613

Merged
merged 6 commits into from
May 11, 2021

Conversation

martijnvg
Copy link
Member

@martijnvg martijnvg commented May 3, 2021

Aliases to data streams can be defined via the existing update aliases api.
Aliases can either only refer to data streams or to indices (not both).
Also the existing get aliases api has been modified to support returning
aliases that refer to data streams.

Aliases for data streams are stored separately from data streams and
and refer to data streams by name and not to the backing indices of
a data stream. This means that when backing indices are added or removed
from a data stream that then the data stream alias doesn't need to be
updated. When a data stream alias is used in an api then all the backing indices
of the data streams the alias currently refers to are resolved.

The authorization model for aliases that refer to data streams is the
same as for aliases the refer to indices. In security privileges can
be defined on aliases, indices and data streams. When a privilege is
granted on an alias then access is also granted on the indices that
an alias refers to (irregardless whether privileges are granted or denied
on the actual indices). The same will apply for aliases that refer
to data streams. See for more details: #66163 (comment)

@martijnvg martijnvg force-pushed the data_stream_aliases_basics branch from fbaa9b4 to 9732815 Compare May 3, 2021 09:15
@@ -192,8 +192,8 @@ teardown:

- do:
headers: { Authorization: "Basic dGVzdF91c2VyOngtcGFjay10ZXN0LXBhc3N3b3Jk" } # test_user
indices.get_alias:
name: easy*
indices.get:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get_alias api now support data streams, so changed this test to use get index api to test
that apis that don't data streams do not include data streams among authorised indices.

@@ -138,6 +138,7 @@ tasks.named("yamlRestCompatTest").configure {
'security/authz/14_cat_indices/Test explicit request while multiple authorized indices',
'security/authz/14_cat_indices/Test explicit request while multiple opened/closed authorized indices',
'security/authz/14_cat_indices/Test wildcard request with multiple authorized indices',
'security/authz/50_data_streams/Test that requests not supporting data streams do not include data streams among authorized indices',
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In 7.x, this test uses get alias api, but that api now supports data streams,
so this yaml rest combat test needs to be muted.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @martijnvg I am testing in 7.14.0 with wildcard matching on the data streams indices and it does not seem to work. Not sure what I did wrong here
image

image

But not using wild card does work

image

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @hungnguyen-elastic, thanks for reporting! I checked this locally and ran into the same issue. I will work on a fix for this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've opened #75456 to track this bug.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @martijnvg

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aliases to data streams can be defined via the existing update aliases api.
Aliases can either only refer to data streams or to indices (not both).
Also the existing get aliases api has been modified to support returning
aliases that refer to data streams.

Aliases for data streams are stored separately from data streams and
and refer to data streams by name and not to the backing indices of
a data stream. This means that when backing indices are added or removed
from a data stream that then the data stream alias doesn't need to be
updated.

The authorization model for aliases that refer to data streams is the
same as for aliases the refer to indices. In security privileges can
be defined on aliases, indices and data streams. When a privilege is
granted on an alias then access is also granted on the indices that
an alias refers to (irregardless whether privileges are granted or denied
on the actual indices). The same will apply for aliases that refer
to data streams. See for more details:
elastic#66163 (comment)

Relates to elastic#66163
@martijnvg martijnvg force-pushed the data_stream_aliases_basics branch from 801c2bf to 63ad677 Compare May 4, 2021 09:26
@martijnvg martijnvg marked this pull request as ready for review May 4, 2021 09:27
@martijnvg martijnvg requested a review from danhermann May 4, 2021 09:27
@elasticmachine elasticmachine added the Team:Data Management Meta label for data/management team label May 4, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-features (Team:Core/Features)

Copy link
Contributor

@danhermann danhermann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely a lot of stuff happening in this PR! I have some suggestions, but nothing blocking.

  • I'd suggest getting @jrodewig to look over the doc changes for consistency, wording, etc.
  • I believe the merge conflict in DataStreamMetadata is due to a recent change by Armin that makes that class immutable. I'd suggest doing the same thing with the new dataStreamAliases member that you're adding here.
  • Along the lines of the previous item, I'd suggest making DataStreamAlias immutable as its list of data streams is currently mutable.
  • I wonder if it makes sense to introduce a new IndexAbstraction.Type member since index aliases and data stream aliases are different. As it stands, IndexAbstraction.DataStreamAlias is the only implementation of IndexAbstraction that does not have its own type.
  • More broadly, and perhaps not a question for this PR specifically, but with data stream aliases limited to data streams, they won't support the use case of migrating a Kibana dashboard gradually from indices to data streams. Were we to want to support that use case in the future, it seems like it would require a lot of rework to this implementation rather than just an iterative refinement of it.

@martijnvg
Copy link
Member Author

Thanks for reviewing @danhermann!

I believe the merge conflict in DataStreamMetadata is due to a recent change by Armin that makes that class immutable. I'd suggest doing the same thing with the new dataStreamAliases member that you're adding here.

👍

Along the lines of the previous item, I'd suggest making DataStreamAlias immutable as its list of data streams is currently mutable.

👍

I wonder if it makes sense to introduce a new IndexAbstraction.Type member since index aliases and data stream aliases are different. As it stands, IndexAbstraction.DataStreamAlias is the only implementation of IndexAbstraction that does not have its own type.

Data stream alias are a different kind of alias compared to indices aliases, but both are similar enough that the functionality that these kinds of aliases provide can abstracted behind the IndexAbstraction.Type.ALIAS enum. If we would introduce another IndexAbstraction.Type enum instance then a lot of checks in the codebase would check for IndexAbstraction.Type.ALIAS and IndexAbstraction.Type.DATA_STREAM_ALIAS, which I think in many places would be verbose.

Also for someone setting up aliases, there isn't a real notion of data stream aliases or indices aliases. The update aliases api or get alias api don't make a distinction between the two, it is just that an alias either refers to indices or data streams. It is just that certain alias options (like routing) aren't supported for aliases that refer to data streams. APIs that use aliases just resolve an alias to a set of indices (irregardless what type of alias or whether an alias refers to a data stream and that data stream refers to backing indices).

More broadly, and perhaps not a question for this PR specifically, but with data stream aliases limited to data streams, they won't support the use case of migrating a Kibana dashboard gradually from indices to data streams. Were we to want to support that use case in the future, it seems like it would require a lot of rework to this implementation rather than just an iterative refinement of it.

I don't see data stream aliases play a role in this case. If an alias is used in front of these indices then the migrate data stream api can be used. If not, then currently there isn't much we can do to help migration to data streams. We would need ES to support index renaming (atomically) in order to be of any help here.

@martijnvg martijnvg requested a review from jrodewig May 10, 2021 08:52
Copy link
Contributor

@jrodewig jrodewig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for putting this together @martijnvg. I left some suggestions, but feel free to ignore them if wanted.

Some additional doc changes are required, but I'll handle those in a separate PR. I'll add you as a reviewer there.

docs/reference/indices/aliases.asciidoc Outdated Show resolved Hide resolved
docs/reference/indices/aliases.asciidoc Outdated Show resolved Hide resolved
@martijnvg
Copy link
Member Author

@elasticmachine update branch

@martijnvg
Copy link
Member Author

Thanks for reviewing @jrodewig!

@martijnvg martijnvg merged commit 6689b8b into elastic:master May 11, 2021
martijnvg added a commit to martijnvg/elasticsearch that referenced this pull request May 11, 2021
Backporting elastic#72613 to 7.x.

Aliases to data streams can be defined via the existing update aliases api.
Aliases can either only refer to data streams or to indices (not both).
Also the existing get aliases api has been modified to support returning
aliases that refer to data streams.

Aliases for data streams are stored separately from data streams and
and refer to data streams by name and not to the backing indices of
a data stream. This means that when backing indices are added or removed
from a data stream that then the data stream alias doesn't need to be
updated.

The authorization model for aliases that refer to data streams is the
same as for aliases the refer to indices. In security privileges can
be defined on aliases, indices and data streams. When a privilege is
granted on an alias then access is also granted on the indices that
an alias refers to (irregardless whether privileges are granted or denied
on the actual indices). The same will apply for aliases that refer
to data streams. See for more details:
elastic#66163 (comment)

Relates to elastic#66163
martijnvg added a commit to martijnvg/elasticsearch that referenced this pull request May 11, 2021
martijnvg added a commit that referenced this pull request May 11, 2021
martijnvg added a commit that referenced this pull request May 11, 2021
Backporting #72613 to 7.x.

Aliases to data streams can be defined via the existing update aliases api.
Aliases can either only refer to data streams or to indices (not both).
Also the existing get aliases api has been modified to support returning
aliases that refer to data streams.

Aliases for data streams are stored separately from data streams and
and refer to data streams by name and not to the backing indices of
a data stream. This means that when backing indices are added or removed
from a data stream that then the data stream alias doesn't need to be
updated.

The authorization model for aliases that refer to data streams is the
same as for aliases the refer to indices. In security privileges can
be defined on aliases, indices and data streams. When a privilege is
granted on an alias then access is also granted on the indices that
an alias refers to (irregardless whether privileges are granted or denied
on the actual indices). The same will apply for aliases that refer
to data streams. See for more details:
#66163 (comment)

Relates to #66163

* made code compatible with java 8
martijnvg added a commit to martijnvg/elasticsearch that referenced this pull request May 11, 2021
martijnvg added a commit that referenced this pull request May 11, 2021
jrodewig added a commit that referenced this pull request May 17, 2021
#72613 adds data stream support to aliases.
This adds an `alias` glossary entry and removes out the current `index alias` entry.
jrodewig added a commit that referenced this pull request May 17, 2021
#72613 adds data stream support to aliases.
This adds an `alias` glossary entry and removes out the current `index alias` entry.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Data Management/Data streams Data streams and their lifecycles >feature Team:Data Management Meta label for data/management team v7.14.0 v8.0.0-alpha1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants