-
Notifications
You must be signed in to change notification settings - Fork 24.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initial data stream commit #53666
Initial data stream commit #53666
Conversation
This commits adds a data stream feature flag, initial definition of a data stream and the stubs for the data stream create, delete and get APIs. Also simple serialization tests are added and a rest test to thest the data stream API stubs. This is a large amount of code and mainly mechanical, but this commit should be straightforward to review, because there isn't any real logic. The data stream transport and rest action are behind the data stream feature flag and are only intialized if the feature flag is enabled. The feature flag is enabled if elasticsearch is build as snapshot or a release build and the 'es.datastreams_feature_flag_registered' is enabled. The integ-test-zip sets the feature flag if building a release build, otherwise rest tests would fail. Relates to elastic#53100
Pinging @elastic/es-core-features (:Core/Features/Data streams) |
before this commit the data steams api were indices based actions, but data streams aren't indices, data streams encapsulates indices, but are indices themselves. It is a cluster level attribute, and therefor cluster based action fits best for now. Perhaps in the future we will have data stream based actions and then this would be a right fit for the data stream crud apis.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
super(in); | ||
this.name = in.readString(); | ||
this.timestampFieldName = in.readString(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to read the optional indices
field here for creating a data stream with existing indices?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For now this api will only create new empty indices.
When we add the ability to add existing indices to a data stream then we can add an optional indices
field here. Makes sense?
…lease build is executed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a number of comments, mostly on naming and locations, otherwise looking good.
@@ -0,0 +1,31 @@ | |||
{ | |||
"cluster.create_data_stream":{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if this name should be: data_streams.create
to be consistent with indices.create
?
Other cluster
operations seems to be mostly about the cluster state and cluster settings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After chatting with Henning, I also came to the conclusion that data stream apis should be concidere indices based operations. (Ideally we should have a data_stream notion, but we need to think more about this and how this would work in security). So I will revert the commit in this pr that changes data stream crud apis back to indices based apis.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the component template apis are cluster operations, I suspect those may need to be changed also, can you explain the reasoning and how it will relate to security?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just like indices based apis, data stream apis targets a namespace. So someone may be allowed to create a data stream for logs-* but not for events-*.
Ideally there should be a data stream privilege that handles this correctly, because data streams aren't indices and a data stream may have indices that don't share the data stream name as common prefix. But for now let's stick with indices:
based action names and group the apis in the indices client, until we a better there is a better understanding how security and data streams should integrate.
I think component templates should remain cluster based apis. The reason is that these resources don't apply to a namespace. The index template (based on the specified pattern) that use component templates should be an index based action/operation, since they apply the an index namespace and soon also to a data stream namespace.
However currently index templates are treated as cluster privilege (see ClusterPrivilegeResolver line 196 in master), even though the action names start with indices:
prefix. With data streams we should properly fix this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I originally had component template APIs using indices:
to match the existing template stuff, but I did end up changing it because that requires the request to provide the indices that it is going to apply to, and component templates don't apply to indices. +1 to properly fix the template exception with data streams.
@@ -0,0 +1,26 @@ | |||
{ | |||
"cluster.delete_data_stream":{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
data_streams.delete
?
@@ -0,0 +1,33 @@ | |||
{ | |||
"cluster.get_data_streams":{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
data_streams.get
?
cluster.create_data_stream: | ||
name: data-stream2 | ||
body: | ||
timestamp_field_name: "@timestamp" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think most places where we ask for a field, we use just "field" and not "field_name". I did not search all usages, but I think this should be just timestamp_field
.
* specific language governing permissions and limitations | ||
* under the License. | ||
*/ | ||
package org.elasticsearch.action.admin.cluster.datastream; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should move this outside the cluster
package? So datastream
is a sibling to indices
.
We might also want to follow the same substructure by adding create
package now? Probably easiest to split into subpackages early.
public class CreateDataStreamAction extends ActionType<AcknowledgedResponse> { | ||
|
||
public static final CreateDataStreamAction INSTANCE = new CreateDataStreamAction(); | ||
public static final String NAME = "cluster:admin/data_stream/create"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am in doubt if we should not add a specific prefix here? No need to look at this now, we can pick that up later.
@@ -1185,6 +1188,36 @@ public DeleteStoredScriptRequestBuilder prepareDeleteStoredScript(){ | |||
public DeleteStoredScriptRequestBuilder prepareDeleteStoredScript(String id){ | |||
return prepareDeleteStoredScript().setId(id); | |||
} | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same story as for the action names. We should consider whether to include this under indices or under a new "client" interface? Will add to our weekly sync, this should not block merging this.
import java.util.List; | ||
import java.util.Objects; | ||
|
||
public final class DataStream extends AbstractDiffable<DataStream> implements ToXContentObject { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rename to DataStreamMetaData
(like IndexMetaData
)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Data stream definitions will be stored in a custom metadata (like ComponentTemplateMetadata
). These classes tend to have the Metadata
suffix and therfore I think that adding MetaData
suffix is confusing here.
Maybe we can rename this to DataStreamSource
? Like how stored scripts are stored? (which also is stored inside a custom metadata).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. I think I imagined a more native integration into the MetaData
(outside Custom
). I think the primary purpose of Custom is for extensibility (x-pack/plugins/modules)? In particular, I think MetaData.getAliasAndIndexLookup
will need some support of data-streams? It could still look for the Custom
metadata pieces, but feels second-class.
Let us leave this as is for now and tackle above when we get to add the metadata and do the lookups.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think MetaData.getAliasAndIndexLookup will need some support of data-streams?
We can add a getter on metadata that reads data stream from a custom and then it is as if it is stored as a primary field. The reason why I think we should do this is that, we get serialization, diffability and versioning for free without changing the Metadata class.
It could still look for the Custom metadata pieces, but feels second-class.
Perhaps we should rename this in the future. Scripts and pipelines and other primary concepts are stored also as custom metadata and these concepts aren't second class either.
I've changed the base branch to be the master branch instead of the data stream feature branch. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Left a few minor optional comments to consider.
return actionName.startsWith("cluster:") || actionName.startsWith("indices:admin/template/"); | ||
return actionName.startsWith("cluster:") || | ||
actionName.startsWith("indices:admin/template/") || | ||
actionName.startsWith("indices:admin/data_stream/"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe add a todo to help the next reader until resolved? Something like:
// todo: hack until we implement security of data_streams
import java.util.List; | ||
import java.util.Objects; | ||
|
||
public final class DataStream extends AbstractDiffable<DataStream> implements ToXContentObject { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. I think I imagined a more native integration into the MetaData
(outside Custom
). I think the primary purpose of Custom is for extensibility (x-pack/plugins/modules)? In particular, I think MetaData.getAliasAndIndexLookup
will need some support of data-streams? It could still look for the Custom
metadata pieces, but feels second-class.
Let us leave this as is for now and tackle above when we get to add the metadata and do the lookups.
* specific language governing permissions and limitations | ||
* under the License. | ||
*/ | ||
package org.elasticsearch.rest.action.admin.cluster; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the rest classes should move to org.elasticsearch.rest.action.admin.datastreams
? Or org.elasticsearch.rest.action.admin.indices.datastream
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, I missed that.
This commits adds a data stream feature flag, initial definition of a data stream and the stubs for the data stream create, delete and get APIs. Also simple serialization tests are added and a rest test to thest the data stream API stubs. This is a large amount of code and mainly mechanical, but this commit should be straightforward to review, because there isn't any real logic. The data stream transport and rest action are behind the data stream feature flag and are only intialized if the feature flag is enabled. The feature flag is enabled if elasticsearch is build as snapshot or a release build and the 'es.datastreams_feature_flag_registered' is enabled. The integ-test-zip sets the feature flag if building a release build, otherwise rest tests would fail. Relates to elastic#53100
This commits adds a data stream feature flag, initial definition of a data stream and
the stubs for the data stream create, delete and get APIs. Also simple serialization
tests are added and a rest test to test the data stream API stubs.
This is a large amount of code and mainly mechanical, but this commit should be
straightforward to review, because there isn't any real logic.
The data stream transport and rest action are behind the data stream feature flag and
are only intialized if the feature flag is enabled. The feature flag is enabled if
elasticsearch is build as snapshot or a release build and the
'es.datastreams_feature_flag_registered' is enabled.
The integ-test-zip sets the feature flag if building a release build, otherwise
rest tests would fail.
Relates to #53100