Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML] Consider validating jobs outside of the builder #34899

Open
davidkyle opened this issue Oct 26, 2018 · 3 comments
Open

[ML] Consider validating jobs outside of the builder #34899

davidkyle opened this issue Oct 26, 2018 · 3 comments
Labels
:ml Machine learning team-discuss

Comments

@davidkyle
Copy link
Member

A general issue but the changes in the Job in Index feature branch make it more relevant.

Whenever a job is parsed from xContent it is parsed into a builder then built wherein the build() method validates the config. This has the satisfying result that all job objects must be valid as they can only be constructed by the builder. Job config is stored in the clusterstate as Job objects rather than Job.Builders however, when the clusterstate is parsed (e.g after a node restart) the builder is used and validation will occur potentially throwing an exception.

For job index documents using the builder to parse the document means the job has to be validated every time it is read, in practice this means lots of code like this:

try {
   // need to wrap this is a try in case it throws
    job = builder.build();
} catch (e) {
    // hmm what should I do now?
}

If the validation did change or some bug was introduced and the build method throws then it is not possible to view the actual job config from the GET jobs API.

Job configuration must be validated on PUT, update and when the job is opened but is it not necessary at GET. I'd like to remove the need for all those try { build()..} statements by either not validating in the builder or parsing the document directly into a job. The job will have to be revalidated when it is opened.

I prefer the idea of having a non-throwing Builder.build() as it makes the code easier to reason about and there are a small number of places where the job config must absolutely be validated.

See also #34858

@davidkyle davidkyle added :ml Machine learning team-discuss labels Oct 26, 2018
@elasticmachine
Copy link
Collaborator

Pinging @elastic/ml-core

@droberts195
Copy link
Contributor

Job configuration must be validated on PUT, update and when the job is opened

I think it would also be a good idea to validate the job config on the node it is allocated to when the job persistent task is allocated to a node. This would pick up failures due to different validation in different versions in mixed version clusters. If a job fails this validation we could put the reason for the validation failure in the allocation failure explanation of the persistent task.

I think the problem is worse for datafeeds though. For datafeeds we're using core search classes to store the search, and in a mixed version cluster some nodes might not have appropriate classes to store some aspects of the search. So the only option is to defer not just validation but also parsing to specific objects until late - something like what is proposed in #30084.

@droberts195
Copy link
Contributor

For datafeeds we're using core search classes to store the search, and in a mixed version cluster some nodes might not have appropriate classes to store some aspects of the search.

This part of the problem has been resolved in #36117.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:ml Machine learning team-discuss
Projects
None yet
Development

No branches or pull requests

3 participants