Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor model management to support apis #95

Conversation

jmazanec15
Copy link
Member

Signed-off-by: John Mazanec [email protected]

Description

This PR adds additional metadata to the Model System index, so that we can implement the APIs specified in #70. The metadata that is added is:

  1. State - the lifecycle state the model is in - training, created, failed
  2. Timestamp - when the model was created
  3. Description - field allowing users to add detail to their models
  4. Error - When model is in failed state, error will indicate why it failed

Additonally, this PR adds update functionality in ModelDao so that models can initially start in TRAINING state and later be updated.

Along with this, some minor code quality changes were made as well.

This PR also adds a TODO to fix the bug in clearing the cache in the ModelDao. This will be tracked in #93.

This PR does not add search functionality to the ModelDao. This will need to be done in a future PR.

Issues Resolved

#92

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Commits are signed as per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@@ -114,6 +114,10 @@ public void addKNNBinaryField(FieldInfo field, DocValuesProducer valuesProducer)
String modelId = field.attributes().get(MODEL_ID);
Model model = ModelCache.getInstance().get(modelId);

if (model.getModelBlob() == null) {
throw new RuntimeException("Model cannot be null");
Copy link
Member

@VijayanB VijayanB Sep 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit Model blob cannot be null

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will update.

this.modelBlob = Objects.requireNonNull(modelBlob, "modelBlob must not be null");

if (ModelState.CREATED.equals(this.modelMetadata.getState()) && modelBlob == null) {
throw new IllegalArgumentException("Model blob cannot be null when model metadata says model is created");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will user understands what is model blob/ model meta data?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, we may have had this discussion before. Basically a model is made of a binary model and some metadata about the model. model_blob refers to the binary. I am not sure if there is a better name. Do you have any recs?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As long as we document what this error message means / how to resolve it, it should be fine.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with Vijay. We can put message something like "We could not find model associated to this id xyz"?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, will update error message.

@@ -56,7 +63,19 @@ public ModelMetadata getModelMetadata() {
* @return length of model blob
*/
public int getLength() {
return modelBlob.length;
if (modelBlob.get() == null) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

may be use getter from 56

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we have to mask length to zero if object is null? Why not let similar to other java objects like string/array?

Copy link
Member Author

@jmazanec15 jmazanec15 Sep 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

may be use getter from 56

Will update

do we have to mask length to zero if object is null? Why not let similar to other java objects like string/array?

GetLength is called in ModelCache to determine how large the model is. For model's whose modelBlob is null, we say the length is 0 so cache loading doesnt fail.

@@ -69,13 +88,13 @@ public boolean equals(Object obj) {

EqualsBuilder equalsBuilder = new EqualsBuilder();
equalsBuilder.append(modelMetadata, other.modelMetadata);
equalsBuilder.append(modelBlob, other.modelBlob);
equalsBuilder.append(modelBlob.get(), other.modelBlob.get());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how are we accessing modelBlob as variable? shouldn't this be getModelBlolb?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will update to use getter


return equalsBuilder.isEquals();
}

@Override
public int hashCode() {
return new HashCodeBuilder().append(modelMetadata).append(modelBlob).toHashCode();
return new HashCodeBuilder().append(modelMetadata).append(modelBlob.get()).toHashCode();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getModelBlob() to avoid modelBlob.get()

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will update to use getter

@Override
public String toString() {
return knnEngine.getName() + DELIMITER + spaceType.getValue() + DELIMITER + dimension;
return knnEngine.getName() + DELIMITER +
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: may be stringbuilder or String.join()

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, will update.

@@ -104,13 +189,18 @@ public boolean equals(Object obj) {
equalsBuilder.append(knnEngine, other.knnEngine);
equalsBuilder.append(spaceType, other.spaceType);
equalsBuilder.append(dimension, other.dimension);
equalsBuilder.append(state.get(), other.state.get());
equalsBuilder.append(timestamp, other.timestamp);
Copy link
Member

@VijayanB VijayanB Sep 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above, shouldn't we use getter?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, will update.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same for timestamp/description/error

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, will update

@@ -107,6 +141,54 @@ public void testPut_withId() throws InterruptedException, IOException {
assertTrue(inProgressLatch2.await(100, TimeUnit.SECONDS));
}

public void testPut_withoutModel() throws InterruptedException, IOException {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

testPutWithoutModel

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I use underscore for better readability. First portion says what we are testing. After underscore represents specific case. I could just test all put together but I felt that would be too long.

assertTrue(inProgressLatch1.await(100, TimeUnit.SECONDS));

// User provided model id that already exists - should be able to update
Model model2 = new Model(new ModelMetadata(KNNEngine.DEFAULT, SpaceType.DEFAULT, dimension, ModelState.CREATED,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: duplicateModel

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will update

Signed-off-by: John Mazanec <[email protected]>
@jmazanec15 jmazanec15 requested a review from VijayanB September 15, 2021 23:18
@@ -114,6 +114,10 @@ public void addKNNBinaryField(FieldInfo field, DocValuesProducer valuesProducer)
String modelId = field.attributes().get(MODEL_ID);
Model model = ModelCache.getInstance().get(modelId);

if (model.getModelBlob() == null) {
throw new RuntimeException("Model blob cannot be null");
Copy link
Member

@vamshin vamshin Sep 22, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets have user friendly message. Something like "There is no model associated to this id xyz"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, will update.

this.modelBlob = Objects.requireNonNull(modelBlob, "modelBlob must not be null");

if (ModelState.CREATED.equals(this.modelMetadata.getState()) && modelBlob == null) {
throw new IllegalArgumentException("Model blob cannot be null when model metadata says model is created");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with Vijay. We can put message something like "We could not find model associated to this id xyz"?

this.modelMetadata = Objects.requireNonNull(modelMetadata, "modelMetadata must not be null");
this.modelBlob = Objects.requireNonNull(modelBlob, "modelBlob must not be null");

if (ModelState.CREATED.equals(this.modelMetadata.getState()) && modelBlob == null) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be equalsIgnoreCase instead of equals?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CREATED is an enum not a string.

@@ -56,7 +63,19 @@ public ModelMetadata getModelMetadata() {
* @return length of model blob
*/
public int getLength() {
return modelBlob.length;
if (getModelBlob() == null) {
return 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: Should this be -1 to make it clear to the caller that there is no model?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that we use this in the cache, I think it should just be 0, indicating the model has a size of 0.

* @param model new model
* @param listener handles index response
*/
void update(String modelId, Model model, ActionListener<IndexResponse> listener) throws IOException;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought we do not want to update model once created? Can some one do it using this function?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is required because initially we set state as training, then we train, then we update state to created and add the model. This will not be exposed to user.

listener);
// After metadata update finishes, remove item from cache if necessary. If no model id is
// passed then nothing needs to be removed from the cache
//TODO: Bug. Model needs to be removed from all nodes caches, not just local.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we planning to fix this TODO in next PR? Having model updatable would result in these scenarios. We should Ideally be not updating them.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is a bug that needs to be fixed. The problem is more relevant in DELETE. #93

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok lets also put the issue in the comment.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do.

Signed-off-by: John Mazanec <[email protected]>
@jmazanec15 jmazanec15 requested a review from vamshin September 22, 2021 16:39
Copy link
Member

@vamshin vamshin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks

@jmazanec15 jmazanec15 merged commit 3cc2857 into opensearch-project:feature/faiss-support Sep 22, 2021
jmazanec15 added a commit to jmazanec15/k-NN-1 that referenced this pull request Oct 22, 2021
jmazanec15 added a commit that referenced this pull request Oct 22, 2021
martin-gaievski pushed a commit to martin-gaievski/k-NN that referenced this pull request Mar 7, 2022
martin-gaievski pushed a commit to martin-gaievski/k-NN that referenced this pull request Mar 7, 2022
martin-gaievski pushed a commit to martin-gaievski/k-NN that referenced this pull request Mar 30, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants