Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add global context index and indices handler #43

Closed
wants to merge 6 commits into from

Conversation

jackiehanyang
Copy link
Collaborator

@jackiehanyang jackiehanyang commented Sep 18, 2023

Description

  • Creating an indices handler to create index while if it's absent, or upgrade the mapping if it's needed.
  • Adding use-case template index mapping
  • Adding a class to throw our own custom exceptions.

Integration tests will be added later when the setup for integration test CI is ready.

Issues Resolved

#51

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@github-actions github-actions bot added the backport 2.x backport PRs to 2.x branch label Sep 18, 2023
Copy link
Member

@dbwiddis dbwiddis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initial pass

public static final String META = "_meta";
public static final String SCHEMA_VERSION_FIELD = "schema_version";
public static final Integer GLOBAL_CONTEXT_INDEX_SCHEMA_VERSION = 1;
public static final String GLOBAL_CONTEXT_INDEX_MAPPING =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we put this in src/main/resources and read it into this constant off the class path? See here.

Copy link
Collaborator Author

@jackiehanyang jackiehanyang Sep 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with putting this index mapping in either path; I believe it's just a matter of coding style. However, I do prefer to put it in under src/main/java cause it can save us from setting up methods to read from src/main/resources, something like getMapping() ML-Commons is using this approach as well, see here

build.gradle Outdated
@@ -105,6 +105,7 @@ repositories {
dependencies {
implementation "org.opensearch:opensearch:${opensearch_version}"
implementation 'org.junit.jupiter:junit-jupiter:5.10.0'
implementation 'org.projectlombok:lombok:1.18.22'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should have a separate discussion/issue/RFC on whether we want to use a code generation tool like Lombok, and figure out what modifications we need to make to our CI to support it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed lombok dependency for now. It's not needed in this pr. I will open a discuss and figure out the XContent question before adding this dependency

*
* @param message message of the exception
*/
public FlowFrameworkException(String message) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea of having our own custom exceptions. Given that these will eventually need to be processed in a RestResponse, can we consider including the appropriate return code as part of constructing this exception? For example, "index not found" should return RestStatus.NOT_FOUND etc.

@codecov
Copy link

codecov bot commented Sep 25, 2023

Codecov Report

Merging #43 (bce1b62) into main (1d22bee) will decrease coverage by 22.17%.
The diff coverage is 0.00%.

@@              Coverage Diff              @@
##               main      #43       +/-   ##
=============================================
- Coverage     73.87%   51.71%   -22.17%     
  Complexity       65       65               
=============================================
  Files             8       13        +5     
  Lines           245      350      +105     
  Branches         22       29        +7     
=============================================
  Hits            181      181               
- Misses           54      159      +105     
  Partials         10       10               
Files Coverage Δ
.../opensearch/flowframework/constant/CommonName.java 0.00% <0.00%> (ø)
...opensearch/flowframework/constant/CommonValue.java 0.00% <0.00%> (ø)
...lowframework/exception/FlowFrameworkException.java 0.00% <0.00%> (ø)
...arch/flowframework/indices/FlowFrameworkIndex.java 0.00% <0.00%> (ø)
...framework/indices/FlowFrameworkIndicesHandler.java 0.00% <0.00%> (ø)

@jackiehanyang jackiehanyang marked this pull request as ready for review September 25, 2023 16:50
Copy link
Member

@joshpalis joshpalis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initial pass

@@ -117,8 +117,12 @@ dependencies {

test {
include '**/*Tests.class'
systemProperty 'tests.security.manager', 'false'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trying to understand these build.gradle settings better, does this disable security for tests?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not disable security manager for the plugin. I know it's deprecated but OpenSearch still uses it and doesn't have our security manager yet.

+ "\n"
+ " },\n"
+ " \"properties\": {\n"
+ " \"pipeline_id\": {\n"
Copy link
Member

@joshpalis joshpalis Sep 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's change this pipeline_id to workflow_id to avoid confusion with ingest/search pipelines

public class FlowFrameworkIndicesHandler {

private static final Logger logger = LogManager.getLogger(FlowFrameworkIndicesHandler.class);
private static final Map<String, Object> indexSettings = Map.of("index.auto_expand_replicas", "0-1");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the justification for this auto_expand_replicas setting, just trying to understand why this is necessary

String indexName = index.getIndexName();
String mapping = index.getMapping();

try (ThreadContext.StoredContext threadContext = client.threadPool().getThreadContext().stashContext()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I understand about security, we only need to stash the thread context if this particular index is managed by the security plugin's list of protected indices. This allows the plugin to do CRUD operations on this protected index. Is the plan to add all flow framework indices to the security plugin's list of protected indices, and if not, why stash the thread context?

CreateIndexRequest request = new CreateIndexRequest(indexName).mapping(mapping).settings(indexSettings);
client.admin().indices().create(request, actionListener);
} else {
logger.debug("index:{} is already created", indexName);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer that we keep logging at the info or error level for success and failure cases respectively

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally don't mind extra debug or even trace logging.

Copy link
Member

@owaiskazi19 owaiskazi19 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need tests! Since we both kind of worked on creating the index. I will create a new PR to have the generic way of creating the index and later you can pass the Global Context mapping file to it to create GC index.

@@ -117,8 +117,12 @@ dependencies {

test {
include '**/*Tests.class'
systemProperty 'tests.security.manager', 'false'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not disable security manager for the plugin. I know it's deprecated but OpenSearch still uses it and doesn't have our security manager yet.


filter {
excludeTestsMatching "org.opensearch.flowframework.indices.*Tests"
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we remove the integ test code from this PR and add it in the next PR as this PR doesn't supports the framework?

+ " \"type\": \"text\"\n"
+ " }\n"
+ " }\n"
+ "}";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should have a JSON file for mapping and then parse it to create a String.

Map<String, Object> indexMapping = indexMetaData.mapping().getSourceAsMap();
Object meta = indexMapping.get(META);
if (meta != null && meta instanceof Map) {
@SuppressWarnings("unchecked")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we have suppress warnings here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect because all we know is that meta is an Object that an instance of Map but we don't know the types of its key and value when they are cast.

Map<String, Object> indexMapping = indexMetaData.mapping().getSourceAsMap();
Object meta = indexMapping.get(META);
if (meta != null && meta instanceof Map) {
@SuppressWarnings("unchecked")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect because all we know is that meta is an Object that an instance of Map but we don't know the types of its key and value when they are cast.

* this file be licensed under the Apache-2.0 license or a
* compatible open source license.
*/
package org.opensearch.flowframework.constant;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious about this choice of package name. Given the class names, seems common might be better?

logger.error("Failed to create index " + indexName, e);
internalListener.onFailure(e);
});
CreateIndexRequest request = new CreateIndexRequest(indexName).mapping(mapping).settings(indexSettings);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does the mapping here require a mediatype? Looking at the javadoc it seems it expects JSON but needs a special format with a _doc key. We shold be consistent between our various implementations so all our mapping files look consistent.

CreateIndexRequest request = new CreateIndexRequest(indexName).mapping(mapping).settings(indexSettings);
client.admin().indices().create(request, actionListener);
} else {
logger.debug("index:{} is already created", indexName);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally don't mind extra debug or even trace logging.

client.admin()
.indices()
.putMapping(
new PutMappingRequest().indices(indexName).source(mapping, XContentType.JSON),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this requires a MediaType argument instead of XContentType.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x backport PRs to 2.x branch
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants