Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add analyze API to high-level rest client #31577

Merged
merged 19 commits into from
Jul 3, 2018
Merged
Show file tree
Hide file tree
Changes from 7 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,8 @@
import org.elasticsearch.action.admin.indices.alias.IndicesAliasesRequest;
import org.elasticsearch.action.admin.indices.alias.IndicesAliasesResponse;
import org.elasticsearch.action.admin.indices.alias.get.GetAliasesRequest;
import org.elasticsearch.action.admin.indices.analyze.AnalyzeRequest;
import org.elasticsearch.action.admin.indices.analyze.AnalyzeResponse;
import org.elasticsearch.action.admin.indices.cache.clear.ClearIndicesCacheRequest;
import org.elasticsearch.action.admin.indices.cache.clear.ClearIndicesCacheResponse;
import org.elasticsearch.action.admin.indices.close.CloseIndexRequest;
Expand Down Expand Up @@ -721,4 +723,32 @@ public void getTemplateAsync(GetIndexTemplatesRequest getIndexTemplatesRequest,
restHighLevelClient.performRequestAsyncAndParseEntity(getIndexTemplatesRequest, RequestConverters::getTemplates,
options, GetIndexTemplatesResponse::fromXContent, listener, emptySet());
}

/**
* Calls the analyze API
*
* See <a href="https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-analyze.html">Analyze API on elastic.co</a>
*
* @param request the request
* @param options the request options (e.g. headers), use {@link RequestOptions#DEFAULT} if nothing needs to be customized
*/
public AnalyzeResponse analyze(AnalyzeRequest request, RequestOptions options) throws IOException {
return restHighLevelClient.performRequestAndParseEntity(request, RequestConverters::analyze, options,
AnalyzeResponse::fromXContent, emptySet());
}

/**
* Asynchronously calls the analyze API
*
* See <a href="https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-analyze.html">Analyze API on elastic.co</a>
*
* @param request the request
* @param options the request options (e.g. headers), use {@link RequestOptions#DEFAULT} if nothing needs to be customized
* @param listener the listener to be notified upon request completion
*/
public void analyzeAsync(AnalyzeRequest request, RequestOptions options,
ActionListener<AnalyzeResponse> listener) {
restHighLevelClient.performRequestAsyncAndParseEntity(request, RequestConverters::analyze, options,
AnalyzeResponse::fromXContent, listener, emptySet());
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,7 @@
import org.elasticsearch.action.admin.cluster.storedscripts.GetStoredScriptRequest;
import org.elasticsearch.action.admin.indices.alias.IndicesAliasesRequest;
import org.elasticsearch.action.admin.indices.alias.get.GetAliasesRequest;
import org.elasticsearch.action.admin.indices.analyze.AnalyzeRequest;
import org.elasticsearch.action.admin.indices.cache.clear.ClearIndicesCacheRequest;
import org.elasticsearch.action.admin.indices.close.CloseIndexRequest;
import org.elasticsearch.action.admin.indices.create.CreateIndexRequest;
Expand Down Expand Up @@ -894,6 +895,18 @@ static Request getTemplates(GetIndexTemplatesRequest getIndexTemplatesRequest) t
return request;
}

static Request analyze(AnalyzeRequest request) throws IOException {
EndpointBuilder builder = new EndpointBuilder();
String index = request.index();
if (index != null) {
builder.addPathPart(index);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: I see in the REST spec that we have also prefer_local and format. I don't see them supported in the corresponding REST action though. Can you double check? Maybe those params should be removed from the SPEC?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prefer_local was removed by commit cafc707 and I don't think format has ever been supported. I'll open a new PR to change the rest-spec, and include the typo fix in that one too

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good thanks!

builder.addPathPartAsIs("_analyze");
Request req = new Request(HttpGet.METHOD_NAME, builder.build());
req.setEntity(createEntity(request, REQUEST_BODY_CONTENT_TYPE));
return req;
}

static Request getScript(GetStoredScriptRequest getStoredScriptRequest) {
String endpoint = new EndpointBuilder().addPathPartAsIs("_scripts").addPathPart(getStoredScriptRequest.id()).build();
Request request = new Request(HttpGet.METHOD_NAME, endpoint);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,8 @@
import org.elasticsearch.action.admin.indices.alias.IndicesAliasesRequest.AliasActions;
import org.elasticsearch.action.admin.indices.alias.IndicesAliasesResponse;
import org.elasticsearch.action.admin.indices.alias.get.GetAliasesRequest;
import org.elasticsearch.action.admin.indices.analyze.AnalyzeRequest;
import org.elasticsearch.action.admin.indices.analyze.AnalyzeResponse;
import org.elasticsearch.action.admin.indices.cache.clear.ClearIndicesCacheRequest;
import org.elasticsearch.action.admin.indices.cache.clear.ClearIndicesCacheResponse;
import org.elasticsearch.action.admin.indices.close.CloseIndexRequest;
Expand Down Expand Up @@ -1240,4 +1242,20 @@ public void testGetIndexTemplate() throws Exception {
new GetIndexTemplatesRequest().names("the-template-*"), client.indices()::getTemplate, client.indices()::getTemplateAsync));
assertThat(notFound.status(), equalTo(RestStatus.NOT_FOUND));
}

public void testAnalyze() throws Exception {

RestHighLevelClient client = highLevelClient();

AnalyzeRequest noindexRequest = new AnalyzeRequest().text("One two three").analyzer("english");
AnalyzeResponse noindexResponse = execute(noindexRequest, client.indices()::analyze, client.indices()::analyzeAsync);

assertThat(noindexResponse.getTokens(), hasSize(3));

AnalyzeRequest detailsRequest = new AnalyzeRequest().text("One two three").analyzer("english").explain(true);
AnalyzeResponse detailsResponse = execute(detailsRequest, client.indices()::analyze, client.indices()::analyzeAsync);

assertNotNull(detailsResponse.detail());

}
}
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,7 @@
import org.elasticsearch.action.admin.indices.alias.IndicesAliasesRequest;
import org.elasticsearch.action.admin.indices.alias.IndicesAliasesRequest.AliasActions;
import org.elasticsearch.action.admin.indices.alias.get.GetAliasesRequest;
import org.elasticsearch.action.admin.indices.analyze.AnalyzeRequest;
import org.elasticsearch.action.admin.indices.cache.clear.ClearIndicesCacheRequest;
import org.elasticsearch.action.admin.indices.close.CloseIndexRequest;
import org.elasticsearch.action.admin.indices.create.CreateIndexRequest;
Expand Down Expand Up @@ -1950,6 +1951,22 @@ public void testGetTemplateRequest() throws Exception {
assertThat(request.getEntity(), nullValue());
}

public void testAnalyzeRequest() throws Exception {
AnalyzeRequest indexAnalyzeRequest = new AnalyzeRequest()
.text("Here is some text")
.index("test_index")
.analyzer("test_analyzer");

Request request = RequestConverters.analyze(indexAnalyzeRequest);
assertThat(request.getEndpoint(), equalTo("/test_index/_analyze"));
assertThat(request.getEntity(), notNullValue());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think other folks are using assertToXContentBody(validateQueryRequest, request.getEntity());. Might be worth leaving a comment about why you aren't using it so no one gets confused.


AnalyzeRequest analyzeRequest = new AnalyzeRequest()
.text("more text")
.analyzer("test_analyzer");
assertThat(RequestConverters.analyze(analyzeRequest).getEndpoint(), equalTo("/_analyze"));
}

public void testGetScriptRequest() {
GetStoredScriptRequest getStoredScriptRequest = new GetStoredScriptRequest("x-script");
Map<String, String> expectedParams = new HashMap<>();
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,9 @@
import org.elasticsearch.action.admin.indices.alias.IndicesAliasesRequest.AliasActions;
import org.elasticsearch.action.admin.indices.alias.IndicesAliasesResponse;
import org.elasticsearch.action.admin.indices.alias.get.GetAliasesRequest;
import org.elasticsearch.action.admin.indices.analyze.AnalyzeRequest;
import org.elasticsearch.action.admin.indices.analyze.AnalyzeResponse;
import org.elasticsearch.action.admin.indices.analyze.DetailAnalyzeResponse;
import org.elasticsearch.action.admin.indices.cache.clear.ClearIndicesCacheRequest;
import org.elasticsearch.action.admin.indices.cache.clear.ClearIndicesCacheResponse;
import org.elasticsearch.action.admin.indices.close.CloseIndexRequest;
Expand Down Expand Up @@ -2211,4 +2214,127 @@ public void onFailure(Exception e) {

assertTrue(latch.await(30L, TimeUnit.SECONDS));
}

public void testAnalyze() throws IOException, InterruptedException {

RestHighLevelClient client = highLevelClient();

{
// tag::analyze-builtin-request
AnalyzeRequest request = new AnalyzeRequest();
request.text("Some text to analyze", "Some more text to analyze"); // <1>
request.analyzer("english"); // <2>
// end::analyze-builtin-request
}

{
// tag::analyze-custom-request
AnalyzeRequest request = new AnalyzeRequest();
request.text("<b>Some text to analyze</b>");
request.addCharFilter("html_strip"); // <1>
request.tokenizer("standard"); // <2>
request.addTokenFilter("lowercase"); // <3>

Map<String, Object> stopFilter = new HashMap<>();
stopFilter.put("type", "stop");
stopFilter.put("stopwords", new String[]{ "to" }); // <4>
request.addTokenFilter(stopFilter); // <5>
// end::analyze-custom-request
}

{
// tag::analyze-custom-normalizer-request
AnalyzeRequest request = new AnalyzeRequest();
request.text("<b>BaR</b>");
request.addCharFilter("html_strip");
request.addTokenFilter("lowercase");
// end::analyze-custom-normalizer-request

// tag::analyze-request-explain
request.explain(true);
request.attributes("keyword", "type");
// end::analyze-request-explain

// tag::analyze-request-sync
AnalyzeResponse response = client.indices().analyze(request, RequestOptions.DEFAULT);
// end::analyze-request-sync

// tag::analyze-response
List<AnalyzeResponse.AnalyzeToken> tokens = response.getTokens(); // <1>
DetailAnalyzeResponse detail = response.detail(); // <2>
// end::analyze-response

assertEquals(tokens.size(), 1);
assertEquals(tokens.get(0).getTerm(), "bar");
assertNotNull(detail.tokenizer());
}

CreateIndexRequest req = new CreateIndexRequest("my_index");
CreateIndexResponse resp = client.indices().create(req, RequestOptions.DEFAULT);
assertTrue(resp.isAcknowledged());

PutMappingRequest pmReq = new PutMappingRequest()
.indices("my_index")
.source("my_field", "type=text,analyzer=english");
PutMappingResponse pmResp = client.indices().putMapping(pmReq, RequestOptions.DEFAULT);
assertTrue(pmResp.isAcknowledged());

{
// tag::analyze-index-request
AnalyzeRequest request = new AnalyzeRequest();
request.index("my_index"); // <1>
request.analyzer("my_analyzer"); // <2>
request.text("some text to analyze");
// end::analyze-index-request

// tag::analyze-execute-listener
ActionListener<AnalyzeResponse> listener = new ActionListener<AnalyzeResponse>() {
@Override
public void onResponse(AnalyzeResponse analyzeTokens) {

}

@Override
public void onFailure(Exception e) {

}
};
// end::analyze-execute-listener

// Use a blocking listener in the test
final CountDownLatch latch = new CountDownLatch(1);
final ActionListener<AnalyzeResponse> blockingListener = new LatchedActionListener<>(listener, latch);
listener = ActionListener.wrap(r -> {
assertThat(r.getTokens(), hasSize(4));
}, e-> {
blockingListener.onFailure(e);
fail("should not fail");
});

// tag::analyze-request-async
client.indices().analyzeAsync(request, RequestOptions.DEFAULT, listener);
// end::analyze-request-async

assertTrue(latch.await(30L, TimeUnit.SECONDS));
}

{
// tag::analyze-index-normalizer-request
AnalyzeRequest request = new AnalyzeRequest();
request.index("my_index"); // <1>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make them line up?

request.normalizer("my_normalizer"); // <2>
request.text("some text to analyze");
// end::analyze-index-normalizer-request
}

{
// tag::analyze-field-request
AnalyzeRequest request = new AnalyzeRequest();
request.index("my_index");
request.field("my_field");
request.text("some text to analyze");
// end::analyze-field-request
}

}
}
112 changes: 112 additions & 0 deletions docs/java-rest/high-level/indices/analyze.asciidoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,112 @@
[[java-rest-high-analyze]]
=== Analyze API

[[java-rest-high-analyze-request]]
==== Analyze Request

An `AnalyzeRequest` contains the text to analyze, and one of several options to
specify how the analysis should be performed.

The simplest version uses a built-in analyzer:

["source","java",subs="attributes,callouts,macros"]
---------------------------------------------------
include-tagged::{doc-tests}/IndicesClientDocumentationIT.java[analyze-builtin-request]
---------------------------------------------------
<1> The text to include. Multiple strings are treated as a multi-valued field
<2> A built-in analyzer

You can configure a custom analyzer:
["source","java",subs="attributes,callouts,macros"]
---------------------------------------------------
include-tagged::{doc-tests}/IndicesClientDocumentationIT.java[analyze-custom-request]
---------------------------------------------------
<1> Configure char filters
<2> Configure the tokenizer
<3> Add a built-in tokenfilter
<4> Configuration for a custom tokenfilter
<5> Add the custom tokenfilter

You can also build a custom normalizer, by including only charfilters and
tokenfilters:
["source","java",subs="attributes,callouts,macros"]
---------------------------------------------------
include-tagged::{doc-tests}/IndicesClientDocumentationIT.java[analyze-custom-normalizer-request]
---------------------------------------------------

You can analyze text using an analyzer defined in an existing index:
["source","java",subs="attributes,callouts,macros"]
---------------------------------------------------
include-tagged::{doc-tests}/IndicesClientDocumentationIT.java[analyze-index-request]
---------------------------------------------------
<1> The index containing the mappings
<2> The analyzer defined on this index to use

Or you can use a normalizer:
["source","java",subs="attributes,callouts,macros"]
---------------------------------------------------
include-tagged::{doc-tests}/IndicesClientDocumentationIT.java[analyze-index-normalizer-request]
---------------------------------------------------
<1> The index containing the mappings
<2> The normalizer defined on this index to use

You can analyze text using the mappings for a particular field in an index:
["source","java",subs="attributes,callouts,macros"]
---------------------------------------------------
include-tagged::{doc-tests}/IndicesClientDocumentationIT.java[analyze-field-request]
---------------------------------------------------

==== Optional arguemnts
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

argh, well spotted. Am I alright to directly commit a fix, or should I open another PR?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am fine with pushing this fix directly

The following arguments can also optionally be provided:

["source","java",subs="attributes,callouts,macros"]
---------------------------------------------------
include-tagged::{doc-tests}/IndicesClientDocumentationIT.java[analyze-request-explain]
---------------------------------------------------
<1> Setting `explain` to true will add further details to the response
<2> Setting `attributes` allows you to return only token attributes that you are
interested in

[[java-rest-high-analyze-sync]]
==== Synchronous Execution

["source","java",subs="attributes,callouts,macros"]
---------------------------------------------------
include-tagged::{doc-tests}/IndicesClientDocumentationIT.java[analyze-request-sync]
---------------------------------------------------

[[java-rest-high-analyze-async]]
==== Asynchronous Execution

The asynchronous execution of an analyze request requires both the `AnalyzeRequest`
instance and an `ActionListener` instance to be passed to the asyncronous method:

["source","java",subs="attributes,callouts,macros"]
---------------------------------------------------
include-tagged::{doc-tests}/IndicesClientDocumentationIT.java[analyze-request-async]
---------------------------------------------------

The asynchronous method does not block and returns immediately. Once it is
completed the `ActionListener` is called back using the `onResponse` method if the
execution successfully completed or using the `onFailure` method if it failed.

A typical listener for `AnalyzeResponse` looks like:

["source","java",subs="attributes,callouts,macros"]
---------------------------------------------------
include-tagged::{doc-tests}/IndicesClientDocumentationIT.java[analyze-execute-listener]
---------------------------------------------------

[[java-rest-high-analyze-response]]
==== Analyze Response

The returned `AnalyzeResponse` allows you to retrieve details of the analysis as
follows:
["source","java",subs="attributes,callouts,macros"]
---------------------------------------------------
include-tagged::{doc-tests}/IndicesClientDocumentationIT.java[analyze-response]
---------------------------------------------------
<1> `AnalyzeToken` holds information about the individual tokens produced by analysis
<2> `DetailAnalyzeResponse` holds more detailed information about tokens produced by
the various substeps in the analysis chain. If `explain` was set to `false` in the
request, this method will return `null`
1 change: 1 addition & 0 deletions docs/java-rest/high-level/supported-apis.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -83,6 +83,7 @@ Alias Management::
* <<java-rest-high-exists-alias>>
* <<java-rest-high-get-alias>>

include::indices/analyze.asciidoc[]
include::indices/create_index.asciidoc[]
include::indices/delete_index.asciidoc[]
include::indices/indices_exists.asciidoc[]
Expand Down
Loading