[ML] Adding support for Cohere inference service #104559

jonathan-buttner · 2024-01-18T18:38:20Z

This PR adds support for interacting with Cohere's rest APIs in the _inference API. This only adds support for text embedding.

Cohere text embedding docs: https://docs.cohere.com/reference/embed

Not included

No support for tokenization, Cohere does provide tokenization APIs so chunking could leverage that API. That functionality isn't included in this PR because we don't support chunking in the other services yet either
No truncation support in the Cohere service implementation. Their API provides truncation support so we'll leverage that instead

Notable differences from the Cohere API

Cohere allows providing a embedding_types field and users can provide multiple values (an array of strings) and the response of the request will include embeddings for all the ones requested. For example a response could have floats and signed byte, and unsigned byte embeddings. To keep with our simplified API we only allow a single embedding type in a response. So this PR adds embedding_type as a single string field instead of an array.
input_type, Cohere's accepted values are search_document (for ingest) and search_query (for searching). Our API supports ingest and search. I thought that was more intuitive but can easily change it to something else if preferred

Notable Changes

Registers a single CohereService that will handle all inference operations in the future
The TextEmbeddingResults class now supports bytes in addition to floats
Corresponding specification PR

Response formats

Floats

{
    "text_embedding": [
        {
            "embedding": [
                0.7519531,
                0.34326172
            ]
        }
    ]
}

Bytes

{
    "text_embedding": [
        {
            "embedding": [
               1,
                -5
            ]
        }
    ]
}

Testing

Add the service

PUT _inference/text_embedding/cohere
{
    "service": "cohere",
    "service_settings": {
        "api_key": "<api key>"
    },
    "task_settings": {
        "model": "embed-english-v3.0",
        "input_type": "ingest",
        "embedding_type": "float",
        "truncate": "end"
    }
}

Performan inference

POST _inference/text_embedding/cohere
{
    "input": "The food was delicious."
}

…ence-cohere

jonathan-buttner · 2024-01-18T21:12:33Z

server/src/main/java/org/elasticsearch/inference/InputType.java

 import java.util.Locale;

 /**
 * Defines the type of request, whether the request is to ingest a document or search for a document.
 */
-public enum InputType implements Writeable {


I decided to remove these because the cohere task settings need to write it as an optional enum so I figured it'd be better to just leave it up to the caller to handle how it is written.

jonathan-buttner · 2024-01-18T21:13:06Z

server/src/main/java/org/elasticsearch/inference/Model.java

@@ -23,6 +23,20 @@ public Model(ModelConfigurations configurations, ModelSecrets secrets) {
        this.secrets = Objects.requireNonNull(secrets);
    }

+    public Model(Model model, TaskSettings taskSettings) {


These just make it easier to construct the Model when these various settings change but the secrets do not.

jonathan-buttner · 2024-01-18T21:15:16Z

.../core/src/main/java/org/elasticsearch/xpack/core/inference/results/TextEmbeddingResults.java

        public Embedding(StreamInput in) throws IOException {
-            this(in.readCollectionAsImmutableList(StreamInput::readFloat));
+            if (in.getTransportVersion().onOrAfter(TransportVersions.ML_INFERENCE_COHERE_EMBEDDINGS_ADDED)) {


Double check that this is correct. This was added so we can support returning bytes or floats. I thought the best way to do that was to used a named writable but happy to change it to something else if there's a better solution. The writables that we support now are the FloatValue and ByteValue classes above.

Are you able to elaborate a bit on why we need implementations of EmbeddingValue rather than directly using primitives or Numbers?

I could be completely wrong but this was my thinking 😅 :

In this constructor public Embedding(StreamInput in), I don't know if I need to read floats or bytes (since we're allowing either to occur now with the addition of cohere). So I can't simply do in.readByte() or in.readFloat(). So I think the way to "know" is to use a named writable which led me to add the EmbeddingValue and the ByteValue, FloatValue. Because the in.readNamedWriteable() will figure out what type it is and read the appropriate one. That way when the results get serialized and another node needs to read them, it can determine whether to read them as floats or bytes.

rather than directly using primitives or Numbers

It does seem excessive to hold an array of EmbeddingValue objects when a primitive would do the job. TextEmbeddingResults is a named writable one option is create the TextEmbeddingByteResults and ByteEmbeddings classes for the byte variant. That would avoid the conditionals in the streaming functions.

jonathan-buttner · 2024-01-18T21:16:34Z

.../core/src/main/java/org/elasticsearch/xpack/core/inference/results/TextEmbeddingResults.java

+            if (in.getTransportVersion().onOrAfter(TransportVersions.ML_INFERENCE_COHERE_EMBEDDINGS_ADDED)) {
+                values = in.readNamedWriteableCollectionAsList(EmbeddingValue.class);
+            } else {
+                values = convertFloatsToEmbeddingValues(in.readCollectionAsImmutableList(StreamInput::readFloat));


I don't think we actually want nodes converting bytes into floats when sending to other nodes. I believe we'll prevent that by using the cohere service's minimum support version. So I don't think we'll ever need to execute this code.

If we don't actually want to run this code, would it be better to log the values and not actually do the conversion?

Yeah good question. Dave and I ran into this with this class as well: https://github.com/elastic/elasticsearch/blob/main/x-pack/plugin/core/src/main/java/org/elasticsearch/xpack/core/inference/action/InferenceAction.java#L215-L224

which should be similar in that a few of those if-blocks shouldn't really happen. We decided for completeness that we'd handle the cases. I'm happy to remove it if we think otherwise though. It just gets complicated with the nodes on separate versions during upgrades. It should be possible but I suppose if we have a bug in the future we could get in a case where it was 🤷‍♂️

I think at least we should be adding in an assertion here so that any tests that hit this code path fail

With Dave's suggestion I think we can get away with the transport version check all together. It'll add duplicate code though. I'll add a comment on the new class.

jonathan-buttner · 2024-01-18T21:17:38Z

x-pack/plugin/inference/src/main/java/module-info.java

@@ -22,10 +22,5 @@
    exports org.elasticsearch.xpack.inference.registry;
    exports org.elasticsearch.xpack.inference.rest;
    exports org.elasticsearch.xpack.inference.services;
-    exports org.elasticsearch.xpack.inference.external.http.sender;


@davidkyle do we need these? I had added them a while ago but now I get all kinds of warnings saying that various classes aren't exported. If I remove them they go away 🤷‍♂️ . I don't think we'd want to export most of this stuff out of the module anyway.

++ we shouldn't need to export that package, thanks for tidying up

jonathan-buttner · 2024-01-18T21:26:15Z

...g/elasticsearch/xpack/inference/external/response/cohere/CohereEmbeddingsResponseEntity.java

+    private static ByteValue parseEmbeddingInt8Entry(XContentParser parser) throws IOException {
+        XContentParser.Token token = parser.currentToken();
+        XContentParserUtils.ensureExpectedToken(XContentParser.Token.VALUE_NUMBER, token, parser);
+        var parsedByte = parser.shortValue();


I didn't see a byteValue() method. Should I add one to the parsing library? That'd probably be a follow up PR.

jonathan-buttner · 2024-01-18T21:28:53Z

...erence/src/main/java/org/elasticsearch/xpack/inference/services/cohere/CohereTruncation.java

+ * 2.0.
+ */
+
+package org.elasticsearch.xpack.inference.services.cohere;


This lives here instead of embeddings because I think a few other cohere APIs support this field as a body parameter

jonathan-buttner · 2024-01-18T21:30:22Z

...e/src/main/java/org/elasticsearch/xpack/inference/services/openai/OpenAiServiceSettings.java

@@ -100,14 +100,6 @@ public OpenAiServiceSettings(
        this(createOptionalUri(uri), organizationId, similarity, dimensions, maxInputTokens);
    }

-    private static URI createOptionalUri(String url) {


Moved to ServiceUtils.

Thanks for the great self-reviewing!

Thanks for the great self-reviewing!

This ^. Thanks

jonathan-buttner · 2024-01-18T21:31:38Z

...nce/src/test/java/org/elasticsearch/xpack/inference/action/InferenceActionResponseTests.java

@@ -46,7 +46,7 @@ protected Writeable.Reader<InferenceAction.Response> instanceReader() {
    @Override
    protected InferenceAction.Response createTestInstance() {
        var result = switch (randomIntBetween(0, 2)) {
-            case 0 -> TextEmbeddingResultsTests.createRandomResults();
+            case 0 -> TextEmbeddingResultsTests.createRandomFloatResults();


@davidkyle just a heads up. I was getting some bwc failures which I think are because of the TextEmbeddingResults conversion to namedwritables. So I'm forcing this to only use floats. Maybe there's an error in the way I've written that logic?

jonathan-buttner · 2024-01-18T21:33:36Z

.../org/elasticsearch/xpack/inference/external/request/cohere/CohereEmbeddingsRequestTests.java

+        var request = createRequest("url", "secret", List.of("abc"), CohereEmbeddingsTaskSettings.EMPTY_SETTINGS);
+
+        var httpRequest = request.createRequest();
+        MatcherAssert.assertThat(httpRequest, instanceOf(HttpPost.class));


I get warnings when using assertThat directly now. It says it's deprecated and to use MatcherAssert.assertThat instead.

elasticsearchmachine · 2024-01-22T13:35:48Z

Pinging @elastic/ml-core (Team:ML)

elasticsearchmachine · 2024-01-22T13:37:50Z

Hi @jonathan-buttner, I've created a changelog YAML for you.

…ence-cohere

jonathan-buttner · 2024-01-22T17:51:38Z

I forgot to pass through the InputType from the TransportCoordinatedInferenceAction. I believe that's going to require a change to the InferenceService interface. I'll stack a new PR on this one for that change.

maxhniebergall

LGTM!

What a huge new feature! Amazing how many new bits were needed for this, and how many new tests there are.

I added some comments where I got confused or think there could be some improvement, but overall this seems great.

maxhniebergall · 2024-01-22T18:20:49Z

.../core/src/main/java/org/elasticsearch/xpack/core/inference/results/TextEmbeddingResults.java

        public Embedding(StreamInput in) throws IOException {
-            this(in.readCollectionAsImmutableList(StreamInput::readFloat));
+            if (in.getTransportVersion().onOrAfter(TransportVersions.ML_INFERENCE_COHERE_EMBEDDINGS_ADDED)) {


Are you able to elaborate a bit on why we need implementations of EmbeddingValue rather than directly using primitives or Numbers?

maxhniebergall · 2024-01-22T18:23:46Z

.../core/src/main/java/org/elasticsearch/xpack/core/inference/results/TextEmbeddingResults.java

+            if (in.getTransportVersion().onOrAfter(TransportVersions.ML_INFERENCE_COHERE_EMBEDDINGS_ADDED)) {
+                values = in.readNamedWriteableCollectionAsList(EmbeddingValue.class);
+            } else {
+                values = convertFloatsToEmbeddingValues(in.readCollectionAsImmutableList(StreamInput::readFloat));


If we don't actually want to run this code, would it be better to log the values and not actually do the conversion?

...in/java/org/elasticsearch/xpack/inference/external/action/cohere/CohereEmbeddingsAction.java

maxhniebergall · 2024-01-22T19:02:04Z

...n/inference/src/main/java/org/elasticsearch/xpack/inference/external/action/ActionUtils.java

@@ -35,5 +39,13 @@ public static ElasticsearchStatusException createInternalServerError(Throwable e
        return new ElasticsearchStatusException(message, RestStatus.INTERNAL_SERVER_ERROR, e);
    }

+    public static String getErrorMessage(@Nullable URI uri, String message) {


perhaps could rename this to constructFailedToSendRequestMessage or constructMessageForFailedToSendRequest

maxhniebergall · 2024-01-22T19:16:18Z

...inference/src/main/java/org/elasticsearch/xpack/inference/external/request/RequestUtils.java

+            throw new ElasticsearchStatusException(
+                Strings.format("Failed to construct %s URL", service),
+                RestStatus.INTERNAL_SERVER_ERROR,


Is there a reason we don't log the URL here, but only the service name?

Also, should this be an internal server error, or is it more like an illegal argument exception or an illegal state exception? Are we expecting that the service URL should have already been validated by this point, and ever getting to this point in the code is the exception?

Is there a reason we don't log the URL here, but only the service name?

Yeah we probably should just to be safe.

Are we expecting that the service URL should have already been validated by this point, and ever getting to this point in the code is the exception?

Yeah, if we actually get an exception I would expect it to be a programmatic error. It would be generated by a method like this one:

static URI buildDefaultUri() throws URISyntaxException { return new URIBuilder().setScheme("https") .setHost(CohereUtils.HOST) .setPathSegments(CohereUtils.VERSION_1, CohereUtils.EMBEDDINGS_PATH) .build(); }

Which essentially just uses static strings and calls the apache library. There's no user input. So unless one of those strings is like ^^ which we hardcode in, we really shouldn't get an exception.

Actually, since we're passing in the URISyntaxException into the ElasticsearchStatusException. I believe it'll generate a caused_by section that gets returned which would include the invalid URL.

Here's an example:

{ "error": { "root_cause": [ { "type": "status_exception", "reason": "Failed to construct Cohere URL" } ], "type": "status_exception", "reason": "Failed to construct Cohere URL", "caused_by": { "type": "u_r_i_syntax_exception", "reason": "Illegal character in authority at index 8: https://^^/v1/embed" } }, "status": 500 }

The URI may be private. It definitly should not be logged but it is ok to include in the REST response as the URI is known to the user who made the request in the first place. The URISyntaxException provides enough information for the user to debug.

Serverless will log responses with 500 status codes which is a good reason to avoid a 500 here. The code building the URI is good, it will only throw on bad inputs so this can be a BAD_REQUEST status.

We talked about this more offline. I'll switch this one to a bad request and then create a follow up PR to remove places where we log the URL and try to think of some other information we could log to help us with triaging errors. Maybe we could hash the url or something and log that instead.

...java/org/elasticsearch/xpack/inference/services/cohere/embeddings/CohereEmbeddingsModel.java

maxhniebergall · 2024-01-22T19:43:48Z

...e/src/main/java/org/elasticsearch/xpack/inference/services/openai/OpenAiServiceSettings.java

@@ -100,14 +100,6 @@ public OpenAiServiceSettings(
        this(createOptionalUri(uri), organizationId, similarity, dimensions, maxInputTokens);
    }

-    private static URI createOptionalUri(String url) {


Thanks for the great self-reviewing!

...rence/src/test/java/org/elasticsearch/xpack/inference/results/TextEmbeddingResultsTests.java

...sticsearch/xpack/inference/external/response/cohere/CohereEmbeddingsResponseEntityTests.java

...ence/src/test/java/org/elasticsearch/xpack/inference/services/cohere/CohereServiceTests.java

davidkyle

Looks good.

Left some minor comments, I'll take another pass later as it is a big PR

davidkyle · 2024-01-23T09:27:09Z

x-pack/plugin/inference/src/main/java/module-info.java

@@ -22,10 +22,5 @@
    exports org.elasticsearch.xpack.inference.registry;
    exports org.elasticsearch.xpack.inference.rest;
    exports org.elasticsearch.xpack.inference.services;
-    exports org.elasticsearch.xpack.inference.external.http.sender;


++ we shouldn't need to export that package, thanks for tidying up

davidkyle · 2024-01-23T09:55:01Z

...inference/src/main/java/org/elasticsearch/xpack/inference/external/request/RequestUtils.java

+            throw new ElasticsearchStatusException(
+                Strings.format("Failed to construct %s URL", service),
+                RestStatus.INTERNAL_SERVER_ERROR,


The URI may be private. It definitly should not be logged but it is ok to include in the REST response as the URI is known to the user who made the request in the first place. The URISyntaxException provides enough information for the user to debug.

Serverless will log responses with 500 status codes which is a good reason to avoid a 500 here. The code building the URI is good, it will only throw on bad inputs so this can be a BAD_REQUEST status.

davidkyle · 2024-01-23T10:13:27Z

...org/elasticsearch/xpack/inference/external/request/cohere/CohereEmbeddingsRequestEntity.java

+    /**
+     * Maps the {@link InputType} to the expected value for cohere for the input_type field in the request
+     */
+    private static final Map<InputType, String> INPUT_TYPE_MAPPING = Map.of(


you can avoid the map by using the ordinal value of the enum as the index to an array

private static final String [] INPUT_TYPE_MAPPING = {SEARCH_DOCUMENT, SEARCH_QUERY} // lookup INPUT_TYPE_MAPPING[inputType.ordinal()]

Add an assertion somewhere that the arrays are the same size

assert INPUT_TYPE_MAPPING.length = InputType.values().length

davidkyle · 2024-01-23T11:13:24Z

...g/elasticsearch/xpack/inference/services/cohere/embeddings/CohereEmbeddingsTaskSettings.java

+ * @param truncation Specifies how the API will handle inputs longer than the maximum token length
+ */
+public record CohereEmbeddingsTaskSettings(
+    @Nullable String model,


I wonder if model belongs in Task Settings or Service Settings.

The idea of task settings is that they can be overridden per request, but changing the model would produce incompatible results. The same argument can be applied to embedding type as the results would be incompatible with previously indexed results.

OpenAiEmbeddingsTaskSettings has set the precedent for model name in task settings but it is worth reviewing the idea.

Yeah good point. I'll move them to the service settings specific to text embeddings. If we wanted to move them out of the task settings for openai, what would be the process there? Maybe the easiest thing to do would be leave them in the task settings but also add them to the service settings and let the service settings take priority?

leave them in the task settings but also add them to the service settings and let the service settings take priority?

It might be possible to automatically update the config to do this

davidkyle · 2024-01-23T11:14:05Z

...e/src/main/java/org/elasticsearch/xpack/inference/services/openai/OpenAiServiceSettings.java

@@ -100,14 +100,6 @@ public OpenAiServiceSettings(
        this(createOptionalUri(uri), organizationId, similarity, dimensions, maxInputTokens);
    }

-    private static URI createOptionalUri(String url) {


Thanks for the great self-reviewing!

This ^. Thanks

davidkyle · 2024-01-23T11:20:54Z

.../core/src/main/java/org/elasticsearch/xpack/core/inference/results/TextEmbeddingResults.java

        public Embedding(StreamInput in) throws IOException {
-            this(in.readCollectionAsImmutableList(StreamInput::readFloat));
+            if (in.getTransportVersion().onOrAfter(TransportVersions.ML_INFERENCE_COHERE_EMBEDDINGS_ADDED)) {


rather than directly using primitives or Numbers

It does seem excessive to hold an array of EmbeddingValue objects when a primitive would do the job. TextEmbeddingResults is a named writable one option is create the TextEmbeddingByteResults and ByteEmbeddings classes for the byte variant. That would avoid the conditionals in the streaming functions.

…ence-cohere

jonathan-buttner · 2024-01-24T15:58:58Z

...k/plugin/core/src/main/java/org/elasticsearch/xpack/core/inference/results/EmbeddingInt.java

-public interface EmbeddingValue extends NamedWriteable, ToXContentFragment {
-    Number getValue();
+public interface EmbeddingInt {
+    int getSize();


I'm open to other ideas here. I did this to make determining the embedding size easier within the ServiceUtils class. The other option would be a second instanceof check for TextEmbeddingByteResults.

jonathan-buttner · 2024-01-24T15:59:19Z

.../plugin/core/src/main/java/org/elasticsearch/xpack/core/inference/results/TextEmbedding.java

+     * @return the size of the text embedding
+     * @throws IllegalStateException if the list of embeddings is empty
+     */
+    int getFirstEmbeddingSize() throws IllegalStateException;


Also needed for simplifying ServiceUtils

jonathan-buttner · 2024-01-24T16:02:28Z

...e/src/main/java/org/elasticsearch/xpack/core/inference/results/TextEmbeddingByteResults.java

+ *     ]
+ * }
+ */
+public record TextEmbeddingByteResults(List<Embedding> embeddings) implements InferenceServiceResults, TextEmbedding {


Basically a duplicate of TextEmbeddingResults. I was trying to think of an easy way to remove the duplication but I think we'd have to push logic up into some base class but that doesn't fully work because of the constructors that take a stream TextEmbeddingByteResults(StreamInput in). I think we could remove some of this by making the inner Embedding class a named writeable too but I'm not sure that's much better than what I had before with FloatValue and ByteValue.

@davidkyle let me know if this is what you were thinking. I suppose we could put off refactoring this until we support a third type. I think we'd still have to add in a bunch of if-blocks though if we do go the named writeable route.

Fundamentally these results classes are different types (byte vs float), it may be verbose but it's good to have the different classes

jonathan-buttner · 2024-01-24T16:05:01Z

...erence/src/main/java/org/elasticsearch/xpack/inference/InferenceNamedWriteablesProvider.java

@@ -98,6 +98,16 @@ public static List<NamedWriteableRegistry.Entry> getNamedWriteables() {
        namedWriteables.add(
            new NamedWriteableRegistry.Entry(ServiceSettings.class, CohereServiceSettings.NAME, CohereServiceSettings::new)
        );
+        namedWriteables.add(


I'm keeping CohereServiceSettings it's own class because I think we'll use it again for other inference actions for cohere. The way I structured CohereEmbeddingsServiceSettings was to make the common settings be in CohereServiceSettings (e.g. the model name, dimensions etc) and put the embedding_type field within CohereEmbeddingsServiceSettings.

See my comment in CohereEmbeddingsServiceSettings about a different way to serialise CohereServiceSettings that avoids adding this entry.

jonathan-buttner · 2024-01-24T16:06:48Z

.../plugin/inference/src/main/java/org/elasticsearch/xpack/inference/services/ServiceUtils.java

@@ -108,25 +110,6 @@ public static String mustBeNonEmptyString(String settingName, String scope) {
        return Strings.format("[%s] Invalid value empty string. [%s] must be a non-empty string", scope, settingName);
    }

-    public static String mustBeNonEmptyList(String settingName, String scope) {


I had added these but they weren't used so removing them.

jonathan-buttner · 2024-01-24T16:10:36Z

.../java/org/elasticsearch/xpack/inference/services/huggingface/HuggingFaceServiceSettings.java

@@ -52,9 +52,7 @@ public static HuggingFaceServiceSettings fromMap(Map<String, Object> map) {

    public static URI extractUri(Map<String, Object> map, String fieldName, ValidationException validationException) {
        String parsedUrl = extractRequiredString(map, fieldName, ModelConfigurations.SERVICE_SETTINGS, validationException);
-        if (parsedUrl == null) {


convertToUri will just return null if passed null now.

davidkyle

LGTM

davidkyle · 2024-01-24T17:50:38Z

...e/src/main/java/org/elasticsearch/xpack/core/inference/results/TextEmbeddingByteResults.java

+ *     ]
+ * }
+ */
+public record TextEmbeddingByteResults(List<Embedding> embeddings) implements InferenceServiceResults, TextEmbedding {


Fundamentally these results classes are different types (byte vs float), it may be verbose but it's good to have the different classes

davidkyle · 2024-01-24T18:02:16Z

...lasticsearch/xpack/inference/services/cohere/embeddings/CohereEmbeddingsServiceSettings.java

+    }
+
+    public CohereEmbeddingsServiceSettings(StreamInput in) throws IOException {
+        commonSettings = in.readNamedWriteable(CohereServiceSettings.class);


Suggested change

commonSettings = in.readNamedWriteable(CohereServiceSettings.class);

commonSettings = new CohereServiceSettings(in);

And the writeTo() is changed to commonSettings.writeTo(out) then I think you can avoid added the CohereServiceSettings.class entry to the named writeable registry.

davidkyle · 2024-01-24T18:03:53Z

...erence/src/main/java/org/elasticsearch/xpack/inference/InferenceNamedWriteablesProvider.java

@@ -98,6 +98,16 @@ public static List<NamedWriteableRegistry.Entry> getNamedWriteables() {
        namedWriteables.add(
            new NamedWriteableRegistry.Entry(ServiceSettings.class, CohereServiceSettings.NAME, CohereServiceSettings::new)
        );
+        namedWriteables.add(


See my comment in CohereEmbeddingsServiceSettings about a different way to serialise CohereServiceSettings that avoids adding this entry.

…ence-cohere

* Starting cohere * Making progress on cohere * Filling out the embedding types * Working cohere * Fixing tests * Removing rate limit error message * Fixing a few comments * Update docs/changelog/104559.yaml * Addressing most feedback * Using separate named writeables for byte results and floats * Fixing a few comments and adding tests * Fixing mutation issue * Removing cohere service settings from named writeable registry

jonathan-buttner added 5 commits January 10, 2024 14:01

Starting cohere

244da75

Making progress on cohere

e1008cf

Filling out the embedding types

f94c32c

Working cohere

58c707d

Merge branch 'main' of github.com:elastic/elasticsearch into ml-infer…

25dcec1

…ence-cohere

jonathan-buttner added :ml Machine learning Team:ML Meta label for the ML team cloud-deploy Publish cloud docker image for Cloud-First-Testing v8.13.0 labels Jan 18, 2024

jonathan-buttner added 2 commits January 18, 2024 14:23

Fixing tests

153785a

Removing rate limit error message

4a127ff

jonathan-buttner commented Jan 18, 2024

View reviewed changes

Fixing a few comments

3ebed6d

jonathan-buttner requested review from davidkyle and maxhniebergall January 22, 2024 13:35

jonathan-buttner marked this pull request as ready for review January 22, 2024 13:35

jonathan-buttner added the >enhancement label Jan 22, 2024

jonathan-buttner and others added 2 commits January 22, 2024 08:37

Update docs/changelog/104559.yaml

a4e576f

Merge branch 'main' of github.com:elastic/elasticsearch into ml-infer…

b4e2443

…ence-cohere

This was referenced Jan 22, 2024

[ML] Adding signed byte support to support Cohere integration elastic/elasticsearch-specification#2397

Closed

Adding signed byte support from cohere integration elastic/elasticsearch-specification#2398

Merged

maxhniebergall approved these changes Jan 22, 2024

View reviewed changes

jonathan-buttner mentioned this pull request Jan 22, 2024

[ML] Integrating input type field through inference services #104630

Closed

davidkyle reviewed Jan 23, 2024

View reviewed changes

jonathan-buttner added 3 commits January 23, 2024 15:17

Addressing most feedback

82001b2

Using separate named writeables for byte results and floats

83caa87

Merge branch 'main' of github.com:elastic/elasticsearch into ml-infer…

bcf193d

…ence-cohere

Fixing a few comments and adding tests

8102408

jonathan-buttner commented Jan 24, 2024

View reviewed changes

Fixing mutation issue

6c8343f

davidkyle approved these changes Jan 24, 2024

View reviewed changes

jonathan-buttner added 2 commits January 24, 2024 15:16

Removing cohere service settings from named writeable registry

de682b5

Merge branch 'main' of github.com:elastic/elasticsearch into ml-infer…

e841368

…ence-cohere

jonathan-buttner merged commit eb8c73f into elastic:main Jan 24, 2024
16 checks passed

jonathan-buttner deleted the ml-inference-cohere branch January 24, 2024 21:14

	commonSettings = in.readNamedWriteable(CohereServiceSettings.class);
	commonSettings = new CohereServiceSettings(in);

[ML] Adding support for Cohere inference service #104559

[ML] Adding support for Cohere inference service #104559

Conversation

jonathan-buttner commented Jan 18, 2024 • edited Loading

Not included

Notable differences from the Cohere API

Notable Changes

Response formats

Floats

Bytes

Testing

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jonathan-buttner Jan 22, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

elasticsearchmachine commented Jan 22, 2024

elasticsearchmachine commented Jan 22, 2024

jonathan-buttner commented Jan 22, 2024

maxhniebergall left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jonathan-buttner Jan 22, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

davidkyle left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

davidkyle left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jonathan-buttner commented Jan 18, 2024 •

edited

Loading

jonathan-buttner Jan 22, 2024 •

edited

Loading

jonathan-buttner Jan 22, 2024 •

edited

Loading