Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[inference] Add support for request cancelation #200757

Closed
Tracked by #197630
pgayvallet opened this issue Nov 19, 2024 · 4 comments · Fixed by #203108
Closed
Tracked by #197630

[inference] Add support for request cancelation #200757

pgayvallet opened this issue Nov 19, 2024 · 4 comments · Fixed by #203108
Labels
Team:AI Infra AppEx AI Infrastructure Team

Comments

@pgayvallet
Copy link
Contributor

pgayvallet commented Nov 19, 2024

At the moment, the inference APIs (chatComplete and output) don't provide any way to perform cancelation of a running request / call.

Technically, the genAI stack connectors all support passing an abort signal for their stream sub actions.

E.g for genAI:

public async streamApi(
{ body, stream, signal, timeout }: StreamActionParams,
.

So it should be possible to leverage that to perform cancelation.

The main question here is how do we want to expose this feature.

  • For normal (non-stream) mode of the APIs, allowing to passing an abort controller as parameter, and passing the controller down to the stack connector call seems like a good option.

  • For stream mode, it's less obvious. We could follow the same approach, but it's not really the way it's supposed to be done for observables. The obs-friendly way would be to perform cancelation on unsubscription. This would require some work to make the internal observable chain be compatible with that approach (as we're not using a pure observable as a source). extracted to [inference] Cancel request in stream mode when unsubscribing #203816

@pgayvallet pgayvallet added the Team:AI Infra AppEx AI Infrastructure Team label Nov 19, 2024
@elasticmachine
Copy link
Contributor

Pinging @elastic/appex-ai-infra (Team:AI Infra)

@legrego
Copy link
Member

legrego commented Nov 19, 2024

For normal (non-stream) mode of the APIs, allowing to passing an abort controller as parameter, and passing the controller down to the stack connector call seems like a good option.

👍 seems reasonable to me.

For stream mode, it's less obvious. We could follow the same approach, but it's not really the way it's supposed to be done for observables. The obs-friendly way would be to perform cancelation on unsubscription. This would require some work to make the internal observable chain be compatible with that approach (as we're not using a pure observable as a source).

@pgayvallet It seems like you have a preferred approach, but it's a bit more effort. Am I misreading, or are there additional considerations such as time pressure or feasibility?

@pgayvallet
Copy link
Contributor Author

There's no time pressure AFAIK.

Regarding feasibility, I'm not 100% sure without doing some testing, but I think we could have the two approaches between stream and non-stream mode cohabitate.

So hopefully it's just about some more effort, yes.

@pgayvallet
Copy link
Contributor Author

I created #203816 to isolate the "cancel on unsubscribe" part of that issue

pgayvallet added a commit that referenced this issue Dec 17, 2024
)

## Summary

Fix #200757

Add cancelation support for `chatComplete` and `output`, based on an
abort signal.


### Examples

#### response mode

```ts
import { isInferenceRequestAbortedError } from '@kbn/inference-common';

try {
  const abortController = new AbortController();
  const chatResponse = await inferenceClient.chatComplete({
    connectorId: 'some-gen-ai-connector',
    abortSignal: abortController.signal,
    messages: [{ role: MessageRole.User, content: 'Do something' }],
  });
} catch(e) {
  if(isInferenceRequestAbortedError(e)) {
    // request was aborted, do something
  } else {
    // was another error, do something else
  }
}

// elsewhere
abortController.abort()
```

#### stream mode

```ts
import { isInferenceRequestAbortedError } from '@kbn/inference-common';

const abortController = new AbortController();
const events$ = inferenceClient.chatComplete({
  stream: true,
  connectorId: 'some-gen-ai-connector',
  abortSignal: abortController.signal,
  messages: [{ role: MessageRole.User, content: 'Do something' }],
});

events$.subscribe({
  next: (event) => {
    // do something
  },
  error: (err) => {
    if(isInferenceRequestAbortedError(e)) {
      // request was aborted, do something
    } else {
      // was another error, do something else
    }
  }
});

abortController.abort();
```
kibanamachine pushed a commit to kibanamachine/kibana that referenced this issue Dec 17, 2024
…tic#203108)

## Summary

Fix elastic#200757

Add cancelation support for `chatComplete` and `output`, based on an
abort signal.

### Examples

#### response mode

```ts
import { isInferenceRequestAbortedError } from '@kbn/inference-common';

try {
  const abortController = new AbortController();
  const chatResponse = await inferenceClient.chatComplete({
    connectorId: 'some-gen-ai-connector',
    abortSignal: abortController.signal,
    messages: [{ role: MessageRole.User, content: 'Do something' }],
  });
} catch(e) {
  if(isInferenceRequestAbortedError(e)) {
    // request was aborted, do something
  } else {
    // was another error, do something else
  }
}

// elsewhere
abortController.abort()
```

#### stream mode

```ts
import { isInferenceRequestAbortedError } from '@kbn/inference-common';

const abortController = new AbortController();
const events$ = inferenceClient.chatComplete({
  stream: true,
  connectorId: 'some-gen-ai-connector',
  abortSignal: abortController.signal,
  messages: [{ role: MessageRole.User, content: 'Do something' }],
});

events$.subscribe({
  next: (event) => {
    // do something
  },
  error: (err) => {
    if(isInferenceRequestAbortedError(e)) {
      // request was aborted, do something
    } else {
      // was another error, do something else
    }
  }
});

abortController.abort();
```

(cherry picked from commit 0b74f62)
JoseLuisGJ pushed a commit to JoseLuisGJ/kibana that referenced this issue Dec 19, 2024
…tic#203108)

## Summary

Fix elastic#200757

Add cancelation support for `chatComplete` and `output`, based on an
abort signal.


### Examples

#### response mode

```ts
import { isInferenceRequestAbortedError } from '@kbn/inference-common';

try {
  const abortController = new AbortController();
  const chatResponse = await inferenceClient.chatComplete({
    connectorId: 'some-gen-ai-connector',
    abortSignal: abortController.signal,
    messages: [{ role: MessageRole.User, content: 'Do something' }],
  });
} catch(e) {
  if(isInferenceRequestAbortedError(e)) {
    // request was aborted, do something
  } else {
    // was another error, do something else
  }
}

// elsewhere
abortController.abort()
```

#### stream mode

```ts
import { isInferenceRequestAbortedError } from '@kbn/inference-common';

const abortController = new AbortController();
const events$ = inferenceClient.chatComplete({
  stream: true,
  connectorId: 'some-gen-ai-connector',
  abortSignal: abortController.signal,
  messages: [{ role: MessageRole.User, content: 'Do something' }],
});

events$.subscribe({
  next: (event) => {
    // do something
  },
  error: (err) => {
    if(isInferenceRequestAbortedError(e)) {
      // request was aborted, do something
    } else {
      // was another error, do something else
    }
  }
});

abortController.abort();
```
benakansara pushed a commit to benakansara/kibana that referenced this issue Jan 2, 2025
…tic#203108)

## Summary

Fix elastic#200757

Add cancelation support for `chatComplete` and `output`, based on an
abort signal.


### Examples

#### response mode

```ts
import { isInferenceRequestAbortedError } from '@kbn/inference-common';

try {
  const abortController = new AbortController();
  const chatResponse = await inferenceClient.chatComplete({
    connectorId: 'some-gen-ai-connector',
    abortSignal: abortController.signal,
    messages: [{ role: MessageRole.User, content: 'Do something' }],
  });
} catch(e) {
  if(isInferenceRequestAbortedError(e)) {
    // request was aborted, do something
  } else {
    // was another error, do something else
  }
}

// elsewhere
abortController.abort()
```

#### stream mode

```ts
import { isInferenceRequestAbortedError } from '@kbn/inference-common';

const abortController = new AbortController();
const events$ = inferenceClient.chatComplete({
  stream: true,
  connectorId: 'some-gen-ai-connector',
  abortSignal: abortController.signal,
  messages: [{ role: MessageRole.User, content: 'Do something' }],
});

events$.subscribe({
  next: (event) => {
    // do something
  },
  error: (err) => {
    if(isInferenceRequestAbortedError(e)) {
      // request was aborted, do something
    } else {
      // was another error, do something else
    }
  }
});

abortController.abort();
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Team:AI Infra AppEx AI Infrastructure Team
Projects
None yet
3 participants