Add a choice of how to end streaming from callback: STOP or CANCEL #1476

sbalandi · 2025-01-03T22:05:28Z

No description provided.

sbalandi · 2025-01-03T22:08:18Z

TODO: add CANCEL for ContinuousBatching

sbalandi · 2025-01-08T22:20:33Z

TODO: add CANCEL for ContinuousBatching

done

ilya-lavrenov

Please, add tests for new functionality.

ilya-lavrenov · 2025-01-10T08:06:22Z

samples/python/prompt_lookup_decoding_lm/prompt_lookup_decoding_lm.py

+    print(subword, end='', flush=True) 
+    # Return flag corresponds whether generation should be stopped. 
+    # False means continue generation. 
+    return False


BTW, should we also support callback w/o return value?

E.g. when user don't care about any stop / cancellation

ilya-lavrenov · 2025-01-10T08:06:35Z

samples/cpp/chat_sample/chat_sample.cpp

        std::cout << word << std::flush;
        // Return flag corresponds whether generation should be stopped.
        // false means continue generation.
-        return false; 
+
+        return ov::genai::StreamerRunningStatus::RUNNING;


Suggested change

return ov::genai::StreamerRunningStatus::RUNNING;

return ov::genai::StreamingStatus::CONTINUE;

ilya-lavrenov · 2025-01-10T08:07:41Z

src/cpp/include/openvino/genai/generation_handle.hpp

@@ -30,6 +31,9 @@ struct EncodedGenerationResult {

    // Status of generation
    GenerationStatus m_status = GenerationStatus::RUNNING;
+
+    // Status of streaming
+    StreamerRunningStatus m_streaming_status = ov::genai::StreamerRunningStatus::UNDEF;


maybe we can extend GenerationStatus ? E.g. DROPPED_BY_HANDLE means STOP in its current implementation, while for CANCEL we can add a new value.

BTW, looks like we can drop DROPPED_BY_PIPELINE as unused.

More thoughts:

maybe we should deprecated drop() method and introduce stop() instead

similarly for GenerationStatus

and extend both GenerationHandle and GenerationStatus with cancel() functionality

In this case CB and LLM pipelines logic / naming will be aligned

moved to GenerationStatus

ilya-lavrenov · 2025-01-10T08:12:20Z

src/python/openvino_genai/__init__.py

@@ -15,7 +15,7 @@
    RawPerfMetrics,
    PerfMetrics,
    StreamerBase,
-    get_version,


can we keep it?

ilya-lavrenov · 2025-01-10T08:13:37Z

src/cpp/src/text_callback_streamer.hpp

@@ -11,12 +11,17 @@ namespace genai {

 class TextCallbackStreamer: public StreamerBase {
 public:
+    StreamerRunningStatus streaming_status = StreamerRunningStatus::UNDEF;


as I see StreamerBase already contains this field ?

ilya-lavrenov · 2025-01-10T08:15:58Z

src/cpp/include/openvino/genai/streamer_base.hpp

+    CANCEL = 3 // Stop generate, drop last prompt and all generated tokens from history, KV cache include history but last step
+};
+
+using CallbackTypeVariant = std::variant<bool, StreamerRunningStatus>;


Suggested change

using CallbackTypeVariant = std::variant<bool, StreamerRunningStatus>;

using CallbackTypeVariant = std::variant<void, bool, StreamerRunningStatus>;

to support callback which just "prints"

ilya-lavrenov · 2025-01-10T08:19:34Z

src/cpp/include/openvino/genai/streamer_base.hpp

@@ -22,6 +34,10 @@ class OPENVINO_GENAI_EXPORTS StreamerBase {
    /// @brief end is called at the end of generation. It can be used to flush cache if your own streamer has one
    virtual void end() = 0;

+    virtual StreamerRunningStatus get_finish_streaming_reason() {


Suggested change

virtual StreamerRunningStatus get_finish_streaming_reason() {

StreamingStatus get_streaming_status() {

ilya-lavrenov · 2025-01-10T08:21:17Z

src/python/openvino_genai/py_openvino_genai.pyi

        ...
    @typing.overload
-    def generate(self, prompts: list[str], generation_config: list[GenerationConfig], streamer: typing.Callable[[str], bool] | StreamerBase | None = None) -> list[GenerationResult]:
+    def generate(self, prompts: list[str], generation_config: list[GenerationConfig], streamer: typing.Callable[[str], bool] | typing.Callable[[str], ...] | StreamerBase | None = None) -> list[GenerationResult]:


should we propagate StreamingStatus to Python API? to use enum instead of str

src/cpp/include/openvino/genai/streamer_base.hpp

src/cpp/src/visual_language/pipeline.cpp

Wovchena · 2025-01-14T08:36:39Z

src/cpp/include/openvino/genai/streamer_base.hpp


 namespace ov {
 namespace genai {

+enum class StreamerRunningStatus {
+    UNDEF = 0, // Streaming is not run
+    RUNNING = 1, // Continue to run of inference


RUNNING and UNDEF seem to be equivalent. In that case you should keep only one of them. Moreover callback should never return UNDEF, so merging them fixes the API.

removed it , moved to GenerationStatus

Merging with GenerationStatus allows a callback to return FINISHED and IGNORED which aren't related to this. I'd guess #1476 (comment) was about aligning API but not merging. @ilya-lavrenov, is that so?

sbalandi force-pushed the callback branch from 62439bf to 3800085 Compare January 3, 2025 22:10

ilya-lavrenov added this to the 2025.0 milestone Jan 4, 2025

ilya-lavrenov self-assigned this Jan 6, 2025

sbalandi force-pushed the callback branch 5 times, most recently from 454cdd9 to 1592ed0 Compare January 8, 2025 19:38

github-actions bot added category: Python API Python API for GenAI category: samples GenAI samples labels Jan 8, 2025

sbalandi force-pushed the callback branch 3 times, most recently from 10a755b to d18fe16 Compare January 8, 2025 22:19

sbalandi marked this pull request as ready for review January 8, 2025 22:43

sbalandi force-pushed the callback branch 3 times, most recently from 2758f6b to 03ca3ce Compare January 9, 2025 21:56

ilya-lavrenov reviewed Jan 10, 2025

View reviewed changes

ilya-lavrenov requested a review from Wovchena January 10, 2025 08:21

andrei-kochin modified the milestones: 2025.0, 2025.1 Jan 13, 2025

Wovchena requested changes Jan 14, 2025

View reviewed changes

sbalandi force-pushed the callback branch 11 times, most recently from 17a9501 to 8975221 Compare January 21, 2025 13:16

sbalandi added 5 commits January 21, 2025 14:04

Add a choice of how to end streaming from callback: STOP or CANCEL

82b1d4d

fix python

0c4e7f9

revert changes for python

abdb8b5

Update

d3bed68

update

8975221

sbalandi requested a review from ilya-lavrenov January 22, 2025 14:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a choice of how to end streaming from callback: STOP or CANCEL #1476

Add a choice of how to end streaming from callback: STOP or CANCEL #1476

sbalandi commented Jan 3, 2025

sbalandi commented Jan 3, 2025

sbalandi commented Jan 8, 2025

ilya-lavrenov left a comment

ilya-lavrenov Jan 10, 2025

ilya-lavrenov Jan 10, 2025

ilya-lavrenov Jan 10, 2025

ilya-lavrenov Jan 11, 2025

sbalandi Jan 20, 2025

ilya-lavrenov Jan 10, 2025

sbalandi Jan 20, 2025

ilya-lavrenov Jan 10, 2025

sbalandi Jan 20, 2025

ilya-lavrenov Jan 10, 2025

ilya-lavrenov Jan 10, 2025

ilya-lavrenov Jan 10, 2025

Wovchena Jan 14, 2025

sbalandi Jan 20, 2025

Wovchena Jan 21, 2025

	return ov::genai::StreamerRunningStatus::RUNNING;
	return ov::genai::StreamingStatus::CONTINUE;

	using CallbackTypeVariant = std::variant<bool, StreamerRunningStatus>;
	using CallbackTypeVariant = std::variant<void, bool, StreamerRunningStatus>;

	virtual StreamerRunningStatus get_finish_streaming_reason() {
	StreamingStatus get_streaming_status() {

Add a choice of how to end streaming from callback: STOP or CANCEL #1476

Are you sure you want to change the base?

Add a choice of how to end streaming from callback: STOP or CANCEL #1476

Conversation

sbalandi commented Jan 3, 2025

sbalandi commented Jan 3, 2025

sbalandi commented Jan 8, 2025

ilya-lavrenov left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment