Skip to content

Commit

Permalink
[DOCS] Adding GenAI Use Cases (openvinotoolkit#27062)
Browse files Browse the repository at this point in the history
Creating an article with use case scenarios for using OpenVINO GenAI.
This PR addresses the following

JIRA ticket: CVS-153319

---------

Co-authored-by: Karol Blaszczak <[email protected]>
  • Loading branch information
2 people authored and CuriousPanCake committed Nov 6, 2024
1 parent 7d34590 commit 5ba4fe9
Show file tree
Hide file tree
Showing 2 changed files with 434 additions and 53 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ Run LLM Inference on OpenVINO with the GenAI Flavor
:hidden:

NPU inference of LLMs <genai-guide-npu>
genai-guide/genai-use-cases


This guide will show you how to integrate the OpenVINO GenAI flavor into your application, covering
Expand Down Expand Up @@ -174,59 +175,6 @@ You can also create your custom streamer for more sophisticated processing:
pipe.generate("The Sun is yellow because", ov::genai::streamer(custom_streamer), ov::genai::max_new_tokens(100));
}
Using GenAI in Chat Scenario
################################

For chat scenarios where inputs and outputs represent a conversation, maintaining KVCache across inputs
may prove beneficial. The chat-specific methods **start_chat** and **finish_chat** are used to
mark a conversation session, as you can see in these simple examples:

.. tab-set::

.. tab-item:: Python
:sync: py

.. code-block:: python
import openvino_genai as ov_genai
pipe = ov_genai.LLMPipeline(model_path)
pipe.set_generation_config({'max_new_tokens': 100)
pipe.start_chat()
while True:
print('question:')
prompt = input()
if prompt == 'Stop!':
break
print(pipe.generate(prompt))
pipe.finish_chat()
.. tab-item:: C++
:sync: cpp
.. code-block:: cpp
int main(int argc, char* argv[]) {
std::string prompt;
std::string model_path = argv[1];
ov::genai::LLMPipeline pipe(model_path, "CPU");
ov::genai::GenerationConfig config = pipe.get_generation_config();
config.max_new_tokens = 100;
pipe.set_generation_config(config)
pipe.start_chat();
for (size_t i = 0; i < questions.size(); i++) {
std::cout << "question:\n";
std::getline(std::cin, prompt);
std::cout << pipe.generate(prompt) << std::endl;
}
pipe.finish_chat();
}
Optimizing Generation with Grouped Beam Search
#######################################################
Expand Down
Loading

0 comments on commit 5ba4fe9

Please sign in to comment.