Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

StaticLLMPipeline: Update config #969

Conversation

TolyaTalamanov
Copy link
Collaborator

No description provided.

Unverified

This user has not yet uploaded their public signing key.
@ilya-lavrenov ilya-lavrenov added the category: LLM LLM pipeline (stateful, static) label Oct 15, 2024
@ilya-lavrenov ilya-lavrenov added this to the 2024.5 milestone Oct 15, 2024
Copy link
Contributor

@dmatveev dmatveev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay-ish for now but, you now, this never ends

ov::AnyMap config = {
{ "NPU_USE_NPUW", "YES" },
{ "NPU_COMPILATION_MODE_PARAMS", "compute-layers-with-higher-precision=Sqrt,Power,ReduceMean,Add_RMSNorm" },
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So do you remember this Add vs Add_RMSNorm problem?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it should be just Add perhaps

const std::optional<NPUDesc>& desc) {
auto config = get_baseline_common_config();
if (desc.has_value() && desc->support_max_mem_alloc_size) {
config.emplace("NPUW_PMM", "NO");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Disabled PMM can be in the base (common) config I believe

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DOne

ov::AnyMap get_default_common_config(const std::shared_ptr<ov::Model>& model,
const std::optional<NPUDesc>& desc) {
auto config = get_baseline_common_config();
if (desc.has_value() && desc->support_max_mem_alloc_size) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, is this check enough? shouldn't you check for the actual value?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part is removed, will be added in next PRs

auto config = get_baseline_common_config();
if (desc.has_value() && desc->support_max_mem_alloc_size) {
config.emplace("NPUW_PMM", "NO");
config.emplace("NPUW_FUNCALL_FOR_ALL", "YES");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once FCFA is on, the PARALLEL_COMPILE can be ON for both models

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will take it into account for the next PR that enables FCFA, thanks!

Copy link
Contributor

@dmatveev dmatveev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So for it and merge., Tomorrow there will be another one. :D

@TolyaTalamanov TolyaTalamanov added this pull request to the merge queue Oct 16, 2024
Merged via the queue into openvinotoolkit:master with commit be23fc6 Oct 16, 2024
48 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: LLM LLM pipeline (stateful, static) Code Freeze
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants