Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[NPU] Support streaming in Python (cpp backend) #12488

Merged

Conversation

Oscilloscope98
Copy link
Contributor

@Oscilloscope98 Oscilloscope98 commented Dec 3, 2024

Description

https://github.com/analytics-zoo/nano/issues/1774

Support streaming in NPU Python (cpp backend)

Usage: same with transformers streamer

In next PR:

  • Update examples

@Oscilloscope98 Oscilloscope98 changed the title Support streaming in NPU Python (cpp backend) [NPU] Support streaming in Python (cpp backend) Dec 3, 2024
output = torch.stack(output_tokens, dim=1)
output = torch.cat((inputs, output), dim=1)
time_t3 = time.perf_counter()

if streamer is not None:
streamer.end()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I put streamer.end() out of time record to fit with our BenchmarkWrapper

Copy link
Contributor

@plusbang plusbang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Oscilloscope98 Oscilloscope98 merged commit 4ac66db into intel-analytics:main Dec 3, 2024
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants