Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Benchmarks - Add LLaMA-2 Models #668

Open
wants to merge 19 commits into
base: main
Choose a base branch
from
Open

Conversation

dpower4
Copy link
Contributor

@dpower4 dpower4 commented Nov 19, 2024

Added llama benchmark - training and inference in accordance with the existing pytorch models implementation like gpt2, lstm etc.

  • Dropped python 3.6 and added 3.10 for cpu unit tests
  • Updated base image 20.12 -> 24.03 (cuda 12.4) for cuda unit tests
  • added llama fp8 unit test for better code coverage, to reduce memory required
  • updated transformers version >= 4.28.0 for LLamaConfig
  • added llama2 to tensorrt
  • llama2 tests not added to test_tensorrt_inference_performance.py due to large memory requirement for worker gpu. tests validated separately on gh200

@dpower4 dpower4 requested review from cp5555 and a team as code owners November 19, 2024 02:53
Copy link
Member

@abuccts abuccts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pls use python3 setup.py lint to check the format and run python3 setup.py format to format and code

@abuccts abuccts changed the title Feat/llama2 Benchmarks - Add LLaMA-2 Models Nov 19, 2024
@dpower4
Copy link
Contributor Author

dpower4 commented Nov 19, 2024

@abuccts can I get access to the unit test logs.

Copy link

codecov bot commented Nov 20, 2024

Codecov Report

Attention: Patch coverage is 36.58537% with 78 lines in your changes missing coverage. Please review.

Project coverage is 84.90%. Comparing base (a8a7bed) to head (0b1da4f).
Report is 3 commits behind head on main.

Files with missing lines Patch % Lines
...bench/benchmarks/model_benchmarks/pytorch_llama.py 32.75% 78 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #668      +/-   ##
==========================================
- Coverage   85.77%   84.90%   -0.87%     
==========================================
  Files          97       98       +1     
  Lines        6925     7116     +191     
==========================================
+ Hits         5940     6042     +102     
- Misses        985     1074      +89     
Flag Coverage Δ
cpu-python3.10-unit-test 70.95% <36.06%> (?)
cpu-python3.7-unit-test 70.91% <35.77%> (-0.68%) ⬇️
cpu-python3.8-unit-test 70.95% <35.83%> (-0.67%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.


🚨 Try these New Features:

@guoshzhao
Copy link
Contributor

LGTM, thanks! Please fix the UT failures with Python 1.10. And since the CUDA tests are running on K80 which is very old GPU, we can skip the "cuda-init-test", and just make sure "cpu-unit-test" can pass.

/__w/1/s/.eggs/setuptools_scm-8.1.0-py3.10.egg/setuptools_scm/_integration/setuptools.py:92: UserWarning: version of superbench already set
  warnings.warn(f"version of {dist_name} already set")
running lint
tests/analyzer/test_summaryop.py:7: error: Module "numpy" has no attribute "NaN"  [attr-defined]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants