Skip to content

Latest commit

 

History

History
executable file
·
13 lines (8 loc) · 600 Bytes

README.md

File metadata and controls

executable file
·
13 lines (8 loc) · 600 Bytes

Llama 2 on CPU, and Mac M1/M2 GPU

This is a fork of https://github.com/facebookresearch/llama that runs on CPU and Mac M1/M2 GPU (mps) if available.

Please refer to the official installation and usage instructions as they are exactly the same.

image

MacBook Pro M1 with 7B model:

  • MPS (default): ~4.3 words per second
  • CPU: ~0.67 words per second

There is also an extra message shown during text generation that reports the number and speed at which tokens are being generated.