-
-
Notifications
You must be signed in to change notification settings - Fork 4.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add dockerfile #1350
Add dockerfile #1350
Changes from 6 commits
c371558
1d2fe9d
6722764
577ae8f
9f9c659
5cd2b85
4f4b206
b1e6fe4
5b2ba1d
d259d91
259cfc3
0e9fb7d
13661d3
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
@@ -0,0 +1,64 @@ | ||||||
FROM nvidia/cuda:11.8.0-devel-ubuntu22.04 AS dev | ||||||
|
||||||
RUN apt-get update -y \ | ||||||
&& apt-get install -y python3-pip python3-venv | ||||||
|
||||||
WORKDIR /workspace | ||||||
COPY requirements.txt requirements.txt | ||||||
RUN --mount=type=cache,target=/root/.cache/pip \ | ||||||
pip install -r requirements.txt | ||||||
|
||||||
COPY requirements-dev.txt requirements-dev.txt | ||||||
RUN --mount=type=cache,target=/root/.cache/pip \ | ||||||
pip install -r requirements-dev.txt | ||||||
|
||||||
FROM dev AS build_wheel | ||||||
|
||||||
ARG max_jobs=4 | ||||||
|
||||||
COPY csrc csrc | ||||||
COPY vllm vllm | ||||||
COPY pyproject.toml pyproject.toml | ||||||
COPY README.md README.md | ||||||
COPY MANIFEST.in MANIFEST.in | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ideally here we will copy only csrc folder and build only the c++ code. If you copy vllm folder any changes to the python code causes slow rebuild of the c++ code, most of the time this is not needed. Also the README.md file is kind of required during the build of the c++ code, but we can copy empty README.me while build the c++ code. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Makes sense. Do you think it would be possible to build the extensions separate from the wheel, so that the wheel building step only bundles everything without having to rebuild? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm not sure how to build the wheel with already build extensions, but it might be possible. I don't think we need to build the wheel. If you want we can do another container path for building the wheel if you want this to be used to publish the wheel files to pip? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ok I just saw you did another stage to build the wheel. This is ok I think. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think for now no need to build the wheel in the dockerfile, can do that later if decide to consolidate docker/CI |
||||||
COPY setup.py setup.py | ||||||
|
||||||
RUN --mount=type=cache,target=/root/.cache/pip \ | ||||||
MAX_JOBS=$max_jobs python3 -m build | ||||||
|
||||||
FROM dev AS build | ||||||
|
||||||
COPY csrc csrc | ||||||
COPY setup.py setup.py | ||||||
COPY README.md README.md | ||||||
skrider marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||
COPY requirements.txt requirements.txt | ||||||
COPY pyproject.toml pyproject.toml | ||||||
COPY vllm/__init__.py vllm/__init__.py | ||||||
|
||||||
ENV MAX_JOBS=$max_jobs | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
RUN python3 setup.py build_ext --inplace | ||||||
|
||||||
FROM dev AS test | ||||||
|
||||||
COPY --from=build /workspace/vllm/*.so /workspace/vllm/ | ||||||
COPY tests tests | ||||||
COPY vllm vllm | ||||||
|
||||||
ENTRYPOINT ["python3", "-m", "pytest", "tests"] | ||||||
|
||||||
FROM nvidia/cuda:11.8.0-base-ubuntu22.04 AS api_server | ||||||
|
||||||
RUN apt-get update -y \ | ||||||
&& apt-get install -y python3-pip libnccl2 | ||||||
WORKDIR /workspace | ||||||
|
||||||
COPY requirements.txt requirements.txt | ||||||
RUN --mount=type=cache,target=/root/.cache/pip \ | ||||||
pip install -r requirements.txt | ||||||
|
||||||
COPY --from=build /workspace/vllm/*.so /workspace/vllm/ | ||||||
COPY vllm vllm | ||||||
|
||||||
EXPOSE 8000 | ||||||
ENTRYPOINT ["python3", "-m", "vllm.entrypoints.api_server"] | ||||||
skrider marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||
|
Original file line number | Diff line number | Diff line change | ||||||
---|---|---|---|---|---|---|---|---|
@@ -0,0 +1,21 @@ | ||||||||
.. _deploying_with_docker: | ||||||||
|
||||||||
Deploying with Docker | ||||||||
============================ | ||||||||
|
||||||||
You can build and run vLLM from source via the provided dockerfile. To build vLLM: | ||||||||
|
||||||||
.. code-block:: console | ||||||||
|
||||||||
$ DOCKER_BUILDKIT=1 docker build . --target prod --tag vllm --build-arg max_jobs=8 | ||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||||
|
||||||||
To run vLLM: | ||||||||
|
||||||||
.. code-block:: console | ||||||||
|
||||||||
$ docker run --runtime nvidia --gpus all \ | ||||||||
-v ~/.cache/huggingface:/root/.cache/huggingface \ | ||||||||
-p 8000:8000 \ | ||||||||
--env "HUGGING_FACE_HUB_TOKEN=<secret>" \ | ||||||||
vllm <args...> | ||||||||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -12,3 +12,6 @@ types-setuptools | |
pytest | ||
pytest-forked | ||
pytest-asyncio | ||
|
||
# distribution | ||
build |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Forward declare the argument and set sensible default