Skip to content

Commit

Permalink
Update LlaMa notebooks to use HF TGI container (#2475)
Browse files Browse the repository at this point in the history
* first draft

* llama hf tgi (#2476)

* Update notebook

* update

* update response format, input format, use env vars

* default sharding to true

* update scoring changes and notebook

* udpate

* update scoring script to use AACS (#2481)

* update scoring script to use AACS

* Add mlflow

* update

* fixes to scoring script

* remove /n

* update scoring script to have system prompt

---------

Co-authored-by: Gaurav Singh <[email protected]>

* black + minor fixes

* update default

* add gen params validation (#2489)

* add top_p in text-gen examples

* score.py changes

* update

* fix

* update scoring to include new aacs key

* add checking for empty string

---------

Co-authored-by: Gaurav Singh <[email protected]>
Co-authored-by: Ayush Mishra <[email protected]>
Co-authored-by: Ayush Mishra <[email protected]>
Co-authored-by: Ke Xu <[email protected]>
Co-authored-by: xuke444 <[email protected]>
  • Loading branch information
6 people authored Jul 28, 2023
1 parent f3b8a39 commit 42b53c8
Show file tree
Hide file tree
Showing 7 changed files with 869 additions and 24 deletions.
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
# Use the base image that includes the necessary dependencies
FROM ghcr.io/huggingface/text-generation-inference:0.9

ENV DEBIAN_FRONTEND noninteractive

RUN apt-get update -y && apt-get install vim openssh-server openssh-client -y

COPY requirements.txt .
RUN pip install -r requirements.txt --no-cache-dir

# List installed packages
RUN pip list

## Delete
RUN rm requirements.txt

# Inference requirements
COPY --from=mcr.microsoft.com/azureml/o16n-base/python-assets:20230419.v1 /artifacts /var/
RUN /var/requirements/install_system_requirements.sh && \
cp /var/configuration/rsyslog.conf /etc/rsyslog.conf && \
cp /var/configuration/nginx.conf /etc/nginx/sites-available/app && \
ln -sf /etc/nginx/sites-available/app /etc/nginx/sites-enabled/app && \
rm -f /etc/nginx/sites-enabled/default
ENV SVDIR=/var/runit
ENV WORKER_TIMEOUT=3600
EXPOSE 5001 8883 8888

# Stop server from starting at the very beginning itself
# We are handling server start from scoring script
ENTRYPOINT []

CMD ["runsvdir", "/var/runit"]
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
azureml-inference-server-http
text-generation
psutil
pandas
numpy
mlflow==2.3.1
azure-ai-contentsafety==1.0.0b1
aiolimiter==1.1.0
azure-ai-mlmonitoring==0.1.0a3
azure-mgmt-cognitiveservices==13.4.0
azure-identity==1.13.0
Loading

0 comments on commit 42b53c8

Please sign in to comment.