Hi, pleased to meet you all.
Today We will deploy Llama-2-13B open-source LLM (can be any other HF LLM) on an AWS EC2 instance armed with a robust 24GB GPU and using Cloud Formation YAML script we will establish the given infrastructure.
We will Pack, build and run everything as a single docker-compose file which provides communication between GenAI RAG application with both open-source GPU powered LLM TGI inference and proprietary OpenAI LLM API for comparison purpose
This code release is being done as part of the speakers corner session that was conducted on 16th April 2024 https://www.landing.ciklum.com/sc-architecting-scalable-ai
So, let's start
how to create stack using console
how to download ec2 key pair and put it in the project root
chmod 400 llm-key.pem
Required "-" instead of "." for SSH connection
PUBLIC_IP=X.XXX.XXX.XXX
PUBLIC_IP=$(echo "$PUBLIC_IP" | sed 's/\./-/g')
ssh -i llm-key.pem ec2-user@ec2-${PUBLIC_IP}.compute-1.amazonaws.com
Required to avoid messing up with git credentials during the demo
scp -r -i llm-flask-rag-aws/llm-key.pem llm-flask-rag-aws ec2-user@ec2-${PUBLIC_IP}.compute-1.amazonaws.com:~/
sudo curl -L "https://github.com/docker/compose/releases/latest/download/docker-compose-$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compose
sudo chmod +x /usr/local/bin/docker-compose
cd llm-flask-rag-aws
docker-compose build
docker-compose up
http://{PUBLIC_IP}:5000/hf
http://{PUBLIC_IP}:5000/openai
how to delete aws cloudformation stack