-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Inference Spec & CI #23
base: main
Are you sure you want to change the base?
Conversation
…rds compatibility with inference box
def torch_device(self): | ||
env = os.environ.get("env", "dev") | ||
torch_device = "cuda" if env == "prod" else "cpu" | ||
return torch_device |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we're bound to this for historical reasons I would let it slide, but I feel like names should be indicative of what they are. Could we call the environment variable 'inference_device' and pass it directly? If I read 'env' for something else dev but want to run on my local GPU I might make a bad mistake.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this was left for reverse compatibility but now that we're fully switched over this is a good time to phase it out 👍🏼
Added:
Example Running FastAPI with Inference Spec:
Example Running Ray Serve with Inference Spec:
# Refer to models-host repo for instructions if you don't have a GPU on the machine running ray serve pip install git+https://github.com/guardrails-ai/models-host@feat/adding-ray-setup serve run models_host.ray_serve:app