-
Notifications
You must be signed in to change notification settings - Fork 2
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* #38 addressed * #38 problem addressed * #38 addressed * integration of models * update with catalog services * update with catalog services * combined properties and generator * grammar cleanup * namespacing added * addressed error checking, tab forward, Help and display formatting * addressed error checking, tab forward, Help and display formatting * model service api * added model mgt services * model service api updates * bugfix services * using inheritance * working on bug fix for save * servicer updates * added skypilot * phils changes * download models * pytest tests added * new tests * updated servicer * change default port * updated capabilities * updated capabilities * #42 addresses issue of canonical smiles being overwritten with 2D canonical from Pubchem * testing and fixes * bugs for launching services * partially implemented context sensitive output * temp feat gpu * create config from service file * remove unused grammer * set default workdir * version bump servicing * print config specs * try to remove service * rename to cfg * format config print * fixes * skip test until next servicing * updated tests and servicing * -c implementation for comand line raw * implemented api * implemented api * reduce interval for spinner * wait for service to ping * add status * cast vars * bug fix * updates for merge data * fix cfg file types * updated tests * bug fix status * scaffold for local service * more fixes * high speed merge of property output * improvements to handling data only modes in api * better status * major services update * spinner function hints * more hints * add remote service * added grammar for remote service catalog * fix Notetype error in returns * start without gpu * fixurl * setup pre-commit * chore format and lint * added append and fixed messaging * fixed adding or displaying molecule not on pubchem * demonstration testing and new notebooks * fix return_val parameter typo * llm model update to latest granite * llm model update to latest granite * update on doco and demo * update notebooks and for demonstration * working remote service defs * servicing version bump * remote service working * chore: linter * fix logger * notebook update * notebook update * better logging * remove lower() * one more log :-) * change save * expand user path * remove sentence transformer * fix remote fetch * service grammar instant refresh * save when necessary * caching service definitions * clean up and debug * update service defs * use python cache decorator * reduce sleep * fix tests * timout on catalog query increase 1 - 3 seconds * timout on catalog query increase 1 - 3 seconds * increase linter line length * fix lru cache update * readme update * temporary fix for url fetch * updates * readme updates * readme updates * linter * corrected Readme and added plugin loader * updated pnd files * updated version * merge * merge * merge * merge * llm ollama support * tuning model output and changing embeddings * finalise on llama3 for ollama and granite-chat for BAM * readme update for ollama * Feat auth api (#47) * authentication grammer * auth with api key * chore: sort imports * auth model inference * rename headers * file lock + optimize + api obfuscate * working proxy * add headers to proxy * can add bearer token in USING quotes --------- Co-authored-by: Brian Duenas <[email protected]> * llm instructions * chore: lint * updated version --------- Co-authored-by: Phil Downey <[email protected]> Co-authored-by: Phil Downey <[email protected]>
- Loading branch information
1 parent
7eed012
commit 0f2b748
Showing
29 changed files
with
1,154 additions
and
609 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,62 @@ | ||
envs: | ||
MODEL_NAME: llama3 # mistral, phi, other ollama supported models | ||
EMBEDDINGS_MODEL_NAME: nomic-embed-text # mistral, phi, other ollama supported models | ||
OLLAMA_HOST: 0.0.0.0:8888 # Host and port for Ollama to listen on | ||
|
||
resources: | ||
cpus: 8+ | ||
memory: 16+ # 8 GB+ for 7B models, 16 GB+ for 13B models, 32 GB+ for 33B models | ||
accelerators: V100:1 # No GPUs necessary for Ollama, but you can use them to run inference faster | ||
ports: 8888 | ||
|
||
service: | ||
replicas: 2 | ||
# An actual request for readiness probe. | ||
readiness_probe: | ||
path: /v1/chat/completions | ||
post_data: | ||
model: $MODEL_NAME | ||
messages: | ||
- role: user | ||
content: Hello! What is your name? | ||
max_tokens: 1 | ||
|
||
setup: | | ||
# Install Ollama | ||
if [ "$(uname -m)" == "aarch64" ]; then | ||
# For apple silicon support | ||
sudo curl -L https://ollama.com/download/ollama-linux-arm64 -o /usr/bin/ollama | ||
else | ||
sudo curl -L https://ollama.com/download/ollama-linux-amd64 -o /usr/bin/ollama | ||
fi | ||
sudo chmod +x /usr/bin/ollama | ||
# Start `ollama serve` and capture PID to kill it after pull is done | ||
ollama serve & | ||
OLLAMA_PID=$! | ||
# Wait for ollama to be ready | ||
IS_READY=false | ||
for i in {1..20}; | ||
do ollama list && IS_READY=true && break; | ||
sleep 5; | ||
done | ||
if [ "$IS_READY" = false ]; then | ||
echo "Ollama was not ready after 100 seconds. Exiting." | ||
exit 1 | ||
fi | ||
# Pull the model | ||
ollama pull $EMBEDDINGS_MODEL_NAME | ||
echo "Model $EMBEDDINGS_MODEL_NAME pulled successfully." | ||
# Pull the model | ||
ollama pull $MODEL_NAME | ||
echo "Model $MODEL_NAME pulled successfully." | ||
# Kill `ollama serve` after pull is done | ||
kill $OLLAMA_PID | ||
run: | | ||
# Run `ollama serve` in the foreground | ||
echo "Serving model $MODEL_NAME" | ||
ollama serve |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.