Feat add local inference #89

codelion · 2024-11-10T02:32:18Z

Ability to use local built-in inference server
Allows logprobs in output responses (this is not supported in ollama)
Allows multiple response sampling (this is not supported in ollama)
Supports multiple LoRAs (this is not supported in ollama)
Supports prompt caching
Supports alternative decoding techniques like cot_decoding and entropy_decoding

- allow loading any model from hf and any lora adapter - support caching - support batchs - support optimized attention - add dynamic temperature

add support for logprobs

fix logprobs return

…n/optillm into feat-add-local-inference

fix multiple loras loading and setting of adapters

…n/optillm into feat-add-local-inference

…server

bump version for new release

codelion added 13 commits November 8, 2024 13:47

initial implementation

23e48ef

- allow loading any model from hf and any lora adapter - support caching - support batchs - support optimized attention - add dynamic temperature

return logprobs and toplogprobs

60a84ef

add support for logprobs

Update README.md

08cdffd

Update README.md

87c5f3a

Add none approach

2c66fc2

fix logprobs return

Merge branch 'feat-add-local-inference' of https://github.com/codelio…

9c2426a

…n/optillm into feat-add-local-inference

Update README.md

795556f

Update inference.py

26030d0

fix multiple loras loading and setting of adapters

Merge branch 'feat-add-local-inference' of https://github.com/codelio…

24b559d

…n/optillm into feat-add-local-inference

add support for cot_decoding and entropy_decoding in local inference …

e6d61b1

…server

Aggregate paths in cot decoding by default

6a3ffa7

Update README.md

3ff58a3

Update setup.py

ad90fd8

bump version for new release

codelion merged commit 7381008 into main Nov 13, 2024

codelion deleted the feat-add-local-inference branch November 13, 2024 02:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feat add local inference #89

Feat add local inference #89

codelion commented Nov 10, 2024 •

edited

Loading

Feat add local inference #89

Feat add local inference #89

Conversation

codelion commented Nov 10, 2024 • edited Loading

codelion commented Nov 10, 2024 •

edited

Loading