Releases · Aleph-Alpha/intelligence-layer-sdk · GitHub

13 Dec 11:32

benbrandt

v0.4.1

Fix missing version bump in the packages

Full Changelog: v0.4.0...v0.4.1

Assets 2

13 Dec 11:29

benbrandt

v0.4.0

Breaking Changes

Evaluator methods changed to support asynchronous processing for human eval. To run everything at once, change evaluator.evaluate() calls to evaluator.run_and_evaluate
- An evaluation also now returns a EvaluationOverview, with much more information about the output of the evaluation.
EmbeddingBasedClassify: init arguments swapped places, from labels_with_examples, client to client, label_with_examples
PromptOutput for Instruct tasks now inherits from CompleteOutput to make it easier to use more information about the raw completion response.

New Features

New IntelligenceApp builder to quickly spin up a FastAPI server with your Tasks
Integration with Argilla for human evaluation
CompleteOutput and PromptOutput now support getting the generated_tokens in the completion for downstream calculations.
Summarization use cases now allow for overriding the default model
New RecursiveSummarizer allows for recursively calling one of the LongContextSummarize tasks until certain thresholds are reached

Fixes

LimitedConcurrencyClient's from_token method now supports a custom API host

Full Changelog: v0.3.0...v0.4.0

Assets 2

29 Nov 11:15

benbrandt

v0.3.0

Breaking Changes

Dataset is now a protocol. SequenceDataset replaces the old Dataset.
The ident attribute on Example is now id.
calculate_bleu function is removed and instead called from a BleuGrader
calculate_rouge function is removed and instead called from a RougeGrader
ClassifyEvaluator is now called SingleLabelClassifyEvaluator
Evaluators now take and return Iterators instead of Sequences to allow for streaming datasets #106 #108

New Features

Evaluators now have better handling of dataset processing.
- Errors are handled for individual examples, so that you don't lose the entire run because of one failed task.
- The dataset run now produces an EvaluationRunOverview generated by an EvaluationRepository, that better captures the aggregated runs and traces. #109 #112 #115 #131
- There is a FileEvaluationRepository and an InMemoryEvaluationRepository available for storing your evaluation results
Support passing Metadata field through DocumentIndexClient (already supported in the Document Index, new in client only) #105
New MultiLabelClassifyEvaluator to evaluate classification use cases that support multi-label classification #129 #133
Evaluators can now be called via the CLI #130

Fixes

Fix issue in EchoTask regarding concurrent execution causing overrides in the PromptTemplate #116

Full Changelog: v0.2.0...v0.3.0

Assets 2

17 Nov 12:51

benbrandt

v0.2.0

Breaking Changes

SingleLabelClassify renamed to PromptBasedClassify with new SingleLabelClassifyOutput in #94 #96
EmbeddingBasedClassify now outputs MultiLabelClassifyOutput to distinguish between the different types of scores produced in #94 #96

New Features

New LimitedConcurrencyClient to better control how many simultaneous API requests are made concurrently, regardless of where they are called within the Task hierarchy
Basic new SingleChunkSummarizeEvaluator and LongContextSummarizeEvaluator that can calculate Rouge and Bleu scores when compared with a "golden summary" in #90 #91

Fixes

Fix issue with Pydantic 2.5 due to ambiguous ordering of types in PydanticSerializable type in #95
Fixed possible deadlock with nested calls to Task.run_concurrently in #99
Allow EchoTask to support models whose tokenizers don't contain pre_tokenizers in #98
Update documentation for including the package in Dockerfiles in #97

Full Changelog: v0.1.0...v0.2.0

Assets 2

13 Nov 14:44

benbrandt

v0.1.0 - Initial Release

Initial Beta Release

Assets 2