Skip to content

Releases: Aleph-Alpha/intelligence-layer-sdk

v0.4.1

13 Dec 11:32
Compare
Choose a tag to compare

Fix missing version bump in the packages

Full Changelog: v0.4.0...v0.4.1

v0.4.0

13 Dec 11:29
Compare
Choose a tag to compare

Breaking Changes

  • Evaluator methods changed to support asynchronous processing for human eval. To run everything at once, change evaluator.evaluate() calls to evaluator.run_and_evaluate
    • An evaluation also now returns a EvaluationOverview, with much more information about the output of the evaluation.
  • EmbeddingBasedClassify: init arguments swapped places, from labels_with_examples, client to client, label_with_examples
  • PromptOutput for Instruct tasks now inherits from CompleteOutput to make it easier to use more information about the raw completion response.

New Features

  • New IntelligenceApp builder to quickly spin up a FastAPI server with your Tasks
  • Integration with Argilla for human evaluation
  • CompleteOutput and PromptOutput now support getting the generated_tokens in the completion for downstream calculations.
  • Summarization use cases now allow for overriding the default model
  • New RecursiveSummarizer allows for recursively calling one of the LongContextSummarize tasks until certain thresholds are reached

Fixes

  • LimitedConcurrencyClient's from_token method now supports a custom API host

Full Changelog: v0.3.0...v0.4.0

v0.3.0

29 Nov 11:15
Compare
Choose a tag to compare

Breaking Changes

  • Dataset is now a protocol. SequenceDataset replaces the old Dataset.
  • The ident attribute on Example is now id.
  • calculate_bleu function is removed and instead called from a BleuGrader
  • calculate_rouge function is removed and instead called from a RougeGrader
  • ClassifyEvaluator is now called SingleLabelClassifyEvaluator
  • Evaluators now take and return Iterators instead of Sequences to allow for streaming datasets #106 #108

New Features

  • Evaluators now have better handling of dataset processing.
    • Errors are handled for individual examples, so that you don't lose the entire run because of one failed task.
    • The dataset run now produces an EvaluationRunOverview generated by an EvaluationRepository, that better captures the aggregated runs and traces. #109 #112 #115 #131
    • There is a FileEvaluationRepository and an InMemoryEvaluationRepository available for storing your evaluation results
  • Support passing Metadata field through DocumentIndexClient (already supported in the Document Index, new in client only) #105
  • New MultiLabelClassifyEvaluator to evaluate classification use cases that support multi-label classification #129 #133
  • Evaluators can now be called via the CLI #130

Fixes

  • Fix issue in EchoTask regarding concurrent execution causing overrides in the PromptTemplate #116

Full Changelog: v0.2.0...v0.3.0

v0.2.0

17 Nov 12:51
Compare
Choose a tag to compare

Breaking Changes

  • SingleLabelClassify renamed to PromptBasedClassify with new SingleLabelClassifyOutput in #94 #96
  • EmbeddingBasedClassify now outputs MultiLabelClassifyOutput to distinguish between the different types of scores produced in #94 #96

New Features

  • New LimitedConcurrencyClient to better control how many simultaneous API requests are made concurrently, regardless of where they are called within the Task hierarchy
  • Basic new SingleChunkSummarizeEvaluator and LongContextSummarizeEvaluator that can calculate Rouge and Bleu scores when compared with a "golden summary" in #90 #91

Fixes

  • Fix issue with Pydantic 2.5 due to ambiguous ordering of types in PydanticSerializable type in #95
  • Fixed possible deadlock with nested calls to Task.run_concurrently in #99
  • Allow EchoTask to support models whose tokenizers don't contain pre_tokenizers in #98
  • Update documentation for including the package in Dockerfiles in #97

Full Changelog: v0.1.0...v0.2.0

v0.1.0 - Initial Release

13 Nov 14:44
Compare
Choose a tag to compare

Initial Beta Release