Skip to content

v5.1.0: rate limits, refactored settings

Compare
Choose a tag to compare
@jamesbraza jamesbraza released this 07 Oct 21:58
· 178 commits to main since this release
1816965

Highlights

In-housed rate limits management

  • Centers on a moving window algorithm with either a Redis or in-memory state
  • Supports dynamically defined rates for different models or providers.
  • New bundled configurations for different OpenAI rate limit tiers
  • Accomplished using new third party dependencies coredis and limits

Refactored Settings to allow for increased flexibility

  • Indexing
    • Indexes can use relative paths, enabling sharing across machines
    • Paper search now no longer rebuilds the index every invocation
    • Index parameter now are grouped in IndexSettings
      • This release begins a deprecation cycle for the original hyperparameters
    • Index builds now have a rich.Progress bar
  • Parsing
    • Chunking and embedding can now be deferred to inference time
  • Agents
    • Agents now have a max_timesteps parameter to upper-bound trajectory length
    • Default agent is now a simple tool calling agent (ToolSelector), instead of a deterministic sequence of tool calls ("fake" agent)

Several bug fixes centered on retry-able errors:

  • Flaky Semantic Scholar and Crossref SSL errors and connection reset errors
  • LLM completions and text embeddings

What's Changed

Full Changelog: v5.0.10...v5.1.0