Skip to content

Commit

Permalink
Prep 0.12.1 release
Browse files Browse the repository at this point in the history
  • Loading branch information
benbrandt committed Apr 26, 2024
1 parent 3edd69b commit 0d6b722
Show file tree
Hide file tree
Showing 4 changed files with 11 additions and 4 deletions.
6 changes: 6 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,11 @@
# Changelog

## v0.12.1

### What's New

- [`rust_tokenizers`](https://crates.io/crates/rust_tokenizers) support has been added to the Rust crate.

## v0.12.0

### What's New
Expand Down
4 changes: 2 additions & 2 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 2 additions & 2 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
members = ["bindings/*"]

[workspace.package]
version = "0.12.0"
version = "0.12.1"
authors = ["Ben Brandt <[email protected]>"]
edition = "2021"
description = "Split text into semantic chunks, up to a desired chunk size. Supports calculating length by characters and tokens, and is callable from Rust and Python."
Expand Down Expand Up @@ -81,8 +81,8 @@ harness = false
[features]
markdown = ["dep:pulldown-cmark"]
rust-tokenizers = ["dep:rust_tokenizers"]
tokenizers = ["dep:tokenizers"]
tiktoken-rs = ["dep:tiktoken-rs"]
tokenizers = ["dep:tokenizers"]

[lints.rust]
future_incompatible = "warn"
Expand Down
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -170,6 +170,7 @@ There are lots of methods of determining sentence breaks, all to varying degrees

| Dependency Feature | Version Supported | Description |
| ------------------ | ----------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| `rust_tokenizers` | `8.1.1` | Enables `(Text/Markdown)Splitter::new` to take any of the provided tokenizers as an argument. |
| `tiktoken-rs` | `0.5.8` | Enables `(Text/Markdown)Splitter::new` to take `tiktoken_rs::CoreBPE` as an argument. This is useful for splitting text for OpenAI models. |
| `tokenizers` | `0.19.1` | Enables `(Text/Markdown)Splitter::new` to take `tokenizers::Tokenizer` as an argument. This is useful for splitting text models that have a Hugging Face-compatible tokenizer. |

Expand Down

0 comments on commit 0d6b722

Please sign in to comment.