Skip to content

Commit

Permalink
Loose tiktoken-rs version requirements (#28)
Browse files Browse the repository at this point in the history
* Loose tiktoken-rs version requirements

* remove unused flag for update
  • Loading branch information
benbrandt authored Jul 2, 2023
1 parent 1f1f06a commit 2f5f718
Show file tree
Hide file tree
Showing 6 changed files with 41 additions and 13 deletions.
8 changes: 4 additions & 4 deletions .github/dependabot.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,10 +10,10 @@ updates:
schedule:
interval: "daily"

# - package-ecosystem: "docker"
# directory: "/"
# schedule:
# interval: "daily"
- package-ecosystem: "cargo"
directory: "/bindings/python"
schedule:
interval: "daily"

- package-ecosystem: "github-actions"
directory: "/"
Expand Down
14 changes: 14 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -97,3 +97,17 @@ jobs:
toolchain: ${{ matrix.msrv }}
- name: cargo +${{ matrix.msrv }} check
run: cargo check

minimal-versions:
name: Check minimal versions
runs-on: ubuntu-latest

steps:
- uses: actions/checkout@v3
- uses: dtolnay/rust-toolchain@master
with:
toolchain: nightly
- uses: Swatinem/rust-cache@v1

- run: cargo update --workspace -Zdirect-minimal-versions
- run: cargo test --workspace --all-features
6 changes: 6 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,11 @@
# Changelog

## v0.4.2

### What's New

- Loosen version requirement for peer dependencies (specifically `tiktoken-rs` now supports `>=v02.0, <0.6.0`)

## v0.4.1

### What's New
Expand Down
18 changes: 9 additions & 9 deletions Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[package]
name = "text-splitter"
version = "0.4.1"
version = "0.4.2"
authors = ["Ben Brandt <[email protected]>"]
edition = "2021"
description = "Split text into semantic chunks, up to a desired chunk size. Supports calculating length by characters and tokens (when used with large language models)."
Expand All @@ -18,21 +18,21 @@ rustdoc-args = ["--cfg", "docsrs"]
# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html

[dependencies]
auto_enums = "0.8.0"
auto_enums = "0.8.1"
either = "1.8.1"
itertools = "0.10.5"
once_cell = "1.17.2"
regex = "1.8.3"
tiktoken-rs = { version = "0.4.2", optional = true }
tokenizers = { version = "0.13.3", default_features = false, features = [
itertools = "0.11.0"
once_cell = "1.18.0"
regex = "1.8.4"
tiktoken-rs = { version = ">=0.2.0, <0.6.0", optional = true }
tokenizers = { version = ">=0.13.3, <0.14.0", default_features = false, features = [
"onig",
], optional = true }
unicode-segmentation = "1.10.1"

[dev-dependencies]
fake = "2.6.1"
insta = { version = "1.29.0", features = ["glob", "yaml"] }
tokenizers = { version = "0.13.3", default-features = false, features = [
insta = { version = "1.30.0", features = ["glob", "yaml"] }
tokenizers = { version = ">=0.13.3, <0.14.0", default-features = false, features = [
"onig",
"http",
] }
Expand Down
4 changes: 4 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,8 @@ let chunks = splitter.chunks("your document text", max_characters);

### With Huggingface Tokenizer

Requires the `tokenizers` feature to be activated.

```rust
use text_splitter::TextSplitter;
// Can also use anything else that implements the ChunkSizer
Expand All @@ -45,6 +47,8 @@ let chunks = splitter.chunks("your document text", max_tokens);

### With Tiktoken Tokenizer

Requires the `tiktoken-rs` feature to be activated.

```rust
use text_splitter::TextSplitter;
// Can also use anything else that implements the ChunkSizer
Expand Down
4 changes: 4 additions & 0 deletions src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,8 @@ let chunks = splitter.chunks("your document text", max_characters);
### With Huggingface Tokenizer
Requires the `tokenizers` feature to be activated.
```rust
use text_splitter::TextSplitter;
// Can also use anything else that implements the ChunkSizer
Expand All @@ -46,6 +48,8 @@ let chunks = splitter.chunks("your document text", max_tokens);
### With Tiktoken Tokenizer
Requires the `tiktoken-rs` feature to be activated.
```rust
use text_splitter::TextSplitter;
// Can also use anything else that implements the ChunkSizer
Expand Down

0 comments on commit 2f5f718

Please sign in to comment.