Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Release v0.0.12 #16

Merged
merged 40 commits into from
May 15, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
40 commits
Select commit Hold shift + click to select a range
096c294
swap the function for an implementation of the Display trait in AIlist
nleroy917 Apr 17, 2024
e7f64d0
start with adding anyhow dep
nleroy917 Apr 17, 2024
27e4279
update error handling
nleroy917 Apr 17, 2024
893db11
more error handling
nleroy917 Apr 17, 2024
28e2913
more error handling updates
nleroy917 Apr 17, 2024
1a53724
dynamic file reading
nleroy917 Apr 17, 2024
e01ae47
rearranging
nleroy917 Apr 17, 2024
db362a1
update the python bindings to support new changes
nleroy917 Apr 17, 2024
49205e1
working on more flexibility and traits
nleroy917 Apr 17, 2024
5554b3f
workon tokenized Region set
nleroy917 Apr 17, 2024
6c27aea
a tokenized region is just an id and a universe pointer. a tokenized …
nleroy917 Apr 17, 2024
7847656
store token id on interval for faster retrieval
nleroy917 Apr 17, 2024
8af47e5
work on python bindings a bit
nleroy917 Apr 18, 2024
a266928
getting there...
nleroy917 Apr 18, 2024
79c015e
more APIs
nleroy917 Apr 18, 2024
1def347
update API a bit
nleroy917 Apr 18, 2024
d392736
bump pyo3 stuff
nleroy917 Apr 18, 2024
2bdedf7
getting close...
nleroy917 Apr 19, 2024
5ad3650
update API more
nleroy917 Apr 19, 2024
a9d91db
more work on the API
nleroy917 Apr 19, 2024
573c781
start type stubs
nleroy917 Apr 19, 2024
92ffb5a
type stubs for tokenizers
nleroy917 Apr 19, 2024
bdd5aeb
finish type stubs
nleroy917 Apr 19, 2024
f8f1257
bump version, changelog, typos
nleroy917 Apr 19, 2024
98ac517
implement super basic bbclient in Rust... use it to instantiate token…
nleroy917 Apr 21, 2024
1455ac2
work on from_pretrained API's
nleroy917 Apr 21, 2024
e398f97
work on anndata tokenizer in Rust
nleroy917 Apr 21, 2024
ee655c6
remove SingleCellTokenizer -- we are not ready
nleroy917 Apr 22, 2024
7a8fc9a
add more functions for RegionSet inside genimtools
nleroy917 Apr 22, 2024
f1fda42
add region set class
nleroy917 Apr 22, 2024
2077efe
work on type stubs for bindings
nleroy917 Apr 22, 2024
48d56f8
Merge pull request #13 from databio/tokenization_updates
nleroy917 Apr 22, 2024
ca69aac
remove empty consts module
nleroy917 Apr 22, 2024
06db024
remove polars dependency since it was doing nothing but bloating the …
nleroy917 Apr 22, 2024
6cb0fb6
add From<&[u8]> implementation for in-memory instantiation
nleroy917 Apr 22, 2024
b3af86d
add test for rs from bytes
nleroy917 Apr 22, 2024
8ed0e88
start soft tokenizer
nleroy917 Apr 22, 2024
a25c94f
better logo
nleroy917 Apr 23, 2024
a5fc1f8
tweaks to fit with geniml
nleroy917 May 10, 2024
ef3ac1b
Merge pull request #17 from databio/soft_tokenizer
nleroy917 May 10, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
<img src="genimtools/docs/logo.svg" alt="genimtools logo" height="100px">
</h1>

`genimtools` is a rust crate that provides a set of tools for working with genomic interval data. Its primary goal is to provide processors for our python package, [`geniml`](https:github.com/databio/geniml), a libary for machine learning on genomic intervals. However, it can be used as a standalone library for working with genomic intervals as well.
`genimtools` is a rust crate that provides a set of tools for working with genomic interval data. Its primary goal is to provide processors for our python package, [`geniml`](https:github.com/databio/geniml), a library for machine learning on genomic intervals. However, it can be used as a standalone library for working with genomic intervals as well.

`genimtools` provides three things:

Expand Down
8 changes: 6 additions & 2 deletions bindings/Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[package]
name = "genimtools-py"
version = "0.0.11"
version = "0.0.12"
edition = "2021"

# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
Expand All @@ -9,5 +9,9 @@ name = "genimtools"
crate-type = ["cdylib"]

[dependencies]
anyhow = "1.0.82"
genimtools = { path = "../genimtools" }
pyo3 = "0.20.0"
pyo3 = { version = "0.21", features=["anyhow", "extension-module"] }
numpy = "0.21"
# pyo3-tch = { git = "https://github.com/LaurentMazare/tch-rs" }
# torch-sys = { git = "https://github.com/LaurentMazare/tch-rs" }
2 changes: 1 addition & 1 deletion bindings/genimtools/__init__.py
Original file line number Diff line number Diff line change
@@ -1 +1 @@
from .genimtools import *
from .genimtools import * # noqa: F403
12 changes: 1 addition & 11 deletions bindings/genimtools/__init__.pyi
Original file line number Diff line number Diff line change
@@ -1,11 +1 @@
PAD_CHR: str
PAD_START: int
PAD_END: int

MASK_CHR: str
MASK_START: int
MASK_END: int

UNKNOWN_CHR: str
UNKNOWN_START: int
UNKNOWN_END: int
__version__: str
2 changes: 1 addition & 1 deletion bindings/genimtools/ailist/__init__.py
Original file line number Diff line number Diff line change
@@ -1 +1 @@
from .genimtools.ailist import *
from .genimtools.ailist import * # noqa: F403
47 changes: 36 additions & 11 deletions bindings/genimtools/ailist/__init__.pyi
Original file line number Diff line number Diff line change
@@ -1,25 +1,50 @@
from typing import List

class Interval:
start: int
end: int
def __init__(self, start: int, end: int):
"""
Represents a range of values.
"""

def __new__(cls, start: int, end: int) -> Interval:
"""
Create a new Interval object.

:param start: The start of the interval.
:param end: The end of the interval.
"""

@property
def start(self) -> int:
"""
The start of the interval.
"""
Create a new Interval.

@property
def end(self) -> int:
"""
The end of the interval.
"""

def __repr__(self) -> str: ...

class AIList:
def __init__(self, intervals: List[Interval], minimum_coverage_length: int) -> AIList:
"""
Create a new AIList.
"""
The augmented interval list (AILIST) object.

This object will compute region overlaps very efficiently.
"""

:param intervals: The list of intervals to use.
:param minimum_coverage_length: The minimum number of intervals that are covered before being decomposed into another list.
def __new__(cls, intervals: List[Interval], minimum_coverage_length: int = None) -> AIList:
"""
Create a new AIList object.

:param intervals: A list of intervals.
:param minimum_coverage_length: The minimum length of the coverage.
"""

def query(self, interval: Interval) -> List[Interval]:
"""
Query the AIList for the intervals that overlap with the given interval.
Query the AIList object for overlapping intervals.

:param interval: The interval to query.
"""
"""
2 changes: 1 addition & 1 deletion bindings/genimtools/tokenizers/__init__.py
Original file line number Diff line number Diff line change
@@ -1 +1 @@
from .genimtools.tokenizers import *
from .genimtools.tokenizers import * # noqa: F403
Loading