Skip to content

Commit

Permalink
Merge branch 'main' into tests
Browse files Browse the repository at this point in the history
  • Loading branch information
alexandra-valeanu authored Dec 8, 2024
2 parents 1e4f231 + 650bff8 commit 1b3cdcf
Show file tree
Hide file tree
Showing 6 changed files with 44 additions and 51 deletions.
49 changes: 25 additions & 24 deletions docs/chapters/features/usage.rst
Original file line number Diff line number Diff line change
@@ -1,46 +1,47 @@
Import package and initialise Kinex
Import Package and Initialize Kinex
===================================

1. Import kinex
1. **Import Kinex**

.. code:: python
from kinex import Kinex
from kinex import Kinex
2. Read the scoring matrix
2. **Create a Kinex Object**

- With Predefined Matrices:

It will look in the resources to find the matrices. If it doesn't find them, it will download them and save them for future use.

.. code:: python
scoring_matrix_ser_thr = pd.read_csv("https://zenodo.org/records/13964893/files/scoring_matrix_ser_thr_82k_sorted.csv.gz?download=1", compression="gzip")
scoring_matrix_tyr = pd.read_csv("https://zenodo.org/records/13964893/files/scoring_matrix_tyr_7k_sorted.csv.gz?download=1", compression="gzip")
scoring_matrix_ser_thr
kinex = Kinex()
- With Your Custom Matrices:

.. code:: python
AAK1 ACVR2A ... YSK4 ZAK
0 -11.147481 -6.325340 ... -6.723077 -7.402360
1 -10.421859 -6.178601 ... -6.343452 -7.373478
... ... ... ...
82753 8.074270 7.289390 ... 4.525527 4.837377
82754 8.623180 7.871226 ... 4.869195 5.062391
scoring_matrix_ser_thr = pd.read_csv("path/to/scoring_matrix_ser_thr.csv")
scoring_matrix_tyr = pd.read_csv("path/to/scoring_matrix_tyr.csv")
[82755 rows x 303 columns]
kinex = Kinex(scoring_matrix_ser_thr, scoring_matrix_tyr)
.. note::

You can optionally save the scoring matrix locally for faster use in the future.
The matrix looks like this:

.. code:: python
scoring_matrix_ser_thr.to_csv("scoring_matrix_ser_thr.csv")
scoring_matrix_tyr.to_csv("scoring_matrix_tyr.csv")
AAK1 ACVR2A ... YSK4 ZAK
0 -11.147481 -6.325340 ... -6.723077 -7.402360
1 -10.421859 -6.178601 ... -6.343452 -7.373478
... ... ... ...
82753 8.074270 7.289390 ... 4.525527 4.837377
82754 8.623180 7.871226 ... 4.869195 5.062391
Or just download using the links:
`https://zenodo.org/records/13964893/files/scoring_matrix_ser_thr_82k_sorted.csv.gz?download=1 <https://zenodo.org/records/13964893/files/scoring_matrix_ser_thr_82k_sorted.csv.gz?download=1>`_
`https://zenodo.org/records/13964893/files/scoring_matrix_tyr_7k_sorted.csv.gz?download=1 <https://zenodo.org/records/13964893/files/scoring_matrix_tyr_7k_sorted.csv.gz?download=1>`_

3. Create a kinex object
.. note::

.. code:: python
Predefined matrices can be found here:

kinex = Kinex(scoring_matrix_ser_thr, scoring_matrix_tyr)
- `Scoring Matrix for Serine/Threonine <https://zenodo.org/records/13964893/files/scoring_matrix_ser_thr_82k_sorted.csv.gz?download=1>`_
- `Scoring Matrix for Tyrosine <https://zenodo.org/records/13964893/files/scoring_matrix_tyr_7k_sorted.csv.gz?download=1>`_
6 changes: 1 addition & 5 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ dependencies = [
"scikit-learn",
"umap-learn",
"importlib-resources",
"tomli",
"requests",
]

[project.optional-dependencies]
Expand All @@ -29,10 +29,6 @@ dev = [
"furo"
]

[project.urls]
scoring_matrix_ser_thr = "https://zenodo.org/records/13964893/files/scoring_matrix_ser_thr_82k_sorted.csv.gz?download=1"
scoring_matrix_tyr = "https://zenodo.org/records/13964893/files/scoring_matrix_tyr_7k_sorted.csv.gz?download=1"


[tool.setuptools]
package-dir = { "" = "src" }
Expand Down
3 changes: 1 addition & 2 deletions requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -6,5 +6,4 @@ statsmodels
plotly
scikit-learn
umap-learn
importlib-resources
tomli
importlib-resources
25 changes: 6 additions & 19 deletions src/kinex/kinex.py
Original file line number Diff line number Diff line change
@@ -1,19 +1,13 @@
import bisect
from collections import namedtuple
from functools import reduce
import tomli

import numpy as np
import pandas as pd

from kinex.enrichment import Enrichment
from kinex.functions import download_file_to_resource
from kinex.resources import (
get_pssm_ser_thr,
get_pssm_tyr,
get_scoring_matrix_ser_thr,
get_scoring_matrix_tyr,
)
from kinex.resources import get_pssm_ser_thr, get_pssm_tyr, get_scoring_matrix_ser_thr, get_scoring_matrix_tyr, get_configuration_file
from kinex.score import Score
from kinex.enrichment import Enrichment
from kinex.sequence import get_sequence_object, SequenceType
Expand All @@ -23,8 +17,7 @@


# Load the pyproject.toml file
with open("pyproject.toml", "rb") as f:
config = tomli.load(f)
config = get_configuration_file()


class Kinex:
Expand Down Expand Up @@ -99,21 +92,15 @@ def __init__(
scoring_matrix_ser_thr = get_scoring_matrix_ser_thr()
# Matrix is not provided and not found in the resources, download the default matrix
if scoring_matrix_ser_thr is None:
scoring_matrix_ser_thr_url = config["project"]["urls"][
"scoring_matrix_ser_thr"
]
download_file_to_resource(
scoring_matrix_ser_thr_url, "default_scoring_matrix_ser_thr.csv.gz"
)
scoring_matrix_ser_thr_url = config["urls"]["scoring_matrix_ser_thr"]
download_file_to_resource(scoring_matrix_ser_thr_url, 'default_scoring_matrix_ser_thr.csv.gz')
scoring_matrix_ser_thr = get_scoring_matrix_ser_thr()

if scoring_matrix_tyr is None:
scoring_matrix_tyr = get_scoring_matrix_tyr()
if scoring_matrix_tyr is None:
scoring_matrix_tyr_url = config["project"]["urls"]["scoring_matrix_tyr"]
download_file_to_resource(
scoring_matrix_tyr_url, "default_scoring_matrix_tyr.csv.gz"
)
scoring_matrix_tyr_url = config["urls"]["scoring_matrix_tyr"]
download_file_to_resource(scoring_matrix_tyr_url, 'default_scoring_matrix_tyr.csv.gz')
scoring_matrix_tyr = get_scoring_matrix_tyr()

self.pssm_ser_thr = pssm_ser_thr
Expand Down
6 changes: 5 additions & 1 deletion src/kinex/resources/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -49,4 +49,8 @@ def get_scoring_matrix_tyr() -> pd.DataFrame:
with resources.files("kinex.resources").joinpath("default_scoring_matrix_tyr.csv.gz").open('rb') as file_path:
return pd.read_csv(file_path, compression='gzip')
except FileNotFoundError:
return None
return None

def get_configuration_file() -> dict:
with resources.files("kinex.resources").joinpath("config.json").open() as json_file:
return json.load(json_file)
6 changes: 6 additions & 0 deletions src/kinex/resources/config.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
{
"urls": {
"scoring_matrix_ser_thr": "https://zenodo.org/records/13964893/files/scoring_matrix_ser_thr_82k_sorted.csv.gz?download=1",
"scoring_matrix_tyr": "https://zenodo.org/records/13964893/files/scoring_matrix_tyr_7k_sorted.csv.gz?download=1"
}
}

0 comments on commit 1b3cdcf

Please sign in to comment.