Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python dev #8

Merged
merged 193 commits into from
Sep 20, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
193 commits
Select commit Hold shift + click to select a range
6eb0285
argparse for required inputes
bkmarzouk Aug 17, 2023
890f3d9
missing required input added
bkmarzouk Aug 17, 2023
35f1c5f
optional args parsing
bkmarzouk Aug 17, 2023
91f5f5f
promote str input to PosixPath where appropriate
bkmarzouk Aug 17, 2023
47ccd75
proto validation check for input paths
bkmarzouk Aug 17, 2023
71580c1
pass parsed args to main as kwargs
bkmarzouk Aug 17, 2023
7cb8d08
simple check to validate input path
bkmarzouk Aug 17, 2023
c1322a0
rename ssb_selection script
bkmarzouk Aug 18, 2023
fbef219
ruff cache
bkmarzouk Aug 18, 2023
43f5489
more type checks on _validate_input_path
bkmarzouk Aug 18, 2023
695b68a
reduce complexity of checks
bkmarzouk Aug 18, 2023
d85ee60
test _validate_input_path
bkmarzouk Aug 18, 2023
5a20464
rename of dest: bed_regions -> target
bkmarzouk Aug 18, 2023
e5bcca6
init prepare coordinates
bkmarzouk Aug 18, 2023
96cca2c
filter_bed_file_transcript dev
bkmarzouk Aug 18, 2023
82d5216
dummy functional methods for stage 2
bkmarzouk Aug 21, 2023
bcaeca1
skeleton functions for STEP 2
bkmarzouk Aug 21, 2023
4a0ae26
rename unit tests dir
bkmarzouk Aug 22, 2023
aaa2037
init setup conda py script
bkmarzouk Aug 22, 2023
4d6738d
init setup conda py script
bkmarzouk Aug 22, 2023
3f192bc
test config dir
bkmarzouk Aug 22, 2023
9e306c2
local env yaml
bkmarzouk Aug 22, 2023
ae9f757
update tests for setup_conda.py
bkmarzouk Aug 22, 2023
9563a71
update condarc methods
bkmarzouk Aug 22, 2023
73754c6
test _create_new_condarc
bkmarzouk Aug 22, 2023
2d4aceb
test _update_condarc
bkmarzouk Aug 22, 2023
2782cf5
add pytest-dependency
bkmarzouk Aug 22, 2023
5ea8f43
use subprocess to determine conda usage
bkmarzouk Aug 22, 2023
12bf2b9
init conda env build tests with dependencies
bkmarzouk Aug 22, 2023
7fe1719
add dependencies for test_prepare_condarc
bkmarzouk Aug 22, 2023
d7b4426
ignore local env build dir
bkmarzouk Aug 22, 2023
7274fc3
(minor style fix) mem issue in formatting
bkmarzouk Aug 22, 2023
d660c69
in test assert conda OR mamba
bkmarzouk Aug 22, 2023
7365781
setup shell script to build conda env and install local editable project
bkmarzouk Aug 23, 2023
af72ae7
local yml file for installation updated. Strip python deps (other tha…
bkmarzouk Aug 23, 2023
7b14ea2
flag removals
bkmarzouk Aug 23, 2023
4904ec6
add bedtools & vep to conda env recipe
bkmarzouk Aug 23, 2023
c03f2ca
bash installers
bkmarzouk Aug 23, 2023
8cb930e
setup.sh update -> source environment constructor from src/
bkmarzouk Aug 23, 2023
cb68dfc
test running soprano env
bkmarzouk Aug 23, 2023
b2f20ea
vep installation check
bkmarzouk Aug 23, 2023
15279ec
bedtools installation check
bkmarzouk Aug 23, 2023
2b6274f
generalise method for detecting whether exec found & add secondary ch…
bkmarzouk Aug 23, 2023
6408b90
rename sh_utils dir -> env_utils
bkmarzouk Aug 25, 2023
001e91f
src/SOPRANO/env_utils/setup_conda.py -> src/SOPRANO/env_utils/config_…
bkmarzouk Aug 25, 2023
4d05848
remove redundent condarc efforts
bkmarzouk Aug 25, 2023
b3422e1
mark dependencies for conda installations
bkmarzouk Aug 25, 2023
f3813c1
add cmd line option for installation (dev or ci). Add deps for ci in …
bkmarzouk Aug 25, 2023
999c5c5
update classifiers
bkmarzouk Aug 25, 2023
05f1d5c
lint test workflow
bkmarzouk Aug 25, 2023
fe49119
update yaml: fix ruff syntax + add branch info
bkmarzouk Aug 25, 2023
c9d44f1
attempt conda env build and testing
bkmarzouk Aug 25, 2023
03ea9e1
missing checkout step added
bkmarzouk Aug 25, 2023
477e777
update yaml: add x perms to shell scripts
bkmarzouk Aug 25, 2023
d5a30a1
avoid shell script: build env using actions
bkmarzouk Aug 25, 2023
6170be9
try micromamba instead of miniconda & ensure env is activated in addi…
bkmarzouk Aug 25, 2023
6c2c01e
omit potentially redundent dependencies in local.yml
bkmarzouk Aug 25, 2023
7719a68
mamba -> micromamba cmd
bkmarzouk Aug 25, 2023
24405bd
add init-shell in context
bkmarzouk Aug 25, 2023
e664d55
ensure source ~/.bashrc & echo ~/.bashrc before (see if written as ex…
bkmarzouk Aug 25, 2023
bb7180a
update shell for pytest steps. Remove activation step: Doesn't persis…
bkmarzouk Aug 25, 2023
0d8a85b
Merge branch 'python' into python-dev
bkmarzouk Aug 25, 2023
d5263ba
Merge branch 'python' into python-dev
bkmarzouk Aug 25, 2023
5703174
minor: dbl quotes in eval
bkmarzouk Aug 25, 2023
b10c0ce
add black style checks
bkmarzouk Aug 25, 2023
604e3f6
add sh utils dir
bkmarzouk Aug 25, 2023
bb04446
init testing for _filter_transcript_file
bkmarzouk Aug 25, 2023
7a22ad4
more details in docstring for _filter_transcript_file
bkmarzouk Aug 25, 2023
66c4a6d
subprocess pipe
bkmarzouk Aug 25, 2023
8a0411f
rename sub_pipe -> subprocess_pipes
bkmarzouk Aug 26, 2023
fe0b5e5
dev and test output handlers for completed subprocesses
bkmarzouk Aug 26, 2023
d66b54a
dev and test and pipe wrapper for subprocesses
bkmarzouk Aug 26, 2023
4b5d173
dev and test _filter_transcript_file method
bkmarzouk Aug 26, 2023
1c0c5bb
dev test filter transcript files
bkmarzouk Aug 26, 2023
9651e8f
tmp ignore tmp dir
bkmarzouk Aug 26, 2023
6f021c7
_define_excluded_regions_for_randomization
bkmarzouk Aug 27, 2023
1cd27f3
init sort excluded regs for rand
bkmarzouk Aug 27, 2023
d55a891
path objects implementation
bkmarzouk Aug 29, 2023
4cddd1f
dev and (mostly) test randomization prep
bkmarzouk Aug 31, 2023
d42c259
entry point
bkmarzouk Aug 31, 2023
ccb18e1
bug fix in filter processing
bkmarzouk Aug 31, 2023
8c451e1
add pytest fixtures for content + helper methods for checking written…
bkmarzouk Aug 31, 2023
3bf0f53
Merge branch 'excl-pos' into python-dev
bkmarzouk Aug 31, 2023
3aa40fc
apply fixtures to test__define_excluded_regions_for_randomization
bkmarzouk Aug 31, 2023
941bdfd
apply fixtures to test__sort_excluded_regions_for_randomization
bkmarzouk Aug 31, 2023
a47f55b
fixtures applied to test__randomize_with_target_file
bkmarzouk Aug 31, 2023
d401ab4
apply fixtures to test__non_randomized
bkmarzouk Aug 31, 2023
7bd7e1e
Merge branch 'excl-pos' into python-dev
bkmarzouk Aug 31, 2023
2cc0b45
exclusion of positively selected genes implemented and tested
bkmarzouk Sep 1, 2023
4ce4172
develop and test get_protein complement
bkmarzouk Sep 1, 2023
6cc1985
prep ssb192
bkmarzouk Sep 1, 2023
5848d9c
transform coordinates implemented
bkmarzouk Sep 4, 2023
5dce060
move examples into src dir. Prepare integration test config. Parsing …
bkmarzouk Sep 4, 2023
6e28a97
introduce Parameters object for pipieline control
bkmarzouk Sep 5, 2023
ac8ecd9
introduce class methods for randomization protocol that depend on Par…
bkmarzouk Sep 5, 2023
d9e9283
update signature and init of transrcipts in Parameters object
bkmarzouk Sep 5, 2023
3a487fc
build Parameters object in main
bkmarzouk Sep 5, 2023
4b7cda6
minor refactoring of existing methods. more complete treatment of inp…
bkmarzouk Sep 5, 2023
10d9405
refactor slightly for improved readability. updated variable names in…
bkmarzouk Sep 5, 2023
9cf834b
update tests with refactored names, vars, funcs, etc
bkmarzouk Sep 5, 2023
792771a
include input annoted file path in AnalysisPaths object
bkmarzouk Sep 5, 2023
b7df74e
clean up cli methods: remove protected method indicators and unify no…
bkmarzouk Sep 5, 2023
d031e1b
integrate seed values for shuffling
bkmarzouk Sep 5, 2023
1aff90d
more explicit line length for isort in toml
bkmarzouk Sep 5, 2023
e8acd62
update defs in integration test following refactoring
bkmarzouk Sep 5, 2023
d56b437
update pre-commit versioning (ruff)
bkmarzouk Sep 5, 2023
e8a3a8a
add excluded genes objects
bkmarzouk Sep 5, 2023
b50c5e1
integrate gene exclusions into pipeline and test
bkmarzouk Sep 5, 2023
ca6d95e
update workflow to include integration tests
bkmarzouk Sep 5, 2023
7e83257
implement class method for filtering
bkmarzouk Sep 5, 2023
f9d4fd1
make application of design pattern more consistent. Apply to gene exc…
bkmarzouk Sep 5, 2023
732bc6d
minor name update
bkmarzouk Sep 5, 2023
03fa944
apply pipeline pattern to complement, ssb and intra epitope methods a…
bkmarzouk Sep 5, 2023
781efd6
update docstrings and tidy comments in prep coords
bkmarzouk Sep 5, 2023
62f5c75
update name of test file for detection at directory level
bkmarzouk Sep 5, 2023
0e5a255
fix indentation levels for easier readiblity
bkmarzouk Sep 5, 2023
39eeb45
methods for decompressing ensemble transcript ids
bkmarzouk Sep 5, 2023
4f08edb
rename integration test since caused conflict with units file name
bkmarzouk Sep 5, 2023
1a037c5
update TranscriptPaths object to include ensembl transcript ids
bkmarzouk Sep 5, 2023
838242f
add paths for *cds.fasta files. add paths for transcript:regions to e…
bkmarzouk Sep 6, 2023
5af9f05
update defs for epitopes and intra epitope paths. Added missing def
bkmarzouk Sep 6, 2023
ae46451
fix incorrectr path def for production of intra_epitopes_cds
bkmarzouk Sep 6, 2023
e496e5d
more path checks on prep coords
bkmarzouk Sep 6, 2023
59aa2ba
refactor test fixtures into seperate file and extract helper methods …
bkmarzouk Sep 6, 2023
741ce16
extract pipeline component object and missing data error into pipelin…
bkmarzouk Sep 6, 2023
d4182c0
integrate ObtainFastaRegions with fiducial test and fixtures
bkmarzouk Sep 6, 2023
1d97cd3
rename integration test case to reference TCGA-05-4396
bkmarzouk Sep 6, 2023
1cee87c
implement GetTranscriptRegionsForSites
bkmarzouk Sep 6, 2023
b9a66d6
fix bug in writing of intra fasta file
bkmarzouk Sep 11, 2023
8a91d5d
estimate subs ssb192
bkmarzouk Sep 11, 2023
f1f4a33
raise runtime error for subprocess steps with non-zero exit
bkmarzouk Sep 11, 2023
5390088
replace buggy method for obtaining transcript:regions
bkmarzouk Sep 11, 2023
1d8b0c1
compute sums over all possible target and non target region sites
bkmarzouk Sep 11, 2023
74cea6e
FixSimulated and ColumnCorrect methods
bkmarzouk Sep 11, 2023
efff3d7
entry point for downloading reference genome files
bkmarzouk Sep 11, 2023
e147154
update parameters to include attribute containing GenomePaths object
bkmarzouk Sep 11, 2023
921cd3c
path fix for ref genomes in frozen data instances
bkmarzouk Sep 11, 2023
7b4503b
Implement context corrections for ssb192
bkmarzouk Sep 11, 2023
d3ad000
add fasta index file for GRCh37
bkmarzouk Sep 11, 2023
f6ecc05
Implement FlagComputations
bkmarzouk Sep 11, 2023
bd58132
Implement TripletCounts
bkmarzouk Sep 11, 2023
329e24d
note in master shell script: code block looks redundent
bkmarzouk Sep 11, 2023
866b824
SiteCorrections
bkmarzouk Sep 12, 2023
fe64f19
typo fix
bkmarzouk Sep 12, 2023
48f16f2
omit integration tests from pipeline (tmp) - these have too many data…
bkmarzouk Sep 12, 2023
d76603c
Merge pull request #3 from instituteofcancerresearch/analysis-ssb192
bkmarzouk Sep 12, 2023
08a7480
IntersectByFrequency implemented
bkmarzouk Sep 12, 2023
0a87a9f
GetSilentCounts
bkmarzouk Sep 12, 2023
937d9db
GetNonSilentCounts
bkmarzouk Sep 12, 2023
7ca5431
GetMissenseCounts
bkmarzouk Sep 12, 2023
aa1a963
GetIntronicCounts
bkmarzouk Sep 12, 2023
ce2112a
fix output of intro mutation count
bkmarzouk Sep 12, 2023
3a2e47b
OnOffCounts
bkmarzouk Sep 12, 2023
5978835
BuildEpitopesDataFile
bkmarzouk Sep 12, 2023
bf05732
Merge pull request #4 from instituteofcancerresearch/step5
bkmarzouk Sep 12, 2023
9451b31
CheckTargetMutations - raise error if None found in target region
bkmarzouk Sep 13, 2023
ab11c8e
ComputeIntronRate
bkmarzouk Sep 13, 2023
602d740
reimplementation of_build_flag_file due to bug in computing GOOD/BAD
bkmarzouk Sep 13, 2023
b79b848
explanation of +1 in awk command
bkmarzouk Sep 13, 2023
623ed25
do not cleanup tmp files for debugging purposes. also added local pat…
bkmarzouk Sep 13, 2023
e7df483
minor notes in e2e test case for testing (tmp)
bkmarzouk Sep 13, 2023
d829b1f
minor note on default behaviour
bkmarzouk Sep 13, 2023
4011ee9
Data prepocessing for intronic dnds calculation
bkmarzouk Sep 14, 2023
bf21283
_compute_mutation_counts
bkmarzouk Sep 14, 2023
fb8db00
ignore output file (used for local tests)
bkmarzouk Sep 14, 2023
dc972bf
_define_variables
bkmarzouk Sep 14, 2023
bd70b3b
implment KaKs computations, define variables and combined quantities,…
bkmarzouk Sep 14, 2023
40e6d84
_compute_conf_interval
bkmarzouk Sep 14, 2023
6874cdf
implement confidence and pval computations
bkmarzouk Sep 14, 2023
7bd81d7
dnds calculations in place
bkmarzouk Sep 15, 2023
09c100c
move methods from src/SOPRANO/calculate_KaKsEpiCorrected_Cl_intron.py…
bkmarzouk Sep 18, 2023
2ce8bae
move path defs for tcga ssb192 test case into fixture
bkmarzouk Sep 18, 2023
2e03fb2
Merge pull request #5 from instituteofcancerresearch/step6
bkmarzouk Sep 18, 2023
f925db6
add pandas to dependencies file
bkmarzouk Sep 18, 2023
034f4cd
pre-computed fai and chrom files for GRCh37/38 for ensembl release 110
bkmarzouk Sep 19, 2023
d85a723
delete old file versions (109)
bkmarzouk Sep 19, 2023
583d8a6
update fixture for change in auxfile which now uses intron file
bkmarzouk Sep 19, 2023
08624dc
provide values if assert fails in check_expected_content(...)
bkmarzouk Sep 19, 2023
a90bbec
update directory config and download procedure / pre-processing of ge…
bkmarzouk Sep 19, 2023
3fabf85
Update capitalization in ref genomes
bkmarzouk Sep 19, 2023
262c8c1
add coreutils (GNU) for osx as dependency
bkmarzouk Sep 19, 2023
dcd1eda
generalise downloader & include option to download primary assembly
bkmarzouk Sep 19, 2023
884c2d0
add 109 GRCh37 fai and chrom files
bkmarzouk Sep 19, 2023
2ed8cb7
add ensemble_transcriptID.translated.fasta index file
bkmarzouk Sep 19, 2023
d2a7998
add fai files for GRCh37/38 rel 110
bkmarzouk Sep 19, 2023
387e2f2
minor note
bkmarzouk Sep 20, 2023
e882256
homo sapiens vcf inputs (for testing)
bkmarzouk Sep 20, 2023
f375c35
init vep annotation script
bkmarzouk Sep 20, 2023
cebad7c
Merge pull request #6 from instituteofcancerresearch/parse-vcf
bkmarzouk Sep 20, 2023
ac073ac
test workflow tests on macos
bkmarzouk Sep 20, 2023
1c6f66e
update tests to run on osx-actions branch (tmp)
bkmarzouk Sep 20, 2023
81bdbbf
update workflow files
bkmarzouk Sep 20, 2023
c52cbaa
Merge pull request #7 from instituteofcancerresearch/osx-actions
bkmarzouk Sep 20, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 10 additions & 7 deletions .github/workflows/test.yml → .github/workflows/dev_tests.yml
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
name: SOPRANO Tests
name: SOPRANO (Dev) Tests

on:
push:
branches:
- python-dev
pull_request:
branches:
- python
- python-dev

jobs:
lint:
Expand All @@ -22,17 +22,20 @@ jobs:
uses: actions/setup-python@v4
with:
python-version: ${{ matrix.python-version }}
- name: Install ruff
- name: Install ruff and black
run: |
python -m pip install --upgrade pip
pip install ruff
- name: Lint with ruff
pip install ruff black
- name: Lint check with ruff
run: |
ruff --format=github --target-version=py311 --line-length 79 .
- name: Style check with black
run: |
black ./ --check --line-length 79
test:
strategy:
matrix:
os: ["ubuntu-latest"] # To include macos-latest
os: ["ubuntu-latest"]
python-version: ["3.11"]
name: test with python ${{ matrix.python-version }} on ${{ matrix.os }}
runs-on: ${{ matrix.os }}
Expand All @@ -56,4 +59,4 @@ jobs:
shell: bash -l {0}
run: |
micromamba activate soprano-dev
pytest tests/test_units
pytest tests/test_units
69 changes: 69 additions & 0 deletions .github/workflows/main_tests.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
name: SOPRANO (Main) Tests

on:
push:
branches:
- python
- master
pull_request:
branches:
- python
- master

jobs:
lint:
strategy:
matrix:
os: ["ubuntu-latest"]
python-version: ["3.11"]
name: lint with python ${{ matrix.python-version }} on ${{ matrix.os }}
runs-on: ${{ matrix.os }}
steps:
- uses: actions/checkout@v3
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v4
with:
python-version: ${{ matrix.python-version }}
- name: Install ruff and black
run: |
python -m pip install --upgrade pip
pip install ruff black
- name: Lint check with ruff
run: |
ruff --format=github --target-version=py311 --line-length 79 .
- name: Style check with black
run: |
black ./ --check --line-length 79
test:
strategy:
matrix:
os: ["ubuntu-latest", "macos-latest"]
python-version: ["3.11"]
name: test with python ${{ matrix.python-version }} on ${{ matrix.os }}
runs-on: ${{ matrix.os }}
steps:
- uses: actions/checkout@v3
- uses: mamba-org/setup-micromamba@v1
with:
environment-file: src/SOPRANO/local.yml
init-shell: bash
- name: Install SOPRANO
shell: bash -l {0}
run: |
micromamba activate soprano-dev
pip install -e .[ci]
- name: Test conda environment
shell: bash -l {0}
run: |
micromamba activate soprano-dev
pytest tests/test_configuration
- name: Test units
shell: bash -l {0}
run: |
micromamba activate soprano-dev
pytest tests/test_units
# - name: Test integration # TODO: Need to think about this
# shell: bash -l {0}
# run: |
# micromamba activate soprano-dev
# pytest tests/test_integrations
7 changes: 7 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -65,3 +65,10 @@ dmypy.json

# Ruff linting
/.ruff_cache/
src/SOPRANO/tmp/

# Data files
src/SOPRANO/data/ensemble_transcriptID.fasta
src/SOPRANO/data/homo_sapiens/**/Homo_sapiens.GRCh*.fa
src/SOPRANO/data/homo_sapiens/**/Homo_sapiens.GRCh*.fa.gz
/example_output.txt
2 changes: 1 addition & 1 deletion .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,6 @@ repos:
hooks:
- id: mypy
- repo: https://github.com/astral-sh/ruff-pre-commit
rev: v0.0.284
rev: v0.0.287
hooks:
- id: ruff
6 changes: 6 additions & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,7 @@ dev = [
"mypy"
]
ci = [
"black",
"pytest",
"pytest-dependency",
"ruff"
Expand All @@ -50,6 +51,10 @@ Homepage = "https://github.com/instituteofcancerresearch/SOPRANO"
"Bug Tracker" = "https://github.com/instituteofcancerresearch/SOPRANO/issues"
Discussions = "https://github.com/bkmarzouk/symflation/discussions"

[project.scripts]
RUN_SOPRANO = "SOPRANO.run_local_ssb_selection:main"
GET_GENOMES = "SOPRANO.run_local_ssb_selection:download_genome"

[tool.hatch]
version.source = "vcs"
build.hooks.vcs.version-file = "src/SOPRANO/version.py"
Expand All @@ -60,6 +65,7 @@ line-length = 79

[tool.isort]
profile = "black"
line_length = 79

[tool.ruff]
line-length = 79
6 changes: 5 additions & 1 deletion setup.sh
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,9 @@ PROJECT_ROOT="$( cd -- "$(dirname -- "$0")" >/dev/null 2>&1 || exit ; pwd -P )"
# Installation helpers dir
INSTALLERS_DIR_PATH="src/SOPRANO/bash_installers"

# Data directory
DATA_DIR_PATH="src/SOPRANO/data"

if [ "$#" -eq 0 ]
then
echo "-- installing dependencies for dev"
Expand All @@ -21,4 +24,5 @@ fi
_PIP_CMD="pip install -e .[$DEPS]"

# Configure conda environment and install repository
source "$INSTALLERS_DIR_PATH/.setup_env.sh"
source "$INSTALLERS_DIR_PATH/.setup_env.sh"
source "$INSTALLERS_DIR_PATH/.setup_data.sh"
Loading