Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Development #117

Merged
merged 49 commits into from
May 3, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
49 commits
Select commit Hold shift + click to select a range
6943de1
Use `shutil` to concatenate file contents.
srgk26 Dec 23, 2022
b5ca9ce
Lower buffer size from 1GB to 16MB.
srgk26 Dec 23, 2022
3930a46
Add new line in bytemode after concatenating each temporary JSON file.
srgk26 Dec 29, 2022
d2c104f
Edit write_output function to add new line when writing writing list …
srgk26 Dec 29, 2022
dcdfd07
Remove writing new line at EOF for IMGT, tabular, and AIRR formats.
srgk26 Dec 29, 2022
a0af51a
Merge pull request #2 from SyntenyBio/srgk26/concat_json_1
srgk26 Jan 3, 2023
c582ed8
Add support to write parquet files from json files. (#4)
srgk26 Jan 30, 2023
8be8588
by default, now force assignment of J genes to match the locus of the…
briney Feb 2, 2023
bb8a502
If the input sequence is sufficiently long, force alignment to the 5'…
briney Feb 2, 2023
47bd7c6
Update build_germline_dbs.py
briney Feb 2, 2023
7897e4f
More verbose logging when the loci of top-scoring V and J genes don't…
briney Feb 2, 2023
d3628ff
force full-length alignment without using global_alignment()
briney Feb 2, 2023
d28e248
new human germline database
briney Feb 2, 2023
1da768b
Write parquet files in place of temporary JSON files. (#6)
srgk26 Feb 3, 2023
d9959c5
Exclude index when writing parquet from pandas. (#8)
srgk26 Feb 6, 2023
2cc56e8
Create __init__.py
briney Feb 10, 2023
96877e3
Create qc.py
briney Feb 10, 2023
989c40c
Create trimming.py
briney Feb 10, 2023
f80ef4b
Create umi.py
briney Feb 10, 2023
fd058bc
Create pp.py
briney Feb 10, 2023
52368d2
add preprocess directory
briney Feb 10, 2023
d738f3e
new macaque germline database
briney Feb 14, 2023
2940f66
temp reorg of preprocess folder
briney Feb 14, 2023
c3200ad
add light chain V genes to macaque database
briney Feb 14, 2023
7a317bb
pin matplotlib (#9)
ndalchau Feb 17, 2023
7ceb8ac
Fix chunk size (#10)
ndalchau Feb 17, 2023
f611712
Freeze numpy install to version 1.23.4. (#12)
srgk26 Feb 21, 2023
231ea26
Polish codebase prior to merging upstream (#11)
srgk26 Feb 21, 2023
9d60d97
Remove specifying maxtasksperchild when creating multiprocess pool re…
srgk26 Feb 22, 2023
03fa169
Set matplotlib version at 3.6.3. (#14)
srgk26 Feb 23, 2023
9407364
fix preprocess imports
briney Mar 2, 2023
f0c4475
remove scikit-bio
briney Mar 2, 2023
4cafc98
update gapped IMGT alignment to use new abutils pairwise alignment fu…
briney Mar 2, 2023
c63a09e
remove _get_gapped_imgt_substitution_matrix
briney Mar 2, 2023
7296ae2
Merge remote-tracking branch 'upstream/development'
srgk26 Mar 9, 2023
7fbe9fe
Merge pull request #115 from SyntenyBio/master
briney Mar 18, 2023
996194c
Create umi.py
briney Apr 11, 2023
f429200
formatting
briney Apr 12, 2023
92bc36f
check to ensure query isn't empty before performing alignment
briney Apr 27, 2023
3bf71ce
check to ensure query isn't empty before local_alignment
briney Apr 27, 2023
a03b6fc
formatting
briney Apr 27, 2023
5a01fd3
fix type hint
briney Apr 27, 2023
f1a1ff4
get parasail matrix for gapped germline re-alignment
briney Apr 27, 2023
2eb01aa
Update requirements.txt
briney Apr 27, 2023
90ac224
fix matrix creation
briney Apr 27, 2023
4d03bc5
Update requirements.txt
briney May 3, 2023
79119b2
bump version to 0.6.0
briney May 3, 2023
2d2ca3e
Merge branch 'development' into preprocessing
briney May 3, 2023
438812c
Merge pull request #116 from briney/preprocessing
briney May 3, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -56,5 +56,7 @@ docs/_build/
# PyBuilder
target/

.vscode

*.sublime-*
.DS_Store
1 change: 1 addition & 0 deletions MANIFEST.in
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@ include abstar/assigners/germline_dbs/tcr/human/*
include abstar/assigners/germline_dbs/tcr/human/blast/*
include abstar/assigners/germline_dbs/tcr/human/ungapped/*
include abstar/assigners/germline_dbs/tcr/human/imgt_gapped/*
include abstar/preprocess/*
include abstar/test_data/*
include abstar/utils/*
include abstar/utils/queue/*
Expand Down
5 changes: 3 additions & 2 deletions abstar/__init__.py
Original file line number Diff line number Diff line change
@@ -1,8 +1,9 @@
import warnings
from Bio import BiopythonWarning
warnings.simplefilter('ignore', BiopythonWarning)

from .core.abstar import run, run_standalone, main, parse_arguments, validate_args
warnings.simplefilter("ignore", BiopythonWarning)

from .core.abstar import run, run_standalone, main, create_parser, validate_args
from .preprocess import fastqc, adapter_trim, quality_trim

from .version import __version__
Empty file added abstar/_preprocess/__init__.py
Empty file.
23 changes: 23 additions & 0 deletions abstar/_preprocess/qc.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
#!/usr/bin/python
# filename: qc.py

#
# Copyright (c) 2023 Bryan Briney
# License: The MIT license (http://opensource.org/licenses/MIT)
#
# Permission is hereby granted, free of charge, to any person obtaining a copy of this software
# and associated documentation files (the "Software"), to deal in the Software without restriction,
# including without limitation the rights to use, copy, modify, merge, publish, distribute,
# sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice shall be included in all copies or
# substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING
# BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
# NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
# DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
#

23 changes: 23 additions & 0 deletions abstar/_preprocess/trimming.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
#!/usr/bin/python
# filename: trimming.py

#
# Copyright (c) 2023 Bryan Briney
# License: The MIT license (http://opensource.org/licenses/MIT)
#
# Permission is hereby granted, free of charge, to any person obtaining a copy of this software
# and associated documentation files (the "Software"), to deal in the Software without restriction,
# including without limitation the rights to use, copy, modify, merge, publish, distribute,
# sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice shall be included in all copies or
# substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING
# BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
# NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
# DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
#

Loading