Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unhandled Compound and Simple location operators #1

Open
averagehat opened this issue Jul 25, 2019 · 1 comment
Open

Unhandled Compound and Simple location operators #1

averagehat opened this issue Jul 25, 2019 · 1 comment

Comments

@averagehat
Copy link
Collaborator

testdata/adeno.gb has the order location attribute, which we don't handle explicitly if it's different form join

There are more, and may be covered in the biopython, scikit-bio, and NCBI documentaiton
There may be parsing rules for genbank files. https://github.com/biocore/scikit-bio/blob/master/skbio/io/format/_sequence_feature_vocabulary.py#L202 'join', 'complement', 'order'

@averagehat
Copy link
Collaborator Author

There's a specification here:

http://www.insdc.org/files/feature_table.html

The location operator is a prefix that specifies what must be done to the 
indicated sequence to find or construct the location corresponding to the 
feature. A list of operators is given below with their definitions and most 
common format. 

complement(location) 
Find the complement of the presented sequence in the span specified by "
location" (i.e., read the complement of the presented strand in its 5'-to-3' 
direction) 

join(location,location, ... location) 
The indicated elements should be joined (placed end-to-end) to form one 
contiguous sequence 

order(location,location, ... location) 
The elements can be found in the 
specified order (5' to 3' direction), but nothing is implied about the 
reasonableness about joining them 

Note : location operator "complement" can be used in combination with either "
join" or "order" within the same location; combinations of "join" and "order" 
within the same location (nested operators) are illegal.

We'll have to check if we handle directions 5' 3' correctly. Also

complement(join(2691..4571,4918..5163))
                          Joins regions 2691 to 4571 and 4918 to 5163, then 
                          complements the joined segments (the feature is on the 
                          strand complementary to the presented strand) 

join(complement(4918..5163),complement(2691..4571))
                          Complements regions 4918 to 5163 and 2691 to 4571, then 
                          joins the complemented segments (the feature is on the 
                          strand complementary to the presented strand)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant