Skip to content

Commit

Permalink
Merging in version 1.2 from the devel branch
Browse files Browse the repository at this point in the history
  • Loading branch information
arvestad committed Aug 30, 2018
2 parents f3e6811 + 35fc9d1 commit 5800bc9
Show file tree
Hide file tree
Showing 12 changed files with 164 additions and 134 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@

# Setuptools distribution folder.
/dist/
/build/

# Python egg metadata, regenerated from source files by setuptools.
/*.egg-info
Expand Down
6 changes: 6 additions & 0 deletions CHANGES.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,3 +14,9 @@
* View only the first and last N characters of accessions with option -aa.
* Read from stdin with the magic file name "-".

## v1.2.0

* The option `-l` colors the alignment but does not break it into blocks. Suitable for piping to `less -RS`,
as suggested by Mark McMullan <[email protected]>.
* More indices indicated below alignments, and with an up-arrow as a tick mark.
* Added option `-sm` to allow restricting output to sequences with accessions containing a given string.
11 changes: 7 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
[![PyPI version](https://badge.fury.io/py/alv.svg)](https://badge.fury.io/py/alv)
[![Build Status](https://travis-ci.org/arvestad/alv.svg?branch=master)](https://travis-ci.org/arvestad/alv)

# alv: a command-line alignment viewer

View you DNA or protein multiple-sequence alignments right at your command line. No need to launch a
Expand Down Expand Up @@ -56,14 +57,16 @@ Run `python setup.py develop test` for development install and to execute tests.

All of the sequences in PFAM's seed alignment for PF00005

![PF00005 seed MSA](https://github.com/arvestad/alv/blob/master/doc/screenshot_PF00005.png)
![PF00005 seed MSA](https://github.com/arvestad/alv/raw/master/doc/screenshot_PF00005.png)

### Yeast sequences from PF00005

### Ten peptide sequences from PF00005
Using the option `-sm YEAST`, we reduce the alignment to the ones with a matching accession.

![MSA from PF00005](https://github.com/arvestad/alv/blob/master/doc/screenshot_1.png)
![MSA from PF00005](https://github.com/arvestad/alv/raw/master/doc/PF00005_yeast.png)

### Seven coding DNA sequences

`alv` is autodetecting that the given DNA sequences are coding and therefore colors codons instead
of nucleotides.
![Sample screenshot](https://github.com/arvestad/alv/blob/master/doc/screenshot_2.png)
![Sample screenshot](https://github.com/arvestad/alv/raw/master/doc/screenshot_2.png)
5 changes: 2 additions & 3 deletions TODO.md
Original file line number Diff line number Diff line change
@@ -1,13 +1,12 @@
# Planned fixes and features

* Better screenshots
* Review how colors work in different terminals. The color scheme working for me looked strange when Vilde was
showing me her work.
* Make it possible to color alignments without breaking alignment into blocks. I.e., a "pure"
version of `alv -k -w 99999 msa.fa | less -SR`. Suggested by Mark McMullan <[email protected]>.
* Add support for restricting to a sub-alignment.
* Add option --glimpse.
* Explicitly choose parts of an alignment to view/color.

* Pypi.org does not handle MarkDown in the README. Should look for a solution.

# Considered features

Expand Down
22 changes: 15 additions & 7 deletions alv/alignment.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,12 @@ def __init__(self, alignment):
def _update_seq_index(self):
self.seq_indices = { r.id : i for i, r in enumerate(self.al)} # Get a dictionary mapping accession to row index in alignment

def al_width(self):
'''
The number of columns in the alignment.
'''
return self.al.get_alignment_length()

def trim_accessions(self, start, stop):
for record in self.al:
acc = record.id
Expand Down Expand Up @@ -61,32 +67,33 @@ def sort_by_identity(self, acc):
reverse=True))
return sorted_accessions

def accession_widths(self):
def accession_widths(self, accessions=None):
'''
Compute the space needed for all accessions, plus one for
a delimiter.
'''
max_accession_length = 5 # initial guess, or minimum
for record in self.al:
if len(record.id) > max_accession_length:
max_accession_length = len(record.id)
if not accessions or record.id in accessions:
if len(record.id) > max_accession_length:
max_accession_length = len(record.id)
return max_accession_length


def block_width(self, terminal_width, args):
def block_width(self, terminal_width, args_width):
'''
For wide alignments, we need to break it up in blocks.
This method calculates how many characters to output in a block.
Take the margin size (for accessions) into account and avoid ending up
with blocks of size 1.
'''
if args.width == 0:
if args_width == 0:
al_width = self.al.get_alignment_length()
left_margin = 1 + self.accession_widths() # Add 1 for a space to the right of the accessions
return self._compute_block_width(terminal_width, al_width, left_margin)
else:
return args.width
return args_width

def _compute_block_width(self, terminal_width, al_width, left_margin):
'''
Expand All @@ -110,7 +117,8 @@ def blocks(self, block_width):
raise AlvEmptyAlignment()
else:
for start in range(0, al_width, block_width):
yield AlignmentBlock(start, start + block_width)
end = min(al_width, start + block_width)
yield AlignmentBlock(start, end)

def apply_painter(self, acc, block, painter):
'''
Expand Down
83 changes: 72 additions & 11 deletions alv/alignmentterminal.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,14 +15,15 @@ def __init__(self, args):
self.order = args.sorting_order.split(',')
if len(self.order) == 0:
raise Exception('Bad order specification: no accessions in input')
if args.select_matching:
self.selection = args.select_matching
else:
self.selection = False

def output_alignment(self, al, painter, args):
def output_alignment(self, al, painter, width):
'''
Output alignment al to stdout in blocks of width at most w with colors from painter.
'''
self.left_margin = 1 + al.accession_widths()
assert self.left_margin < self.width - 10

if self.sorting == 'alpha':
accessions = al.sorted_accessions()
elif self.sorting == 'fixed':
Expand All @@ -31,19 +32,79 @@ def output_alignment(self, al, painter, args):
accessions = al.sort_by_identity(self.order)
else:
accessions = al.accessions()
accessions = list(accessions)

if self.selection:
chosen_accessions = []
for acc in accessions:
if self.selection in acc:
chosen_accessions.append(acc)
else:
chosen_accessions = list(accessions)

self.left_margin = 1 + al.accession_widths(chosen_accessions)
assert self.left_margin < self.width - 10

columns_per_block = al.block_width(self.width, args)
columns_per_block = al.block_width(self.width, width)
for block in al.blocks(columns_per_block):
for acc in accessions:
for acc in chosen_accessions:
colored_subseq = al.apply_painter(acc, block, painter)
print("{0:{width}}{1}".format(acc, colored_subseq, width=self.left_margin))
print(' ' * self.left_margin, block.start, sep='')
print(make_tick_string(self.left_margin, block.start, block.end, 20, 7))

# print(' ' * self.left_margin, '↑', block.start, sep='') # print index of first column

def print_one_sequence_block(self, record, left_margin, start, block_width):
colored_string = colorize_sequence_string(rec.seq[start : start + block_width])
print("{0:{width}}{1}".format(rec.id, colored_string, width=left_margin))

# def print_one_sequence_block(self, record, left_margin, start, block_width):
# colored_string = colorize_sequence_string(rec.seq[start : start + block_width])
# print("{0:{width}}{1}".format(rec.id, colored_string, width=left_margin))



def calc_tick_indices(start, end, distance, min_distance):
'''
Return a list of indices for which we want a tick mark at the bottom of the alignment.
The goal is to have an index for the starting position of a block (leftmost column number),
and then a tick mark on even multiples of 20 (or what is given by 'distance'), for example:
53 60 80 100
Care is needed so that space is left between first and second indices, and min_distance indicates
how much.
'''
first_even_pos = (start // distance + 1) * distance
if first_even_pos - start < min_distance:
first_even_pos += distance # Compensate a bit
positions = range(first_even_pos, end, distance)
return positions

def make_one_tick(position, space):
'''
Return a string which is 'space' wide and contains a number (the position)
followed by an up-arrow.
'''

return '{0:>{width}}↑'.format(position, width=space-1)

def make_tick_string(left_margin, start, end, distance, min_distance):
'''
Construct the index bar which is printed at the bottom of an alignment block.
left_margin is how much space is allowed for accessions.
start is the column number of the beginning of an alignment block.
end is the last column of an alignment block.
distance is the desired distance between up-arrows
min_distance is the space we allow for position numbers plus an up-arrow
'''
even_indices = calc_tick_indices(start, end, distance, min_distance)

# Initial space
index_bar = ' ' * (left_margin - min_distance + 1) # Account for space needed by indices

# Add first column index
index_bar += make_one_tick(start, min(left_margin+1, min_distance))

last_pos = start
for pos in even_indices:
spacer = pos - last_pos - 1
index_block = '{0:>{width}}↑'.format(pos, width=spacer)
index_bar += index_block
last_pos = pos
return index_bar
96 changes: 0 additions & 96 deletions alv/main.py

This file was deleted.

2 changes: 1 addition & 1 deletion alv/version.py
Original file line number Diff line number Diff line change
@@ -1 +1 @@
__version__ = '1.1.0'
__version__ = '1.2.0'
Loading

0 comments on commit 5800bc9

Please sign in to comment.