Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Release 0.0.12 #8

Merged
merged 18 commits into from
Jul 1, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 17 additions & 6 deletions README.rst
Original file line number Diff line number Diff line change
@@ -1,8 +1,20 @@
=======
emtools
=======

Utilities for CryoEM data manipulation.
.. |logo_image| image:: https://github.com/3dem/emhub/wiki/images/emtools-logo.png
:height: 60px

|logo_image|

**emtools** is a Python package with utilities for manipulating CryoEM images
and metadata such as STAR files or SQLITE databases. It also contains other
miscellaneous utils for processes handling and monitoring, among others.

The library is composed by several modules that provide mainly classes to
perform certain operations.

For more detailed information check the documentation at:

https://3dem.github.io/emdocs/emtools/


Installation
------------
Expand All @@ -16,8 +28,7 @@ Or for development:
.. code-block:: bash

git clone [email protected]:3dem/emtools.git
cd emtools
pip install -e .
pip install -e emtools/

Usage
-----
Expand Down
4 changes: 3 additions & 1 deletion docs/conf.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
import datetime as dt
import emtools

extensions = ['sphinx.ext.autodoc',
'sphinx.ext.autosectionlabel',
Expand Down Expand Up @@ -48,7 +49,8 @@
#html_logo = "https://github.com/3dem/emhub/wiki/images/emhub-logo-top-gray.svg"

html_context = {
'last_updated': dt.datetime.now().date()
'last_updated': dt.datetime.now().date(),
'emtools_version': emtools.__version__
}

templates_path = ["templates"]
Expand Down
5 changes: 3 additions & 2 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -32,9 +32,10 @@ Modules and Classes
.. toctree::
:maxdepth: 2

utils <utils/index>
metadata <metadata/index>
utils
image
jobs <jobs/index>




Expand Down
10 changes: 10 additions & 0 deletions docs/jobs/index.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@

jobs
====

.. toctree::
:maxdepth: 1

Pipeline <pipeline>


7 changes: 7 additions & 0 deletions docs/jobs/pipeline.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@

Pipeline
========

.. autoclass:: emtools.jobs.Pipeline
:members:

188 changes: 180 additions & 8 deletions docs/metadata/starfile.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,14 @@
StarFile
========

This class allows to read and write data blocks from STAR files.
It is also designed to inspect the data blocks, columns
and number of elements in an efficient manner without parsing the entire
file.
This class allows you to read and write data blocks from STAR files.
There are some features that make this class useful for the manipulation of data in STAR format:


* Inspect data blocks, columns, and the number of elements without parsing the entire file.
* Iterate over the rows without reading the whole table in memory.
* Read or iterate over a subset of rows only.


Inspecting without fully parsing
--------------------------------
Expand Down Expand Up @@ -43,31 +47,199 @@ https://github.com/3dem/em-testdata/blob/main/metadata/run_it025_data.star
#>>> items: 4786
#>>> columns: 25

sf.close()


The methods used in the previous example (`getTableNames`, `getTableSize`, and `getTableInfo`)
all inspect the STAR file without fully parsing all rows. This way is much more faster
that parsing rows if not needed. These methods will also create an index of where
data blocks are in the file, so if you need to read a data table, it will jump to
that position in the file.

Reading a Table
---------------

After opening a StarFile, we can easily read any table using the function `getTable`
as shown in the following example:

.. code-block:: python

with StarFile(movieStar) as sf:
# Read table in a different order as they appear in file
# also before the getTableNames() call that create the offsets
t1 = sf.getTable('local_shift')
t2 = sf.getTable('general')

This function has some optional arguments such as *guessType* for inferring
the column type from the first row. In some cases this is not desired and one
can pass *guessType=False* and then all columns will be treated as strings.
For example, reading the ``job.star`` file:

.. code-block:: python

with StarFile('job.star') as sf:
print(f"Tables: {sf.getTableNames()}") # ['job', 'joboptions_values']
t = sf.getTable('joboptions_values', guessType=False)
values = {row.rlnJobOptionVariable: row.rlnJobOptionValue for row in t}

Iterating over the rows
-----------------------

Iterating over the Table rows
-----------------------------

In some cases, we just want to iterate over the rows and operate on them one by one.
In that case, it is not necessary to fully load the whole table in memory. Iteration
also allows to read range of rows but not all of them. This is specially useful
for visualization purposes, where we can show a number of elements and allow to go
through all of them in an efficient manner.

Please check the :ref:`Examples` for practical use cases.


Writing STAR files
------------------

It is easy to write STAR files using the :class:`StarFile` class. We just need to
open it with write mode enabled. Then we could just write a table header
and then write rows one by one, or we could write an entire table at once.

Please check the :ref:`Examples` for practical use cases.

Examples
--------

Parsing EPU's XML files
.......................

Although the :py:class:`StarFile` class has been used mainly to handle Relion STAR files,
we can use any label and table names. For example, if we want to parse the
XML files from EPU to extract the beam shift per movie, and write an output
STAR file:

Comparison with other libraries
-------------------------------
.. code-block:: python

out = StarFile(outputStar, 'w')
t = Table(['movieBaseName', 'beamShiftX', 'beamShiftY'])
out.writeHeader('Movies', t)

for base, x, y in EPU.get_beam_shifts(inputDir):
out.writeRow(t.Row(movieBaseName=base, beamShiftX=x, beamShiftY=y))

out.close()


Note in this example that we are not storing the whole table in memory. We just
create an empty table with the desired columns and then we write one row for
each XML file parsed.


Balancing Particles views based on orientation angles
.....................................................

We could read angle Rot and Tilt from a particles STAR file as numpy arrays:

.. code-block:: python

with StarFile('particles.star') as sf:
size = sf.getTableSize('particles')
info = sf.getTableInfo('particles')
# Initialize the numpy arrays with zero and the number of particles
anglesRot = np.zeros(size)
anglesTilt = np.zeros(size)
# Then iterate the rows and store only these values
for i, p in enumerate(sf.iterTable('particles')):
anglesRot[i] = p.rlnAngleRot
anglesTilt[i] = p.rlnAngleTilt


Then we can use these arrays to plot the values and assess angular regions
more dense and create a subset of points to make it more evenly distributed.
Let's assume we computed the list of points to remove in the list *to_remove*.
Now, we can go through the input *particles.star* and we will create a similar
one, but with some particles removed. We will copy every table into the output
STAR files, except for the *particles* one, were whe need to filter out some
particles. We can do it with the following code:

.. code-block:: python

with StarFile('particles.star') as sf:
with StarFile('output_particles.star', 'w') as outSf:
# Preserve all tables, except particles that will be a subset
for tableName in sf.getTableNames():
if tableName == 'particles':
info = sf.getTableInfo('particles')
table = Table(columns=info.getColumns())
outSf.writeHeader('particles', table)
counter = 0
for i, p in enumerate(sf.iterTable('particles')):
if i == to_remove[counter]: # Skip this item
counter += 1
continue
outSf.writeRow(p)
else:
table = sf.getTable(tableName)
outSf.writeTable(tableName, table)

Converting from Scipion to micrographs STAR file
................................................

The following function shows how we can write a *micrographs.star* file
from a Scipion set of CTFs:

.. code-block:: python

def write_micrographs_star(micStarFn, ctfs):
firstCtf = ctfs.getFirstItem()
firstMic = firstCtf.getMicrograph()
acq = firstMic.getAcquisition()

with StarFile(micStarFn, 'w') as sf:
optics = Table(['rlnOpticsGroupName',
'rlnOpticsGroup',
'rlnMicrographOriginalPixelSize',
'rlnVoltage',
'rlnSphericalAberration',
'rlnAmplitudeContrast',
'rlnMicrographPixelSize'])
ps = firstMic.getSamplingRate()
op = 1
opName = f"opticsGroup{op}"
optics.addRowValues(opName, op, ps,
acq.getVoltage(),
acq.getSphericalAberration(),
acq.getAmplitudeContrast(),
ps)

sf.writeLine("# version 30001")
sf.writeTable('optics', optics)

mics = Table(['rlnMicrographName',
'rlnOpticsGroup',
'rlnCtfImage',
'rlnDefocusU',
'rlnDefocusV',
'rlnCtfAstigmatism',
'rlnDefocusAngle',
'rlnCtfFigureOfMerit',
'rlnCtfMaxResolution',
'rlnMicrographMovieName'])
sf.writeLine("# version 30001")
sf.writeHeader('micrographs', mics)

for ctf in ctfs:
mic = ctf.getMicrograph()
u, v, a = ctf.getDefocus()
micName = mic.getMicName()
movName = os.path.join('data', 'Images-Disc1',
micName.replace('_Data_FoilHole_',
'/Data/FoilHole_'))
row = mics.Row(mic.getFileName(), op,
ctf.getPsdFile(),
u, v, abs(u - v), a,
ctf.getFitQuality(),
ctf.getResolution(),
movName)

sf.writeRow(row)

Reference
---------
Expand Down
34 changes: 34 additions & 0 deletions docs/templates/sidebar/brand.html
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
{#-

Hi there!

You might be interested in https://pradyunsg.me/furo/customisation/sidebar/

Although if you're reading this, chances are that you're either familiar
enough with Sphinx that you know what you're doing, or landed here from that
documentation page.

Hope your day's going well. :)

-#}
<a class="sidebar-brand{% if logo %} centered{% endif %}" href="{{ pathto(master_doc) }}">
{% block brand_content %}
{%- if logo_url %}
<div class="sidebar-logo-container">
<img class="sidebar-logo" src="{{ logo_url }}" alt="Logo"/>
</div>
{%- endif %}
{%- if theme_light_logo and theme_dark_logo %}
<div class="sidebar-logo-container">
<img class="sidebar-logo only-light" src="{{ pathto('_static/' + theme_light_logo, 1) }}" alt="Light Logo"/>
<img class="sidebar-logo only-dark" src="{{ pathto('_static/' + theme_dark_logo, 1) }}" alt="Dark Logo"/>
</div>
{%- endif %}
{% if not theme_sidebar_hide_name %}
<span class="sidebar-brand-text">{{ docstitle if docstitle else project }}</span>
{%- endif %}
<p><small>Version: {{ emtools_version }}</small> </p>
<p><small>Last updated: {{ last_updated }}</small> </p>

{% endblock brand_content %}
</a>
5 changes: 0 additions & 5 deletions docs/utils.rst

This file was deleted.

11 changes: 11 additions & 0 deletions docs/utils/index.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@

utils
=====

.. toctree::
:maxdepth: 1

Color, Pretty, Timer <misc>
Process, Path, System <process>


18 changes: 18 additions & 0 deletions docs/utils/misc.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@

Color
=====

.. autoclass:: emtools.utils.Color
:members:

Pretty
======

.. autoclass:: emtools.utils.Pretty
:members:

Timer
=====

.. autoclass:: emtools.utils.Timer
:members:
19 changes: 19 additions & 0 deletions docs/utils/process.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@

Process
=======

.. autoclass:: emtools.utils.Process
:members:

Path
====

.. autoclass:: emtools.utils.Path
:members:

System
======

.. autoclass:: emtools.utils.System
:members:

Loading
Loading