diff --git a/docs/.doctrees/api/seismicrna.cluster.doctree b/docs/.doctrees/api/seismicrna.cluster.doctree index 5af0b1dc..3ce3695d 100644 Binary files a/docs/.doctrees/api/seismicrna.cluster.doctree and b/docs/.doctrees/api/seismicrna.cluster.doctree differ diff --git a/docs/.doctrees/api/seismicrna.cluster.tests.doctree b/docs/.doctrees/api/seismicrna.cluster.tests.doctree new file mode 100644 index 00000000..8f4c0027 Binary files /dev/null and b/docs/.doctrees/api/seismicrna.cluster.tests.doctree differ diff --git a/docs/.doctrees/api/seismicrna.core.batch.doctree b/docs/.doctrees/api/seismicrna.core.batch.doctree index 398889d3..1ad8e5aa 100644 Binary files a/docs/.doctrees/api/seismicrna.core.batch.doctree and b/docs/.doctrees/api/seismicrna.core.batch.doctree differ diff --git a/docs/.doctrees/api/seismicrna.core.batch.tests.doctree b/docs/.doctrees/api/seismicrna.core.batch.tests.doctree new file mode 100644 index 00000000..449c14c2 Binary files /dev/null and b/docs/.doctrees/api/seismicrna.core.batch.tests.doctree differ diff --git a/docs/.doctrees/api/seismicrna.core.doctree b/docs/.doctrees/api/seismicrna.core.doctree index 75e6cc15..9b50720b 100644 Binary files a/docs/.doctrees/api/seismicrna.core.doctree and b/docs/.doctrees/api/seismicrna.core.doctree differ diff --git a/docs/.doctrees/api/seismicrna.core.io.doctree b/docs/.doctrees/api/seismicrna.core.io.doctree index d5cfaf14..36accbc7 100644 Binary files a/docs/.doctrees/api/seismicrna.core.io.doctree and b/docs/.doctrees/api/seismicrna.core.io.doctree differ diff --git a/docs/.doctrees/api/seismicrna.core.mu.doctree b/docs/.doctrees/api/seismicrna.core.mu.doctree index 8d7a1fc0..b18d8ea0 100644 Binary files a/docs/.doctrees/api/seismicrna.core.mu.doctree and b/docs/.doctrees/api/seismicrna.core.mu.doctree differ diff --git a/docs/.doctrees/api/seismicrna.core.mu.tests.doctree b/docs/.doctrees/api/seismicrna.core.mu.tests.doctree index 5661ce71..0f227228 100644 Binary files a/docs/.doctrees/api/seismicrna.core.mu.tests.doctree and b/docs/.doctrees/api/seismicrna.core.mu.tests.doctree differ diff --git a/docs/.doctrees/api/seismicrna.core.mu.unbias.doctree b/docs/.doctrees/api/seismicrna.core.mu.unbias.doctree deleted file mode 100644 index 31d15624..00000000 Binary files a/docs/.doctrees/api/seismicrna.core.mu.unbias.doctree and /dev/null differ diff --git a/docs/.doctrees/api/seismicrna.core.mu.unbias.tests.doctree b/docs/.doctrees/api/seismicrna.core.mu.unbias.tests.doctree deleted file mode 100644 index 5dad9080..00000000 Binary files a/docs/.doctrees/api/seismicrna.core.mu.unbias.tests.doctree and /dev/null differ diff --git a/docs/.doctrees/api/seismicrna.core.seq.doctree b/docs/.doctrees/api/seismicrna.core.seq.doctree index a03f1264..17716273 100644 Binary files a/docs/.doctrees/api/seismicrna.core.seq.doctree and b/docs/.doctrees/api/seismicrna.core.seq.doctree differ diff --git a/docs/.doctrees/api/seismicrna.core.seq.tests.doctree b/docs/.doctrees/api/seismicrna.core.seq.tests.doctree index 14cf8dc8..69ef81f0 100644 Binary files a/docs/.doctrees/api/seismicrna.core.seq.tests.doctree and b/docs/.doctrees/api/seismicrna.core.seq.tests.doctree differ diff --git a/docs/.doctrees/api/seismicrna.core.tests.doctree b/docs/.doctrees/api/seismicrna.core.tests.doctree index 7dd2ca20..8aa0de16 100644 Binary files a/docs/.doctrees/api/seismicrna.core.tests.doctree and b/docs/.doctrees/api/seismicrna.core.tests.doctree differ diff --git a/docs/.doctrees/api/seismicrna.mask.doctree b/docs/.doctrees/api/seismicrna.mask.doctree index 7cf0e526..c0fe5cb4 100644 Binary files a/docs/.doctrees/api/seismicrna.mask.doctree and b/docs/.doctrees/api/seismicrna.mask.doctree differ diff --git a/docs/.doctrees/api/seismicrna.relate.doctree b/docs/.doctrees/api/seismicrna.relate.doctree index c2577f42..e5c18e2c 100644 Binary files a/docs/.doctrees/api/seismicrna.relate.doctree and b/docs/.doctrees/api/seismicrna.relate.doctree differ diff --git a/docs/.doctrees/api/seismicrna.relate.py.doctree b/docs/.doctrees/api/seismicrna.relate.py.doctree index 1e0c023c..cd42929e 100644 Binary files a/docs/.doctrees/api/seismicrna.relate.py.doctree and b/docs/.doctrees/api/seismicrna.relate.py.doctree differ diff --git a/docs/.doctrees/api/seismicrna.relate.py.tests.doctree b/docs/.doctrees/api/seismicrna.relate.py.tests.doctree index e0fd569c..f58a19c7 100644 Binary files a/docs/.doctrees/api/seismicrna.relate.py.tests.doctree and b/docs/.doctrees/api/seismicrna.relate.py.tests.doctree differ diff --git a/docs/.doctrees/api/seismicrna.relate.tests.doctree b/docs/.doctrees/api/seismicrna.relate.tests.doctree new file mode 100644 index 00000000..6ff263d7 Binary files /dev/null and b/docs/.doctrees/api/seismicrna.relate.tests.doctree differ diff --git a/docs/.doctrees/api/seismicrna.table.doctree b/docs/.doctrees/api/seismicrna.table.doctree index 82d21911..adaf565d 100644 Binary files a/docs/.doctrees/api/seismicrna.table.doctree and b/docs/.doctrees/api/seismicrna.table.doctree differ diff --git a/docs/.doctrees/api/seismicrna.wf.doctree b/docs/.doctrees/api/seismicrna.wf.doctree index 21719194..d9cc3471 100644 Binary files a/docs/.doctrees/api/seismicrna.wf.doctree and b/docs/.doctrees/api/seismicrna.wf.doctree differ diff --git a/docs/.doctrees/cli.doctree b/docs/.doctrees/cli.doctree index 37dfd8a8..7814c564 100644 Binary files a/docs/.doctrees/cli.doctree and b/docs/.doctrees/cli.doctree differ diff --git a/docs/.doctrees/environment.pickle b/docs/.doctrees/environment.pickle index 173ed9e0..c8e727f1 100644 Binary files a/docs/.doctrees/environment.pickle and b/docs/.doctrees/environment.pickle differ diff --git a/docs/.doctrees/howto/run/cluster.doctree b/docs/.doctrees/howto/run/cluster.doctree index 6782c31b..8f87d1ad 100644 Binary files a/docs/.doctrees/howto/run/cluster.doctree and b/docs/.doctrees/howto/run/cluster.doctree differ diff --git a/docs/.doctrees/howto/run/mask.doctree b/docs/.doctrees/howto/run/mask.doctree index c4f8cf2a..3a617fd6 100644 Binary files a/docs/.doctrees/howto/run/mask.doctree and b/docs/.doctrees/howto/run/mask.doctree differ diff --git a/docs/_sources/api/seismicrna.cluster.rst.txt b/docs/_sources/api/seismicrna.cluster.rst.txt index c26f8544..3a9081d4 100644 --- a/docs/_sources/api/seismicrna.cluster.rst.txt +++ b/docs/_sources/api/seismicrna.cluster.rst.txt @@ -6,6 +6,14 @@ seismicrna.cluster package :undoc-members: :show-inheritance: +Subpackages +----------- + +.. toctree:: + :maxdepth: 4 + + seismicrna.cluster.tests + Submodules ---------- diff --git a/docs/_sources/api/seismicrna.cluster.tests.rst.txt b/docs/_sources/api/seismicrna.cluster.tests.rst.txt new file mode 100644 index 00000000..74f22c08 --- /dev/null +++ b/docs/_sources/api/seismicrna.cluster.tests.rst.txt @@ -0,0 +1,16 @@ +seismicrna.cluster.tests package +================================ + +.. automodule:: seismicrna.cluster.tests + :members: + :undoc-members: + :show-inheritance: + +Submodules +---------- + + +.. automodule:: seismicrna.cluster.tests.em_test + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/_sources/api/seismicrna.core.batch.rst.txt b/docs/_sources/api/seismicrna.core.batch.rst.txt index 907cfe75..736fa800 100644 --- a/docs/_sources/api/seismicrna.core.batch.rst.txt +++ b/docs/_sources/api/seismicrna.core.batch.rst.txt @@ -6,6 +6,14 @@ seismicrna.core.batch package :undoc-members: :show-inheritance: +Subpackages +----------- + +.. toctree:: + :maxdepth: 4 + + seismicrna.core.batch.tests + Submodules ---------- diff --git a/docs/_sources/api/seismicrna.core.batch.tests.rst.txt b/docs/_sources/api/seismicrna.core.batch.tests.rst.txt new file mode 100644 index 00000000..ab1ccc3b --- /dev/null +++ b/docs/_sources/api/seismicrna.core.batch.tests.rst.txt @@ -0,0 +1,22 @@ +seismicrna.core.batch.tests package +=================================== + +.. automodule:: seismicrna.core.batch.tests + :members: + :undoc-members: + :show-inheritance: + +Submodules +---------- + + +.. automodule:: seismicrna.core.batch.tests.count_test + :members: + :undoc-members: + :show-inheritance: + + +.. automodule:: seismicrna.core.batch.tests.index_test + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/_sources/api/seismicrna.core.mu.rst.txt b/docs/_sources/api/seismicrna.core.mu.rst.txt index 85f25f7b..72ea149c 100644 --- a/docs/_sources/api/seismicrna.core.mu.rst.txt +++ b/docs/_sources/api/seismicrna.core.mu.rst.txt @@ -13,7 +13,6 @@ Subpackages :maxdepth: 4 seismicrna.core.mu.tests - seismicrna.core.mu.unbias Submodules ---------- @@ -47,3 +46,9 @@ Submodules :members: :undoc-members: :show-inheritance: + + +.. automodule:: seismicrna.core.mu.unbias + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/_sources/api/seismicrna.core.mu.tests.rst.txt b/docs/_sources/api/seismicrna.core.mu.tests.rst.txt index dc379bef..9a899b56 100644 --- a/docs/_sources/api/seismicrna.core.mu.tests.rst.txt +++ b/docs/_sources/api/seismicrna.core.mu.tests.rst.txt @@ -38,3 +38,9 @@ Submodules :members: :undoc-members: :show-inheritance: + + +.. automodule:: seismicrna.core.mu.tests.unbias_test + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/_sources/api/seismicrna.core.mu.unbias.rst.txt b/docs/_sources/api/seismicrna.core.mu.unbias.rst.txt deleted file mode 100644 index 9290330f..00000000 --- a/docs/_sources/api/seismicrna.core.mu.unbias.rst.txt +++ /dev/null @@ -1,30 +0,0 @@ -seismicrna.core.mu.unbias package -================================= - -.. automodule:: seismicrna.core.mu.unbias - :members: - :undoc-members: - :show-inheritance: - -Subpackages ------------ - -.. toctree:: - :maxdepth: 4 - - seismicrna.core.mu.unbias.tests - -Submodules ----------- - - -.. automodule:: seismicrna.core.mu.unbias.algo - :members: - :undoc-members: - :show-inheritance: - - -.. automodule:: seismicrna.core.mu.unbias.frame - :members: - :undoc-members: - :show-inheritance: diff --git a/docs/_sources/api/seismicrna.core.mu.unbias.tests.rst.txt b/docs/_sources/api/seismicrna.core.mu.unbias.tests.rst.txt deleted file mode 100644 index d6cc5cce..00000000 --- a/docs/_sources/api/seismicrna.core.mu.unbias.tests.rst.txt +++ /dev/null @@ -1,22 +0,0 @@ -seismicrna.core.mu.unbias.tests package -======================================= - -.. automodule:: seismicrna.core.mu.unbias.tests - :members: - :undoc-members: - :show-inheritance: - -Submodules ----------- - - -.. automodule:: seismicrna.core.mu.unbias.tests.algo_test - :members: - :undoc-members: - :show-inheritance: - - -.. automodule:: seismicrna.core.mu.unbias.tests.frame_test - :members: - :undoc-members: - :show-inheritance: diff --git a/docs/_sources/api/seismicrna.core.rst.txt b/docs/_sources/api/seismicrna.core.rst.txt index 50dd5e4d..184e31cd 100644 --- a/docs/_sources/api/seismicrna.core.rst.txt +++ b/docs/_sources/api/seismicrna.core.rst.txt @@ -34,6 +34,12 @@ Submodules :show-inheritance: +.. automodule:: seismicrna.core.dims + :members: + :undoc-members: + :show-inheritance: + + .. automodule:: seismicrna.core.header :members: :undoc-members: diff --git a/docs/_sources/api/seismicrna.core.tests.rst.txt b/docs/_sources/api/seismicrna.core.tests.rst.txt index 4063bd1a..f8e87be3 100644 --- a/docs/_sources/api/seismicrna.core.tests.rst.txt +++ b/docs/_sources/api/seismicrna.core.tests.rst.txt @@ -10,6 +10,12 @@ Submodules ---------- +.. automodule:: seismicrna.core.tests.dims_test + :members: + :undoc-members: + :show-inheritance: + + .. automodule:: seismicrna.core.tests.header_test :members: :undoc-members: diff --git a/docs/_sources/api/seismicrna.relate.rst.txt b/docs/_sources/api/seismicrna.relate.rst.txt index c8646df9..8f263549 100644 --- a/docs/_sources/api/seismicrna.relate.rst.txt +++ b/docs/_sources/api/seismicrna.relate.rst.txt @@ -15,6 +15,7 @@ Subpackages seismicrna.relate.aux seismicrna.relate.c seismicrna.relate.py + seismicrna.relate.tests Submodules ---------- @@ -56,6 +57,12 @@ Submodules :show-inheritance: +.. automodule:: seismicrna.relate.sim + :members: + :undoc-members: + :show-inheritance: + + .. automodule:: seismicrna.relate.write :members: :undoc-members: diff --git a/docs/_sources/api/seismicrna.relate.tests.rst.txt b/docs/_sources/api/seismicrna.relate.tests.rst.txt new file mode 100644 index 00000000..3b2cac38 --- /dev/null +++ b/docs/_sources/api/seismicrna.relate.tests.rst.txt @@ -0,0 +1,16 @@ +seismicrna.relate.tests package +=============================== + +.. automodule:: seismicrna.relate.tests + :members: + :undoc-members: + :show-inheritance: + +Submodules +---------- + + +.. automodule:: seismicrna.relate.tests.sim_test + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/_sources/howto/run/cluster.rst.txt b/docs/_sources/howto/run/cluster.rst.txt index 95d53e90..1b737a1c 100644 --- a/docs/_sources/howto/run/cluster.rst.txt +++ b/docs/_sources/howto/run/cluster.rst.txt @@ -11,17 +11,14 @@ Cluster input file: Mask report You can give any number of Mask report files as inputs for the Cluster step. See :doc:`../inputs` for ways to list multiple files. -For example, to cluster relation vectors of reads from ``sample-1`` masked over -reference ``ref-1`` section ``abc``, and from ``sample-2`` masked over reference -``ref-2`` section ``full``, use the command :: +Cluster all masked reads in ``out``:: - seismic cluster {out}/sample-1/mask/ref-1/abc {out}/sample-2/mask/ref-2/full + seismic cluster out -where ``{out}`` is the path of your output directory from the Relate step. +Cluster reads from ``sample-1`` masked over reference reference ``ref-1``, +section ``abc``:: -To cluster all masked relation vectors in ``{out}``, you can use the command :: - - seismic cluster {out} + seismic cluster out/sample-1/mask/ref-1/abc Cluster: Settings ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ @@ -105,9 +102,7 @@ You can set the number of independent EM runs using ``--em-runs`` (``-e``). Cluster: Output files ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -All output files go into the directory ``{out}/{sample}/cluster/{ref}/{sect}``, -where ``{out}`` is the output directory, ``{sample}`` is the sample, ``{ref}`` -is the reference, and ``{sect}`` is the section. +All output files go into the directory ``OUT/SAMPLE/cluster/REFERENCE/SECTION``. Cluster output file: Batch of cluster memberships """""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""" @@ -171,7 +166,7 @@ In your cluster report: If all runs converged to identical solutions, then every NRMSD would be 0 and every Correlation would be 1. Generally, the runs are sufficiently reproducible if runs 1 and 2 have NRMSDs - less than 0.1 and Correlations greater than 0.95 with respect to run 0. + less than 0.05 and Correlations greater than 0.98 with respect to run 0. If not, then there you have no evidence that run 0 is the global optimum for that number of clusters, so it would be best to rerun clustering using more independent runs to increase the chances of finding the global optimum. diff --git a/docs/_sources/howto/run/mask.rst.txt b/docs/_sources/howto/run/mask.rst.txt index cfe9a379..d424a948 100644 --- a/docs/_sources/howto/run/mask.rst.txt +++ b/docs/_sources/howto/run/mask.rst.txt @@ -110,20 +110,27 @@ Mask setting: Filter reads The second substep of masking is filtering reads. You can filter reads based on three criteria, in this order: +Filter reads by number of positions covering the section +'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''' + +You can require every read to contain a minimum number of bases in the section +(i.e. set a minimum coverage) using ``--min-ncov-read`` followed by the minimum +coverage. +The minimum coverage must be at least 1 because reads that do not cover the +section at all should always be filtered out. +Note that this filter considers only positions that were not pre-excluded (see +:ref:`mask_exclude`). + Filter reads by fraction of informative positions '''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''' -For some applications, such as finding alternative structures, every read must -span the vast majority of positions in the section of the reference. -You can set a limit on the minimum number of informative bases in the read, -as a fraction of the number of non-excluded positions in the section, using -``--min-finfo-read {f}``. -For example, to require 95% of the non-excluded positions in the section to be +You can set a limit on the minimum information in each read, as a fraction of +the number of non-excluded positions in the read, using ``--min-finfo-read`` +followed by the minimum fraction of informative positions. +For example, to require 95% of the non-excluded positions in the read to be informative, use ``--min-finfo-read 0.95``. -If the section had 296 positions, and 141 remained after excluding positions -(see :ref:`mask_exclude`), then a read with 137 informative positions would -have an informed fraction of 97% and be kept, but a read with 133 informative -positions would have an informed fraction of 94% and be discarded. +Note that the denominator of this fraction is the number of bases in the read +that cover the section; it is not just the length of the section or of the read. Filter reads by fraction of mutated positions '''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''' diff --git a/docs/api/seismicrna.cluster.html b/docs/api/seismicrna.cluster.html index bc0b9e5b..3cef2512 100644 --- a/docs/api/seismicrna.cluster.html +++ b/docs/api/seismicrna.cluster.html @@ -19,7 +19,7 @@ - + @@ -56,6 +56,7 @@
  • seismicrna.align package
  • seismicrna.cleanfa package
  • seismicrna.cluster package
  • @@ -113,6 +114,17 @@

    seismicrna.cluster package

    +
    +

    Subpackages

    + +

    Submodules

    @@ -452,12 +464,12 @@

    Cluster Comparison Module class seismicrna.cluster.em.EmClustering(uniq_reads: seismicrna.cluster.uniq.UniqReads, order: int, *, min_iter: int, max_iter: int, conv_thresh: float)

    Bases: object

    -

    Run expectation-maximization to cluster the given bit vectors -into the specified number of clusters.

    +

    Run expectation-maximization to cluster the given reads into the +specified number of clusters.

    property bic
    -

    Compute this model’s Bayesian Information Criterion.

    +

    Bayesian Information Criterion of the model.

    @@ -473,6 +485,24 @@

    Cluster Comparison Module +
    +property end3s
    +

    3’ end coordinates (0-indexed).

    +

    + +
    +
    +property end5s
    +

    5’ end coordinates (0-indexed).

    +
    + +
    +
    +get_members(batch_num: int)
    +

    Cluster memberships of the reads in the batch.

    +
    +
    get_mus()
    @@ -485,12 +515,6 @@

    Cluster Comparison Module -
    -get_resps(batch_num: int)
    -

    Responsibilities of the reads in the batch.

    -

    -
    property log_like
    @@ -508,26 +532,37 @@

    Cluster Comparison Module
    property logn_exp
    -

    Log number of expected observations of each bit vector.

    +

    Log number of expected observations of each read.

    -
    -property nreads_total
    -

    Total number of reads, including redundant ones.

    +
    +property masked
    +

    Masked positions (0-indexed).

    -
    -property prop_adj
    -

    Proportion of each cluster, adjusted for observer bias.

    +
    +property n_data
    +

    Number of data points in the model.

    -
    -property prop_obs
    -

    Observed proportion of each cluster, without adjusting for -observer bias.

    +
    +property n_params
    +

    Number of parameters in the model.

    +
    + +
    +
    +property n_pos_total
    +

    Number of positions, including those masked.

    +
    + +
    +
    +property n_pos_unmasked
    +

    Number of unmasked positions.

    @@ -552,41 +587,17 @@

    Cluster Comparison Module -
    -property sparse_mus
    -

    Mutation rates of all positions (row) in each cluster (col), -including positions that are not used for clustering. The rate -for every unused position always remains zero.

    +
    +property section_end5
    +

    5’ end of the section.

    +
    +
    +property unmasked
    +

    Unmasked positions (0-indexed).

    -
    -
    -seismicrna.cluster.em.calc_bic(n_params: int, n_data: int, log_like: float, min_data_param_ratio: float = 10.0)
    -

    Compute the Bayesian Information Criterion (BIC) of a model. -Typically, the model with the smallest BIC is preferred.

    -
    -
    Parameters
    -
      -
    • n_params (int) – Number of parameters that the model estimates

    • -
    • n_data (int) – Number of data points in the sample from which the parameters -were estimated

    • -
    • log_like (float) – Natural logarithm of the likelihood of observing the data given -the parameters

    • -
    • min_data_param_ratio (float = 10.0) – In order for the BIC approximation to be valid, the sample size -must be much larger than the number of estimated parameters. -Issue a warning if the sample size is less than this ratio times -the number of parameters, but still compute and return the BIC.

    • -
    -
    -
    Returns
    -

    Bayesian Information Criterion (BIC)

    -
    -
    Return type
    -

    float

    -
    -
    @@ -615,7 +626,7 @@

    Cluster Comparison Module
    -seismicrna.cluster.main.run(input_path: tuple[str, ...], *, max_clusters: int = 0, em_runs: int = 6, min_em_iter: int = 8, max_em_iter: int = 512, em_thresh: float = 0.6931471805599453, brotli_level: int = 10, max_procs: int = 16, parallel: bool = True, force: bool = False) list[pathlib.Path]
    +seismicrna.cluster.main.run(input_path: tuple[str, ...], *, max_clusters: int = 0, em_runs: int = 6, min_em_iter: int = 8, max_em_iter: int = 512, em_thresh: float = 0.1, brotli_level: int = 10, max_procs: int = 16, parallel: bool = True, force: bool = False) list[pathlib.Path]

    Infer alternative structures by clustering reads’ mutations.

    Parameters
    @@ -624,7 +635,7 @@

    Cluster Comparison Moduleint) – Number of independent runs for each number of clusters [keyword-only, default: 6]

  • min_em_iter (int) – Minimum iterations per clustering run [keyword-only, default: 8]

  • max_em_iter (int) – Maximum iterations per clustering run [keyword-only, default: 512]

  • -
  • em_thresh (float) – Maximum change in log likelihood for convergence [keyword-only, default: 0.6931471805599453]

  • +
  • em_thresh (float) – Maximum change in log likelihood for convergence [keyword-only, default: 0.1]

  • brotli_level (int) – Compression level for brotli (0 - 11) [keyword-only, default: 10]

  • max_procs (int) – Maximum number of simultaneous processes [keyword-only, default: 16]

  • parallel (bool) – Run tasks in parallel [keyword-only, default: True]

  • @@ -677,18 +688,48 @@

    Cluster – Names Module
    -class seismicrna.cluster.uniq.UniqReads(sample: str, section: seismicrna.core.seq.section.Section, min_mut_gap: int, muts_per_pos: list[numpy.ndarray], batch_to_uniq: list[pandas.core.series.Series], counts_per_uniq: numpy.ndarray)
    +class seismicrna.cluster.uniq.UniqReads(sample: str, section: seismicrna.core.seq.section.Section, min_mut_gap: int, ends: tuple[numpy.ndarray, numpy.ndarray], muts_per_pos: list[numpy.ndarray], batch_to_uniq: list[pandas.core.series.Series], counts_per_uniq: numpy.ndarray)

    Bases: object

    Collection of bit vectors of unique reads.

    +
    +
    +property end3s
    +

    3’ end coordinates.

    +
    + +
    +
    +property end3s_zero
    +

    3’ end coordinates (0-indexed from 5’ end of section).

    +
    + +
    +
    +property end5s
    +

    5’ end coordinates.

    +
    + +
    +
    +property end5s_zero
    +

    5’ end coordinates (0-indexed from 5’ end of section).

    +
    +
    classmethod from_dataset(dataset: seismicrna.mask.data.MaskMutsDataset)
    -
    -get_matrix()
    -

    Full boolean matrix of the unique bit vectors.

    +
    +get_cov_matrix()
    +

    Full boolean matrix of the covered positions.

    +
    + +
    +
    +get_mut_matrix()
    +

    Full boolean matrix of the mutations.

    @@ -704,15 +745,9 @@

    Cluster – Names Module

    -
    -property num_pos
    -

    Number of positions in each bit vector.

    -
    - -
    -
    -property num_reads
    -

    Number of total reads (including duplicates).

    +
    +property num_nonuniq: int
    +

    Number of total reads (including non-unique reads).

    @@ -721,11 +756,6 @@

    Cluster – Names Module

    Number of unique reads.

    -
    -
    -property pos_nums
    -
    -
    property ref
    @@ -734,31 +764,11 @@

    Cluster – Names Module

    -
    -
    -seismicrna.cluster.uniq.batch_to_uniq_read_num(read_nums_per_batch: list[numpy.ndarray], uniq_read_nums: Iterable[list])
    -

    Map each read’s number in its own batch to its unique number in -the pool of all batches.

    -
    - -
    -
    -seismicrna.cluster.uniq.count_uniq_reads(uniq_read_nums: Iterable[list])
    -

    Count the occurrances of each unique value in the original.

    -
    -
    seismicrna.cluster.uniq.get_uniq_reads(pos_nums: Iterable[int], pattern: seismicrna.core.rel.pattern.RelPattern, batches: Iterable[seismicrna.core.batch.muts.RefseqMutsBatch])
    -
    -
    -seismicrna.cluster.uniq.uniq_reads_to_mutations(uniq_reads: Iterable[tuple], pos_nums: Iterable[int])
    -

    Map each position to the numbers of the unique reads that are -mutated at the position.

    -
    -
    seismicrna.cluster.write.cluster(mask_report_file: pathlib.Path, max_order: int, n_runs: int, *, min_iter: int, max_iter: int, conv_thresh: float, brotli_level: int, n_procs: int, force: bool)
    @@ -791,7 +801,7 @@

    Cluster – Names Module

    diff --git a/docs/api/seismicrna.core.mu.tests.html b/docs/api/seismicrna.core.mu.tests.html index 9757719f..68baeb4b 100644 --- a/docs/api/seismicrna.core.mu.tests.html +++ b/docs/api/seismicrna.core.mu.tests.html @@ -19,7 +19,7 @@ - + @@ -952,6 +952,518 @@ +
    +
    +class seismicrna.core.mu.tests.unbias_test.CalcRectangularSum(methodName='runTest')
    +

    Bases: unittest.case.TestCase

    +
    +
    +static calc_spanning_sum_slow(array: numpy.ndarray)
    +
    + +
    +
    +test_2d()
    +
    + +
    +
    +test_3d()
    +
    + +
    + +
    +
    +class seismicrna.core.mu.tests.unbias_test.TestAdjustMinGap(methodName='runTest')
    +

    Bases: unittest.case.TestCase

    +
    +
    +test_gap_le_zero_le_npos()
    +
    + +
    +
    +test_npos_le_zero_le_gap()
    +
    + +
    +
    +test_zero_le_gap_lt_npos()
    +
    + +
    +
    +test_zero_lt_npos_le_gap()
    +
    + +
    + +
    +
    +class seismicrna.core.mu.tests.unbias_test.TestCalcPClust(methodName='runTest')
    +

    Bases: unittest.case.TestCase

    +
    +
    +test_p_clust()
    +
    + +
    + +
    +
    +class seismicrna.core.mu.tests.unbias_test.TestCalcPClustGivenNoClose(methodName='runTest')
    +

    Bases: unittest.case.TestCase

    +
    +
    +test_ncls1()
    +
    + +
    +
    +test_ncls2()
    +
    + +
    + +
    +
    +class seismicrna.core.mu.tests.unbias_test.TestCalcPEnds(methodName='runTest')
    +

    Bases: unittest.case.TestCase

    +
    +
    +test_invert()
    +

    Test the inverse of calc_p_ends_given_noclose.

    +
    + +
    + +
    +
    +class seismicrna.core.mu.tests.unbias_test.TestCalcPEndsGivenNoClose(methodName='runTest')
    +

    Bases: unittest.case.TestCase

    +
    +
    +test_npos0_ncls1()
    +
    + +
    +
    +test_npos1_ncls0()
    +
    + +
    +
    +test_npos1_ncls1()
    +
    + +
    +
    +test_npos2_ncls1()
    +
    + +
    + +
    +
    +class seismicrna.core.mu.tests.unbias_test.TestCalcPEndsObserved(methodName='runTest')
    +

    Bases: unittest.case.TestCase

    +
    +
    +test_calc_p_ends_given_observed()
    +
    + +
    + +
    +
    +class seismicrna.core.mu.tests.unbias_test.TestCalcPMutGivenSpan(methodName='runTest')
    +

    Bases: unittest.case.TestCase

    +
    +
    +test_invert()
    +

    Test the inverse of _calc_p_mut_given_span_noclose.

    +
    + +
    + +
    +
    +class seismicrna.core.mu.tests.unbias_test.TestCalcPMutGivenSpanNoClose(methodName='runTest')
    +

    Bases: unittest.case.TestCase

    +
    +
    +test_clusters()
    +
    + +
    +
    +test_simulated()
    +
    + +
    + +
    +
    +class seismicrna.core.mu.tests.unbias_test.TestCalcPNoClose(methodName='runTest')
    +

    Bases: unittest.case.TestCase

    +
    +
    +test_p_noclose()
    +
    + +
    + +
    +
    +class seismicrna.core.mu.tests.unbias_test.TestCalcParams(methodName='runTest')
    +

    Bases: unittest.case.TestCase

    +
    +
    +test_infer()
    +
    + +
    + +
    +
    +class seismicrna.core.mu.tests.unbias_test.TestClip(methodName='runTest')
    +

    Bases: unittest.case.TestCase

    +
    +
    +test_with_clip()
    +
    + +
    +
    +test_without_clip()
    +
    + +
    + +
    +
    +class seismicrna.core.mu.tests.unbias_test.TestNoCloseMuts(methodName='runTest')
    +

    Bases: unittest.case.TestCase

    +
    +
    +test_more_muts()
    +
    + +
    +
    +test_no_muts()
    +
    + +
    +
    +test_one_mut()
    +
    + +
    + +
    +
    +class seismicrna.core.mu.tests.unbias_test.TestNormalize(methodName='runTest')
    +

    Bases: unittest.case.TestCase

    +
    +
    +test_sum_positive()
    +
    + +
    +
    +test_sum_zero()
    +
    + +
    + +
    +
    +class seismicrna.core.mu.tests.unbias_test.TestPrivateCalcPNoCloseGivenEnds(methodName='runTest')
    +

    Bases: unittest.case.TestCase

    +

    Test _calc_p_nomut_window and _calc_p_noclose_given_ends.

    +
    +
    +test_clusters()
    +
    + +
    +
    +test_length_0()
    +
    + +
    +
    +test_length_1()
    +
    + +
    +
    +test_length_2_min_gap_1()
    +
    + +
    +
    +test_length_3_min_gap_1()
    +
    + +
    +
    +test_length_3_min_gap_2()
    +
    + +
    +
    +test_length_4_min_gap_1()
    +
    + +
    +
    +test_length_4_min_gap_2()
    +
    + +
    +
    +test_length_4_min_gap_3()
    +
    + +
    +
    +test_min_gap_0()
    +
    + +
    + +
    +
    +class seismicrna.core.mu.tests.unbias_test.TestPublicCalcPNoCloseGivenEnds(methodName='runTest')
    +

    Bases: unittest.case.TestCase

    +
    +
    +test_1_dim()
    +
    + +
    +
    +test_2_dim()
    +
    + +
    +
    +test_invalid_dim()
    +
    + +
    + +
    +
    +class seismicrna.core.mu.tests.unbias_test.TestTriuAllClose(methodName='runTest')
    +

    Bases: unittest.case.TestCase

    +
    +
    +test_equal()
    +
    + +
    +
    +test_tril()
    +
    + +
    +
    +test_triu()
    +
    + +
    + +
    +
    +class seismicrna.core.mu.tests.unbias_test.TestTriuDiv(methodName='runTest')
    +

    Bases: unittest.case.TestCase

    +
    +
    +test_1x1()
    +
    + +
    +
    +test_1x1x1()
    +
    + +
    +
    +test_2x2()
    +
    + +
    +
    +test_2x2x2()
    +
    + +
    + +
    +
    +class seismicrna.core.mu.tests.unbias_test.TestTriuDot(methodName='runTest')
    +

    Bases: unittest.case.TestCase

    +
    +
    +test_1x1()
    +
    + +
    +
    +test_1x1x1()
    +
    + +
    +
    +test_2x2()
    +
    + +
    +
    +test_2x2x2()
    +
    + +
    + +
    +
    +class seismicrna.core.mu.tests.unbias_test.TestTriuLog(methodName='runTest')
    +

    Bases: unittest.case.TestCase

    +
    +
    +compare(result: numpy.ndarray, expect: numpy.ndarray)
    +
    + +
    +
    +test_2d()
    +
    + +
    +
    +test_3d()
    +
    + +
    + +
    +
    +class seismicrna.core.mu.tests.unbias_test.TestTriuNorm(methodName='runTest')
    +

    Bases: unittest.case.TestCase

    +
    +
    +test_1x1()
    +
    + +
    +
    +test_1x1x1()
    +
    + +
    +
    +test_1x1x2()
    +
    + +
    +
    +test_2x2()
    +
    + +
    +
    +test_2x2x1()
    +
    + +
    +
    +test_2x2x2()
    +
    + +
    +
    +test_2x2x2x2()
    +
    + +
    + +
    +
    +class seismicrna.core.mu.tests.unbias_test.TestTriuSum(methodName='runTest')
    +

    Bases: unittest.case.TestCase

    +
    +
    +test_1x1()
    +
    + +
    +
    +test_1x1x1()
    +
    + +
    +
    +test_1x1x2()
    +
    + +
    +
    +test_2x2()
    +
    + +
    +
    +test_2x2x1()
    +
    + +
    +
    +test_2x2x2()
    +
    + +
    +
    +test_2x2x2x2()
    +
    + +
    +
    +test_all_zero()
    +
    + +
    + +
    +
    +seismicrna.core.mu.tests.unbias_test.label_no_close_muts(muts: numpy.ndarray, min_gap: int)
    +

    Return a 1D vector that is True for every row in muts that +has no two mutations that are too close, otherwise False.

    +
    + +
    +
    +seismicrna.core.mu.tests.unbias_test.no_close_muts(read: numpy.ndarray, min_gap: int)
    +

    Return True if the read has no two mutations separated by fewer +than min_gap non-mutated positions, otherwise False.

    +
    + +
    +
    +seismicrna.core.mu.tests.unbias_test.simulate_params(n_pos: int, n_cls: int, p_mut_max: float = 1.0)
    +

    Return p_mut, p_ends, and p_cls parameters.

    +
    + +
    +
    +seismicrna.core.mu.tests.unbias_test.simulate_reads(n_reads: int, p_mut: numpy.ndarray, p_ends: numpy.ndarray)
    +

    Simulate n_reads reads based on the mutation rates (p_mut) +and the distributions of end coordinates (p_ends).

    +
    +
    @@ -960,7 +1472,7 @@