diff --git a/.github/workflows/documentation.yml b/.github/workflows/documentation.yml index d8b6316..c648a97 100755 --- a/.github/workflows/documentation.yml +++ b/.github/workflows/documentation.yml @@ -18,7 +18,7 @@ jobs: sudo apt-get install python3-distutils - name: Build Documentation working-directory: docs - run: sphinx-build . _build + run: sphinx-build -W --keep-going . _build - name: copy image files run: cp -r docs/assets docs/_build/ - uses: actions/upload-pages-artifact@v3 diff --git a/README.md b/README.md index 54337f3..e41abc9 100755 --- a/README.md +++ b/README.md @@ -4,7 +4,7 @@ [![PyPI version](https://img.shields.io/pypi/v/crispr-bean)](https://pypi.org/project/crispr-bean/) [![Code style](https://img.shields.io/badge/code%20style-black-black)](https://github.com/psf/black) -`bean` unconfounds variant effect from variable editing outcome of CRISPR screens by considering genotypic outcome from *reporter* sequence. +`bean` improves CRISPR pooled screen analysis by 1) unconfounding variable per-guide editing outcome by considering genotypic outcome from *reporter* sequence and 2) through accurate modeling of screen procedure. Reporter construct diff --git a/docs/ReporterScreen_api.rst b/docs/ReporterScreen_api.rst index d489ffc..54604cf 100755 --- a/docs/ReporterScreen_api.rst +++ b/docs/ReporterScreen_api.rst @@ -1195,7 +1195,7 @@ LFC calculation & Addition sns.pairplot(lfcs) -.. image:: ../imgs/output_20_2.png +.. image:: assets/output_20_2.png LFC can be aggregated for biological replicates. @@ -1599,7 +1599,7 @@ Getting edit rates from allele counts plt.show() -.. image:: ../imgs/output_34_1.png +.. image:: assets/output_34_1.png diff --git a/docs/_count.md b/docs/_count.md index 7400cba..b65b98e 100755 --- a/docs/_count.md +++ b/docs/_count.md @@ -3,7 +3,7 @@ -```python +```bash bean count-samples \ --input sample_list.csv `# sample with lines 'R1_filepath,R2_filepath,sample_name\n'` \ -b A `# base that is being edited (A/G)` \ @@ -13,7 +13,7 @@ bean count-samples \ -t 12 `# number of threads` \ --name my_sorting_screen `# name of this sample run` \ ``` -```python +```bash bean count --R1 R1.fq --R2 R2.fq -b A -f sgRNA_info_table.csv -r ``` By default, `bean count[-samples]` assume R1 and R2 are trimmed off of the adapter sequence. You may need to adjust the command arguments according to your read structure. diff --git a/docs/_guild/.doctrees/ReporterScreen_api.doctree b/docs/_guild/.doctrees/ReporterScreen_api.doctree deleted file mode 100644 index 0ecb4df..0000000 Binary files a/docs/_guild/.doctrees/ReporterScreen_api.doctree and /dev/null differ diff --git a/docs/_index.md b/docs/_index.md deleted file mode 100755 index 45e2414..0000000 --- a/docs/_index.md +++ /dev/null @@ -1,4 +0,0 @@ ---- -layout: default -title: CRISPR-BEAN ---- diff --git a/docs/_input.md b/docs/_input.md index 21368b8..0a4b6c6 100755 --- a/docs/_input.md +++ b/docs/_input.md @@ -4,7 +4,7 @@ File should contain following columns. * `name`: gRNA ID column * `sequence`: gRNA sequence * `barcode`: R2 barcode to help match reporter to gRNA, written in the sense direction (as in R1) -* In order to use accessibility in the [variant effect quantification](#bean-run-quantify-variant-effects), provide accessibility information in one of two options. (For non-targeting guides, provide NA values (empty cell).) +* In order to use accessibility in the variant effect quantification downstream (in [`bean run`](https://pinellolab.github.io/crispr-bean/run.html)), provide accessibility information in one of two options. (For non-targeting guides, provide NA values (empty cell).) * Option 1: `chrom` & `genomic_pos`: Chromosome (ex. `chr19`) and genomic position of guide sequence. You will have to provide the path to the bigwig file with matching reference version in `bean run`. * Option 2: `accessibility_signal`: ATAC-seq signal value of the target loci of each guide. * For variant library (gRNAs are designed to target specific variants and ignores bystander edits) @@ -16,7 +16,7 @@ File should contain following columns. * `chrom`: Chromosome of gRNA targeted locus. * `start_pos`: gRNA starting position in the genome. Required when you provide `strand` column. Should specify the smaller coordinate value among start and end position regardless of gRNA strandedness. -Also see examples for [variant library](tests/data/test_guide_info.csv) and [tiling library](tests/data/test_guide_info_tiling.csv). +Also see examples for [variant library](https://github.com/pinellolab/crispr-bean/blob/main/tests/data/test_guide_info.csv) and [tiling library](https://github.com/pinellolab/crispr-bean/blob/main/tests/data/test_guide_info_tiling_chrom.csv). ## sample_list.csv File should contain following columns with header. @@ -34,4 +34,4 @@ For proliferation / survival screens: * `time`: Numeric time following the base editing of each sample. -Also see examples for [FACS sorting screen](tests/data/sample_list.csv) and [proliferation / survival screen](tests/data/sample_list_survival.csv). \ No newline at end of file +Also see examples for [FACS sorting screen](https://github.com/pinellolab/crispr-bean/blob/main/tests/data/sample_list.csv) and [proliferation / survival screen](https://github.com/pinellolab/crispr-bean/blob/main/tests/data/sample_list_survival.csv). \ No newline at end of file diff --git a/docs/_ldl_cds.md b/docs/_ldl_cds.md index 90740ed..3913879 100755 --- a/docs/_ldl_cds.md +++ b/docs/_ldl_cds.md @@ -51,10 +51,11 @@ bean run sorting tiling \ --scale-by-acc \ --accessibility-col accessibility ``` + See more details below. ## 1. Count gRNA & reporter (:ref:`count_samples`) -``` +```bash screen_id=my_sorting_tiling_screen working_dir=my_workdir @@ -67,11 +68,13 @@ bean count-samples \ -n ${screen_id} `# ID of the screen` \ --tiling ``` -Make sure you follow the [input file format](../../README#input-file-format) for seamless downstream steps. This will produce `./bean_count_${screen_id}.h5ad`. + +Make sure you follow the [input file format](https://pinellolab.github.io/crispr-bean/input.html) for seamless downstream steps. This will produce `./bean_count_${screen_id}.h5ad`. ## 2. QC (:ref:`qc`) -Base editing data will include QC about editing efficiency. As QC uses predefined column names and values, beware to follow the [input file guideline](../../README#input-file-format), but you can change the parameters with the full argument list of [`bean qc`](../../README#bean qc-qc-of-reporter-screen-data). (Common factors you may want to tweak is `--ctrl-cond=bulk` and `--lfc-conds=top,bot` if you have different sample condition labels.) -``` +Base editing data will include QC about editing efficiency. As QC uses predefined column names and values, beware to follow the [input file guideline](https://pinellolab.github.io/crispr-bean/input.html), but you can change the parameters with the full argument list of [bean qc](https://pinellolab.github.io/crispr-bean/qc.html). (Common factors you may want to tweak is `--ctrl-cond=bulk` and `--lfc-conds=top,bot` if you have different sample condition labels.) + +```bash bean qc \ ${working_dir}/bean_count_${screen_id}.h5ad `# Input ReporterScreen .h5ad file path` \ -o ${working_dir}/bean_count_${screen_id}_masked.h5ad `# Output ReporterScreen .h5ad file path` \ @@ -87,13 +90,15 @@ If the data does not include reporter editing data, you can provide `--no-editin As tiling library doesn't have designated per-gRNA target variant, any base edit observed in reporter may be the candidate variant, while having too many variants with very low editing rate significantly decreases the power. Variants are filtered based on multiple criteria in `bean fitler`. If the screen targets coding sequence, it's beneficial to translate edits into coding varaints whenever possible for better power. For translation, provide `--translate` and one of the following: -``` + +```bash [ --translate-gene-name GENE_SYMBOL OR --translate-genes-list path_to_gene_names_file.txt OR --translate-fasta gene_exon.fa, OR --translate-fastas-csv gene_exon_fas.csv] ``` -where `path_to_gene_names_file.txt` has one gene symbol per line, and gene symbol uses its MANE transcript (hg38) coordinates of exons. In order to use other reference versions or transcript ID, you'll need to feed in fasta file. See detailed formatting of fasta file [here](../../README#translating-alleles). + +where `path_to_gene_names_file.txt` has one gene symbol per line, and gene symbol uses its MANE transcript (hg38) coordinates of exons. In order to use other reference versions or transcript ID, you'll need to feed in fasta file. See [detailed formatting of fasta file](https://pinellolab.github.io/crispr-bean/filter.html#translating-alleles). Example allele filtering given we're translating based on MANE transcript exons of multiple gene symbols: @@ -106,15 +111,16 @@ bean filter ${working_dir}/bean_count_${screen_id}_masked.h5ad \ --translate --translate-genes-list ${working_dir}/gene_symbols.txt ``` -Ouptut file `` shows number of alleles per guide and number of guides per variant, where we want high enough values for the latter. See the typical output for dataset with good editing coverage & filtering result [here](../example_filtering_ouptut/). +Ouptut file `` shows number of alleles per guide and number of guides per variant, where we want high enough values for the latter. See the [typical output](https://github.com/pinellolab/crispr-bean/tree/main/docs/example_filtering_ouptut/) for dataset with good editing coverage & filtering result. ## 4. Quantify variant effect (:ref:`run`) -By default, `bean run [sorting,survival] tiling` uses most filtered allele counts table for variant identification and quantification of their effects. **Check [allele filtering output](../example_filtering_ouptut/)** and choose alternative filtered allele counts table if necessary. +By default, `bean run [sorting,survival] tiling` uses most filtered allele counts table for variant identification and quantification of their effects. Check [allele filtering output](https://github.com/pinellolab/crispr-bean/tree/main/docs/example_filtering_ouptut/) and choose alternative filtered allele counts table if necessary. `bean run` can take 3 run options to quantify editing rate: 1. From **reporter + accessibility** - 1-1. If your gRNA metadata table (`${working_dir}/test_guide_info.csv` above) included per-gRNA accessibility score, - ``` + 1-1. If your gRNA metadata table (`${working_dir}/test_guide_info.csv` above) included per-gRNA accessibility score, + + ```bash bean run sorting tiling \ ${working_dir}/bean_count_${screen_id}_alleleFiltered.h5ad \ -o $working_dir \ @@ -122,8 +128,10 @@ By default, `bean run [sorting,survival] tiling` uses most filtered allele count --scale-by-acc \ --accessibility-col accessibility ``` - 1-2. If your gRNA metadata table (`${working_dir}/test_guide_info.csv` above) included per-gRNA chromosome & position and you have bigWig file with accessibility signal, - ``` + + 1-2. If your gRNA metadata table (`${working_dir}/test_guide_info.csv` above) included per-gRNA chromosome & position and you have bigWig file with accessibility signal, + + ```bash bean run sorting tiling \ ${working_dir}/bean_count_${screen_id}_alleleFiltered.h5ad \ -o $working_dir \ @@ -133,15 +141,18 @@ By default, `bean run [sorting,survival] tiling` uses most filtered allele count ``` 2. From **reporter** - ``` + + ```bash bean run sorting tiling \ ${working_dir}/bean_count_${screen_id}_alleleFiltered.h5ad \ -o $working_dir \ --fit-negctrl ``` + 3. No reporter information, assume the same editing efficiency of all gRNAs. - Use this option if your data don't have editing rate information. - ``` + Use this option if your data don't have editing rate information. + + ```bash bean run sorting tiling \ ${working_dir}/bean_count_${screen_id}_alleleFiltered.h5ad \ -o $working_dir \ diff --git a/docs/_ldl_var.md b/docs/_ldl_var.md index 7eb1af7..f81f714 100755 --- a/docs/_ldl_var.md +++ b/docs/_ldl_var.md @@ -43,6 +43,7 @@ bean-run sorting variant \ --scale-by-acc \ --accessibility-col accessibility ``` + See more details below. ## 1. Count gRNA & reporter (:ref:`count_samples`) @@ -58,11 +59,13 @@ bean-count-samples \ -r `# Quantify reporter edits` \ -n ${screen_id} `# ID of the screen to be counted` ``` -Make sure you follow the [input file format](../../README#input-file-format) for seamless downstream steps. This will produce `./bean_count_${screen_id}.h5ad`. + +Make sure you follow the [input file format](https://pinellolab.github.io/crispr-bean/input.html) for seamless downstream steps. This will produce `./bean_count_${screen_id}.h5ad`. ## 2. QC samples & guides (:ref:`qc`) -Base editing data will include QC about editing efficiency. As QC uses predefined column names and values, beware to follow the [input file guideline](../../README#input-file-format), but you can change the parameters with the full argument list of [`bean-qc`](../../README#bean-qc-qc-of-reporter-screen-data). (Common factors you may want to tweak is `--ctrl-cond=bulk` and `--lfc-conds=top,bot` if you have different sample condition labels.) -``` +Base editing data will include QC about editing efficiency. As QC uses predefined column names and values, beware to follow the [input file guideline](https://pinellolab.github.io/crispr-bean/input.html), but you can change the parameters with the full argument list of [bean qc](https://pinellolab.github.io/crispr-bean/qc.html). (Common factors you may want to tweak is `--ctrl-cond=bulk` and `--lfc-conds=top,bot` if you have different sample condition labels.) + +```bash bean-qc \ bean_count_${screen_id}.h5ad `# Input ReporterScreen .h5ad file path` \ -o bean_count_${screen_id}_masked.h5ad `# Output ReporterScreen .h5ad file path` \ @@ -78,8 +81,8 @@ If the data does not include reporter editing data, you can provide `--no-editin `bean-run` can take 3 run options to quantify editing rate: 1. From **reporter + accessibility** - If your gRNA metadata table (`${working_dir}/test_guide_info.csv` above) included per-gRNA accessibility score, - ``` + If your gRNA metadata table (`${working_dir}/test_guide_info.csv` above) included per-gRNA accessibility score, + ```bash bean-run sorting variant \ ${working_dir}/bean_count_${screen_id}_masked.h5ad \ -o ${working_dir}/ \ @@ -87,8 +90,10 @@ If the data does not include reporter editing data, you can provide `--no-editin --scale-by-acc \ --accessibility-col accessibility ``` - If your gRNA metadata table (`${working_dir}/test_guide_info.csv` above) included per-gRNA chromosome & position and you have bigWig file with accessibility signal, - ``` + + If your gRNA metadata table (`${working_dir}/test_guide_info.csv` above) included per-gRNA chromosome & position and you have bigWig file with accessibility signal, + + ```bash bean-run sorting variant \ ${working_dir}/bean_count_${screen_id}_masked.h5ad \ -o ${working_dir}/ \ @@ -99,16 +104,19 @@ If the data does not include reporter editing data, you can provide `--no-editin 2. From **reporter**, without accessibility - This assumes the all target sites have the uniform chromatin accessibility. - ``` + This assumes the all target sites have the uniform chromatin accessibility. + + ```bash bean-run sorting variant \ ${working_dir}/bean_count_${screen_id}_masked.h5ad \ -o ${working_dir}/ \ --fit-negctrl ``` + 3. No reporter information, assume the same editing efficiency of all gRNAs. - Use this option if your data don't have editing outcome information. - ``` + Use this option if your data don't have editing outcome information. + + ```bash bean-run sorting variant \ ${working_dir}/bean_count_${screen_id}_masked.h5ad \ -o ${working_dir}/ \ diff --git a/docs/_profile.md b/docs/_profile.md index cd40cb4..47806a2 100755 --- a/docs/_profile.md +++ b/docs/_profile.md @@ -1,8 +1,10 @@ # `bean profile`: Profile editing patterns + ```bash bean profile my_sorting_screen.h5ad -o output_prefix `# Prefix for editing profile report` ``` + # Output Above command produces `prefix_editing_preference.[html,ipynb]` as editing preferences ([see example](../notebooks/profile_editing_preference.ipynb)). -Allele translation \ No newline at end of file +Editing profiles \ No newline at end of file diff --git a/docs/_prolif_gwas.md b/docs/_prolif_gwas.md index 9af45ee..cf1ded9 100644 --- a/docs/_prolif_gwas.md +++ b/docs/_prolif_gwas.md @@ -8,7 +8,7 @@ GWAS variant screen with per-variant gRNA tiling design, selected based on FACS Selection - Cells are sorted based on FACS signal quantiles
variant library design + Cells are grown and will be selected based on their fitness. Cells are sampled in multiple timepoints.
variant library design @@ -56,11 +56,11 @@ bean count-samples \ -r `# Quantify reporter edits` \ -n ${screen_id} `# ID of the screen to be counted` ``` -Make sure you follow the [input file format](../../README#input-file-format) for seamless downstream steps. This will produce `./bean_count_${screen_id}.h5ad`. +Make sure you follow the [input file format](https://pinellolab.github.io/crispr-bean/input.html) for seamless downstream steps. This will produce `./bean_count_${screen_id}.h5ad`. ## 2. QC samples & guides (:ref:`qc`) -Base editing data will include QC about editing efficiency. As QC uses predefined column names and values, beware to follow the [input file guideline](../../README#input-file-format), but you can change the parameters with the full argument list of [`bean qc`](../../README#bean qc-qc-of-reporter-screen-data). (Common factors you may want to tweak is `--ctrl-cond=bulk` and `--lfc-conds=top,bot` if you have different sample condition labels.) -``` +Base editing data will include QC about editing efficiency. As QC uses predefined column names and values, beware to follow the [input file guideline](https://pinellolab.github.io/crispr-bean/input.html), but you can change the parameters with the full argument list of [bean qc](https://pinellolab.github.io/crispr-bean/qc.html). (Common factors you may want to tweak is `--ctrl-cond=bulk` and `--lfc-conds=top,bot` if you have different sample condition labels.) +```bash bean qc \ ${working_dir}/bean_count_${screen_id}.h5ad `# Input ReporterScreen .h5ad file path` \ -o ${working_dir}/bean_count_${screen_id}_masked.h5ad `# Output ReporterScreen .h5ad file path` \ @@ -79,8 +79,8 @@ If the data does not include reporter editing data, you can provide `--no-editin `bean run` can take 3 run options to quantify editing rate: 1. From **reporter + accessibility** - If your gRNA metadata table (`${working_dir}/test_guide_info.csv` above) included per-gRNA accessibility score, - ``` + If your gRNA metadata table (`${working_dir}/test_guide_info.csv` above) included per-gRNA accessibility score, + ```bash bean run sorting variant \ ${working_dir}/bean_count_${screen_id}_masked.h5ad \ -o $working_dir \ @@ -88,8 +88,8 @@ If the data does not include reporter editing data, you can provide `--no-editin --scale-by-acc \ --accessibility-col accessibility ``` - If your gRNA metadata table (`${working_dir}/test_guide_info.csv` above) included per-gRNA chromosome & position and you have bigWig file with accessibility signal, - ``` + If your gRNA metadata table (`${working_dir}/test_guide_info.csv` above) included per-gRNA chromosome & position and you have bigWig file with accessibility signal, + ```bash bean run sorting variant \ ${working_dir}/bean_count_${screen_id}_masked.h5ad \ -o $working_dir \ @@ -100,16 +100,17 @@ If the data does not include reporter editing data, you can provide `--no-editin 2. From **reporter**, without accessibility - This assumes the all target sites have the uniform chromatin accessibility. - ``` + This assumes the all target sites have the uniform chromatin accessibility. + ```bash bean run sorting variant \ ${working_dir}/bean_count_${screen_id}_masked.h5ad \ -o $working_dir \ --fit-negctrl ``` 3. No reporter information, assume the same editing efficiency of all gRNAs. - Use this option if your data don't have editing outcome information. - ``` + Use this option if your data don't have editing outcome information. + + ```bash bean run sorting variant \ ${working_dir}/bean_count_${screen_id}_masked.h5ad \ -o $working_dir \ diff --git a/docs/_qc.md b/docs/_qc.md index ee89c3e..7d00c86 100755 --- a/docs/_qc.md +++ b/docs/_qc.md @@ -30,65 +30,3 @@ bean qc \ Above command produces * `my_sorting_screen_masked.h5ad` without problematic replicate and guides and with sample masks, and * `qc_report_my_sorting_screen.[html,ipynb]` as QC report. -##### Optional arguments: -* `-o OUT_SCREEN_PATH`, `--out-screen-path OUT_SCREEN_PATH` - Path where quality-filtered ReporterScreen object to be written to -* `-r OUT_REPORT_PREFIX`, `--out-report-prefix OUT_REPORT_PREFIX` - Output prefix of qc report (prefix.html, prefix.ipynb) - -##### QC thresholds: -* `--count-correlation-thres COUNT_CORRELATION_THRES` - Correlation threshold to mask out. -* `--edit-rate-thres EDIT_RATE_THRES` - Mean editing rate threshold per sample to mask out. -* `--lfc-thres LFC_THRES` - Positive guides' correlation threshold to filter out. - -##### Run options: -* `-b`, `--remove-bad-replicates` - Remove replicates with at least two of its samples meet the QC threshold (bean run does not support having only one sorting bin sample for a replicate). -* `-i`, `--ignore-missing-samples` - If the flag is not provided, if the ReporterScreen object does not contain all condiitons for - each replicate, make fake empty samples. If the flag is provided, don't add dummy samples. -* `--no-editing` Ignore QC about editing. Can be used for QC of other editing modalities. -* `--dont-recalculate-edits` - When ReporterScreen.layers['edit_count'] exists, do not recalculate the edit counts from - ReporterScreen.uns['allele_count']. - -##### Input `.h5ad` formatting: -Note that these arguements will change the way the QC metrics are calculated for guides, samples, or replicates. -* `--tiling TILING` Specify that the guide library is tiling library without 'n guides per target' design -* `--replicate-label REPLICATE_LABEL` - Label of column in `bdata.samples` that describes replicate ID. -* `--sample-covariates SAMPLE_COVARIATES` - Comma-separated list of column names in `bdata.samples` that describes non-selective - experimental condition. (drug treatment, etc.) -* `--condition-label CONDITION_LABEL` - Label of column in `bdata.samples` that describes experimental condition. (sorting bin, time, - etc.) -###### Editing rate calculation - * `--control-condition CTRL_COND` - Values in of column in `ReporterScreen.samples[condition_label]` for guide-level editing rate - to be calculated. Default is `None`, which considers all samples. - * `--rel-pos-is-reporter` - Specifies whether `edit_start_pos` and `edit_end_pos` are relative to reporter position. If - `False`, those are relative to spacer position. - Editing rate is calculated with following parameters in - * Variant screens: - * `--target-pos-col TARGET_POS_COL` - Target position column in `bdata.guides` specifying target edit position in reporter - * tiling screens: - * `--edit-start-pos EDIT_START_POS` - Edit start position to quantify editing rate on, 0-based inclusive. - * `--edit-end-pos EDIT_END_POS` - Edit end position to quantify editing rate on, 0-based exclusive. -###### LFC of positive controls - * `--posctrl-col POSCTRL_COL` - Column name in ReporterScreen.guides DataFrame that specifies guide category. To use all - gRNAs, feed empty string ''. - * `--posctrl-val POSCTRL_VAL` - Value in ReporterScreen.guides[`posctrl_col`] that specifies guide will be used as the - positive control in calculating log fold change. - * `--lfc-conds LFC_CONDS` - Values in of column in `ReporterScreen.samples[condition_label]` for LFC will be calculated - between, delimited by comma \ No newline at end of file diff --git a/docs/cds.rst b/docs/cds.rst index dc4ea75..8b38b09 100755 --- a/docs/cds.rst +++ b/docs/cds.rst @@ -1,5 +1,5 @@ Coding sequence tiling library -*********************** +********************************************** .. mdinclude:: _ldl_cds.md See :ref:`subcommands` for the full details. diff --git a/docs/conf.py b/docs/conf.py index 6b99fd5..58daeb7 100755 --- a/docs/conf.py +++ b/docs/conf.py @@ -7,17 +7,28 @@ # https://www.sphinx-doc.org/en/master/usage/configuration.html#project-information project = "bean" -copyright = "2024, Jayoung Ryu" -author = "Jayoung Ryu" +copyright = "2024, Pinello lab" +author = "Jayoung Ryu, Pinello lab" release = "1.0.0" # -- General configuration --------------------------------------------------- # https://www.sphinx-doc.org/en/master/usage/configuration.html#general-configuration -extensions = ["sphinxarg.ext", "m2r"] +extensions = [ + "sphinxarg.ext", + "m2r", + "sphinx.ext.extlinks", +] templates_path = ["_templates"] exclude_patterns = ["_build", "Thumbs.db", ".DS_Store"] +extlinks = { + "git_tag": ("https://github.com/sphinx-doc/alabaster/tree/%s", "%s"), + "bug": ("https://github.com/sphinx-doc/alabaster/issues/%s", "#%s"), + "feature": ("https://github.com/sphinx-doc/alabaster/issues/%s", "#%s"), + "issue": ("https://github.com/sphinx-doc/alabaster/issues/%s", "#%s"), +} + root_doc = "index" numpydoc_show_class_members = False @@ -27,3 +38,18 @@ html_theme = "alabaster" html_static_path = ["_static"] +html_logo = "assets/beans.svg" +html_theme_options = { + "description": "Activity-normalized variant effect size estimation from pooled CRISPR screens", + "github_user": "pinellolab", + "github_repo": "crispr-bean", + "github_button": "true", + "github_count": "false", +} +html_sidebars = { + "**": [ + "about.html", + "globaltoc.html", + "searchbox.html", + ] +} diff --git a/docs/count_samples.rst b/docs/count_samples.rst index db89732..d445879 100755 --- a/docs/count_samples.rst +++ b/docs/count_samples.rst @@ -1,8 +1,10 @@ .. _count_samples: + `bean count-samples` *********************** .. mdinclude:: _count.md + Full parameters ================== .. argparse:: diff --git a/docs/example_filtering_output/my_sorting_screen_masked.filtered.filter_log.txt b/docs/example_filtering_output/my_sorting_screen_masked.filtered.filter_log.txt new file mode 100644 index 0000000..e58a02d --- /dev/null +++ b/docs/example_filtering_output/my_sorting_screen_masked.filtered.filter_log.txt @@ -0,0 +1,5 @@ +allele_counts 400734 +allele_counts_spacer 210363 +allele_counts_spacer_noindels 207500 +allele_counts_spacer_noindels_translated 193445 +allele_counts_spacer_noindels_translated_prop0.05_0.2 31517 diff --git a/docs/example_filtering_output/my_sorting_screen_masked.filtered.filtered_allele_stats.pdf b/docs/example_filtering_output/my_sorting_screen_masked.filtered.filtered_allele_stats.pdf new file mode 100644 index 0000000..18c94c2 Binary files /dev/null and b/docs/example_filtering_output/my_sorting_screen_masked.filtered.filtered_allele_stats.pdf differ diff --git a/docs/filter.rst b/docs/filter.rst index 3b10bc4..108ae1a 100755 --- a/docs/filter.rst +++ b/docs/filter.rst @@ -1,4 +1,5 @@ .. _filter: + `bean filter` *********************** .. mdinclude:: _filter.md diff --git a/docs/gwas.rst b/docs/gwas.rst index 31e8c68..7134546 100755 --- a/docs/gwas.rst +++ b/docs/gwas.rst @@ -1,3 +1,5 @@ +.. _gwas: + GWAS variant library *********************** .. mdinclude:: _ldl_var.md diff --git a/docs/index.rst b/docs/index.rst index 34d4e8a..d785cc7 100755 --- a/docs/index.rst +++ b/docs/index.rst @@ -4,8 +4,8 @@ contain the root `toctree` directive. Welcome to `bean`'s documentation! -================================ -`bean` unconfounds variant effect from variable editing outcome of CRISPR screens by considering genotypic outcome from *reporter* sequence. +================================================================ +`bean` improves CRISPR pooled screen analysis by 1) unconfounding variable per-guide editing outcome by considering genotypic outcome from *reporter* sequence and 2) through accurate modeling of screen procedure. .. image:: assets/summary.png :width: 700 @@ -15,11 +15,9 @@ Welcome to `bean`'s documentation! Workflows =================== .. toctree:: - :maxdepth: 1 + :maxdepth: 2 - gwas - cds - prolif_gwas + tutorials =================== API references @@ -30,17 +28,16 @@ API references input subcommands -=================== +=========================== Screen data structure -=================== +=========================== .. toctree:: - + reporterscreen - ReporterScreen_api -================== +========================= Indices and tables -================== +========================= * :ref:`genindex` * :ref:`modindex` diff --git a/docs/input.rst b/docs/input.rst index 79285ff..780596b 100755 --- a/docs/input.rst +++ b/docs/input.rst @@ -1,4 +1,5 @@ .. _input: + Input file format -*********************** +************************ .. mdinclude:: _input.md \ No newline at end of file diff --git a/docs/profile.rst b/docs/profile.rst index c25a422..7d79464 100755 --- a/docs/profile.rst +++ b/docs/profile.rst @@ -1,3 +1,5 @@ +.. _profile: + `bean profile` *********************** .. mdinclude:: _profile.md diff --git a/docs/prolif_gwas.rst b/docs/prolif_gwas.rst index 0206c15..179c814 100644 --- a/docs/prolif_gwas.rst +++ b/docs/prolif_gwas.rst @@ -1,3 +1,5 @@ +.. _prolif_gwas: + Proliferation screen with GWAS library ********************************************** .. mdinclude:: _prolif_gwas.md diff --git a/docs/qc.rst b/docs/qc.rst index bd233cf..3114df5 100755 --- a/docs/qc.rst +++ b/docs/qc.rst @@ -1,4 +1,5 @@ .. _qc: + `bean qc` *********************** .. mdinclude:: _qc.md diff --git a/docs/reporterscreen.rst b/docs/reporterscreen.rst index 687a31b..34c3d5a 100644 --- a/docs/reporterscreen.rst +++ b/docs/reporterscreen.rst @@ -1,3 +1,9 @@ -ReporterScreen object +ReporterScreen *********************** .. mdinclude:: _reporterscreen.md + +====================================== +Data wrangling examples +====================================== +.. toctree:: + ReporterScreen_api \ No newline at end of file diff --git a/docs/run.rst b/docs/run.rst index db09224..a5dcdd3 100755 --- a/docs/run.rst +++ b/docs/run.rst @@ -1,4 +1,5 @@ .. _run: + `bean run` *********************** .. mdinclude:: _run.md diff --git a/docs/subcommands.rst b/docs/subcommands.rst index 56319ca..8842def 100755 --- a/docs/subcommands.rst +++ b/docs/subcommands.rst @@ -1,4 +1,5 @@ .. _subcommands: + =================== Subcommands =================== diff --git a/tests/data/sample_list_survival.csv b/tests/data/sample_list_survival.csv new file mode 100644 index 0000000..41914ce --- /dev/null +++ b/tests/data/sample_list_survival.csv @@ -0,0 +1,10 @@ +,replicate,condition,time +day_7_1,1,D7,7 +day_14_1,1,D14,14 +plasmids_for_library_1,1,D0,0 +day_7_2,2,D7,7 +day_14_2,2,D14,14 +plasmids_for_library_2,2,D0,0 +day_7_3,3,D7,7 +day_14_3,3,D14,14 +plasmids_for_library_3,3,D0,0