Skip to content

Commit

Permalink
Merge pull request #43 from pinellolab/dev
Browse files Browse the repository at this point in the history
add docs for sgRNA output
  • Loading branch information
jykr authored Jun 25, 2024
2 parents b864328 + 5214932 commit b0a2d1b
Show file tree
Hide file tree
Showing 3 changed files with 39 additions and 15 deletions.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@
2. [`profile`](https://pinellolab.github.io/crispr-bean/profile.html): Profile editing preferences of your editor.
3. [`qc`](https://pinellolab.github.io/crispr-bean/qc.html): Quality control report and filtering out / masking of aberrant sample and guides
4. [`filter`](https://pinellolab.github.io/crispr-bean/filter.html): Filter reporter alleles; essential for `tiling` mode that allows for all alleles generated from gRNA.
5. [`run`](https://pinellolab.github.io/crispr-bean/run.html): Quantify targeted variants' effect sizes from screen data. **See more about the model in the link**.
5. [`run`](https://pinellolab.github.io/crispr-bean/run.html): Quantify targeted variants' effect sizes from screen data. **See more about the [model](https://pinellolab.github.io/crispr-bean/model.html) & [output](https://github.com/pinellolab/crispr-bean/tree/main/docs/example_run_output)**
* Screen data is saved as [`ReporterScreen` object](https://pinellolab.github.io/crispr-bean/reporterscreen.html) in the pipeline.
BEAN stores mapped gRNA and allele counts in `ReporterScreen` object which is compatible with [AnnData](https://anndata.readthedocs.io/en/latest/index.html).

Expand Down
51 changes: 37 additions & 14 deletions docs/_run.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,19 +48,42 @@ See full list of parameters [below](#full-parameters).
<img src="/crispr-bean/assets/model_output.png" alt="model" width="700"/>
Above command produces
* `output_prefix/bean_element_result.[model_type].csv` with following columns:
* Estimated variant effect sizes
* `mu` (Effect size): Mean of variant phenotype, given the wild type has standard normal phenotype distribution of `mu = 0, sd = 1`.
* `mu_sd`: Mean of variant phenotype `mu` is modeled as normal distribution. The column shows fitted standard deviation of `mu` that quantify the uncertainty of the variant effect.
* `mu_z`: z-score of `mu`
* `sd`: Standard deviation of variant phenotype, given the wild type has standard normal phenotype distribution of `mu = 0, sd = 1`.
* `CI[0.025`, `0.975]`: Credible interval of `mu`
* **When negative control is provided, above columns with `_adj` suffix are provided, which are the corresponding values adjusted for negative control.**
* Metrics on per-variant evidence provided in input (provided in `tiling` mode)
* `effective_edit_rate`: Sum of per-variant editing rates over all alleles observed in the input. Allele-level editing rate is divided by the number of variants observed in the allele prior to summing up.
* `n_guides`: # of guides covering the variant.
* `n_coocc`: # of cooccurring variants with a given variant in any alleles observed in the input.
* `output_prefix/bean_sgRNA_result.[model_type].csv`:
* `edit_rate`: Estimated editing rate at the target loci.
## `bean_element_result.[model_type].csv`
- Variant ID / grouping
- `edit`: Variant ID.
- `group`: The grouping of the coding variants, assigned as one of nonsense/missense/synonymous.
- `int_pos`: The integer position of the noncoding variants.
- `chrom`: The chromosome of the variant.
- `pos`: The position of the variant. If coding variant, starts with `A` and the position specified 1-based amino acid position. If noncodig variant, numeric genomic position.
- `ref`: The reference base/amino acid of the variant.
- `alt`: The alternative base/amino acid of the variant.
- `coding`: A flag indicating if the element is coding variant or not.
- Per-variant summary of variant-producing guides (`tiling` mode)
- `guide_target_group`: Aggregated `target_group` column in the input sgRNA_info.csv file. All unique values of the guides that produced (filtered) edited alleles that includes this variant is listed.
- `effective_edit_rate`: The effective editing rate of the element. Calculated as `sum_over_guides(sum_over_alleles(per_guide_allele_editing_rate / # variants in the allele))`.
- `editing_guides`: List of guides that edited the variant.
- `per_guide_editing_rates`: The per-guide editing rates of the variant.
- `n_guides`: The number of guides that edited the variant.
- `n_coocc`: The number of unique co-occurring variants that appeared together in any alleles that contains the variant.
- Variant effect size: Use `mu_z_adj` whenever available, otherwise `mu_z_scaled`, otherwise `mu_z`.
- `mu`: The mean value of the variant effect size.
- `mu_sd`: The standard deviation of the mean value of the variant effect size.
- `mu_z`: The z-score of the mean value of the variant effect size.
- `sd`: The standard deviation of the phenotype induced by the variant.
- `CI[0.025,0.975]`: The 95% credible interval of the mean value of the variant effect size. Corresponds to `mu_z_adj` when available, otherwise `mu_z_scaled`, otherwise `mu_z`.
- `[]_scaled`: Above values scaled by negative control variants.
- `[]_adj`: Above values scaled by synonymous variants.
## `bean_sgRNA_result.[model_type].csv`
- `name`: sgRNA ID provided in the `name` column of the input.
- `edit_rate`: Effective editing rates
- `accessibility`: (Only if you have used `--scale-by-acc`) Accessibility signal that is used for scaling of the editing rate.
- `scaled_edit_rate`: (Only if you have used `--scale-by-acc`) Endogenous editing rate used for modeling, estimated by scaling reporter editing rate by accessibility signal
- `[cond1]_[cond2].median_lfc`: Raw LFC with pseudocount fed in with `--guide-lfc-pseudocount` argument (default 5).
- For `tiling` mode
- `variants`: Variants generated by this gRNA
- `variant_edit_rates`: Editing rate of this gRNA for each variant it creates.
See the full output file description and example output [here](https://github.com/pinellolab/crispr-bean/tree/main/docs/example_run_output).
1 change: 1 addition & 0 deletions docs/example_run_output/tiling/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ These are the example output of [`bean run`](https://pinellolab.github.io/crispr
- `per_guide_editing_rates`: The per-guide editing rates of the variant.
- `n_guides`: The number of guides that edited the variant.
- `n_coocc`: The number of unique co-occurring variants that appeared together in any alleles that contains the variant.

- Variant effect size: Use `mu_z_adj` whenever available, otherwise `mu_z_scaled`, otherwise `mu_z`.
- `mu`: The mean value of the variant effect size.
- `mu_sd`: The standard deviation of the mean value of the variant effect size.
Expand Down

0 comments on commit b0a2d1b

Please sign in to comment.