Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

batch effects #3

Open
wangjiawen2013 opened this issue Mar 10, 2019 · 7 comments
Open

batch effects #3

wangjiawen2013 opened this issue Mar 10, 2019 · 7 comments

Comments

@wangjiawen2013
Copy link

wangjiawen2013 commented Mar 10, 2019

Dear,
in the Lukassen 2018 data, batch1 and batch2 do not align well using DCA (DCA on Lukassen.ipynb), while it seems to align the two mice quite well with scvi (scvi on Lukassen.ipynb)!
which one should I use ?

@vals
Copy link
Owner

vals commented Mar 12, 2019

Hi Jiawen,

Romain told me the reason it aligns well without batch correction in scVI is probably due to a size factor scaling scVI does.

I havn't used DCA much since the paper came out, but I use scVI almost every day. I don't remember if DCA has batch correction methods built in, but this is a feature of scVI that I find works very well.

@wangjiawen2013
Copy link
Author

I am newcomer of scVI. I notice that your scvi pipeline is different from that of scVI basic tutorial (https://github.com/YosefLab/scVI/blob/master/tests/notebooks/basic_tutorial.ipynb).
what's the difference ? Do you make any customized improvements to obtain better results ?

@vals
Copy link
Owner

vals commented Mar 14, 2019

How do you mean? The only differences I can think of is that I store data in AnnData objects rather than GeneDatasets, and I use a different library for tSNE visualization.

@wangjiawen2013
Copy link
Author

@vals
Copy link
Owner

vals commented Mar 14, 2019

Oh the post from last April used an old version of scVI that is deprecatred.

@wangjiawen2013
Copy link
Author

Dear,
do you know when to use gene/gene-batch/gene-label/gene-cell as the "param dispersion" in VAE ?

:param dispersion: One of the following
    * ``'gene'`` - dispersion parameter of NB is constant per gene across cells
    * ``'gene-batch'`` - dispersion can differ between different batches
    * ``'gene-label'`` - dispersion can differ between different labels
    * ``'gene-cell'`` - dispersion can differ for every gene in every cell

@vals
Copy link
Owner

vals commented Mar 18, 2019

Hi,

I typically use gene-batch because I have noticed when analyzing data in general that the overdispersion trend when plotting mean-vs-variance for genes per batch it tends to be different per batch.

I haven't used the supervised mode of scVI much, so can't comment on the effect of gene-label. And the gene-cell option is interesting, but I haven't tried it much.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants