Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] continuous_factor errors out in _build_contrast #313

Closed
jeffhsu3 opened this issue Sep 11, 2024 · 3 comments
Closed

[BUG] continuous_factor errors out in _build_contrast #313

jeffhsu3 opened this issue Sep 11, 2024 · 3 comments
Labels
bug Something isn't working

Comments

@jeffhsu3
Copy link

Describe the bug
Setting continuous_factors in a DeseqDataSet exclusively causes an error in DeseqStats when building a contrast. However, if the same factor is included in the design_factors, it is converted to a Categorical type and works without error.

To Reproduce

dds = DeseqDataSet(
    adata=adf,
    design_factors=["treatment"],
    continuous_factors=["time"],
    ref_level=["treatment", "CTRL"],
)
stat_res_time = DeseqStats(dds, contrast=["time", "", ""])

The adf.obs.time.dtype is int64. This raises the following in the _build_contrast call:

The contrast variable ('time') should be one of the design factors.

Would just changing the check in _build_contrast in DeseqStats be enough?

pydeseq2 version: 0.4.11

Expected behavior
The if statement should also check if the factor is in self.dds.continuous_factors

Screenshots
If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):
Ubuntu 22.04

Additional context
Add any other context about the problem here.

@jeffhsu3 jeffhsu3 added the bug Something isn't working label Sep 11, 2024
jeffhsu3 added a commit to jeffhsu3/PyDESeq2 that referenced this issue Sep 11, 2024
@BorisMuzellec
Copy link
Collaborator

Hi @jeffhsu3,

Continuous factors must indeed also be listed as design factors. (This is the meaning - maybe not so clear - of the error message you get.)

I.e., in your case, the code should be changed to

dds = DeseqDataSet(
    adata=adf,
    design_factors=["treatment", "time"],
    continuous_factors=["time"],
    ref_level=["treatment", "CTRL"],
)
stat_res_time = DeseqStats(dds, contrast=["time", "", ""])

You mentioned that this caused time to be treated as a categorical factor, could you provide an example of this behaviour?

Thanks!

@jeffhsu3
Copy link
Author

jeffhsu3 commented Sep 12, 2024

Thanks!

The design matrix isn't affected, but the obs df is.

print(adf.obs.time.dtype)  # Output: dtype('int64')

# After creating DeseqDataSet
dds = DeseqDataSet(
    adata=adf,
    design_factors=["treatment", "time"],
    continuous_factors=["time"],
    ref_level=["treatment", "CTRL"],
)
print(dds.obs.time.dtype)  # Output: dtype('O')

print(dds.obsm['design_matrix']) # Output: dtype('int64')

@BorisMuzellec
Copy link
Collaborator

Closing this because of #328

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants