-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
msnset se coercions #284
msnset se coercions #284
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your PR and the great idea to convert an MSnSet
into an SummarizedExperiment
object and vice versa. Instead of a se2msnset
and msnset2se
function we would prefer a as(msnset, "SummarizedExperiment")
and as(se, "MSnSet")
method. You could use setAs("MSnSet", "SummarizedExpirment", function(from) {})
for that.
Additionally, I added some comments to your code (mostly about the coding style).
R/msnset2se.R
Outdated
#' @export | ||
se2msnset <- function(se) { | ||
# Check input | ||
assertthat::assert_that(inherits(se, "SummarizedExperiment")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently we don't use assertthat
for evaluation of input variables. While it is very convenient it don't think we are accepting a new dependency for just an easy if (inherits(se, "SummarizedExperiment"))
check. BTW: assertthat
is not listed in the DESCRIPTION
file.
R/msnset2se.R
Outdated
assertthat::assert_that(inherits(se, "SummarizedExperiment")) | ||
|
||
# Extract expression, feature and pheno data | ||
raw <- SummarizedExperiment::assay(se) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently we also have no dependency on SummarizedExperiment
. That means you have to add SummarizedExperiment
to the DESCRIPTION
file. I would suggest to add it into Suggests:
and use if (requireNamespace("SummarizedExperiment"))
here to check whether it is available or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ::
should not necessary. You only need it if you have two functions with the same name in two imported packages. Beside adding SummarizedExperiment
to the DESCRIPTION
file you have to import the class and the methods you use via NAMESPACE
: importClassesFrom(SummarizedExperiment, SummarizedExperiment)
and importFrom(SummarizedExperiment, assay, colData, rowData)
.
R/msnset2se.R
Outdated
# Extract expression, feature and pheno data | ||
raw <- SummarizedExperiment::assay(se) | ||
feat_data <- data.frame(SummarizedExperiment::rowData(se)) | ||
rownames(feat_data) <- se@NAMES |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should not access slots directly. Why not use names(se)
here? I am not very familiar with SummarizedExperiment
yet but are you sure that rownames(rowData(se))
is never set?
R/msnset2se.R
Outdated
pheno_data <- data.frame(SummarizedExperiment::colData(se)) | ||
|
||
# Generate MSnSet | ||
msnset <- MSnbase::MSnSet(exprs = as.matrix(raw), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no need to use MSnbase::
because there are no other MSnSet
s around and the namespace of the package is always used first.
R/msnset2se.R
Outdated
# Generate MSnSet | ||
msnset <- MSnbase::MSnSet(exprs = as.matrix(raw), | ||
pData = Biobase::AnnotatedDataFrame(pheno_data), | ||
fData = Biobase::AnnotatedDataFrame(feat_data)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need Biobase::
as well. We are depending on Biobase
so AnnotatedDataFrame
is available without ::
.
R/msnset2se.R
Outdated
#' @export | ||
msnset2se <- function(msnset) { | ||
# Check input | ||
assertthat::assert_that(inherits(msnset, "MSnSet")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See my comment above regarding assertthat
.
R/msnset2se.R
Outdated
# Extract assay, rowData and colData | ||
raw <- Biobase::exprs(msnset) | ||
row_data <- Biobase::fData(msnset) | ||
col_data <- Biobase::pData(msnset) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need Biobase::
here as well.
R/msnset2se.R
Outdated
se <- SummarizedExperiment::SummarizedExperiment(assays = as.matrix(raw), | ||
rowData = row_data, | ||
colData = col_data) | ||
return(se) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return
not needed.
R/msnset2se.R
Outdated
col_data <- Biobase::pData(msnset) | ||
|
||
# Generate SE | ||
se <- SummarizedExperiment::SummarizedExperiment(assays = as.matrix(raw), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If SummarizedExperiment
was attached by requireNamespace
(which is missing here) the ::
operator isn't needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added requireNamespace, however I still need the :: operator.
R/msnset2se.R
Outdated
|
||
# Extract expression, feature and pheno data | ||
raw <- SummarizedExperiment::assay(se) | ||
feat_data <- data.frame(SummarizedExperiment::rowData(se)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please don't use snake_case. We try to follow the bioc coding style and use camelCase
I incorporated the style changes and comments. I kept se2msnset and msnset2se as internal functions, which I call within the setAs functions. Let me know if I overlooked something and/or there are other issues. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right direction but I have still some (code styling) comments.
R/msnset2se.R
Outdated
stopifnot(inherits(se, "SummarizedExperiment")) | ||
|
||
if(!requireNamespace("SummarizedExperiment")) { | ||
stop("The SummarizedExperiment package is required for SummarizedExperiment to MSnSet conversion.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for being petty but could you please use just 4 spaces for indentation (you used ~8) and keep lines shorter than 80 characters?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW: if you use setAs
instead of an internal function the stopifnot(inherits(se, "SummarizedExperiment"))
is superfluous.
R/msnset2se.R
Outdated
assertthat::assert_that(inherits(se, "SummarizedExperiment")) | ||
|
||
# Extract expression, feature and pheno data | ||
raw <- SummarizedExperiment::assay(se) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ::
should not necessary. You only need it if you have two functions with the same name in two imported packages. Beside adding SummarizedExperiment
to the DESCRIPTION
file you have to import the class and the methods you use via NAMESPACE
: importClassesFrom(SummarizedExperiment, SummarizedExperiment)
and importFrom(SummarizedExperiment, assay, colData, rowData)
.
R/msnset2se.R
Outdated
fData = Biobase::AnnotatedDataFrame(feat_data)) | ||
return(msnset) | ||
MSnSet(exprs = as.matrix(raw), | ||
pData = AnnotatedDataFrame(phenoData), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please fix the indentation here as well.
R/msnset2se.R
Outdated
setAs("MSnSet", | ||
"SummarizedExperiment", | ||
function(from) { | ||
msnset2se(from) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The functions are very short and not used anywhere else. That's why I don't see the point of an additional internal function. Please put the function body into the setAs
method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@arnesmits - I believe you use use se2msnset
and msnset2se
in your package and possibly mention them in documentation or vignette. I think we can keep them, at least in a first instance, as they might already be used here and there. But the official as
syntax should probably he preferred.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am against exporting se2msnset
and msnset2se
:
- We prefer the
as
syntax. x2y
is uncommon.- If we add it we would teach the user a "wrong" approach.
- If a second package (
MSnbase
) supports them they become more common and seem more "correct". - To deprecate/defunct a function is effort (we have to remember it for more than one release cycle, documentation, ...) and we maybe won't ever do it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@arnesmits - are you ok with marking the old forms as Defunct and suggest to use the standard as
form, so we can redirect users that might already have used them to a more standard form?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with both of you that the common as syntax would be preferable. The old form is mentioned in the documentation of my package, although it is not required in the default workflow. I will mark it as Defunct and suggest the as form, once this is incorporated in MSnbase.
@@ -21,6 +21,6 @@ test_that("se2msnset conversion", { | |||
expect_equal(colnames(msnset_back), colnames(se)) | |||
expect_equal(rownames(msnset_back), rownames(se)) | |||
expect_equal(Biobase::exprs(msnset_back), SummarizedExperiment::assay(se)) | |||
expect_equal(Biobase::fData(msnset_back), data.frame(SummarizedExperiment::rowData(se), row.names = se@NAMES)) | |||
expect_equal(Biobase::fData(msnset_back), data.frame(SummarizedExperiment::rowData(se), row.names = names(se))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I missed this one before. Why do you compare against msnset_back
and not directly against msnset
? And wouldn't be expect_equal(msnset, as(se, "MSnSet"))
enough?
And the ::
shouldn't be needed here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One issue with the current coercion is that it only converts the assayData, phenoData and featureData. The other slots, such as qual, processingData, experimentData, protocolData and annotation are MSnSet specific. The SummarizedExperiment supports a simple list as metadata, so one solution could be to parse all MSnSet specific slots into a list which is stored in the SummarizedExperiment metadata and vice versa for SE to MSnSet conversion. The question is whether it is necessary to transfer this additional information and whether it is preferred to store this information in a simple list?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will push the necessary as(x, "list")
to MSnbase
in the coming days, so that you can add these slots to the SummarizedExperiment
's metadata
slot. Will update this comment with more instructions when the code becomes available.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added the three as
methods in the latest MSnbase
on github. There is also MSnbase:::.reduce_list
that will remove empty elements or those with and empty character (""
) - MSnbase:::.reduce_list(as(processingData(x), "list"))
, so as to avoid adding missing data. The names of the elements could possibly be prefixed with MSnbase
to make it explicit where these metadata originate from.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lgatto: Thanks. In the end, I didn't use these functions as I could store the data.frame, MSnProcess, MIAPE and AnnotatedDataFrame objects in the metadata list of SummarizedExperiment.
The current implementation is a bit 'MSnSet-focused', because it will correctly coerce all MSnSet slots into SE and back to MSnSet. However, SE metadata that does not fit into one of the MSnSet slots will be lost in the SE to MSnSet coercion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue with this is that once you have an object created with
se <- as(msnset, "SummarizedExperiment")
you will still need MSnbase
to use se
because of the MSnProcess
and MIAPE
objects in the metadata (AnnotatedDataFrame
is defined in Biobase
, which is a dependency of SummarizedExperiment
). Having that assumption simplifies the code, but it limits the portability of that se
object. @arnesmits - what is your typical use case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will work on this next week, and only transfer some information as a list to the SummarizedExperiment
metadata slot, and document the behaviour in the manual page.
@@ -8,7 +8,7 @@ test_that("msnset2se conversion", { | |||
expect_equal(colnames(msnset), colnames(se)) | |||
expect_equal(rownames(msnset), rownames(se)) | |||
expect_equal(Biobase::exprs(msnset), SummarizedExperiment::assay(se)) | |||
expect_equal(Biobase::fData(msnset), data.frame(SummarizedExperiment::rowData(se), row.names = se@NAMES)) | |||
expect_equal(Biobase::fData(msnset), data.frame(SummarizedExperiment::rowData(se), row.names = names(se))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ::
shouldn't be needed here.
Also, @arnesmits, please add yourself as a contributor in the DESCRIPTION file. |
more SummarizedExperiment from suggests to imports
@arnesmits: Sorry I was on the wrong track. If we just want to "Suggests" @lgatto + @arnesmits : Sry, I wasn't aware that we have to import I am a little bit depressed that we have to add a dependency just for an |
@sgibb - I moved that package to |
Yes, but I am afraid that is not possible. I don't know if there is a way to provide a method for a class without importing this class. |
BTW: it is not a warning but a message and I already get the message: |
Yes, I can live with the message. |
I agree that avoiding a hard dependency is preferred, so changed this back. |
@arnesmits After moving |
Done now. See |
First step towards an MSnSet to SE coercions.