Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

msnset se coercions #284

Merged
merged 5 commits into from
Dec 17, 2017
Merged

msnset se coercions #284

merged 5 commits into from
Dec 17, 2017

Conversation

arnesmits
Copy link
Contributor

First step towards an MSnSet to SE coercions.

Copy link
Collaborator

@sgibb sgibb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your PR and the great idea to convert an MSnSet into an SummarizedExperiment object and vice versa. Instead of a se2msnset and msnset2se function we would prefer a as(msnset, "SummarizedExperiment") and as(se, "MSnSet") method. You could use setAs("MSnSet", "SummarizedExpirment", function(from) {}) for that.

Additionally, I added some comments to your code (mostly about the coding style).

R/msnset2se.R Outdated
#' @export
se2msnset <- function(se) {
# Check input
assertthat::assert_that(inherits(se, "SummarizedExperiment"))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently we don't use assertthat for evaluation of input variables. While it is very convenient it don't think we are accepting a new dependency for just an easy if (inherits(se, "SummarizedExperiment")) check. BTW: assertthat is not listed in the DESCRIPTION file.

R/msnset2se.R Outdated
assertthat::assert_that(inherits(se, "SummarizedExperiment"))

# Extract expression, feature and pheno data
raw <- SummarizedExperiment::assay(se)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently we also have no dependency on SummarizedExperiment. That means you have to add SummarizedExperiment to the DESCRIPTION file. I would suggest to add it into Suggests: and use if (requireNamespace("SummarizedExperiment")) here to check whether it is available or not.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The :: should not necessary. You only need it if you have two functions with the same name in two imported packages. Beside adding SummarizedExperiment to the DESCRIPTION file you have to import the class and the methods you use via NAMESPACE: importClassesFrom(SummarizedExperiment, SummarizedExperiment) and importFrom(SummarizedExperiment, assay, colData, rowData).

R/msnset2se.R Outdated
# Extract expression, feature and pheno data
raw <- SummarizedExperiment::assay(se)
feat_data <- data.frame(SummarizedExperiment::rowData(se))
rownames(feat_data) <- se@NAMES
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should not access slots directly. Why not use names(se) here? I am not very familiar with SummarizedExperiment yet but are you sure that rownames(rowData(se)) is never set?

R/msnset2se.R Outdated
pheno_data <- data.frame(SummarizedExperiment::colData(se))

# Generate MSnSet
msnset <- MSnbase::MSnSet(exprs = as.matrix(raw),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no need to use MSnbase:: because there are no other MSnSets around and the namespace of the package is always used first.

R/msnset2se.R Outdated
# Generate MSnSet
msnset <- MSnbase::MSnSet(exprs = as.matrix(raw),
pData = Biobase::AnnotatedDataFrame(pheno_data),
fData = Biobase::AnnotatedDataFrame(feat_data))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need Biobase:: as well. We are depending on Biobase so AnnotatedDataFrame is available without ::.

R/msnset2se.R Outdated
#' @export
msnset2se <- function(msnset) {
# Check input
assertthat::assert_that(inherits(msnset, "MSnSet"))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my comment above regarding assertthat.

R/msnset2se.R Outdated
# Extract assay, rowData and colData
raw <- Biobase::exprs(msnset)
row_data <- Biobase::fData(msnset)
col_data <- Biobase::pData(msnset)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need Biobase:: here as well.

R/msnset2se.R Outdated
se <- SummarizedExperiment::SummarizedExperiment(assays = as.matrix(raw),
rowData = row_data,
colData = col_data)
return(se)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return not needed.

R/msnset2se.R Outdated
col_data <- Biobase::pData(msnset)

# Generate SE
se <- SummarizedExperiment::SummarizedExperiment(assays = as.matrix(raw),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If SummarizedExperiment was attached by requireNamespace (which is missing here) the :: operator isn't needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added requireNamespace, however I still need the :: operator.

R/msnset2se.R Outdated

# Extract expression, feature and pheno data
raw <- SummarizedExperiment::assay(se)
feat_data <- data.frame(SummarizedExperiment::rowData(se))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't use snake_case. We try to follow the bioc coding style and use camelCase

@arnesmits
Copy link
Contributor Author

I incorporated the style changes and comments. I kept se2msnset and msnset2se as internal functions, which I call within the setAs functions. Let me know if I overlooked something and/or there are other issues.

Copy link
Collaborator

@sgibb sgibb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right direction but I have still some (code styling) comments.

R/msnset2se.R Outdated
stopifnot(inherits(se, "SummarizedExperiment"))

if(!requireNamespace("SummarizedExperiment")) {
stop("The SummarizedExperiment package is required for SummarizedExperiment to MSnSet conversion.")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for being petty but could you please use just 4 spaces for indentation (you used ~8) and keep lines shorter than 80 characters?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW: if you use setAs instead of an internal function the stopifnot(inherits(se, "SummarizedExperiment")) is superfluous.

R/msnset2se.R Outdated
assertthat::assert_that(inherits(se, "SummarizedExperiment"))

# Extract expression, feature and pheno data
raw <- SummarizedExperiment::assay(se)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The :: should not necessary. You only need it if you have two functions with the same name in two imported packages. Beside adding SummarizedExperiment to the DESCRIPTION file you have to import the class and the methods you use via NAMESPACE: importClassesFrom(SummarizedExperiment, SummarizedExperiment) and importFrom(SummarizedExperiment, assay, colData, rowData).

R/msnset2se.R Outdated
fData = Biobase::AnnotatedDataFrame(feat_data))
return(msnset)
MSnSet(exprs = as.matrix(raw),
pData = AnnotatedDataFrame(phenoData),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please fix the indentation here as well.

R/msnset2se.R Outdated
setAs("MSnSet",
"SummarizedExperiment",
function(from) {
msnset2se(from)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The functions are very short and not used anywhere else. That's why I don't see the point of an additional internal function. Please put the function body into the setAs method.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@arnesmits - I believe you use use se2msnset and msnset2se in your package and possibly mention them in documentation or vignette. I think we can keep them, at least in a first instance, as they might already be used here and there. But the official as syntax should probably he preferred.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am against exporting se2msnset and msnset2se:

  1. We prefer the as syntax.
  2. x2y is uncommon.
  3. If we add it we would teach the user a "wrong" approach.
  4. If a second package (MSnbase) supports them they become more common and seem more "correct".
  5. To deprecate/defunct a function is effort (we have to remember it for more than one release cycle, documentation, ...) and we maybe won't ever do it.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@arnesmits - are you ok with marking the old forms as Defunct and suggest to use the standard as form, so we can redirect users that might already have used them to a more standard form?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with both of you that the common as syntax would be preferable. The old form is mentioned in the documentation of my package, although it is not required in the default workflow. I will mark it as Defunct and suggest the as form, once this is incorporated in MSnbase.

@@ -21,6 +21,6 @@ test_that("se2msnset conversion", {
expect_equal(colnames(msnset_back), colnames(se))
expect_equal(rownames(msnset_back), rownames(se))
expect_equal(Biobase::exprs(msnset_back), SummarizedExperiment::assay(se))
expect_equal(Biobase::fData(msnset_back), data.frame(SummarizedExperiment::rowData(se), row.names = se@NAMES))
expect_equal(Biobase::fData(msnset_back), data.frame(SummarizedExperiment::rowData(se), row.names = names(se)))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I missed this one before. Why do you compare against msnset_back and not directly against msnset? And wouldn't be expect_equal(msnset, as(se, "MSnSet")) enough?

And the :: shouldn't be needed here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One issue with the current coercion is that it only converts the assayData, phenoData and featureData. The other slots, such as qual, processingData, experimentData, protocolData and annotation are MSnSet specific. The SummarizedExperiment supports a simple list as metadata, so one solution could be to parse all MSnSet specific slots into a list which is stored in the SummarizedExperiment metadata and vice versa for SE to MSnSet conversion. The question is whether it is necessary to transfer this additional information and whether it is preferred to store this information in a simple list?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will push the necessary as(x, "list") to MSnbase in the coming days, so that you can add these slots to the SummarizedExperiment's metadata slot. Will update this comment with more instructions when the code becomes available.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added the three as methods in the latest MSnbase on github. There is also MSnbase:::.reduce_list that will remove empty elements or those with and empty character ("") - MSnbase:::.reduce_list(as(processingData(x), "list")), so as to avoid adding missing data. The names of the elements could possibly be prefixed with MSnbase to make it explicit where these metadata originate from.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lgatto: Thanks. In the end, I didn't use these functions as I could store the data.frame, MSnProcess, MIAPE and AnnotatedDataFrame objects in the metadata list of SummarizedExperiment.

The current implementation is a bit 'MSnSet-focused', because it will correctly coerce all MSnSet slots into SE and back to MSnSet. However, SE metadata that does not fit into one of the MSnSet slots will be lost in the SE to MSnSet coercion.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue with this is that once you have an object created with

se <- as(msnset, "SummarizedExperiment")

you will still need MSnbase to use se because of the MSnProcess and MIAPE objects in the metadata (AnnotatedDataFrame is defined in Biobase, which is a dependency of SummarizedExperiment). Having that assumption simplifies the code, but it limits the portability of that se object. @arnesmits - what is your typical use case?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will work on this next week, and only transfer some information as a list to the SummarizedExperiment metadata slot, and document the behaviour in the manual page.

@@ -8,7 +8,7 @@ test_that("msnset2se conversion", {
expect_equal(colnames(msnset), colnames(se))
expect_equal(rownames(msnset), rownames(se))
expect_equal(Biobase::exprs(msnset), SummarizedExperiment::assay(se))
expect_equal(Biobase::fData(msnset), data.frame(SummarizedExperiment::rowData(se), row.names = se@NAMES))
expect_equal(Biobase::fData(msnset), data.frame(SummarizedExperiment::rowData(se), row.names = names(se)))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The :: shouldn't be needed here.

@lgatto
Copy link
Owner

lgatto commented Dec 8, 2017

Also, @arnesmits, please add yourself as a contributor in the DESCRIPTION file.

arnesmits and others added 2 commits December 11, 2017 10:28
more SummarizedExperiment from suggests to imports
@sgibb
Copy link
Collaborator

sgibb commented Dec 12, 2017

@arnesmits: Sorry I was on the wrong track. If we just want to "Suggests" SummarizedExperiment you have to use SummarizedExperiment::assay after requireNamespace("SummarizedExperiment) (so you were right). Adding it to NAMESPACE would require to "import" or "depend" on it.

@lgatto + @arnesmits : Sry, I wasn't aware that we have to import SummarizedExperiment to provide an as method (otherwise we get a warning: in method for ‘coerce’ with signature ‘"SummarizedExperiment","MSnSet"’: no definition for class “SummarizedExperiment”). So requireNamespace is not necessary at all.

I am a little bit depressed that we have to add a dependency just for an as.

@lgatto
Copy link
Owner

lgatto commented Dec 13, 2017

@sgibb - I moved that package to Imports to fix the error, but I'm happy to leave it Suggests to avoid the hard dependency.

@sgibb
Copy link
Collaborator

sgibb commented Dec 13, 2017

Yes, but I am afraid that is not possible. I don't know if there is a way to provide a method for a class without importing this class.

@sgibb
Copy link
Collaborator

sgibb commented Dec 13, 2017

BTW: it is not a warning but a message and I already get the message:
in method for ‘coerce’ with signature ‘"IBSpectra","MSnSet"’: no definition for class “IBSpectra”
So we could move SummarizedExperiment to Suggests: and add a few more messages about missing class definitions to the MSnbase startup ... 😞

@lgatto
Copy link
Owner

lgatto commented Dec 13, 2017

Yes, I can live with the message.

@arnesmits
Copy link
Contributor Author

I agree that avoiding a hard dependency is preferred, so changed this back.

@sgibb
Copy link
Collaborator

sgibb commented Dec 15, 2017

@arnesmits After moving SummarizedExperiments from Imports to Suggests you also have to remove the entries from NAMESPACE. Could you please add some information to the MSnSet-class manual page.

@lgatto lgatto merged commit 9fe34ff into lgatto:master Dec 17, 2017
@lgatto
Copy link
Owner

lgatto commented Dec 17, 2017

Done now. See ./R/coerce.R and ./man/MSnSet-class.Rd and here for details. The unit tests are in `test_coerce.R.

@arnesmits
Copy link
Contributor Author

arnesmits commented Dec 19, 2017

@lgatto + @sgibb: Thanks a lot for all your work!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants