Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Breaking changes in rliger #161

Closed
theAeon opened this issue Feb 22, 2024 · 57 comments
Closed

Breaking changes in rliger #161

theAeon opened this issue Feb 22, 2024 · 57 comments
Assignees
Labels
bug Something isn't working enhancement Enhance existing function/feature New Feature New feature to add Release Prep Vignette Related Issues with or Updates to vignettes or pkgdown website

Comments

@theAeon
Copy link

theAeon commented Feb 22, 2024

Hi! We're planning on submitting a major rliger upgrade to CRAN in the coming days to weeks. It contains a fairly large number of API changes and (more relevant to your purposes) object structure changes. We will be pushing the updated code to welch-lab/liger@newObj shortly and promoting it to master once CRAN has their way with it. Updated documentation is available at https://mvfki.github.io/liger/. Feel free to reach out if you have any questions!

Best,
-Andrew Robbins

@samuel-marsh
Copy link
Owner

Hi Andrew,

Thanks so much for the heads up! Just quick scroll through the commits and new changes look like fantastic additions to the package. I have just been working on in my current dev branch bunch of updates to liger functionality and conversion functions that definitely look like they would be affected. But based on very quick look at new commits it seems like a lot of new functionality and class structure changes will make implementation of my functions easier and allow me to port even more of them to S3 generics to work on Seurat and Liger objects.

I have also been planning scCustomize release shortly (currently delayed slightly as I work out a few ggplot 3.5.0 issues that crop up during CRAN tests on devel platforms) before it can be approved.

Just couple questions so I can make plans and test things.

  • Is work on the current dev branch "newObj" essentially finished or are there more potentially breaking changes coming?
  • From package development standpoint I know this is hard one to answer sometimes but do you and team have any more info on days vs. weeks timeline in terms of what you are shooting for?
    • The reason I ask is trying to decide whether it is better to hard specify in scCustomize requirements use of rliger v1.0.1 until full release of this new version and then work on changes or whether to delay my release (I'm ok with either as goal of package is to make things easier for end user).

Happy to discuss back and forth here or if you'd rather discuss via email just shoot me message at my email in package description.

Thanks!
Sam

@samuel-marsh samuel-marsh self-assigned this Feb 22, 2024
@samuel-marsh samuel-marsh added bug Something isn't working enhancement Enhance existing function/feature New Feature New feature to add Vignette Related Issues with or Updates to vignettes or pkgdown website Release Prep labels Feb 22, 2024
@theAeon
Copy link
Author

theAeon commented Feb 22, 2024

Just couple questions so I can make plans and test things.

  • Is work on the current dev branch "newObj" essentially finished or are there more potentially breaking changes coming?

It's essentially finished as it stands right now over at Yichen's repo (mvfki/liger@newObj) but it should be merged into the primary one sooner rather than later.

  • From package development standpoint I know this is hard one to answer sometimes but do you and team have any more info on days vs. weeks timeline in terms of what you are shooting for?

    • The reason I ask is trying to decide whether it is better to hard specify in scCustomize requirements use of rliger v1.0.1 until full release of this new version and then work on changes or whether to delay my release (I'm ok with either as goal of package is to make things easier for end user).

No worries! I'm only hedging here because the timeline is entirely dependent on CRAN's whims. I'm probably being overly pessimistic, but I have had a package chilling in inspect for over a month at this point.

@samuel-marsh
Copy link
Owner

Ok great thanks so much! I'll start working on new version and will let you know if I have any questions or run into any issues. I'll leave this issue open until everything is set on my end.

Thanks again!
Sam

@samuel-marsh
Copy link
Owner

One quick question. Will the released package be rliger2 or is that just development name?

Thanks!
Sam

@theAeon
Copy link
Author

theAeon commented Feb 22, 2024 via email

@samuel-marsh
Copy link
Owner

great thanks!

@samuel-marsh
Copy link
Owner

Hi @theAeon (also tagging @mvfki here as bug reporting),

I have found two bugs (you may already be aware) and have one suggestion.

1. The first bug is in ligerToSeurat which does not appear to work.

reprex

pbmc <- pbmc3k.SeuratData::pbmc3k.final
pbmc <- UpdateSeuratObject(pbmc)

pbmc$sample_id <- sample(c("sample1", "sample2", "sample3", "sample4", "sample5", "sample6"), size = ncol(pbmc), replace = TRUE)

pbmc_liger <- as.liger(pbmc)

pbmc_liger_seurat <- ligerToSeurat(pbmc_liger)

Error: Cannot add new cells with [[<-

2. The second bug is in as.liger when converting Seurat object using Assay5 with split layers

pbmc <- pbmc3k.SeuratData::pbmc3k.final

pbmc <- UpdateSeuratObject(pbmc)

pbmc <- scCustomize::Convert_Assay(seurat_object = pbmc, convert_to = "v5")

pbmc$sample_id <- sample(c("sample1", "sample2", "sample3", "sample4", "sample5", "sample6"), size = ncol(pbmc), replace = TRUE)

pbmc_split <- scCustomize::Split_Layers(seurat_object = pbmc, split.by = "sample_id")

pbmc_liger <- as.liger(pbmc_split, datasetVar = "sample_id")

Error in .checkArgLen(datasetVar, ncol(object), repN = TRUE, class = c("factor",  : 
  `datasetVar` has to be a vector of length 445
In addition: Warning message:
In LayerData.Assay5(object = object[[assay]], layer = layer, ...) :
  multiple layers are identified by counts.sample1 counts.sample2 counts.sample3 counts.sample5 counts.sample4 counts.sample6
 only the first layer is used

In my writing of function for current version of Liger I handled this by just adding a check for multiple layers of same type being present in assay and then allowing users to specify group.by = "layers" to use when creating liger object.
See function in dev branch here.

3. The suggestion that I had also pertains to as.liger when converting from Seurat.
I think it would be helpful if the column used for grouping datasets in ligerDatasets to be retained in cellMeta with the rest of [email protected] columns. The rationale being that retention of the name of that column can be important and helpful when plotting and additionally if converting back to Seurat object after Liger it would make have same meta.data columns as original object.

This means that there will be duplicate columns in cellMeta between dataset and whatever column name is but that doesn't present too much overhead (at least in my mind) to be prohibitive).

Best,
Sam

@mvfki
Copy link

mvfki commented Feb 23, 2024

Hi Sam,

These suggestions are super helpful! I apologize that I haven't yet got too much experience dealing with new Seurat architecture. I'll definitely get it fixed and let you know shortly.

Thanks,
Yichen

@samuel-marsh
Copy link
Owner

Hi Yichen,

No worries!! I'm only just getting my head about the Assay5 changes too. Thanks for all work on new updates to Liger. All of the changes look awesome!!

Best,
Sam

@mvfki
Copy link

mvfki commented Feb 28, 2024

Hi Sam,

I just looked into the bugs.

1. For the first one, the failure originated when creating a new liger object where I added <datasetName>_ prefix to the barcodes. I forgot to also have the dimnames of DimRed objects and etc. modified accordingly. This can be simply fixed with an additional step colnames(seurat) <- colnames(ligerWithOnlyRawData) before extracting everything else. The behavior of adding the prefix can be turned off with the option addPrefix = FALSE. By default it is added if the "barcodes", or say cell identifiers, do not start with the dataset name, this is originally intended to deal with 10X barcode duplicates from different runs.

Meanwhile, after running the code block in your example, it is currently not expected to return you an identical Seurat object regardless of the barcode issue, because as.liger does not extract everything from the Seurat object. I made this decision because the Seurat normalized and scaled data is not suitable for iNMF integration and I don't think they should be brought into a liger object. I would suggest doing as.liger(), any analysis you need, and then fetch the final components needed into the original Seurat object. Or otherwise, starting with a LIGER analysis, and then ligerToSeurat() to continue with Seurat. Hope this makes sense. I'm very happy to discuss all these interoperation use cases and I'll definitely add some more documentation after we reach some sort of agreement.

2. The second one about split layers is very interesting. I have been assuming that Seurat keeps everything in one matrix object. Therefore, many of my other Seurat wrappers do split the data by variable first and concatenate things back in the end, with pretty awkward efficiency. If they now do support having things apart, my wrappers should obviously just go with them as we normally do with liger style structure. I'll dig into this part and get back to you in this thread later.

3. Regarding the suggestion in the third point, yes I agree! And it is actually implemented in the same way as you said. Below is what I would see if I just go with your example with the first bug fixed. And you can see "dataset" as the preserved field for the liger structure and "sample_id" is still there. Note that they are not "identical" objects because dataset is coerced to a factor.

pbmc <- pbmc3k.SeuratData::pbmc3k.final
pbmc <- UpdateSeuratObject(pbmc)

pbmc$sample_id <- sample(c("sample1", "sample2", "sample3", "sample4", "sample5", "sample6"), size = ncol(pbmc), replace = TRUE)

pbmc_liger <- as.liger(pbmc, datasetVar = "sample_id")

colnames(cellMeta(pbmc_liger))
[1] "dataset"            "barcode"            "nUMI"               "nGene"             
[5] "mito"               "ribo"               "hemo"               "orig.ident"        
[9] "nCount_RNA"         "nFeature_RNA"       "seurat_annotations" "percent.mt"        
[13] "RNA_snn_res.0.5"    "seurat_clusters"    "sample_id"          "pca"               
[17] "umap"     

pbmc_liger
An object of class liger with 2638 cells
datasets(6): sample1 (448 cells), sample2 (453 cells), sample3 (431 cells), ..., sample6 (452 cells) 
cellMeta(17): dataset, barcode, nUMI, ..., umap 
varFeatures(0):  

Appreciate it!
Yichen

@mvfki
Copy link

mvfki commented Feb 28, 2024

OK, I had a look at the Seurat5 integration vignette. I think Seurat is suggesting we use the objects with two styles: 1. each object represents a batch and can be of either old versions or v5 2. A v5 object with split layers each representing a batch. Although I believe 3. a single object with batches merged can still be found and needs our support.

In the scenario of creating a liger object, we can do createLiger(list(sampl1 = seurat1, sample2 = seurat2)) for "1.", and as.liger(datasetVar = "sample_id") with "3.". I'll have the support for "2." as soon as I can. As for Seurat S3 methods, I feel like it would be the easiest for us to require split layers instead of supporting too many things. The Seurat vignette is recommending this after all.

Appreciate it a lot if you could share some insights on this.

@mvfki
Copy link

mvfki commented Feb 29, 2024

Just made new push with fixes!

@samuel-marsh
Copy link
Owner

samuel-marsh commented Feb 29, 2024

Hi Yichen,

Great, I'll check them out asap!

Yes the Seurat format is now one of 3 versions (V3/4 format, V5 one matrix, V5 split matrices). That is the reason I have the layer check in some of my functions. Whether user has split layers when using V5 structure will depend on what step of analysis they are at and how they are doing analysis or even what qc they do. So even in scenario where user intially had split layers they may not when converting to liger. For instance performing cell cycle scoring in Seurat requires the layers to be joined before running.

Part of it will just depend on what your and teams intentions are in terms of which package versions of Seurat to support. If the intention is to support users with both Seurat package versions 4 and 5+ then it becomes important to have checks as to what the Assay structure is and if layers are split.
Both are pretty simply just checking Assay vs Assay5 (I made internaly function to simplify this for me in scCustomize see here). And whether layers are split by checking length of returned vector when using search parameter in Layers() function to identify whether multiple layers of raw counts are present (see here).

If the layers aren't split then you can just internally run the split functionality as described on V5 cheatsheet vingette. I also made function in scCustomize to do this just because I think intuitively for many users the syntax can be confusing and easy to make mistake and return only the assay and not whole object Split_Layers.

If intention is to support Seurat package V4 and V5 then the other important thing as package requirement is to require SeuratObject (>= 5.0.0) in DESCRIPTION. The reason for this is because that in that version the getter/setter functions will work across for users with Seurat V3/4, Seurat V5 but with V3/4 object structure, and Seurat V5 with V5 object structure (For instance LayerData replaces GetAssayData). Even in V3/4 structure where "layers" don't technically exist named as such the function will pull the correct data. If only supporting V5 then simply specify Seurat (>= 5.0.0) will be sufficient to ensure correct SeuratObject version.

Unrelated to previous comments:
ICYI as it pertains ligerToSeurat if you are interested in condensing the syntax to be similar to as.liger I recently figured out method by which to extend the Seurat generic as.Seurat without any package warnings about overwriting functionality when loading together.

You first just need to reexport as.Seurat from SeuratObject (I also added note so that when users search for it in manual they can be directed to manual package specific for my packages version (or liger's version etc). Then just as with other generics provide the right roxygen comments and function name and then it will work see here.

As I mention it's not necessary but does simply the format conversion syntax to use as.xxx universally.

Thanks again and I'll let you know if I run into anything else with new updates!

Best,
Sam

@mvfki
Copy link

mvfki commented Mar 1, 2024

Hi Sam,

I strongly agree with that having as.Seurat working with a liger object would be the best for user experience. However, implementing it would require we have Seurat/SeuratObject in the "Imports" (or "Depends") field. This is going to be too much of a dependency for LIGER. All the other functions in rliger that rely on Seurat/SeuratObject functionalities are solely for making LIGER available to Seurat object users, but not transporting Seurat features to LIGER users. So I believe it would be the overall way out to keep them in "Suggests".

Wellllllll, let's keep that in scCustomize for now.

@samuel-marsh
Copy link
Owner

Hi Yichen,

That totally makes sense as Seurat dependency is heavy for sure.

One quick question in playing around I think either I missing something or there is bug. I'm not sure that removeMissing is functioning correctly in createLiger or the pass through from as.liger isn't working as intended.

I'm trying to convert from Seurat to new liger using rliger2's as.liger and maintain all genes for each sample even if cells are missing count for that gene:

liger_obj <- as.liger(seurat_obj, datasetVar = "sample_id", removeMissing = FALSE)

However, following conversion the new object has diffeing dimensions for each ligerDataset appearing to filter out genes not expressed.

> dim(liger_obj@datasets$SAMPLE001)
[1] 16903   583
> dim(liger_obj@datasets$SAMPLE002)
[1] 15262   188
> dim(liger_obj@datasets$SAMPLE004)
[1] 13100   108

Is there something else I'm missing or just a bug?

Thanks!
Sam

@samuel-marsh
Copy link
Owner

samuel-marsh commented Mar 11, 2024

One other thing to pose and this is really just a question as it may not be a lift that you/team are interested in doing.
But is there any thought about changing the capitalization of liger class?

For instance when Seurat moved from Seurat 2 to Seurat3 while back they went from "seurat" to "Seurat". This allows for S3 generics and to easily switch between objects from different liger versions. Or at very least makes for checks against old liger versions easy just using inherits.

Just question. Thanks again for all this work on new version of liger. The speed of iNMF is fantastic!! That was always biggest hang up when running liger was just time. 50K cell object now running in half the time! Really great!!

Edit: just saw internal checkObjVersion so ignore this one :)

Best,
Sam

@mvfki
Copy link

mvfki commented Mar 11, 2024

Hi Sam,

For the removeMissing one, it looks pretty like a bug to me. I'll look into it asap, although it is generally recommended to remove missing entries as per the nature of iNMF.

For the class naming, that was actually a good question. I had some discussion about it with team previously, exactly because I saw Seurat team did the smart thing lol. However, we made the decision not to mess around with that, because we don't have that many users that need to look deep into the data structure. I do have an internal function is.newLiger() for that purpose. I'll make it exported then.

Indeed, the speed test of the new iNMF implementation that I've done, not a formal benchmarking though, showed a much higher fold change, like 30x to 50x. It is equipped with OpenMP multi-processing support (not available to Apple silicon unfortunately though), so if you can test with nCores option (by default 2L) tweaked, it should show you something more surprising. Apologize here that I just noticed a couple of days ago that I forgot to expose this to the user-end function so you probably didn't notice it. It is now available on welch-lab/liger@newObj.

Thank you a lot!
Yichen

@samuel-marsh
Copy link
Owner

sounds great for removeMissing

Even better news on OpenMP! Ya I'll try with ncores upped! I'm running some tests in GCP instance so I got plenty of cores to try :)

Thanks!
Sam

@samuel-marsh
Copy link
Owner

Hi Yichen,

Ok yes confirmed that speed up is actually way closer to what your estimates were (realized I had been comparing results on two different machines with different settings too; forgot I changed parameters part way through this analysis).

One question/potential bug though. I'm wondering if possibly nCores it not correctly passing to openmp (I guess that would be within RcppPlanc) or whether there is known bottleneck in terms of performance gains with more cores?

The reason I ask is because when testing on my old intel MBP for dataset with ~50K cells, 40 datasets, 5000ish var genes, k15, lambda5, 1 iteration.
Default nCore setting: 99 seconds
nCores 4: 60 seconds
nCores 8: 60 seconds
nCores24: 60 seconds
nCores64: 60 seconds

Maybe need bigger dataset for more than 4 cores to to make difference?

Thanks!
Sam

@theAeon
Copy link
Author

theAeon commented Mar 12, 2024

For context, what model MBP is this?

@samuel-marsh samuel-marsh mentioned this issue Mar 12, 2024
9 tasks
@samuel-marsh
Copy link
Owner

Here's snapshot of system info. It's Intel i9 from 2019.
image

@theAeon
Copy link
Author

theAeon commented Mar 12, 2024

hm. The RcppPlanc cmakelists doesn't actually link to openmp on osx because Apple Clang/LLVM doesn't ship libomp. I have to imagine the speedup from 2 to 4 is backend BLAS parallelism, but I'd need to know what your install linked against to investigate further. Can you get me a compile log?

@samuel-marsh
Copy link
Owner

sure. How do I get that?

@theAeon
Copy link
Author

theAeon commented Mar 13, 2024

No, you'd need to set that environment variable from outside the R instance before launching it. The easiest way would be to just use Rterm for the moment.

Setting it in RStudio's terminal doesn't affect RStudio's instance.

@samuel-marsh
Copy link
Owner

Ok so here's what I've just run directly in R session.

> Sys.getenv("VECLIB_MAXIMUM_THREADS")
""
> Sys.setenv(VECLIB_MAXIMUM_THREADS = 1)
> Sys.getenv("VECLIB_MAXIMUM_THREADS")
[1] "1"

Seems to have set correctly so no running liger code:

default: ~85 seconds
nCores 4: ~54 seconds
nCores 8: ~52 seconds
nCores 16: ~63 seconds

I can see by watching Activity monitor that is is taking up different amounts of CPU (was doing this both before this last trial and now).
During this trial on this laptop User CPU %s were:
default: ~15%
nCores 4: ~30%
nCores 8: ~55%
nCores 16: ~85%

@theAeon
Copy link
Author

theAeon commented Mar 13, 2024

Awesome. I'm not particularly surprised that going to 16 cores is causing a bit of slowdown as I don't believe that hyperthreading is effective at handling homogenous workloads and RcppPlanc is all double-precision floating point. I'll see if I can't do something about that on the RcppPlanc side of things, but unfortunately I don't know that I can influence veclib at runtime like I can openblas.

@theAeon
Copy link
Author

theAeon commented Mar 13, 2024

As for all-core scaling poorly, I think that's likely to just be thermal throttling.

@samuel-marsh
Copy link
Owner

Gotcha.

I do have update on linux setting through RStudio server in terms of nCores.

So again here I was using data.table as my test because it has nice reporting and get/set mechanism.
At first while it correctly reported the number of cores available it wouldn't allow me to change the omp cores with its setDTthreads function, it would always remain at 2.

So I checked the following:

> Sys.getenv("OMP_NUM_THREADS")
[1] 2
> Sys.setenv(OMP_NUM_THREADS = 32)
> Sys.getenv("OMP_NUM_THREADS")
[1] 32

But still setDTthreads didn't update the data.table threads. But then I noticed that on this VM the OMP_THREAD_LIMIT parameter was set and set to 2.

So I then ran:

> Sys.getenv("OMP_THREAD_LIMIT")
[1] 2
> Sys.setenv(OMP_THREAD_LIMIT = 32)
> Sys.getenv("OMP_THREAD_LIMIT")
[1] 32

Following this when I used setDTthreads(threads = 32) it properly updated the threads data.table said it was using to 32.

So it appears I can properly modify openMP threads in this instance but when I ran nCores = 32 for INMF the timing was identical to when I didn't set nCores.

I will send compilation log info for this instance this afternoon, have meeting starting soon.

Thanks again!!
Sam

@mvfki
Copy link

mvfki commented Mar 13, 2024

@theAeon @samuel-marsh quick note that I'm going to name the package back to rliger within this week. Previously it was named by rliger2 simply because I need to check if my reimplementation can produce identical result as the old version has. Having both versions presented in one session helps, but now I think I'm pretty ready to say bye to the old version.

@samuel-marsh
Copy link
Owner

@mvfki Great!! I'll update my dev code in scCustomize once that change is live. I'm basically making each function I have split so that it can remain functional for either <=1.0.1 or the newer version. Still working through all of the functions but will have more time next week to update those.

@samuel-marsh
Copy link
Owner

@theAeon Here is compilation logs from my GCP instance.

RcppPlanc
> devtools::install_github('welch-lab/RcppPlanc')
Downloading GitHub repo welch-lab/RcppPlanc@HEAD
── R CMD build ───────────────────────────────────────────────────────────────────────────────────────────
✔  checking for file/tmp/RtmpP6JeDd/remotes206cf5b909/welch-lab-RcppPlanc-2718ccb/DESCRIPTION...preparingRcppPlanc:checking DESCRIPTION meta-informationcleaning srcrunningcleanup’
─  checking for LF line-endings in source and make files and shell scriptschecking for empty or unneeded directoriesbuildingRcppPlanc_1.0.0.tar.gzInstalling package into/home/rstudio/R/x86_64-pc-linux-gnu-library/4.3-3.18’
(aslibis unspecified)
* installing *source* packageRcppPlanc...
** using staged installation
-- The C compiler identification is GNU 11.4.0
-- The CXX compiler identification is GNU 11.4.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/gcc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/g++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Found R: /usr/local/bin/R (found version "4.3.2") 
-- MKL_THREADING = OMP
-- Looking for sys/types.h
-- Looking for sys/types.h - found
-- Looking for stdint.h
-- Looking for stdint.h - found
-- Looking for stddef.h
-- Looking for stddef.h - found
-- Check size of void*
-- Check size of void* - done
-- Looking for sgemm_
-- Looking for sgemm_ - not found
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
-- Found Threads: TRUE  
-- Looking for sgemm_
-- Looking for sgemm_ - found
-- Found BLAS: /usr/lib/x86_64-linux-gnu/libopenblas.so  
-- Looking for cheev_
-- Looking for cheev_ - found
-- Found LAPACK: /usr/lib/x86_64-linux-gnu/libopenblas.so;-lm;-ldl  
-- Found OpenMP_C: -fopenmp (found version "4.5") 
-- Found OpenMP_CXX: -fopenmp (found version "4.5") 
-- Found OpenMP: TRUE (found version "4.5")  
-- Found R_Rcpp: /usr/local/lib/R/site-library/Rcpp  
-- Found R_RcppArmadillo: /usr/local/lib/R/site-library/RcppArmadillo  
-- Found R_RcppProgress: /usr/local/lib/R/site-library/RcppProgress  
-- A cache variable, namely HWLOC_DIR, has been set to specify the install directory of HWLOC
-- Checking for one of the modules 'hwloc'
-- Looking for HWLOC - found using PkgConfig
-- Found HWLOC: hwloc  
-- Performing Test HAVE_HWLOC_PARENT_MEMBER
-- Performing Test HAVE_HWLOC_PARENT_MEMBER - Success
-- Performing Test HAVE_HWLOC_CACHE_ATTR
-- Performing Test HAVE_HWLOC_CACHE_ATTR - Success
-- Performing Test HAVE_HWLOC_OBJ_PU
-- Performing Test HAVE_HWLOC_OBJ_PU - Success
-- Looking for hwloc_bitmap_free in hwloc
-- Looking for hwloc_bitmap_free in hwloc - found
CMake Warning (dev) at /usr/local/lib/python3.10/dist-packages/cmake/data/share/cmake-3.27/Modules/FetchContent.cmake:1316 (message):
  The DOWNLOAD_EXTRACT_TIMESTAMP option was not given and policy CMP0135 is
  not set.  The policy's OLD behavior will be used.  When using a URL
  download, the timestamps of extracted files should preferably be that of
  the time of extraction, otherwise code that depends on the extracted
  contents might not be rebuilt if the URL changes.  The OLD behavior
  preserves the timestamps from the archive instead, but this is usually not
  what you want.  Update your project to the NEW behavior or specify the
  DOWNLOAD_EXTRACT_TIMESTAMP option with a value of true to avoid this
  robustness issue.
Call Stack (most recent call first):
  CMakeLists.txt:158 (FetchContent_Declare)
This warning is for project developers.  Use -Wno-dev to suppress it.

CMake Deprecation Warning at /tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/src/_deps/highfive-src/CMakeLists.txt:1 (cmake_minimum_required):
  Compatibility with CMake < 3.5 will be removed from a future version of
  CMake.

  Update the VERSION argument <min> value or use a ...<max> suffix to tell
  CMake that the project does not need compatibility with older versions.


-- Found HDF5: /usr/lib/x86_64-linux-gnu/hdf5/serial/libhdf5.so;/usr/lib/x86_64-linux-gnu/libcrypto.so;/usr/lib/x86_64-linux-gnu/libcurl.so;/usr/lib/x86_64-linux-gnu/libpthread.a;/usr/lib/x86_64-linux-gnu/libsz.so;/usr/lib/x86_64-linux-gnu/libz.so;/usr/lib/x86_64-linux-gnu/libdl.a;/usr/lib/x86_64-linux-gnu/libm.so (found version "1.10.7")  
-- Looking for std::filesystem::exists
-- Looking for std::filesystem::exists - not found
-- Performing Test FINITE_MATH
-- Performing Test FINITE_MATH - Success
-- Performing Test SIGNALING_NANS
-- Performing Test SIGNALING_NANS - Success
-- Performing Test ROUNDING_MATH
-- Performing Test ROUNDING_MATH - Success
-- Performing Test TRAPPING_MATH
-- Performing Test TRAPPING_MATH - Success
-- Performing Test ERRNO_MATH
-- Performing Test ERRNO_MATH - Success
-- Performing Test UNSIGNED_ZERO
-- Performing Test UNSIGNED_ZERO - Success
-- Looking for openblas/cblas.h
-- Looking for openblas/cblas.h - not found
-- Looking for cblas.h
-- Looking for cblas.h - found
-- Configuring done (10.1s)
-- Generating done (0.0s)
-- Build files have been written to: /tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/src
** libs
/usr/local/lib/python3.10/dist-packages/cmake/data/bin/cmake -S/tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/tools -B/tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/src --check-build-system CMakeFiles/Makefile.cmake 0
/usr/local/lib/python3.10/dist-packages/cmake/data/bin/cmake -E cmake_progress_start /tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/src/CMakeFiles /tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/src//CMakeFiles/progress.marks
make  -f CMakeFiles/Makefile2 all
make[1]: Entering directory '/tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/src'
make  -f CMakeFiles/hw_detect.dir/build.make CMakeFiles/hw_detect.dir/depend
make[2]: Entering directory '/tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/src'
cd /tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/src && /usr/local/lib/python3.10/dist-packages/cmake/data/bin/cmake -E cmake_depends "Unix Makefiles" /tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/tools /tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/tools /tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/src /tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/src /tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/src/CMakeFiles/hw_detect.dir/DependInfo.cmake "--color="
make[2]: Leaving directory '/tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/src'
make  -f CMakeFiles/hw_detect.dir/build.make CMakeFiles/hw_detect.dir/build
make[2]: Entering directory '/tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/src'
[ 16%] Building C object CMakeFiles/hw_detect.dir/common/hw_detect.c.o
/usr/bin/gcc  -I/tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/tools -I/usr/local/include -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g -O2 -g -DNDEBUG -std=gnu99 -fPIC -MD -MT CMakeFiles/hw_detect.dir/common/hw_detect.c.o -MF CMakeFiles/hw_detect.dir/common/hw_detect.c.o.d -o CMakeFiles/hw_detect.dir/common/hw_detect.c.o -c /tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/src/common/hw_detect.c
make[2]: Leaving directory '/tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/src'
[ 16%] Built target hw_detect
make  -f CMakeFiles/RcppPlanc.dir/build.make CMakeFiles/RcppPlanc.dir/depend
make[2]: Entering directory '/tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/src'
cd /tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/src && /usr/local/lib/python3.10/dist-packages/cmake/data/bin/cmake -E cmake_depends "Unix Makefiles" /tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/tools /tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/tools /tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/src /tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/src /tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/src/CMakeFiles/RcppPlanc.dir/DependInfo.cmake "--color="
make[2]: Leaving directory '/tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/src'
make  -f CMakeFiles/RcppPlanc.dir/build.make CMakeFiles/RcppPlanc.dir/build
make[2]: Entering directory '/tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/src'
[ 33%] Building CXX object CMakeFiles/RcppPlanc.dir/rcppplanc_nmf.cpp.o
/usr/bin/g++ -DHIGHFIVE_HAS_CONCEPTS=0 -DMPI_NO_CPPBIND -DRcppPlanc_EXPORTS -I/tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/tools/../src/common -I/tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/tools/../src/nnls -I/tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/tools/../src/nmf -I/usr/local/lib/R/include -I/usr/local/lib/R/site-library/Rcpp/include -I/usr/local/lib/R/site-library/RcppArmadillo/include -I/usr/local/lib/R/site-library/RcppProgress/include -I/tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/src/_deps/highfive-src/include -isystem /usr/include/hdf5/serial -I/usr/local/include -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g -O2 -g -DNDEBUG -fPIC -fopenmp -ffinite-math-only -fno-signaling-nans -fno-rounding-math -fno-trapping-math -fno-math-errno -fno-signed-zeros -MD -MT CMakeFiles/RcppPlanc.dir/rcppplanc_nmf.cpp.o -MF CMakeFiles/RcppPlanc.dir/rcppplanc_nmf.cpp.o.d -o CMakeFiles/RcppPlanc.dir/rcppplanc_nmf.cpp.o -c /tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/src/rcppplanc_nmf.cpp
[ 50%] Building CXX object CMakeFiles/RcppPlanc.dir/RcppExports.cpp.o
/usr/bin/g++ -DHIGHFIVE_HAS_CONCEPTS=0 -DMPI_NO_CPPBIND -DRcppPlanc_EXPORTS -I/tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/tools/../src/common -I/tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/tools/../src/nnls -I/tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/tools/../src/nmf -I/usr/local/lib/R/include -I/usr/local/lib/R/site-library/Rcpp/include -I/usr/local/lib/R/site-library/RcppArmadillo/include -I/usr/local/lib/R/site-library/RcppProgress/include -I/tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/src/_deps/highfive-src/include -isystem /usr/include/hdf5/serial -I/usr/local/include -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g -O2 -g -DNDEBUG -fPIC -fopenmp -ffinite-math-only -fno-signaling-nans -fno-rounding-math -fno-trapping-math -fno-math-errno -fno-signed-zeros -MD -MT CMakeFiles/RcppPlanc.dir/RcppExports.cpp.o -MF CMakeFiles/RcppPlanc.dir/RcppExports.cpp.o.d -o CMakeFiles/RcppPlanc.dir/RcppExports.cpp.o -c /tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/src/RcppExports.cpp
[ 66%] Building CXX object CMakeFiles/RcppPlanc.dir/conversion.cpp.o
/usr/bin/g++ -DHIGHFIVE_HAS_CONCEPTS=0 -DMPI_NO_CPPBIND -DRcppPlanc_EXPORTS -I/tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/tools/../src/common -I/tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/tools/../src/nnls -I/tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/tools/../src/nmf -I/usr/local/lib/R/include -I/usr/local/lib/R/site-library/Rcpp/include -I/usr/local/lib/R/site-library/RcppArmadillo/include -I/usr/local/lib/R/site-library/RcppProgress/include -I/tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/src/_deps/highfive-src/include -isystem /usr/include/hdf5/serial -I/usr/local/include -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g -O2 -g -DNDEBUG -fPIC -fopenmp -ffinite-math-only -fno-signaling-nans -fno-rounding-math -fno-trapping-math -fno-math-errno -fno-signed-zeros -MD -MT CMakeFiles/RcppPlanc.dir/conversion.cpp.o -MF CMakeFiles/RcppPlanc.dir/conversion.cpp.o.d -o CMakeFiles/RcppPlanc.dir/conversion.cpp.o -c /tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/src/conversion.cpp
[ 83%] Building C object CMakeFiles/RcppPlanc.dir/common/detect_blas.c.o
/usr/bin/gcc -DHIGHFIVE_HAS_CONCEPTS=0 -DMPI_NO_CPPBIND -DRcppPlanc_EXPORTS -I/tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/tools/../src/common -I/tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/tools/../src/nnls -I/tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/tools/../src/nmf -I/usr/local/lib/R/include -I/usr/local/lib/R/site-library/Rcpp/include -I/usr/local/lib/R/site-library/RcppArmadillo/include -I/usr/local/lib/R/site-library/RcppProgress/include -I/tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/src/_deps/highfive-src/include -isystem /usr/include/hdf5/serial -I/usr/local/include -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g -O2 -g -DNDEBUG -std=gnu99 -fPIC -fopenmp -ffinite-math-only -fno-signaling-nans -fno-rounding-math -fno-trapping-math -fno-math-errno -fno-signed-zeros -MD -MT CMakeFiles/RcppPlanc.dir/common/detect_blas.c.o -MF CMakeFiles/RcppPlanc.dir/common/detect_blas.c.o.d -o CMakeFiles/RcppPlanc.dir/common/detect_blas.c.o -c /tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/src/common/detect_blas.c
[100%] Linking CXX shared library RcppPlanc.so
/usr/local/lib/python3.10/dist-packages/cmake/data/bin/cmake -E cmake_link_script CMakeFiles/RcppPlanc.dir/link.txt --verbose=1
/usr/bin/g++ -fPIC -I/usr/local/include -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g -O2 -g -DNDEBUG -shared -L/usr/local/lib -shared -Wl,-soname,RcppPlanc.so -o RcppPlanc.so CMakeFiles/RcppPlanc.dir/rcppplanc_nmf.cpp.o CMakeFiles/RcppPlanc.dir/RcppExports.cpp.o CMakeFiles/RcppPlanc.dir/conversion.cpp.o CMakeFiles/RcppPlanc.dir/common/detect_blas.c.o CMakeFiles/hw_detect.dir/common/hw_detect.c.o   -L/usr/local/lib/R/lib  -Wl,-rpath,/usr/local/lib/R/lib:/usr/lib/x86_64-linux-gnu/hdf5/serial -lstdc++fs -lR /usr/lib/x86_64-linux-gnu/libopenblas.so /usr/lib/x86_64-linux-gnu/libopenblas.so -lm -ldl /usr/lib/x86_64-linux-gnu/hdf5/serial/libhdf5.so /usr/lib/x86_64-linux-gnu/libcrypto.so /usr/lib/x86_64-linux-gnu/libcurl.so /usr/lib/x86_64-linux-gnu/libpthread.a /usr/lib/x86_64-linux-gnu/libsz.so /usr/lib/x86_64-linux-gnu/libz.so /usr/lib/x86_64-linux-gnu/libdl.a -lm -L/usr/lib/x86_64-linux-gnu -lhwloc 
/usr/local/lib/python3.10/dist-packages/cmake/data/bin/cmake -E remove /tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/src/_deps/highfive-build/CMakeFiles/hdf5/cmake_hdf5_test.c
make[2]: Leaving directory '/tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/src'
[100%] Built target RcppPlanc
make[1]: Leaving directory '/tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/src'
/usr/local/lib/python3.10/dist-packages/cmake/data/bin/cmake -E cmake_progress_start /tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/src/CMakeFiles 0
installing to /home/rstudio/R/x86_64-pc-linux-gnu-library/4.3-3.18/00LOCK-RcppPlanc/00new/RcppPlanc/libs
** R
** data
*** moving datasets to lazyload DB
** inst
** byte-compile and prepare package for lazy loading
** help
*** installing help indices
** building package indices
** installing vignettes
** testing if installed package can be loaded from temporary location
** checking absolute paths in shared objects and dynamic libraries
** testing if installed package can be loaded from final location
** testing if installed package keeps a record of temporary installation path
* DONE (RcppPlanc)
rliger
> devtools::install_github("welch-lab/liger", ref = "newObj")
Downloading GitHub repo welch-lab/liger@newObj
── R CMD build ───────────────────────────────────────────────────────────────────────────────────────────
✔  checking for file/tmp/RtmpBYJEfV/remotes7f14870a7a9/welch-lab-liger-e23c66e/DESCRIPTION...preparingrliger:checking DESCRIPTION meta-information ...cleaning srcchecking for LF line-endings in source and make files and shell scriptschecking for empty or unneeded directoriesbuildingrliger_1.99.0.tar.gzInstalling package into/home/rstudio/R/x86_64-pc-linux-gnu-library/4.3-3.18’
(aslibis unspecified)
* installing *source* packagerliger...
** using staged installation
** libs
using C++ compiler:g++ (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0g++ -std=gnu++17 -I"/usr/local/lib/R/include" -DNDEBUG -DARMA_64BIT_WORD=1  -I'/usr/local/lib/R/site-library/Rcpp/include' -I'/usr/local/lib/R/site-library/RcppArmadillo/include' -I'/usr/local/lib/R/site-library/RcppProgress/include' -I/usr/local/include   -DARMA_DONT_USE_OPENMP  -fpic  -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g  -c ModularityOptimizer.cpp -o ModularityOptimizer.o
g++ -std=gnu++17 -I"/usr/local/lib/R/include" -DNDEBUG -DARMA_64BIT_WORD=1  -I'/usr/local/lib/R/site-library/Rcpp/include' -I'/usr/local/lib/R/site-library/RcppArmadillo/include' -I'/usr/local/lib/R/site-library/RcppProgress/include' -I/usr/local/include   -DARMA_DONT_USE_OPENMP  -fpic  -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g  -c RModularityOptimizer.cpp -o RModularityOptimizer.o
g++ -std=gnu++17 -I"/usr/local/lib/R/include" -DNDEBUG -DARMA_64BIT_WORD=1  -I'/usr/local/lib/R/site-library/Rcpp/include' -I'/usr/local/lib/R/site-library/RcppArmadillo/include' -I'/usr/local/lib/R/site-library/RcppProgress/include' -I/usr/local/include   -DARMA_DONT_USE_OPENMP  -fpic  -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g  -c RcppExports.cpp -o RcppExports.o
g++ -std=gnu++17 -I"/usr/local/lib/R/include" -DNDEBUG -DARMA_64BIT_WORD=1  -I'/usr/local/lib/R/site-library/Rcpp/include' -I'/usr/local/lib/R/site-library/RcppArmadillo/include' -I'/usr/local/lib/R/site-library/RcppProgress/include' -I/usr/local/include   -DARMA_DONT_USE_OPENMP  -fpic  -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g  -c cinmf_util.cpp -o cinmf_util.o
g++ -std=gnu++17 -I"/usr/local/lib/R/include" -DNDEBUG -DARMA_64BIT_WORD=1  -I'/usr/local/lib/R/site-library/Rcpp/include' -I'/usr/local/lib/R/site-library/RcppArmadillo/include' -I'/usr/local/lib/R/site-library/RcppProgress/include' -I/usr/local/include   -DARMA_DONT_USE_OPENMP  -fpic  -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g  -c data_processing.cpp -o data_processing.o
g++ -std=gnu++17 -I"/usr/local/lib/R/include" -DNDEBUG -DARMA_64BIT_WORD=1  -I'/usr/local/lib/R/site-library/Rcpp/include' -I'/usr/local/lib/R/site-library/RcppArmadillo/include' -I'/usr/local/lib/R/site-library/RcppProgress/include' -I/usr/local/include   -DARMA_DONT_USE_OPENMP  -fpic  -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g  -c feature_mat.cpp -o feature_mat.o
g++ -std=gnu++17 -I"/usr/local/lib/R/include" -DNDEBUG -DARMA_64BIT_WORD=1  -I'/usr/local/lib/R/site-library/Rcpp/include' -I'/usr/local/lib/R/site-library/RcppArmadillo/include' -I'/usr/local/lib/R/site-library/RcppProgress/include' -I/usr/local/include   -DARMA_DONT_USE_OPENMP  -fpic  -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g  -c quantile_norm.cpp -o quantile_norm.o
g++ -std=gnu++17 -I"/usr/local/lib/R/include" -DNDEBUG -DARMA_64BIT_WORD=1  -I'/usr/local/lib/R/site-library/Rcpp/include' -I'/usr/local/lib/R/site-library/RcppArmadillo/include' -I'/usr/local/lib/R/site-library/RcppProgress/include' -I/usr/local/include   -DARMA_DONT_USE_OPENMP  -fpic  -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g  -c snn.cpp -o snn.o
g++ -std=gnu++17 -I"/usr/local/lib/R/include" -DNDEBUG -DARMA_64BIT_WORD=1  -I'/usr/local/lib/R/site-library/Rcpp/include' -I'/usr/local/lib/R/site-library/RcppArmadillo/include' -I'/usr/local/lib/R/site-library/RcppProgress/include' -I/usr/local/include   -DARMA_DONT_USE_OPENMP  -fpic  -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g  -c wilcoxon.cpp -o wilcoxon.o
g++ -std=gnu++17 -shared -L/usr/local/lib/R/lib -L/usr/local/lib -o rliger.so ModularityOptimizer.o RModularityOptimizer.o RcppExports.o cinmf_util.o data_processing.o feature_mat.o quantile_norm.o snn.o wilcoxon.o -llapack -lblas -lgfortran -lm -lquadmath -L/usr/local/lib/R/lib -lR
installing to /home/rstudio/R/x86_64-pc-linux-gnu-library/4.3-3.18/00LOCK-rliger/00new/rliger/libs
** R
** data
*** moving datasets to lazyload DB
** inst
** byte-compile and prepare package for lazy loading
** help
*** installing help indices
** building package indices
** installing vignettes
** testing if installed package can be loaded from temporary location
** checking absolute paths in shared objects and dynamic libraries
** testing if installed package can be loaded from final location
** testing if installed package keeps a record of temporary installation path
* DONE (rliger)
sessionInfo() output
> sessionInfo()
R version 4.3.2 (2023-10-31)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 22.04.2 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so;  LAPACK version 3.10.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8       
 [4] LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                  LC_ADDRESS=C              
[10] LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

time zone: Etc/UTC
tzcode source: system (glibc)

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

loaded via a namespace (and not attached):
 [1] miniUI_0.1.1.1    compiler_4.3.2    promises_1.2.1    Rcpp_1.0.12       stringr_1.5.1    
 [6] callr_3.7.5       later_1.3.2       fastmap_1.1.1     mime_0.12         R6_2.5.1         
[11] curl_5.2.1        htmlwidgets_1.6.4 desc_1.4.3        profvis_0.3.8     shiny_1.8.0      
[16] rlang_1.1.3       cachem_1.0.8      stringi_1.8.3     httpuv_1.6.14     fs_1.6.3         
[21] pkgload_1.3.4     memoise_2.0.1     cli_3.6.2         magrittr_2.0.3    ps_1.7.6         
[26] digest_0.6.35     processx_3.8.3    rstudioapi_0.15.0 xtable_1.8-4      remotes_2.4.2.1  
[31] devtools_2.4.5    lifecycle_1.0.4   vctrs_0.6.5       glue_1.7.0        urlchecker_1.0.1 
[36] sessioninfo_1.2.2 pkgbuild_1.4.3    purrr_1.0.2       tools_4.3.2       usethis_2.2.3    
[41] ellipsis_0.3.2    htmltools_0.5.7

@theAeon
Copy link
Author

theAeon commented Mar 14, 2024

That all looks correct. Can you see what happens if you benchmark it with the proper openmp variable state set beforehand? I don't know whether openmp runtime state carries between extensions.

@theAeon
Copy link
Author

theAeon commented Mar 14, 2024

The other possibility is that this is the same issue where pthreaded BLAS is causing parallelism recursion. RcppPlanc only checks for openmp enablement at attach, so it wouldn't necessarily be aware of the lifted threadcount.

@samuel-marsh
Copy link
Owner

@theAeon Sounds good I will try those first thing tomorrow.

@mvfki I have found another potential bug/issue. While I do realize parts of this may be edge cases I wanted to mention and why it may be better to have separate slot for the dimreducs.

  • First it is possible to overwrite the dimreducs if user accidentally adds single column with the same name to the meta data (this is the edge case really) but it results in error message that may not be as interpretable to some.
> plotByDatasetAndCluster(liger_obj)
✖ Specified variables from cellMeta not found: "UMAP.1" and "UMAP.2"
Error in `[[.default`(plotDF, colorBy) : subscript out of bounds
✖ Plotting feature "dataset" on 46960 cells ... failed

To me at least I think having them in separate slot makes sense and ensures that they aren't accidentally overwritten by user but I don't know if there may be reason I don't understand why adding another slot could cause issues.

  • Second and related is that if user has column in meta.data of Seurat that has same name as a dimreduc then as.liger will fail. Presumably because it's trying to add both the dimreduc and meta.data. (this would also be resolved if dimreducs were stored separately.)
> pbmc_liger <- as.liger(pbmc, datasetVar = "treatment")
✔ calculating QC for dataset "Treatment1" ... done
✔ calculating QC for dataset "Treatment2" ... done
ℹ Removing missing in dataset: "Treatment1"
ℹ Removing missing in dataset: "Treatment2"
Error in `[.DimReduc`(value, barcodes, , drop = FALSE) : 
  None of the requested loadings are present.
  • The third but related issue comes in the form of the dimreducs that can be transfered from Seurat during as.liger. The inherently get named exactly as they were in Seurat object which also leaves potential (depending on naming keys and Seurat version) that they could be overwritten by liger analysis. While some users may not care to retain this info other's may want to keep all reductions.
    • For this part I would suggest maybe adding adding suffix to imported dimreducs during the as.liger step (e.g., UMAP_Seurat) to that they remain intact in the meta data. (would be necessary whether dimreducs are stored in cellMeta or separately).

Thanks again!!
Sam

@mvfki
Copy link

mvfki commented Mar 15, 2024

I do agree with having them in a separate slot and what you found is actually what I've been somewhat worried about. The underlying things are like: a variable is called "UMAP", and it is a matrix. When using ggplot, "fortifying", I have to convert the S4Vectors::DataFrame into a data.frame which expands the matrix into as many columns as it should, and the column names "UMAP.1" "UMAP.2" are generated there. I have to manually get the suffix pasted in my plotting function, which is dirty.

Things are even worse when it comes to container class conversion. I cannot actually tell a difference between an actuall dimred between something comes as a matrix but should be considered as metadata (e.g. SingleR returns), unless I get many more stuffs recorded and make the whole thing even dirtier.

When we were at a pretty early stage, we were thinking in a way that dimreds are somewhat metadata. So I just put them there because from my previous experience with SCEs I know a matrix can be inserted in a DataFrame. To be honest, I'm kinda suffering from this decision now.

Appreciate it that you finally let me made my mind to do it.

Yichen

@samuel-marsh
Copy link
Owner

Hi Yichen (@mvfki ),

Ya it's definitely tricky. So one thing to consider is potentially just modeling the slot on SeuratObject's DimReduc class (or potentially just importing that class as is from SeuratObject) in liger.

I know we discussed heavy dependencies before but I actually checked it out and SeuratObject's dependencies are actually rather light (or at least by comparison to Seurat for sure). In fact Seurat Object as dependency would only add 11 packages to installation.

rliger <- pak::pkg_deps("rliger")
seurat_object <- pak::pkg_deps("SeuratObject")

> rliger_full <- bind_rows(rliger$deps) %>% 
+   filter(type %in% c("imports", "depends")) %>% 
+   pull(package) %>% 
+   unique()
> seurat_object_full <- bind_rows(seurat_object$deps) %>% 
+   filter(type %in% c("imports", "depends")) %>% 
+   pull(package) %>% 
+   unique()

> setdiff(seurat_object_full, rliger_full)
 [1] "sp"           "future"       "future.apply" "progressr"    "tools"        "spam"         "globals"     
 [8] "digest"       "listenv"      "parallelly"   "dotCall64"  

I don't know the ins and outs of the DimReduc class or really setting classes like that but just a thought. It might also make transfer of dimreducs from Seurat easier because they could just be imported.

Either adapting the class or directly importing it would allow for use of Seurat Objects accessor functions (most notably Embeddings, Reductions, and Loadings) which allow for easy pulling of information for plotting or other needs.

Just a thought in case it helps.

Also again want to repeat for both of you how great this new version of liger is and how thankful I am you both and entire rest of the team's work!!!

Finally, as it regards DimReduc storage and accessing/plotting if that is something that you would like help with I'm happy to try and assist if that lightens the load at all, just let me know!

Best,
Sam

@mvfki
Copy link

mvfki commented Mar 15, 2024

I'm currently drafting it in the simplest way: the @dimReds slot would be a list of matrices. And I have checks on cell ID matching across the object. I know that a Seurat DimReduc object can have a lot more information stored for individual dimreds but we haven't yet seen any use case with them in LIGER, rather, we simply just need the embedding coordinates for visualization. One problem I can foresee is that when we convert a liger object to a seurat object, we don't have that much information to fill in the DimReduc object constructor. And information could be lost when converting it back and forth, though this is not really what we expect users to do.

This is not yet overally finished but all unit tests are already passing and it's live in the branch (together with the package name change). Maybe you want to try out shortly after I revisit the Seurat conversion cases later today?

Thank you very much for all the valuable comments and efforts in testing things out. To be honest it could be impossible for us to find out these many edge cases without your help!

Yichen

@samuel-marsh
Copy link
Owner

Hi Yichen,

That's awesome!! Ya I will give it a go later today! Happy to help any way I can, even if it's just finding all the weird things lol.

Bets,
Sam

@samuel-marsh
Copy link
Owner

Hi @mvfki

Got new warning message I've never gotten before today. Running iNMF I got this:

INMF started, niter=30
0%   10   20   30   40   50   60   70   80   90   100%
[----|----|----|----|----|----|----|----|----|----|

warning: 
warning: solve(): system is singular (rcond: solve(): system is singular (rcond: 4.25938e-18); attempting approx solution1.94997e-19
); attempting approx solution

warning: solve(): system is singular (rcond: 
warning: 1.94997e-19solve(): system is singular (rcond: ); attempting approx solution
2.54117e-18); attempting approx solution

warning: solve(): system is singular (rcond: 
warning: solve(): system is singular (rcond: 2.60621e-182.54117e-18); attempting approx solution); attempting approx solution


warning: 
warning: solve(): system is singular (rcond: solve(): system is singular (rcond: 2.16777e-182.60621e-18); attempting approx solution); attempting approx solution


warning: solve(): system is singular (rcond: 3.50222e-18); attempting approx solution

warning: solve(): system is singular (rcond: 1.29952e-18); attempting approx solution

warning: solve(): system is singular (rcond: 5.61285e-18); attempting approx solution

warning: solve(): system is singular (rcond: 5.61285e-18); attempting approx solution

warning: solve(): system is singular (rcond: 9.06841e-19); attempting approx solution

warning: solve(): system is singular (rcond: 9.06841e-19); attempting approx solution

warning: solve(): system is singular (rcond: 8.07143e-18); attempting approx solution

warning: solve(): system is singular (rcond: 2.32176e-18); attempting approx solution

warning: solve(): system is singular (rcond: 2.32176e-18); attempting approx solution

warning: solve(): system is singular (rcond: 1.94997e-19); attempting approx solution

warning: solve(): system is singular (rcond: 2.54117e-18); attempting approx solution

warning: solve(): system is singular (rcond: 1.29952e-18); attempting approx solution

warning: solve(): system is singular (rcond: 2.16777e-18); attempting approx solution

warning: solve(): system is singular (rcond: 5.61285e-18); attempting approx solution

warning: solve(): system is singular (rcond: 1.94997e-19); attempting approx solution

warning: solve(): system is singular (rcond: 9.06841e-19); attempting approx solution

warning: 
warning: solve(): system is singular (rcond: solve(): system is singular (rcond: 8.07143e-182.54117e-18); attempting approx solution); attempting approx solution


warning: solve(): system is singular (rcond: 2.60621e-18); attempting approx solution

warning: solve(): system is singular (rcond: 
warning: 5.61285e-18solve(): system is singular (rcond: ); attempting approx solution
2.32176e-18); attempting approx solution

warning: solve(): system is singular (rcond: 9.06841e-19); attempting approx solution

warning: solve(): system is singular (rcond: 2.32176e-18); attempting approx solution

warning: solve(): system is singular (rcond: 1.94997e-19); attempting approx solution

warning: solve(): system is singular (rcond: 2.54117e-18); attempting approx solution

It went on even more than that but saving space here. The warning went away when I lowered the k value. Just wondering what the error is and if there is way to convey more informative error message/abort the function so user can adjust parameters.

Thanks!
Sam

@mvfki
Copy link

mvfki commented Mar 19, 2024

Looks like something mathematical. Could you share the following numbers for this run:

  • lengths(ligerobj) for number of cells for each dataset, I recently just added this S3 method
  • length(varFeatures(ligerobj)) for the other side of the dimensionality
  • the k you got these warnings

@samuel-marsh
Copy link
Owner

Ya looks like probably issue with cell number.

 lengths(piehl_liger)
Piehl_2022_A1 Piehl_2022_A2 Piehl_2022_A4 Piehl_2022_A5 Piehl_2022_A6 Piehl_2022_A7 Piehl_2022_A8 Piehl_2022_B1 Piehl_2022_B2 Piehl_2022_B3 
           95           172           435            97            63           216            29           137           272            75 
Piehl_2022_B4 Piehl_2022_B5 Piehl_2022_B6 Piehl_2022_B7 Piehl_2022_B8 Piehl_2022_C1 Piehl_2022_C2 Piehl_2022_C3 Piehl_2022_C4 Piehl_2022_C5 
          112            92            73           116           290          1172            82           292           146            29 
Piehl_2022_C6 Piehl_2022_C8 Piehl_2022_D1 Piehl_2022_D2 Piehl_2022_D3 Piehl_2022_D4 Piehl_2022_D5 Piehl_2022_D6 Piehl_2022_D8 Piehl_2022_E1 
          340            45            67           581            60           135            27           182            27           338 
Piehl_2022_E2 Piehl_2022_E3 Piehl_2022_E4 Piehl_2022_E5 Piehl_2022_E7 Piehl_2022_E8 Piehl_2022_F1 Piehl_2022_F2 Piehl_2022_F3 Piehl_2022_F4 
          209            96           223           444            62           103           225           164           219           181 
Piehl_2022_F6 Piehl_2022_F7 Piehl_2022_F8 Piehl_2022_G1 Piehl_2022_G2 Piehl_2022_G3 Piehl_2022_G4 Piehl_2022_G5 Piehl_2022_G6 Piehl_2022_G7 
          192           419            31           154           322           142           178            26           372            47 
Piehl_2022_G8 Piehl_2022_H1 Piehl_2022_H2 Piehl_2022_H3 Piehl_2022_H4 Piehl_2022_H5 Piehl_2022_H6 Piehl_2022_H7 Piehl_2022_H8 
           76           874            18           226            70           242            32            13           138 

5031 variable features
I got warnings at 15 and 14 k but not at 13 which is the same as lowest number of cells in dataset.
Maybe add an abort if k > lowest number of cells?

@mvfki
Copy link

mvfki commented Mar 19, 2024

Makes sense! Added!

I've just had some last pieces of remaining work done and made a push with the major version bump. I'd like to let you know that our team plans to submit it to cran this week. Please let me know if you have any more concerns, especially the parts related to the reverse suggests by scCustomize.

Appreciate it!
Yichen

@samuel-marsh
Copy link
Owner

Hooray! Congrats! I think I'm pretty good on scCustomize end. About half-way through making things forward/backward compatible but I think all good. Will let you know if anything else comes up!

@samuel-marsh
Copy link
Owner

@mvfki One thing I just thought of and if it's too late for this CRAN release please don't worry about it but something to think about for next update after 2.0.0.

It would be nice to have a parameter in runCluster to return the knn/snn or both. Either to cellMeta or a dedicated "neighbors" slot. The reason I mention this is that I've been running couple things lately that use neighbor graphs for analysis. For instance, Milo and cna for cluster-free differential abundance.

I can obviously run the internal code from runCluster to replicate the results but would be easy to just have logical parameter in runCluster to return when running first time.

Thanks!
Sam

@mvfki
Copy link

mvfki commented Mar 20, 2024

Doable and it's not too late yet. I'm still dealing with some cran machine specific notes lol. Since liger itself doesn't have much things to do with the SNN but just run clustering, I'm not going to add a slot for it which would cost additional work on sanity checks. I'd say either a variable in cellMeta since S4Vectors::DataFrame should work with it, or just throw it in uns for unstructured miscellaneous information. I prefer the latter because it might result in enormous memory use when we happen to do as.data.frame on cellMeta.

@samuel-marsh
Copy link
Owner

Great!!

Also glad it's not too late because think I just found another bug. With the new dimred slot there is problem when using as.liger from a seurat object that doesn't have any dimreducs:

pbmc <- pbmc3k.SeuratData::pbmc3k.final
pbmc <- UpdateSeuratObject(pbmc)

pbmc$treatment <- sample(c("Treatment1", "Treatment2"), size = ncol(pbmc),
                         replace = TRUE)

> pbmc_liger <- as.liger(pbmc, datasetVar = "treatment")
Error in dimnames(x) <- dn : 'dimnames' applied to non-array

@samuel-marsh
Copy link
Owner

Ya I think adding to uns makes sense so you don't have coerce it to dense structure.

@samuel-marsh
Copy link
Owner

@mvfki Ignore previous comment about bug. It was error occuring due to rliger2 being loaded at same time as newer rliger.

@mvfki
Copy link

mvfki commented Mar 20, 2024

haha I was about to say I couldn't reproduce it.

@samuel-marsh
Copy link
Owner

haha yup. thought I had removed rliger2 but guess not. Doing that now though lol

@mvfki
Copy link

mvfki commented Mar 21, 2024

Hi @samuel-marsh , it is live on CRAN, surprisingly no much trouble from the manual inspection!

Again, I appreciate it so much that you kindly gave us so much help in testing the package.

Best,
Yichen

@samuel-marsh
Copy link
Owner

Awesome congrats!! Happy I could help. I haven't run into any more issues in updating functions so I'll close this issue here and if I run into anything more will open on liger repo. Thanks so much both of you @theAeon and @mvfki for all your help!! The updates are really fantastic, speed was always the main drag for me and the changes are amazing (also having leiden clustering without passing to python which is soooooo slow in Seurat)!!!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working enhancement Enhance existing function/feature New Feature New feature to add Release Prep Vignette Related Issues with or Updates to vignettes or pkgdown website
Projects
None yet
Development

No branches or pull requests

3 participants