-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Breaking changes in rliger #161
Comments
Hi Andrew, Thanks so much for the heads up! Just quick scroll through the commits and new changes look like fantastic additions to the package. I have just been working on in my current dev branch bunch of updates to liger functionality and conversion functions that definitely look like they would be affected. But based on very quick look at new commits it seems like a lot of new functionality and class structure changes will make implementation of my functions easier and allow me to port even more of them to S3 generics to work on Seurat and Liger objects. I have also been planning scCustomize release shortly (currently delayed slightly as I work out a few ggplot 3.5.0 issues that crop up during CRAN tests on devel platforms) before it can be approved. Just couple questions so I can make plans and test things.
Happy to discuss back and forth here or if you'd rather discuss via email just shoot me message at my email in package description. Thanks! |
It's essentially finished as it stands right now over at Yichen's repo (mvfki/liger@newObj) but it should be merged into the primary one sooner rather than later.
No worries! I'm only hedging here because the timeline is entirely dependent on CRAN's whims. I'm probably being overly pessimistic, but I have had a package chilling in inspect for over a month at this point. |
Ok great thanks so much! I'll start working on new version and will let you know if I have any questions or run into any issues. I'll leave this issue open until everything is set on my end. Thanks again! |
One quick question. Will the released package be rliger2 or is that just development name? Thanks! |
Good catch-thats the development name.
…On February 22, 2024 2:14:51 PM EST, Samuel Marsh ***@***.***> wrote:
One quick question. Will the released package be rliger2 or is that just development name?
Thanks!
Sam
--
Reply to this email directly or view it on GitHub:
#161 (comment)
You are receiving this because you authored the thread.
Message ID: ***@***.***>
|
great thanks! |
Hi @theAeon (also tagging @mvfki here as bug reporting), I have found two bugs (you may already be aware) and have one suggestion. 1. The first bug is in reprex
2. The second bug is in
In my writing of function for current version of Liger I handled this by just adding a check for multiple layers of same type being present in assay and then allowing users to specify 3. The suggestion that I had also pertains to This means that there will be duplicate columns in cellMeta between dataset and whatever column name is but that doesn't present too much overhead (at least in my mind) to be prohibitive). Best, |
Hi Sam, These suggestions are super helpful! I apologize that I haven't yet got too much experience dealing with new Seurat architecture. I'll definitely get it fixed and let you know shortly. Thanks, |
Hi Yichen, No worries!! I'm only just getting my head about the Assay5 changes too. Thanks for all work on new updates to Liger. All of the changes look awesome!! Best, |
Hi Sam, I just looked into the bugs. 1. For the first one, the failure originated when creating a new liger object where I added Meanwhile, after running the code block in your example, it is currently not expected to return you an identical Seurat object regardless of the barcode issue, because 2. The second one about split layers is very interesting. I have been assuming that Seurat keeps everything in one matrix object. Therefore, many of my other Seurat wrappers do split the data by variable first and concatenate things back in the end, with pretty awkward efficiency. If they now do support having things apart, my wrappers should obviously just go with them as we normally do with liger style structure. I'll dig into this part and get back to you in this thread later. 3. Regarding the suggestion in the third point, yes I agree! And it is actually implemented in the same way as you said. Below is what I would see if I just go with your example with the first bug fixed. And you can see
Appreciate it! |
OK, I had a look at the Seurat5 integration vignette. I think Seurat is suggesting we use the objects with two styles: 1. each object represents a batch and can be of either old versions or v5 2. A v5 object with split layers each representing a batch. Although I believe 3. a single object with batches merged can still be found and needs our support. In the scenario of creating a liger object, we can do Appreciate it a lot if you could share some insights on this. |
Just made new push with fixes! |
Hi Yichen, Great, I'll check them out asap! Yes the Seurat format is now one of 3 versions (V3/4 format, V5 one matrix, V5 split matrices). That is the reason I have the layer check in some of my functions. Whether user has split layers when using V5 structure will depend on what step of analysis they are at and how they are doing analysis or even what qc they do. So even in scenario where user intially had split layers they may not when converting to liger. For instance performing cell cycle scoring in Seurat requires the layers to be joined before running. Part of it will just depend on what your and teams intentions are in terms of which package versions of Seurat to support. If the intention is to support users with both Seurat package versions 4 and 5+ then it becomes important to have checks as to what the Assay structure is and if layers are split. If the layers aren't split then you can just internally run the split functionality as described on V5 cheatsheet vingette. I also made function in scCustomize to do this just because I think intuitively for many users the syntax can be confusing and easy to make mistake and return only the assay and not whole object Split_Layers. If intention is to support Seurat package V4 and V5 then the other important thing as package requirement is to require Unrelated to previous comments: You first just need to reexport As I mention it's not necessary but does simply the format conversion syntax to use Thanks again and I'll let you know if I run into anything else with new updates! Best, |
Hi Sam, I strongly agree with that having Wellllllll, let's keep that in scCustomize for now. |
Hi Yichen, That totally makes sense as Seurat dependency is heavy for sure. One quick question in playing around I think either I missing something or there is bug. I'm not sure that I'm trying to convert from Seurat to new liger using rliger2's
However, following conversion the new object has diffeing dimensions for each ligerDataset appearing to filter out genes not expressed.
Is there something else I'm missing or just a bug? Thanks! |
Edit: just saw internal Best, |
Hi Sam, For the For the class naming, that was actually a good question. I had some discussion about it with team previously, exactly because I saw Seurat team did the smart thing lol. However, we made the decision not to mess around with that, because we don't have that many users that need to look deep into the data structure. I do have an internal function Indeed, the speed test of the new iNMF implementation that I've done, not a formal benchmarking though, showed a much higher fold change, like 30x to 50x. It is equipped with OpenMP multi-processing support (not available to Apple silicon unfortunately though), so if you can test with Thank you a lot! |
sounds great for removeMissing Even better news on OpenMP! Ya I'll try with ncores upped! I'm running some tests in GCP instance so I got plenty of cores to try :) Thanks! |
Hi Yichen, Ok yes confirmed that speed up is actually way closer to what your estimates were (realized I had been comparing results on two different machines with different settings too; forgot I changed parameters part way through this analysis). One question/potential bug though. I'm wondering if possibly The reason I ask is because when testing on my old intel MBP for dataset with ~50K cells, 40 datasets, 5000ish var genes, k15, lambda5, 1 iteration. Maybe need bigger dataset for more than 4 cores to to make difference? Thanks! |
For context, what model MBP is this? |
hm. The RcppPlanc cmakelists doesn't actually link to openmp on osx because Apple Clang/LLVM doesn't ship libomp. I have to imagine the speedup from 2 to 4 is backend BLAS parallelism, but I'd need to know what your install linked against to investigate further. Can you get me a compile log? |
sure. How do I get that? |
No, you'd need to set that environment variable from outside the R instance before launching it. The easiest way would be to just use Rterm for the moment. Setting it in RStudio's terminal doesn't affect RStudio's instance. |
Ok so here's what I've just run directly in R session.
Seems to have set correctly so no running liger code: default: ~85 seconds I can see by watching Activity monitor that is is taking up different amounts of CPU (was doing this both before this last trial and now). |
Awesome. I'm not particularly surprised that going to 16 cores is causing a bit of slowdown as I don't believe that hyperthreading is effective at handling homogenous workloads and RcppPlanc is all double-precision floating point. I'll see if I can't do something about that on the RcppPlanc side of things, but unfortunately I don't know that I can influence veclib at runtime like I can openblas. |
As for all-core scaling poorly, I think that's likely to just be thermal throttling. |
Gotcha. I do have update on linux setting through RStudio server in terms of nCores. So again here I was using data.table as my test because it has nice reporting and get/set mechanism. So I checked the following:
But still So I then ran:
Following this when I used So it appears I can properly modify openMP threads in this instance but when I ran nCores = 32 for INMF the timing was identical to when I didn't set nCores. I will send compilation log info for this instance this afternoon, have meeting starting soon. Thanks again!! |
@theAeon @samuel-marsh quick note that I'm going to name the package back to |
@mvfki Great!! I'll update my dev code in scCustomize once that change is live. I'm basically making each function I have split so that it can remain functional for either <=1.0.1 or the newer version. Still working through all of the functions but will have more time next week to update those. |
@theAeon Here is compilation logs from my GCP instance. RcppPlanc> devtools::install_github('welch-lab/RcppPlanc')
Downloading GitHub repo welch-lab/RcppPlanc@HEAD
── R CMD build ───────────────────────────────────────────────────────────────────────────────────────────
✔ checking for file ‘/tmp/RtmpP6JeDd/remotes206cf5b909/welch-lab-RcppPlanc-2718ccb/DESCRIPTION’ ...
─ preparing ‘RcppPlanc’:
✔ checking DESCRIPTION meta-information
─ cleaning src
─ running ‘cleanup’
─ checking for LF line-endings in source and make files and shell scripts
─ checking for empty or unneeded directories
─ building ‘RcppPlanc_1.0.0.tar.gz’
Installing package into ‘/home/rstudio/R/x86_64-pc-linux-gnu-library/4.3-3.18’
(as ‘lib’ is unspecified)
* installing *source* package ‘RcppPlanc’ ...
** using staged installation
-- The C compiler identification is GNU 11.4.0
-- The CXX compiler identification is GNU 11.4.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/gcc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/g++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Found R: /usr/local/bin/R (found version "4.3.2")
-- MKL_THREADING = OMP
-- Looking for sys/types.h
-- Looking for sys/types.h - found
-- Looking for stdint.h
-- Looking for stdint.h - found
-- Looking for stddef.h
-- Looking for stddef.h - found
-- Check size of void*
-- Check size of void* - done
-- Looking for sgemm_
-- Looking for sgemm_ - not found
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
-- Found Threads: TRUE
-- Looking for sgemm_
-- Looking for sgemm_ - found
-- Found BLAS: /usr/lib/x86_64-linux-gnu/libopenblas.so
-- Looking for cheev_
-- Looking for cheev_ - found
-- Found LAPACK: /usr/lib/x86_64-linux-gnu/libopenblas.so;-lm;-ldl
-- Found OpenMP_C: -fopenmp (found version "4.5")
-- Found OpenMP_CXX: -fopenmp (found version "4.5")
-- Found OpenMP: TRUE (found version "4.5")
-- Found R_Rcpp: /usr/local/lib/R/site-library/Rcpp
-- Found R_RcppArmadillo: /usr/local/lib/R/site-library/RcppArmadillo
-- Found R_RcppProgress: /usr/local/lib/R/site-library/RcppProgress
-- A cache variable, namely HWLOC_DIR, has been set to specify the install directory of HWLOC
-- Checking for one of the modules 'hwloc'
-- Looking for HWLOC - found using PkgConfig
-- Found HWLOC: hwloc
-- Performing Test HAVE_HWLOC_PARENT_MEMBER
-- Performing Test HAVE_HWLOC_PARENT_MEMBER - Success
-- Performing Test HAVE_HWLOC_CACHE_ATTR
-- Performing Test HAVE_HWLOC_CACHE_ATTR - Success
-- Performing Test HAVE_HWLOC_OBJ_PU
-- Performing Test HAVE_HWLOC_OBJ_PU - Success
-- Looking for hwloc_bitmap_free in hwloc
-- Looking for hwloc_bitmap_free in hwloc - found
CMake Warning (dev) at /usr/local/lib/python3.10/dist-packages/cmake/data/share/cmake-3.27/Modules/FetchContent.cmake:1316 (message):
The DOWNLOAD_EXTRACT_TIMESTAMP option was not given and policy CMP0135 is
not set. The policy's OLD behavior will be used. When using a URL
download, the timestamps of extracted files should preferably be that of
the time of extraction, otherwise code that depends on the extracted
contents might not be rebuilt if the URL changes. The OLD behavior
preserves the timestamps from the archive instead, but this is usually not
what you want. Update your project to the NEW behavior or specify the
DOWNLOAD_EXTRACT_TIMESTAMP option with a value of true to avoid this
robustness issue.
Call Stack (most recent call first):
CMakeLists.txt:158 (FetchContent_Declare)
This warning is for project developers. Use -Wno-dev to suppress it.
CMake Deprecation Warning at /tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/src/_deps/highfive-src/CMakeLists.txt:1 (cmake_minimum_required):
Compatibility with CMake < 3.5 will be removed from a future version of
CMake.
Update the VERSION argument <min> value or use a ...<max> suffix to tell
CMake that the project does not need compatibility with older versions.
-- Found HDF5: /usr/lib/x86_64-linux-gnu/hdf5/serial/libhdf5.so;/usr/lib/x86_64-linux-gnu/libcrypto.so;/usr/lib/x86_64-linux-gnu/libcurl.so;/usr/lib/x86_64-linux-gnu/libpthread.a;/usr/lib/x86_64-linux-gnu/libsz.so;/usr/lib/x86_64-linux-gnu/libz.so;/usr/lib/x86_64-linux-gnu/libdl.a;/usr/lib/x86_64-linux-gnu/libm.so (found version "1.10.7")
-- Looking for std::filesystem::exists
-- Looking for std::filesystem::exists - not found
-- Performing Test FINITE_MATH
-- Performing Test FINITE_MATH - Success
-- Performing Test SIGNALING_NANS
-- Performing Test SIGNALING_NANS - Success
-- Performing Test ROUNDING_MATH
-- Performing Test ROUNDING_MATH - Success
-- Performing Test TRAPPING_MATH
-- Performing Test TRAPPING_MATH - Success
-- Performing Test ERRNO_MATH
-- Performing Test ERRNO_MATH - Success
-- Performing Test UNSIGNED_ZERO
-- Performing Test UNSIGNED_ZERO - Success
-- Looking for openblas/cblas.h
-- Looking for openblas/cblas.h - not found
-- Looking for cblas.h
-- Looking for cblas.h - found
-- Configuring done (10.1s)
-- Generating done (0.0s)
-- Build files have been written to: /tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/src
** libs
/usr/local/lib/python3.10/dist-packages/cmake/data/bin/cmake -S/tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/tools -B/tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/src --check-build-system CMakeFiles/Makefile.cmake 0
/usr/local/lib/python3.10/dist-packages/cmake/data/bin/cmake -E cmake_progress_start /tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/src/CMakeFiles /tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/src//CMakeFiles/progress.marks
make -f CMakeFiles/Makefile2 all
make[1]: Entering directory '/tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/src'
make -f CMakeFiles/hw_detect.dir/build.make CMakeFiles/hw_detect.dir/depend
make[2]: Entering directory '/tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/src'
cd /tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/src && /usr/local/lib/python3.10/dist-packages/cmake/data/bin/cmake -E cmake_depends "Unix Makefiles" /tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/tools /tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/tools /tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/src /tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/src /tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/src/CMakeFiles/hw_detect.dir/DependInfo.cmake "--color="
make[2]: Leaving directory '/tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/src'
make -f CMakeFiles/hw_detect.dir/build.make CMakeFiles/hw_detect.dir/build
make[2]: Entering directory '/tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/src'
[ 16%] Building C object CMakeFiles/hw_detect.dir/common/hw_detect.c.o
/usr/bin/gcc -I/tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/tools -I/usr/local/include -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g -O2 -g -DNDEBUG -std=gnu99 -fPIC -MD -MT CMakeFiles/hw_detect.dir/common/hw_detect.c.o -MF CMakeFiles/hw_detect.dir/common/hw_detect.c.o.d -o CMakeFiles/hw_detect.dir/common/hw_detect.c.o -c /tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/src/common/hw_detect.c
make[2]: Leaving directory '/tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/src'
[ 16%] Built target hw_detect
make -f CMakeFiles/RcppPlanc.dir/build.make CMakeFiles/RcppPlanc.dir/depend
make[2]: Entering directory '/tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/src'
cd /tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/src && /usr/local/lib/python3.10/dist-packages/cmake/data/bin/cmake -E cmake_depends "Unix Makefiles" /tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/tools /tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/tools /tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/src /tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/src /tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/src/CMakeFiles/RcppPlanc.dir/DependInfo.cmake "--color="
make[2]: Leaving directory '/tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/src'
make -f CMakeFiles/RcppPlanc.dir/build.make CMakeFiles/RcppPlanc.dir/build
make[2]: Entering directory '/tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/src'
[ 33%] Building CXX object CMakeFiles/RcppPlanc.dir/rcppplanc_nmf.cpp.o
/usr/bin/g++ -DHIGHFIVE_HAS_CONCEPTS=0 -DMPI_NO_CPPBIND -DRcppPlanc_EXPORTS -I/tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/tools/../src/common -I/tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/tools/../src/nnls -I/tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/tools/../src/nmf -I/usr/local/lib/R/include -I/usr/local/lib/R/site-library/Rcpp/include -I/usr/local/lib/R/site-library/RcppArmadillo/include -I/usr/local/lib/R/site-library/RcppProgress/include -I/tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/src/_deps/highfive-src/include -isystem /usr/include/hdf5/serial -I/usr/local/include -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g -O2 -g -DNDEBUG -fPIC -fopenmp -ffinite-math-only -fno-signaling-nans -fno-rounding-math -fno-trapping-math -fno-math-errno -fno-signed-zeros -MD -MT CMakeFiles/RcppPlanc.dir/rcppplanc_nmf.cpp.o -MF CMakeFiles/RcppPlanc.dir/rcppplanc_nmf.cpp.o.d -o CMakeFiles/RcppPlanc.dir/rcppplanc_nmf.cpp.o -c /tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/src/rcppplanc_nmf.cpp
[ 50%] Building CXX object CMakeFiles/RcppPlanc.dir/RcppExports.cpp.o
/usr/bin/g++ -DHIGHFIVE_HAS_CONCEPTS=0 -DMPI_NO_CPPBIND -DRcppPlanc_EXPORTS -I/tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/tools/../src/common -I/tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/tools/../src/nnls -I/tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/tools/../src/nmf -I/usr/local/lib/R/include -I/usr/local/lib/R/site-library/Rcpp/include -I/usr/local/lib/R/site-library/RcppArmadillo/include -I/usr/local/lib/R/site-library/RcppProgress/include -I/tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/src/_deps/highfive-src/include -isystem /usr/include/hdf5/serial -I/usr/local/include -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g -O2 -g -DNDEBUG -fPIC -fopenmp -ffinite-math-only -fno-signaling-nans -fno-rounding-math -fno-trapping-math -fno-math-errno -fno-signed-zeros -MD -MT CMakeFiles/RcppPlanc.dir/RcppExports.cpp.o -MF CMakeFiles/RcppPlanc.dir/RcppExports.cpp.o.d -o CMakeFiles/RcppPlanc.dir/RcppExports.cpp.o -c /tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/src/RcppExports.cpp
[ 66%] Building CXX object CMakeFiles/RcppPlanc.dir/conversion.cpp.o
/usr/bin/g++ -DHIGHFIVE_HAS_CONCEPTS=0 -DMPI_NO_CPPBIND -DRcppPlanc_EXPORTS -I/tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/tools/../src/common -I/tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/tools/../src/nnls -I/tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/tools/../src/nmf -I/usr/local/lib/R/include -I/usr/local/lib/R/site-library/Rcpp/include -I/usr/local/lib/R/site-library/RcppArmadillo/include -I/usr/local/lib/R/site-library/RcppProgress/include -I/tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/src/_deps/highfive-src/include -isystem /usr/include/hdf5/serial -I/usr/local/include -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g -O2 -g -DNDEBUG -fPIC -fopenmp -ffinite-math-only -fno-signaling-nans -fno-rounding-math -fno-trapping-math -fno-math-errno -fno-signed-zeros -MD -MT CMakeFiles/RcppPlanc.dir/conversion.cpp.o -MF CMakeFiles/RcppPlanc.dir/conversion.cpp.o.d -o CMakeFiles/RcppPlanc.dir/conversion.cpp.o -c /tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/src/conversion.cpp
[ 83%] Building C object CMakeFiles/RcppPlanc.dir/common/detect_blas.c.o
/usr/bin/gcc -DHIGHFIVE_HAS_CONCEPTS=0 -DMPI_NO_CPPBIND -DRcppPlanc_EXPORTS -I/tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/tools/../src/common -I/tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/tools/../src/nnls -I/tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/tools/../src/nmf -I/usr/local/lib/R/include -I/usr/local/lib/R/site-library/Rcpp/include -I/usr/local/lib/R/site-library/RcppArmadillo/include -I/usr/local/lib/R/site-library/RcppProgress/include -I/tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/src/_deps/highfive-src/include -isystem /usr/include/hdf5/serial -I/usr/local/include -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g -O2 -g -DNDEBUG -std=gnu99 -fPIC -fopenmp -ffinite-math-only -fno-signaling-nans -fno-rounding-math -fno-trapping-math -fno-math-errno -fno-signed-zeros -MD -MT CMakeFiles/RcppPlanc.dir/common/detect_blas.c.o -MF CMakeFiles/RcppPlanc.dir/common/detect_blas.c.o.d -o CMakeFiles/RcppPlanc.dir/common/detect_blas.c.o -c /tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/src/common/detect_blas.c
[100%] Linking CXX shared library RcppPlanc.so
/usr/local/lib/python3.10/dist-packages/cmake/data/bin/cmake -E cmake_link_script CMakeFiles/RcppPlanc.dir/link.txt --verbose=1
/usr/bin/g++ -fPIC -I/usr/local/include -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g -O2 -g -DNDEBUG -shared -L/usr/local/lib -shared -Wl,-soname,RcppPlanc.so -o RcppPlanc.so CMakeFiles/RcppPlanc.dir/rcppplanc_nmf.cpp.o CMakeFiles/RcppPlanc.dir/RcppExports.cpp.o CMakeFiles/RcppPlanc.dir/conversion.cpp.o CMakeFiles/RcppPlanc.dir/common/detect_blas.c.o CMakeFiles/hw_detect.dir/common/hw_detect.c.o -L/usr/local/lib/R/lib -Wl,-rpath,/usr/local/lib/R/lib:/usr/lib/x86_64-linux-gnu/hdf5/serial -lstdc++fs -lR /usr/lib/x86_64-linux-gnu/libopenblas.so /usr/lib/x86_64-linux-gnu/libopenblas.so -lm -ldl /usr/lib/x86_64-linux-gnu/hdf5/serial/libhdf5.so /usr/lib/x86_64-linux-gnu/libcrypto.so /usr/lib/x86_64-linux-gnu/libcurl.so /usr/lib/x86_64-linux-gnu/libpthread.a /usr/lib/x86_64-linux-gnu/libsz.so /usr/lib/x86_64-linux-gnu/libz.so /usr/lib/x86_64-linux-gnu/libdl.a -lm -L/usr/lib/x86_64-linux-gnu -lhwloc
/usr/local/lib/python3.10/dist-packages/cmake/data/bin/cmake -E remove /tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/src/_deps/highfive-build/CMakeFiles/hdf5/cmake_hdf5_test.c
make[2]: Leaving directory '/tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/src'
[100%] Built target RcppPlanc
make[1]: Leaving directory '/tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/src'
/usr/local/lib/python3.10/dist-packages/cmake/data/bin/cmake -E cmake_progress_start /tmp/Rtmp7wy8Ee/R.INSTALL3831b72f51/RcppPlanc/src/CMakeFiles 0
installing to /home/rstudio/R/x86_64-pc-linux-gnu-library/4.3-3.18/00LOCK-RcppPlanc/00new/RcppPlanc/libs
** R
** data
*** moving datasets to lazyload DB
** inst
** byte-compile and prepare package for lazy loading
** help
*** installing help indices
** building package indices
** installing vignettes
** testing if installed package can be loaded from temporary location
** checking absolute paths in shared objects and dynamic libraries
** testing if installed package can be loaded from final location
** testing if installed package keeps a record of temporary installation path
* DONE (RcppPlanc) rliger> devtools::install_github("welch-lab/liger", ref = "newObj")
Downloading GitHub repo welch-lab/liger@newObj
── R CMD build ───────────────────────────────────────────────────────────────────────────────────────────
✔ checking for file ‘/tmp/RtmpBYJEfV/remotes7f14870a7a9/welch-lab-liger-e23c66e/DESCRIPTION’ ...
─ preparing ‘rliger’:
✔ checking DESCRIPTION meta-information ...
─ cleaning src
─ checking for LF line-endings in source and make files and shell scripts
─ checking for empty or unneeded directories
─ building ‘rliger_1.99.0.tar.gz’
Installing package into ‘/home/rstudio/R/x86_64-pc-linux-gnu-library/4.3-3.18’
(as ‘lib’ is unspecified)
* installing *source* package ‘rliger’ ...
** using staged installation
** libs
using C++ compiler: ‘g++ (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0’
g++ -std=gnu++17 -I"/usr/local/lib/R/include" -DNDEBUG -DARMA_64BIT_WORD=1 -I'/usr/local/lib/R/site-library/Rcpp/include' -I'/usr/local/lib/R/site-library/RcppArmadillo/include' -I'/usr/local/lib/R/site-library/RcppProgress/include' -I/usr/local/include -DARMA_DONT_USE_OPENMP -fpic -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g -c ModularityOptimizer.cpp -o ModularityOptimizer.o
g++ -std=gnu++17 -I"/usr/local/lib/R/include" -DNDEBUG -DARMA_64BIT_WORD=1 -I'/usr/local/lib/R/site-library/Rcpp/include' -I'/usr/local/lib/R/site-library/RcppArmadillo/include' -I'/usr/local/lib/R/site-library/RcppProgress/include' -I/usr/local/include -DARMA_DONT_USE_OPENMP -fpic -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g -c RModularityOptimizer.cpp -o RModularityOptimizer.o
g++ -std=gnu++17 -I"/usr/local/lib/R/include" -DNDEBUG -DARMA_64BIT_WORD=1 -I'/usr/local/lib/R/site-library/Rcpp/include' -I'/usr/local/lib/R/site-library/RcppArmadillo/include' -I'/usr/local/lib/R/site-library/RcppProgress/include' -I/usr/local/include -DARMA_DONT_USE_OPENMP -fpic -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g -c RcppExports.cpp -o RcppExports.o
g++ -std=gnu++17 -I"/usr/local/lib/R/include" -DNDEBUG -DARMA_64BIT_WORD=1 -I'/usr/local/lib/R/site-library/Rcpp/include' -I'/usr/local/lib/R/site-library/RcppArmadillo/include' -I'/usr/local/lib/R/site-library/RcppProgress/include' -I/usr/local/include -DARMA_DONT_USE_OPENMP -fpic -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g -c cinmf_util.cpp -o cinmf_util.o
g++ -std=gnu++17 -I"/usr/local/lib/R/include" -DNDEBUG -DARMA_64BIT_WORD=1 -I'/usr/local/lib/R/site-library/Rcpp/include' -I'/usr/local/lib/R/site-library/RcppArmadillo/include' -I'/usr/local/lib/R/site-library/RcppProgress/include' -I/usr/local/include -DARMA_DONT_USE_OPENMP -fpic -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g -c data_processing.cpp -o data_processing.o
g++ -std=gnu++17 -I"/usr/local/lib/R/include" -DNDEBUG -DARMA_64BIT_WORD=1 -I'/usr/local/lib/R/site-library/Rcpp/include' -I'/usr/local/lib/R/site-library/RcppArmadillo/include' -I'/usr/local/lib/R/site-library/RcppProgress/include' -I/usr/local/include -DARMA_DONT_USE_OPENMP -fpic -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g -c feature_mat.cpp -o feature_mat.o
g++ -std=gnu++17 -I"/usr/local/lib/R/include" -DNDEBUG -DARMA_64BIT_WORD=1 -I'/usr/local/lib/R/site-library/Rcpp/include' -I'/usr/local/lib/R/site-library/RcppArmadillo/include' -I'/usr/local/lib/R/site-library/RcppProgress/include' -I/usr/local/include -DARMA_DONT_USE_OPENMP -fpic -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g -c quantile_norm.cpp -o quantile_norm.o
g++ -std=gnu++17 -I"/usr/local/lib/R/include" -DNDEBUG -DARMA_64BIT_WORD=1 -I'/usr/local/lib/R/site-library/Rcpp/include' -I'/usr/local/lib/R/site-library/RcppArmadillo/include' -I'/usr/local/lib/R/site-library/RcppProgress/include' -I/usr/local/include -DARMA_DONT_USE_OPENMP -fpic -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g -c snn.cpp -o snn.o
g++ -std=gnu++17 -I"/usr/local/lib/R/include" -DNDEBUG -DARMA_64BIT_WORD=1 -I'/usr/local/lib/R/site-library/Rcpp/include' -I'/usr/local/lib/R/site-library/RcppArmadillo/include' -I'/usr/local/lib/R/site-library/RcppProgress/include' -I/usr/local/include -DARMA_DONT_USE_OPENMP -fpic -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g -c wilcoxon.cpp -o wilcoxon.o
g++ -std=gnu++17 -shared -L/usr/local/lib/R/lib -L/usr/local/lib -o rliger.so ModularityOptimizer.o RModularityOptimizer.o RcppExports.o cinmf_util.o data_processing.o feature_mat.o quantile_norm.o snn.o wilcoxon.o -llapack -lblas -lgfortran -lm -lquadmath -L/usr/local/lib/R/lib -lR
installing to /home/rstudio/R/x86_64-pc-linux-gnu-library/4.3-3.18/00LOCK-rliger/00new/rliger/libs
** R
** data
*** moving datasets to lazyload DB
** inst
** byte-compile and prepare package for lazy loading
** help
*** installing help indices
** building package indices
** installing vignettes
** testing if installed package can be loaded from temporary location
** checking absolute paths in shared objects and dynamic libraries
** testing if installed package can be loaded from final location
** testing if installed package keeps a record of temporary installation path
* DONE (rliger) sessionInfo() output> sessionInfo()
R version 4.3.2 (2023-10-31)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 22.04.2 LTS
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so; LAPACK version 3.10.0
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8
[4] LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
time zone: Etc/UTC
tzcode source: system (glibc)
attached base packages:
[1] stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached):
[1] miniUI_0.1.1.1 compiler_4.3.2 promises_1.2.1 Rcpp_1.0.12 stringr_1.5.1
[6] callr_3.7.5 later_1.3.2 fastmap_1.1.1 mime_0.12 R6_2.5.1
[11] curl_5.2.1 htmlwidgets_1.6.4 desc_1.4.3 profvis_0.3.8 shiny_1.8.0
[16] rlang_1.1.3 cachem_1.0.8 stringi_1.8.3 httpuv_1.6.14 fs_1.6.3
[21] pkgload_1.3.4 memoise_2.0.1 cli_3.6.2 magrittr_2.0.3 ps_1.7.6
[26] digest_0.6.35 processx_3.8.3 rstudioapi_0.15.0 xtable_1.8-4 remotes_2.4.2.1
[31] devtools_2.4.5 lifecycle_1.0.4 vctrs_0.6.5 glue_1.7.0 urlchecker_1.0.1
[36] sessioninfo_1.2.2 pkgbuild_1.4.3 purrr_1.0.2 tools_4.3.2 usethis_2.2.3
[41] ellipsis_0.3.2 htmltools_0.5.7 |
That all looks correct. Can you see what happens if you benchmark it with the proper openmp variable state set beforehand? I don't know whether openmp runtime state carries between extensions. |
The other possibility is that this is the same issue where pthreaded BLAS is causing parallelism recursion. RcppPlanc only checks for openmp enablement at attach, so it wouldn't necessarily be aware of the lifted threadcount. |
@theAeon Sounds good I will try those first thing tomorrow. @mvfki I have found another potential bug/issue. While I do realize parts of this may be edge cases I wanted to mention and why it may be better to have separate slot for the dimreducs.
To me at least I think having them in separate slot makes sense and ensures that they aren't accidentally overwritten by user but I don't know if there may be reason I don't understand why adding another slot could cause issues.
Thanks again!! |
I do agree with having them in a separate slot and what you found is actually what I've been somewhat worried about. The underlying things are like: a variable is called "UMAP", and it is a matrix. When using ggplot, "fortifying", I have to convert the S4Vectors::DataFrame into a data.frame which expands the matrix into as many columns as it should, and the column names "UMAP.1" "UMAP.2" are generated there. I have to manually get the suffix pasted in my plotting function, which is dirty. Things are even worse when it comes to container class conversion. I cannot actually tell a difference between an actuall dimred between something comes as a matrix but should be considered as metadata (e.g. SingleR returns), unless I get many more stuffs recorded and make the whole thing even dirtier. When we were at a pretty early stage, we were thinking in a way that dimreds are somewhat metadata. So I just put them there because from my previous experience with SCEs I know a matrix can be inserted in a DataFrame. To be honest, I'm kinda suffering from this decision now. Appreciate it that you finally let me made my mind to do it. Yichen |
Hi Yichen (@mvfki ), Ya it's definitely tricky. So one thing to consider is potentially just modeling the slot on SeuratObject's DimReduc class (or potentially just importing that class as is from SeuratObject) in liger. I know we discussed heavy dependencies before but I actually checked it out and SeuratObject's dependencies are actually rather light (or at least by comparison to Seurat for sure). In fact Seurat Object as dependency would only add 11 packages to installation.
I don't know the ins and outs of the DimReduc class or really setting classes like that but just a thought. It might also make transfer of dimreducs from Seurat easier because they could just be imported. Either adapting the class or directly importing it would allow for use of Seurat Objects accessor functions (most notably Just a thought in case it helps. Also again want to repeat for both of you how great this new version of liger is and how thankful I am you both and entire rest of the team's work!!! Finally, as it regards DimReduc storage and accessing/plotting if that is something that you would like help with I'm happy to try and assist if that lightens the load at all, just let me know! Best, |
I'm currently drafting it in the simplest way: the @dimReds slot would be a list of matrices. And I have checks on cell ID matching across the object. I know that a Seurat DimReduc object can have a lot more information stored for individual dimreds but we haven't yet seen any use case with them in LIGER, rather, we simply just need the embedding coordinates for visualization. One problem I can foresee is that when we convert a liger object to a seurat object, we don't have that much information to fill in the DimReduc object constructor. And information could be lost when converting it back and forth, though this is not really what we expect users to do. This is not yet overally finished but all unit tests are already passing and it's live in the branch (together with the package name change). Maybe you want to try out shortly after I revisit the Seurat conversion cases later today? Thank you very much for all the valuable comments and efforts in testing things out. To be honest it could be impossible for us to find out these many edge cases without your help! Yichen |
Hi Yichen, That's awesome!! Ya I will give it a go later today! Happy to help any way I can, even if it's just finding all the weird things lol. Bets, |
Hi @mvfki Got new warning message I've never gotten before today. Running iNMF I got this:
It went on even more than that but saving space here. The warning went away when I lowered the k value. Just wondering what the error is and if there is way to convey more informative error message/abort the function so user can adjust parameters. Thanks! |
Looks like something mathematical. Could you share the following numbers for this run:
|
Ya looks like probably issue with cell number.
5031 variable features |
Makes sense! Added! I've just had some last pieces of remaining work done and made a push with the major version bump. I'd like to let you know that our team plans to submit it to cran this week. Please let me know if you have any more concerns, especially the parts related to the reverse suggests by scCustomize. Appreciate it! |
Hooray! Congrats! I think I'm pretty good on scCustomize end. About half-way through making things forward/backward compatible but I think all good. Will let you know if anything else comes up! |
@mvfki One thing I just thought of and if it's too late for this CRAN release please don't worry about it but something to think about for next update after 2.0.0. It would be nice to have a parameter in I can obviously run the internal code from runCluster to replicate the results but would be easy to just have logical parameter in Thanks! |
Doable and it's not too late yet. I'm still dealing with some cran machine specific notes lol. Since liger itself doesn't have much things to do with the SNN but just run clustering, I'm not going to add a slot for it which would cost additional work on sanity checks. I'd say either a variable in cellMeta since S4Vectors::DataFrame should work with it, or just throw it in uns for unstructured miscellaneous information. I prefer the latter because it might result in enormous memory use when we happen to do as.data.frame on cellMeta. |
Great!! Also glad it's not too late because think I just found another bug. With the new dimred slot there is problem when using
|
Ya I think adding to uns makes sense so you don't have coerce it to dense structure. |
@mvfki Ignore previous comment about bug. It was error occuring due to rliger2 being loaded at same time as newer rliger. |
haha I was about to say I couldn't reproduce it. |
haha yup. thought I had removed rliger2 but guess not. Doing that now though lol |
Hi @samuel-marsh , it is live on CRAN, surprisingly no much trouble from the manual inspection! Again, I appreciate it so much that you kindly gave us so much help in testing the package. Best, |
Awesome congrats!! Happy I could help. I haven't run into any more issues in updating functions so I'll close this issue here and if I run into anything more will open on liger repo. Thanks so much both of you @theAeon and @mvfki for all your help!! The updates are really fantastic, speed was always the main drag for me and the changes are amazing (also having leiden clustering without passing to python which is soooooo slow in Seurat)!!!! |
Hi! We're planning on submitting a major rliger upgrade to CRAN in the coming days to weeks. It contains a fairly large number of API changes and (more relevant to your purposes) object structure changes. We will be pushing the updated code to welch-lab/liger@newObj shortly and promoting it to master once CRAN has their way with it. Updated documentation is available at https://mvfki.github.io/liger/. Feel free to reach out if you have any questions!
Best,
-Andrew Robbins
The text was updated successfully, but these errors were encountered: