Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

spatialLIBD #1389

Closed
8 tasks done
lcolladotor opened this issue Feb 27, 2020 · 31 comments
Closed
8 tasks done

spatialLIBD #1389

lcolladotor opened this issue Feb 27, 2020 · 31 comments
Assignees
Labels
3a. accepted will be ingested into Bioconductor daily builder for distribution WARNINGS

Comments

@lcolladotor
Copy link
Contributor

Update the following URL to point to the GitHub repository of
the package you wish to submit to Bioconductor

Confirm the following by editing each check box to '[x]'

  • I understand that by submitting my package to Bioconductor,
    the package source and all review commentary are visible to the
    general public.

  • I have read the Bioconductor Package Submission
    instructions. My package is consistent with the Bioconductor
    Package Guidelines.

  • I understand that a minimum requirement for package acceptance
    is to pass R CMD check and R CMD BiocCheck with no ERROR or WARNINGS.
    Passing these checks does not result in automatic acceptance. The
    package will then undergo a formal review and recommendations for
    acceptance regarding other Bioconductor standards will be addressed.

  • My package addresses statistical or bioinformatic issues related
    to the analysis and comprehension of high throughput genomic data.

  • I am committed to the long-term maintenance of my package. This
    includes monitoring the support site for issues that users may
    have, subscribing to the bioc-devel mailing list to stay aware
    of developments in the Bioconductor community, responding promptly
    to requests for updates from the Core team in response to changes in
    R or underlying software.

I am familiar with the essential aspects of Bioconductor software
management, including:

  • The 'devel' branch for new packages and features.
  • The stable 'release' branch, made available every six
    months, for bug fixes.
  • Bioconductor version control using Git
    (optionally via GitHub).

For help with submitting your package, please subscribe and post questions
to the bioc-devel mailing list.

@bioc-issue-bot
Copy link
Collaborator

Hi @lcolladotor

Thanks for submitting your package. We are taking a quick
look at it and you will hear back from us soon.

The DESCRIPTION file for this package is:

Package: spatialLIBD
Title: LIBD Visium spatial transcriptomics human pilot data inspector
Version: 0.99.0
Date: 2020-02-23
Authors@R:
    c(
    person("Leonardo", "Collado-Torres", role = c("aut", "cre"), 
    email = "[email protected]", comment = c(ORCID = "0000-0003-2140-308X")),
    person(c("Andrew", "E."), "Jaffe",
    role = "ctb", email = "[email protected]",
    comment = c(ORCID = "0000-0001-6886-1454"))
    )
Description: Inspect interactively the spatial transcriptomics 10x Genomics
    Visium data from Maynard, Collado-Torres et al, 2020 analyzed by Lieber
    Institute for Brain Development researchers and collaborators.
License: Artistic-2.0
Encoding: UTF-8
LazyData: true
Imports: 
    shiny,
    golem,
    ggplot2,
    cowplot,
    plotly,
    viridisLite,
    shinyWidgets,
    Polychrome,
    sessioninfo,
    grid,
    grDevices,
    methods,
    AnnotationHub,
    utils,
    png,
    scater,
    DT,
    ExperimentHub,
    RColorBrewer,
    SummarizedExperiment,
    stats,
    graphics,
    S4Vectors,
    IRanges,
    fields
RoxygenNote: 7.0.2
Roxygen: list(markdown = TRUE)
URL: https://github.com/LieberInstitute/spatialLIBD
BugReports: https://github.com/LieberInstitute/spatialLIBD/issues
Suggests: 
    knitr,
    rmarkdown,
    BiocStyle,
    knitcitations,
    testthat (>= 2.1.0),
    covr,
    here,
    BiocManager
VignetteBuilder: knitr
biocViews: Homo_sapiens_Data, ExperimentHub, SequencingData, SingleCellData,
    ExpressionData, Tissue
Depends:
    SingleCellExperiment,
    R (>= 3.6.1)

Add SSH keys to your GitHub account. SSH keys
will are used to control access to accepted Bioconductor
packages. See these instructions to add SSH keys to
your GitHub account.

@lcolladotor
Copy link
Contributor Author

Hi!

This is the recently promised package that I mentioned at BioC's slack last Friday.

We are working on a new pkg that we’ll submit to BioC, potentially next week. The package will have:

  • Ways to download the 12 Visium data that Andrew mentioned previously.
  • The data is stored in a SingleCellExperiment object with the histology images stored inside metadata(sce)
  • Functions for plotting data from the data.
  • The code for launching a shiny app that lets you annotate spots to a given tissue/spot label of your choosing.
  • Some other functions related to our particular dataset.

The idea is that it could be useful to you if:

  • You simply want some more Visium data.

  • Want to re-shape your data into what ours is structured as, then re-use the visualization functions and/or the shiny app itself.

  • Want to explore our data in more detail.

  • I’ll let you know once we submit it to BioC so that everyone can access it. The shiny app is not meant to compete with iSEE and I guess that iSEE (or other apps) could be updated to handle this type of data. It was designed with feedback from our biologists such that they could annotate the spots. Compared to Loupe from 10x Genomics, it’s a bit more bulky but it does let you explore multiple samples.

Best,
Leo

Having said all that, spatialLIBD has the following features and properties:

  • It is proposed as an Experiment package such that the data will live at ExperimentHub and can be accessed through spatialLIBD::fetch_data(). The function has a backup mechanism which means that it works right now (without the caching featured by ExperimentHub through BiocFileCache).
  • It contains the code for a prototype shiny application that can be run using spatialLIBD::run_app() and is deployed at https://jhubiostatistics.shinyapps.io/spatialLIBD/ . The shiny code works and is definitely not at the level of iSEE as that was not the focus of our project. Basically, the shiny app is good enough for our purposes, although spatialLIBD::run_app() could be used to visualize your own Visium data if you re-shape it to fit our requirements. Given that https://jhubiostatistics.shinyapps.io/spatialLIBD/ at the time that I'm writing this supports max 10 simultaneous users due to the memory constraints, we think that users will want to run the app locally with spatialLIBD::run_app() more frequently than not.
  • We have built a pkgdown documentation website that is updated automatically as is available at http://research.libd.org/spatialLIBD/ thanks to Travis (spatialLIBD build reports). This required a small workaround for the memory limits there.
  • Does not depend on R 4.0.0 as should be the case for new submissions because we (a) are using it actively right now with R 3.6.1 and 3.6.2 coupled with Bioconductor version 3.10 (current release) and (b) because we are deploying a shiny app at https://jhubiostatistics.shinyapps.io/spatialLIBD/. Unless you have a suggestion, we ask that you please ignore this.
  • We are aware that Aaron Lun announced on Bioc-devel that an upcoming version of SingleCellExperiment will change the object internally. So maybe we'll need to run updateObject() in the future once R 4.0.0 is released or something like that (again, if we update the object now we won't be able to use it on R 3.6.1/3.6.2 with BioC 3.10).
  • Part of the documentation points to https://github.com/LieberInstitute/HumanPilot which is our GitHub repository for the analysis code for the project for which we developed spatialLIBD. This repository will be made public within a few days (maybe even tomorrow; we'll flip the switch as soon as we post a public pre-print on bioRxiv). In particular, we link heavily to this repository in the vignette section Re-shaping your data to our structure.
  • Currently has a clean R CMD check output except for a NOTE https://travis-ci.com/LieberInstitute/spatialLIBD/jobs/291570143#L880-L885 about the package size (11.5 Mb). But that's small enough for BiocCheck considering it's an Experiment package.
  • Uses set.seed() inside spatialLIBD::layer_boxplot(seed) because we want our jittered points to be fully reproducible. The user can control this through the seed argument. This is related to the only WARNING BiocCheck reports (see further below).
  • The system() calls mentioned by BiocCheck are from lines of code that I as the developer ran and wanted to save for self-future reference. They are never evaluated by spatialLIBD. This is in part because I'm trying out the golem way of organizing your code with a dev directory.
  • Has no formal unit tests, though all the functions are used at https://jhubiostatistics.shinyapps.io/spatialLIBD/ ^^. This is related to the next paragraph.

Overall, I think this goes beyond a typical Experiment package. It has flavors of a Software package, however, we are well aware that the Bioconductor community is working towards developing infrastructure packages for spatial transcriptomics data and in particular, 10x Genomics Visium data. Given that spatialLIBD provides the first substantial data on human brain for the Visium platform and more challenges than the public mouse 10x Genomics dataset, we think that maybe some of the work we did here could help fuel that development and we would be happy to collaborate/help in any way. For example, over time the visualization functions could either move from spatialLIBD to another package (where we could formally test them) and we would thus likely update spatialLIBD to import the new infraestructure (or something like that).

As always, if you have any questions, please let me know.

I also wanted to let you know that since we are submitting our pre-print soon, likely we'll link to http://research.libd.org/spatialLIBD/ instead of http://bioconductor.org/packages/spatialLIBD unless you have other suggestions.

Best,
Leo

R CMD build/check on R 4.0.0

BiocCheck on R 3.6.2

Below is the output of BiocCheck (locally from my laptop on R 3.6.2 and Bioconductor 3.10 -- current release branch; I know that the single issue builder will provide more details soon).

$ R CMD BiocCheck --new-package spatialLIBD_0.99.0.tar.gz
This is BiocCheck version 1.22.0. BiocCheck is a work in progress.
Output and severity of issues may change. Installing package...
* Checking Package Dependencies...
* Checking if other packages can import this one...
* Checking to see if we understand object initialization...
* Checking for deprecated package usage...
* Checking for remote package usage...
* Checking version number...
* Checking for version number mismatch...
* Checking new package version number...
* Checking R Version dependency...
* Checking package size...
* Checking individual file sizes...
* Checking biocViews...
* Checking that biocViews are present...
* Checking package type based on biocViews...
    ExperimentData
* Checking for non-trivial biocViews...
* Checking that biocViews come from the same category...
* Checking biocViews validity...
* Checking for recommended biocViews...
    * NOTE: Consider adding these automatically suggested biocViews:
      PackageTypeData
See http://bioconductor.org/developers/how-to/biocViews/
* Checking build system compatibility...
* Checking for blank lines in DESCRIPTION...
* Checking if DESCRIPTION is well formatted...
* Checking for whitespace in DESCRIPTION field names...
* Checking that Package field matches directory/tarball name...
* Checking for Version field...
* Checking for valid maintainer...
* Checking DESCRIPTION/NAMESPACE consistency...
    This is not a software package, skipping vignette checks...
* Checking library calls...
* Checking for library/require of spatialLIBD...
* Checking coding practice...
    * NOTE: Avoid sapply(); use vapply()
      Found in files:
        check_modeling_results.R (line 33, column 9)
        check_modeling_results.R (line 40, column 19)
        check_modeling_results.R (line 41, column 13)
        gene_set_enrichment.R (line 80, column 17)
        gene_set_enrichment.R (line 100, column 26)
        gene_set_enrichment.R (line 101, column 28)
        layer_boxplot.R (line 192, column 17)
        layer_stat_cor.R (line 56, column 17)
        sig_genes_extract.R (line 78, column 29)
        sig_genes_extract.R (line 93, column 9)
        sig_genes_extract.R (line 97, column 9)
        sig_genes_extract.R (line 101, column 9)
    * NOTE: Avoid system() ; use system2()
      Found in files:
        inst/scripts/make-data_spatialLIBD.R (line 201)
        inst/scripts/make-data_spatialLIBD.R (line 218)
        inst/doc/spatialLIBD.Rmd (line 192)
        vignettes/spatialLIBD.Rmd (line 192)
    * WARNING: Remove set.seed usage in R code
      Found in R/ directory functions:
        layer_boxplot()
* Checking parsed R code in R directory, examples, vignettes...
* Checking function lengths...................................................
    * NOTE: Recommended function length <= 50 lines.
      There are 14 functions > 50 lines.
      The longest 5 functions are:
        app_server() (R/app_server.R, line 17): 1273 lines
        app_ui() (R/app_ui.R, line 6): 516 lines
        layer_boxplot() (R/layer_boxplot.R, line 104): 123 lines
        check_sce() (R/check_sce.R, line 25): 89 lines
        layer_matrix_plot() (R/layer_matrix_plot.R, line 59): 86 lines
* Checking man page documentation...
* Checking package NEWS...
* Checking unit tests...
* Checking skip_on_bioc() in tests...
* Checking formatting of DESCRIPTION, NAMESPACE, man pages, R source,
  and vignette source...
    * NOTE: Consider shorter lines; 305 lines (5%) are > 80 characters
      long.
    First 6 lines:
      R/app_server.R:37     ## From /dcl02/lieber/ajaffe/SpatialTranscriptomi...
      R/app_server.R:38     # cat(paste0("'", names(cols_layers_martinowich),...
      R/app_server.R:74         } else if (input$cluster %in% c('layer_guess'...
      R/app_server.R:76         } else if (input$cluster %in% c('layer_guess_...
      R/app_server.R:356                 assays(sce_sub)[[assayname]][which(r...
      R/app_server.R:413             scale_fill_manual(values = get_colors(co...
    * NOTE: Consider multiples of 4 spaces for line indents, 118
      lines(2%) are not.
    First 6 lines:
      man/check_sce.Rd:8   sce,
      man/check_sce.Rd:9   variables = c("GraphBased", "Layer", "Maynard", "M...
      man/fetch_data.Rd:8   type = c("sce", "sce_layer", "modeling_results"),
      man/fetch_data.Rd:9   destdir = tempdir(),
      man/fetch_data.Rd:10   eh = ExperimentHub::ExperimentHub()
      man/gene_set_enrichment_plot.Rd:8   enrichment,
    See http://bioconductor.org/developers/how-to/coding-style/
    See FormatR package:
      https://cran.r-project.org/web/packages/formatR/index.html
* Checking if package already exists in CRAN...
* Checking if new package already exists in Bioconductor...
* Checking for bioc-devel mailing list subscription...
    * NOTE: Cannot determine whether maintainer is subscribed to the
      bioc-devel mailing list (requires admin credentials).  Subscribe
      here: https://stat.ethz.ch/mailman/listinfo/bioc-devel
* Checking for support site registration...
    Maintainer is registered at support site.


Summary:
ERROR count: 0
WARNING count: 1
NOTE count: 7
For detailed information about these checks, see the BiocCheck
vignette, available at
https://bioconductor.org/packages/3.10/bioc/vignettes/BiocCheck/inst/doc/BiocCheck.html#interpreting-bioccheck-output

@bioc-issue-bot
Copy link
Collaborator

A reviewer has been assigned to your package. Learn what to expect
during the review process.

IMPORTANT: Please read the instructions for setting
up a push hook on your repository, or further changes to your
repository will NOT trigger a new build.

@bioc-issue-bot bioc-issue-bot added 2. review in progress assign a reviewer and a more thorough review of package code and documentation taking place and removed 1. awaiting moderation labels Mar 2, 2020
@bioc-issue-bot
Copy link
Collaborator

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "skipped, ERROR".
This may mean there is a problem with the package that you need to fix.
Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

@bioc-issue-bot
Copy link
Collaborator

Received a valid push; starting a build. Commits are:

7ac24dc v0.99.10 -- update AWS file locations, resolve a b...

@bioc-issue-bot
Copy link
Collaborator

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "ERROR".
This may mean there is a problem with the package that you need to fix.
Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

@lcolladotor
Copy link
Contributor Author

Hi @dvantwisk.

Thank you for reviewing spatialLIBD. Right now my understanding is that I cannot resolve the RccpAnnoy error from the DESCRIPTION since I can't ask the Single Package Builder to have a particular version of RcppAnnoy installed.

If that's not the case, please let me know. This is related to eddelbuettel/rcppannoy#57. In any case, it looks from eddelbuettel/rcppannoy#57 (comment) that RcppAnnoy 0.0.16 that fixes this issue will be available through CRAN by the end of the week.

As for the BiocCheck warnigns about R 3.6 and set.seed(), I already explained that above.

Best,
Leo

@mtmorgan
Copy link
Contributor

mtmorgan commented Mar 3, 2020

For set.seed(), I guess it's as reproducible if it is set outside the function as set inside? so why set it inside? this is generally considered Very Bad Practice.

@bioc-issue-bot
Copy link
Collaborator

Received a valid push; starting a build. Commits are:

5024450 v0.99.11 -- test without listing the version of Rc...

@bioc-issue-bot
Copy link
Collaborator

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "ERROR".
This may mean there is a problem with the package that you need to fix.
Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

@bioc-issue-bot
Copy link
Collaborator

Received a valid push; starting a build. Commits are:

a2c0d08 v0.99.12 -- see NEWS.md for details

@bioc-issue-bot
Copy link
Collaborator

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "ERROR, WARNINGS".
This may mean there is a problem with the package that you need to fix.
Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

@lcolladotor
Copy link
Contributor Author

Hi,

So I moved set.seed() out of spatialLIBD::layer_boxplot() but I still need it inside spatialLIBD::app_server() to make the boxplots reproducible, so the warning from BiocCheck remains. It's not like I can call set.seed() outside of app_server() since then individual boxplots wouldn't really be reproducible as they would depend on the order a user made them. That's why BiocCheck still lists 2 warnings: set.seed() and the R version (3.6 not 4.0; because I need to deploy the app).

I did reduce the number of NOTEs from 6 to 3 for BiocCheck. The error by BiocCheckGitClone is now gone since I removed spatialLIBD.Rproj (I'm not sure why that file can't be there).

As for Windows, R CMD check fails due to memory. How much memory does that builder have? Because the package needs about 2.5 GB of RAM to load the data it provides.

Best,
Leo

@mtmorgan
Copy link
Contributor

mtmorgan commented Mar 4, 2020

but generally reproducibility involving random numbers depends on the order in which the user invokes functions, how is this any different?

@lcolladotor
Copy link
Contributor Author

I simply want the boxplots to have points plotted on top of them and to be jittered randomly, but controlled in such a way that you can re-make the same plot that you see in the shiny app through the R console. That way you can make it on the R console and save to a different set of pdf size/height if you want to. Or you can make the sample plot and change the colors as in http://research.libd.org/spatialLIBD/reference/layer_boxplot-5.png and http://research.libd.org/spatialLIBD/reference/layer_boxplot-6.png.

Originally, I was able to do this by having a seed argument in spatialLIBD::layer_boxplot() and using the same seed in every call of layer_boxplot() (whether it was the examples of the package or the shiny app). Now by having the user call set.seed() themselves prior to running layer_boxplot(), then the user can remake all plots they want to in the console as in http://research.libd.org/spatialLIBD/reference/layer_boxplot.html. However, in order for a user to remake the sample plot they see from the shiny app, I also have to use set.seed() inside app_server(). Thus the warning by BiocCheck remains.

If a user (or myself) runs set.seed() before run_app() (which calls app_server() and app_ui()) and thus avoiding the warning all together, then in a way yes, the figures reproducible from the shiny app, but only if they make the same number of boxplots. That is:

## Behavior that would avoid the warning
## but you don't get the same plot each time
## (in this case illustrated with 3 random numbers)
> set.seed(1)
> runif(3)
[1] 0.2655087 0.3721239 0.5728534
> runif(3)
[1] 0.9082078 0.2016819 0.8983897

## unless you run the plots in the same sequence
> set.seed(1)
> runif(3)
[1] 0.2655087 0.3721239 0.5728534
> runif(3)
[1] 0.9082078 0.2016819 0.8983897

## Scenario I want: every plot has the same
## seed and you can then easily re-make it
## later without keeping track of the order
## (well, number) of boxplots you made on
## the app
> set.seed(1)
> runif(3)
[1] 0.2655087 0.3721239 0.5728534
> set.seed(1)
> runif(3)
[1] 0.2655087 0.3721239 0.5728534

@mtmorgan
Copy link
Contributor

The best advice is to not call set.seed in your package code. Continue to do so if you insist, but know that this deviates from best practice.

@dvantwisk
Copy link

dvantwisk commented Mar 26, 2020

I apologize for the silence on my end. Even though we are still deliberating about some of the errors you are having on our builders' check, I have reviewed your package. It is extremely well written and I only have a few small concerns. Please message me back when you have questions or you are finished making changes. I should be able to be more responsive to you now. I should be able to give you more concrete directions on what you should due to solve your build issues shortly.

GENERAL

  • [REQUIRED] We generally want to see repositories that are no larger than
    5MB, however, we will make exceptions if the user makes an effort to reduce the
    size as much as possible. Your repo is currently 47MB large. The .git
    directory makes at 36MB of this. It may be the case that you saved some large
    files and later deleted them, however, they are still present in the version
    control history. I ask that you use the BFG-repo cleaner tool to attempt to
    reduce the size of your package by removing unneeded version control history.
    https://rtyley.github.io/bfg-repo-cleaner/

vignettes/spatialLIBD.R

  • [REQUIRED] There are multiple uses of eval=FALSE in the vignette. We
    generally want to see using facing code run on our builder to ensure it
    functions as intended. However, there are cases when eval=FALSE can be
    used acceptably such as when the code opens something interactive. I assume
    this is the case this your usage here and I just want you to confirm it is
    indeed the case.

@dvantwisk
Copy link

dvantwisk commented Mar 27, 2020

Regarding the issues you've been having with the builder:

  • We will allow you to use set.seed() as you described since you were not able to achieve the reproducible plots without it, even though this is not ideal.

  • With regards to examples failing due to memory usage, our 32-bit windows builder only allows 2.5GB of memory to be used by a package being built at any time. For this, there may be two solutions. One is to simply reduce the amount of memory used by the example. Another option may be to remove larger objects from memory to free up space using rm(), as they may persist between examples otherwise.

@lcolladotor
Copy link
Contributor Author

ohh, it's been 6 days =(

Sorry, I was teaching a course last week and I'm behind on several projects. Thank you for looking into the package!

I'll work on making a smaller SCE object at the spot-level (the 2+ GB one currently) just for some plots and use that for the examples in order to get through the 2.5 GB limit on the Windows builder (I didn't know about that limit, thanks for the info!). I already have one but it was kind of used internally only in the vignette (and not in the R help files) which is why there are some non-shiny chunks with eval = FALSE in the vignette.

Best,
Leo

@dvantwisk
Copy link

Hello,

I just want to check in with the progress of your package submission. I need to send you a reminder that the deadline for your package to be included in the next Bioconductor release is April 21st. The package will need to be accepted by then to be included in the next release. Please message me back if you have any issues making the changes I specified.

@dvantwisk
Copy link

I do need to remind you that the final day to accept packages for the upcoming Bioconductor release is April 22nd. Please message me if you have any other questions.

@bioc-issue-bot
Copy link
Collaborator

Received a valid push; starting a build. Commits are:

8970f02 v0.99.13 -- Switch to GitHub actions as described ...

@bioc-issue-bot
Copy link
Collaborator

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "WARNINGS, ERROR".
This may mean there is a problem with the package that you need to fix.
Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

@lcolladotor
Copy link
Contributor Author

lcolladotor commented Apr 22, 2020

Hi,

Sorry for dropping the ball on the communicating here. I got really into GitHub Actions at r-lib/actions#84 since Wednesday last week so I got behind on this project.

Anyway, I'm working on spatialLIBD right now, but I have discovered something that might affect Bioconductor a bit more broadly.

This is in relation to the github repo size that you mentioned in #1389 (comment).

Steps that I followed

  1. I used git rm --cache to remove a ~1 MB Adobe Illustrator file and another small image. Then using bfg I removed anything bigger than 600 Kb not present in the HEAD. The end result was a 46 Mb fresh git clone https://gist.github.com/lcolladotor/c7a74f94c9176982dc4be4ce7c5f1f7e#file-log_step1-txt-L113-L124.
  2. I used git rm --cache to remove some ~500kb images, then ran bfg again this time over 300 Kb. Still 46 Mb on a fresh clone.
  3. I went to LieberInstitute/spatialLIBD/branches and manually deleted the gh-pages branch. This lead to a fresh clone of 22 mb. That branch has the pkgdown files and their history.
  4. After re-deploying (so re-making a gh-pages branch) the pkgdown docs, it went back to 44 Mb. on a fresh clone.
$ git clone [email protected]:LieberInstitute/spatialLIBD.git
Cloning into 'spatialLIBD'...
remote: Enumerating objects: 742, done.
remote: Counting objects: 100% (742/742), done.
remote: Compressing objects: 100% (365/365), done.
remote: Total 1621 (delta 364), reused 675 (delta 325), pack-reused 879
Receiving objects: 100% (1621/1621), 33.50 MiB | 6.28 MiB/s, done.
Resolving deltas: 100% (907/907), done.
$ cd spatialLIBD/
$ du -sh .
 44M	.
  1. I googled a bit and with the solution listed here, I cloned only the master branch which lead to a full size of 18 Mb.

Bioconductor as a whole

Hence, I think that this could affect Bioconductor at large if the SBP and the daily builds just clone the branch needed (so master for the SBP and bioc-devel, RELEASE_X_Y for release) instead of the full repositories. For example:

git clone -b master --single-branch [email protected]:LieberInstitute/spatialLIBD.git

Back to spatialLIBD

I don't think that I can reduce the file size significantly anymore without affecting the package.

The 12 low resolution images (as defined by 10X Genomics) take 5.5 Mb themselves and really essential, plus this being an experiment and not a software package, I thought that it would be ok to have them in it. We have the "high" resolution (~6 Mb) ones and the "raw" (~500 Mb) TIFFs on AWS linked to from the READMEs.

$ du -sh inst/app/www/data/
5.5M	inst/app/www/data/

The 2 images and a file noted by bfg on the 300 Kb run are already reduced in size to what we want them to show on screens (maybe I can save a tiny bit of space here), plus the 385 Kb csv is an example file used by the shiny app.

The rest seems to come from from the git history.

$ du -sh *
4.0K	DESCRIPTION
4.0K	NAMESPACE
4.0K	NEWS.md
196K	R
 20K	README.Rmd
 24K	README.md
4.0K	_pkgdown.yml
4.0K	app.R
4.0K	codecov.yml
168K	data
436K	data-raw
136K	dev
5.7M	inst
2.3M	man
236K	pkgdown
8.0K	tests
 48K	vignettes
$ du -sh .git
8.9M	.git

The pkgdown landing site is what we provided as the main link to reviewers of the paper too given it's guaranteed (as much as it can be) availability over the shiny app (that can crash if you have too many users). Plus that can be dodged by cloning the master branch only.

Best,
Leo

@bioc-issue-bot
Copy link
Collaborator

Received a valid push; starting a build. Commits are:

5aee697 Make the run_app() example a dontrun one
f684bdf v0.99.14 -- add the enough_ram() function; add fet...

@bioc-issue-bot
Copy link
Collaborator

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "WARNINGS".
This may mean there is a problem with the package that you need to fix.
Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

@bioc-issue-bot
Copy link
Collaborator

Received a valid push; starting a build. Commits are:

05664de v0.99.15 -- fix a NOTE

@lcolladotor
Copy link
Contributor Author

Hi,

I'm happy to report that the package spatialLIBD now builds and checks on both Linux and Windows at http://bioconductor.org/spb_reports/spatialLIBD_buildreport_20200422013847.html. It further passed on macOS * at https://github.com/LieberInstitute/spatialLIBD/actions/runs/84435908 as shown here.

There are still two WARNINGs due to:

  • R 3.6 instead of 4.0 (although the above tests show that it runs in the newest R versions) given the deployment dependency on shinyapps.io. If you want, I can bump the version requirement to 4.0 once shinyapps.io has an R 4.0 version up and running too.
  • Usage of set.seed() as discussed previously. It's only limited to the shiny app code as it's outside of function calls elsewhere, which was the best I could do to minimize the impact of set.seed() in the code while still having reproducible images from the shiny app (same one you see as the one you download on PDF).

This version (0.99.14) includes a new function called enough_ram() that leverages benchmarkme::has_ram() as in drisso/mbkmeans/blocksize.R#L17 with a default of 3GB (3e9 bytes really). It returns a logical(1) and the result is used to condition the running of the examples that require the ~2.1 GB sce object. Thus, these examples don't run on the Bioconductor Single Package Builder (SPB) on Windows. However, they do at GitHub Actions as shown here (they provide machines with 7 GB of RAM). The functions are still tested on Windows in the SPB since now fetch_data(type) supports type = 'sce_example' which downloads the subsetted sce object that I was using behind the scenes on the vignette (hence the eval = FALSE statements that are gone now; except for installing the package and creating the vignette).

Version 0.99.15 running right now resolves a small NOTE on fetch_data().

Best,
Leo

* Using R 4.1 and not 4.0 on both macOS and Windows due to the weird nature of this window of time where there's more versions of R beyond release and devel as discussed here; yes, I know that R 4.1 is not supported right now by BioC but that's the test I was able to setup. This has enabled me to discover some issues in a few of my other packages.

@bioc-issue-bot
Copy link
Collaborator

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "WARNINGS".
This may mean there is a problem with the package that you need to fix.
Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

@dvantwisk dvantwisk added 3a. accepted will be ingested into Bioconductor daily builder for distribution and removed 2. review in progress assign a reviewer and a more thorough review of package code and documentation taking place labels Apr 23, 2020
@bioc-issue-bot
Copy link
Collaborator

Your package has been accepted. It will be added to the
Bioconductor Git repository and nightly builds. Additional
information will be posed to this issue in the next several
days.

Thank you for contributing to Bioconductor!

@mtmorgan
Copy link
Contributor

The master branch of your GitHub repository has been added to Bioconductor's git repository.

To use the git.bioconductor.org repository, we need an 'ssh' key to associate with your github user name. If your GitHub account already has ssh public keys (https://github.com/lcolladotor.keys is not empty), then no further steps are required. Otherwise, do the following:

  1. Add an SSH key to your github account
  2. Submit your SSH key to Bioconductor

See further instructions at

https://bioconductor.org/developers/how-to/git/

for working with this repository. See especially

https://bioconductor.org/developers/how-to/git/new-package-workflow/
https://bioconductor.org/developers/how-to/git/sync-existing-repositories/

to keep your GitHub and Bioconductor repositories in sync.

Your package will be included in the next nigthly 'devel' build (check-out from git at about 6 pm Eastern; build completion around 2pm Eastern the next day) at

https://bioconductor.org/checkResults/

(Builds sometimes fail, so ensure that the date stamps on the main landing page are consistent with the addition of your package). Once the package builds successfully, you package will be available for download in the 'Devel' version of Bioconductor using BiocManager::install("spatialLIBD"). The package 'landing page' will be created at

https://bioconductor.org/packages/spatialLIBD

If you have any questions, please contact the bioc-devel mailing list (https://stat.ethz.ch/mailman/listinfo/bioc-devel); this issue will not be monitored further.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3a. accepted will be ingested into Bioconductor daily builder for distribution WARNINGS
Projects
None yet
Development

No branches or pull requests

4 participants