Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

acs-intermediate fix to saved fldgen memory bloat #49

Merged
merged 13 commits into from
Jul 15, 2020
Merged

acs-intermediate fix to saved fldgen memory bloat #49

merged 13 commits into from
Jul 15, 2020

Conversation

abigailsnyder
Copy link
Contributor

an intermediate fix to #25

Reducing what we save from a trained emulator to the bare bones list entries needed for generating new fields. This takes the ISIMIP GFDL trained emulator from 5.6gb for everything to 2.1gb. Hopefully this is small enough to work in the Cassandra pipeline.

The scripts in fldgen/inst/scripts are copies of the same files in /pic/projects/GCAM/GE/drought-expt/fldgen-emulators, with fldgen/inst/scripts/train-emulators.R updated to include a call to the new emulator_reducer function and save the smaller emulators. So in theory those can be re-built from scratch, or the existing emulators in /pic/projects/GCAM/GE/drought-expt/fldgen-emulators can each be loaded, reduced, saved with a different script, following what has been added to fldgen/inst/scripts/train-emulators.R.

Note that when you load an RDS object from fldgen/inst/scripts/train-emulators.R, it will come in with the name reducedEmulator and not emu. So either the python code has to be adjusted or this piece of code in the training script:

 reducedEmulator <- emulator_reducer(emu)
        outfilename <- paste0('fldgen-',model, '_reducedEmulator.rds')
        saveRDS(reducedEmulator, outfilename)

Would have to be redone as

 emu <- emulator_reducer(emu)
        outfilename <- paste0('fldgen-',model, '_reducedEmulator.rds')
        saveRDS(emu, outfilename)

Left distinct for now, so clear in the fldgen package and cassandra users can amend according to their own preference.

The pointers in the cassandra directory on pic will have to be updated to point to these smaller emulators.

  • @crvernon will probably fail on the development version of R for mac tests, since the ncdf4 package hasn't been updated for that yet. We like to run that test and check the failure to keep an eye on potential issues coming down the road.

  • passes local package check at least. Fingers crossed behaves here.

  • also removes functions that are never used that caused issue 'reading and writing models works' test fails  #40

@abigailsnyder
Copy link
Contributor Author

only test failing is the devel on macOS, due to ncdf4 as usual.

@abigailsnyder
Copy link
Contributor Author

Once PR is merged in,
@abigailsnyder will use the new function to create reduced size versions of all emulators in /pic/projects/GCAM/GE/drought-expt/fldgen-emulators/ with the correct variable names. Not updating any of the trained emulators contents, not changing science. Just saving a copy of what is already being used with fewer things in it.

Copy link
Contributor

@kdorheim kdorheim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@abigailsnyder this looks good. Only suggestions on documentation feel free to accept or reject. One question though is the idea that this is something that you run on a train emulator before saving it? It is not included internally in some function right?

R/generateTPresids.R Show resolved Hide resolved
R/writedata.R Show resolved Hide resolved
R/writedata.R Show resolved Hide resolved
R/writedata.R Outdated Show resolved Hide resolved
R/writedata.R Outdated Show resolved Hide resolved
R/writedata.R Outdated Show resolved Hide resolved
@abigailsnyder abigailsnyder merged commit ca50388 into JGCRI:master Jul 15, 2020
@abigailsnyder abigailsnyder deleted the acs-fldgen-subsetting-output-fcn branch July 15, 2020 17:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants