Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hard-coded discover paths #1

Open
mathomp4 opened this issue Jul 6, 2022 · 5 comments
Open

Hard-coded discover paths #1

mathomp4 opened this issue Jul 6, 2022 · 5 comments
Assignees
Labels
question Further information is requested

Comments

@mathomp4
Copy link
Member

mathomp4 commented Jul 6, 2022

One issue that will eventually need to be fixed (once NAS runs of GEOS are working again) is that this repo currently has hard-coded discover paths in OH_GridComp/OH_instance_OH.rc:

Untuned_GBoostFile: /discover/nobackup/mmanyin/CCM/run/oh26/OH_model/OH_NoTune.bin
XGBoost_0.81_File: /discover/nobackup/mmanyin/CCM/run/oh26/OH_model/OH_Tuned_M01.bin
XGBoost_1.6.0_File: /discover/nobackup/mmanyin/CCM/OH_Boost/XGBoost_1.6.0/xgboh_UpDwnALBUVSZAAll_NoGMIALB_NoScale_NoRegressor_NewXGB_M01.model
XGBoostFile: /discover/nobackup/mmanyin/CCM/OH_Boost/XGBoost_1.6.0/xgboh_UpDwnALBUVSZAAll_NoGMIALB_NoScale_NoRegressor_NewXGB_M01.model

And pretty much everywhere in OH_GridComp/OH_GridComp_ExtData.rc

@mathomp4 mathomp4 added the question Further information is requested label Jul 6, 2022
@mathomp4
Copy link
Member Author

Eventually, we'll probably want this accessed via ExtData somehow. Need to work out the scripting, etc.

@mmanyin
Copy link
Collaborator

mmanyin commented Jul 27, 2022

First, regarding the XGBoostFile - I will handle that once there are a full set of 12 monthly files from Dan Anderson. (Previously I have worked with Mark Solomon to move many GMI files to GMAO project space, so I know the process). Note that the Boost files are not gridded, so ExtData does not apply.

More generally, there are a number of lines in OH_GridComp_ExtData.rc which reference daily output files from "MERRA2/GMI", Luke Oman's full chemistry replay run. I will check with Luke to see if there has been much demand for accessing these at NAS. Copying all of the output to mirrored GMAO space would be prohibitive.

@mathomp4
Copy link
Member Author

Ohhh. It's only a month? I didn't know that. I thought these were "universal". I suppose it makes sense it's not. I'm learning about ML! :) So these are like monthly climatologies (which isn't the right word, but...) where we expect all Januarys to look like model January?

And I only meant "ExtData" in the sense that that is our "generic model boundary condition" sort of space. If not there, then we probably need some sort of scripting to edit for NAS like we do for MERRA2 Replay.

And, yeah, if you think these runs would only ever be run at NCCS, then I suppose it doesn't matter. 😄

@mmanyin
Copy link
Collaborator

mmanyin commented Jul 27, 2022

As I understand it, Dan trains the model for a given month, so yes- like a climatology. He recently trained January using XGBoost 1.6, and now that things are working in GEOS, I will ask him to complete Feb - Dec.

Regarding ExtData, I was confused between the directory (which sits under scratch/) and the software for reading/interpolating gridded input. Yes, you are correct, it should be re-located under ExtData/

@mathomp4
Copy link
Member Author

Regarding ExtData, I was confused between the directory (which sits under scratch/) and the software for reading/interpolating gridded input. Yes, you are correct, it should be re-located under ExtData/

Or maybe somewhere new? I mean, we use ExtData as a catchall, but there is no requirement we use that.

Then again, these are chemistry files and I'm sure Arlindo would say ExtData/chemistry is where they go! Note that subdirs in ExtData/chemistry is "semver versioned" as well, so you could version by xgboost version or some other signifier (v1.0.0, etc.) and when the files need updating, you add a new subdir, and there you go! That way you don't have to version the files themselves (unless its preferred)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants