Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Single Site Test Attempt on Casper (fails) #2293

Open
rgknox opened this issue Dec 15, 2023 · 4 comments
Open

Single Site Test Attempt on Casper (fails) #2293

rgknox opened this issue Dec 15, 2023 · 4 comments
Labels
blocked: dependency Wait to work on this until dependency is resolved bug something is working incorrectly external issue needs to be addressed elsewhere (submodule); issue here for the sake of project tracking

Comments

@rgknox
Copy link
Collaborator

rgknox commented Dec 15, 2023

I attempted to build a single site test on casper and ran into some errors. I've never used this machine before today mind you, but I just heard that it is the ideal machine to run single site runs, so I gave it a quick test.

I have no special environment variables or anything other than the default modules loaded. I executed the create_test in an interactive queue using execcasper, and I also gave it one try on the login node (with the same error).

This uses the following tags:
ctsm: ctsm5.1.dev159
fates: sci.1.69.0_api.31.0.0

./create_test SMS_Lm13.1x1_brazil.I2000Clm50FatesCruRsGs.casper_nvhpc.clm-FatesCold --generate /glade/derecho/scratch/rgknox/ctsm5.1.dev159-sci.1.69.0_api.31.0.0 --project P93300041 -o

Testnames: ['SMS_Lm13.1x1_brazil.I2000Clm50FatesCruRsGs.casper_nvhpc.clm-FatesCold']
create_test will do up to 1 tasks simultaneously
create_test will use up to 45 cores simultaneously
Creating test directory /glade/scratch/rgknox/SMS_Lm13.1x1_brazil.I2000Clm50FatesCruRsGs.casper_nvhpc.clm-FatesCold.G.20231215_101759_3lofzp
RUNNING TESTS:
  SMS_Lm13.1x1_brazil.I2000Clm50FatesCruRsGs.casper_nvhpc.clm-FatesCold
Starting CREATE_NEWCASE for test SMS_Lm13.1x1_brazil.I2000Clm50FatesCruRsGs.casper_nvhpc.clm-FatesCold with 1 procs
Finished CREATE_NEWCASE for test SMS_Lm13.1x1_brazil.I2000Clm50FatesCruRsGs.casper_nvhpc.clm-FatesCold in 2.681000 seconds (PASS)
Starting XML for test SMS_Lm13.1x1_brazil.I2000Clm50FatesCruRsGs.casper_nvhpc.clm-FatesCold with 1 procs
Finished XML for test SMS_Lm13.1x1_brazil.I2000Clm50FatesCruRsGs.casper_nvhpc.clm-FatesCold in 0.331622 seconds (PASS)
Starting SETUP for test SMS_Lm13.1x1_brazil.I2000Clm50FatesCruRsGs.casper_nvhpc.clm-FatesCold with 1 procs
Finished SETUP for test SMS_Lm13.1x1_brazil.I2000Clm50FatesCruRsGs.casper_nvhpc.clm-FatesCold in 0.518615 seconds (FAIL). [COMPLETED 1 of 1]
    Case dir: /glade/scratch/rgknox/SMS_Lm13.1x1_brazil.I2000Clm50FatesCruRsGs.casper_nvhpc.clm-FatesCold.G.20231215_101759_3lofzp
    Errors were:
        ERROR: module command /glade/u/apps/dav/opt/lmod/7.7.29/libexec/lmod python purge  failed with message:
        /glade/u/apps/dav/opt/lua/5.3.4/bin/lua: error while loading shared libraries: libreadline.so.6: cannot open shared object file: No such file or directory

Waiting for tests to finish
FAIL SMS_Lm13.1x1_brazil.I2000Clm50FatesCruRsGs.casper_nvhpc.clm-FatesCold (phase SETUP)
    Case dir: /glade/scratch/rgknox/SMS_Lm13.1x1_brazil.I2000Clm50FatesCruRsGs.casper_nvhpc.clm-FatesCold.G.20231215_101759_3lofzp
Due to presence of batch system, create_test will exit before tests are complete.
To force create_test to wait for full completion, use --wait
test-scheduler took 4.831060886383057 seconds
@ekluzek ekluzek added bug something is working incorrectly next this should get some attention in the next week or two. Normally each Thursday SE meeting. labels Dec 15, 2023
@ekluzek
Copy link
Collaborator

ekluzek commented Dec 15, 2023

@rgknox try the intel compiler does that fail as well?

@rgknox
Copy link
Collaborator Author

rgknox commented Dec 15, 2023

yes, appears to be the same error as well:

ERROR: module command /glade/u/apps/dav/opt/lmod/7.7.29/libexec/lmod python purge failed with message:
/glade/u/apps/dav/opt/lua/5.3.4/bin/lua: error while loading shared libraries: libreadline.so.6: cannot open shared object file: No such file or director

@ekluzek
Copy link
Collaborator

ekluzek commented Dec 15, 2023

This is with ccs_config_cesm0.0.84. It looks like work on casper went into ccs_config_cesm0.0.87, so I'll try with that.

@ekluzek
Copy link
Collaborator

ekluzek commented Dec 15, 2023

OK, that doesn't work out of the box. It might need a change in both ccs_config and in cime.

ESMCI/ccs_config_cesm#138

@samsrabin samsrabin added the external issue needs to be addressed elsewhere (submodule); issue here for the sake of project tracking label Feb 15, 2024
@ekluzek ekluzek added blocked: dependency Wait to work on this until dependency is resolved and removed next this should get some attention in the next week or two. Normally each Thursday SE meeting. labels Feb 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
blocked: dependency Wait to work on this until dependency is resolved bug something is working incorrectly external issue needs to be addressed elsewhere (submodule); issue here for the sake of project tracking
Projects
None yet
Development

No branches or pull requests

3 participants