Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CLM deaths in restFile_enddef on cheyenne in some cases due to overflow of float to integer on restart #165

Closed
ekluzek opened this issue Dec 16, 2017 · 23 comments
Milestone

Comments

@ekluzek
Copy link
Collaborator

ekluzek commented Dec 16, 2017

Erik Kluzek < erik > - 2017-04-12 10:57:58 -0600
Bugzilla Id: 2442
Bugzilla CC: andre, cacraig, eaton, fischer, jedwards, mvertens, rfisher, sacks,

The following test fails on cheyenne due to overflow in conversion to a 4-byte integer.

ERS_D_Ld5.f09_g16.ICLM45ED.cheyenne_intel.clm-edTest

The traceback is...

1: RTM river discharge into ocean: ICE
1: pionfput_mod.F90 128 1 4 255 1
1: m3/s
1: pionfput_mod.F90 128 1 4 1 1 A
1: pio_support::pio_die:: myrank= -1 : ERROR: nf_mod.F90: 1508 :
1: Overflow when type cast to 4-byte integer.
73: pio_support::pio_die:: myrank= -1 : ERROR: nf_mod.F90: 1508 :
73: Overflow when type cast to 4-byte integer.
37: pio_support::pio_die:: myrank= -1 : ERROR: nf_mod.F90: 1508 :
37: Overflow when type cast to 4-byte integer.
109: pio_support::pio_die:: myrank= -1 : ERROR: nf_mod.F90: 1508 :
109: Overflow when type cast to 4-byte integer.
37:Image PC Routine Line Source
37:cesm.exe 0000000003A595ED Unknown Unknown Unknown
37:cesm.exe 0000000003498576 pio_support_mp_pi 120 pio_support.F90
37:cesm.exe 0000000003496920 pio_utils_mp_chec 59 pio_utils.F90
37:cesm.exe 0000000003465CDD nf_mod_mp_pio_end 1508 nf_mod.F90
37:cesm.exe 0000000000B9E118 ncdio_pio_mp_ncd_ 386 ncdio_pio.F90.in
37:cesm.exe 0000000000C28127 restfilemod_mp_re 731 restFileMod.F90
37:cesm.exe 0000000000C20C30 restfilemod_mp_re 109 restFileMod.F90
37:cesm.exe 00000000008B18E2 clm_driver_mp_clm 1197 clm_driver.F90

@ekluzek ekluzek added this to the clm5 milestone Dec 16, 2017
@ekluzek
Copy link
Collaborator Author

ekluzek commented Dec 16, 2017

Ben Andre < andre > - 2017-04-14 11:52:42 -0600

We discussed this on the fates SW call, and the feeling was that the restart interface has change so much that it's not worth spending a lot of time on. If a quick look at this points to a an issue with fates, then please mark as expected fail and cheyenne will be tested and fixed with the upcoming fates reintegration.

@ekluzek
Copy link
Collaborator Author

ekluzek commented Dec 16, 2017

Erik Kluzek < erik > - 2017-04-18 17:46:51 -0600

In the DDT debugging session I learned more about the problem here. The error happens on the PIO end definition call. I think what the problem is that the dimension of some variables exceed 32-byte integers (multiplying the dimensions together). numCohorts for this case is about 33 million. So multiplying by a thousand you exceed a signed 4-byte integer. It looks like the open statements are supposed to be opening with large file support 64BIT_OFFSET, so this shouldn't be a limitation, and yet it is.

The call it dies in is restFile_enddef in restFile_write...

call subgridRestWrite(bounds, ncid, flag='define' )

call accumulRest( ncid, flag='define' )

call clm_instRest(bounds, ncid, flag='define')

if (present(rdate)) then
   call hist_restart_ncd (bounds, ncid, flag='define', rdate=rdate )
end if

call restFile_enddef( ncid )

I tried building everything with 8 byte integers with "-i8" to the compiler and that failed in trying to resolve overloaded functions based on the data types.

@ekluzek
Copy link
Collaborator Author

ekluzek commented Dec 16, 2017

Erik Kluzek < erik > - 2017-04-19 00:59:30 -0600

Jim suggested I try opening up 64BIT_DATA rather than 64BIT_OFFSET. So I tried that and it fails the same way.

Index: ../../../components/clm/src/main/ncdio_pio.F90.in

--- ../../../components/clm/src/main/ncdio_pio.F90.in (revision 84459)
+++ ../../../components/clm/src/main/ncdio_pio.F90.in (working copy)
@@ -24,7 +24,7 @@
use perf_mod , only : t_startf, t_stopf
use fileutils , only : getavu, relavu
use mct_mod , only : mct_gsMap, mct_gsMap_lsize, mct_gsMap_gsize, mct_gsMap_orderedPoints

  • use pio , only : file_desc_t, io_desc_t, iosystem_desc_t, pio_64bit_offset
  • use pio , only : file_desc_t, io_desc_t, iosystem_desc_t, pio_64bit_data
    use pio , only : pio_bcast_error, pio_char, pio_clobber, pio_closefile, pio_createfile, pio_def_dim
    use pio , only : pio_def_var, pio_double, pio_enddef, pio_get_att, pio_get_var, pio_global, pio_initdecomp
    use pio , only : pio_inq_att, pio_inq_dimid, pio_inq_dimlen, pio_inq_dimname, pio_inq_vardimid, pio_inq_varid
    @@ -259,7 +259,7 @@
    end if
    end if
  • ierr = pio_createfile(pio_subsystem, file, my_io_type, fname, ior(PIO_CLOBBER,PIO_64BIT_OFFSET))
  • ierr = pio_createfile(pio_subsystem, file, my_io_type, fname, ior(PIO_CLOBBER,PIO_64BIT_DATA))

    if(ierr/= PIO_NOERR) then
    call shr_sys_abort( ' ncd_pio_createfile ERROR: Failed to open file to write: '//trim(fname))

@ekluzek
Copy link
Collaborator Author

ekluzek commented Dec 16, 2017

Ben Andre < andre > - 2017-04-25 18:09:18 -0600

TL,DR: A work around for this problem is to change the edTest to use netcdf instead of pnetcdf.


I followed up on Jim and Erik discussion in Comment 3 by setting both PIO_64BIT_OFFSET and PIO_64BIT_DATA, but it didn't fix the error:

  • ierr = pio_createfile(pio_subsystem, file, my_io_type, fname, ior(PIO_CLOBBER,PIO_64BIT_OFFSET))
  • ierr = pio_createfile(pio_subsystem, file, my_io_type, fname, &
  •   ior(ior(PIO_CLOBBER,PIO_64BIT_OFFSET),PIO_64BIT_DATA))
    

This case uses pio1 and pnetcdf by default. Replace pnetcdf for netcdf then this will run to completion:

$ ./xmlquery PIO_TYPENAME
PIO_TYPENAME: ['CPL:pnetcdf', 'ATM:netcdf', 'LND:netcdf', 'ICE:pnetcdf', 'OCN:pnetcdf', 'ROF:pnetcdf', 'GLC:pnetcdf', 'WAV:pnetcdf', 'ESP:pnetcdf']

It looks like yellowstone and cheyenne are using:

  • cheyenne - pnetcdf 1.8.0

  • yellowstone - pnetcdf 1.6.1

Looking at the release notes for pnetcdf:

https://trac.mcs.anl.gov/projects/parallel-netcdf/wiki/ReleaseNotes-1.7.0

Other updates:

* Conform with netCDF on the maximal dimension size for CDF-2 file format to be (2^32 - 4)

According to the netcdf doc

http://www.unidata.ucar.edu/software/netcdf/docs/netcdf_introduction.html

NetCDF Classic Format
...
For these users, 64-bit offset format is a natural choice. It greatly eases the size restrictions of netCDF classic files (see 64 bit Offset Limitations).

Files with the 64-bit offsets are identified with a “CDF\002” at the beginning of the file. In this documentation this format is called “64-bit offset format.”

Looking at the hexdump of the file that I think is being written seems to indicate that this is a 64bit file:

$ hexdump -C ERS_D_Ld5.f09_g16.ICLM45ED.cheyenne_intel.clm-edTest.junk-2442.clm2.rh0.0001-01-04-00000.nc 
00000000  43 44 46 02 00 00 00 00  00 00 00 0a 00 00 00 1c  |CDF.............|

So I guess I don't really understand the netcdf file formats, but it seems like a contradiction that pnetcdf is now consistent with netcdf v2 format, but limits the size to 32 bits. Maybe something changed between pnetcdf 1.6 and 1.8 that we need to update the internals of the I/O stack?

@ekluzek
Copy link
Collaborator Author

ekluzek commented Dec 16, 2017

Erik Kluzek < erik > - 2017-04-26 16:06:01 -0600

I talked to Ben about this afternoon. Jim has an idea about what is happening, and something bad in our definitions for ED is causing it to output in NetCDF2 format. This is also due to bad behavior on pnetcdf's, which is why going to netcdf works.

What I realized in 2445, is that even on yellowstone some of the files are being produced but I can't read them (a ncdump tells me the file is in a unrecognized format). So this problem affects yellowstone as well as it's creating files you can't read. cheyenne is actually being more courteous and stopping rather than creating mystery files that are unreadable. Although CLM is apparently able to read them because restart tests seem to work.

@ekluzek
Copy link
Collaborator Author

ekluzek commented Dec 16, 2017

Jim Edwards < jedwards > - 2017-04-26 16:09:44 -0600

I think that you'll find that the files are unreadable only in certain versions of ncdump. You may need to load a netcdf mopdule so that you point to a newer version of this tool.

@ekluzek
Copy link
Collaborator Author

ekluzek commented Dec 16, 2017

Erik Kluzek < erik > - 2017-04-26 17:43:26 -0600

(In reply to Jim Edwards from comment #6)

I think that you'll find that the files are unreadable only in certain
versions of ncdump. You may need to load a netcdf mopdule so that you point
to a newer version of this tool.

OK, I was able to find a version of ncdump that would work. Although only the version 4.4 versions would work, all of the older ones wouldn't. They give this...

ncdump -h $MYSCR/SMS_D_Ld5.f45_f45.ICLM45ED.yellowstone_intel.clm-edTest.20170424_133019_2sh1t7/run/SMS_D_Ld5.f45_f45.ICLM45ED.yellowstone_intel.clm-edTest.20170424_133019_2sh1t7.clm2.r.0001-01-06-00000.nc
ncdump: /glade/scratch/erik/SMS_D_Ld5.f45_f45.ICLM45ED.yellowstone_intel.clm-edTest.20170424_133019_2sh1t7/run/SMS_D_Ld5.f45_f45.ICLM45ED.yellowstone_intel.clm-edTest.20170424_133019_2sh1t7.clm2.r.0001-01-06-00000.nc: NetCDF: Unknown file format

But, then tools built on top of netcdf -- like ncview -- won't necessarily work presumably depending on the version of netcdf they were built with.

@ekluzek
Copy link
Collaborator Author

ekluzek commented Dec 16, 2017

Ben Andre < andre > - 2017-04-28 19:18:14 -0600

TL,DR : My recommendation is to remove this test and mark the issue as won't fix.


Jim pointed out that the CDF2 format issue I was seeing was because 64bit offset and data are mutually exclusive. I reverted to 64bit data and it is correctly generating a CDF5 format file. So that was a red herring.

I verified that this is a problem with ED specific code by trying a couple of permutations on the test without ED and commenting out the calls to ED restart.

More debugging led me further into the ED restart stack which has been massively refactored in the upcoming fates tag. I ran the failing test on my incoming branch and it passes.

I'm running the full ED/fates test suite on cheyenne this weekend to verify my branch is OK, and I'll ensure ED/Fates tests work on cheyenne when my tag comes in. My recommendation is to remove this test and mark the issue as won't fix.

@ekluzek
Copy link
Collaborator Author

ekluzek commented Dec 16, 2017

Ben Andre < andre > - 2017-05-02 11:38:13 -0600

Update: I ran all ed and clm tests over the weekend and the results were as follows:

yellowstone:
aux_clm :
pass - ERS_D_Ld5.f09_g16.ICLM45ED.yellowstone_intel.clm-edTest

ed :
pass - ERS_D_Ld5.f09_g16.ICLM45ED.yellowstone_gnu.clm-edTest
pass - ERS_D_Ld5.f09_g16.ICLM45ED.yellowstone_intel.clm-edTest
pass - ERS_D_Ld5.f09_g16.ICLM45ED.yellowstone_pgi.clm-edTest
pass - ERS_D_Ld5.f09_g16.ICLM50FATES.yellowstone_gnu.clm-edTest
pass - ERS_D_Ld5.f09_g16.ICLM50FATES.yellowstone_intel.clm-edTest
pass - ERS_D_Ld5.f09_g16.ICLM50FATES.yellowstone_pgi.clm-edTest
pass - ERS_D_Ld5.f19_g16.ICLM45ED.yellowstone_gnu.clm-edTest
pass - ERS_D_Ld5.f19_g16.ICLM45ED.yellowstone_intel.clm-edTest
pass - ERS_D_Ld5.f19_g16.ICLM45ED.yellowstone_pgi.clm-edTest
pass - ERS_D_Ld5.f19_g16.ICLM50FATES.yellowstone_gnu.clm-edTest
pass - ERS_D_Ld5.f19_g16.ICLM50FATES.yellowstone_intel.clm-edTest
pass - ERS_D_Ld5.f19_g16.ICLM50FATES.yellowstone_pgi.clm-edTest

cheyenne:
aux_clm :
pass - ERS_D_Ld5.f09_g16.ICLM45ED.cheyenne_intel.clm-edTest
ed :
fail - ERS_D_Ld5.f09_g16.ICLM50FATES.cheyenne_intel.clm-edTest
fail - ERS_D_Ld5.f19_g16.ICLM50FATES.cheyenne_intel.clm-edTest
pass - ERS_D_Ld5.f19_g16.ICLM45ED.cheyenne_intel.clm-edTest
pass - ERS_D_Ld5.f09_g16.ICLM45ED.cheyenne_intel.clm-edTest

So, there's still a problem on cheyenne with high res, but only with clm5fates, not clm45ed.

Differences between clm45 and clm5 from the same test run:

hexdump -C ../ERS_D_Ld5.f09_g16.ICLM45ED.cheyenne_intel.clm-edTest.C.04281953-edi/run/ERS_D_Ld5.f09_g16.ICLM45ED.cheyenne_intel.clm-edTest.C.04281953-edi.clm2.r.0001-01-04-00000.nc | head
00000000 43 44 46 02 00 00 00 00 00 00 00 0a 00 00 00 10 |CDF.............|

hexdump -C ../ERS_D_Ld5.f09_g16.ICLM50FATES.cheyenne_intel.clm-edTest.C.04281953-edi/run/ERS_D_Ld5.f09_g16.ICLM50FATES.cheyenne_intel.clm-edTest.C.04281953-edi.clm2.r.0001-01-04-00000.nc | head
00000000 43 44 46 05 00 00 00 00 00 00 00 00 00 00 00 0a |CDF.............|

Diff-ing the cases, it looks like with clm45 is using netcdf by default, while clm5 is using pnetcdf by default. Reviewing old fates history to figure out why, we encounter an issue a year ago that resulted in just setting all ED compsets to default to netcdf instead of pnetcdf. So, this is probably still broken for all ED/FATES compsets in the upcoming branch.

I spend a all day yesterday trying to track this down further by isolating the variables that cause the problem. I identified a couple of them, but haven't figured out yet why they are different.

I'll try to look at this more, but I think I'm going to need a debug build of the I/O stack to understand what pnetcdf is unhappy with.

I need to do some other cleanup stuff to stay on track with the fates reintegration.

@ekluzek
Copy link
Collaborator Author

ekluzek commented Dec 16, 2017

Erik Kluzek < erik > - 2017-06-02 11:52:20 -0600

Chris is now seeing this same problem in F compsets...

ERP_Ln9.f09_f09.F1850_DONOTUSE.cheyenne_intel.cam-outfrq9s_clm5

@ekluzek
Copy link
Collaborator Author

ekluzek commented Dec 16, 2017

Chris Fischer < fischer > - 2017-06-12 14:39:50 -0600

The only place I'm seeing this error now is in

ERP_Ln9.f09_f09_mg17.FHIST_DEV.cheyenne_intel.cam-outfrq9s_clm5

@ekluzek
Copy link
Collaborator Author

ekluzek commented Dec 16, 2017

Chris Fischer < fischer > - 2017-06-12 14:40:32 -0600

I should also add this is for cesm2_0_alpha06o.

@ekluzek
Copy link
Collaborator Author

ekluzek commented Dec 16, 2017

Cheryl Craig < cacraig > - 2017-06-12 15:40:28 -0600

So did CLM come up with a fix for this bug, or did it just "go away". Looking back over this exchange, it's not sounding like its an easy bug to find and fix.

@ekluzek
Copy link
Collaborator Author

ekluzek commented Dec 16, 2017

Ben Andre < andre > - 2017-06-12 15:49:47 -0600

It went away.

It was originally only occurring when ED/fates was on. My original thought was that ED was doing something wrong with the indexing or data to cause the overflow, but after debugging it for a while and talking with Jim, we started thinking it was another issue where a struct was being incorrectly copied and something was getting corrupted or dropped. At some point when I was merging this stopped being reproducible, so I assumed it was actually a fates problem that was fixed by the fates restart refactor.

@ekluzek
Copy link
Collaborator Author

ekluzek commented Dec 16, 2017

Jim Edwards < jedwards > - 2017-06-12 20:40:23 -0600

It's there in the F compset, I've been able to reproduce with the simpler test
SMS_Ln9.f09_f09_mg17.FHIST_DEV.cheyenne_intel.cam-outfrq9s_clm5

An error in the enddef statement indicates that there was something wrong with an attribute.
It appears to be located in the call to

   call soilbiogeochem_nitrogenstate_inst%restart(bounds, ncid, flag=flag)                            

at around lines 527 of file clm_InstMod.F90 commenting out that line causes the enddef to complete.

@ekluzek
Copy link
Collaborator Author

ekluzek commented Dec 16, 2017

Chris Fischer < fischer > - 2017-06-14 14:45:12 -0600

I was able to get FHIST_DEV to run by changing the number of nodes being used from 4 to 10. This matches the layout being used on yellowstone.

@ekluzek
Copy link
Collaborator Author

ekluzek commented Dec 16, 2017

Erik Kluzek < erik > - 2017-06-14 15:03:40 -0600

(In reply to Chris Fischer from comment #16)

I was able to get FHIST_DEV to run by changing the number of nodes being
used from 4 to 10. This matches the layout being used on yellowstone.

This sounds like we just aren't giving enough processors for these cases on cheyenne, since memory (mostly) scales with processor count for CLM. So I'm wondering if the FATES case, might run if we ran it with more processors?

@ekluzek
Copy link
Collaborator Author

ekluzek commented Dec 16, 2017

Chris Fischer < fischer > - 2017-06-15 15:49:08 -0600

Brian E. was able to get f09_f09_mg17.FHIST_DEV.cheyenne_intel to fail using
10 nodes with NTASKS=120 and NTHRDS=3. This is running on 10 nodes.

37: pio_support::pio_die:: myrank= -1 : ERROR: nf_mod.F90: 1520 :
37: Overflow when type cast to 4-byte integer.
37:Image PC Routine Line Source
37:cesm.exe 000000000321971D Unknown Unknown Unknown
37:cesm.exe 0000000002E96271 pio_support_mp_pi 120 pio_support.F90
37:cesm.exe 0000000002E947CA pio_utils_mp_chec 60 pio_utils.F90
37:cesm.exe 0000000002E7F00C nf_mod_mp_pio_end 1520 nf_mod.F90
37:cesm.exe 00000000023FD0EA restfilemod_mp_re 732 restFileMod.F90
37:cesm.exe 000000000230C3ED clm_driver_mp_clm 1168 clm_driver.F90
37:cesm.exe 00000000022FA863 lnd_comp_mct_mp_l 444 lnd_comp_mct.F90
37:cesm.exe 000000000043071B component_mod_mp_ 681 component_mod.F90
37:cesm.exe 00000000004195DE cesm_comp_mod_mp_ 2639 cesm_comp_mod.F90
37:cesm.exe 0000000000430436 MAIN__ 68 cesm_driver.F90

@ekluzek
Copy link
Collaborator Author

ekluzek commented Dec 16, 2017

Erik Kluzek < erik > - 2017-06-15 15:54:56 -0600

(In reply to Chris Fischer from comment #18)

Brian E. was able to get f09_f09_mg17.FHIST_DEV.cheyenne_intel to fail using
10 nodes with NTASKS=120 and NTHRDS=3. This is running on 10 nodes.

37: pio_support::pio_die:: myrank= -1 : ERROR: nf_mod.F90:
1520 :
37: Overflow when type cast to 4-byte integer.
37:Image PC Routine Line
Source
.
.
.
cesm_comp_mod.F90
37:cesm.exe 0000000000430436 MAIN__ 68
cesm_driver.F90

Chris if this is a size/memory issue using threads would run into it quicker. Can you try with 30 nodes NTASKS=360 and NTHRDS=3? If that works, it might be useful to see if it seems to consistently fail for smaller layouts and work for larger ones. And then see at what point it starts to fail...

@ekluzek
Copy link
Collaborator Author

ekluzek commented Dec 16, 2017

Chris Fischer < fischer > - 2017-06-15 16:19:31 -0600

NTASKS=360 and NTHRDS=3 worked.

@ekluzek
Copy link
Collaborator Author

ekluzek commented Dec 16, 2017

Bill Sacks < sacks > - 2017-07-12 19:25:14 -0600

Renaming this issue to reflect its more general incarnation

@ekluzek
Copy link
Collaborator Author

ekluzek commented Dec 16, 2017

Bill Sacks < sacks > - 2017-07-12 19:29:32 -0600

It sounds like Jim E was on the track of this bug earlier (comment 15). There are only 10 or so variables in the offending routine. Would it make sense to comment them out one by one to see which one is the problem?

@ekluzek
Copy link
Collaborator Author

ekluzek commented Nov 28, 2018

I'm going to close this, because I think we must have fixed it, or else we wouldn't be running now.

@ekluzek ekluzek closed this as completed Nov 28, 2018
billsacks added a commit that referenced this issue Jan 14, 2023
82d3b247f Merge pull request #178 from johnpaulalex/test_doc
3223f49ea Additional documentation of system tests - global variables, method descriptions.
45b7c01c3 Merge pull request #177 from jedwards4b/git_workflow
ace90b2c2 try setting credentials this way
f4d6aa933 try setting credentials this way
1d61a6944 use this to set git credentials
7f9d330e1 use this to set git credentials
5ac731b85 add tmate code
836847be7 get git workflow working
dcd462d71 Merge pull request #176 from jedwards4b/add_github_testing
2d2479e9d Merge pull request #175 from johnpaulalex/fix
711a53fdf add github testing of prs and automatic tagging of main
cfe0f888a fix typos
5665d6140 Fix broken checkout behavior introduced by PR #172.
27909e255 Merge pull request #173 from johnpaulalex/readall
00ad0440b Further tiny refactorings and docs of checkout API (no-op).    Remove unused load_all param in _External.checkout().    Rename _External.checkout_externals() to checkout_subexternals(), to remove the ambiguity about whether the main external pointed to by the _External is itelf checked out (it is not)    Clarify load_all documentation - it’s always recursive, but applies different criteria at each level.    Rename variables in checkout.py (e.g. ext_description)  to match the equivalent code in sourcetree.py.
2ea3d1a3a Merge pull request #172 from johnpaulalex/fixit
43bf8092c Merge pull request #171 from johnpaulalex/docstatus
e6aa7d21e Merge pull request #170 from johnpaulalex/printdir
adbd71557 On checkout, refresh locally installed optional packages regardless of whether -o is passed in.
add074593 Comment tweaks, and fix 'ppath' typo
696527cb8 Document the format of various status dictionaries, and the various paths and path components within an _External.
c677b9403 When processing an external, print out its path in addition to the base filename (to disambiguate all the externals.cfg's)
975d7fd5a Merge pull request #169 from johnpaulalex/docfix_branch
09709e36d Document _Externals.status().  The original comment was apparently copy-pasted from checkout().
1d880e090 Merge pull request #167 from billsacks/fix_svn_on_windows
3510da848 Tweak a unit test to improve coverage
eb7fc1368 Handle the possibility that the URL already ends with '/'
02ea87e3d Fix svn URLs on Windows
b1c02ab54 Merge pull request #165 from gold2718/doc_fix
9f4be8c7b Add documentation about externals = None feature

git-subtree-dir: manage_externals
git-subtree-split: 82d3b247fc8fb3b70458bf2b4570ee6678754c4d
ekluzek added a commit that referenced this issue Dec 16, 2023
0f884bfec Merge pull request #205 from jedwards4b/sunset_svn_git_access
82a5edf79 merge in billsacks:svn_testing_no_github
17532c160 Use a local svn repo for testing
9c904341a different method to determine if in tests
539952ebd remove debug print statement
cc5434fa7 fix submodule testing
1d7f28840 remove broken tests
04e94a519 provide a meaningful error message
38bcc0a8c Merge pull request #201 from jedwards4b/partial_match
b4466a5aa remove debug print statement
c3cf3ec35 fix issue with partial branch match
7b6d92ef6 Merge pull request #198 from johnpaulalex/gitdir
927ce3a98 Merge pull request #197 from johnpaulalex/testpath
a04f1148f Merge pull request #196 from johnpaulalex/readmod
d9c14bf25 Change the rest of the methods to use -C. Still some usage of getcwd in test_unit_repository_git.
332b10640 Fix incorrect logged path of checkout_externals in test_sys_checkout: it was basically the parent of the current directory, which varies throughout the test. (it called abspath with '{0}/../../', which adds arbitrary and not-interpolated subdir '{0}' to the path, then removes it and removes one more level).
932a7499b Remove printlog from read_gitmodules_file since read_externals_description_file() already has a nearly-the-same printlog (but add it to the other caller).
5d13719ed Merge pull request #195 from johnpaulalex/check_repo
423395449 Update utest to mock _git_remote_verbose in a new way, since it is now called via the GitRepository class rather than on the specific GitRepository instance.
d7a42ae96 Check that desired repo was actually checked out.
71596bbc1 Merge pull request #194 from johnpaulalex/manic2
4c96e824e Make the MANIC_TEST_BARE_REPO_ROOT env var special - give it a constant for easy tracking, and automatically tear it down after each test.
259bfc04d test_sys_checkout: use actual paths in on-the-fly configs rather than MANIC_TEST_BARE_REPO_ROOT env var. This will make it easier to test (in the near future) that checkout_externals actually checked out the desired repo dir.
557bbd6eb Merge pull request #193 from johnpaulalex/manic
5314eede1 Remove MANIC_TEST_TMP_REPO_ROOT environment variable in favor of module-level variable.
345fc1e14 Merge pull request #191 from johnpaulalex/test_doc12
2117b843c test_sys_checkout: verify that basic by-tag/branch/hash tests actually take us to the correct git tag/branch/hash.
94d6e5f2b Merge pull request #190 from johnpaulalex/test_doc11
3ff33a6a8 Inline local-path-creation methods
47dea7f64 Merge pull request #189 from johnpaulalex/test_doc10
9ea75cbf8 Grab-bag of renamings: Remove redundant _NAME from repo constants, and consistently add _REPO suffix (This causes the majority of diffs).
c0c847ec8 Merge pull request #188 from johnpaulalex/test_doc9
2dd5ce0f7 test_sys_checkout.py: only check for correct 'required' or 'optional' state in the test that exercises required vs optional behavior. Removed a lot of boilerplate.
eb3085984 Merge pull request #187 from johnpaulalex/test_doc8
1832e1f84 test_sys_checkout: Simplify many tests to only use a single external.
8689d61ec Merge pull request #186 from johnpaulalex/test_doc7
fbee4253e Grab bag of test_sys_checkout cleanups:    Doc inside of each test more clearly/consistently.    TestSysCheckoutSVN didn’t get the inlining-of-helper-methods treatment, now it has that.    Move various standalone repo helper methods (like create_branch) into a RepoUtils class.    README.md was missing newlines when rendered as markdown.    Doc the return value of checkout.main    Fix test_container_exclude_component - it was looking for the wrong key (which is never present); now it looks for the correct key.
f0ed44a6e Merge pull request #185 from johnpaulalex/test_doc6
a3d59f5f2 Merge pull request #184 from johnpaulalex/test_doc5
5329c8ba7 test_sys_checkout: Inline config generation functions that are only called once.
464f2c7a7 test_sys_checkout: Inline another layer (per-config-file checks). Rename the 4 methods that are used multiple times, to reflect what they do rather than what they're called.
8872c0df6 Merge pull request #183 from johnpaulalex/doc_test4
c045335f6 Merge pull request #182 from johnpaulalex/doc_test3
c583b956e Merge pull request #181 from johnpaulalex/doc_test2
e01cfe278 test_sys_checkout: less confusing handling of return values from checkout_externals. Specifically, when doing a checkout, don't return tree_status from _before_ the checkout. Make a new wrapper to call checkout_externals a second time, to calculate the new status after a checkout (very frequent pattern).
23286818c test_sys_checkout: Remove another layer (which generates test component names)
c3717b6bc Merge pull request #180 from johnpaulalex/doc_test
36d7a4434 test_sys_checkout.py: remove one layer of functions (that check for local status enums). No-op.
2c4584bf7 More documentation about tests: * contents of test repositories (n a new README.md) * various constants in test_sys_checkout.py that point to those contents, and terminology like container/simple/mixed. * in each test method, the scenarios being tested. * The coupling between test methods.
55e74bd0a Merge pull request #179 from johnpaulalex/circ
66be84290 Remove circular dependency by making _External stop doing tricky things with sourcetrees.
82d3b247f Merge pull request #178 from johnpaulalex/test_doc
3223f49ea Additional documentation of system tests - global variables, method descriptions.
45b7c01c3 Merge pull request #177 from jedwards4b/git_workflow
ace90b2c2 try setting credentials this way
f4d6aa933 try setting credentials this way
1d61a6944 use this to set git credentials
7f9d330e1 use this to set git credentials
5ac731b85 add tmate code
836847be7 get git workflow working
dcd462d71 Merge pull request #176 from jedwards4b/add_github_testing
2d2479e9d Merge pull request #175 from johnpaulalex/fix
711a53fdf add github testing of prs and automatic tagging of main
cfe0f888a fix typos
5665d6140 Fix broken checkout behavior introduced by PR #172.
27909e255 Merge pull request #173 from johnpaulalex/readall
00ad0440b Further tiny refactorings and docs of checkout API (no-op).    Remove unused load_all param in _External.checkout().    Rename _External.checkout_externals() to checkout_subexternals(), to remove the ambiguity about whether the main external pointed to by the _External is itelf checked out (it is not)    Clarify load_all documentation - it’s always recursive, but applies different criteria at each level.    Rename variables in checkout.py (e.g. ext_description)  to match the equivalent code in sourcetree.py.
2ea3d1a3a Merge pull request #172 from johnpaulalex/fixit
43bf8092c Merge pull request #171 from johnpaulalex/docstatus
e6aa7d21e Merge pull request #170 from johnpaulalex/printdir
adbd71557 On checkout, refresh locally installed optional packages regardless of whether -o is passed in.
add074593 Comment tweaks, and fix 'ppath' typo
696527cb8 Document the format of various status dictionaries, and the various paths and path components within an _External.
c677b9403 When processing an external, print out its path in addition to the base filename (to disambiguate all the externals.cfg's)
975d7fd5a Merge pull request #169 from johnpaulalex/docfix_branch
09709e36d Document _Externals.status().  The original comment was apparently copy-pasted from checkout().
1d880e090 Merge pull request #167 from billsacks/fix_svn_on_windows
3510da848 Tweak a unit test to improve coverage
eb7fc1368 Handle the possibility that the URL already ends with '/'
02ea87e3d Fix svn URLs on Windows
b1c02ab54 Merge pull request #165 from gold2718/doc_fix
9f4be8c7b Add documentation about externals = None feature
a3b3a0373 Merge pull request #162 from ESMCI/fischer/python3
d4f1b1e8d Change shebang lines to python3
2fd941abc Merge pull request #158 from billsacks/modified_solution
de08dc2ee Add another option for when an external is in a modified state
e954582d0 Merge pull request #156 from billsacks/onbranch_show_hash
952e44d51 Change output: put tag/hash before branch name
10288430f Fix pre-existing pylint issues
01b13f78f When on a branch, show tag/hash, too
39ad53263 Merge pull request #150 from gold2718/fix_combo_config
75f8f02f5 Merge pull request #152 from jedwards4b/sort_by_local_path
42687bd53 remove commented code
29e26af81 fix pylint issues
7c9f3c613 add a test for nested repo checkout
75c5353d2 fix spacing
24a3726a1 improve sorting, checkout externals with each comp
29f45b086 remove py2 test and fix super call
880a4e765 remove decode
1c53be854 no need for set call
36c56dbac simplier fix for issue
dc67cc682 simpler solution
b32c6fca9 fix to allow submodule name different from path
5b5e1c2b0 Merge pull request #144 from billsacks/improve_errmsg
c983863c4 Add another option for dealing with modified externals
59ce252cf Add some details to the error message when externals are modified
be5a1a4d7 Merge pull request #143 from jedwards4b/add_exclude
2aa014a1b fix lint issue
49cd5e890 fix lint issues
418173ffd Added tests for ExternalsDescriptionDict
afab352c8 fix lint issue
be85b7d1b fix the test
a580a570b push test
d43710864 add a test
21affe33c fix formatting issue
72e6b64ae add an exclude option

git-subtree-dir: manage_externals
git-subtree-split: 0f884bfec8e43d0c02261de858d6ec3f6d855e51
samsrabin pushed a commit to samsrabin/CTSM that referenced this issue May 3, 2024
Corrects CIME path in buildnml scripts.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant