Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create IO library for less coupling and easier interfacing with FATES #2006

Open
adrifoster opened this issue May 16, 2023 · 5 comments
Open
Labels
code health improving internal code structure to make easier to maintain (sustainability)

Comments

@adrifoster
Copy link
Collaborator

adrifoster commented May 16, 2023

Right now the subroutines/modules that are responsible for IO in CTSM are not very centralized. Most of it happens through ncdio_pio but, there are other important modules, and these modules also depend on a number of other, non-strictly-IO modules.

Creating an IO package/library/centralized set of modules would help with decoupling and would allow FATES to more easily use IO (via an interface, etc.) with more flexibility. This would also make FATES unit tests that use inputs much easier to create.

Some notes:

  • the main IO module (that calls pio, the external parallel IO library) is ncdio_pio
    • ncdio_pio uses some shr modules (this is fine)
    • it also uses:
      • spmdMod: for global variables like masterproc, mpicom, iam, etc.
      • clm_varcon: for global variables like spval, ispval, grldn
      • clm_varpar: global variables numrad, nlevgrnd, nlevlak, nlevdecomp_ful, etc.
      • clm_varctl: control variables single_column and iulog
      • decompMod: a few subroutines that are only used within ncdio_pio, however they essentially map one input argument into another
      • perf_mod: t_startf and t_stopf - never actually used
      • fileutils: does file IO things
      • array_utils: method convert_to_logical, only used in ncdio_pio
  • this is quite coupled but I think there are a few options to decouple things

@billsacks and @ekluzek thoughts? I have some ideas that could depend on "abstractions" vs "concretions" for these variables and methods

@ekluzek
Copy link
Collaborator

ekluzek commented May 16, 2023

I think we need to meet to really go over this.

If I understand what you are bringing up is the internal dependencies of ncdio_pio will link those dependencies into FATES. This is also further complicated by the fact that we have ELM and CTSM to support. The dependencies above are for CTSM, and the ELM list is likely slightly different than above, and most certainly will evolve separately. So one way to do this would be to have some abstract interfaces that would be shared by ELM and CTSM that would have different details between the two. The other part of this would be to have a IO library for CTSM that had the smallest set of dependencies possible so that you could easily link it to FATES unit tests and full FATES. Am I understanding what you are getting at here?

@billsacks
Copy link
Member

Thanks for opening this issue, @adrifoster and for laying out your initial thought. I'm unfortunately not going to have much time to devote to this over the next month or two, but my initial thought on this is that the problem might be more tractable if we reframe it – so that, rather than thinking of this as an I/O library (which raises questions of why some of these lower-level dependencies are part of the I/O library), we instead think about it as a lower-level utility library. For starters the public API (which FATES will depend on) can be I/O focused, but I feel like this framing might give us the liberty to include many, if not all, of the ncdio_pio dependencies in this separate library. The key practical consideration, as far as I can tell, is that nothing in this new library can depend on FATES (to avoid circular dependencies), but beyond that, I'm thinking that it can theoretically include any other pieces of CTSM (though ideally it would have some reasonable level of cohesion). @adrifoster do you agree with that?

@adrifoster
Copy link
Collaborator Author

Thanks @ekluzek and @billsacks! Happy to meet to chat about this at some point. I don't think this is of any immediate concern so we can just think about it for now since folks are quite busy. Or I can work on prototyping/design, etc.

If I understand what you are bringing up is the internal dependencies of ncdio_pio will link those dependencies into FATES.

Actually I would like to remove as many dependencies as possible from ncdio_pio so that we can introduce as few as possible to FATES/FATES API. This also feels like it would improve the CTSM code in general by decreasing coupling and having our modules have a single responsibility.

This is also further complicated by the fact that we have ELM and CTSM to support. The dependencies above are for CTSM, and the ELM list is likely slightly different than above, and most certainly will evolve separately. So one way to do this would be to have some abstract interfaces that would be shared by ELM and CTSM that would have different details between the two.

Yes, likely the dependencies will differ so someone on the ELM side can update their version if they want separately. I agree that we should have some kind of abstract interface (probably the existing FATES interface) to call these modules so that either can change without a bunch of stuff breaking or being affected.

The other part of this would be to have a IO library for CTSM that had the smallest set of dependencies possible so that you could easily link it to FATES unit tests and full FATES. Am I understanding what you are getting at here?

Yes that's what I was thinking!

my initial thought on this is that the problem might be more tractable if we reframe it – so that, rather than thinking of this as an I/O library (which raises questions of why some of these lower-level dependencies are part of the I/O library), we instead think about it as a lower-level utility library.

I like the idea of a lower-level library(ies). But it seems like it might be most useful if we split up the modules/libraries based on responsibility (I/O vs. model functionality, etc.). That way, if we just want to do a simple I/O, we only need the I/O module/library, not a large utility library. Additionally, this makes things easier to change by reducing the dependencies between the various utilities. Right now, it is difficult to make these kinds of changes because everything is tied together, but if we reduce that we can more easily make changes, swap in one module for another, etc. Happy to chat more about this, though.

@billsacks
Copy link
Member

@adrifoster thanks for explaining this. I see your points here. Could some of this breaking of dependencies be achieved by passing a number of parameters (like the ones from clm_varpar and elsewhere) into an initialization routine for ncdio_pio (either by turning it into a class and storing variables in the class, or just storing module-level variables)? Is that the kind of thing you're thinking about?

@adrifoster
Copy link
Collaborator Author

@billsacks yes that was definitely one of the ideas I was thinking of!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
code health improving internal code structure to make easier to maintain (sustainability)
Projects
None yet
Development

No branches or pull requests

3 participants