ClimatExPrep

A repository of software for preprocessing climatex data for the deep learning pipeline.

Requirements

Miniconda with Python >= 3.7
xESMF

💽 Installation

xESMF is only available through Conda, so you will have to be able to install conda on your system. Unfortunately, this is limiting because certain HPCs don't allow conda. There may be workarounds but I have not explored them.

### 📋 Configuration
Please see `ClimatExPrep/conf/config.yml` for an example configuration.

```yaml
hr:
  path: /home/nannau/jabbar/data/wrf/*.nc
  is_west_negative: true
  output_path: /home/nannau/gom/wrf
lr:
  path: /home/nannau/jabbar/data/era5/*.nc
  is_west_negative: false
  output_path: /home/nannau/gom/era5

# If the variables are the same, but are named differently in
# the netCDF files, use the yaml key as the "homogenizing" key, and
# list the alternative name in "alternative_names".
# This pipeline will rename the alternative name to the yaml key if found.
# E.g. if the lr and hr datasets have different
# numbers of covariates, e.g. Annau et al. 2023, then
# you can specify if a variable only occurs in the hr or lr
# dataset respectively with hr_only or lr_only: true
# The same logic above applies for the dimensions, dims, and coordinates, coords.
vars:
  U10:
    alternative_names: ["u10", "uas"]
    standardize: true
    standardize_style: "normal"
  V10:
    alternative_names: ["v10", "vas"]
    standardize: true
    standardize_style: "normal"
dims:
  time:
    alternative_names: ["Time", "Times", "times"]
  rlat:
    alternative_names: ["rotated_latitude"]
    hr_only: true
  rlon:
    alternative_names: ["rotated_longitude"]
    hr_only: true
coords:
  lat:
    alternative_names: ["latitude", "Lat", "Latitude"]
  lon:
    alternative_names: ["longitude", "Long", "Lon", "Longitude"]

# Time indexing for subsets
time:
  # Crop to the dataset with the shortest run
  # this defines the full dataset from which to subset
  full:
    start: "20001001T01:00:00"
    end: "20150929T23:00:00"
  # use this to select which years to reserve for testing
  # the remaining years in full will be used for training
  test_years: [2000, 2009, 2014]

# sets the scale factor and index slices of the rotated coordinates
spatial:
  scale_factor: 8
  x:
    first_index: 0
    last_index: 632
  y:
    first_index: 2
    last_index: 634

# xarray netcdf engine
engine: h5netcdf

# dask client parameters
compute:
  threads_per_worker: 2
  memory_limit: "20GB"
  dashboard_address: 8787

🚀 Running

You can just edit the configuration file above and run

python ClimatExPrep/preprocess.py

High-level workflow

Name		Name	Last commit message	Last commit date
Latest commit History 99 Commits
.github/workflows		.github/workflows
.idea		.idea
ClimatExPrep		ClimatExPrep
.coveragerc		.coveragerc
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
docker-compose.yml		docker-compose.yml
environment.yml		environment.yml
nohup.out		nohup.out
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ClimatExPrep

Requirements

💽 Installation

🚀 Running

About

Releases

Packages

Languages

License

sbeale007/ClimatExPrep

Folders and files

Latest commit

History

Repository files navigation

ClimatExPrep

Requirements

💽 Installation

🚀 Running

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages