Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generalization results #105

Merged
merged 106 commits into from
Dec 20, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
106 commits
Select commit Hold shift + click to select a range
103dc38
remove numpy version pin
rogerkuou Sep 16, 2023
5bc85d0
add dssa1 sentinel-1 example
rogerkuou Sep 16, 2023
1a52c50
pin pandas, see #76
SarahAlidoost Sep 20, 2023
af8346b
add jupyterlab and xarray to env file
SarahAlidoost Sep 21, 2023
fc8e621
add nb for first example
SarahAlidoost Sep 21, 2023
6aa7ef2
fix a comment
SarahAlidoost Sep 21, 2023
14fb8b9
Merge pull request #77 from VegeWaterDynamics/add_nb_example_1
rogerkuou Sep 28, 2023
0442320
solve the conflict in the environment.yml
rogerkuou Sep 29, 2023
6e9ed4a
added sklearn example
Sep 28, 2023
925eeee
Merge pull request #81 from VegeWaterDynamics/save_models
rogerkuou Sep 29, 2023
79b4e58
add xarray dataset extension
rogerkuou Sep 21, 2023
e64c3e1
add unit tests for is_valid
rogerkuou Sep 21, 2023
960c71f
add a notebook for second example
SarahAlidoost Sep 29, 2023
979d6ec
update notebooks to include zarr storage
rogerkuou Oct 2, 2023
eb827b7
Merge pull request #82 from VegeWaterDynamics/add_nb_example_2
rogerkuou Oct 2, 2023
ab5799e
solve the conflict in the environment.yml
rogerkuou Sep 29, 2023
460d35f
added sklearn example
Sep 28, 2023
c514813
add a notebook for second example
SarahAlidoost Sep 29, 2023
2cb805f
update notebooks to include zarr storage
rogerkuou Oct 2, 2023
d31b698
initial implementation of model split
rogerkuou Oct 4, 2023
9eaa4a4
add value checks and docs
rogerkuou Oct 4, 2023
d8d5f1e
add dask and xarray as dependency
rogerkuou Oct 5, 2023
22096e7
fix string separation
rogerkuou Oct 5, 2023
8bd63b8
fix dask import
rogerkuou Oct 5, 2023
c852e08
add tests for model split
rogerkuou Oct 5, 2023
5ff70ac
fix model split tests
rogerkuou Oct 5, 2023
841b73c
add split example
rogerkuou Oct 5, 2023
b97b790
Merge branch 'generalization' into 69_model_split
rogerkuou Oct 5, 2023
67dcb2b
add zarr into dependency
rogerkuou Oct 12, 2023
a2c9689
Apply suggestions from code review
rogerkuou Oct 12, 2023
2f9ae06
change the name to dataset_split
rogerkuou Oct 12, 2023
df9d1bc
Merge pull request #78 from VegeWaterDynamics/69_model_split
rogerkuou Oct 12, 2023
2b4939e
Merge pull request #84 from VegeWaterDynamics/dask_exploration
rogerkuou Oct 16, 2023
1d29a10
reorganize the example folder
rogerkuou Oct 16, 2023
ca51ea3
update path in the data aliging example
rogerkuou Oct 16, 2023
ed6ae7a
remove environment, requirements and setup.cfg files
SarahAlidoost Nov 10, 2023
e3a5db0
add pyproject.toml file, pin versions
SarahAlidoost Nov 10, 2023
bbbaabd
update build workflow
SarahAlidoost Nov 10, 2023
d7fe6c2
drop python 3.12 because numpy is pinned, see #80
SarahAlidoost Nov 10, 2023
33c8c25
Fix the name in pyproject.toml
SarahAlidoost Nov 10, 2023
269fbdc
Merge pull request #86 from VegeWaterDynamics/fix_dependecies
SarahAlidoost Nov 10, 2023
d3393a7
make a generalization_workflow botebook
rogerkuou Nov 16, 2023
2ebabed
update generalization workflow notebook
rogerkuou Nov 21, 2023
f7d87ba
rename motrainer.py to spliter.py
rogerkuou Nov 22, 2023
a0a5dbb
change splitter structure
rogerkuou Nov 22, 2023
e0ae446
rename spliter test
rogerkuou Nov 22, 2023
08bb524
make data_split work for dimensions without coords
rogerkuou Nov 22, 2023
990a6c4
reduce function complexity
rogerkuou Nov 23, 2023
5af1a06
add more tests for identifier validator
rogerkuou Nov 23, 2023
a973f4f
add dask-ml as a demo dependency
rogerkuou Nov 23, 2023
024aa74
update generalization workflow
rogerkuou Nov 23, 2023
2a39d95
use learning_rate insteadof lr because it is deprecated
SarahAlidoost Nov 24, 2023
02af9ec
use index.get_level_values, comment metedata lat/lon
SarahAlidoost Nov 24, 2023
298b8b8
draft nb for dnn training
SarahAlidoost Nov 24, 2023
dc39649
restore index.year
SarahAlidoost Nov 24, 2023
6b0d757
Merge pull request #96 from VegeWaterDynamics/93_improve_data_split
rogerkuou Dec 1, 2023
ea0d2ad
add train test splitter
rogerkuou Dec 4, 2023
a78a6bf
add unit tests
rogerkuou Dec 4, 2023
a700345
add generalization workflow
rogerkuou Dec 4, 2023
ba3151a
Merge branch 'generalization' into add_dnn_nb
SarahAlidoost Dec 5, 2023
7cd8d20
fix the index year, add dask client info
SarahAlidoost Dec 5, 2023
9ee1328
fix this year, fix coordinates name in jackknife
SarahAlidoost Dec 5, 2023
cd60f48
disable warning logs
SarahAlidoost Dec 5, 2023
afac0e3
add dask-labextension to demo dependencies
SarahAlidoost Dec 5, 2023
7ce24e0
add dask implementations, run notebook
SarahAlidoost Dec 5, 2023
2d05e23
add a comment
SarahAlidoost Dec 5, 2023
bce1de1
remove unused imports, fix comments
SarahAlidoost Dec 6, 2023
c060a69
Apply suggestions from code review
rogerkuou Dec 6, 2023
430a1da
update notebook
rogerkuou Dec 6, 2023
a4f94c4
update test dnn
rogerkuou Dec 6, 2023
03fe46d
update tests
rogerkuou Dec 6, 2023
7b94632
check if lat and lon exists in the training data before exporting to …
rogerkuou Dec 6, 2023
e9255a0
fix deprecated import
rogerkuou Dec 6, 2023
567f308
Merge pull request #98 from VegeWaterDynamics/93_train_test_split
rogerkuou Dec 6, 2023
9bd74b5
Merge pull request #97 from VegeWaterDynamics/add_dnn_nb
SarahAlidoost Dec 6, 2023
44de650
Apply suggestions from code review
rogerkuou Dec 8, 2023
26cae02
solve conficts
rogerkuou Dec 8, 2023
26c3a2b
add check metadata file in test
rogerkuou Dec 8, 2023
f5db577
setup documentation structure
rogerkuou Dec 11, 2023
75f3694
add ruff settings to pyproject
SarahAlidoost Dec 11, 2023
c5f840d
add ruff lint workflow using ruff
SarahAlidoost Dec 11, 2023
85914b5
fix simple linter errors
SarahAlidoost Dec 11, 2023
354c674
add parameter strict to zip making it consistent across Python versions
SarahAlidoost Dec 11, 2023
1ab4406
Do not use mutable for argument defaults
SarahAlidoost Dec 11, 2023
36c826a
fix deprecated namespaces
SarahAlidoost Dec 11, 2023
73faf8c
Merge pull request #100 from VegeWaterDynamics/95_refactor_unit_tests
rogerkuou Dec 11, 2023
4545a88
Merge branch 'generalization' into 101_ruff
SarahAlidoost Dec 12, 2023
4c0a7e8
fix remaining linter errors
SarahAlidoost Dec 12, 2023
d7abb9c
add general intro
rogerkuou Dec 12, 2023
be16b1f
update general descriptions
rogerkuou Dec 12, 2023
1c3c8e8
update installation docs
rogerkuou Dec 12, 2023
7bad043
Merge pull request #102 from VegeWaterDynamics/101_ruff
SarahAlidoost Dec 12, 2023
11c1a79
add more notebooks to docs
rogerkuou Dec 12, 2023
0292371
add usage page
rogerkuou Dec 12, 2023
78288f1
Merge branch 'generalization' into 91_setup_documentation
rogerkuou Dec 16, 2023
4da0143
add dimension index check to 2d split
rogerkuou Dec 16, 2023
b5261b5
remove keys check in train test split to make the function more flexi…
rogerkuou Dec 16, 2023
f026da2
update usage
rogerkuou Dec 18, 2023
5642308
reorg usage
rogerkuou Dec 18, 2023
cb1d5a0
update usage split
rogerkuou Dec 19, 2023
f03c22c
update daskml usage
rogerkuou Dec 19, 2023
2c3d7fb
update dnn usage
rogerkuou Dec 19, 2023
2e8ca7e
update notebooks
rogerkuou Dec 19, 2023
7147292
other doc files
rogerkuou Dec 19, 2023
19a4231
update readme
rogerkuou Dec 19, 2023
09c5129
Merge pull request #103 from VegeWaterDynamics/91_setup_documentation
rogerkuou Dec 20, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
33 changes: 12 additions & 21 deletions .github/workflows/build.yml
Original file line number Diff line number Diff line change
@@ -1,37 +1,28 @@
name: Build
name: Build and test Python package

on: [push, pull_request]

jobs:

build:
name: Build for ${{ matrix.python-version }}
runs-on: 'ubuntu-latest'
runs-on: ${{ matrix.os }}
strategy:
fail-fast: false
matrix:
python-version: ['3.9', '3.10']
os: ['ubuntu-latest', 'macos-latest', 'windows-latest']
python-version: ['3.10', '3.11']
steps:
- uses: actions/checkout@v2
- uses: actions/checkout@v3
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v2
uses: actions/setup-python@v3
with:
python-version: ${{ matrix.python-version }}
- name: Python info
shell: bash -l {0}
run: |
which python
python --version
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install -r requirements.txt
- name: Build
shell: bash -l {0}
run: |
python setup.py build
- name: Test
shell: bash -l {0}
run: |
pip install pytest pytest-cov pycodestyle
pytest --cov --cov-report term --cov-report xml --junitxml=xunit-result.xml
python -m pip install build
python -m pip install .[dev]
- name: Build the package
run: python -m build
- name: Run tests
run: pytest tests/
8 changes: 8 additions & 0 deletions .github/workflows/lint.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
name: Ruff lint
on: [push, pull_request]
jobs:
ruff:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- uses: chartboost/ruff-action@v1
22 changes: 22 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
# MOTrainer: Measurement Operator Trainer

[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.7540443.svg)](https://doi.org/10.5281/zenodo.7540443)

Measurement Operator Trainer is a Python package training measurement operators
(MO) for data assimilations purposes. It is specifically designed for the aplications where one needs to split large spatio-temporal data into independent partitions, and then train separate ML models for each partition.

Please refer to the MOtrainer documentation for more details.

Copyright (c) 2021,

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

## Credits

This package was created with [Cookiecutter](https://github.com/audreyr/cookiecutter) and the [NLeSC/python-template](https://github.com/NLeSC/python-template).
79 changes: 0 additions & 79 deletions README.rst

This file was deleted.

17 changes: 17 additions & 0 deletions docs/CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
# Change Log

All notable changes to this project will be documented in this file.
This project adheres to [Semantic Versioning](http://semver.org/).


[0.1.0] - 2023-12-21
********************

Added
-----

The first version of the MOTrainer package. The following functionalities are implemented:
- Spatio temporal Dataset split;
- Jackknife GPI implementation;
- Documentation and examples for distributed training;
- Relevant tests.
76 changes: 76 additions & 0 deletions docs/CODE_OF_CONDUCT.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
# Code of Conduct

This code of conduct is adapted from the
[Git Code of Conduct](https://github.com/git/git/blob/master/CODE_OF_CONDUCT.md).

## Our Pledge

We as members, contributors, and leaders pledge to make participation in our
community a harassment-free experience for everyone, regardless of age, body
size, visible or invisible disability, ethnicity, sex characteristics, gender
identity and expression, level of experience, education, socio-economic status,
nationality, personal appearance, race, religion, or sexual identity
and orientation.

We pledge to act and interact in ways that contribute to an open, welcoming,
diverse, inclusive, and healthy community.

## Our Standards

Examples of behavior that contributes to a positive environment for our
community include:

* Demonstrating empathy and kindness toward other people
* Being respectful of differing opinions, viewpoints, and experiences
* Giving and gracefully accepting constructive feedback
* Accepting responsibility and apologizing to those affected by our mistakes,
and learning from the experience
* Focusing on what is best not just for us as individuals, but for the
overall community

Examples of unacceptable behavior include:

* The use of sexualized language or imagery, and sexual attention or
advances of any kind
* Trolling, insulting or derogatory comments, and personal or political attacks
* Public or private harassment
* Publishing others' private information, such as a physical or email
address, without their explicit permission
* Other conduct which could reasonably be considered inappropriate in a
professional setting

## Enforcement Responsibilities

Project maintainers are responsible for clarifying and enforcing our standards of
acceptable behavior and will take appropriate and fair corrective action in
response to any behavior that they deem inappropriate, threatening, offensive,
or harmful.

Project maintainers have the right and responsibility to remove, edit, or reject
comments, commits, code, wiki edits, issues, and other contributions that are
not aligned to this Code of Conduct, and will communicate reasons for moderation
decisions when appropriate.

## Scope

This Code of Conduct applies both within project spaces and in public spaces when
an individual is representing the project or its community. Examples of representing
a project or community include using an official project e-mail address, posting via
an official social media account, or acting as an appointed representative at an
online or offline event. Representation of a project may be further defined and
clarified by project maintainers.

## Enforcement

Instances of abusive, harassing, or otherwise unacceptable behavior may be
reported by contacting the project team at [email protected].

All complaints will be reviewed and investigated promptly and fairly.

All Project maintainers are obligated to respect the privacy and security of the
reporter of any incident.

## Attribution

This Code of Conduct is adapted from the Contributor Covenant, version 1.4,
available at https://www.contributor-covenant.org/version/1/4/code-of-conduct.html
45 changes: 45 additions & 0 deletions docs/CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@

# MOTrainer Contributing Guidelines


We welcome any kind of contribution to our software, from simple comment
or question to a full fledged [pull request](https://help.github.com/articles/about-pull-requests/).
Please read and follow our [Code of Conduct](./CODE_OF_CONDUCT.md).

A contribution can be one of the following cases:

- you have a question;
- you think you may have found a bug (including unexpected behavior);
- you want to make some kind of change to the code base (e.g. to fix a bug, to add a new feature, to update documentation).

The sections below outline the steps in each case.

## You have a question

- use the search functionality [here](https://github.com/VegeWaterDynamics/motrainer/issues)
to see if someone already filed the same issue;
- if your issue search did not yield any relevant results, make a new issue;
- apply the "question" label; apply other labels when relevant.

## You think you may have found a bug

- use the search functionality [here](https://github.com/VegeWaterDynamics/motrainer/issues) to see if someone already filed the same issue;
- if your issue search did not yield any relevant results, make a new issue, making sure to provide enough information to the rest of the community to understand the cause and context of the problem. Depending on the issue, you may want to include:
- the [SHA hashcode](https://help.github.com/articles/autolinked-references-and-urls/#commit-shas>) of the commit that is causing your problem;
- some identifying information (name and version number) for dependencies you're using;
- information about the operating system;
- apply relevant labels to the newly created issue.

## You want to make some kind of change to the code base

- (**important**) announce your plan to the rest of the community *before you start working*. This announcement should be in the form of a (new) issue;
- (**important**) wait until some kind of consensus is reached about your idea being a good idea;
- if needed, fork the repository to your own Github profile and create your own feature branch off of the latest master commit. While working on your feature branch, make sure to stay up to date with the master branch by pulling in changes, possibly from the 'upstream' repository (follow the instructions [here](https://help.github.com/articles/configuring-a-remote-for-a-fork/) and [here](https://help.github.com/articles/syncing-a-fork/));
- make sure the existing tests still work. First, install the development dependencies as `pip install .[dev]`, and then run `pytest tests`;
- add your own tests (if necessary);
- update or expand the documentation. Make sure the documentation is built successfully: first, install documentation dependencies as `pip install .[docs]` and then run `mkdocs build`.
- make sure the linting tests pass by running `ruff` in the project root directory: `ruff check .`;
- [push](http://rogerdudler.github.io/git-guide/) your feature branch to (your fork of) the MOTrainer repository on GitHub;
- create the pull request, e.g. following the instructions [here](https://help.github.com/articles/creating-a-pull-request/).

In case you feel like you've made a valuable contribution, but you don't know how to write or run tests for it, or how to generate the documentation: don't let this discourage you from making the pull request; we can help you! Just go ahead and submit the pull request, but keep in mind that you might be asked to append additional commits to your pull request.
20 changes: 0 additions & 20 deletions docs/Makefile

This file was deleted.

13 changes: 0 additions & 13 deletions docs/_static/theme_overrides.css

This file was deleted.

Empty file removed docs/_templates/.gitignore
Empty file.
Loading