Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better NaN handling and dataframe support #200

Merged
merged 15 commits into from
Aug 27, 2024
Merged

Conversation

JannisHoch
Copy link
Owner

  • first steps taken to support a regression model
  • better handling of NaN values: data points with one or more NaN values will be removed, i.e., only polygons which are covered by all features will remain in the analysis
  • dataframe support: input X and Y now as dataframes instead of np.arrays

@JannisHoch JannisHoch merged commit 338021e into dev Aug 27, 2024
JannisHoch added a commit that referenced this pull request Dec 11, 2024
* Create pythonpublish.yml

* updated to 0.1.0 including pip support

* Create .pypirc

* removed token

* corrected file name

* removed GDAL dep

* updated file content

* corrected folder name

* version 0.1.1, now REALLY with pip and github actions

* updated pypi installation instruction

* added quantiles

* added sanity checks

* writing geojson instead shp

* Bump pip from 20.0.2 to 21.1

Bumps [pip](https://github.com/pypa/pip) from 20.0.2 to 21.1.
- [Release notes](https://github.com/pypa/pip/releases)
- [Changelog](https://github.com/pypa/pip/blob/main/NEWS.rst)
- [Commits](pypa/pip@20.0.2...21.1)

---
updated-dependencies:
- dependency-name: pip
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>

* Bump ipython from 7.13.0 to 7.16.3 in /docs

Bumps [ipython](https://github.com/ipython/ipython) from 7.13.0 to 7.16.3.
- [Release notes](https://github.com/ipython/ipython/releases)
- [Commits](ipython/ipython@7.13.0...7.16.3)

---
updated-dependencies:
- dependency-name: ipython
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>

* Bump ipython from 7.13.0 to 7.16.3

Bumps [ipython](https://github.com/ipython/ipython) from 7.13.0 to 7.16.3.
- [Release notes](https://github.com/ipython/ipython/releases)
- [Commits](ipython/ipython@7.13.0...7.16.3)

---
updated-dependencies:
- dependency-name: ipython
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>

* Bump numpy from 1.18.1 to 1.21.0

Bumps [numpy](https://github.com/numpy/numpy) from 1.18.1 to 1.21.0.
- [Release notes](https://github.com/numpy/numpy/releases)
- [Changelog](https://github.com/numpy/numpy/blob/main/doc/HOWTO_RELEASE.rst.txt)
- [Commits](numpy/numpy@v1.18.1...v1.21.0)

---
updated-dependencies:
- dependency-name: numpy
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>

* Bump numpy from 1.18.1 to 1.21.0 in /docs

Bumps [numpy](https://github.com/numpy/numpy) from 1.18.1 to 1.21.0.
- [Release notes](https://github.com/numpy/numpy/releases)
- [Changelog](https://github.com/numpy/numpy/blob/main/doc/HOWTO_RELEASE.rst.txt)
- [Commits](numpy/numpy@v1.18.1...v1.21.0)

---
updated-dependencies:
- dependency-name: numpy
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>

* Bump numpy from 1.21.0 to 1.22.0 in /docs (#150)

Bumps [numpy](https://github.com/numpy/numpy) from 1.21.0 to 1.22.0.
- [Release notes](https://github.com/numpy/numpy/releases)
- [Changelog](https://github.com/numpy/numpy/blob/main/doc/HOWTO_RELEASE.rst)
- [Commits](numpy/numpy@v1.21.0...v1.22.0)

---
updated-dependencies:
- dependency-name: numpy
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump numpy from 1.21.0 to 1.22.0 (#151)

Bumps [numpy](https://github.com/numpy/numpy) from 1.21.0 to 1.22.0.
- [Release notes](https://github.com/numpy/numpy/releases)
- [Changelog](https://github.com/numpy/numpy/blob/main/doc/HOWTO_RELEASE.rst)
- [Commits](numpy/numpy@v1.21.0...v1.22.0)

---
updated-dependencies:
- dependency-name: numpy
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump nbconvert from 5.6.1 to 6.3.0 (#153)

Bumps [nbconvert](https://github.com/jupyter/nbconvert) from 5.6.1 to 6.3.0.
- [Release notes](https://github.com/jupyter/nbconvert/releases)
- [Commits](jupyter/nbconvert@5.6.1...6.3.0)

---
updated-dependencies:
- dependency-name: nbconvert
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump nbconvert from 5.6.1 to 6.3.0 in /docs (#152)

Bumps [nbconvert](https://github.com/jupyter/nbconvert) from 5.6.1 to 6.3.0.
- [Release notes](https://github.com/jupyter/nbconvert/releases)
- [Commits](jupyter/nbconvert@5.6.1...6.3.0)

---
updated-dependencies:
- dependency-name: nbconvert
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump setuptools from 49.6 to 65.5.1 in /docs (#158)

Bumps [setuptools](https://github.com/pypa/setuptools) from 49.6 to 65.5.1.
- [Release notes](https://github.com/pypa/setuptools/releases)
- [Changelog](https://github.com/pypa/setuptools/blob/main/CHANGES.rst)
- [Commits](pypa/setuptools@v49.6.0...v65.5.1)

---
updated-dependencies:
- dependency-name: setuptools
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump nbconvert from 6.3.0 to 6.5.1 (#154)

Bumps [nbconvert](https://github.com/jupyter/nbconvert) from 6.3.0 to 6.5.1.
- [Release notes](https://github.com/jupyter/nbconvert/releases)
- [Commits](jupyter/nbconvert@6.3.0...6.5.1)

---
updated-dependencies:
- dependency-name: nbconvert
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump nbconvert from 6.3.0 to 6.5.1 in /docs (#155)

Bumps [nbconvert](https://github.com/jupyter/nbconvert) from 6.3.0 to 6.5.1.
- [Release notes](https://github.com/jupyter/nbconvert/releases)
- [Commits](jupyter/nbconvert@6.3.0...6.5.1)

---
updated-dependencies:
- dependency-name: nbconvert
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump wheel from 0.33.6 to 0.38.1 in /docs (#156)

Bumps [wheel](https://github.com/pypa/wheel) from 0.33.6 to 0.38.1.
- [Release notes](https://github.com/pypa/wheel/releases)
- [Changelog](https://github.com/pypa/wheel/blob/main/docs/news.rst)
- [Commits](pypa/wheel@0.33.6...0.38.1)

---
updated-dependencies:
- dependency-name: wheel
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump setuptools from 49.6 to 65.5.1 (#157)

Bumps [setuptools](https://github.com/pypa/setuptools) from 49.6 to 65.5.1.
- [Release notes](https://github.com/pypa/setuptools/releases)
- [Changelog](https://github.com/pypa/setuptools/blob/main/CHANGES.rst)
- [Commits](pypa/setuptools@v49.6.0...v65.5.1)

---
updated-dependencies:
- dependency-name: setuptools
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump ipython from 7.16.3 to 8.10.0 in /docs (#159)

Bumps [ipython](https://github.com/ipython/ipython) from 7.16.3 to 8.10.0.
- [Release notes](https://github.com/ipython/ipython/releases)
- [Commits](ipython/ipython@7.16.3...8.10.0)

---
updated-dependencies:
- dependency-name: ipython
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump ipython from 7.16.3 to 8.10.0 (#160)

Bumps [ipython](https://github.com/ipython/ipython) from 7.16.3 to 8.10.0.
- [Release notes](https://github.com/ipython/ipython/releases)
- [Commits](ipython/ipython@7.16.3...8.10.0)

---
updated-dependencies:
- dependency-name: ipython
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* updated dependencies (#161)

* v0.1.2 (#162)

* updated dependencies

* correct dtype of list with selection conflict type

* got rid of fiona requirement

* improved downloading of example data

* deprecated plotting for now

as sklearn API has changed

* v0.1.2

* Bump pip from 21.1 to 23.3 (#183)

Bumps [pip](https://github.com/pypa/pip) from 21.1 to 23.3.
- [Changelog](https://github.com/pypa/pip/blob/main/NEWS.rst)
- [Commits](pypa/pip@21.1...23.3)

---
updated-dependencies:
- dependency-name: pip
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump jinja2 from 2.11.3 to 3.1.3 in /docs (#184)

Bumps [jinja2](https://github.com/pallets/jinja) from 2.11.3 to 3.1.3.
- [Release notes](https://github.com/pallets/jinja/releases)
- [Changelog](https://github.com/pallets/jinja/blob/main/CHANGES.rst)
- [Commits](pallets/jinja@2.11.3...3.1.3)

---
updated-dependencies:
- dependency-name: jinja2
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Update and freeze deps (#186)

* some cleaning up

* precommit hooks

* models.py as object-based

* object based machine_learning.py

* moving functions here

* object based models.py

* removed some unnecesssary functions

* updates because of moved function

* starting to clean up data.py mess

* Object based and more (#187)

* cleaning up evaluation.py

* parsing and collecting of settings/configurations in separate file

* separate file for I/O functions

* clean up

* removed outdated tests

* improving selection.py

* Update CODE_OF_CONDUCT.md

* updated and moved functions in conflicts.py

* cosmetic changes to machine_learning.py

* put all neighbours related functions in one file

* updating data.py, resp. what is now called xydata.py

* moved run_prediction func from ppeline.py to models.py

* updated cli script

* Debug copro runner (#188)

* works up to ML part

* debugging some eval functions

* reference run works via command line; removed feature importance for now

* loop over n_runs now part of models.py

* rename

* minor improvements

* defining projection period works now (again)

* projections succesfully run

* completed docstrings and typehints

* improving console out

* watprovID now always a kwarg (#189)

* Fix rtd (#190)

* restart doc

* updated dependencis

* removed out-dated duplicate

* next take

* added pydata-sphinx-theme dep

* using conda for rtd

* bla

* add mock

* sphinx works locally at least

* splitting docs in source and make

* blabla

* no copro dep in conf.py

* add installation instructions

* added static data

* Fix rtd (#191)

* restart doc

* updated dependencis

* removed out-dated duplicate

* next take

* added pydata-sphinx-theme dep

* using conda for rtd

* bla

* add mock

* sphinx works locally at least

* splitting docs in source and make

* blabla

* no copro dep in conf.py

* add installation instructions

* added static data

* added API partially

* updated installation

* GridsearchCV (#192)

* possible to tune hyperparameters of each RFC instance

* n_jobs and verbose as click options

* applying gridsearchcv works as expected

* updated example settings

* extended documentation

* computing permutation importance for each run n with 10 permutations (#193)

* lag time as constant; removed distutils

* Bump setuptools from 69.0.3 to 70.0.0 (#198)

Bumps [setuptools](https://github.com/pypa/setuptools) from 69.0.3 to 70.0.0.
- [Release notes](https://github.com/pypa/setuptools/releases)
- [Changelog](https://github.com/pypa/setuptools/blob/main/NEWS.rst)
- [Commits](pypa/setuptools@v69.0.3...v70.0.0)

---
updated-dependencies:
- dependency-name: setuptools
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump scikit-learn from 1.3.2 to 1.5.0 (#197)

Bumps [scikit-learn](https://github.com/scikit-learn/scikit-learn) from 1.3.2 to 1.5.0.
- [Release notes](https://github.com/scikit-learn/scikit-learn/releases)
- [Commits](scikit-learn/scikit-learn@1.3.2...1.5.0)

---
updated-dependencies:
- dependency-name: scikit-learn
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Add ACLED support and use "yaml" file type for model settings (#199)

* first steps

* Switch from cfg to yaml (#194)

* parse YAML file

* parsing settings from YAML-file works

* improved conflict property selection based on YAML-file input

* fix _clip_to_extent

* reading indicator data works, also with options

* adding poly_id and conflict_id

* added todos

* fixed problem with perm. importance and debugged workflow to be compatible with yaml input

* add KFold to GridSearchCV

* Better NaN handling and dataframe support (#200)

* option for either Regression or Classification model

* fine tuned selection of conflict data points

* option to provide ML target variable

* use simulation_name as output dir

* data extraction from netCDF files in a separate function

* target variable and estimator used determined in main script

* should belong to previous commit

* first step towards flexible target_vars when extracting conflict (Y) data

* added class docstrings

* constructing X and Y data as dataframes instead of arrays

* no log-scale support for now, more consistent treatment of polygons w/o feature data

* removing all polygons with 1 or more NaNs

* fully pd.dataframe support implemented

* save only selected conflicts which fall in simulation period

* finetuning print output

* Fix/projections (#201)

* option for either Regression or Classification model

* fine tuned selection of conflict data points

* option to provide ML target variable

* use simulation_name as output dir

* data extraction from netCDF files in a separate function

* target variable and estimator used determined in main script

* should belong to previous commit

* first step towards flexible target_vars when extracting conflict (Y) data

* added class docstrings

* constructing X and Y data as dataframes instead of arrays

* no log-scale support for now, more consistent treatment of polygons w/o feature data

* removing all polygons with 1 or more NaNs

* fully pd.dataframe support implemented

* save only selected conflicts which fall in simulation period

* finetuning print output

* no random state set for Kfold in GridSearchCV to ensure all n models are fitted on different data

* better handling of cores via command line

* remove content of output dir to avoid conflicts with expected files

* settings parsed for projections with new yaml file structure

* udpated docstring for load_estimators

* fixed definition of projection period

* reinitated initiate_X_data function

* saving files as GPKG instead GeoJSON

* no output for None output from rasterstats

* make run_prediction() work

* apply isort

* no ML target var for nwo

* fix Geopandas driver

* corrected number of function arguments

* Bump scikit-learn from 0.22.1 to 1.5.0 in /docs (#205)

Bumps [scikit-learn](https://github.com/scikit-learn/scikit-learn) from 0.22.1 to 1.5.0.
- [Release notes](https://github.com/scikit-learn/scikit-learn/releases)
- [Commits](scikit-learn/scikit-learn@0.22.1...1.5.0)

---
updated-dependencies:
- dependency-name: scikit-learn
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump setuptools from 65.5.1 to 70.0.0 in /docs (#204)

Bumps [setuptools](https://github.com/pypa/setuptools) from 65.5.1 to 70.0.0.
- [Release notes](https://github.com/pypa/setuptools/releases)
- [Changelog](https://github.com/pypa/setuptools/blob/main/NEWS.rst)
- [Commits](pypa/setuptools@v65.5.1...v70.0.0)

---
updated-dependencies:
- dependency-name: setuptools
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump jinja2 from 3.1.3 to 3.1.4 in /docs (#203)

Bumps [jinja2](https://github.com/pallets/jinja) from 3.1.3 to 3.1.4.
- [Release notes](https://github.com/pallets/jinja/releases)
- [Changelog](https://github.com/pallets/jinja/blob/main/CHANGES.rst)
- [Commits](pallets/jinja@3.1.3...3.1.4)

---
updated-dependencies:
- dependency-name: jinja2
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant