- Template directory for datascience competitions.
- Data is saved in PostgreSQL on Docker🐳 container and the data is reproducibule/reusable 😄🎉
git clone https://github.com/kiccho1101/kaggle-base.git
cd kaggle-base
Recommended:
make pull
or
make build
make jupyter
- Copy token and acccess to localhost:${JUPYTER_PORT} (default: 9000)
make start-db
- Then you can access to localhost:${PGWEB_PORT} (default: 9002) to view the database.
make kfold CONFIG_NAME(default: lightgbm_0)
- Create all features.
make feature
- Specify a feature that will be created.
make feature FEATURE_NAME
make cv CONFIG_NAME
make stats
make train-and-predict CONFIG_NAME
- Then submit your output file!🙆
./output/submission_xxx.csv
make format
make check
make reset-db
Recommended:
make shell
python xxx.py
or
make run python xxx.py
まさに特徴量管理に疲弊していたときに見つけたスライド。すごくわかりやすいです。
クラスの書き方が参考になります。