ToDos:
- Business/Data Understanding
- Data Prep - Lisa
- cleaning e.g. unnecessary columns (or which would be problematic for ethics & data privacy)
- test model performance with all columns available?
- feature engineering?
- create centralized test set of all - Fabian
- create seed for val set for reproducability
- create data_prep.py (library, auch für einzel Daten für deployment?)
- Optimize Models?
- Logistic Regression Training - Daniel
- Centralized - Fabian
- Individual for each Bank (for comparison to FML)
- (Optional) Federated - Fabian (SVC is better Model, but only slightly)
- Explainability
- SVM Training - Daniel
- Centralized - Fabian
- Individual for each Bank (for comparison to FML)
- Federated - Fabian
- Explainability
- NN Model Training (TensorFlow) - Vanessa
- Centralized
- Individual for each Bank (for comparison to FML) - Fabian
- Optimize - hat Vanessa gut gemacht 😉
- Federated - Fabian
- Explainability
- (Optional) Connection to W&B (upload stats)
- Explainability
- Performance
- Eval Accuracy
- (Optional) Streamlit/Gradio Test Interface für Model = Deployment?
- Code Cleaning
- Security issues?!
- Model Card?
- Präsentation (ans CRISP-DM anpassen)
- Results of Business/Data Understanding
- Results of Feature Engineering
- FL Model Performance (Centralized vs. FML vs 3x NN)
- Comparison to centralized Model (per bank?)
- Comparison to simpler non-NN Model_(s) (performance & acc)
- Performance Criteria (Precision?)
- Fairness & Explainability
- Security issues?!
- Model Card?
Thoughts: Compare sperately trained Bank Performance with Federated Performance (Global & bank specific data) = What could single bank achieve vs FML? Compare centrtal model to FML?
Thursday ToDos:
- Centralized NN (Track?)
- inference time per element = 707us/step
- eval per bank data
- 3 seperate NN per bank
- Logistic Regression
- inference time per element
- NN Explainability
- (Optional) Model Cards