optimization

Repository of the code (Python 3) for the homework of Optimization for Data Science, academic year 2019/2020, UNIPD.

Packages required:

numpy, matplotlib, sklearn, argparse, json, time, tabulate. tqdm

Main scripts (src)

In src it is possible to find the main scripts used for this work:

loss.py: containing the LogisticLoss class
optimizer.py: containing the abstract Optimizer class and the main three optimization algorithms (Gradient Descent, SGD and SVRG)
model.py: containing the Model class, which is a simple Linear Classifier with the fit and predict functions. It also contains a main which is a demo of how the model works with some 1D and 2D data and the different optimizations algorithms.

Test script (src)

In src there is the script test.py that gives the possibilities to test the model with the choosen optimizer on three different datasets (IRIS, MNIST, a9a).
The script has the following arguments:

data: String. The dataset to use (mnist, a9a, iris). Default is iris.
optim: String. the optimization algorithm chosen (gd, sgd, svrg). Default is gd
tollerance: Float. The minimum norm that the gradient must have in order to stop the optimization algorithm. Default is 0.001
iter_epoch: Int. The number of example to take in every step of SVRG. Default 5000. Ignored if not SVRG.
epochs: Int. The number of epochs. Default is 10.
lr: Float. the step-size. Default is 0.05
reg_coeff: Float. The regularization lambda coefficient. Default is 0.001
init: String. The weights initialization. Either "zeros" or "random" (normal distribution). Default is zeros
verbose: 0/1. Print some information on every step (1) or don't (0). Default is 0.
seed: Int. Define a random seed. Default is None

Example:

cd C:\Users\fgrim\Desktop\Optimization\optimization
python src\model.py --data "mnist" --optim "gd" --epochs 300 --lr 0.2 --reg_coeff 0.01 --init "random" --seed 42

Result notebook:

result.ipynb is a notebook (set up in a colab enviroment, but easy to modfiy to run it in local) that fit different models (different dataset and optimizer) and return some results.

Results (results)

In the results folder there are some json files containing some results and two scripts display.py and reduce.py that were done in order to process and visualize the results

Datasets (data)

In the data folder there are the MNIST dataset and the a9a dataset. The IRIS dataset is called by sklearn.dataset*.

Images (images)

In the images folder there are some results images

Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
data		data
images		images
results		results
src		src
.DS_Store		.DS_Store
.gitignore		.gitignore
README.md		README.md
Results.ipynb		Results.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

optimization

Packages required:

Main scripts (src)

Test script (src)

Result notebook:

Results (results)

Datasets (data)

Images (images)

About

Releases

Packages

Contributors 3

Languages

f-grimaldi/optimization

Folders and files

Latest commit

History

Repository files navigation

optimization

Packages required:

Main scripts (src)

Test script (src)

Result notebook:

Results (results)

Datasets (data)

Images (images)

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages