Train test ratio #43

tab1tha · 2020-03-10T20:09:22Z

This pull request addresses #3

The vehicles.csv dataset is used instead of the generated.csv dataset so that variation could easily be observed. It plots a graph of accuracy against test size and yields a table containing progressive values of performance indices (precision, f1_score, recall), as the test size is changed.

This commit uses modules created in #28 and so it incorporates those changes since the said pull request has not yet been merged to Upstream/master.
The main file here is train_test_ratio.py which imports the load_dataset and test_size vary modules.
The k_nn module created in the aforementioned issue is modified slightly and imported as part of this solution. The train_test_ratio.ipynb file shows the results obtained by implementing the changes in this pull request.

get main.yml

tab1tha · 2020-03-10T20:23:02Z

Multi-line comments will be replaced with docstrings, following the recommendations of the maintainer concerning pull request 28, and according to PEP conventions.

tab1tha · 2020-03-11T06:02:02Z

The principal commit is "Effect of split ratio on performance"(sha 599990c). It shows the progressive changes that have been made to address this particular issue #3.
The first commit was just included so that modules created before (in pull request #28) can be reused and the second commit was made to synchronize my fork with upstream/master.

tab1tha · 2020-03-11T09:17:55Z

@dzeber @mlopatka Should I continue in this way? that is, reusing the modules I created in the previous pull request for the next?. This will therefore necessitate that the branch of the next pull request will be 'branched out' from this one in order to make these modules available for reuse as I did here.
However, this will pose a problem when the original branch is updated, because it would not synchronise with this one.
On the other hand, if I treat each pull request independently, I will have to recreate the modules whenever I need them. Considering that I made initial changes to my master branch which are pending in pull request #28, this second option means that I will have to refork upstream and reclone my repo in order that origin master will be in synchrony with upstream master. However, doing this will cause all the unmerged changes in origin master now (#28) to be lost.
Please how do you think I should proceed? what do you recommend?

tab1tha · 2020-03-18T12:24:59Z

@mlopatka please can you merge this? I do not have merging rights.

mlopatka · 2020-03-20T17:20:05Z

@tab1tha apologies for the delay.
Can you please resolve conflicts and ping me for merge.

tab1tha · 2020-03-20T19:03:04Z

@mlopatka I have resolved all the merge conflicts but one. I resolved locally and when I pushed the changes there is one conflict that is present her in k_nn.py
lines 39 to 43 need to be left there while
lines 45 to 47 need to be deleted
However, when I do this on github, the "mark as resolved" button is deactivated so I cannot save the changes. Please can you resolve this last conflict and merge?

mlopatka · 2020-03-20T19:09:31Z

@tab1tha can you verify that knn.py looks good to you?
Then please pull down master so we can use it as a base for resolving other conflicts.

tab1tha added 3 commits March 8, 2020 20:54

KNN model trained and tested on generated.csv dataset

4344921

Merge branch 'master' of https://github.com/mozilla/PRESC in order to

3f32805

get main.yml

Effect of split ratio on performance

599990c

Hyperparameter tuning and cross validation implemented

d2336c5

tab1tha mentioned this pull request Mar 13, 2020

KNN model trained and tested on generated.csv dataset #28

Merged

mlopatka approved these changes Mar 16, 2020

View reviewed changes

Black formatting applied

d33f907

Merge conflicts resolved

0063bf0

Merge branch 'master' into train-test-ratio

19b12bd

mlopatka merged commit dde72c1 into mozilla:master Mar 20, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Train test ratio #43

Train test ratio #43

tab1tha commented Mar 10, 2020 •

edited

Loading

tab1tha commented Mar 10, 2020

tab1tha commented Mar 11, 2020

tab1tha commented Mar 11, 2020 •

edited

Loading

tab1tha commented Mar 18, 2020

mlopatka commented Mar 20, 2020

tab1tha commented Mar 20, 2020

mlopatka commented Mar 20, 2020

Train test ratio #43

Train test ratio #43

Conversation

tab1tha commented Mar 10, 2020 • edited Loading

tab1tha commented Mar 10, 2020

tab1tha commented Mar 11, 2020

tab1tha commented Mar 11, 2020 • edited Loading

tab1tha commented Mar 18, 2020

mlopatka commented Mar 20, 2020

tab1tha commented Mar 20, 2020

mlopatka commented Mar 20, 2020

tab1tha commented Mar 10, 2020 •

edited

Loading

tab1tha commented Mar 11, 2020 •

edited

Loading