Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Survey Assist Using RF #938
base: master
Are you sure you want to change the base?
Survey Assist Using RF #938
Changes from 1 commit
6471a96
9d6a1af
9bd9b18
1b9ece0
e7d2a14
59633e0
0adb5fe
e9abd51
3820d87
5b2572e
bf7f406
1d7be5a
b3d0db2
c514fe0
3b038a9
94fc848
87f109c
33cdaab
01fcb2a
f5fec64
585cc90
61bbe3f
a32ce4f
052cb08
104dd9a
1b523ed
35a1346
19bb394
450094c
ad968de
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still see this form of import
Already flagged in #938 (comment)
and #938 (comment)
but not yet fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FIxed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this just boilerplate code to copy parameters from one datastructure to another? Why not use something like
kwargs
instead?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, I have used
kwargs
now.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is
trip_grouper
and where did it come from? Why do we need to save it?Why are we serializing attribute-by-attribute instead of saving a JSON representation of the model, which would be be obvious fit for our document-based storage system?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even for JSON serialization, we'll have to work attribute by attribute, since each instance here contain instances of other classes (
ForestClassifierModel
hasForestClassifier
which has 3RandomForestClassifiers
and 2OneHotWrappers
( this further has oneHotEncoders). We'll have to add for every class its own serialization method ( since JSON only serializes basic data structures by default) so thatjson.dumps (obj)
can work. The current version was written with least disturbance to other parts of codebase in mind. However, prioritizing the way serialization fits our DB is an important thought. Do you want me to try the JSON way?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I opted for this file because it was used in one other tests for greedySimilarityBinning. However, for the user mentioned (above), there are just 6 trips. I wanted to confirm with you if I am free to use other files from this location.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The other files in this location are used in other tests so I don't see any reason why they cannot be used. As you point out, these are daily snapshots, so they are unlikely to give very high quality results, but there is no restriction on their use.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This part is questionable. I wanted to know if dumping ( writing predictions of first run ever to be used as ground truth) in a JSON here is a good practice or not?
Other option is to store the past predictions in db. But I think the tear-down will clean it up while exiting. So I opted for JSON. Is there a way to protect the writing. Let me know if I am missing something here.