-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feat/replace csv parser by papa parse #14
Conversation
Processing the CSV as is currently done is taking considerably more time than with the old method. I'm going to see what I can do to improve the efficiency of the parsing. I would wait to review this until I have found a better solution. |
I have updated the code to run MUCH MUCH faster. Instead of processing each row individually within Papa parse, I parse the entire CSV then process each row. This increased the speed by a factor of 10. I tested on WINDON, and previously it was taking ~550secs to run CLE. Now, it only takes ~45secs. You can now review 🙏 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Relatively small thing but necessary. Everything else looks good! 🙌
} | ||
|
||
row = row.trim(); | ||
const trimmedRow = row.map(value => value.trim()); | ||
const gtfsRow = new GtfsRow(trimmedRow); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're recreating the GtfsRow for each row, which defeats the memory saving purpose (creating the class n time + object n time, instead of class 1 time + object n time). The keys will be copied the same way than before the optimization.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great point. I will update this
Feature
gtfsNodeLib nows uses Papa Parse to parse through the CSV instead of the internal CSV parser.