You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
basically, it seems we're spending a lot of time compiling regular expressions. individually, those don't matter so much (percall=1ms) but we seem to be doing hundreds of those. I think it might be related to the timezone_parser.py file (build_tz_offsets?) but i stopped digging there.
the exact source is a little besides the point: shouldn't just importing the module be safe enough, performance wise? i know we load a default parser, but that's not what's eating us here, but rather a bunch of globals in timezone_parser.py... it seems to me those could be lazily loaded, at least?
The text was updated successfully, but these errors were encountered:
oh and in case you're wondering why this matters to me, it's because i wrote this tool called undertime who gives you different times in different zones, as a one-shot commandline tool. most of its time is spent building those regexes it doesn't use. :)
i'm now lazily loading dateparser itself, but the user can definitely "feel" when it hits that corner case.
We use a lot of data objects in our libraries that usually load from json and moving them to lazy-load instead of load-on-import has been helpful. It's not too hard and it's been reliable for us.
hi!
first, thanks for this awesome project, it's really useful and powerful and i am grateful to not have to write this stuff myself. :)
i open this issue because I feel there's some inherent performance issue to be paid whenever we even load the dateparser library:
compare with similar libraries:
a quick profiling seems to show it spends an inordinate amount of time compiling regular expressiongs:
basically, it seems we're spending a lot of time compiling regular expressions. individually, those don't matter so much (percall=1ms) but we seem to be doing hundreds of those. I think it might be related to the
timezone_parser.py
file (build_tz_offsets
?) but i stopped digging there.the exact source is a little besides the point: shouldn't just importing the module be safe enough, performance wise? i know we load a default parser, but that's not what's eating us here, but rather a bunch of globals in
timezone_parser.py
... it seems to me those could be lazily loaded, at least?The text was updated successfully, but these errors were encountered: