Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

See how much new version of python-benedict impacts us #6423

Closed
skshetry opened this issue Aug 13, 2021 · 8 comments · Fixed by #6706
Closed

See how much new version of python-benedict impacts us #6423

skshetry opened this issue Aug 13, 2021 · 8 comments · Fixed by #6706
Labels
A: experiments Related to dvc exp optimize Optimizes DVC research

Comments

@skshetry
Copy link
Member

skshetry commented Aug 13, 2021

The recent release of python-benedict has monkeypatched JSONEncoder to always use python-based encoder (instead of C-based encoder.

This monkeypatching happens during import.

from benedict import benedict

As we only use it during exp run --set-params, we need to see how much it affects us. Also how important is the performance for the exp run? Do we save large directories and files during exp run?


Note: Simple benchmark shows it causes 2x-4x degradation.

@efiop
Copy link
Contributor

efiop commented Aug 13, 2021

@skshetry Should we put an upper cap for it for now?

also, for the record: this is clearly +1 to the microbenchmarks that you've proposed before.

@isidentical isidentical added optimize Optimizes DVC research A: experiments Related to dvc exp labels Aug 15, 2021
@skshetry
Copy link
Member Author

µbenchmarks do show a 2-4x slowdown, but it still feels theoretical. We lack benchmarks for pipelines, so it's hard to tell.

@skshetry
Copy link
Member Author

skshetry commented Aug 30, 2021

Discussed with @efiop to get rid of the dependency, we only use it in --set-params. We need to see what it'll take to replace it.

@pmrowla
Copy link
Contributor

pmrowla commented Aug 30, 2021

May be worth revisiting #4883

@mattlbeck
Copy link
Contributor

mattlbeck commented Sep 14, 2021

Benedict's merge module appears to be fairly self-contained https://github.com/fabiocaccamo/python-benedict/blob/c98c471065ae84b4752a87f1bd63fe3987783663/benedict/core/merge.py . A possible stopgap could be to remove the bendict import and copy this module into the DVC codebase instead. I have done something similar for keypath parsing currently in #6521 (8d219dc4c4be678f8e12a57f766d282d13997443).

@skshetry
Copy link
Member Author

skshetry commented Sep 14, 2021

I think I'm more inclined on working around benedict than replacing it. We should rollback this particular change that it does on import:

https://github.com/fabiocaccamo/python-benedict/blob/c98c471065ae84b4752a87f1bd63fe3987783663/benedict/dicts/__init__.py#L283-L285

@daavoo
Copy link
Contributor

daavoo commented Sep 14, 2021

It is worth having a dependency that requires workarounds if we only use ~20 lines of that dependency?

Considering it causes other side effects/bugs like #6378 #6476

@skshetry
Copy link
Member Author

skshetry commented Sep 14, 2021

@mattlbeck, we still need to unflatten those and it comes from keylist_util.py.

I have not been able to work on this, and will not be able to work for few days at least
Easier way is to workaround benedict, that should help us move ahead #6476 and #6521.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A: experiments Related to dvc exp optimize Optimizes DVC research
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants