-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. Weβll occasionally send you account related emails.
Already on GitHub? Sign in to your account
command: Show dvc list output in pretty JSON format #6647
Conversation
e4a8eff
to
fff3317
Compare
dvc/command/ls/__init__.py
Outdated
@@ -72,6 +84,11 @@ def add_parser(subparsers, parent_parser): | |||
list_parser.add_argument( | |||
"--show-json", action="store_true", help="Show output in JSON format." | |||
) | |||
list_parser.add_argument( | |||
"--show-pretty-json", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could just prettify the JSON by default by checking if we are in isatty
mode and then prettify that output.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the suggestion! The patch was updated
Btw, why not just do |
requirements/default.txt
Outdated
@@ -44,3 +44,4 @@ typing_extensions>=3.10.0.2 | |||
fsspec[http]>=2021.8.1 | |||
aiohttp-retry==2.4.5 | |||
diskcache>=5.2.1 | |||
pygments>=2.10.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We depend on rich
, so you could just do something like following:
from rich.json import JSON
obj = JSON.from_data(data)
ui.write(obj, styled=True)
fff3317
to
a268c79
Compare
dvc/command/ls/__init__.py
Outdated
if sys.stdout.isatty(): | ||
from rich.json import JSON | ||
|
||
ui.write(JSON.from_data(entries), styled=True) | ||
else: | ||
ui.write(json.dumps(entries)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's move this to dvc/ui/__init__.py
as ui.write_json
, so that we can reuse it in other parts of dvc.
Also, support for indent=
and prettify=True|False
that keeps the old behaviour would be nice to have. π
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, we need to bump rich
requirement, as from_data
is only available since 10.9.0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, that makes sense. patch updated.
a268c79
to
486c1b3
Compare
The PR looks good. @iterative/dvc, if you have any opinions if dvc should format json. Otherwise, we'll be formatting in other places as well. |
dvc/command/ls/__init__.py
Outdated
import json | ||
|
||
ui.write(json.dumps(entries)) | ||
if sys.stdout.isatty(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could hide this check under ui.write_json (if rich doesn't do that already for us) CC @skshetry
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Though that would get a little confusing with prettify
, as it is not clear if it should enforce it or be more like "prettify, but only if it is a tty". Still, it feels like tty
check belongs to ui.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Btw, is there any significant performace hit for prettifying? If not, maybe we should indeed consider just showing pretty json by default?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the comments!. It is hard to tell the performance gap between json.dumps and prettifying. I can do some evaluation on it later. Besides that, I tend to believe the prettifying is not the bottleneck since dvc list should fetch repo info from remote first which may spend more time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did two simple tests on it.
-
A python array has 100000 element, each element is a dict which contains 3 key-value. json.dump takes about 0.05s, rich.json.from_date takes about 2.6s
-
dvc list takes about 1.8s to fetch https://github.com/iterative/example-get-started while prettifying takes 0.03s
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think performance should be a concern. We could choose a certain threshold like 1000 or so entries where we fallback to json.dumps(indent=4)
instead of prettifying it. No one should expect us to prettify the 100000 many entries.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
regretfully dvc list
(and others) are not yet fully optimized themselves and having 100K entries in the resulting json is not unusual for our use case, so looks like prettifying is kinda expensive to do always. Plus we have other projects depending on this json output, so that migth be slowing them down too. But isatty
check kinda solves that, because if someone will be using the json output - they will redirect it somewhere anyway, so we should be safe and efficient, while being pretty. That should make the threshold approach unnecessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the comments! I move the isatty into ui init.py and the patch updated. :)
1. update ls __init__.py to output pretty JSON format 2. introduce write_json function into ui __init__.py
486c1b3
to
40f04df
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We discussed this today in the team, and everyone is okay with this getting merged. For consistency, we'd like this to be used in all of the other commands that has --json
flags (though in a separate PR).
@eggqq007, would you like to create a PR (or, issue) for #6647 (review)? Thanks for this PR. |
Sure, new issue in #6712 |
β I have followed the Contributing to DVC checklist.
π If this PR requires documentation updates, I have created a separate PR (or issue, at least) in dvc.org and linked it here.
Closes iterative/dvc.org#2838
Closes #6646
Thank you for the contribution - we'll try to review it as soon as possible. π