top-level plot collections: fail on empty dict #10233

skshetry · 2024-01-12T02:58:47Z

If you have a top-level plots with empty dict as follows:

# dvc.yaml
plots:
- {}

dvc would fail with traceback that looks like:

StopIteration                             Traceback (most recent call last)
Cell In[3], line 1
----> 1 repo.index._plot_sources

File ~/Projects/dvc/.venv/lib/python3.12/site-packages/funcy/objects.py:25, in cached_property.__get__(self, instance, type)
     23 if instance is None:
     24     return self
---> 25 res = instance.__dict__[self.fget.__name__] = self.fget(instance)
     26 return res

File ~/Projects/dvc/dvc/repo/index.py:448, in Index._plot_sources(self)
    445 from dvc.repo.plots import _collect_pipeline_files
    447 sources: List[str] = []
--> 448 for data in _collect_pipeline_files(self.repo, [], {}).values():
    449     for plot_id, props in data.get("data", {}).items():
    450         if isinstance(props.get("y"), dict):

File ~/Projects/dvc/dvc/repo/plots/__init__.py:490, in _collect_pipeline_files(repo, targets, props, onerror)
    488         dvcfile_defs_dict[elem] = None
    489     else:
--> 490         k, v = next(iter(elem.items()))
    491         dvcfile_defs_dict[k] = v
    493 resolved = _resolve_definitions(
    494     repo.dvcfs,
    495     targets,
   (...)
    499     onerror=onerror,
    500 )

StopIteration:

The empty dict is not supported by dvc, and we should fail validation in this case.

codecov · 2024-01-12T03:05:23Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (d711ecd) 90.47% compared to head (3a82fba) 90.23%.

Additional details and impacted files

@@            Coverage Diff             @@
##             main   #10233      +/-   ##
==========================================
- Coverage   90.47%   90.23%   -0.25%     
==========================================
  Files         493      493              
  Lines       37595    37596       +1     
  Branches     5455     5455              
==========================================
- Hits        34014    33924      -90     
- Misses       2953     3022      +69     
- Partials      628      650      +22

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

shcheklein

Thanks @skshetry for a quick turnaround and for empty YAML catch ...

can we make it a validation error though? Otherwise it's easy to miss this kind of conditions
I think we need a test for this ... it's easy to add and it's easy to get the same regression later in this (edge) case

shcheklein · 2024-01-12T03:20:17Z

Also, please add some description to PR :)

skshetry · 2024-01-12T03:27:40Z

There are just too many edge cases with schema and too many negative cases, that I don't find it worth it to cover it with tests. :)

And the schema is very declarative.

shcheklein · 2024-01-12T03:32:29Z

There are just too many edge cases with schema and too many negative cases, that I don't find it worth it to cover it with tests. :)

that's exactly what we need I think, especially as we hit more and more edge cases like this. It's totally fine to have 1000s of tests for this. And even consider generating different combinations, etc if needed.

skshetry · 2024-01-12T03:33:00Z

I'd look for property-based testing than writing 1000s of tests.

skshetry · 2024-01-12T03:45:09Z

that's exactly what we need I think, especially as we hit more and more edge cases like this. It's totally fine to have 1000s of tests for this. And even consider generating different combinations, etc if needed.

Negative tests will just test voluptuous. Here, we'll be testing if Required works or not.

shcheklein · 2024-01-12T03:53:38Z

Negative tests will just test voluptuous. Here, we'll be testing if Required works or not.

I would argue that you are testing the DVC schema in this case (e.g. that someone would not drop the Required). But I agree if this is becomes part of the validation it's way better and less error prone.

dvc/repo/plots/__init__.py

shcheklein

I would still add a test. Your call on this.

skshetry · 2024-01-12T04:11:41Z

I would still add a test. Your call on this.

The problem here is that there is a mismatch between schema validation and parsing/deserialization.
Even though we might add a test for schema validation, there is no guarantee that the deserialization logic will be correct.

I don't think they are a different thing. They should be just one single thing. And we should just depend on validation to rule those edgecases out.

* top-level plot collections: skip empty dict * tighten schema for plots; require key name

top-level plot collections: skip empty dict

cfeb6bc

shcheklein requested changes Jan 12, 2024

View reviewed changes

shcheklein assigned skshetry Jan 12, 2024

shcheklein added bug Did we break something? A: plots Related to the plots product: Studio Integration with Studio and removed product: Studio Integration with Studio labels Jan 12, 2024

shcheklein reviewed Jan 12, 2024

View reviewed changes

dvc/repo/plots/__init__.py Outdated Show resolved Hide resolved

tighten schema for plots; require key name

3a82fba

skshetry force-pushed the plot-skip-empty-dict branch from b91ff5d to 3a82fba Compare January 12, 2024 04:00

skshetry changed the title ~~top-level plot collections: skip empty dict~~ top-level plot collections: fail on empty dict Jan 12, 2024

shcheklein approved these changes Jan 12, 2024

View reviewed changes

skshetry merged commit b46bd9c into iterative:main Jan 12, 2024
22 checks passed

skshetry deleted the plot-skip-empty-dict branch January 12, 2024 04:11

BradyJ27 pushed a commit to BradyJ27/dvc that referenced this pull request Apr 22, 2024

top-level plot collections: fail on empty dict (iterative#10233)

2b9d877

* top-level plot collections: skip empty dict * tighten schema for plots; require key name

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

top-level plot collections: fail on empty dict #10233

top-level plot collections: fail on empty dict #10233

skshetry commented Jan 12, 2024 •

edited

Loading

codecov bot commented Jan 12, 2024 •

edited

Loading

shcheklein left a comment

shcheklein commented Jan 12, 2024

skshetry commented Jan 12, 2024

shcheklein commented Jan 12, 2024

skshetry commented Jan 12, 2024

skshetry commented Jan 12, 2024 •

edited

Loading

shcheklein commented Jan 12, 2024

shcheklein left a comment

skshetry commented Jan 12, 2024

top-level plot collections: fail on empty dict #10233

top-level plot collections: fail on empty dict #10233

Conversation

skshetry commented Jan 12, 2024 • edited Loading

codecov bot commented Jan 12, 2024 • edited Loading

Codecov Report

shcheklein left a comment

Choose a reason for hiding this comment

shcheklein commented Jan 12, 2024

skshetry commented Jan 12, 2024

shcheklein commented Jan 12, 2024

skshetry commented Jan 12, 2024

skshetry commented Jan 12, 2024 • edited Loading

shcheklein commented Jan 12, 2024

shcheklein left a comment

Choose a reason for hiding this comment

skshetry commented Jan 12, 2024

skshetry commented Jan 12, 2024 •

edited

Loading

codecov bot commented Jan 12, 2024 •

edited

Loading

skshetry commented Jan 12, 2024 •

edited

Loading