Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Epic: Improve error handling and configuration validation #10102

Closed
6 tasks
hqdncw opened this issue Nov 22, 2023 · 6 comments
Closed
6 tasks

Epic: Improve error handling and configuration validation #10102

hqdncw opened this issue Nov 22, 2023 · 6 comments
Labels
awaiting response we are waiting for your reply, please respond! :) epic Umbrella issue (high level). Include here: list of tasks/PRs, details, Qs, timelines, etc

Comments

@hqdncw
Copy link

hqdncw commented Nov 22, 2023

Summary / Background

Confusing error messages and inadequate configuration validation can lead to a poor user experience when working with remote storage providers. The current error messages are not always clear or actionable, making it difficult for users to diagnose and resolve issues.

Scope

Our goal is to provide users with clearer, more actionable error messages and consistent error handling across all providers. Additionally, we will enhance the DVC configuration validation process to prevent errors from occurring in the first place. By doing so, we can improve the overall user experience and reduce friction when working with remote storage providers.

Assumptions

  • By using YAML instead of INI, we can leverage the existing JSON schema validation capabilities of modern IDEs, which would otherwise require a custom linter for .dvc/config. This means that DVC users can benefit from better configuration file validation and error reporting without having to install or configure any additional tools.
  • Upgrading to YAML allows us to reuse the strictyaml module to provide more informative and helpful error messages when config files contain invalid data types (thanks to @skshetry). These improved error messages can help users quickly identify and fix issues in their configurations, leading to a smoother and more efficient development experience.

Open Questions

  • Should we make YAML the default format for .dvc/config file in future versions of DVC?
  • Should we use pydantic or msgspec?

Blockers / Dependencies

General Approach

Steps

Must have (p1)

Tasks

Preview Give feedback

Optional / followup (p2)

Tasks

Preview Give feedback

Timelines

@hqdncw hqdncw added the epic Umbrella issue (high level). Include here: list of tasks/PRs, details, Qs, timelines, etc label Nov 22, 2023
@hqdncw
Copy link
Author

hqdncw commented Nov 22, 2023

Related: #9606, #5531, #4027

@skshetry
Copy link
Member

Confusing error messages and inadequate configuration validation can lead to a poor user experience when working with remote storage providers. The current error messages are not always clear or actionable, making it difficult for users to diagnose and resolve issues.

Could you please share some examples here?

@hqdncw
Copy link
Author

hqdncw commented Nov 22, 2023

Confusing error messages and inadequate configuration validation can lead to a poor user experience when working with remote storage providers. The current error messages are not always clear or actionable, making it difficult for users to diagnose and resolve issues.

Could you please share some examples here?

There may be opportunities to improve configuration validation to ensure that all necessary settings are properly configured before use.

iterative/dvc-s3#26
iterative/dvc-s3#25 (comment)
iterative/dvc-webdav#24

@skshetry
Copy link
Member

There may be opportunities to improve configuration validation to ensure that all necessary settings are properly configured before use.

Most of those are runtime validations. It may be difficult to provide a good error message because of the layers involved (fsspec<>dvc config translation, lack of proper exception on fsspec sides, too many types of auth configs, etc).

I associate config validations to mean the static validations. For runtime validations, take a look at discussions on a separate command: #8235.
For those runtime validations, do we even need to change config format or need pydantic at all?
How would they help?

@skshetry skshetry added the awaiting response we are waiting for your reply, please respond! :) label Nov 24, 2023
@hqdncw
Copy link
Author

hqdncw commented Dec 5, 2023

There may be opportunities to improve configuration validation to ensure that all necessary settings are properly configured before use.

Most of those are runtime validations. It may be difficult to provide a good error message because of the layers involved (fsspec<>dvc config translation, lack of proper exception on fsspec sides, too many types of auth configs, etc).

I associate config validations to mean the static validations. For runtime validations, take a look at discussions on a separate command: #8235. For those runtime validations, do we even need to change config format or need pydantic at all? How would they help?

While we don't necessarily need Pydantic for runtime validations, switching all schemas to Pydantic would still have some benefits. For example, it would simplify our workflow by eliminating the need to manually synchronize the Pydantic schema used in dvcyaml-schema and the Voluptuous schema, saving us time and improving the development experience. However, I consider this change to be low priority.

@skshetry
Copy link
Member

I don't think we need a meta issue, and the problems are being tracked on individual issues. Closing...

@skshetry skshetry closed this as not planned Won't fix, can't repro, duplicate, stale Mar 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
awaiting response we are waiting for your reply, please respond! :) epic Umbrella issue (high level). Include here: list of tasks/PRs, details, Qs, timelines, etc
Projects
None yet
Development

No branches or pull requests

2 participants