Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add .cfg/.ini params parser #10175

Closed
wants to merge 1 commit into from
Closed

Conversation

percevalw
Copy link

@percevalw percevalw commented Dec 15, 2023

Hi,

Following #7122, this PR adds support for .ini / .cfg files (INI format).
While we don't assume that nested sections of INI files follow the same convention as TOML (i.e., [section.nested]), we still parse them as nested dictionaries because it should not affect non-nested configurations and it allows more complex overrides on files that do present nesting (àla spaCy config.cfg files).

Documentation PR: iterative/dvc.org#5060

Copy link

codecov bot commented Dec 15, 2023

Codecov Report

Attention: 2 lines in your changes are missing coverage. Please review.

Comparison is base (3e4e143) 90.61% compared to head (945c69c) 90.64%.

Files Patch % Lines
dvc/utils/serialize/_ini.py 97.56% 0 Missing and 2 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main   #10175      +/-   ##
==========================================
+ Coverage   90.61%   90.64%   +0.02%     
==========================================
  Files         499      501       +2     
  Lines       37880    37992     +112     
  Branches     5505     5414      -91     
==========================================
+ Hits        34326    34436     +110     
  Misses       2912     2912              
- Partials      642      644       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@skshetry
Copy link
Member

skshetry commented Jan 25, 2024

@percevalw, thank you for taking the time to create the pull request.

Unfortunately, I'm leaning towards declining this PR as it will be difficult for us to maintain as the configuration parsing scope increases. This is because:

  1. There is no single standard ini format.
  2. There is no support for native types, so we'll have to do a custom parsing. This makes it tricky to modify.
  3. Comments are not preserved on the update.
  4. Given the support for TOML, YAML, JSON, Python, and Hydra, there's less demand for .ini files. And I expect it to decline further in the future.

We are a small team, and we have struggled to maintain all the implemented params formats, especially in the case of Python parsing, which has been left unmaintained for an extended period.

Given that, I'm hesitant to introduce one more format, particularly one with a scope that may potentially increase over time at this time.

@percevalw
Copy link
Author

Hi, thank you for your review. I understand the difficulty of maintaining a very specific feature like this one.

Would you be open in this case to considering a third party serialization ecosystem, using entry points for example?
I believe such a mechanism would have a lower impact on the maintenance of dvc, and could accommodate for custom requirements.

For instance, having in a "my-custom-dvc-ini-serializer" package a pyproject.toml field looking like the following:

[project.entry-points."dvc_serializers"]
"cfg" = "my_custom_dvc_ini_serializer"

with my_custom_dvc_ini_serializer/__init__.py containing the changes suggested in this PR, and having dvc look for such entry points when running dvc/utils/serialize/__init__.py to update its dictionaries.
This could look like :

... # content of serialize/__init__.py

import sys
from importlib.metadata import entry_points

for ep in entry_points(group='dvc_serializers'):
    ext = ep.load()
    PARSERS.update(ext.PARSERS)
    LOADERS.update(ext.LOADERS)
    MODIFIERS.update(ext.MODIFIERS)
    DUMPERS.update(ext.DUMPERS)

@skshetry
Copy link
Member

@percevalw, unfortunately, we cannot provide a stable API/hook for the config API. The PARSERS/LOADERS, et al. are utilities meant for internal usage and were not designed thinking about extensibility. They are not in a very good state either.

While I do understand your use case, we don't see the demand or interests in #7122. There has been discussion about providing an API in #6506 before, and reusing the current utilities might lock us in when we come to improve parameters' workflow in the future.

@skshetry
Copy link
Member

Closing as stale.

@skshetry skshetry closed this Feb 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Archived in project
Development

Successfully merging this pull request may close these issues.

2 participants