Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

User comments in config file lost after writing to config via pgcli #1240

Closed
ghost opened this issue Feb 4, 2021 · 12 comments
Closed

User comments in config file lost after writing to config via pgcli #1240

ghost opened this issue Feb 4, 2021 · 12 comments

Comments

@ghost
Copy link

ghost commented Feb 4, 2021

Description

First of all, thanks for a great tool, I use it a lot.

User comments to the pgcli config file are getting lost after the config file gets modified via pgcli itself (e.g. adding a named query). According to #1162 (comment) they should remain. Additionally, all whitelines from at least the [named queries] section onward get lost as well.

I can reproduce this on a clean virtualenv and with a fresh config file.

Reproduce

Log from my reproduction attempts:

  • Removed previous config file
  • Created new temp virtualenv (using the system Python 3.6.9), installed new pgcli (pip install pgcli)
  • Started pgcli, this created a new config file
  • Config file (from [named queries] section onward; same goes for the config file snippets below) at this point:
    # Named queries are queries you can execute by name.
    [named queries]
    
    # DSN to call by -D option
    [alias_dsn]
    # example_dsn = postgresql://[user[:password]@][netloc][:port][/dbname]
    
    # Format for number representation
    # for decimal "d" - 12345678, ",d" - 12,345,678
    # for float "g" - 123456.78, ",g" - 123,456.78
    [data_formats]
    decimal = ""
    float = ""
    
  • Save new named query from pgcli: \ns foo bar. Config file snippet after this:
    [named queries]
    foo = bar
    [alias_dsn]
    [data_formats]
    decimal = ""
    float = ""
    • NOTE: the "# Named queries are queries you can execute by name." comment (and the preceding whiteline) above the [named query] section header is now gone as well (while the "# style classes for colored table output" comment above output.header = "#00ff5f bold" remained in tact. As did all other comment and whitelines earlier in the file)
  • Close pgcli (just in case), add user comment above the foo = bar named query entry. Config file snippet after this:
    [named queries]
    # My comment
    foo = bar
    [alias_dsn]
    [data_formats]
    decimal = ""
    float = ""
  • Start pgcli again, add new named query: \ns bar SELECT 1. Config file snippet after this:
    [named queries]
    foo = bar
    bar = SELECT 1
    [alias_dsn]
    [data_formats]
    decimal = ""
    float = ""
  • Close pgcli (just in case), add record and comment to [alias_dsn]. Config file snippet after this:
    [named queries]
    foo = bar
    bar = SELECT 1
    [alias_dsn]
    # I feel like I'm getting lost
    baz = postgresql://foo@localhost/postgres
    [data_formats]
    decimal = ""
    float = ""
  • Start pgcli again, add new named query: \ns baz SELECT 2. Config file snippet after this:
    [named queries]
    foo = bar
    bar = SELECT 1
    baz = SELECT 2
    [alias_dsn]
    baz = postgresql://foo@localhost/postgres
    [data_formats]
    decimal = ""
    float = ""

Did another quick test (only some of the steps from above) with a Python 3.9.0 (via pyenv) temp virtualenv, same results. Tried this one with both a pip install pgcli and a pip install git+https://github.com/dbcli/pgcli@master.

The same happens in my proper config file using my proper pgcli install, which contains many more named queries and DSN aliases.

Environment, verison info

  • OS: Ubuntu 18.04.5 (x64)

  • pgcli:

    • In most of the tests above:
      Server: PostgreSQL 9.6.13
      Version: 3.1.0
      
    • My proper install:
      Version: 3.0.0
      
  • pip freeze of my proper install:

``` cffi==1.14.0shared libraries cli-helpers==1.2.1 click==7.1.1 configobj==5.0.6 cryptography==2.9.2 dbus-python==1.2.16 humanize==2.3.0 importlib-metadata==1.7.0 jeepney==0.4.3 keyring==21.2.1 pgcli==3.0.0 pgspecial==1.11.9 pkg-resources==0.0.0 prompt-toolkit==3.0.8 psycopg2==2.8.5 pycparser==2.20 Pygments==2.6.1 SecretStorage==3.1.2 setproctitle==1.1.10 six==1.14.0 sqlparse==0.3.1 tabulate==0.8.7 terminaltables==3.1.0 wcwidth==0.1.9 zipp==3.1.0 ```
@ERYoung11
Copy link
Contributor

ERYoung11 commented Feb 23, 2021

I have looked at this some and am still puzzled about what's going on. I'm just not advanced enough in python to get all the way at this point. I'll leave some notes here in case a smarter developer can see what I cannot.

I can tell you that somehow the default pgclirc that comes with pgcli and the changed configuration file are getting mixed together. I suspect that is what is causing issues.

Here are the salient tests I have done so far:

  • When running just ConfigObj, it appears to work just fine.

    config['named queries']['foo']='bar'
    print(config['named queries'])
    config.write()

  • Running just NamedQueries by itself works fine.

    NamedQueries.instance = NamedQueries.from_config(config)
    NamedQueries.instance.list()
    NamedQueries.instance.save("alice","bob")

  • if you have a valid configuration file, then you add a section to the default pgclirc file that section shows up in the data structure that holds the configuration in the NamedQueries object. I would expect that it wouldn't because I would expect that pgcli would ONLY read from the user configuration file after the original creation of that user configuration file.

Steps to reproduce:

  • make sure user config file is == default config file.
  • I changed the default config file (pgclirc, which comes with the software) by renaming the section [alias_dsn] to be called [lias_dsn].
  • Inserting a print(NamedQueries.instance.config) command at the top of the ioscommand/list_named_queries function, produces a dump of the config (not including comments) that ends with this:
    'named queries': {'blah': 'asdf'}, 'lias_dsn': {}, 'data_formats': {'decimal': '', 'float': ''}, 'alias_dsn': {}}
  • (note that it includes the section alias_dsn from the user config file AND the lias_dsn from the default pgclirc that comes with pgclirc

Note that I also saw this sort of reaction when I took the user config file and truncated most of it, leaving just the [named queries] section and the sections that follow it. When I would start the program and do a save of a named query, ALL of the file would reappear, even the things I had cut out.

@ERYoung11
Copy link
Contributor

@pasenor @j-bennet (I see you're active so I'll ask you for your guidance).

This issue is caused by a known issue in ConfigObj.py (DiffSK/configobj#171, submitted in 2018)

There is no more recent published version of ConfigObj than what you are using and it has this bug. There are more recent updates to their github.

This can be fixed by removing the merges in config.py/load_config() and just pulling in the user config file (which will be the default file if they had nothing at startup). However if you do this and the user were to truncate their config file you would not get most of the default settings (colors, etc).

So, recommendations on how to move forward?

  • Move the default configurations into the pgcli code, rather than re-reading it from a file every time?
  • Try a replacement for ConfigObj?
  • Work with ConfigObj maintainers to get a newer version published?
  • Do your own merge of configurations?
  • Other options?

@j-bennet
Copy link
Contributor

I looked at the ConfigObj repo, and yeah, not only it has issues opened in 2014, but it has PRs open in 2016 still hanging around. I bumped the issue though, we will see if that helps at all. A couple other things we could try:

  • Submit a PR into their repo and hope that it's handled quicker than the issue.
  • Fork their repo and fix the issue. Point our requirements to the fork. We can't release it to pypi though.

If ConfigObj maintainers remain unresponsive, looking into ConfigObj replacements would be my preferred solution.

@gfrlv
Copy link
Member

gfrlv commented Feb 24, 2021

Do we really need this fancy merge-with-comments feature? It is difficult to do right and ties us to the few libraries like ConfigObj, and then it doesn't work. Everything will be simpler if we decouple configuration objects (that can be merged) from files (that can only be copied or edited directly).
If I'm not missing anything, we only rewrite config on two occasions:

  • when the user's config is empty -- in this case, why not just copy the file
  • when adding a named query. But then, it's very easy to just edit the file: locate the [named queries] section, find its end and append the query. One line added, the rest of the files is guaranteed to remain intact.

Then we can replace ConfigObj with either toml or configparser from the standard library

@ERYoung11
Copy link
Contributor

Either way, I'm willing to try helping if you can point me in the direction you want. I'm just a beginner with a keyboard so I don't feel qualified to make architectural decisions for your product.

At this point, I don't really want to take on fixing ConfigObj if that were even possible.

@j-bennet
Copy link
Contributor

j-bennet commented Feb 24, 2021

It's quite possible we're doing some config merging with default when it's not needed, just because it seemed easy and "why not". I'd have to look through the code more closely to see why we even need that merge-then-write. We do need to merge-on-read in case user provided --pgclirc when running cli.

Update: scratch the above, we have a reasonable case for merging configs. We always want to merge existing config with default pgclirc, because it's possible that pgclirc had some newly added settings. Argh.

@gfrlv
Copy link
Member

gfrlv commented Feb 25, 2021

We certainly always want to merge configs in-memory, as dicts. But I don't think we write the results into user's config file if it already exists: https://github.com/dbcli/pgcli/blob/master/pgcli/config.py#L34 the parameter overwrite here is False throughout the whole code base.

And that's good, overwriting user's configuration without a trace is not something I would want any program to do to me. We can always show warning and let the user fix things manually. At least, if there are any automated edits, the old values should remain as comments, but that's also not something ConfigObj knows how to do.

@ERYoung11
Copy link
Contributor

How about I try this:

  • move the default pgclirc into code.
  • generate a user config file if it doesn't exist, from the in mem
  • otherwise, pull in the user config and merge in mem with the in mem defaults
  • implement some methods to allow update to user config only for CRUD of named queries

Do you think that would work and be maintainable?

@gfrlv
Copy link
Member

gfrlv commented Feb 25, 2021

move the default pgclirc into code.

What do you mean, remove pgclirc from the source tree and store it as a python object, without comments? But then how are we going to initialize the user's config?

I'd start with your last bullet point, make a more gentle config editing function and fix the actual bug with disappearing comments. Then we can see how it looks and decide if we want to refactor things further.

Also, shouldn't all this config juggling move to cli_helpers? It's quite similar in pgcli and mycli, and can be improved in both

@ERYoung11
Copy link
Contributor

move the default pgclirc into code.

What do you mean, remove pgclirc from the source tree and store it as a python object, without comments? But then how are we going to initialize the user's config?

My thought, perhaps naively, was that we could have both comments and configurations in the code so that we could generate the default file on demand for the user. Then as the code runs we basically ignore the comments and pull in any user configs to overwrite the default configuration sections that they have changed.

I'd start with your last bullet point, make a more gentle config editing function and fix the actual bug with disappearing comments. Then we can see how it looks and decide if we want to refactor things further.

The issue with doing just this part is that today, the code doesn't know what the default settings are as it pulls them in from the default file. So, if a user has a full config file, the code is fine. However if the user has (for example) only 2 entries in the config file (and we're not merging in the default file, which is where the error is) then the code won't have values for anything else.

Also, shouldn't all this config juggling move to cli_helpers? It's quite similar in pgcli and mycli, and can be improved in both

Maybe :) I haven't looked at mycli or cli_helpers. Again, if you all can provide me direction, I'll go there.

@j-bennet
Copy link
Contributor

j-bennet commented Feb 26, 2021

Fixed via #1251.

This is by no means a perfect solution. I wish that configparser in stdlib supported comments on write, unfortunately it does not. I'd like to try and submit a PR to https://github.com/DiffSK/configobj to fix this issue, but it will not happen immediately.

@j-bennet
Copy link
Contributor

I submitted a PR into ConfigObj repo. We'll see what happens with that.

DiffSK/configobj#212

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants