Add lens for reading termcap-style databases #274

bodgit · 2015-07-31T15:05:55Z

This is my stab at fixing #271 which seems to parse everything I've thrown at it including a ~900 KB termcap file. I can also add new records to files. However unless I disable type checking I get:

Syntax error in lens definition
./getcap.aug:34.2-.62:Failed to compile record
./getcap.aug:34.43-.54:exception: ambiguous iteration
      Iterated regexp: /(([^:\n]|\\\\:)+|[^:\n]*(\\"[^\n]+\\"[^:\n]*)+)(:|:[ \t]*\\\\\n[ \t]*:)/
      '\\:\\:\\::' can be split into
      '\\:|=|\\:\\::'

     and
      '\\:\\:\\::|=|'

    Iterated lens: ./getcap.aug:33.19-.61:

Any advice on cleaning that up would be appreciated.

I've tried writing some basic tests however I'm getting stuck where there should be a literal \ in the output. If I try \-escaping it then I get something like the following:

Syntax error in lens definition
tests/test_getcap.aug:102:17: Unexpected character #
tests/test_getcap.aug:101.17-102.8:syntax error, unexpected LIDENT, expecting '}'
tests/test_getcap.aug:syntax error

Any ideas?

bodgit · 2015-07-31T17:51:18Z

One annoyance I have is that new records will be added without any whitespace like so:

foo:bar:baz:

which is fine for small records, but long ones will wrap and look a bit meh so I made the following change to my lens:

diff --git a/lenses/getcap.aug b/lenses/getcap.aug
index bd002de..c38893a 100644
--- a/lenses/getcap.aug
+++ b/lenses/getcap.aug
@@ -28,10 +28,9 @@ module Getcap =
   (* field must not contain ':' unless quoted or '\'-escaped  *)
   let field      = /([^:\n]|\\\\:)+|[^:\n]*(\"[^\n]+\"[^:\n]*)+/

-  let sep        = del /:|:[ \t]*\\\\\n[ \t]*:/ ":"
-  let name       = store field . sep
-  let capability = [ label "capability" . store field . sep ]
-  let record     = [ seq "record" . name . capability+ . eol ]
+  let sep        = del /:([ \t]*\\\\\n[ \t]*:)?/ ":\\\n\t:"
+  let capability = [ label "capability" . store field ]
+  let record     = [ seq "record" . store field . sep . capability . ( sep . capability )* . Sep.colon . eol ]

   let lns = ( empty | comment | record )*

which should in theory should mean new records are added like so:

foo:\
        :bar:\
        :baz:

however what gets written to the file with augtool is:

foo:
        :bar:
        :baz:

The \'s aren't written. This and the problem with my tests suggests something weird going on with augeas and literal \ characters.

raphink · 2015-08-03T11:52:38Z

lenses/tests/test_getcap.aug

+"
+
+test Getcap.lns get getcap =
+  { "1" = "example|an example of binding multiple values to names"


This test doesn't match what your lens is doing.

raphink · 2015-08-03T11:55:05Z

I've added a CSV lens this morning, which is very similar to what you're trying to achieve.

In fact, CSV.lns_generic ":" parses your examples (although not the way you want them parsed).

You might want to check it out.

bodgit · 2015-08-03T12:08:29Z

I realised the structure I was parsing records into didn't allow me easily to write idempotent changes for use with Puppet so I've changed how records are parsed.

I still have a problem with my tests in that I can't give some sample test output containing literal \ characters and then test that they are present in the parsed structures.

I will take a look at the CVS lens and see how that works.

raphink · 2015-12-22T11:13:02Z

Any news on this lens @bodgit ?

zachfi · 2016-10-05T19:54:54Z

Oooh yes, this would be quite useful.

bodgit · 2017-09-29T21:54:54Z

Still not working 100% but now we know there are some bugs with backslashes in general it makes sense to get those ironed out first.

raphink · 2017-10-02T07:26:29Z

@bodgit typechecking doesn't pass yet though

lutter · 2017-10-02T20:50:02Z

@bodgit nice work .. one minor nit: I now consider seq a design mistake since it just confuses people when they try to add to the tree. Would you object to changing seq "record" to label "record" ? If you don't have time to make that change, I'd be more than happy to do that after merging this PR.

bodgit · 2017-10-03T08:47:28Z

Ironically I think it was label "record" originally before I noticed some other lenses use seq and figured it might make things more usable. I don't mind either way, I will change it back to use label again and update the tests to match.

bodgit · 2017-10-03T10:31:21Z

Other than that, the only thing to point out is that I've dropped reading the actual termcap file. Two reasons:

It doesn't seem to follow its own conventions w.r.t. escape sequences. AFAICT capabilities should not contain literal : characters anywhere and should either be replaced with \c/\C or its octal value \072, I found lots of entries that try to use \:. ^ and \ should also be escaped as \^ and \\ and often they're not. This is partly due to ^X being used as an escape sequence to represent control-X (FSVO X). If I make the lens follow the escape sequence rules then it doesn't parse termcap, if I make it aware of the bugs then the lens becomes mostly impossible to pass type checking as there's no consistency.
rtadvd.conf despite using the same file format doesn't actually use the same library routines for reading the file, hence it contains bare IPv6 addresses, complete with literal : characters. Trying to parse that with getcap will split addr="2001:dead::beef" into addr="2001, dead, ``, and beef" which isn't terribly useful from an Augeas users' PoV.

In theory I should just be able to use a simple regex such as /[^:]*/ but in order to cater for the most useful files I've used something similar to /("[^"]*"|[^:"]*)/ so it matches either a quoted string containing anything, or a string containing neither colons nor quotes which seems to work for anything except the termcap file.

raphink · 2017-10-03T12:37:20Z

@lutter seq is still useful so long as #244 is not merged, or am I missing something?

bodgit · 2017-10-03T15:52:56Z

Actually, thinking about this, I can split this into two lenses with one lens written mostly in terms of the other with just a different regexp for matching the capabilities. Bear with me 😄

lutter · 2017-10-03T16:29:49Z

@bodgit sounds good .. awaiting the refactored, much expanded lens anxiously ;) Looks like you've stumbled on one of the great things about config files: the discrepancy of the documented format and the format tools actually understand; in a lot of cases, I've had to read the parsing code of whatever consumes a file to discover that. Seems termcap is no different in that regard.

@raphink yes, #244 would still be useful, but I'd like to get a little more feedback on syntax etc. So if you have thoughts, would love to discuss more on the PR

bodgit · 2017-10-05T08:24:07Z

The termcap lens still doesn't work on an OpenBSD /etc/termcap but I think that's a bug with their tic which generates it; it doesn't escape bare ^ characters whereas if I use tic on a RHEL7 host then it generates a file with \136 instead which the lens parses fine. I will try and raise that with them at some point.

lutter · 2017-10-05T23:34:56Z

Awesome ! I am merging this for now; when you resolve the tic issues, we can always amend/improve the lens.

bodgit mentioned this pull request Jul 31, 2015

Support for termcap-style capability databases #271

Closed

raphink reviewed Aug 3, 2015
View reviewed changes

raphink added the improvement label Aug 3, 2015

bodgit force-pushed the getcap branch from 36e2c1a to a0c23d9 Compare August 3, 2015 12:04

bodgit mentioned this pull request Sep 28, 2017

How to write out a literal backslash character? #507

Closed

bodgit force-pushed the getcap branch from a0c23d9 to 119ad9c Compare September 29, 2017 21:41

bodgit force-pushed the getcap branch from 18f5c40 to 3ebb28a Compare October 2, 2017 17:32

Add lenses for reading termcap-style databases

d1b6351

bodgit force-pushed the getcap branch from 3ebb28a to d1b6351 Compare October 4, 2017 19:33

lutter merged commit d1b6351 into hercules-team:master Oct 5, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add lens for reading termcap-style databases #274

Add lens for reading termcap-style databases #274

bodgit commented Jul 31, 2015

bodgit commented Jul 31, 2015

raphink Aug 3, 2015

raphink commented Aug 3, 2015

bodgit commented Aug 3, 2015

raphink commented Dec 22, 2015

zachfi commented Oct 5, 2016

bodgit commented Sep 29, 2017

raphink commented Oct 2, 2017

lutter commented Oct 2, 2017

bodgit commented Oct 3, 2017

bodgit commented Oct 3, 2017

raphink commented Oct 3, 2017

bodgit commented Oct 3, 2017

lutter commented Oct 3, 2017

bodgit commented Oct 5, 2017 •

edited

Loading

lutter commented Oct 5, 2017

Add lens for reading termcap-style databases #274

Add lens for reading termcap-style databases #274

Conversation

bodgit commented Jul 31, 2015

bodgit commented Jul 31, 2015

raphink Aug 3, 2015

Choose a reason for hiding this comment

raphink commented Aug 3, 2015

bodgit commented Aug 3, 2015

raphink commented Dec 22, 2015

zachfi commented Oct 5, 2016

bodgit commented Sep 29, 2017

raphink commented Oct 2, 2017

lutter commented Oct 2, 2017

bodgit commented Oct 3, 2017

bodgit commented Oct 3, 2017

raphink commented Oct 3, 2017

bodgit commented Oct 3, 2017

lutter commented Oct 3, 2017

bodgit commented Oct 5, 2017 • edited Loading

lutter commented Oct 5, 2017

bodgit commented Oct 5, 2017 •

edited

Loading