SAT solver cost #820

ampli · 2018-10-02T18:12:50Z

Fix the SAT parser disjunct cost to take into account null expression cost. See (1) in this post in issue #188. (It seems I didn't open a new issue for that and instead continued the discussion in the same issue.)

Also fix the total cost per disjunct to be according to the calculation that is done in the classic parser, including the cost-cutoff calculation (see issue #783).

The result is that the reported costs (can be inspected with !disjunct) should now be identical to those reported by the classic parser. Note that because duplicate disjuncts are not discarded, and they may have different costs (in the classic parser only the one with the lowest cost is retained) the same parse can be shown multiple times, with different costs.

I had a difficulty to word the ChangeLog description.
I also used there "Modify" and not "fixed" because I am not convinced yet that the cost-cutoff calculation method that is currently used is "correct" (or "optimal").

I did comparison tests (SAT vs classic parsers) when I produce all the possible solutions for each input sentence. The order of the parses is of course different, and also the SAT parser produces many duplicates. But by using md5 hashes of each parse I validated that they produce exactly the same parses. I didn't try to programmatically match the reported disjunct costs - I only inspected them manually and they seem identical (if you compare to the linkage with the lower total cost among the duplicates).

If you find different costs please tell me and I will try to investigate/fix that.

Some remaining problems (in addition to the open one from issue #188):

I suspect that the cost-guiding solution preference is currently broken (the intention is that the SAT solver will produce solutions with lower cost first.) I still don't know how to fix that.
The reported connector order (as displayed by !disjuncts) is not correct (so the original disjuncts cannot be correctly reconstructed and used). I know how to fix that.
Panic timeout is not implemented (will be fixed when fixing Fixing panic mode #785 is done).
"Robust parsing" (parsing with nulls) is not implemented. I started this work but still need to learn more some SAT related material in order to make an efficient implementation.
Improving the speed in various ways (there is very much room for that).

When distributing over AND_type, attribute the cost to the first and'ed term only (instead of to each of them). The cost calculation in generate_satisfaction_for_expression() is left intact for now, because changing it interferes with the cost cutoff calculation.

There cost is to be recovered on linkage extraction if they happen to reside on a participating disjunct.

The SAT-parser already fills in this info.

This fix allows to discard an unintended linkage. It is to be used to discard linkages with cost > cost_max (not actually done, at least for now). In this opportunity an malloc() of the linkage struct is eliminated by allocating it on the stack. Also discard the FIXME on create_linkage, because compute_chosen_disjuncts() doesn't exists now and its replacement process_linkages() is too different.

Not actually used because it is currently incompatible to the classic parser (in which a total linkage cost can be greater than cost_cutoff).

linas · 2018-10-02T19:22:10Z

link-grammar/sat-solver/sat-encoder.cpp

  /* Loop until a good linkage is found.
   * Insane (mixed alternatives) linkages are always ignored.
   * Disconnected linkages are normally ignored, unless
   * !test=linkage-disconnected is used (and they are sane) */
+  bool linkage_ok;


Should almost certainly initialize here, e.g. linkage_ok = false.

linas · 2018-10-02T19:30:35Z

The changelog entry is fine. A 'corporate-speak' variation of if it might be

Revise the SAT parser cost model to align it with the classic parser.

I'm still nervous about the finer details of how the costs are being computed and handled. I'm working on the theoretical side (you've seen the skippy.pdf, I'm sure) and it's not obvious that what I'm thinking of there actually matches what the code is doing. However, I don't have time to think about this now. Just be aware that, in the long run, costs need to behave as if they were log-probabilities, or rather, as if they were mutual-information. More or less. Details TBD.

ampli · 2018-10-02T20:21:00Z

It anyway must be reassigned inside the loop for each iteration (line 1584).

However, I admit it is not clear enough. I'm not sure that adding an initial assignment will make it clearer,
since it is not actually needed.

So the question remains: How to make it clearer?

linas · 2018-10-02T20:41:51Z

It's fine. I'm reading the diffs, not the full code, so things like loop structure are not particularly visible.

ampli added 9 commits October 2, 2018 19:59

WordTag::insert_connectors(): Improve debugging

922ab93

WordTag::insert_connectors(): Record costly-null expressions

7daae32

There cost is to be recovered on linkage extraction if they happen to reside on a participating disjunct.

SAT-parser: Extract costs due to costly-null expressions

509679a

sentence_link_cost(): Remove old comment

b403ec2

The SAT-parser already fills in this info.

sat_extract_links(): Improve debug messages

57e24b0

sat_extract_links(): Prepare for after-linkage cost-cutoff

a7b0904

Not actually used because it is currently incompatible to the classic parser (in which a total linkage cost can be greater than cost_cutoff).

ChangeLog: Notify about the SAT parser disjunct cost modification

23e9f27

linas reviewed Oct 2, 2018

View reviewed changes

linas merged commit 2141564 into opencog:master Oct 2, 2018

ampli mentioned this pull request Oct 2, 2018

Add a strict connector name check #821

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SAT solver cost #820

SAT solver cost #820

ampli commented Oct 2, 2018 •

edited

Loading

linas Oct 2, 2018

linas commented Oct 2, 2018

ampli commented Oct 2, 2018

linas commented Oct 2, 2018

SAT solver cost #820

SAT solver cost #820

Conversation

ampli commented Oct 2, 2018 • edited Loading

linas Oct 2, 2018

Choose a reason for hiding this comment

linas commented Oct 2, 2018

ampli commented Oct 2, 2018

linas commented Oct 2, 2018

ampli commented Oct 2, 2018 •

edited

Loading