-
Notifications
You must be signed in to change notification settings - Fork 119
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SAT solver cost #820
SAT solver cost #820
Conversation
When distributing over AND_type, attribute the cost to the first and'ed term only (instead of to each of them). The cost calculation in generate_satisfaction_for_expression() is left intact for now, because changing it interferes with the cost cutoff calculation.
There cost is to be recovered on linkage extraction if they happen to reside on a participating disjunct.
The SAT-parser already fills in this info.
This fix allows to discard an unintended linkage. It is to be used to discard linkages with cost > cost_max (not actually done, at least for now). In this opportunity an malloc() of the linkage struct is eliminated by allocating it on the stack. Also discard the FIXME on create_linkage, because compute_chosen_disjuncts() doesn't exists now and its replacement process_linkages() is too different.
Not actually used because it is currently incompatible to the classic parser (in which a total linkage cost can be greater than cost_cutoff).
/* Loop until a good linkage is found. | ||
* Insane (mixed alternatives) linkages are always ignored. | ||
* Disconnected linkages are normally ignored, unless | ||
* !test=linkage-disconnected is used (and they are sane) */ | ||
bool linkage_ok; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should almost certainly initialize here, e.g. linkage_ok = false
.
The changelog entry is fine. A 'corporate-speak' variation of if it might be
I'm still nervous about the finer details of how the costs are being computed and handled. I'm working on the theoretical side (you've seen the |
However, I admit it is not clear enough. I'm not sure that adding an initial assignment will make it clearer, So the question remains: How to make it clearer? |
It's fine. I'm reading the diffs, not the full code, so things like loop structure are not particularly visible. |
Fix the SAT parser disjunct cost to take into account null expression cost. See (1) in this post in issue #188. (It seems I didn't open a new issue for that and instead continued the discussion in the same issue.)
Also fix the total cost per disjunct to be according to the calculation that is done in the classic parser, including the cost-cutoff calculation (see issue #783).
The result is that the reported costs (can be inspected with
!disjunct
) should now be identical to those reported by the classic parser. Note that because duplicate disjuncts are not discarded, and they may have different costs (in the classic parser only the one with the lowest cost is retained) the same parse can be shown multiple times, with different costs.I had a difficulty to word the ChangeLog description.
I also used there "Modify" and not "fixed" because I am not convinced yet that the cost-cutoff calculation method that is currently used is "correct" (or "optimal").
I did comparison tests (SAT vs classic parsers) when I produce all the possible solutions for each input sentence. The order of the parses is of course different, and also the SAT parser produces many duplicates. But by using md5 hashes of each parse I validated that they produce exactly the same parses. I didn't try to programmatically match the reported disjunct costs - I only inspected them manually and they seem identical (if you compare to the linkage with the lower total cost among the duplicates).
If you find different costs please tell me and I will try to investigate/fix that.
Some remaining problems (in addition to the open one from issue #188):
!disjuncts
) is not correct (so the original disjuncts cannot be correctly reconstructed and used). I know how to fix that.