-
Notifications
You must be signed in to change notification settings - Fork 119
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Get rid of null_count>0
parsing
#1449
Comments
I just tried the experiment.
gives no change for For
This says Dennis was right, and we've been carrying a cost-max bug for a decade! The The I have a new russian text, that you don't, some WWI vainglorious suffering; it drops from 156 errors to 152. Should this, and maybe pandp, be checked into git? |
Dennis had it right. See issue opencog#1449 This improves parsing of real sentences, without causing damage elsewhere.
If #1450 is merged, then we could always add The more I think about this, the more I like it. I'm infatuated with it, now. |
See the long discussion in the still-open issue #783. I don't remember (and didn't compare now) if my proposed change is the same as your change here. |
In issue #783 I referred to an article that uses an identical cost logic like the original (current) code.
I will answer that there. |
Please indicate if I missed referring to anything else here. |
One problem is that we will get the parses of any null count, from 1 to (sent->length-1). Due to the exploding number of linkages, samling will be needed. Even if there is a linkage with one null, you could never sample enough links to be sure, unless you encounter it by chance. And even if you encountered one, it may be very hard to find additional ones with different null words (or even to find out that they exist). I strongly suspect that your proposal is equivalent to the current algo of parsing with null links if you just remove from it the null links counting. You will than get all the linkages with any number of null words, with the exact same complexity. If you look in the original payper that talks about parsing with nulls, it detailed how null links are equivalent to implicit added links to each words that ensure a linkage. |
OK, so #783 is very long and it will take me a while to digest it. In the meanwhile: you seem to be saying that I'm just reinventing null links. And perhaps that is true. However,
Unless we find a way of enumerating linkages in order, according to #1451 ... if that could be done, then it would not matter how many linkages there are, because we could always get them in sorted order, without sampling. The modification there, to
There is no problem, if linkages can always be accessed in sorted order. If there is one null link, then cost determines which word is the null-linked word. |
Get rid of islands, too: |
Yes.
Not at all. The "complexity" of the null_count handling in So I cannot see why you may want it. BTW, There is also null_cnt in |
As you may recall, we need islands for implementing your phantom-word handling idea (for which it seems to magically function). |
It seems trivial - just add an array subscript. But I have not implemented it yet, so hence is my comment.
But it is absolutely needed, since your proposal will not work to find the parsing with the least nulls, and the parse with nulls would be many times slower.
The problem is that you extract linkages in sorted order according to any combination of one and more of |
It was confusing, broken, and prevented good parses from happening. See opencog#1450 opencog#1449 and opencog#783 and opencog#1453
The back-of-the-envelope calculation for this is that, if one adds a single optional XXX connector to every expression, this has the effect of doubling the number of disjuncts (for that word) If a sentence has OK, I can accept that argument. (And I'll close this issue) However, my thinking still runs in these directions
About this last bullet -- lets think about that, some more -- it leads to the possibility that a parse with zero null words might actually be more costly than a parse with one null word, plus the cost of the XXX link. Is it worth the effort to try this? |
Right.
I can think of:
But if you provide such ANY links, even a sentence that has a full parse will also have plenty of parses that involve these ANY links. Is it useful?
Suppose that after a parse that gets null words, we add ANY links to all the chosen disjuncts of all the linkages. Will we get something useful?
I really don't know. |
The ANY links have a high cost. Full parses would have much lower costs, and so ranking would eliminate sentences with ANY links. For the current English dict, a cost of 3.0 or 3.5 or 4.0 seems like a good choice. Slightly larger than almost all existing costs in the dict. However, there is still a possibility of a "cost inversion", where a parse with a single ANY link is cheaper than the cheapest parse without an ANY link. This would be unlikely, but perhaps some weird constructions would cause this. It could be used as an indicator of what a "better" collection of disjuncts could be. Note this curious interplay between null-words & islands.
To full connect the first parse, there are two possibilities:
or
both which give a total cost of 4.1 (assuming ANY costs 4.0) For the island-parse, a full linkage is possible with either
or
either of which give a total cost of 5.0. There are probably sentences for which the cost of the island-parse is less than the cost of the null-word parse.
The only place where ANY connectors can be added are as shown in the examples above.
So, to be clear with what I am trying to do: The DL/NN systems proceed with gradient descent. I'm trying to do something "kind of similar" but by counting. I also get high-dimensional vector spaces, and I also see phenomena similar to the early wordvec results -- "King is to Queen what Paris is to London" type stuff, etc. Interpretability is a mixed bag, because ... my high-dimensional vector spaces are populated with lots of floats. Those floats are mutual information values, and are interpretable as such ... but ... its still lots of numbers, in the end. The other big difference is that I'm just one guy doing something that no one understands vs. an army of thousands who are fine-tuning both the theory and the GPU code. I've got hundreds of good ideas worth exploring, but it takes me maybe 2-6 months to explore one good idea, so forward progress is on a glacial timescale. I think that what I'm doing can match anything that can be done with DL/NN. It might even be cheaper to compute. But I am working with a large, complex, fragile system in which I create 3 bugs for every two changes. There is a big advantage with working with simple systems, and my current setup is anything but simple. |
BTW, for the above example:
so inserting a "phantom comma" would have give a full parse. And that full parse most-closely resembles the island parse; the island parse is "almost correct", in this example. |
My idea was to do a linkage again and then we are supposed to get them only in the places you indicated. |
Oh, that would work! |
A related idea would be to prune first, including power pruning, everything, and then add ANY links, (and then maybe prune one more time). This would keep the overall size low. It would also allow the detection of "cost inversions", where a parse with a single high-cost ANY link is cheaper than a full link, without out it. |
Is there still a reason to proceed with your idea with the z connectors? |
I guess you mean an additional pruning step, since at the start we always prune. |
I mean, first do expression lookup as normal, and all pruning, as normal, and then add ANY connectors to all the disjuncts that remain after power-pruning. (then maybe prune one more time?) Then parse as normal. |
Except I don't know how to add ANY connectors to disjuncts. They would have to be added as optional connectors, added to the beginning and end of disjuncts, and maybe even in the middle (?!) so this would cause number of disjuncts to explode by a factor of 2^N or 3^N or more. This would be easier, if multi-connectors were zero-or-more, instead of one-or-more. |
My suggestion of adding ANY only after a parse that gets null words, is in order to save processing time while getting the same final result.
I cannot see how it can prune anything then, since the added disjuncts with ANY can connect to the other ones, and those w/o ANY have already been checked in the initial pruning. |
Yes, I understand. That would work. But after I read this, it gave me another idea: add ANY links after pruning, before parsing. This is much less efficient, but would capture the case of cost-inversion. The ability to play with low-cost ANY links could be interesting. |
Will it be useful to add such a "test" or Parse_option, which includes the link cost? |
Yes. I now see three cases:
Case 1) is worth of a Parse_option, because it really does do something useful: it proposes a linkage, in the same format as a normal linkage, that contains null-linked words. It would be cute if we could print Case 2) is worth a Parse_option. Although users might have trouble telling it apart from case 1, because it gives more-or-less the same results. I'm interested in comparing the performance of case1 and 2. Case 3) is ... meh. So a "test". It's obviously slower and uses more memory than 1 or 2, and would only give different results if there was a cost inversion. It would allow us to find out how common such inversions are ... |
The proposal, in short: get rid of all support for parsing with null words. It seems to be complicated, requiring extra checks and tests in all sorts of delicate location. It adds complexity to the code. Perhaps it should be gotten rid of?
The replacement is to extend/alter the ZZZ-link mechanism to handle the case of unlinkable words. It seems there are several ways to do this:
If there are no parses, then rebuild the expressions for the words, and automatically add
({XXX-} & {XXX+})
to every expression, and then re-parse.Always add
({XXX-} & {XXX+})[2.5]
to every expression, even from the very beginning. This would avoid the need for a second parse, in all situations: there would always be some linkage, no matter what.Both of these have problems with cost:
XXX
connectors with zero cost will distort parsing; we want to use them only as a last resort.XXX
with a high cost also distorts parsing: adding them with a cost of 2.6999 will prevent all parses that use any connector with greater than zero cost (given the English rejection of costs above 2.7)Both of these might (?) be solvable with a minor redesign of how
cost-max
(akacost_cutoff
) actually works. This comment about Dennis's code:link-grammar/link-grammar/prepare/build-disjuncts.c
Lines 204 to 218 in 4672e07
I think maybe Dennis had it right? That is, if we set:
then, yes, it will often be the case that max-cost will be less than cost. But that is OK. It will allow expressions such as
A-[1.6] & XXX+[2.5]
to have a grand total cost of4.1=1.6+2.5
and thus get ranked appropriately, whilemaxcost
will be just 2.5 which is still below thecost_cutoff
of 2.7, and thus gets included in the parse. This would allow option 2) to be used, in place of null-counts.Does this make sense? @ampli do you like this? I think this would even be an easy experiment to make.
FWIW, I like this idea, because it is effectively what I am already doing in the atomese dict: either I am very certain of a disjunct (so it has a low cost) or I don't really know, so I assemble one from word pairs (at a higher cost) or I really don't know (e.g. unknown word) in which case a random high-cost link is allowed.
Long-term, there is another issue: the interplay with
cost-max
and combinatorial explosions. It would be nice (nicer) to have "flood-level" counting: slowly raise cost-max, until the count grows greater than zero. But I don't know how to implement this efficiently/automatically (See however, issue #1451)The text was updated successfully, but these errors were encountered: