-
Notifications
You must be signed in to change notification settings - Fork 234
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pass truth values from GroundedPredicateNodes to rule engine from pattern matcher. #1868
Comments
Several short remarks:
|
FWIW, there is an open task to make sure that the URE can work with formulas for any kind of values, and not just truth values. The formulas are "arbitrary", and user-specified. For example, PLN has a certain set of formulas, but one could also use a different set to implement Bayesian nets, etc. A different open task would be to have the pattern matcher not use truth values at all, but to instead look at a single-bit true/false boolean flag in the atom. That would fully decouple the mechanics of the pattern matcher from the flow of truth through the network. |
There are no formulas whatsoever in the code that you just referenced. Those to files are config files, nothing more. Here is an example of a formula: lines 16-17 of Here's more: Those formulas are specific to PLN. Other rule collections can and do use different formulas. |
In this example, PLN is used to compute TV of a compound logical expression. It's absolutely valid usage of PLN/URE. And the problem is - if PredicateNodes are used then the output is exactly what should be expected from PLN, but if equivalent GroundedPredicates are used, then the result is incorrect meaning that the computed TVs are lost. |
Hi @Necr0x0Der OK, so you are claiming that this is a PLN bug. I am not familiar with PLN, so I don't know. At any rate, PLN is not a part of the atomspace, its not a part of the pattern matcher, its not a part of the URE. So it seems like this is not even the correct github repo to report this bug. I'm going to close this for now. You have to ask Nil to take a look at it. The point is that PLN is a layer on top of the URE, and the URE is a layer on top of the pattern matcher. Suggesting some bizarro hack to the pattern matcher to fix a bug in PLN just sounds wrong to me. There are other users of the pattern matcher besides the URE, and there are other users of the URE besides PLN. You'd have to make a much much stronger case that the pattern matcher should work in some new or different kind of way. |
No, you didn't understand. This is not specifically PLN bug. You could call it an URE bug, because whatever rules you will use in URE, (truth) values calculated by GroundedPredicates will be lost during chaining. However, the problem can be easily fixed if Pattern Matcher will store truth values of GroundedPredicates after evaluating them. URE relies on PM as on interpreter of Atomese, so either PM should be extended, or URE should re-evaluate grounded predicates... |
Look, can you actually explain this in a way that is understandable? To me, this comes of as a big giant ball of confusion about how the system actually works, and I don't feel I have the time to try to figure out who is confused by what. There is nothing that you said, nor is there anything in the initial bug report, that indicates that there is any bug in the atomspace. |
@linas I was writing to explain, but I'm thinking maybe it's better to move to singnet instead, and we'll only ping you if it really ends up concerning the pattern matcher. |
Look, I just now found out about singnet. Cassio made me an explicit promise that there won't be any development happening on singnet, and, as is clearly visible in github, there seems to be significant development happening there, so WTF. I'm really quite rather very unhappy about that. It's good that there are more developers joining the project, but that means that it is time to start acting in a professional manner. Yes, the chaos of children's playgrounds can be fun, but it is not an effective way to build good software. This is especially critical at this point: if we get over-run by chaos and crazy ad-hoc patches, you'll just get a pile of shitty code. You cannot build a skyscraper out of random pieces of wood. Let me be clear: this bug report sounds like some random confusion, with some incoherent proposal to change something for no good reason. That's not engineering, that's hacking. Hacked systems never work, they always fail. Millions of developers have been writing software since before the 1970's, and we know what happens to software when one makes random, incoherent hacks. Doing this in the singnet repo is a high-way to failure. |
The issue may look confusing but it's not a reason to dismiss it, if we can't sort it out here, then what? We can use emails but github offers a better platform for that so... I'm all for being super authoritative about merging, but not about discussing, otherwise we'll never make progress... OK, sure having that issue closed doesn't mean it cannot be discussed, maybe we can discuss it here while being closed...
Are you sure? Maybe there was a misunderstanding there... But yeah of course the plan is to have singnet repos, wholly or partly being merged back to opencog. |
though I wonder, can people without admin rights post messages on closed issues? |
I can |
Oh, yeah, of course, if you can create and close you should be able to post to it (it seems closed PRs however cannot receive further commits, but that's independent of rights I suppose). |
OK, we can discuss, but if the proposed change is to the pattern matcher (which seems to be what the proposal suggests) then the example code should invoke the pattern matcher only, and compare it to the existing pattern matcher behavior. Somewhat off-topic, but: it has been nagging me for many many years that the pattern matcher touches truth values. I never really liked that. The thought is now crystallizing that atoms could have a single-bit crisp-logic boolean value, and that this bit could be used for all of the basic pattern matching needs. This way, other formulas, other behaviors could be moved to modules, and it would open the possibility of using other value types, besides truth values, with which to perform formula evaluations. |
@noskill, small comment (till I get to understand the whole thing)
(define (redness box )
(display "redness called with: ")
(display box)
(display "\n")
(stv 0.55 (random 1.0))
) can be replaced by (define (redness box )
(cog-logger-info "redness called with ~a" box)
(stv 0.55 (random 1.0))
) having previously loaded the logger with (use-modules (opencog logger)) That message will by default be printed in a
See https://wiki.opencog.org/wikihome/index.php/Scheme#Logger for more info about the logger. |
"trace" is not a logger level, see https://wiki.opencog.org/wikihome/index.php/Scheme#Change_the_log_level for valid log levels. |
@noskill Please, simplify the issue as much as possible, for instance the It also looks like the conjunction could have 2 instead of 3 arguments. Don't hesitate to simplify names as well, like replacing "BoundingBox" by "A", or "Herring" and "Red" by "P" and "Q", I mean, I don't know, do whatever you think is gonna simplify the issue, make as flowing as possible to read and understand. |
Don't hesitate to use markdown goodies, like specifying that the code is in scheme (as I've just done), etc. |
Thanks Nil! I simplified the example a bit more, leaving only one predicate - redness. I doesn't like one-letter names though.. |
@linas this is not true:
You saw the plans for the fork, commented on them, and agreed with them in a call we had a couple of months ago. |
@ngeiswei I made a unit test for this issue https://github.com/noskill/atomspace/blob/virt-eval-test/tests/query/CogBindTVUTest.cxxtest |
I'm still exploring but it seems the problem could be that executing an evaluation link doesn't return the evaluation instance with the TV on it although evaluating that same link does return the correct TV. So for instance, given (define (sp x) (stv 0.55 0.55))
(define GP (GroundedPredicate "scm:sp"))
(define A (Concept "A"))
(define E (Evaluation GP A)) The following returns the correct TV (cog-evaluate! E)
$3 = (stv 0.550000 0.550000) But the following
returns E unchanged. Do we want |
Although, it's true that if one modifies the TV of |
On Wed, Sep 26, 2018 at 6:21 AM Nil Geisweiller ***@***.***> wrote:
@linas <https://github.com/linas>, when you say "the atomspace is NOT
lambda calculus", do you have in mind relational programming languages like
miniKanren?
No. The atomspace is first and foremost a knowledge representation (KR)
system, which means that it implements the axioms of "relational algebra".
Another very famous system that implements the relational algebra is SQL.
Any KR system is going to implement the relational algebra. For example,
prolog. The KR-subsystem of prolog is called "datalog", it is what is left
over when you rip out inferencing, chaining, logic out from prolog.
The miniKanren project is something else; they are trying to create a
pure-scheme implementation of something that looks like prolog. As it
clearly shows, implementing logic and inferencing and KR on top of scheme
is awkward and not scalable and provides no marvelous revelations, other
than that it's an uneasy partnership.
Roughly speaking, its because one is trying to marry together a "no
side-effects" style of thinking, with a "nothing except side-effects" style
of thinking. More generally, one should step back and ask "why would one
ever want to marry KR and lambda calc?" The obvious answer is "cause it's
geeky fun". The next question is: "can it be fast, flexible, efficient,
scalable, easy-to-use?" and the answer is, so far, seems to be "no", from
what I can tell. But maybe I'm wrong. The last question is "what can the
atomspace learn from it?" and right now, the answer seems to be "nothing,
so far". I haven't seen anything appealing, but maybe I missed something.
The axioms of relational algebra differ from the axioms or lambda calculus,
and I want to avoid adding the axioms of lambda calculus to the atomspace,
despite the fact that we've already taken most of the steps to get there.
For example, there's a problem with quotation -- you wrote most of the code
to do quotation, and surely you understand just how ugly and incompatible
it is, when used in the context of knowledge representation? There has to
be a better way.
|
On Thu, Sep 27, 2018 at 8:30 AM Necr0x0Der ***@***.***> wrote:
So, what do you think, should we modify EvaluationLink::do_eval_scratch
or use your hack to explicitly set TV for EvaluationLink?
Guys,
Look, it would be nice if you did not ignore everything that I write, and
continue pressing that an ugly hack be implemented. The goal of good design
is to have a consistent system that is easy to understand and behaves as
expected. Adding ugly hacks to work around bugs located somewhere else is
not good design.
|
On Fri, Sep 28, 2018 at 2:20 AM Nil Geisweiller ***@***.***> wrote:
systematic TV caching is OK.
Systematic TV caching is bad for the following reasons:
1) its slow. It forces everyone to pay the price for this, even if they do
not need it.
2) it bloats the atoms and the atomspace. The default TV takes zero bytes
to store. Any TV that is not the default TV is going to use up space.
3) It bloats disk-usage. If you save the dataset to disk, all of those TV's
will get saved, slowing down atom store performance, atom fetch performance.
4) What's the point of caching, if you do not also provide a way of
verifying that the cached value is not stale? How do you know that it is
still accurate? What happens if that same evaluation link is used in two
different queries, running in two different threads? Which query "wins"?
Which of the two values is "correct"? How could you ever know?
There are few, very nearly zero users who will even be aware that there is
a cached value being stored. Why should everyone pay a penalty price, just
because some obscure user might be doing something ill-defined and
ambiguous with some computed value?
@linas <https://github.com/linas>, what do you think?
I want to remove truth values from the pattern matcher. More and more, it
is starting to feel like a design mistake.
…--
cassette tapes - analog TV - film cameras - you
|
@linas What then would be difference between atomese and prolog's family of languages? I somehow got into thinking that atomese is an extension of logic programming with probabilistic reasoning.. |
Can't we just use truth value during computation in rule engine? That's our use case. Currently we create child atomspace before request to rule engine, and destroy it afterwards.. There should be easier way to do some requests without bloating of atomspace. |
Atomese has zero logic programming in it, and zero probabilistic reasoning. Atomese is a knowledge-representation system, plus a collection of tools and layers that allow you to implement logic programming, if you wish. Currently, we do not implement any logic programming system. (That might change someday) We do implement a probabilistic reasoning system, called PLN, but that lives in a different github repo. There's all sorts of neat stuff we should get around to implementing, but it takes .. time and effort. |
I don't understand what this is saying, so I can't answer it. I am trying to replace truth values by just-plain values, which are more general and more flexible. However, since the atomspace is legacy code, there's still a lot of truth values everywhere. Removing all of them will take time. The rule engine ... well ideally, it would accept, as input, a set of rules, a set of formulas, an atomspace containing the rules and formulas, and maybe a different atomspace containing the data. When a rule fires, it should cause a formula to run. The inputs to a formula should be generic values, not truth values, and the outputs should be generic values as well. PLN is a specific set of rules and formulas, described in the PLN book. Other rules and formulas are possible. One of the design goals of the atomspace is to allow simple, easy representation of "generic knowledge", and then allow many different kinds of AI algorithms to manipulate that knowledge. Examples of AI algorithms include: rule-engines, parsers, constraint-solvers, logic solvers, search algos, data-mining algos, neural nets, etc. So far, only a few of these have been implemented. So far, its been kind-of awkward and kind-of slow. Just about any special-case system, e.g. from apache foundation, is going to outperform the atomspace+layers on top of it. The atomspace is trying to be a generic, neutral platform, for any AI algo, which is something no one else attempts to do. Whether th atomspace is successful in doing this ... is open to debate. I'm pushing back on this bug report because the proposed solution is not generic. For example, people who want to run neural nets on top of the atomspace really wouldn't need or want the solution you propose. |
I think @linas is right (I'm actually glad he shed the light on this, cause frankly I was blind). However I also think that caching (or rather storing) TV in evaluation calls can be a desired feature, I just agree with @linas that this should not be in I think the best would be to create a new link, say called Such |
Maybe better names would be Or if |
I also agree with @linas that the pattern matcher should not require its For instance instead of
we would have
|
You said that storing tv in EvaluationLink would bload atomspace, but, at least in our usecase, we need EvaluationLink with tv only during ure's query computation. Anyway Nil's solution with CachedTVEvaluationLink would allow user to choose the desired behavior.
We wrap neural network in EvaluationLink with GroundedPredicate. We populate atomspace with ConceptNodes with some data for neural network to perform classification. (cog-execute! (BindLink
(VariableList
(TypedVariableLink (VariableNode "$B") (TypeNode "ConceptNode"))
(TypedVariableLink (VariableNode "$X") (TypeNode "ConceptNode"))
)
(AndLink
(InheritanceLink (VariableNode "$B") (ConceptNode "BoundingBox"))
(InheritanceLink (VariableNode "$X") (ConceptNode "color"))
(EvaluationLink (GroundedPredicateNode "py:runNeuralNetwork") (ListLink (VariableNode "$B") (ConceptNode "surfboard")) )
(EvaluationLink (GroundedPredicateNode "py:runNeuralNetwork") (ListLink (VariableNode "$B") (VariableNode "$X")) )
)
(ListLink (Variable "$B") (Variable "$X") (ConceptNode "surfboard"))
)
) In order to use rule engine we need this evaluation link (EvaluationLink (GroundedPredicateNode "py:runNeuralNetwork") (ListLink (VariableNode "$B") (ConceptNode "surfboard")) ) to have a truth value. @linas Perhaps we took an unexpected path to integrate atomspace with neural networks. What is the expected one? |
The other option (alternatively to having a For instance, the rule
Let's call this rule
See what I mean? |
Also, if it wasn't clear, let me remind that |
I am not that familiar with ure yet, is there a way to force ure to apply this set-grounded-predicate-evaluation-tv-rule to every evaluation link with grounded predicate node it encounters? I mean URE could produce some result without using this rule at all. For now I am quite happy with setting tv from within grounded predicate. This would allow us, as Alexey noted, to return different value to pattern matcher. This way we can process in URE evaluation links with low associated truth values. But i still think that better solution would be to implement CachedEvaluationLink and to provide the threshold to pattern matcher call. Here "better" means providing more options to change the system's behavior. |
@noskill said
Oh, you mean that it may return something above 0.5 while setting the TV to its really value, that way you have more control over when the PM would allow a match? Regarding forcing the URE, I think yes, it's possible to force the URE using inference control rules. But otherwise, what can be done is simply giving the URE enough evaluations and adding preconditions in the PLN conjunction rule to dismiss conjunctions if not all arguments have non null confidence. |
this would not skip the intermediary inference tree though... It would be nice we have something that allows that. One way actually would be to gather inference trees, wrap them as rules that you'd be able to re-use then to skip inference steps... |
@ngeiswei Yes, we want to be able to compute truth value for such and links: (AndLink
(EvaluationLink (stv 0.35 0.45)
(GroundedPredicateNode "scm: redness")
(ConceptNode "B")
)
(InheritanceLink (stv 0.45 0.45)
(ConceptNode "red")
(ConceptNode "color")
)
) Current pattern matcher implementation would throw out the EvaluationLink with (stv 0.35 0.45), but unexpectedly would not throw out the IheritanceLink. I thought that IheritanceLink wouldn't be stored as well, but since it works, setting tv from within the grounded predicate seems to give same result as would providing threshold for pattern matcher.. I am not sure about other link types.. |
Interesting, so what I thought was a bug is actually a feature here... @Necr0x0Der @linas this is another reason to have the pattern matcher support boolean predicate, that what one can choose how what threshold to apply over strength or confidence, instead of having a fixed 0.5 threshold over strength. |
I think that the BindLink is mal-formed. I believe that what you want is this:
The pattern matcher simply finds all possible variable groundings that satisfy the premise, and places those groundings into the conclusion. |
Re this comment:
The pattern matcher does NOT look at truth values on atoms. Instead, it is simply looking to see if an atom exists, or not. If the atom exists, then its a match. If it does not exist, then .. well, it doesn't exist. Existence is just a crisp true/false thing. There are complications. For example, the A "relation", in the strict, mathematical sense, is a function of N arguments, that, when evaluated, returns 1 or 0. Relations never return fractions, floats, or any other kind of value. (off-topic, but important: a "relational algebra" implements a collection of relations. SQL is a practical kind of easy-to-use relational algebra system. Relations are important.) So: In the future, Perhaps we should instead use a
we should instead write
which evaluates to a crisp true/false when it is used in the premise-side of a |
@linas Our intention here is to decrease computational complexity: (AndLink
(InheritanceLink (VariableNode "$B") (ConceptNode "BoundingBox"))
(InheritanceLink (VariableNode "$X") (ConceptNode "color"))
(EvaluationLink (GroundedPredicateNode "py:runNeuralNetwork") (ListLink (VariableNode "$B") (ConceptNode "surfboard")) )
(EvaluationLink (GroundedPredicateNode "py:runNeuralNetwork") (ListLink (VariableNode "$B") (VariableNode "$X")) )
) Here if pattern matcher finds that evaluation link (EvaluationLink (GroundedPredicateNode "py:runNeuralNetwork") (ListLink (ConceptNode "BB-11") (ConceptNode "surfboard")) ) doesn't match, it won't consider this (ConceptNode "BB-11") anymore. If we pass all variables to one evaluation link this optimization won't happen. |
Ahh. Well, the pattern matcher is pretty fast: its going to be thousands of times faster than any neural net. Its going to be ten times faster than calling a python function. Doing this will in fact slow things down: the pattern matcher is still forced to explore all possible groundings, because it cannot guess what your |
I checked and pattern matcher does not call one evaluation link if other doesn't match, it doesn't respect the order in the AndLink though. |
The AndLink is an unordered link. SequentialAnd is ordered. |
... except the pattern matcher doesn't respect SequntialAnd's either. What the AndLink is doing is specifying a set of clauses, which must be satisfied. The clauses form a graph; graphs do not have an "order". |
Thanks Linas! SequentialAnd seems to be respected by pattern matcher, at least for Evaluation links. I guess pattern matcher first considers inheritance links, then evaluation links. This can be usefull. |
Closing; I don't see any work items here, and I think all issues have been resolved. |
We need to use rule engine along with virtual evaluation links.
Current pattern matcher implementation does not store truth value returned by grounded predicate.
Instead matched evaluation link keeps its default value of stv(1 0). And here matched means having mean tv more than hardcoded threshold of 0.5.
Example:
====================================================================
Current implementation gives (stv 1.000000 0.000000) for all and links for the first conjunction (run-bc AndRedBox1):
One possible fix is to store truth value somewhere at the end of DefaultPatternMatchCB::eval_term
atomspace/opencog/query/DefaultPatternMatchCB.cc
Lines 679 to 683 in e965598
Then I get the expected result (stv 0.55 0.77770132)
======================================================================
The text was updated successfully, but these errors were encountered: