-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Loading data to AtomSpace #12
Comments
Although it is somewhat of a test issue let me answer it. Ideally you want to load csv directly into atomese, to not pay the overhead of creating a structure like The completely ideal solution would be reuse as much as you can from |
Nil, How are you representing tables? Again, I want to draw your attention to the module
and the types of all row atoms are the same type, etc. then the matrix API does "neat thing". (It does NOT read from a file, though)
as well,. I'm currently experimenting with much more awkward row-column representations. What the matrix (aka vector) API does is allows you to have some complicated blob of data in the atomspace, and it allows you do declare that some subset of it looks "just like a matrix" or "just like a table" , and then it implements a bunch of generic table/matrix methods on it (currently, conditional probabilities, mutual information, cosine and jaccquard distances, etc.) -- It doesn't matter what the actual atoms really are, because all the algos just use the definition of the matrix to find the right atoms. It would be nice if in some hazy future, MOSES would work the same way. Its probably too early for this, right now, but its an idea. (it would be nice to port the matrix API to C++, for speed, and to port it to "R", so that Mike and the biology guys could examine matrix-like slices of the atomspace in R. But that's a different project). |
I read the README but it didn't seem clear to me how to incorporate that data in the Atomese evaluation, I mean for instance evaluating
where What I've been thinking though is to have a column stored as a list of values, such as
it could reuse this column to avoid re-evaluating Having said that, I'm not terribly concerned about efficiency at this point, I just want to attempt to move towards a direction that would foster "holistic" cognitive integration, like reasoning on programs, fitness functions and data. |
@ngeiswei Assuming that we're going to implement this using option 2 (i.e by converting As described at https://github.com/opencog/as-moses/blob/master/moses/comboreduct/table/table.h#L911
Where as our current representation at #3 doesn't separately hold the output and input data. Do we need to change it to handle that? |
@Yidnekachew I'm not sure what is best at this point. I suppose you may separate output and input data, like Table for now. The other problem I'm seeing is that Boolean tables don't have any compact representation offered in #3 . Either we come up with one or you use the unfolded representation such as (Evaluation (stv 0 1)
(Predicate "i1")
(Node "r1"))
(Evaluation (stv 1 1)
(Predicate "i2")
(Node "r1"))
(Evaluation (stv 1 1)
(Predicate "o")
(Node "r1")) which has the "advantage" of forcing us to experiment with both representations and weight their pros and cons, maybe. |
@ngeiswei If we're doing it both ways, a dataset like this
is going to be represented as: For the Boolean type, Using input and output table
Using the unfolded table
For the Real type, Using input and output table
Using the compact format
Am I right? I will also need to have a look if |
That's correct @Yidnekachew . |
BTW, it's better if the first feature is the output (as its default MOSES' assumption, I've corrected #3 accordingly). |
Other representations to consider would be (List
(List (Schema "o") (Schema "i1") (Schema "i2"))
(List (Number 1) (Number 0) (Number 1))
(List (Number 1) (Number 1) (Number 0))
(List (Number 0) (Number 0) (Number 0))) this one is probably the most compact and doesn't need to introduce row nodes. Its drawback is that it has no self-contained semantics. Also, another option, to avoid having 2 distinct representation for Boolean and numerical data, is to use TrueLink http://wiki.opencog.org/w/TrueLink and FalseLink http://wiki.opencog.org/w/FalseLink. I'm perhaps thinking of another representation that may have the advantage of that one above (i.e. doesn't introduces row nodes) yet is semantically self-contained. I'll come back later on that. Meanwhile, here's my suggestion: since we're are more less stepping into the unknown (well as far as I am concerned I don't have a clear cut idea of what is gonna be best) I suggest you implement all |
Obviously, an option to have the table compact type representation sorta semantically self-contained is to wrap it in a "AS-MOSES:table" predicate or something, like (Evaluation
(Predicate "AS-MOSES:table")
(List
(List (Schema "o") (Schema "i1") (Schema "i2"))
(List (Number 1) (Number 0) (Number 1))
(List (Number 1) (Number 1) (Number 0))
(List (Number 0) (Number 0) (Number 0)))) that requires subsequent transformations to reason about it and the axiomatization of |
Port fitnesseval logical_bscore
Question: which approach would be preferred?:
The text was updated successfully, but these errors were encountered: