-
Notifications
You must be signed in to change notification settings - Fork 234
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Load CSV tables into AtomSpace #2989
Conversation
BTW, @ngeiswei @Habush @kasimebrahim @Yidnekachew @Bitseat @eman @behailu04 I'd like to bring your attention to the brand-new demo The main difference here is that the main atomspace evaluator is used, instead of the as-moses evaluator. That means that all the functions from the AtomSpace are supported, and not just some of them. The functions look very similar to the as-moses atomese/combo trees; they're only a little bit different. There's an extra This opens the possibility for applying moses algos to non-table data, including video and audio data, or any kind of streaming data, or complex data sources. The Value system allows data to flow in from anywhere, in any way. The AS-MOSES system can then explore different kinds of mutations applied to data processing pipelines. I'm getting ready to tackle some of these data sources. Anyway, thanks for your work in as-moses. It's not been in vain. The future is bright, methinks. |
Hi Linas,
It is really great to hear the news and also great to hear from you. :)
Congratulations on the big achievement and I thank you for the recognition.
Kind regards,
Bitseat
…On Sun, Aug 21, 2022 at 1:32 PM Linas Vepštas ***@***.***> wrote:
Merged #2989 <#2989> into master.
—
Reply to this email directly, view it on GitHub
<#2989 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AGKAU5AAF7BAHKLINETGCF3V2IAU7ANCNFSM57D4MTCQ>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
That is impressive @linas! And thank you for keeping us in the loop. |
this is very cool, thanks linus! how hard would it be to expand this to import sql db dumps and reproduce the relationships of the connecting keys between tables? |
Hi @mjsduncan -- not hard. Not easy. It depends. Let me start with a question. Do you want the SQL data as Values, or as Atoms? The CSV mapping puts an entire column of a table into a single vector value (because this is "natural" for moses) An alternative would have been to take each row of the table, and convert it into a The nice thing about using vectors is that they're fast, compact, uniform. The bad thing is they're not searchable. By contrast, you can search (pattern-match) the EvaluationLinks; but they're slower, bulkier. Long ago, I came up with this idea, never implemented. Tell me what you think. It goes like this:
I don't know if you're interested in the second bullet or not. If you're working with biology databases, then maybe working from dumps is all you want. Maybe the live data connection isn't needed. The live data connection is trickier, harder and more fragile. One "hard part" is coming up with a generic way of allowing the user to specify what the table-to-atomese mapping is. I've got ideas for this (See wiki page for SignatureLink...) but it would take some polishing to get it right. Excuse me. As I write the above, I just realized there are two easy tricks... Just click here. One trick is to create a |
thanks for the detailed reply, linus. i'm definitely thinking of importing data as atoms, and ultimately converted into a more compact and semantically meaningful form than the original tables, otherwise what would be the point? what i'm interested in is importing a whole database, tho i can see the value in what would be a sql interface module so info from a sql db could be imported as needed for evaluation & inference. my question is motivated by the existence of a relational db schema and related tools that are being used to compile data on model organisms: importing these into an atomspace would be fertile ground for developing automated biological inference systems |
Hi Mike, The way to move forward is to open a new issue on github, describe the general desired features, and reference the discussion here. We should continue the discussion there. To build this, make things concrete: pick the 1 or 2 schema that seem to be the most important for you, copy them into the issue. Then write down the matching AtomSpace structures that these would be converted into. Basically, provide a detailed example. This will allow me to think concretely about how to implement things. Where's the data? Do you just want to import database dumps stored in some compressed files? Or will you set up a server somewhere, running some DB, that will hold the data? If there's some server, what is it? postgres? mariadb? reddis? something else? I would need to know, in order to connect to it, interact with it. |
This provides an ability to load plain-text tables (comma-separated values, tab-seperated values)
into the AtomSpace.
The format allows Atomese programs to act on the columns of the table (add, subtract, etc.)
This is one of the important capabilities needed by old-style as-moses.