Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conditional generation of triples #2

Closed
fpompermaier opened this issue Jul 24, 2014 · 12 comments
Closed

Conditional generation of triples #2

fpompermaier opened this issue Jul 24, 2014 · 12 comments

Comments

@fpompermaier
Copy link

Hi,
nice work! Do you think it could be possible to manage a condition for generation of triples?
For example, adding a rr:condition (expression) to a rr:subjectMap like:

<#DateTime>
rml:logicalSource [
rml:sourceName "example.xml";
rml:iterator "/notes/note";
rml:queryLanguage ql:XPath;
];
rr:subjectMap [
rr:template "http://www.example.com/DateTime/{@year}{@month}{@day}";
rr:class ex:DateTime;
rr:condition "{@month}.trim() != '' "
];

Thanks,
FP

@andimou
Copy link
Contributor

andimou commented Jul 28, 2014

Hey Flavio,
The trim function of XPath is normalize-space(@month) != "", thus it should better be:

rml:condition "{normalize-space(@month)} != \" \" " 

I'm looking into it.

@RubenVerborgh
Copy link
Member

In this particular case, we might wonder whether we could solve this on a different level of abstraction. After all, emptiness and whitespace trimming are common functions.

@andimou
Copy link
Contributor

andimou commented Jul 28, 2014

@RubenVerborgh, that's why I said I'm looking into it (on RML model level) but the example given seems to me that was meant more on conditional statements rather than data cleansing.

@fpompermaier
Copy link
Author

That's right. It could be useful to integrate it with some lib to evaluate expressions like Google Refine Expression Language (GREL) functions (https://github.com/OpenRefine/OpenRefine/wiki/GREL-Functions) or similar (jeval, etc). In my use case it's usefult to put conditions on subjects and predicate generations

@andimou
Copy link
Contributor

andimou commented Jul 28, 2014

GREL is more on data cleansing though. GREL as well as Regular Expressions were considered when HTML2RDF mappings were performed.
In that case, we introduced the rml:process accompanied by the rml:processFormulation at the Logical source defining which the formalization is for each case, namely regular expressions or GREL etc.
@fpompermaier, if that's what you need, let me clean up my code and I'll push it asap.

On the other side, rml:condition would/could cover cases that a condition (something like an if statement) needs to be fulfilled to do the mapping. We still need to see if that's necessary because we tried to keep as close as possible to the R2RML standard so far.

@fpompermaier
Copy link
Author

That would be awesome for my uwe case, otherwise I have to implement it..
On Jul 28, 2014 2:03 PM, "andimou" [email protected] wrote:

GREL is more on data cleansing though. GREL as well as Regular Expressions
were considered when HTML2RDF mappings were performed.
In that case, we introduced the rml:process accompanied by the
rml:processFormulation at the Logical source defining which the
formalization is for each case, namely regular expressions or GREL etc.
@fpompermaier https://github.com/fpompermaier, if that's what you need,
let me clean up my code and I'll push it asap.

On the other side, rml:condition would/could cover cases that a condition
(something like an if statement) needs to be fulfilled to do the mapping.
We still need to see if that's necessary because we tried to keep as close
as possible to the R2RML standard so far.


Reply to this email directly or view it on GitHub
#2 (comment).

@fpompermaier
Copy link
Author

Actually what I was trying to implement as a a PoC was to use something like rr:guard or rr:expjeval with JEval on subjects and predicates to manage when to create them, that is what I need in my use case. Is it so uncommon to have such a requirement?
I think that with this feature RML would be perfect!

@fpompermaier
Copy link
Author

I just pushed a merge request of a simple implementation for the management of production rules using JEval. This is just a PoC, I just wont to know if it could be ok or if you think it should be done in a different way. I cannot send you the test because it contains private data but we can keep in touch if you want so I can send it to you!

@andimou
Copy link
Contributor

andimou commented Sep 3, 2014

The problem I see is that it is case-specific.
It should normally be accompanied by an rml:processFormulation ql:JEval declared at the Logical Source of each triples map and the values to be processed should be passed to a separate module (for instance, a processing performer - which I'm investigating/developing) that takes care of the post-processing, in your case using JEval.

From the RML model's perspective, this should be able to handle any possible post-processing. In your case it is JEval, I'm using regular expressions, eventually the post-processing will handle GREL. But, an RML processor might be implemented using Python or any other programming language where JEval is not available but still processing is required.

At the moment, I hesitate to use by default something that's not compliant with a standard/specification/recommendation (as that's the way RML works so far - more details at rml.io), let alone if it's programming language specific. However, it should not be restrictive, so, for instance, if you or anyone else do want to use JEval or any other library, you should always be able to do so by adding it as a new processing performer in the same way that you add new type of input sources.

@fpompermaier
Copy link
Author

Thanks for the reply. I agree that this functionality must be better formalized and standardized.
My push request was just a PoC of what I need (ant not only me I think) and a rough implementation from which you can take some insights (hopefully..).
I'm looking forward to see this functionality taken into account in a future release.

@andimou andimou closed this as completed Sep 8, 2014
@andimou
Copy link
Contributor

andimou commented Oct 15, 2014

@fpompermaier

I cannot send you the test because it contains private data but we can keep in touch if you want so I can send it to you!

Would you like to send me a private e-mail with more info about your use case?

@fpompermaier
Copy link
Author

Sorry for the late reply..my private mail is [email protected]

In my company (okkam.it) we would like to promote the use of RML but for
some input source we should be able to generate triples or quadruples only
if some condition is true as explained some time ago.

At the moment we use a custom program but we'd really love to promote the
adoption of a standard if possible!

Best,
Flavio
On Oct 15, 2014 2:55 PM, "andimou" [email protected] wrote:

@fpompermaier https://github.com/fpompermaier

I cannot send you the test because it contains private data but we can
keep in touch if you want so I can send it to you!

Would you like to send me a private e-mail with more info about your use
case?


Reply to this email directly or view it on GitHub
#2 (comment).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants