Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

XLSX source is not supported #79

Closed
neobernad opened this issue Feb 26, 2024 · 6 comments
Closed

XLSX source is not supported #79

neobernad opened this issue Feb 26, 2024 · 6 comments
Assignees
Labels
bug Something isn't working

Comments

@neobernad
Copy link

Hi @dachafra,

Opening up a new issue as suggested in #77.

I have the following YARRRML file (actually it is longer and I have anonymized it):

prefixes:
  ds_data: https://example.com/data/MyData/
  ds_property: https://example.com/MyProperty/
  rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns#
  rdfs: http://www.w3.org/2000/01/rdf-schema#
  grel: http://users.ugent.be/~bjdmeest/function/grel.ttl#
  morph-kgc: https://github.com/morph-kgc/morph-kgc/function/built-in.ttl#

sources:
  MyDataSource:
    - data.xlsx

mappings:
  PersonMappings:
    sources: MyDataSource

    s: ds_data:$(person_name)
    po:
      - [rdf:type, ds_property:Person]
      - [rdfs:label, $(person_name), xsd:string]

The execution of the translation of the file above raises an exception ERROR: The YARRRML mapping has not been translated:

yaml = YAML(typ='safe', pure=True)
rml_content = yatter.translate(yaml.load(open("mappings.yml")))

Here, rml_content is None.

What could be wrong? After having a quick look debugging yatter, it seems that the YARRRML file is being properly opened and parsed, but something in the translate or get_non_asserted_mappings methods is not liking the structure of the mappings perhaps?

Thanks,
José Antonio

@dachafra
Copy link
Member

Hi José Antonio,

Your mapping is missing the referenceFormulation, I understand that is CSV right?(https://rml.io/yarrrml/spec/#reference-formulation)

prefixes:
  ds_data: https://example.com/data/MyData/
  ds_property: https://example.com/MyProperty/
  rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns#
  rdfs: http://www.w3.org/2000/01/rdf-schema#
  grel: http://users.ugent.be/~bjdmeest/function/grel.ttl#
  morph-kgc: https://github.com/morph-kgc/morph-kgc/function/built-in.ttl#

sources:
  MyDataSource:
    - data.xlsx~csv

mappings:
  PersonMappings:
    sources: MyDataSource

    s: ds_data:$(person_name)
    po:
      - [rdf:type, ds_property:Person]
      - [rdfs:label, $(person_name), xsd:string]

@dachafra dachafra added the bug Something isn't working label Feb 26, 2024
@dachafra dachafra changed the title "ERROR: The YARRRML mapping has not been translated" exception from YARRRML to RML XLSX source is not supported Feb 26, 2024
@dachafra
Copy link
Member

dachafra commented Feb 27, 2024

Hi @neobernad, is this what you expect from the output?

@prefix ds_data: <https://example.com/data/MyData/>.
@prefix ds_property: <https://example.com/MyProperty/>.
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>.
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>.
@prefix grel: <http://users.ugent.be/~bjdmeest/function/grel.ttl#>.
@prefix morph-kgc: <https://github.com/morph-kgc/morph-kgc/function/built-in.ttl#>.
@prefix rr: <http://www.w3.org/ns/r2rml#>.
@prefix rml: <http://semweb.mmlab.be/ns/rml#>.
@prefix xsd: <http://www.w3.org/2001/XMLSchema#>.
@prefix ql: <http://semweb.mmlab.be/ns/ql#>.
@prefix d2rq: <http://www.wiwiss.fu-berlin.de/suhl/bizer/D2RQ/0.1#>.
@prefix foaf: <http://xmlns.com/foaf/0.1/>.
@prefix schema: <http://schema.org/>.
@prefix formats: <http://www.w3.org/ns/formats/>.
@prefix comp: <http://semweb.mmlab.be/ns/rml-compression#>.
@prefix void: <http://rdfs.org/ns/void#>.
@prefix fnml: <http://semweb.mmlab.be/ns/fnml#>.
@base <http://example.com/ns#>.


<PersonMappings_0> a rr:TriplesMap;

	rml:logicalSource [
		a rml:LogicalSource;
		rml:source "data.xlsx";
		rml:referenceFormulation ql:CSV
	];
	rr:subjectMap [
		a rr:SubjectMap;
		rr:template "https://example.com/data/MyData/{person_name}";
	];
	rr:predicateObjectMap [
		rr:predicateMap [
			a rr:PredicateMap;
			rr:constant rdf:type;
		];
		rr:objectMap [
			a rr:ObjectMap;
			rr:constant ds_property:Person;
		];
	];
	rr:predicateObjectMap [
		rr:predicateMap [
			a rr:PredicateMap;
			rr:constant rdfs:label;
		];
		rr:objectMap [
			a rr:ObjectMap;
			rml:reference "person_name";
			rr:datatype xsd:string
		];
	].

@neobernad
Copy link
Author

Hi @dachafra,

Firstly, thanks for the quick response :-).

I forgot to mention that I removed the 'referenceFormulation' which formerly was xlsx as it triggered an error.

The output you suggest it is something I would expect! Shall I transform any Excel file into a CSV then?

@dachafra
Copy link
Member

If you want to use Excel referenceFormulation you need to use its engine and extension: https://www.dfki.uni-kl.de/~mschroeder/demo/excel-rml/. But I guess you want to use an XLSX file without transforming it to CSV but the behavior would be the same, i.e. the expected parsing of the file is per row similar as it would be a CSV (that is what the referenceFormulation means). If this output is what you expect, it's already solved and I'll push the changes

dachafra added a commit that referenced this issue Feb 27, 2024
@dachafra
Copy link
Member

Now, XLSX is supported with CSV referenceFormulation.

@neobernad
Copy link
Author

Sorry for the late response, I could not on this topic until now.

I have tested it with the latest version of the repository and with CSV referenceFormulation. It works like a charm, thank you! :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants