Skip to content

Commit

Permalink
Merge pull request #1 from EMMC-ASBL/make_dlite2cuds_usecase_independent
Browse files Browse the repository at this point in the history
The dlite2cuds and cuds2dlite functionalities from utils have been made usecase independent with a simple molecule example. Note that the the example only contains single value 'values', i.e. strings, ints, floats but no lists or arrays. This means that dimensions above 1 cannot be consdered yet. Strategies are to be fixed in subsequent PRs, as well as other tests that have been commented out.
  • Loading branch information
francescalb authored Feb 9, 2023
2 parents 5f3db62 + 5246dc0 commit 357369f
Show file tree
Hide file tree
Showing 64 changed files with 1,108 additions and 582 deletions.
38 changes: 20 additions & 18 deletions .github/workflows/ci_tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -25,19 +25,19 @@ jobs:
python -m pip install --upgrade pip
pip install -U setuptools wheel
while IFS="" read -r line || [ -n "${line}" ]; do
if [[ "${line}" =~ ^pre-commit.*$ ]]; then
pre_commit="${line}"
fi
done < requirements_dev.txt
while IFS="" read -r line || [ -n "${line}" ]; do
if [[ "${line}" =~ ^invoke.*$ ]]; then
invoke="${line}"
fi
done < requirements_docs.txt
pip install ${pre_commit} ${invoke}
#while IFS="" read -r line || [ -n "${line}" ]; do
# if [[ "${line}" =~ ^pre-commit.*$ ]]; then
# pre_commit="${line}"
# fi
#done < requirements_dev.txt
#while IFS="" read -r line || [ -n "${line}" ]; do
# if [[ "${line}" =~ ^invoke.*$ ]]; then
# invoke="${line}"
# fi
#done < requirements_docs.txt
pip install -e .[dev]
# pip install ${pre_commit} ${invoke}
- name: Test with pre-commit
run: SKIP=pylint,pylint-tests pre-commit run --all-files
Expand All @@ -59,20 +59,22 @@ jobs:
run: |
python -m pip install -U pip
pip install -U setuptools wheel
pip install -U -r requirements.txt -r requirements_dev.txt -r requirements_docs.txt
pip install -e .
pip install -e .[dev]
pip install safety
- name: Run pylint
run: pylint --rcfile=pyproject.toml --ignore-paths=tests/ --extension-pkg-whitelist='pydantic' *.py dlite_cuds

- name: Run pylint - tests
run: pylint --rcfile=pyproject.toml --extension-pkg-whitelist='pydantic' --disable=import-outside-toplevel,redefined-outer-name,import-error tests
run: pylint --rcfile=pyproject.toml --extension-pkg-whitelist='pydantic' --disable=import-outside-toplevel,redefined-outer-name,import-error tests --recursive=yes

# Ignore ID 44715 for now.
# See this NumPy issue for more information: https://github.com/numpy/numpy/issues/19038
# Ignore ID 51668 for now.
# This is a subdependency.
# Ignore ID 48547, because of RDFLIB.
- name: Run safety
run: pip freeze | safety check --stdin --ignore 44715
run: pip freeze | safety check --stdin --ignore 44715 --ignore 51668 --ignore 48547

pytest:
name: pytest (${{ matrix.os[1] }}-py${{ matrix.python-version }})
Expand Down Expand Up @@ -159,7 +161,7 @@ jobs:
run: |
python -m pip install -U pip
pip install -U setuptools wheel
pip install -e .[docs]
pip install -e .[doc]
- name: Build
run: |
Expand Down
4 changes: 2 additions & 2 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -19,13 +19,13 @@ repos:
args: [--markdown-linebreak-ext=md]

- repo: https://github.com/timothycrosley/isort
rev: 5.11.4
rev: 5.12.0
hooks:
- id: isort
args: ["--profile", "black", "--filter-files", "--skip-gitignore"]

- repo: https://github.com/ambv/black
rev: 22.12.0
rev: 23.1.0
hooks:
- id: black

Expand Down
20 changes: 20 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,25 @@
# DLite2CUDS

## Restrictions on the input CUDS when converting to DLite DataModel

The implementations has some severe restrictions for now:

* An individual can only be an rdf:type of one class (i.e. it cannot be for instance both an rdf:type :Human and rdf:type Mother).
In theory, it should not be a limitation but the current implementation does not allow it.

* All individuals of the same type must have the exact same properties defined. Incomplete individuals are not accepted (i.e. individuals missing a property).

Restrictions on Dlite Models etc:

* Three things are needed: Entity of interest (e.g. the DataModel), collection with data, collection with mappings.

* Not only properties must be mapped, but also the concepts/entities themselves.

* Every data provided need to be linked to a concept to be added to the graph.

* Only single value data are supported (dimensionality in Dlite DataModel not yet implemented).
The type of the values is also limited to standard types (str, int, ...)

An OTEAPI Plugin with OTE strategies.

Further reading:
Expand Down
56 changes: 27 additions & 29 deletions dlite_cuds/strategies/cuds_to_collection_function.py
Original file line number Diff line number Diff line change
Expand Up @@ -51,9 +51,10 @@ class CollectionConfig(AttrDict):
entity_collection_id: Optional[str] = Field(
None,
description="id of the collection that contains the entity and"
"the mapping relations."
"the mapping relations.",
)


class CollectionFunctionConfig(FunctionConfig):
"""Function filter config."""

Expand All @@ -69,9 +70,7 @@ class SessionUpdateCollectionFunction(SessionUpdate):
"""Class for returning values when converting from CUDS to Collection."""

graph_cache_key: str = Field(..., description="Cache key to graph.")
collection_id: str = Field(
..., description="Collection uri."
)
collection_id: str = Field(..., description="Collection uri.")


@dataclass
Expand All @@ -86,14 +85,16 @@ class CollectionFunctionStrategy:

function_config: CollectionFunctionConfig

def initialize(self, session: "Optional[Dict[str, Any]]" = None) -> SessionUpdate: # pylint: disable=unused-argument

def initialize(
self,
session: "Optional[Dict[str, Any]]" = None, # pylint: disable=unused-argument
) -> SessionUpdate:
"""Initialize."""
return SessionUpdate()

def get(
self, session: "Optional[Dict[str, Any]]" = None # pylint: disable=unused-argument

self,
session: "Optional[Dict[str, Any]]" = None,
) -> SessionUpdateCollectionFunction:
"""Parse CUDS.
Arguments:
Expand Down Expand Up @@ -132,13 +133,13 @@ def get(
if self.function_config.configuration.entity_collection_id is None:
key = "entity_collection_id"
if key in session:
entity_collection = dlite.get_instance( \
session.get(key))
entity_collection = dlite.get_instance(session.get(key))
else:
raise DLiteCUDSError(f"Missing {key}")
else:
entity_collection = dlite.get_instance( \
self.function_config.configuration.entity_collection_id)
entity_collection = dlite.get_instance(
self.function_config.configuration.entity_collection_id
)

# get the entity
list_instances = _get_instances(entity_collection.asdict())
Expand All @@ -164,38 +165,35 @@ def get(

# the object should normally come from the entity mapping
# but it might be unique so...
cuds_class = get_unique_triple(graph,
entity.uri,
predicate="http://emmo.info/domain-mappings#mapsTo")
cuds_class = get_unique_triple(
graph, entity.uri, predicate="http://emmo.info/domain-mappings#mapsTo"
)

#cuds_class = self.function_config.configuration.cudsClass
cuds_relations = self.function_config.configuration.cudsRelations

# check that the entity is actually mapped to the specified class, missing

# get the list of datum (cuds object isA cuds_class)
listdatums = get_list_class(graph,cuds_class)
listdatums = get_list_class(graph, cuds_class)

# create the collection
coll = dlite.Collection() # not a good idea to use: id='dataset')
coll = dlite.Collection()

# to make it lives longer, to avoid that
# it is freed when exiting that function
coll._incref() # pylint: disable=protected-access

coll._incref() # pylint: disable=protected-access

# loop to create and populate the entities
for idatum,datum in enumerate(listdatums):
# e.g. http://www.osp-core.com/cuds#eb75e4d8-007b-432d-a643-b3a1004b74e1
for idatum, datum in enumerate(listdatums):
# create the instance of the entity
# WARNING with assume that this entity class do not need dimensions
datum0 = entity()
uridatum = datum0.meta.uri

# get the list of properties for that datum cuds object
listprop = get_object_props_uri(graph,datum,cuds_relations)
listprop = get_object_props_uri(graph, datum, cuds_relations)

for propname,propdata in dictprop.items():
for propname, propdata in dictprop.items():
# build the uri of the property
# e.g. http://www.myonto.eu/0.1/Concept#property
uriprop = uridatum + "#" + propname
Expand All @@ -204,17 +202,17 @@ def get(
# e.g. http://www.osp-core.com/mycase#property
# Need a test to check that the property is available
# if not we keep the default value from Dlite
concepturi = get_unique_triple(graph,uriprop,predicatemapsto)
concepturi = get_unique_triple(graph, uriprop, predicatemapsto)

# find the uri of the property that is_a propURI
# AND is in relation with datum
# e.g. http://www.osp-core.com/cuds#1130eafc-2fb0-45f2-83ac-72ce9f35e987
propuri = get_unique_prop_fromlist_uri(graph,listprop,concepturi)
propuri = get_unique_prop_fromlist_uri(graph, listprop, concepturi)

# Add a test if propuri is None

# find the property value and unit for that datum
dataprop = get_value_prop(graph,propuri)
dataprop = get_value_prop(graph, propuri)

# assert if the unit are matching
# if dataprop['unit'] != propdata['unit']:
Expand All @@ -223,12 +221,12 @@ def get(
# " entity: ",propdata['unit'])

# affect the value to the instance
datum0[propname] = convert_type(dataprop['value'],propdata["type"])
datum0[propname] = convert_type(dataprop["value"], propdata["type"])

# define a label for the Dlite collection
# the label is only an internal reference in the collection
# and so not valid outside. It is then possible to use some simple labels.
label = 'datum_'+str(idatum)
label = "datum_" + str(idatum)

# add the instances to the collection
# it will add a set of relations descripting the instance of Dlite entity
Expand Down
49 changes: 27 additions & 22 deletions dlite_cuds/strategies/cuds_to_entity_function.py
Original file line number Diff line number Diff line change
Expand Up @@ -49,8 +49,10 @@ class EntityConfig(AttrDict):

graph_cache_key: Optional[str] = Field(
None,
description=("Cache key to the graph in the datacache that contains all the cuds"
" and the ontotogy."),
description=(
"Cache key to the graph in the datacache that contains all the cuds"
" and the ontotogy."
),
)

cudsRelations: List[str] = Field(
Expand All @@ -64,13 +66,13 @@ class EntityConfig(AttrDict):
)

namespace: str = Field(
"http://www.namespace.no", # Should be changed to DLite default namespace
"http://onto-ns.com/meta",
description=("Namespace of the DLite entity"),
)

version: Optional[str] = Field(
"0.1",
#TODO improve unclear description
# Must improve unclear description
description=("Version of the dlite entity"),
)

Expand All @@ -89,17 +91,18 @@ class EntityFunctionConfig(FunctionConfig):
class SessionUpdateEntityFunction(SessionUpdate):
"""Class for returning values when converting from CUDS to DLite Entity."""

triples_key: str = Field(..., description="Key to triples in datacache "
"representing the mapping"
"of the entity properties to the ontology."
)
entity_uri: str = Field(
..., description="uri of the newly created Dlite entity."
triples_key: str = Field(
...,
description="Key to triples in datacache "
"representing the mapping"
"of the entity properties to the ontology.",
)
entity_uri: str = Field(..., description="uri of the newly created Dlite entity.")
# adding the collection id in the session update
entity_collection_id: str = Field(
..., description="id of the collection that contains the entity and"
"the mapping relations."
...,
description="id of the collection that contains the entity and"
"the mapping relations.",
)


Expand All @@ -115,7 +118,10 @@ class EntityFunctionStrategy:

function_config: EntityFunctionConfig

def initialize(self, session: "Optional[Dict[str, Any]]" = None) -> SessionUpdate:
def initialize(
self,
session: "Optional[Dict[str, Any]]" = None, # pylint: disable=unused-argument
) -> SessionUpdate:
"""Initialize."""
return SessionUpdate()

Expand All @@ -131,7 +137,7 @@ def get(
- uri of the DLite entity.
- uri of the DLite collection containing the entity and mapping relations.
"""

# pylint: disable=too-many-locals
if session is None:
raise DLiteCUDSError("Missing session")

Expand Down Expand Up @@ -160,9 +166,7 @@ def get(
cuds_relations = self.function_config.configuration.cudsRelations

if self.function_config.configuration.entityName is None:
self.function_config.configuration.entityName = (
cuds_class.split("#")[1]
)
self.function_config.configuration.entityName = cuds_class.split("#")[1]

# Build the uri of the new DLite entity
uri = (
Expand All @@ -177,8 +181,9 @@ def get(
entity, triples = cuds2dlite(graph, cuds_class, cuds_relations, uri)

# Append to the triple the mapping of the entity to the cuds_class
triples.append(spo_to_triple(uri,"http://emmo.info/domain-mappings#mapsTo",
cuds_class))
triples.append(
spo_to_triple(uri, "http://emmo.info/domain-mappings#mapsTo", cuds_class)
)

triples_key = cache.add(triples)

Expand All @@ -188,14 +193,14 @@ def get(

# Need to include the relations representing the mapping
for triple in triples:
sub,pred,obj = triple_to_spo(triple)
coll.add_relation(sub,pred,obj)
sub, pred, obj = triple_to_spo(triple)
coll.add_relation(sub, pred, obj)

return SessionUpdateEntityFunction(
**{
"triples_key": triples_key,
"entity_uri": uri,
"entity_collection_id": coll.uuid,
"entity_uuid": entity.uuid
"entity_uuid": entity.uuid,
}
)
1 change: 0 additions & 1 deletion dlite_cuds/strategies/parse.py
Original file line number Diff line number Diff line change
Expand Up @@ -74,7 +74,6 @@ def get(
self,
session: "Optional[Dict[str, Any]]" = None, # pylint: disable=unused-argument
) -> SessionUpdate:

"""Parse CUDS.
Arguments:
session: A session-specific dictionary context.
Expand Down
Loading

0 comments on commit 357369f

Please sign in to comment.