USE CASE: MSc Programs #17

analice1pt · 2019-05-07T10:03:20Z

Creator: Ana Alice Baptista

Problem statement

Application that combines and integrates aggregated Linked Open Data of courses from several European universities, processes that data and releases the processed data as LOD. In this case we have LOD both as input and output. The data is conformant to three application profiles (AP1, AP2 and AP3). This causes that although the structures of the datasets are similar, some properties and constraints over values differ. For example:
• Property: indication of the University of a course: AP1 uses eg1:foo AP2 uses eg2:bar, AP3 uses eg3:baz.
• Ranges of the property in the AP: AP1 uses eg1:Foo, AP2 uses eg2:Bar and AP3 uses xsd:string.
• Relationships: in AP1 one course is related with one or more universities; in AP2 one course may be related to zero or more universities; in AP3 one course relates with one and only one university.

The questions are:
1 - How to map data originating from different datasets?
2 - How to deal with different but equivalent properties?
3 - How to deal with different domains and ranges of equivalent properties?
4 - How to deal with different constraints over values?

Stakeholders

Data providers, universities, future university students and other potential users of the application.

Links

Requirements

R1 – To be able to identify, relate and map possibly conflicting application profiles.
R2 – To be able to state preferred properties, classes and related constraints over possibly conflicting possibilities.
R3 – To be able to identify which data sources are related to a given profile.

Comments

We usually think of application profiles for data that we want to make available. How do we do when we want to develop applications that use data made available by others?

The text was updated successfully, but these errors were encountered:

philbarker · 2019-05-09T10:22:20Z

One issue that a consuming application has to deal with when using an application profile that I don't think is mentioned here is what to do with incoming data that conforms to the base specification but does not adhere to the application profile. Options:

keep all data that arrives so that is can be passed on other
only ingest the data that adheres to the profile

Option 2 might seem the obvious route, but many applications prescribe a minimal profile, i.e. saying something like "you must supply this much data before we will accept any", while encouraging provision of more data.

analice1pt · 2019-05-09T16:35:34Z

@philbarker , I think this might be a new use case. When I wrote this use case, I was not thinking about an already existing base profile. Instead, I was thinking that a base profile might emerge from the inputing data.

kcoyle · 2019-05-10T05:50:11Z

In SHACL and ShEx these two cases are handled with a statement that the graph being investigated (which could be the same as what we define as a profile) is either OPEN (allow properties that are not included in the validation document) or CLOSED (only allow properties that are included in the validation document). I have included the ShEx property sx:closed as a possible property for the profile in my original attempt at the vocabulary.

Would one of you add this as a use case? Thanks.

analice1pt · 2019-05-10T08:50:11Z

@kcoyle, I am not sure we are talking about the same thing. I think I should make this use case more clear. I am thinking on how to organize the data in the design phase. I mean, suppose that we want to develop an application that only uses data that others make available (e.g., Eurostat, aggregated data from European hospitals and aggregated transportation data). This data may have different properties, but also equal properties and equivalent properties. My question is: how do we, in the design phase, handle this potential diversity and superposition?

philbarker · 2019-05-10T09:47:54Z

@kcoyle #19 is a use case for what I had in mind

kcoyle · 2019-05-10T12:14:25Z

@analice1pt I wasn't thinking that profiles themselves would handle mapping. Again, perhaps a more detailed example of a single situation (e.g. with just a few elements) would help us think about this.

marianamalta · 2019-05-10T13:57:37Z

@analice1pt the point here is:
Someone else wants to aggregate LOD, from providers that don't know each other, did not talk to each other, that just decided to publish according to its own model. These data will then be published again as LOD by an entity that did not produce the data.
How is this different from defining a application profile "without" having structured data ? The kinds of things (entities) might be different from provider to provider as well how those things are described (properties); and even if the models have similarities they may use different vocabularies/terms to describe each domain/property.
The issue here has more to do with the process, or the track of things (ProviderA published propertyA using termA but now we publish with termB), than with the application profile itself...
I am sorry if i am not reaching the real issue...

analice1pt · 2019-05-10T15:00:46Z

@marianamalta , I guess that depends on what is an application profile and how it can be used. I like AP to support the design process, not only to inform others about the data that I am making available (if any).

marianamalta · 2019-05-10T18:28:16Z

Right. But the process of developing an AP can be complex, how deep do you want to go? This is something (the tracking of the process) that might not have an ending!

analice1pt · 2019-05-17T10:52:44Z

I am giving an example of a simple situation that fits in this use case. A more complex situation would involve different ranges or different allowed values for properties.

analice1pt · 2019-05-17T10:58:50Z

Another simple example because it involves only one property with different ranges in the MAP. I am assuming that we will be able to specify a range in the MAP that will be somehow related to the range in the base schema of the property.

kcoyle · 2019-05-22T08:29:36Z

I'm trying to develop requirements from this. Do either (or both) of these capture the sense?

An application profile may contain mapping information for data from different sources
An application profile may allow for more than one value type for a property
There needs to be a way to define preferred properties in the case where more than one is available in the dataset

analice1pt · 2019-05-28T16:49:57Z

@kcoyle , I agree with those requirements. I just would like to point out that mapping between data sources may not capture all the meaning: the idea is mapping between profiles of different data sources.

kcoyle · 2019-05-29T04:39:43Z

@analice1pt Mapping between profiles would require that the individual statements in each profile are identified as belonging to that profile. As a reminder, all properties / elements in a profile are pre-defined in vocabularies. So when a profile reuses dct:title it is identified with dct:title. How would you indicate that this is the use of dct:title in a particular profile?

analice1pt · 2019-05-29T10:06:11Z

@kcoyle , I am not sure I understood your comment. I was meaning that we should be able to map between the properties, types of values, and other constraints of the data sources, not between the data sources themselves. When I referred to APs, I was meaning both implicit and explicit APs. By implicit APs I mean APs that may be inferred by the data.

kcoyle added the use case label May 8, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

USE CASE: MSc Programs #17

USE CASE: MSc Programs #17

analice1pt commented May 7, 2019

philbarker commented May 9, 2019

analice1pt commented May 9, 2019

kcoyle commented May 10, 2019

analice1pt commented May 10, 2019

philbarker commented May 10, 2019

kcoyle commented May 10, 2019

marianamalta commented May 10, 2019

analice1pt commented May 10, 2019 •

edited

Loading

marianamalta commented May 10, 2019

analice1pt commented May 17, 2019 •

edited

Loading

analice1pt commented May 17, 2019

kcoyle commented May 22, 2019

analice1pt commented May 28, 2019

kcoyle commented May 29, 2019

analice1pt commented May 29, 2019

USE CASE: MSc Programs #17

USE CASE: MSc Programs #17

Comments

analice1pt commented May 7, 2019

Problem statement

Stakeholders

Links

Requirements

Comments

philbarker commented May 9, 2019

analice1pt commented May 9, 2019

kcoyle commented May 10, 2019

analice1pt commented May 10, 2019

philbarker commented May 10, 2019

kcoyle commented May 10, 2019

marianamalta commented May 10, 2019

analice1pt commented May 10, 2019 • edited Loading

marianamalta commented May 10, 2019

analice1pt commented May 17, 2019 • edited Loading

analice1pt commented May 17, 2019

kcoyle commented May 22, 2019

analice1pt commented May 28, 2019

kcoyle commented May 29, 2019

analice1pt commented May 29, 2019

analice1pt commented May 10, 2019 •

edited

Loading

analice1pt commented May 17, 2019 •

edited

Loading