Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Group Triples by Common Subject #99

Merged
merged 4 commits into from
Aug 14, 2020

Conversation

tmciver
Copy link
Contributor

@tmciver tmciver commented Aug 14, 2020

This work addresses the bug in #98.

This test reveals a bug where if triples with common subjects are separated by
triples with different subjects, then those triples with common subjects are not
grouped together as expected.
@robstewart57
Copy link
Owner

@tmciver Thanks for this bug report and PR, and your efforts in improving the rdf4h library.

Looks good, I'll merge this.

It'd be good if, in addition to the "triples with the same subject should be grouped" test case (and other unit tests), we can also use property based tests using QuickCheck for far greater coverage of possible inputs.

E.g. for the bug that you report, the specification might be:

In the Turtle serialisation of a graph, every subject node appears in a subject position exactly once.

or

In the Turtle serialisation of a graph, a node can appear in a subject position at most once.

This can test for many more possible interleavings of triples with common subjects, in addition to:

<:something> <dc:title> \"Some title\" .
<:another> <rdf:type> <schema:Document> .
<:something> <rdf:type> <schema:Document> .

We probably can't use the Turtle parser to test this, since it desugars < subject , [ (predicate , object) ] > patterns into < subject , predicate , object > triples.

Hhmm..

@robstewart57
Copy link
Owner

@tmciver we should probably also have QuickCheck tests to validate properties that we believe should hold when parsers and serialisers are composed. E.g.

forall (g::RDF a) . triplesOf g
                    ==
                    triplesOf (parseString (TurtleParser Nothing Nothing)
                                   (writeRdfString (TurtleSerializer Nothing Nothing) g))

There's probably a few such properties we assume hold that might not. E.g. on the number of triples, the order of triples, ensuring namespaces are/aren't lost, etc.

@tmciver
Copy link
Contributor Author

tmciver commented Aug 16, 2020

@robstewart57 I love the idea of using Quickcheck! That's something I can work on but I'd like to tackle #97 first since the produced Turtle is not valid until that one is resolved.

@tmciver tmciver deleted the group-by-subject branch August 23, 2020 19:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants