Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Query engine does not perform correctly comparison of String literals from two graphs populated differently #1309

Open
E-Babkin opened this issue Aug 8, 2024 · 4 comments

Comments

@E-Babkin
Copy link

E-Babkin commented Aug 8, 2024

Inspired by the following Stackoverflow discussion :
https://stackoverflow.com/questions/78768869/inside-virtuoso-named-graphs-can-rdf-string-literals-be-represented-in-different?noredirect=1#comment139010223_78768869

Requested details:

Version: 07.20.3240

A test case:

  1. create a simple ttl file with one triple and a literal object of String type.

    @prefix m <http://example.str.com> .
    m:ID1111 m:reflect "ROAM-1234".
    
  2. load that ttl file using Virtuoso web console to the graph urn:model1

  3. load the same ttl file using Java Jena library to the graph urn:model2

  4. make the following SPARQL query:

    select DISTINCT ?s ?s2 ?o
    WHERE { 
        GRAPH <urn:model1>
        {
            ?s m:reflect ?o.
        }
        GRAPH <urn:model2>
        {
            ?s2 m:reflect ?o.
        }
    }
    

A non empty result is expected, however the query returns empty result.

A small modification (add FILTER clause) gives correct answer:

select DISTINCT ?s ?s2 ?o
    WHERE { 
        GRAPH <urn:model1>
        {
            ?s m:reflect ?o.
        }
        GRAPH <urn:model2>
        {
            ?s2 m:reflect ?o2.
        }

     FILTER(str(?o) = str(?o2))
    }
@HughWilliams
Copy link
Collaborator

HughWilliams commented Aug 8, 2024

What is the Virtuoso version in use?

Do you have a test case for recreating?

@E-Babkin
Copy link
Author

E-Babkin commented Aug 9, 2024

details were added to the main text.

@HughWilliams
Copy link
Collaborator

HughWilliams commented Aug 9, 2024

How exactly are you loading the data with Jena? Can you provide a runnable program that can be used for loading the data, with the specific method being used?

@E-Babkin
Copy link
Author

E-Babkin commented Aug 9, 2024

Here is a fragment of the actual Java Code from Spring App.

import java.io.InputStream;

import java.util.ArrayList;
import java.util.Date;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.Optional;
import java.util.Random;
import java.util.concurrent.ThreadLocalRandom;

import org.apache.commons.lang3.RandomStringUtils;

import org.apache.jena.graph.Node;
import org.apache.jena.graph.NodeFactory;
import org.apache.jena.graph.Triple;
import org.apache.jena.query.Query;
import org.apache.jena.query.QueryFactory;
import org.apache.jena.query.QuerySolution;
import org.apache.jena.query.ResultSet;
import org.apache.jena.rdf.model.Model;
import org.apache.jena.rdf.model.RDFNode;
import org.apache.jena.rdf.model.Statement;
import org.apache.jena.riot.Lang;
import org.apache.jena.riot.RDFLanguages;
import org.apache.jena.riot.RDFParser;
import org.apache.jena.riot.system.ErrorHandlerFactory;
import org.apache.jena.riot.system.StreamRDF;

import org.springframework.beans.factory.annotation.Value;
import org.springframework.stereotype.Service;
import org.springframework.web.multipart.MultipartFile;

import virtuoso.jena.driver.VirtDataset;
import virtuoso.jena.driver.VirtGraph;
import virtuoso.jena.driver.VirtIsolationLevel;
import virtuoso.jena.driver.VirtModel;
import virtuoso.jena.driver.VirtStreamRDF;
import virtuoso.jena.driver.VirtuosoQueryExecution;
import virtuoso.jena.driver.VirtuosoQueryExecutionFactory;
import virtuoso.jena.driver.VirtuosoUpdateFactory;
import virtuoso.jena.driver.VirtuosoUpdateRequest;


@Service
public class VirtuosoDbDataServiceImpl implements DbDataService {

  private static final Lang FILE_TYPE = RDFLanguages.TTL;
  private static final int BATCH_SIZE = 5000;
  private static final boolean IS_USE_AUTO_COMMIT = false;
  private static final VirtIsolationLevel ISOLATION_LEVEL = VirtIsolationLevel.REPEATABLE_READ;
  private static final int CONCURRENCY = VirtGraph.CONCUR_DEFAULT;


  @Value("${app.host}")
  private String dbHost;

  @Value("${app.user}")
  private String user;

  @Value("${app.password}")
  private String password;
  
  
  public void insertDataFromFile(MultipartFile file, String graphName, Boolean isClearGraph)
      throws Exception {
    try {

      VirtDataset virtDataset = new VirtDataset(dbHost, user, password);
      virtDataset.setIsolationLevel(ISOLATION_LEVEL);
      VirtModel virtModel = (VirtModel) virtDataset.getNamedModel(graphName);

      if (isClearGraph) {
        virtModel.removeAll();
      }

      virtModel.setConcurrencyMode(CONCURRENCY);
      StreamRDF writer = virtModel.getStreamRDF(IS_USE_AUTO_COMMIT, BATCH_SIZE,
          new VirtStreamRDF.DeadLockHandler(0));
      InputStream inputStream = file.getInputStream();
      RDFParser parser = RDFParser.create().source(inputStream).lang(FILE_TYPE)
          .errorHandler(ErrorHandlerFactory.errorHandlerWarn).build();

      parser.parse(writer);
      inputStream.close();
      virtDataset.close();
    } catch (Exception e) {
      throw new Exception(e.getMessage());
    }
  }
  
...
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants