Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ondisk property and BaseFacadeXGraphBuilder.previousTDB2Path #300

Closed
luigi-asprino opened this issue Aug 31, 2022 · 5 comments
Closed

ondisk property and BaseFacadeXGraphBuilder.previousTDB2Path #300

luigi-asprino opened this issue Aug 31, 2022 · 5 comments
Labels
Improvement Doing the same thing but better
Milestone

Comments

@luigi-asprino
Copy link
Member

I would like to discuss a few design choices related to ondisk feature.
This issue relates to the PR #185 issues #280 #295 #296 and #298 (at least).

First, why do we need BaseFacadeXGraphBuilder.previousTDB2Path field?
The BaseFacadeXGraphBuilder is instantiated once for each query execution.
Therefore, in the case that field is static the previous path might point to a TDB used for a previous query, and, we don't want that the execution of a query affects the processing of the following ones.
If not static, the previous path is always "".
Maybe we don't need it, or am I missing something here?

BaseFacadeXGraphBuilder.getDatasetGraph() is supposed to create the DatasetGraph where the triples will be stored. In the case of ondisk property set, the DatasetGraph has to be a TDB2, an in-memory implementation. otherwise.
We may discuss what to do in the case that the TDB already exists (that is, deleting the directory or not, or let the user decide - e.g. with ondisk.reuse), but I would keep the method very simple.

The other thing that I find strange is the ondisk property itself.
The current behaviour is:

  • if the path string passed as a parameter is a directory, then, create a temporary directory inside ondisk path for the TDB.
  • otherwise, create a temporary directory in \tmp
    Well, if we remove the previousTDB2Path field, there is no way to tell the engine to use an existing TDB located at a certain path.

I'd suggest the following.

  • ondisk becomes a true/false option. false is the default value. true, means tells the engine to create a TDB in a temporary directory in /tmp or (preferably) System.getProperty("java.io.tmpdir")
  • introduce ondisk.path to tell the engine the path of the TDB (this will be created in case it doesn't already exist).
  • introduce ondisk.reuse to tell the engine to wipe the directory before creating the new TDB. Default value = false

WDYT?

@justin2004
Copy link
Contributor

justin2004 commented Sep 1, 2022

First, why do we need BaseFacadeXGraphBuilder.previousTDB2Path field?

if the user specifies the ondisk.reuse option we need to know where that TDB2 (used in a previous query) lived on the disk so that we can reuse it. we aren't reusing the whole /tmp directory, we are reusing the specific TDB2 e.g. /tmp/3477575753 and we need to store that directory name somewhere.

i think your suggestions are good. 👍

@enridaga
Copy link
Member

enridaga commented Sep 1, 2022

First, why do we need BaseFacadeXGraphBuilder.previousTDB2Path field?

if the user specifies the ondisk.reuse option we need to know where that TDB2 (used in a previous query) lived on the disk so that we can reuse it. we aren't reusing the whole /tmp directory, we are reusing the specific TDB2 e.g. /tmp/3477575753 and we need to store that directory name somewhere.

i think your suggestions are good. 👍

I agree with removing the static fields and ask the users to point at the TDB2 folder to be reused. Consider also that static fields won't be a solution for the CLI. Also, I would avoid generating temporary folders and ask the user where they want to store the temporary database and if they want it to be deleted at the end of the process or not.

@luigi-asprino
Copy link
Member Author

Ok, so we all agree to remove the static fields. Maybe, we can also avoid ondisk.path and use ondisk for passing the path of the TDB explicitly.
And also avoid the generation of the temporary path for the TDB2 and leave the user to point at the TDB2 folder explicitly, so there is no need to store the previous TDB path.
Great! I proceed with the implementation of these changes.

@luigi-asprino
Copy link
Member Author

luigi-asprino commented Sep 4, 2022

With f1355c0 the creation of the DatasetGraph is as follows.
If the ondisk property is set, a TDB is created at the location passed as a value (in the case that ondisk.reuse is set as false, the location is deleted before creating the TBD).
By default, ondisk.reuse is true (I think that false would cause accidental deletion of folders).

@luigi-asprino
Copy link
Member Author

I close this issue as it is completed via f1355c0

@enridaga enridaga added the Improvement Doing the same thing but better label Sep 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Improvement Doing the same thing but better
Projects
None yet
Development

No branches or pull requests

3 participants