Implementation of the LDBC Social Network Benchmark's Interactive workload in Microsoft SQL Server using its SQL Graph feature.
The recommended environment is that the benchmark scripts (Bash) and the LDBC driver (Java 11) run on the host machine, while the SQL Server database runs in a Docker container. Therefore, the requirements are as follows:
- Bash
- Java 11
- Docker 19+
- enough free space in the directory
${MSSQL_DATA_DIR}
To build, use the script: scripts/build.sh
. This implementation is tested using the docker container of SQL Server 2019.
Before running the SQL Server implementation, the following files must be edited:
.env
: file containing the environment variables used in thedocker-compose.yml
docker-compose.yml
: edit if data must be stored persistentdriver/benchmark.properties
: properties file used in execute benchmark modedriver/validate.properties
: properties file used in validation modedriver/create-validation-parameters.properties
: properties file used in create validation parameter mode
-
Change the
MSSQL_CSV_DIR
found in.env
file to the path where the dataset is located, e.g.:MSSQL_CSV_DIR=`pwd`/social-network-sf1-bi-composite-merged-fk/
By default, the dataset is loaded again and existing databases & tables dropped when the docker container is restarted. To prevent reloading, set the
MSSQL_RECREATE
variable toFalse
, e.g.:MSSQL_RECREATE=False
-
To persist the data by storing the database outside a Docker volume, uncomment the following lines in the
docker-compose.yml
file:- type: bind source: ${MSSQL_DATA_DIR} target: /var/opt/mssql/data - type: bind source: ${MSSQL_DATA_LOGS} target: /var/opt/mssql/log - type: bind source: ${MSSQL_DATA_SECRETS} target: /var/opt/mssql/secrets
The environment variables used in the
source
variables need to be set in the.env
file to the folder to store the data, logs and secrets, e.g.:MSSQL_DATA_DIR=./scratch/data/ MSSQL_DATA_LOGS=./scratch/logs/ MSSQL_DATA_SECRETS=./scratch/secrets
-
To execute the benchmark, change the following properties in
driver/benchmark.properties
:thread_count
: amount of threads to useldbc.snb.interactive.parameters_dir
: path to the folder with the substitution parametersldbc.snb.interactive.updates_dir
: path to the folder with the updatestreams. Make sure the update streams corresponds to thethread_count
.ldbc.snb.interactive.scale_factor
: the scale factor to use (must be the same as the substitution parameters and update streams)
-
To validate the benchmark, change the following properties in
driver/validate.properties
:
validate_database
: Path to the validation parameter CSV-file to useldbc.snb.interactive.parameters_dir
: path to the folder with the substitution parameters
- To create validation parameters, change the following properties in
driver/create-validation-parameters.properties
:ldbc.snb.interactive.parameters_dir
: path to the folder with the substitution parametersldbc.snb.interactive.updates_dir
: path to the folder with the updatestreams. Make sure the update streams corresponds to thethread_count
.ldbc.snb.interactive.scale_factor
: the scale factor to use (must be the same as the substitution parameters and update streams)
This SQL Server implementation uses the composite-merged-fk
CSV layout, with headers and without quoted fields. To generate data that confirms this requirement, run Datagen without any layout or formatting arguments (--explode-*
or --format-options
).
In Datagen's directory (ldbc_snb_datagen_spark
), issue the following commands. We assume that the Datagen project is built and sbt
is available.
export SF=desired_scale_factor
export LDBC_SNB_DATAGEN_MAX_MEM=available_memory
export LDBC_SNB_DATAGEN_JAR=$(sbt -batch -error 'print assembly / assemblyOutputPath')
rm -rf out-sf${SF}/graphs/parquet/raw
tools/run.py \
--cores $(nproc) \
--memory ${LDBC_SNB_DATAGEN_MAX_MEM} \
-- \
--format csv \
--scale-factor ${SF} \
--mode bi \
--output-dir out-sf${SF}
The dataset is loaded automatically using the db-loader container. To start the SQL Server container and load the data, docker-compose
is used:
docker-compose build
, to build the db-loader containerdocker-compose up
to start the SQL Server container and the db-loader container
-
To run the benchmark run the following command:
driver/benchmark.sh
-
To validate the results, run the following command:
driver/validate.sh
-
To create validation parameters, run the command:
driver/create-validation-parameters.sh