Name		Name	Last commit message	Last commit date
parent directory ..
ddl		ddl
dml		dml
output		output
queries		queries
scratch		scratch
scripts		scripts
README.md		README.md
benchmark.py		benchmark.py
load.py		load.py
queries.py		queries.py

README.md

LDBC SNB BI Umbra implementation

Umbra implementation of the LDBC Social Network Benchmark's BI workload.

⚠️ Note that while this implementation uses containerized execution for Umbra, it does not work on MacOS (neither on x86, nor on Apple Silicon).

Getting the container

The Umbra container is currently available upon request.

Generating the data set

The Umbra implementation expects the data to be in composite-merged-fk CSV layout, with headers and without quoted fields. To generate data that confirms this requirement, run Datagen without any layout or formatting arguments (--explode-* or --format-options).

In Datagen's directory (ldbc_snb_datagen_spark), issue the following commands. We assume that the Datagen project is built and the ${PLATFORM_VERSION}, ${DATAGEN_VERSION} environment variables are set correctly.

export SF=desired_scale_factor
export LDBC_SNB_DATAGEN_MAX_MEM=available_memory
export LDBC_SNB_DATAGEN_JAR=$(sbt -batch -error 'print assembly / assemblyOutputPath')

rm -rf out-sf${SF}/
tools/run.py \
    --cores $(nproc) \
    --memory ${LDBC_SNB_DATAGEN_MAX_MEM} \
    -- \
    --format csv \
    --scale-factor ${SF} \
    --mode bi \
    --output-dir out-sf${SF}

Loading the data set

Note that unlike Postgres, Umbra does not support directly loading from compressed files (.csv.gz).

Set the ${UMBRA_CSV_DIR} environment variable to point to the data set.

To use a locally generated data set, set the ${LDBC_SNB_DATAGEN_DIR} and ${SF} environment variables and run:

export UMBRA_CSV_DIR=${LDBC_SNB_DATAGEN_DIR}/out-sf${SF}/graphs/csv/bi/composite-merged-fk/

Or, simply run:

. scripts/use-datagen-data-set.sh

To download and use the sample data set, run:

wget -q https://ldbcouncil.org/ldbc_snb_datagen_spark/social-network-sf0.003-bi-composite-merged-fk.zip
unzip -q social-network-sf0.003-bi-composite-merged-fk.zip
export UMBRA_CSV_DIR=`pwd`/social-network-sf0.003-bi-composite-merged-fk/graphs/csv/bi/composite-merged-fk/

Or, simply run:

scripts/get-sample-data-set.sh
. scripts/use-sample-data-set.sh

The data set should consist of uncompressed CSVs. If you retrieved a compressed data set (.csv.gz files), set the ${UMBRA_CSV_DIR} environment variable and uncompress the files (note that doing so deletes the original compressed files):
```
scripts/decompress-data-set.sh
```
To start the DBMS, create a database and load the data, run:
```
scripts/load-in-one-step.sh
```
The substitution parameters should be generated using the paramgen.

Queries

To run the queries, issue:

scripts/queries.sh

For a test run, use:

scripts/queries.sh ${SF} --test

Benchmark

To run the queries and the batches alternately, as specified by the benchmark, run:

scripts/benchmark.sh

Running queries interactively

To connect to the database through the SQL console, use:

scripts/connect.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

umbra

umbra

README.md

LDBC SNB BI Umbra implementation

Getting the container

Generating the data set

Loading the data set

Queries

Benchmark

Running queries interactively

Files

umbra

Directory actions

More options

Directory actions

More options

Latest commit

History

umbra

Folders and files

parent directory

README.md

LDBC SNB BI Umbra implementation

Getting the container

Generating the data set

Loading the data set

Queries

Benchmark

Running queries interactively