Important

no symlinks can be added to this repo because: 1) this repo can be mirrored to storage platforms that do not support symlinks 2) pythons zipfile lib does not support symlinks

Organization


rdg_datasets
  - <name_of_some_rdg>/
    - [conditional] import.py
    - [conditional] generate.py
    - [conditional] migrate.py
    - [conditional] README.md
    - storage_format_version_N/<rdg_contents>
    - storage_format_version_N+1/<rdg_contents>
    - storage_format_version_N+2/<rdg_contents>
    
csv_datasets
 - <name_of_some_csv_dataset>/
   - <csv_dataset_contents>
  
misc_datasets
  - <name_of_some_misc_dataset>
    - README.md describing what this misc dataset is and what it can be used for

if a rdg can be imported from csv, if possible, its directory should be named identically to the csv datasets directory

misc_datasets should be used only when the dataset does not fall into one of the other categories

The following scripts are conditionally present for each RDG. One of the following options must be present.

import.py

Script to import this RDG from CSV When to use: This is the preferred option. If the RDG can be imported from CSV, do this.

generate.py

Script to generate this RDG in special way When to use: To be used if this RDG is not importable from CSV. If generation is difficult to automate, consider using migrate.

migrate.py

Script to migrate this RDG from its current storage_format_version to the latest When to use: Some special RDGs are not easy to generate in an automatible fashion, but can be easily migrated to the latest storage_format_version by loading/storing them.

README.md

Describes how to generate the RDG manually if it is not feasible to generate or migrate it. If this RDG cannot be created by any of the above scripts, the steps to create it must be described in detail here. If this is a special RDG, created to cover a specific test case, describe how it is special and the test case(s) here.

Requirements of the conditional scripts

the "main" function must be called uprev to be found by the global uprev script
the uprev function must return the path to where the new rdg can be found
the scripts must keep the organization outlined above

How to uprev the rdgs in this repo

ensure the most recent master commit of this repo is checked out: git checkout master; git pull
run ./uprev build_tools --build_dir=<katana_build_dir>
run ./uprev rdgs --help to see the required args

ex: ./uprev rdgs --storage_format_version <N> --build_dir <katana_build_dir>

ensure ./uprev validate passes for all rdgs
make a new commit with the message upreved rdgs to storage_format_version_M
create a katana repo PR to bump up the version of this submodule

How to add an rdg to this repo

Required information:

name of your rdg
storage_format_version of your rdg
- you can see what storage_format_version you rdg is by running grep -rni "storage_format_version" * in the directory containing your rdg
- if there are no matches, your rdg is storage_format_version_1
wherever you see <rdg-name> replace it with the name of your rdg

take a look at the organization section above, specifically the rdg_datasets section
create a directory in rdg_datasets called <rdg-name>
create a storage_format_version_# directory in your <rdg-name> directory

ensure the storage_format_verion_3 matches the version in the rdg

put the rdg contents in the storage_format_verison_# directory

it is important that the contents of the rdg are directly in the storage_format_version_# directory, and not nested inside another directory

create a README.md in your <rdg-name> directory for your rdg with general notes about what this rdg tests, and how it was created.
copy one of [migrate.py, import.py, generate.py] from another rdg to your <rdg-name> directory

take a look at the definitions of each of these scripts above to see which is appropriate
modify the scripts variables to match your rdg

Now your rdg can be easily upreved to the latest storage_format_version

Name		Name	Last commit message	Last commit date
Latest commit History 143 Commits
.github		.github
csv_datasets		csv_datasets
libuprev		libuprev
misc_datasets		misc_datasets
rdg_datasets		rdg_datasets
schema_files		schema_files
.gitignore		.gitignore
README.md		README.md
uprev		uprev

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Important

Organization

The following scripts are conditionally present for each RDG. One of the following options must be present.

import.py

generate.py

migrate.py

README.md

Requirements of the conditional scripts

How to uprev the rdgs in this repo

How to add an rdg to this repo

About

Releases

Packages

Contributors 19

Languages

KatanaGraph/test-datasets

Folders and files

Latest commit

History

Repository files navigation

Important

Organization

The following scripts are conditionally present for each RDG. One of the following options must be present.

import.py

generate.py

migrate.py

README.md

Requirements of the conditional scripts

How to uprev the rdgs in this repo

How to add an rdg to this repo

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 19

Languages

Packages