In this task we generate relatively complex graphs which contain either exactly 1 or exactly 0 cycles of length 3 (triangle structures).
In this task we generate a BA graph with two cliques attached. We assign class 1 to graphs where the distance between cliques is larger than a threshold and class 0 othrewise.
scripts
folder contains code for creation of datasets and converting to other formats. See the readme inside the folder for more information.triangles
folder contains triangles dataset: 10k training data, 1k test data and 1k validation data.clique
folder contains clique distance dataset with the same data distribution as clique distance.images
folder contains visualizations of graphs (non-filtered)
Both datasets are stored in our internal format. It consists of two files: sp (adjacency matrix) and cl (graph classification labels). We provide a pair of these files for each split. Script folder also contains a converter to TU dataset format.
SP file contains adjacency matrices for each data set. It is a tab-separated file where each line correspond to an edge in a graph. The SP file has the following columns:
- Graph ID (arbitrary string)
- Start Node ID
- End Node ID
- Edge weight (always 1)
Graphs corresponding to each ID must be stored successively without interleaving with other graphs, but the order of edges can be arbitrary.
It is also a tab-separated file with two columns:
- Graph ID
- Graph Label
The order of entries must be the same with the SP file.
All data is licensed under CC0. All code is licensed under MIT license. (Or change it to whatever you want)