Name	Name	Last commit message	Last commit date
Latest commit jkminder Release v0.6.1 Jul 14, 2022 a75d499 · Jul 14, 2022 History 103 Commits
.github/workflows	.github/workflows	added neo4j service to tests action	May 5, 2022
docs	docs	Updated the docs to the new converter api	Jul 14, 2022
examples	examples	Updated the examples to new api	Jul 14, 2022
rel2graph	rel2graph	added rel2graph.utils module with load_file function and tests	Jul 14, 2022
tests	tests	added rel2graph.utils module with load_file function and tests	Jul 14, 2022
.gitignore	.gitignore	deleted old and static doc files	Jul 13, 2022
README.md	README.md	Release v0.6.1	Jul 14, 2022
TODO.md	TODO.md	added tests for parallel relations	Jan 14, 2022
requirements.txt	requirements.txt	fixed issues with building package	Dec 14, 2021
setup.py	setup.py	Release v0.6.1	Jul 14, 2022

Repository files navigation

Rel2graph

Rel2graph is a library that simplifies the convertion of data in relational format to a graph knowledge database. It reliefs you of the cumbersome manual work of writing the conversion code and let's you focus on the conversion schema and data processing.

The library is built specifically for converting data into a neo4j graph. The library further supports extensive customization capabilities to clean and remodel data. As neo4j python client it uses the py2neo library.

Note: The py2neo library does not support parallel relations of the same type (same source, same target and same type). If your graph requires such parallel relations please checkout the provided py2neo extensions.

Installation

If you have setup a private ssh key for your github, copy-paste the command below to install the latest version (v0.6.1):

$ pip install git+ssh://git@github.com/sg-dev/rel2graph@v0.6.1

If you don't have ssh set up, download the latest wheel here and install the wheel with:

$ pip install **path-to-wheel**

If you have cloned the repository you can also build it locally with

$ pip install **path-to-repository**

The rel2graph library supports Python 3.7+.

Quick Start

A quick example for converting data in a Pandas dataframe into a graph. The full example code can be found under examples. For more details, please checkout the full documentation. We first define a convertion schema in a YAML style config file. In this config file we specify, which entites are converted into which nodes and which relations.

`schema.yaml`

ENTITY("Flower"):
    NODE("Flower") flower:
        - sepal_length = Flower.sepal_length
        - sepal_width = Flower.sepal_width
        - petal_length = Flower.petal_width
        - petal_width = append(Flower.petal_width, " milimeters")
    NODE("Species", "BioEntity") species:
        + Name = Flower.species
    RELATION(flower, "is", species):
    
ENTITY("Person"):
    NODE("Person") person:
        + ID = Person.ID
        - FirstName = Person.FirstName
        - LastName = Person.LastName
    RELATION(person, "likes", MATCH("Species", Name=Person.FavoriteFlower)):
        - Since = "4ever"

The library itself has 2 basic elements, that are required for the conversion: the Converter that handles the conversion itself and an Iterator that iterates over the relational data. The iterator can be implemented for arbitrary data in relational format. Rel2graph currently has preimplemented iterators under:

rel2graph.relational_modules.odata for OData databases (based on pyodata)
rel2graph.relational_modules.pandas for Pandas dataframes

We will use the PandasDataframeIterator from rel2graph.relational_modules.pandas. Further we will use the IteratorIterator that can wrap multiple iterators to handle multiple dataframes. Since a pandas dataframe has no type/table name associated, we need to specify the name when creating a PandasDataframeIterator. We also define define a custom function append that can be refered to in the schema file and that appends a string to the attribute value. For an entity with Flower["petal_width"] = 5, the outputed node will have the attribute petal_width = "5 milimeters".

from py2neo import Graph
import pandas as pd 
from rel2graph.relational_modules.pandas import PandasDataframeIterator 
from rel2graph import IteratorIterator, Converter, Attribute, register_attribute_postprocessor

# Create a connection to the neo4j graph with the py2neo Graph object
graph = Graph(scheme="http", host="localhost", port=7474,  auth=('neo4j', 'password')) 

people = ... # a dataframe with peoples data (ID, FirstName, LastName, FavoriteFlower)
people_iterator = PandasDataframeIterator(people, "Person")
iris = ... # a dataframe with the iris dataset
iris_iterator = PandasDataframeIterator(iris, "Flower")

# register a custom data processing function
@register_attribute_postprocessor
def append(attribute, append_string):
    new_attribute = Attribute(attribute.key, attribute.value + append_string)
    return new_attribute

# Create IteratorIterator
iterator = IteratorIterator([pandas_iterator, iris_iterator])

# Create converter instance with schema, the final iterator and the graph
converter = Converter("schema.yaml", iterator, graph)
# Start the conversion
converter()

Known issues

If you encounter a bug or an unexplainable behavior, please check the known issues list. If your issue is not found, submit a new one.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Rel2graph

Installation

Quick Start

`schema.yaml`

Known issues

About

Releases 28

Contributors 2

Languages

License

jkminder/data2neo

Folders and files

Latest commit

History

Repository files navigation

Rel2graph

Installation

Quick Start

schema.yaml

Known issues

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 28

Contributors 2

Languages

`schema.yaml`