Updated the docs to the new converter api

jkminder · Jul 14, 2022 · e3d3dcd · e3d3dcd
1 parent fae6a04
commit e3d3dcd
Show file tree

Hide file tree

Showing 5 changed files with 28 additions and 9 deletions.
diff --git a/docs/source/api/api.rst b/docs/source/api/api.rst
@@ -7,4 +7,5 @@ API Reference
 
    Core <core>
    Relational modules <relational_modules>
-   Py2neo extensions <py2neo_extensions>
+   Py2neo extensions <py2neo_extensions>
+   Utils <utils>
diff --git a/docs/source/api/utils.rst b/docs/source/api/utils.rst
@@ -0,0 +1,8 @@
+-----------------
+Utils
+-----------------
+
+General utility functions for working with rel2graph.
+
+
+.. autofunction:: rel2graph.utils.load_file
diff --git a/docs/source/converter.rst b/docs/source/converter.rst
@@ -2,26 +2,35 @@ Converter
 =========
 
 The |Converter| handles the main conversion of the relational data. 
-It is initialised with the *conversion schema filename*, the iterator and the graph. 
+It is initialised with the *conversion schema* as a string, the iterator and the graph. 
 
 .. code-block:: python
 
     from rel2graph import Converter
 
-    converter = Converter(config_filename, iterator, graph)
+    converter = Converter(conversion_schema, iterator, graph)
 
 To start the conversion, one simply calls the object. It then iterates twice over the iterator: first to process all the nodes and, secondly, to create all relations. This makes sure that any node a relation refers to is already created first.
 
 .. code-block:: python
 
     converter()
 
+If your conversion schema is saved in a seperate file you can use the provided ``load_file`` function to load it into a string.
+
+.. code-block:: python
+
+    from rel2graph import Converter
+    from rel2graph.utils import load_file
+
+    converter = Converter(load_file(conversion_schema_file), iterator, graph)
+
 The |Converter| can utilise **multithreading**. When initialising you can set the number of parallel workers. Each worker operates in its own thread. 
 Be aware that the committing to the graph is often still serialized, since the semantics require this (e.g. nodes must be committed before any relation or when [merging nodes](#merging-nodes) all the nodes must be serially committed). So the primary use-case of using multiple workers is if your resources are utilising a network connection (e.g. remote database) or if you require a lot of [matching](#match) in the graph (matching is parallelised).
 
 .. code-block:: python
 
-    converter = Converter(config_filename, iterator, graph, num_workers = 20)
+    converter = Converter(conversion_schema, iterator, graph, num_workers = 20)
 
 **Attention:** If you enable more than 1 workers, ensure that all your :doc:`wrappers <wrapper>` support multithreading (add locks if necessary).
 
@@ -40,7 +49,7 @@ In the first cell, you initially have created the converter object and called it
 
 .. code-block:: python
 
-    converter = Converter(config_filename, iterator, graph)
+    converter = Converter(conversion_schema, iterator, graph)
     converter()
 
 Now a ``ConnectionException`` is raised due to network problems. You can now fix the problem and then recall the converter in a new cell:
@@ -58,14 +67,14 @@ You have a small error in your :doc:`conversion schema <conversion_schema>` for
 
 .. code-block:: python
 
-    converter = Converter(config_filename, iterator, graph)
+    converter = Converter(conversion_schema, iterator, graph)
     converter()
 
 Now, e.g. ``KeyError`` is raised since the attribute name was written slightly wrong. Instead of rerunning the whole conversion (which might take hours), you can fix the schema file and reload the schema file and recall the converter:
 
 .. code-block:: python
 
-    converter.reload_config(config_filename)
+    converter.reload_schema(conversion_schema)
     converter()
 
 The converter will just continue where it left off with the new :doc:`conversion schema <conversion_schema>`. 

diff --git a/docs/source/introduction.rst b/docs/source/introduction.rst
@@ -16,7 +16,7 @@ we can keep supplying it with resources without writing more code.
     :alt: rel2graph factory
 
 Since there might be different types of resources, we build a factory per resource type. 
-One specifies all the "blueprints" for all the factories in thr |convschema| file. 
+One specifies all the "blueprints" for all the factories in the |convschema| file. 
 A |Converter|, the main object of *rel2graph*, will take this file and construct all the factories 
 based on your "blueprints". For a set of supplied resources the |Converter| will automatically select 
 the correct factory, use it to produce a graph out of the resource and merge the produced graph with 

diff --git a/docs/source/quick_start.rst b/docs/source/quick_start.rst
@@ -41,6 +41,7 @@ For an entity with ``Flower["petal_width"] = 5``, the outputed node will have th
     import pandas as pd 
     from rel2graph.relational_modules.pandas import PandasDataframeIterator 
     from rel2graph import IteratorIterator, Converter, Attribute, register_attribute_postprocessor
+    from rel2graph.utils import load_file
 
     # Create a connection to the neo4j graph with the py2neo Graph object
     graph = Graph(scheme="http", host="localhost", port=7474,  auth=('neo4j', 'password')) 
@@ -60,7 +61,7 @@ For an entity with ``Flower["petal_width"] = 5``, the outputed node will have th
     iterator = IteratorIterator([pandas_iterator, iris_iterator])
 
     # Create converter instance with schema, the final iterator and the graph
-    converter = Converter("schema.yaml", iterator, graph)
+    converter = Converter(load_file("schema.yaml"), iterator, graph)
     # Start the conversion
     converter()