The neoads
module implements interoperable AbstractSet, AbstractMap and AbstractDLList
(Doubly Linked List), over a Neo4J backend.
It relies on Neomodel for the OGM functionality and is designed to minimise
round-trips to the database. In other words, if an operation can be carried out
at Server Side, neoads
will execute it as a query rather than attempt to
instantiate the data structure in local memory, modify it and then push it
back to the backend.
neoads
data structures are implemented in a way that is:
- Agnostic to domain-specific data model
- You can create lists, sets or maps of anything in your data model,
simply by making
ElementDomain
the root object of your data model.
- You can create lists, sets or maps of anything in your data model,
simply by making
- Completely transparent to the database backend
- The abstraction layer does not enforce a special organisation or shortcuts and it is still possible to access the data structures in your CYPHER queries.
The main benefit of neoads
is that it makes it possible to store query results in
data structures within the DBMS and feed those data structures to further processing
without re-running the queries themselves.
In addition to classes that model the Set, Map and Doubly Linked Lists, neoads
also introduces a number of other entities such as CompositeString
or SimpleDate
that aim to blur the line between variables held in local RAM and variables held at the
database management system (Neo4j).
The ultimate goal is to be able to make neoads
data structures completely transparent
from the point of view of a high productivity language such as Python.
This means that neoads
data structures could behave as drop-in replacements for data
processing algorithms and interact "naturally" with other objects via operators and functions.
From this point of view, Neo4J becomes a giant heap available to higher level algorithms.
At the moment, it is possible to add neoads
to your virtualenv
directly from
this repository via:
> pip install git+https://...
neoads
relies on neomodel and assumes
that:
- A Neo4j instance with proper access rights is already available via a known Bolt URL
- Neomodel's
db.set_connection()
has already been called prior to any further calls toneoads
. (Or more generally,neomodel
has already been initialised).
This section contains the bear minimum usage examples for a user to get up and
running with neoads
.
For more information, please refer to the
detailed documentation in ReadTheDocs,
or in doc/
.
Create a simple variable called answer
:
u = SimpleNumber(42, "answer").save()
u
is now a data object that provides full access to the SimpleNumber
. To
recall it from the database management system simply use:
v = SimpleNumber.nodes.get(name="answer")
The exact same example applies for SimpleDate
with the exception that the value
argument must be a standard Python datetime
object.
u = CompositeString("Hello World", "greeting").save()
In addition to all other operations that can be applied to u
, it is also possible
to access its contents via:
print(u[2])
u = CompositeArrayString(["Hello", "Hola", "Χαίρετε"], "greetings").save()
A similar initialisation pattern applies to CompositeArrayNumber, CompositeArrayDate.
To work with abstract data structures we need to set up a few things first. Here are a few strings whose use will become apparent shortly:
colours_a = [CompositeString("RED", "A_RED").save(), CompositeString("GREEN", "A_GREEN").save(), CompositeString("BLUE", "A_BLUE").save()] colours_b = [CompositeString("RED", "B_RED").save(), CompositeString("GREEN", "B_GREEN").save(), CompositeString("PURPLE", "B_PURPLE").save()]
Create two sets:
colours_a_set = AbstractSet(name="A_COLOURS").save() colours_b_set = AbstractSet(name="B_COLOURS").save()
Add the items:
for a_colour in colours_a: colours_a_set.add(a_colour) for b_colour in colours_b: colours_b_set.add(b_colour)
Now, sets can be combined with operators as in:
union_of_colour_sets = colours_a_set | colours_b_set
Or in more complex ways, such as this way of evaluating the symmetric difference:
symm_diff_colour = (colours_a_set - colours_b_set) | (colours_b_set-colours_a_set)
It is worth noting that all operators used above do produce intermediate objects with the result of partial evaluations (for example, there are two intermediate sets that are produced with the partial results needed to evaluate the __or__). These can be cleared via garbage collection.
All of these operations have taken place at server side. None of the sets had to travel to the client side, be processed and then be pushed out to the server again.
Similarly, it is possible to create AbstractMap, AbstractDLList objects. For more information please refer to the documentation.
neoads
is completely agnostic to the sort of entities its abstract data structures point to,
so long as these descend from a common ancestor of type ElementDomain
.
The following is a minimal example of how to setup lists of persons living in specific geographical
regions. Notice here the way an arbitrary data model gets integrated with neoads
.
class PersonalRelationship(neomodel.StructredRel): """ A very simple assocation class between entities of type Person that bears the date the acquaintance was made. """ on_date = neomodel.DateTimeProperty(default_now=True) class Country(neoads.ElementDomain): uid = neomodel.UniqueIdProperty() name = neomodel.StringProperty() class Person(neoads.ElementDomain): uid = neomodel.UniqueIdProperty() full_name = neomodel.StringProperty() acquainted_with = neomodel.RelationshipTo("Person", "ACQUAINTED_WITH", model = PersonalRelationship) lives_in = neomodel.RelationshipTo("Country", "LIVES_IN")
Now, given this data model, we can instantiate a double linked list at server side just by running a simple query:
# First of all create the list some_abstract_list = neoads.AbstractDLList(name="EU_27_PERSONS").save() # The populate it some_abstract_list.from_query("MATCH (ListItem:Person)-[LIVES_IN]->(b:Country) " "WHERE b.name IN ['Austria', 'Belgium', 'Bulgaria', 'Croatia', 'Cyprus', 'Czechia', " "'Denmark', 'Estonia', 'Finland', 'France', 'Germany', 'Greece', 'Hungary', " "'Ireland', 'Italy', 'Latvia', 'Lithuania', 'Luxembourg', 'Malta', 'Netherlands', " "'Poland', 'Portugal', 'Romania', 'Slovakia', 'Slovenia', 'Spain', 'Sweden'] ")
At this point, some_abstract_list
will be populated with all Person
that have a connection to
a Country
within the European Union's EU27 countries. This of course was by name.
With neoads
it would also have been possible to first create lists of Country
that belong within
a specific geographical region (e.g. Europe, Asia, Americas, etc) and then create a list (still from
a query) that makes use of a previously defined neoads
list.
In addition to this and since neoads
is completely agnostic to the types of elements its abstract
data structures can hold, it is possible to have abstract data structures point to other abstract
data structures creating higher complexity structures if required.
For example a neoads
abstract list of lists can be accessed via some_list[0][1]["Alpha"][2]
. The
first indexing returns neoads.AbstractDLList
whose indexing operation returns neoads.AbstractDLList
that in turn returns a neoads.AbstractMap
whose value might be another neoads.AbstractDLList
from which we return the element at index 2
.
neoads
data structures remain completely re-usable at server side.
For more information please see the documentation on abstract data structures.
Unnamed entities are saved at the DBMS but unless their name
attribute has been saved (or
they have been renamed) it is impossible for them to be retrieved once a reference to them
has been lost.
This is of course by design, to cover for cases where an intermediate data structure is required, but is not required to be saved.
For those particular cases, neoads
provides a very simple "garbage collector".
The garbage collector is basically a set of scripts that look for specific variables
and erase them.
These variables are as follows:
- Variables whose name conforms to a UUID4 identifier and do not have any connection with any other entity in the system
- Data structure entities that are not connected to any data structure
And this brings us nicely to the Memory Manager.
The objective of the memory manager is to group together a number of operations that might be required for the maintenance of the DBMS state.
At the moment, the only operations that are available via the memory manager are listing objects, getting a reference to an object and performing garbage collection.
Other operations that are planned are:
- Lost+Found / recovery operations
- Optimisation
- Backup and Restoring
MemoryManager objects are straightforward to initialise and work with. The following
example assumes that the NEO4J_BOLT_URL
environment variable has already been set.
import random import neoads mm = neoads.MemoryManager() some_elements = [neoads.SimpleNumber(random.random()).save() for k in range(0,10)] # Let's perform a dir() on the DBMS for the variables we just set objects_in_mem = mm.list_objects() # list_objects() returns a dictionary where the key is the name of a variable and value is # the object of the variable itself. # Let's get a reference to an object some_object = mm.get_object(objects_in_mem[0]) # At this point some_object is of type `SimpleNumber` (or whatever appropriate type) and # we can apply any further operations on it.
Module documentation is available in doc/
as a standard sphinx
documentation project or over at ReadTheDocs