Skip to content

In-memory database of facts (records with attributes) with a predicative searching facility

License

Notifications You must be signed in to change notification settings

yegor256/factbase

Repository files navigation

Single-Table NoSQL-ish In-Memory Database

DevOps By Rultor.com We recommend RubyMine

rake PDD status Gem Version Test Coverage Yard Docs Hits-of-Code License

This Ruby gem manages an in-memory database of facts. A fact is simply a map of properties and values. The values are either atomic literals or non-empty sets of literals. It is possible to delete a fact, but impossible to delete a property from a fact.

ATTENTION: The current implemention is naive and, because of that, very slow. I will be very happy if you suggest a better implementation without the change of the interface. The Factbase::query() method is what mostly needs performance optimization: currently it simply iterates through all facts in the factbase in order to find those that match the provided terms. Obviously, even a simple indexing may significantly increase performance.

Here is how you use it (it's thread-safe, by the way):

fb = Factbase.new
f = fb.insert
f.kind = 'book'
f.title = 'Object Thinking'
fb.query('(eq kind "book")').each do |f|
  f.seen = true
end
fb.insert
fb.query('(not (exists seen))').each do |f|
  f.title = 'Elegant Objects'
end

You can save the factbase to the disc and then load it back:

file = '/tmp/simple.fb'
f1 = Factbase.new
f = f1.insert
f.foo = 42
File.save(file, f1.export)
f2 = Factbase.new
f2.import(File.read(file))
assert(f2.query('(eq foo 42)').each.to_a.size == 1)

There are some boolean terms available in a query (they return either true or false):

  • (always) and (never) are true and false
  • (nil v) is true if v is nil
  • (not b) is the inverse of b
  • (or b1 b2 ...) is true if at least one argument is true
  • (and b1 b2 ...) — if all arguments are true
  • (when b1 b2) — if b1 is true and b2 is true or b1 is false
  • (exists p) — if p property exists
  • (absent p) — if p property is absent
  • (zero v) — if any v equals to zero
  • (eq v1 v2) — if any v1 equals to any v2
  • (lt v1 v2) — if any v1 is less than any v2
  • (gt v1 v2) — if any v1 is greater than any v2
  • (many v) — if v has many values
  • (one v) — if v has one value

There are string manipulators:

  • (concat v1 v2 v3 ...) — concatenates all v
  • (sprintf v v1 v2 ...) — creates a string by v format with params
  • (matches v s) — if any v matches the s regular expression

There are a few terms that return non-boolean values:

  • (at i v) is the i-th value of v
  • (size v) is the cardinality of v (zero if v is nil)
  • (type v) is the type of v ("String", "Integer", "Float", "Time", or "Array")
  • (either v1 v1) is v2 if v1 is nil

It's possible to modify the facts retrieved, on fly:

  • (as p v) adds property p with the value v
  • (join s t) adds properties named by the s mask with the values retrieved by the t term, for example, (join "x<=foo,y<=bar" (gt x 5)) will add x and y properties, setting them to values found in the foo and bar properties in the facts that match (gt x 5)

Also, some simple arithmetic:

  • (plus v1 v2) is a sum of ∑v1 and ∑v2
  • (minus v1 v2) is a deducation of ∑v2 from ∑v1
  • (times v1 v2) is a multiplication of ∏v1 and ∏v2
  • (div v1 v2) is a division of ∏v1 by ∏v2

It's possible to add and deduct string values to time values, like (plus t '2 days') or (minus t '14 hours').

Types may be converted:

  • (to_int v) is an integer of v
  • (to_str v) is a string of v
  • (to_float v) is a float of v

One term is for meta-programming:

  • (defn f "self.to_s") defines a new term using Ruby syntax and returns true
  • (undef f) undefines a term (nothing happens if it's not defined yet), returns true

There are terms that are history of search aware:

  • (prev p) returns the value of p property in the previously seen fact
  • (unique p) returns true if the value of p property hasn't been seen yet

The agg term enables sub-queries by evaluating the first argument (term) over all available facts, passing the entire subset to the second argument, and then returning the result as an atomic value:

  • (lt age (agg (eq gender 'F') (max age))) selects all facts where the age is smaller than the maximum age of all women
  • (eq id (agg (always) (max id))) selects the fact with the largest id
  • (eq salary (agg (eq dept $dept) (avg salary))) selects the facts with the salary average in their departments

There are also terms that match the entire factbase and must be used primarily inside the (agg ..) term:

  • (nth v p) returns the p property of the v-th fact (must be a positive integer)
  • (first p) returns the p property of the first fact
  • (count) returns the tally of facts
  • (max p) returns the maximum value of the p property in all facts
  • (min p) returns the minimum
  • (sum p) returns the arithmetic sum of all values of the p property

It's also possible to use a sub-query in a shorter form than with the agg:

  • (empty q) is true if the subquery q is empty

There are some system-level terms:

  • (env v1 v2) returns the value of environment variable v1 or the string v2 if it's not set

How to contribute

Read these guidelines. Make sure you build is green before you contribute your pull request. You will need to have Ruby 3.2+ and Bundler installed. Then:

bundle update
bundle exec rake

If it's clean and you don't see any error messages, submit your pull request.