Skip to content
This repository has been archived by the owner on Sep 2, 2021. It is now read-only.

Implementing Event storage/search with a timeseries database or a lucene indexed database #110

Open
farodin91 opened this issue Nov 3, 2016 · 9 comments

Comments

@farodin91
Copy link
Member

Possible Databases

  • Elasticsearch (lucene)
  • Influxdb (timeseries)
  • Cassandra (Clustered Database)
  • TiKV key Value

I would like to hear your ideas.
For the start we could start, capsulated event handling a bit more.

@farodin91 farodin91 changed the title Implementing Event storage and search with a timeseries database or a lucene indexed database Implementing Event storage/search with a timeseries database or a lucene indexed database Nov 20, 2016
@sphinxc0re
Copy link

Would then part of the event lookup move into ruma/ruma-events ?

@farodin91
Copy link
Member Author

Currently, I don't think to move it into ruma-events, because this repos are used to define structures these could use by client or servers.

@sphinxc0re
Copy link

Also, I learned from working with InfluxDB, that TimeSeries DBMS are working best if they are filled with data by a rate of 1Set/(5sec to 5min)

@sphinxc0re
Copy link

I don't think this is the case with these events so I find this a little overkill

@mujx
Copy link

mujx commented Nov 20, 2016

What would be the benefit of this?

Seems like over engineering to me, at least at this point. Also adding an extra dependency will create problems with deployment. Synapse works fine without it.

@farodin91
Copy link
Member Author

I think, if we use Elasticsearch could reduce the complexity of sync massively. It could increase performance.

@sphinxc0re
Copy link

sphinxc0re commented Nov 20, 2016

What about adding the possibility to choose whether the event processing should be done through ElasticSearch/redis/ on startup or through the config file?

@jimmycuadra
Copy link
Member

When I was originally trying to decide on the primary data store for Ruma, I was strongly considering RethinkDB, as its concept of a client subscribing to a updates on a query seemed like a great fit for Matrix's /sync endpoint. Since then, the company behind RethinkDB has gone out of business, which is a real shame, but there is an effort to keep the project going by the community. I'm definitely supportive of the idea of using a data store that better fits the use case for Ruma. I would prioritize homeserver performance over operational/deployment complexity. We can worry about how to make deployment easy for layman users when we start writing docs about deployment. I'm more concerned with Ruma being able to support a homeserver with a huge number of users than I am about making it easy for a layman to deploy it in the simplest case.

@skade
Copy link

skade commented Dec 12, 2016

How would Elasticsearch reduce the complexity of sync? None of the mentioned products are particularly good at syncing.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Development

No branches or pull requests

5 participants