Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Enhancement] Improve search performance by using external index engine #1731

Closed
To-om opened this issue Jan 5, 2021 · 2 comments
Closed
Assignees
Labels
enhancement TheHive4 TheHive4 related issues
Milestone

Comments

@To-om
Copy link
Contributor

To-om commented Jan 5, 2021

Request Type

Enhancement

Description

Currently, TheHive4 suffers from a performance problem. The reason is that TheHive uses basic index mechanism (embedded in JanusGraph). This indexes are simple to use and to manage but they have limitations: they support only equality lookups and cannot be used for sort. In TheHive, almost all lists (list of cases, list of alerts, ...) are sorted. This implies a scan of all the elements of the list, which have a heavy performance impact (especially if the list is long).

In order to solve this issue, TheHive 4.1 will come with a new index engine.

If TheHive4 is used in a cluster mode (more than one server that run TheHive), all servers must connect to common index engine (which can be in a cluster mode or not). In this case a new component must be installed for index management. For this kind of architecture, an Elasticsearch will be used as index engine. In contrary of the use of Elasticsearch in TheHive3, in TheHive4 it will use Elasticsearch only for indexes (it wont store data). Once this new component is installed, it must be configured on all TheHive nodes (in application.conf).

db.janusgraph {
  storage { ... }
  index.search {
    backend = elasticsearch
    hostname = ["es1.local", "es2.local"] // IP or hostname of elasticsearch nodes
    index-name = thehive
  }
}

If you use TheHive4 in a single server mode, you can use Elasticsearch or a file based index engine (Lucene). The latter solution has less infrastructure impact (no component to install) and only requires configuration that indicate where index data is stored in the filesystem. The platform administrator must ensure that there is enough space to hold data.

db.janusgraph {
  storage { ... }
  index.search {
    backend = lucene
    directory = /opt/thehive/index
  }
}

If the indices are not yet installed, TheHive indexes all data at startup. This could be a long process, depending on the volume of data.

@To-om To-om added enhancement TheHive4 TheHive4 related issues labels Jan 5, 2021
@To-om To-om added this to the 4.1.0 milestone Jan 5, 2021
@To-om To-om self-assigned this Jan 5, 2021
To-om added a commit that referenced this issue Jan 5, 2021
To-om added a commit that referenced this issue Jan 5, 2021
To-om added a commit that referenced this issue Jan 5, 2021
To-om added a commit that referenced this issue Jan 21, 2021
To-om added a commit that referenced this issue Jan 21, 2021
To-om added a commit that referenced this issue Jan 21, 2021
To-om added a commit that referenced this issue Jan 21, 2021
To-om added a commit that referenced this issue Jan 21, 2021
@cugu
Copy link

cugu commented Jan 22, 2021

Will this also speed up insertion / creation of cases?

To-om added a commit that referenced this issue Jan 22, 2021
To-om added a commit that referenced this issue Jan 23, 2021
To-om added a commit that referenced this issue Jan 27, 2021
To-om added a commit that referenced this issue Jan 27, 2021
To-om added a commit that referenced this issue Jan 27, 2021
To-om added a commit that referenced this issue Jan 28, 2021
To-om added a commit that referenced this issue Feb 9, 2021
To-om added a commit that referenced this issue Feb 12, 2021
To-om added a commit that referenced this issue Feb 15, 2021
To-om added a commit that referenced this issue Feb 15, 2021
To-om added a commit that referenced this issue Feb 15, 2021
To-om added a commit that referenced this issue Feb 16, 2021
To-om added a commit that referenced this issue Feb 17, 2021
To-om added a commit that referenced this issue Mar 1, 2021
To-om added a commit that referenced this issue Mar 1, 2021
To-om added a commit that referenced this issue Mar 1, 2021
To-om added a commit that referenced this issue Mar 1, 2021
To-om added a commit that referenced this issue Mar 2, 2021
To-om added a commit that referenced this issue Mar 2, 2021
To-om added a commit that referenced this issue Mar 3, 2021
To-om added a commit that referenced this issue Mar 3, 2021
To-om added a commit that referenced this issue Mar 4, 2021
@To-om
Copy link
Contributor Author

To-om commented Mar 5, 2021

@cugu No, indexes only speed up data retrieval, not modification

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement TheHive4 TheHive4 related issues
Projects
None yet
Development

No branches or pull requests

3 participants