The Database module is part of X-Road v6 monitor project, which includes modules of Database module (this document), Collector module, Corrector module, Analysis module, Reports module, Opendata module and Networking/Visualizer module.
The Database module provides storage and synchronization between the other modules.
Overall system, its users and rights, processes and directories are designed in a way, that all modules can reside in one server (different users but in same group 'opmon') but also in separate servers.
Overall system is also designed in a way, that allows to monitor data from different X-Road v6 instances (in Estonia ee-dev
, ee-test
, EE
, see also X-Road v6 environments.
Overall system is also designed in a way, that can be used by X-Road Centre for all X-Road members as well as for Member own monitoring (includes possibilities to monitor also members data exchange partners).
The database is implemented with the MongoDB technology: a non-SQL database with replication and sharding capabilities.
This document describes the installation steps for Ubuntu 16.04. For other Linux distribution, please refer to: MongoDB 3.4 documentation
Add the MongoDB repository key and location:
# Key 4096R/A15703C6 2016-01-11 [expires: 2020-01-05]
sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv 0C49F3730359A14518585931BC711F9BA15703C6
echo "deb [ arch=amd64,arm64 ] http://repo.mongodb.org/apt/ubuntu xenial/mongodb-org/3.4 multiverse" \
| sudo tee /etc/apt/sources.list.d/mongodb-org-3.4.list
Install MongoDB server and client tools (shell)
sudo apt-get update
sudo apt-get install --yes mongodb-org
Most libraries follow the "MAJOR.MINOR.PATCH" schema, so the guideline is to review and update PATCH versions always (they mostly contain bug fixes). MINOR updates can be applied, as they should keep compatibility, but there is no guarantee for some libraries. A suggestion would be to check if tests are working after MINOR updates and rollback if they stop working. MAJOR updates should not be applied.
This section describes the necessary MongoDB configuration. It assumes that MongoDB is installed and running.
To check if the MongoDB daemon is active, run:
sudo service mongod status
To start MongoDB daemon, run:
sudo service mongod start
To ensure that MongoDB daemon start is enabled after system start, run and check, that /lib/systemd/system/mongod.service; enabled
is present:
sudo service mongod status
# Loaded: loaded (/lib/systemd/system/mongod.service; disabled; ...)
If not, then enable and restart MongoDB daemon:
sudo systemctl enable mongod.service
# Created symlink from /etc/systemd/system/multi-user.target.wants/mongod.service \
# to /lib/systemd/system/mongod.service.
sudo service mongod restart
sudo service mongod status
# Loaded: loaded (/lib/systemd/system/mongod.service; enabled; ...)
The database module requires the creation of the following users:
- A root user to controls the configuration of the database.
- A backup specific user to backup data collections.
- A superuser to personalize access to the configuration of the database.
- Module specific users, to provide access and limit permissions.
- Optionally, creation of a test user for Integration Tests with the database.
- Optionally, creation of a read-only user for reviewing the database.
Enter MongoDB client shell:
mongo
Inside the MongoDB client shell, create the root user in the admin database. Replace ROOT_PWD with the desired root password (keep it in your password safe).
use admin
db.createUser( { user: "root", pwd: "ROOT_PWD", roles: ["root"] })
Inside the MongoDB client shell, create the db_backup user in the admin database. Replace BACKUP_PWD with the desired password (keep it in your password safe).
use admin
db.createUser( { user: "db_backup", pwd: "BACKUP_PWD", roles: ["backup"] })
When applicable, create inside the MongoDB client shell one or more personalized superuser user in the admin database. Replace superuser with the personalized user name in your domain. Replace SUPERUSER_PWD with the desired password (keep it in your password safe).
use admin
db.createUser( { user: "superuser", pwd: "SUPERUSER_PWD", roles: ["root"] })
The instructions below describes the creation of module users for collector, corrector, reports, analyzer, and anonymizer modules.
Note 1: The instructions assume sample
instances. For other instances replace the sample
suffix appropriately.
X-Road instance names in Estonia are case-sensitive (sample: EE
differs from ee
).
Note 2: MongoDB database names and settings in different modules should follow this case-sensitivity as well to guarantee, that publishing and notification scripts in Reports module can use correct path. Please refer to the specific configuration file of every module and set the MongoDB access to match the user and passwords created here, where:
- INSTANCE: is the X-Road v6 instance
- MONGODB_USER: is the instance-specific module user
- MONGODB_PWD: is the MONGODB_USER password
- MONGODB_SERVER: is the database host (example:
opmon
) - MONGODB_SUFFIX: is the database suffix, same as INSTANCE
Inside the MongoDB client shell, create the collector_sample user in the auth_db database. Replace MODULE_PWD with the desired module password. The collector user has "readWrite" permissions to "query_db" and "collector_state" databases (here, query_db_sample and collector_state_sample).
use auth_db
db.createUser( { user: "collector_sample", pwd: "MODULE_PWD", roles: []})
db.grantRolesToUser( "collector_sample", [{ role: "readWrite", db: "query_db_sample"}])
db.grantRolesToUser( "collector_sample", [{ role: "readWrite", db: "collector_state_sample"}])
Inside the MongoDB client shell, create the corrector_sample user in the auth_db database. Replace MODULE_PWD with the desired module password. The corrector user has "readWrite" permissions to "query_db" database (here, query_db_sample).
use auth_db
db.createUser( { user: "corrector_sample", pwd: "MODULE_PWD", roles: []})
db.grantRolesToUser( "corrector_sample", [{ role: "readWrite", db: "query_db_sample"}])
Inside the MongoDB client shell, create the reports_sample user in the auth_db database. Replace MODULE_PWD with the desired module password. The reports user has "read" permissions to "query_db" database and "readWrite" permission to "reports_state" database (here, query_db_sample and reports_state_sample).
use auth_db
db.createUser({ user: "reports_sample", pwd: "MODULE_PWD", roles: [] })
db.grantRolesToUser( "reports_sample", [{ role: "read", db: "query_db_sample" }])
db.grantRolesToUser( "reports_sample", [{ role: "readWrite", db: "reports_state_sample" }])
Analyzer module has two sub-modules, calculation ('training or updateing historic average models' and 'finding anomalies') and web interface. It is suggested to have separate users for them although the access rights are the same. Web interface user needs access to the query database as well because list of queries, that belong to an anomaly are loaded from there.
Inside the MongoDB client shell, create the analyzer_sample user in the auth_db database. Replace MODULE_PWD with the desired module password. The analyzer user has "read" permissions to "query_db" database and "readWrite" permission to "analyzer_database" database (here, query_db_sample and analyzer_database_sample).
use auth_db
db.createUser({ user: "analyzer_sample", pwd: "MODULE_PWD", roles: [] })
db.grantRolesToUser( "analyzer_sample", [{ role: "read", db: "query_db_sample" }])
db.grantRolesToUser( "analyzer_sample", [{ role: "readWrite", db: "analyzer_database_sample" }])
Inside the MongoDB client shell, create the analyzer_interface_sample user in the auth_db database. Replace MODULE_PWD with the desired module password. The analyzer interface user has "read" permissions to "query_db" database and "readWrite" permission to "analyzer_database" database (here, query_db_sample and analyzer_database_sample).
use auth_db
db.createUser({ user: "analyzer_interface_sample", pwd: "MODULE_PWD", roles: [] })
db.grantRolesToUser( "analyzer_interface_sample", [{ role: "read", db: "query_db_sample" }])
db.grantRolesToUser( "analyzer_interface_sample", [{ role: "readWrite", db: "analyzer_database_sample" }])
Inside the MongoDB client shell, create the anonymizer_sample user in the auth_db database. Replace MODULE_PWD with the desired module password. The anonymizer user has "read" permissions to "query_db" database (here, query_db_sample) and "readWrite" permissions to "anonymizer_state" database (here anonymizer_state_sample).
use auth_db
db.createUser({ user: "anonymizer_sample", pwd: "MODULE_PWD", roles: [] })
db.grantRolesToUser( "anonymizer_sample", [{ role: "read", db: "query_db_sample" }])
db.grantRolesToUser( "anonymizer_sample", [{ role: "readWrite", db: "anonymizer_state_sample" }])
The ci_test user is only necessary to run integration tests that uses MongoDB (example, corrector integration tests). The integration tests uses as MONGODB_SUFFIX the value PY-INTEGRATION-TEST
, and this should not be mixed with any module specific user.
Inside the MongoDB client sehll, create the ci_test user in the auth_db database. The default password is also "ci_test". The ci_test user has permissions ONLY to databases:
- CI_query_db
- CI_collector_state
- CI_reports_state
- CI_analyzer_database
Note: The integration database uses a different name convention to avoid conflict with real MONGODB_SUFFIX instances.
use auth_db
db.createUser({ user: "ci_test", pwd: "ci_test", roles: [] })
db.grantRolesToUser( "ci_test", [{ role: "dbOwner", db: "CI_query_db" }])
db.grantRolesToUser( "ci_test", [{ role: "dbOwner", db: "CI_collector_state" }])
db.grantRolesToUser( "ci_test", [{ role: "dbOwner", db: "CI_reports_state" }])
db.grantRolesToUser( "ci_test", [{ role: "dbOwner", db: "CI_analyzer_database" }])
Inside the MongoDB client shell, create the user_read user in the admin database. Replace USER_PWD with the desired password (keep it in your password safe).
use admin
db.createUser( { user: "user_read", pwd: "USER_PWD", roles: ["readAnyDatabase"] })
To check if all users and configurations were properly created, list all users and verify their roles using the following commands:
Inside the MongoDB client shell:
use admin
db.getUsers()
use auth_db
db.getUsers()
Inside mongo client shell, execute the following command to enable log rotation:
use admin
db.runCommand( { logRotate : 1 } )
exit
To ensure, that daily logfiles are kept, we suggest to use logrotate. Please add file /etc/logrotate.d/mongodb
sudo vi /etc/logrotate.d/mongodb
with content:
/var/log/mongodb/mongod.log {
daily
rotate 30
compress
dateext
notifempty
missingok
sharedscripts
postrotate
/bin/kill -SIGUSR1 `pgrep mongod` 2> /dev/null || true
endscript
}
MongoDB default install does not enable authentication. The following steps are used to configure MongoDB security authorization.
NOTE: The root user (database admin) needs to exist already. See section 'Configure root user'.
To enable MongoDB security authorization, edit the mongod.conf configuration file using your favorite text editor (here, vi is used).
sudo vi /etc/mongod.conf
Change the following line in the configuration file:
security:
authorization: enabled
After saving the alterations, the MongoDB service needs to be restarted. This can can be performed with the following command:
sudo service mongod restart
Note: After enabling authentication it will be needed to specify database, user and password when connecting to mongo client shell. For example:
mongo admin --username root --password
# or
mongo admin --username user_read --password
# or
mongo auth_db --username collector_sample --password
To make MongoDB services available in the modules network (see System Architecture)[system_architecture.md], the following configuration is necessary:
Open MongoDB configuration file in your favorite text editor (here, vi is used)
sudo vi /etc/mongod.conf
Add the external IP (the IP seen by other modules in the network) to enable the IP biding.
In this example, if the machine running MongoDB (opmon
) has the Ethernet IP 10.11.22.33
, and therefore, the following line is edited in the configuration file:
bindIp: 127.0.0.1,10.11.22.33
After saving the alterations, the MongoDB service needs to be restarted. This can can be performed with the following command:
sudo service mongod restart
The MongoDB interface is exposed by default in the port 27017. Make sure the port is allowed in the firewall configuration.
The default MongoDB install uses the following folders to store data and logs:
Data folder:
/var/lib/mongodb
Log files:
/var/log/mongodb
Although indexes can improve query performances, indexes also present some operational considerations. See MongoDB Operational Considerations for Indexes for more information.
Our collection holds a large amount of data, and our applications need to be able to access the data while building the index, therefor we consider building the index in the background, as described in Background Construction.
The example here uses the INSTANCE-specific database query_db_sample
, and the same procedure should be used to other / additional instances.
Enter MongoDB client as root:
mongo admin --username root --password
Inside MongoDB client shell, execute the following commands:
use query_db_sample
db.raw_messages.createIndex({'corrected': 1, 'requestInTs': 1})
db.clean_data.createIndex({'clientHash': 1})
db.clean_data.createIndex({'producerHash': 1})
db.clean_data.createIndex({'correctorTime': 1})
db.clean_data.createIndex({'correctorStatus': 1, 'client.requestInTs': 1 })
db.clean_data.createIndex({'correctorStatus': 1, 'producer.requestInTs': 1 })
db.clean_data.createIndex({'messageId': 1, 'client.requestInTs': 1})
db.clean_data.createIndex({'messageId': 1, 'producer.requestInTs': 1})
db.clean_data.createIndex({'client.requestInTs': 1})
db.clean_data.createIndex({'client.serviceCode': 1})
db.clean_data.createIndex({'producer.requestInTs': 1})
db.clean_data.createIndex({'producer.serviceCode': 1})
db.clean_data.createIndex({'client.clientMemberCode': 1, 'client.clientSubsystemCode': 1, 'client.requestInTs': 1})
db.clean_data.createIndex({'client.serviceMemberCode': 1, 'client.serviceSubsystemCode': 1, 'client.requestInTs': 1})
db.clean_data.createIndex({'producer.clientMemberCode': 1, 'producer.clientSubsystemCode': 1, 'producer.requestInTs': 1})
db.clean_data.createIndex({'producer.serviceMemberCode': 1, 'producer.serviceSubsystemCode': 1, 'producer.requestInTs': 1})
use collector_state_sample
db.server_list.createIndex({ 'timestamp': -1})
use reports_state_sample
db.notification_queue.createIndex({'status': 1, 'user_id': 1})
use analyzer_database_sample
db.incident.createIndex({'incident_status': 1, 'incident_creation_timestamp': 1})
Note 1: If planning to select / filter records manually according to different fields, then please consider to create index for every field to allow better results. From other side, if these are not needed, please consider to drop them as the existence reduces overall speed of Corrector module.
Note 2: Additional indexes might required for system scripts in case of functionality changes (eg different reports). Please consider to create them as they speed up significantly the speed of Reports module.
Note 3: Index build might affect availability of cursor for long-running queries. Please review the need of active Collector module and specifically the need of active Corrector module while running long-running queries, specifically Reports module.
Additional helping scripts, directory mongodb_scripts/
has been included into source code repository.
To fetch it:
# If HOME not set, set it to /tmp default.
export TMP_DIR=${HOME:=/tmp}
export PROJECT="X-Road-opmonitor"
export PROJECT_URL="https://github.com/ria-ee/${PROJECT}.git"
export SOURCE="${TMP_DIR}/${PROJECT}"
if [ ! -d "${TMP_DIR}/${PROJECT}" ]; then \
cd ${TMP_DIR}; git clone ${PROJECT_URL}; \
else \
cd ${SOURCE}; git pull ${PROJECT_URL}; \
fi
ls -al ${SOURCE}/mongodb_scripts/
NB! Mentioned appendixes do not log their work and do not keep heartbeat.
Scripts read_speed_test.py
, hash_speed_test.py
are available.
Commands:
export INSTANCE="sample"
cd ${SOURCE}/mongodb_scripts/
python3 read_speed_test.py query_db_${INSTANCE} root --auth admin --host 127.0.0.1:27017
python3 hash_speed_test.py query_db_${INSTANCE} root --auth admin --host 127.0.0.1:27017
It might happen, that under heavy and continuos write activities to database, the indexes corrupt.
In result, read access from database takes long time, it can be monitored also from current log file /var/log/mongodb/mongod.log
, where in COMMANDs the last parameter protocol:op_query in milliseconds (ms) is large even despite usage of indexes (planSummary: IXSCAN { }).
In such cases, dropping and creating indexes again or reIndex() process might be the solution. It may also be worth running if the collection size has changed significantly or if the indexes are consuming a disproportionate amount of disk space. These operations may be expensive for collections that have a large amount of data and/or a large number of indexes. Please consult with MongoDB documentation.
Commands:
mongo admin --username root --password
> use query_db_sample
> db.raw_messages.reIndex()
> db.clean_data.reIndex()
> use collector_state_sample
> db.server_list.reIndex()
> use reports_state_sample
> db.notification_queue.reIndex()
> use analyzer_database_sample
> db.incident.reIndex()
exit
Additional tools create_indexes_query.py
, create_indexes_reports.py
, create_indexes_collector.py
and create_indexes_analyzer.py
are available.
Commands:
export INSTANCE="sample"
cd ${SOURCE}/mongodb_scripts/
python3 create_indexes_query.py query_db_${INSTANCE} root --auth admin --host 127.0.0.1:27017
python3 create_indexes_collector.py collector_state_${INSTANCE} root --auth admin --host 127.0.0.1:27017
python3 create_indexes_reports.py reports_state_${INSTANCE} root --auth admin --host 127.0.0.1:27017
python3 create_indexes_analyzer.py analyzer_database_${INSTANCE} root --auth admin --host 127.0.0.1:27017
To keep MongoDB size under control, save the MongoDB / HDD space, the optional raw_data_archive.py
script can be used.
It archives processed data ({"corrected": true}) from raw data, and remove the documents from database.
Commands:
export INSTANCE="sample"
cd ${SOURCE}/mongodb_scripts/
python3 raw_data_archive.py query_db_${INSTANCE} root --auth admin --host 127.0.0.1:27017
MongoDB runs as a daemon process. It is possible to stop, start and restart the database with the following commands:
- Check stop
sudo service mongod stop
- Check start
sudo service mongod start
- Check restart
sudo service mongod restart
- Check status
sudo service mongod status
It is also possible to monitor MongoDB with a GUI interface using the MongoDB Compass. For specific instructions, please refer to:
https://www.mongodb.com/products/compass
and for a complete list of MongoDB monitoring tools, please refer to:
https://docs.mongodb.com/master/administration/monitoring/
To perform backup of database, it is recommended to use the mongodb tools mongodump and mongorestore
For example, to perform a complete database backup, execute (replace BACKUP_PWD
with the password for backup user set in section 'Configure backup user' and MDB_BKPDIR
is output directory for backup):
export MDB_BKPDIR="/srv/backup/mongodb-`/bin/date '+%Y-%m-%d_%H:%M:%S'`"
mkdir --parents ${MDB_BKPDIR}
mongodump --username db_backup --password 'BACKUP_PWD' --authenticationDatabase admin --oplog --gzip --out ${MDB_BKPDIR}
For example, to perform a database restore, execute (replace BACKUP_PWD
with the password for backup user set in section 'Configure backup user' and MDB_BKPDIR
is directory for backup):
mongorestore --username db_backup --password 'BACKUP_PWD' --authenticationDatabase admin --oplogReplay --gzip ${MDB_BKPDIR}
For additional details and recommendations about MongoDB backup and restore tools, please check:
https://docs.mongodb.com/manual/tutorial/backup-and-restore-tools/
MongoDB supports replication. A replica set in MongoDB is a group of mongod processes that maintain the same data set. Replica sets provide redundancy and high availability, and are the basis for all production deployments.
Sample to add replication, add the following line in the configuration file:
sudo vi /etc/mongod.conf
replication:
replSetName: rs0
oplogSizeMB: 100
After saving the alterations, the MongoDB service needs to be restarted. This can can be performed with the following command:
sudo service mongod restart
To make mongod instance as master, the following commands are needed in mongod shell (in this example, if the machine running MongoDB (opmon
) has the Ethernet IP 10.11.22.33
):
> rs.initiate()
{
"info2" : "no configuration specified. Using a default configuration for the set",
"me" : "10.11.22.33:27017",
"ok" : 1
}
To build or rebuild indexes for a replica set, see Build Indexes on Replica Sets.
For additional details and recommendations about MongoDB replication set, please check MongoDB Manual Replication
To change the size of oplog, follow the steps provided in manual https://docs.mongodb.com/manual/tutorial/change-oplog-size/
Please note, that performance of MongoDB and its tuning really depends on physical hardware, its drivers, HDD, CPU, RAM, SWP settings, operating system tunings etc.
The performance also depends on size of database, existing indexes and their sizes, how they fit into RAM. The solution with small amount of data (less than 100 millions rows, less than 10G data), speed of different queries is usually very good and satisfactory (less than a second) but might rapidly increase when data amount increases (up to 1 billion rows, up to 2T of data etc).
We cannot predict all the nyances here, please be prepared within your own team.
Some of the tunings that might be important follows:
# mongo admin --username root --password
STORAGE [initandlisten] ** WARNING: Using the XFS filesystem is strongly recommended with the WiredTiger storage engine
STORAGE [initandlisten] ** See http://dochub.mongodb.org/core/prodnotes-filesystem
CONTROL [initandlisten]
CONTROL [initandlisten] ** WARNING: You are running on a NUMA machine.
CONTROL [initandlisten] ** We suggest launching mongod like this to avoid performance problems:
CONTROL [initandlisten] ** numactl --interleave=all mongod [other options]
CONTROL [initandlisten]
CONTROL [initandlisten] ** WARNING: /sys/kernel/mm/transparent_hugepage/enabled is 'always'.
CONTROL [initandlisten] ** We suggest setting it to 'never'
CONTROL [initandlisten]
CONTROL [initandlisten] ** WARNING: /sys/kernel/mm/transparent_hugepage/defrag is 'always'.
CONTROL [initandlisten] ** We suggest setting it to 'never'
Edit and add into /etc/sysctl.conf
net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1
net.ipv6.conf.lo.disable_ipv6 = 1
Edit and add into /etc/sysctl.conf
vm.dirty_ratio = 15
vm.dirty_background_ratio = 5
See also https://en.wikipedia.org/wiki/Paging#Swappiness
Edit and add into /etc/sysctl.conf
vm.swappiness = 1
# or max vm.swappiness = 10
Edit and add into /etc/security/limits.d/mongod.conf
mongod soft nproc 64000
mongod hard nproc 64000
mongod soft nofile 64000
mongod hard nofile 64000