Emerald Moonbeam is a crawler to gather the network information of the Polkadot nodes. The crawler connects to each available node, gets the details about the node (software version, etc) and its peers to continue crawling.
The current implementation is targeting the Kusama network (i.e. uses its bootnodes, follows compatible nodes, executes specific commands, etc), but later it may be adapted for other Polkadot/Substrate based blockchains.
Note
|
The project is in an early stage of development. Configuration, run arguments and output format may be changed without backward compatibility. |
-
Works on the protocol (libp2p/substrate/polkadot) level, emulates behaviour of a node
-
Starting from a bootstrap list, it discovers new nodes and connects to each
-
Extracts each node details, such as used software, supported protocols and current blockchain
-
Exports gathered details as
-
a local log file in JSON format
-
MySQL table
-
AWS S3
-
GCP Storage
-
-
Provides Prometheus compatible monitoring
By default it saves details about found nodes into a file in ./log
directory, and logs progress, as amount of
fully processed peers to the console.
Download a zip with the latest application from https://github.com/emeraldpay/moonbeam/releases (ex. moonbeam-0.3.0.zip
) and unpack it.
wget https://github.com/emeraldpay/moonbeam/releases/download/v0.3.0/moonbeam-0.3.0.zip unzip moonbeam-0.3.0.zip cd moonbeam-0.3.0
./bin/moonbeam
For a detailed demo please see example in the demo
directory. This demo setup of Moonbeam shows basic
integration with Grafana for monitoring of the process, and with Kibana to analyzer results of the crawler.
See the demo: ./demo
Note
|
If you run with Gradle you should pass the arguments with --args . Ex.: --args="--export.file.targetdir=my_results_dir"
|
-
--key=<…>
- private key of the node if available. Otherwise a random Private/Public Key are generated and published to the output, which can be used on the next run to keep the samePeerId
and improve discovery by other peers. -
--port=30100
- port to listen for incoming connections.30100
by default -
--export.file.targetdir=./log
- directory to save results,./log
by default. See Export to JSON file -
--export.mysql.enabled=true
to enable export to MySQL. See Export to MySQL -
--export.s3.enabled=true
to enable export to AWS S3. See Export to AWS S3 -
--export.gs.enabled=true
to enable export to Google Storage. See Export to GCP Storage
Moonbeam produces a log with all the details about the peers it had connected to. The log is JSON items separated by a new line.
The log may contain multiple lines, even for the same remote peer. The main reason for that is because the bot periodically reconnects to peers, to check for availability and updates. Another reason is that the bot is also listening for incoming connections, so another node can decide to connect to the bot based on its logic.
To specify the directory use --export.file.targetdir=/path/to/dir
, see other options below.
Example log file: /val/log/moonbeam.2020-04-08T21-45-21.i5w4gj2r.json.log
, where:
-
moonbeam
standard prefix for all logs -
2020-04-08T21-45-21
timestamp when file was created, asyyyy-MM-dd’T’HH-mm-ss
-
i5w4gj2r
alphanumeric random uniq instance id, to avoid conflicts if many crawlers are running (or when multiple results are stored in an archive)
{
"version":"https://schema.emeraldpay.io/moonbeam",
"timestamp":"2020-03-06T05:07:42.280046Z",
"peerId":"12D3KooWF5pLe2Vvj41GR3B77mwmp5afAQviMDiLYytbS36VSD2o",
"agent":{
"software":"parity-polkadot",
"version":"v0.7.20",
"commit":"3738158",
"platformFull":"x86_64-linux-gnu",
"platform":"linux",
"full":"parity-polkadot/v0.7.20-3738158-x86_64-linux-gnu (unknown)"
},
"host":{
"address":"/ip4/18.196.25.132/tcp/30333",
"hostname":null,
"ip":"24.156.21.132",
"port":30333,
"type":"IP"
},
"connection":{
"connectionType":"OUT",
"connectedAt":"2020-03-06T05:07:27.276719Z",
"disconnectedAt":"2020-03-06T05:07:42.279986Z"
},
"blockchain":{
"height":1328525,
"bestHash":"58e4a0b11edabdbee58c9f5aa430b9b7a7ce344562619b3fb26e2c814d6829aa",
"genesis":"b0a8d493285c2df73290dfb7e61f870f17b41801197a149ca93654499ea3dafe"
},
"protocols":{
"versions":[
{"id":"/substrate/ksmcc3","versions":["6","5","4","3"]},
{"id":"/ipfs/ping","versions":["1.0.0"]},
{"id":"/ipfs/id","versions":["1.0.0"]},
{"id":"/ipfs/kad","versions":["1.0.0"]}
]
}
}
Note
|
The JSON in the actual log would be JSON one-liners, i.e., not pretty-printed and have \n at the end of the line
|
While most of the fields are self-explanatory, some of them need extra description. Please see below.
Field | Example Value | Description |
---|---|---|
|
Version id for this file structure. If the schema of the file changed that breaks compatibility (i.e., fields are moved or renamed) - it gets a new unique id. Please note that at this stage of the development, the schema is not yet finalized. |
|
|
|
The timestamp when log item was written. Please note it can be different from connection time. |
|
|
|
|
List of protocols (in terms of the libp2p) supported by the remote peer |
Option | Default value | Description |
---|---|---|
|
|
Path to store log files |
|
|
Max time period to log into a single file. I.e., by default a new log file will be created every 60 minutes.
If you export to AWS S3/GCP Storage, it’s the time after which the file is closed and uploaded. |
Moonbeam can be configured to export nodes to a MySQL table.
-
The bot only appends a new information, and if you need to clean up the table, you have to run an external scheduled job to do so.
-
The table is going to have duplicate lines, appended each time the bot hit a peer. Use
SELECT DISTINCT
to get uniq peers. -
Table name: moonbeam.
CREATE TABLE `moonbeam` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`found_at` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
`ip` varchar(45) DEFAULT NULL,
`peer_id` varchar(200) DEFAULT NULL,
`agent_full` varchar(128) DEFAULT NULL,
`agent_app` varchar(64) DEFAULT NULL,
`agent_version` varchar(64) DEFAULT NULL,
`genesis` char(66) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=197 DEFAULT CHARSET=utf8;
Column | Example | Description |
---|---|---|
|
|
Timestamp when the peer was found |
|
|
IP address |
|
PeerId |
|
|
|
Full agent name |
|
|
Type of software |
|
|
Software version |
|
|
Hash of the genesis block |
Option | Default value | Description |
---|---|---|
|
|
Enable/disable export to MySQL |
|
|
URL to connect. Format |
|
|
Username |
|
Password |
docker run -v $(pwd)/results:/results emeraldpay/moonbeam \ --export.mysql.enabled=true \ --export.mysql.url=10.0.2.100:3306/moonbeam \ --export.mysql.password=123456
Setup Moonbeam to upload logs to the Amazon AWS S3 bucket. Please note that the files are uploaded once they are finished (i.e. closed) by JSON exporter. By default it’s every 60 minutes. See Export to JSON file
Option | Default value | Description |
---|---|---|
|
|
Enable/disable export to AWS S3 |
|
|
(required) AWS Region |
|
(required) S3 Bucket to upload files |
|
|
(optional) Path prefix, i.e. a directory. Example: |
|
|
(required) AWS credentials |
docker run -v $(pwd)/results:/results emeraldpay/moonbeam \ --export.s3.enabled=true \ --export.s3.accesskey=AKIJF5KA05L1JAF \ --export.s3.secretkey=i85aGTgtzh39t9+h8gka9bkbAEW1lgIYVC811Aoe \ --export.s3.bucket=my-crawler-bucket \ --export.s3.region=us-east-1 \ --export.s3.path=moonbeam/
Setup Moonbeam to upload logs to the Google Cloud Storage bucket. Please note that the files are uploaded once they are finished (i.e. closed) by JSON exporter. By default it’s every 60 minutes. See Export to JSON file
Option | Default value | Description |
---|---|---|
|
|
Enable/disable export to GCP Storage |
|
(required) GCP Bucket name to upload files |
|
|
(optional) Path prefix, i.e. a directory. Example: |
|
|
(optional) Path to JSON file with credentials |
docker run -v $(pwd)/results:/results -v $(pwd)/gcloud.json:/etc/moonbeam/gcloud.json emeraldpay/moonbeam \ --export.gs.enabled=true \ --export.gs.credentials=/etc/moonbeam/gcloud.json \ --export.gs.bucket=my-crawler-bucket \ --export.gs.path=moonbeam/
Application exports Prometheus compatible status at http://127.0.0.1:1234
Option | Default value | Description |
---|---|---|
|
|
Host to bind Prometheus exporter |
|
|
Port to bind Prometheus exporter |
Name | Labels | Type | Description |
---|---|---|---|
|
|
Counter |
Total Bytes transferred by the crawler. |
|
|
Counter |
Total messages transferred |
|
|
Counter |
Connection errors, i.e. timeout, host inaccessible, etc |
|
|
Counter |
Error specific for the libp2p or Polkadot protocol |
|
Counter |
Discovered addresses |
|
|
|
Counter |
Connections to peers |
|
|
Counter |
Successfully finished connections to peers |
|
Histogram |
How many neighbour peers were reported by a connection |
|
|
Summary |
Time spent on a connection. With the following quantiles: 90% (/- 1%), 95% (/- 0.5%), 99% (+/- 0.1%) |
Label | Options | Description |
---|---|---|
|
|
Direction of the connection, i.e. |
|
|
Direction of the bytes transferred, i.e. inside a connection. |
|
|
Type of a connection error |
|
|
Level on which an error happened. |
In addition to the metrics above the application exports JVM metrics (such as memory use, threads, etc),
and process (file descriptors, etc) metrics.
All of those metrics are available under jvm_
namespace, or under process_
.
Instead of passing configuration as command line options, all of them could be specified with a config file and
pass a path to it as --spring.config.location=PATH_TO_FILE
.
moonbeam.properties
## Main Configuration
# Port to listen for connections
port=30100
# Private key to use, otherwise a new random key generated
key=
## Prometheus Monitoring
prometheus.host=127.0.0.1
prometheus.port=1234
## Export ot JSON
# Directory for files
export.file.targetdir=/var/log
# Time limit for a file, after which a new file is created
export.file.timelimit=60m
## Export to MySQL
export.mysql.enabled=false
# Url to the database
export.mysql.url=jdbc:mysql://localhost:3306/moonbeam
# Username
export.mysql.username=moonbeam
# Password
export.mysql.password=
## Export ot AWS S3
export.s3.enabled=false
# AWS region with the bucket
export.s3.region=us-east-1
# Bucket to use
export.s3.bucket=
# Path on the bucket
export.s3.path=
# AWS Auth Key
export.s3.accesskey=
# AWS Auth Key Secret
export.s3.secretkey=
## Export to GCP Storage
export.gs.enabled=false
# Bucket to use
export.gs.bucket=
# Path on the bucket
export.gs.path=
# Path to JSON file with credentials
export.gs.credentials=
-
Java 11+
-
Gradle 5.6+
-
(optional) port 30100 accessible from the internet to accept incoming connections
-
Kotlin
-
Spring Boot + Spring Framework
-
Spring Reactor + Netty
-
io.libp2p:jvm-libp2p-minimal
for libp2p types/structures (provided transport/security are not used, Moonbeam has own implementation) -
Groovy + Spock Framework for testing
-
Uses Spring Reactor and reactive streams idea in general. It allows opening many non-blocking connections with minimal overhead, avoiding threads and state synchronization, which is especially crucial for a crawler to make sure it can process hundreds of peers and thousands of connections in parallel.
-
Because the libp2p library for JVM was not production ready at the moment of the development, the required subset of the Libp2p protocol was implemented from scratch. Moonbeam implementation has only part of the protocol that is specific for bot functionality and may be missing many other features.
-
A similar situation is for SCALE codec, which didn’t have any implementation for JVM. Therefore Moonbeam has its own small unoptimized implementation, which is suitable only for reading some types of messages that bot is accessing.
-
The bot is designed for aggressive use of the protocol, just to gather all important details from remotes. It doesn’t follow some of the Libp2p and Substrate protocols guidelines, it uses many shortcuts and sometimes deliberately ignores or misuses parts of the protocols to get job done.
To build the source code please install Gradle from the website https://gradle.org/, or through SDKMAN https://sdkman.io/
gradle build
gradle jibDockerBuild ... docker run emeraldpay/moonbeam
Want to support the project, prioritize a specific feature, or get commercial help with using Moonbeam in your project? Please contact [email protected] to discuss the possibility
The core project code is released under Apache 2.0 license.
Demo and docs are published under CC0 license + additionally Apache 2.0 for code parts in the examples/demo.
File src/proto/dht.proto
, with the definition of DHT Protobuf messages, is taken from libp2p specification and has
the same license as it specified for the libp2p specification.