doc: Elaborated more on the NEAR Indexer tweaking (#2997)

near · Jul 22, 2020 · 0c0ad30 · 0c0ad30
1 parent 04649e3
commit 0c0ad30
Show file tree

Hide file tree

Showing 2 changed files with 53 additions and 13 deletions.
diff --git a/chain/indexer/README.md b/chain/indexer/README.md
@@ -1,16 +1,16 @@
 # NEAR Indexer
 
-NEAR Indexer is a micro-framework, which provides you with a stream of blocks that are recorded on NEAR network.
+NEAR Indexer is a micro-framework, which provides you with a stream of blocks that are recorded on NEAR network. It is useful to handle real-time "events" on the chain.
 
+## Rationale
 
-NEAR Indexer is useful to handle real-time "events" on the chain.
+As scaling dApps enter NEAR’s mainnet, an issue may arise: how do they quickly and efficiently access state from our deployed smart contracts, and cut out the cruft? Contracts may grow to have complex data structures and querying the network RPC may not be the optimal way to access state data. The NEAR Indexer Framework allows for streams to be captured and indexed in a customized manner. The typical use-case is for this data to make its way to a relational database. Seeing as this is custom per project, there is engineering work involved in using this framework.
 
+NEAR Indexer is already in use for several new projects, namely, we index all the events for NEAR Blockchain Explorer, and we also dig into Access Keys and index all of them for NEAR Wallet passphrase recovery and multi-factor authentication. With NEAR Indexer you can do high-level aggregation as well as low-level introspection of all the events inside the blockchain.
 
-NEAR Indexer is going to be used to build NEAR Explorer, augment NEAR Wallet, and provide overview of events in Rainbow Bridge.
-
-
-See the [example](https://github.com/nearprotocol/nearcore/tree/master/tools/indexer/example) for further details.
+We are going to build more Indexers in the future, and will also consider building Indexer integrations with streaming solutions like Kafka, RabbitMQ, ZeroMQ, and NoSQL databases. Feel free to [join our discussions](https://github.com/nearprotocol/nearcore/issues/2996).
 
+See the [example](https://github.com/nearprotocol/nearcore/tree/master/tools/indexer/example) for further technical details.
 
 ## How to set up and test NEAR Indexer
 
@@ -47,19 +47,16 @@ $ env NEAR_ENV=local near --keyPath ~/.near/localnet/validator_key.json create_a
 
 To run the NEAR Indexer connected to betanet we need to have configs and keys prepopulated, you can get them with the [nearup](https://github.com/near/nearup). Clone it and follow the instruction to run non-validating node (leaving account ID empty).
 
-Configs for betanet are in the `~/.near/betanet` folder. We need to ensure that NEAR Indexer follows all the necessary shards and syncs all the blocks, so `"tracked_shards"` and `"block_fetch_horizon"` parameters in `~/.near/betanet/config.json` need to be configured properly. For example, with a single shared network, you just add the shard #0 to the list:
+Configs for betanet are in the `~/.near/betanet` folder. We need to ensure that NEAR Indexer follows all the necessary shards, so `"tracked_shards"` parameters in `~/.near/betanet/config.json` needs to be configured properly. For example, with a single shared network, you just add the shard #0 to the list:
+
 ```
 ...
-"consensus": {
-  ...
-  "block_fetch_horizon": 18446744073709551615,
-  ...
-},
-...
 "tracked_shards": [0],
 ...
 ```
 
+Hint: See the Tweaks section below to learn more about further configuration options.
+
 After that we can run NEAR Indexer.
 
 Follow to `nearcore` folder.
@@ -70,3 +67,40 @@ $ cargo run --release ~/.near/betanet
 ```
 
 After the network is synced, you should see logs of every block produced in Betanet. Get back to the code to implement any custom handling of the data flowing into the indexer.
+
+## Tweaks
+
+By default, nearcore is configured to do as little work as possible while still operating on an up-to-date state. Indexers may have different requirements, so there is no solution that would work for everyone, and thus we are going to provide you with the set of knobs you can tune for your requirements.
+
+As already has been mentioned in this README, the most common tweak you need to apply is listing all the shards you want to index data from; to do that, you should ensure that `"tracked_shards"` in the `config.json` lists all the shard IDs, e.g. for the current betanet and testnet, which have a single shard:
+
+```
+...
+"tracked_shards": [0],
+...
+```
+
+Another tweak changes the default "fast" sync process to a "full" sync process. When the node gets online and observes that its state is missing or outdated, it will do state sync, and that can be done in two strategies:
+
+1. ("fast" / default) sync enough information (only block headers) to ensure that the chain is valid; that means that the node won't have transactions, receipts, and execution outcomes, only the proofs, so Indexer will skip these blocks
+2. (very slow / full sync) sync all the blocks, chunks, transactions, receipts, and execution outcomes starting from the genesis.
+
+To force full sync (don't forget to track shards [see the previous tweak]), make the following change to your `config.json`:
+
+```
+...
+"consensus": {
+  ...
+  "block_fetch_horizon": 18446744073709551615,
+  ...
+},
+...
+```
+
+Indexer Framework also exposes access to the internal APIs (see `Indexer::client_actors` method), so you can fetch data about any block, transaction, etc, yet by default, nearcore is configured to remove old data (garbage collection [GC]), so querying the data that was observed a few epochs before may return an error saying that the data is not found. If you only need blocks streaming, you don't need this tweak, but if you need access to the historical data right from your Indexer, consider updating `"archive"` setting in `config.json` to `true`:
+
+```
+...
+"archive": true,
+...
+```
diff --git a/tools/indexer/example/README.md b/tools/indexer/example/README.md
@@ -0,0 +1,6 @@
+NEAR Indexer Simple Logger Example
+==================================
+
+This is an example project featuring [NEAR Indexer Framework](https://github.com/nearprotocol/nearcore/tree/master/chain/indexer). This Indexer prints out all the blocks, chunks, transactions, receipts, execution outcomes, and state changes block by block immediately once it gets finalized in the network.
+
+Refer to the NEAR Indexer Framework README to learn how to run this example.