-
Notifications
You must be signed in to change notification settings - Fork 62
pipeline_tuning
Thomas Leibovici edited this page Nov 23, 2015
·
23 revisions
When processing incoming information (from scan or changelogs), robinhood periodically dumps internal statistics to its log file (frequency for dumping stats is driven by config parameter 'logs::stats_interval). These log lines starts with 'STATS' tag which makes them easy to 'grep'.
grep STATS /var/log/robinhood.log
The more useful part for analyzing ingest performance is the EntryProcessor Pipeline Stats section. It indicates the time spent for each kind of operation, thus giving an idea of current limitation:
- Filesystem operations: GET_FID, GET_INFO_FS, CHGLOG_CLR.
- Database operations: GET_INFO_DB, DB_APPLY, RM_OLD_ENTRIES.
- Wait: the number of entries waiting to be processed at this stage.
- Curr: the number of entries currently being processed at this stage.
- Done: the number of entries that have been processed at this stage but that cannot go to the next stage because an entry with the same 'id' or 'path' is currently being processed in next steps.
- Total: the total number of entries that were processed at this step since the program started.
- ms/op: the average duration of operations at this stage.
Robinhood is running filesystem scan. Let's analyze the workflow stats and try to improve the scan speed:
==== EntryProcessor Pipeline Stats === Idle threads: 0 Id constraints count: 10000 (hash min=0/max=5/avg=0.6) Name constraints count: 10000 (hash min=0/max=5/avg=0.6) Stage | Wait | Curr | Done | Total | ms/op | 0: GET_FID | 0 | 0 | 0 | 441 | 13.12 | 1: GET_INFO_DB | 9958 | 2 | 0 | 55892 | 0.61 | 2: GET_INFO_FS | 0 | 28 | 10 | 55864 | 17.27 | 3: REPORTING | 0 | 0 | 0 | 55854 | 0.00 | 4: PRE_APPLY | 0 | 0 | 0 | 55854 | 0.00 | 5: DB_APPLY | 0 | 2 | 0 | 55852 | 1.43 | 6: CHGLOG_CLR | 0 | 0 | 0 | 0 | 0.00 | 7: RM_OLD_ENTRIES | 0 | 0 | 0 | 0 | 0.00 |
Observations:
- Pipeline is full (constraints count=10000) which may slow down the scan speed (scan threads can't push entries in the pipeline when it is full). This indicate the limiting point is currently in the pipeline sprocessing steps.
- Longer operations are filesystem operations: GET_FID (13.12 ms/op) and GET_INFO_FS (17.27 ms/op)
- According to 'Curr' column, most threads (28/32) are currently processing entries in 'GET_INFO_FS' stage, which may starve over pipeline steps.
Let's see what we get then:
==== EntryProcessor Pipeline Stats === Idle threads: 26 Id constraints count: 7 (hash min=0/max=1/avg=0.0) Name constraints count: 8 (hash min=0/max=1/avg=0.0) Stage | Wait | Curr | Done | Total | ms/op | 0: GET_FID | 0 | 0 | 0 | 2011 | 4.25 | 1: GET_INFO_DB | 0 | 2 | 0 | 381537 | 0.47 | 2: GET_INFO_FS | 0 | 3 | 0 | 381534 | 6.90 | 3: REPORTING | 0 | 0 | 0 | 381534 | 0.00 | 4: PRE_APPLY | 0 | 0 | 0 | 381534 | 0.00 | 5: DB_APPLY | 0 | 2 | 0 | 381532 | 1.36 | 6: CHGLOG_CLR | 0 | 0 | 0 | 0 | 0.00 | 7: RM_OLD_ENTRIES | 0 | 0 | 0 | 0 | 0.00 |
Observations:
- Pipeline workflow looks better: it is no longer full so scan threads are no longer limited by robinhood processing rate to push entries to the pipeline.
- Latency of filesystem operations also descreased (4.25ms and 6.90ms). This may be explained by less concurrency on filesystem operations.
Back to wiki home