Skip to content

Commit

Permalink
Add a way to get entry sizes in XDR query tool.
Browse files Browse the repository at this point in the history
Also added a flag to iterate all non-dead entries in the bucket list (not just 'current' ones).
  • Loading branch information
dmkozh committed Jul 17, 2024
1 parent d78f48e commit 4fc139f
Show file tree
Hide file tree
Showing 15 changed files with 337 additions and 154 deletions.
30 changes: 19 additions & 11 deletions docs/software/ledger_query_examples.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,63 +12,71 @@ ledger to JSON files for further analysis.

* Dump entries modified in the 1000 most recent ledgers:

`stellar-core.exe dump-ledger --conf ../stellar-core_pubnet.cfg --output-file q.json --last-ledgers 1000`
`./stellar-core dump-ledger --conf ../stellar-core_pubnet.cfg --output-file q.json --last-ledgers 1000`

* Dump 1000 recently modified ledger entries (not necessarily the *most* recently modified):
`stellar-core.exe dump-ledger --conf ../stellar-core_pubnet.cfg --output-file q.json --limit 1000`
`./stellar-core dump-ledger --conf ../stellar-core_pubnet.cfg --output-file q.json --limit 1000`

* Dump all the ledger entries with provided account ID or trustline ID:

`stellar-core.exe dump-ledger --conf ../stellar-core_pubnet.cfg
`./stellar-core dump-ledger --conf ../stellar-core_pubnet.cfg
--output-file q.json --filter-query
"data.account.accountID == 'GDNG6SVZAJHCFCH65R7SQDLGVR6FDAR67M7YDHEESXKRRZYBWVF4BEC5'
|| data.trustLine.accountID == 'GDNG6SVZAJHCFCH65R7SQDLGVR6FDAR67M7YDHEESXKRRZYBWVF4BEC5'" `

* Dump 1000 account entries that have non-empty `inflationDest` field:

`stellar-core.exe dump-ledger --conf ../stellar-core_pubnet.cfg
`./stellar-core dump-ledger --conf ../stellar-core_pubnet.cfg
--output-file q.json --filter-query "data.account.inflationDest != NULL" --limit 1000`

* Dump all the offer entries that trade lumens for any asset with code `'AABBG'` and have
been modified within the last 1000 ledgers:

`stellar-core.exe dump-ledger --conf ../stellar-core_pubnet.cfg
`./stellar-core dump-ledger --conf ../stellar-core_pubnet.cfg
--output-file q.json --filter-query
"data.offer.selling == 'NATIVE' && data.offer.buying.assetCode == 'AABBG'"
--last-ledgers 1000`

* Dump 100 trustline entries that have buying liabilities lower than selling liabilities:

`stellar-core.exe dump-ledger --conf ../stellar-core_pubnet.cfg
`./stellar-core dump-ledger --conf ../stellar-core_pubnet.cfg
--output-file q.json --filter-query
"data.trustLine.ext.v1.liabilities.buying < data.trustLine.ext.v1.liabilities.selling"
--limit 100`

* Dump 100 account entries that fullfill a more complex filter (this just demonstrates
* Dump 100 account entries that fulfill a more complex filter (this just demonstrates
that filter supports logical expressions):

`stellar-core.exe dump-ledger --conf ../stellar-core_pubnet.cfg
`./stellar-core dump-ledger --conf ../stellar-core_pubnet.cfg
--output-file q7.json --filter-query
"(data.account.balance < 100000000 || data.account.balance >= 2000000000)
&& data.account.numSubEntries > 2" --limit 100`

* Output 10 entries larger than 200 bytes:

`./stellar-core dump-ledger --conf ../stellar-core_pubnet.cfg
--output-file q8.json --filter-query "entry_size() > 200" --limit 10`

## Aggregating ledger entries

The following examples demonstrate how to aggregate parts of the ledger into CSV tables.

* Find the count of every ledger entry type starting from the certain ledger seq:

`stellar-core.exe dump-ledger --conf ../stellar-core_pubnet.cfg
`./stellar-core dump-ledger --conf ../stellar-core_pubnet.cfg
--output-file q.csv --filter-query "lastModifiedLedgerSeq >= 37872608"
--group-by "data.type" --agg "count()"`

* Dump the order book stats for the offers that have been modified during the last
100000 ledgers:

`stellar-core.exe dump-ledger --conf ../stellar-core_pubnet.cfg
`./stellar-core dump-ledger --conf ../stellar-core_pubnet.cfg
--output-file q.csv --filter-query "data.type == 'OFFER'"
--group-by "data.offer.selling, data.offer.selling.assetCode,
data.offer.selling.issuer, data.offer.buying, data.offer.buying.assetCode,
data.offer.buying.issuer" --agg "sum(data.offer.amount), avg(data.offer.amount), count()"
--last-ledgers 100000`


* Find the entry size distribution:

`./stellar-core dump-ledger --output-file entry_stats.json --group-by data.type --agg sum(entry_size()),avg(entry_size())`
3 changes: 2 additions & 1 deletion src/bucket/BucketManager.h
Original file line number Diff line number Diff line change
Expand Up @@ -373,7 +373,8 @@ class BucketManager : NonMovableOrCopyable
virtual void visitLedgerEntries(
HistoryArchiveState const& has, std::optional<int64_t> minLedger,
std::function<bool(LedgerEntry const&)> const& filterEntry,
std::function<bool(LedgerEntry const&)> const& acceptEntry) = 0;
std::function<bool(LedgerEntry const&)> const& acceptEntry,
bool includeAllStates) = 0;

// Schedule a Work class that verifies the hashes of all referenced buckets
// on background threads.
Expand Down
70 changes: 66 additions & 4 deletions src/bucket/BucketManagerImpl.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1319,7 +1319,7 @@ BucketManagerImpl::mergeBuckets(HistoryArchiveState const& has)
}

static bool
visitEntriesInBucket(std::shared_ptr<Bucket const> b, std::string const& name,
visitLiveEntriesInBucket(std::shared_ptr<Bucket const> b, std::string const& name,
std::optional<int64_t> minLedger,
std::function<bool(LedgerEntry const&)> const& filterEntry,
std::function<bool(LedgerEntry const&)> const& acceptEntry,
Expand Down Expand Up @@ -1381,11 +1381,67 @@ visitEntriesInBucket(std::shared_ptr<Bucket const> b, std::string const& name,
return !stopIteration;
}

static bool
visitAllEntriesInBucket(
std::shared_ptr<Bucket const> b, std::string const& name,
std::optional<int64_t> minLedger,
std::function<bool(LedgerEntry const&)> const& filterEntry,
std::function<bool(LedgerEntry const&)> const& acceptEntry)
{
ZoneScoped;

using namespace std::chrono;
medida::Timer timer;

bool stopIteration = false;
timer.Time([&]() {
for (BucketInputIterator in(b); in; ++in)
{
BucketEntry const& e = *in;
if (e.type() == LIVEENTRY || e.type() == INITENTRY)
{
auto const& liveEntry = e.liveEntry();
if (minLedger && liveEntry.lastModifiedLedgerSeq < *minLedger)
{
stopIteration = true;
continue;
}
if (filterEntry(e.liveEntry()))
{
if (!acceptEntry(e.liveEntry()))
{
stopIteration = true;
break;
}
}
}
else
{
if (e.type() != DEADENTRY)
{
std::string err = "Malformed bucket: unexpected "
"non-INIT/LIVE/DEAD entry.";
CLOG_ERROR(Bucket, "{}", err);
throw std::runtime_error(err);
}
}
}
});
nanoseconds ns =
timer.duration_unit() * static_cast<nanoseconds::rep>(timer.max());
milliseconds ms = duration_cast<milliseconds>(ns);
size_t bytesPerSec = (b->getSize() * 1000 / (1 + ms.count()));
CLOG_INFO(Bucket, "Processed {}-byte bucket file '{}' in {} ({}/s)",
b->getSize(), name, ms, formatSize(bytesPerSec));
return !stopIteration;
}

void
BucketManagerImpl::visitLedgerEntries(
HistoryArchiveState const& has, std::optional<int64_t> minLedger,
std::function<bool(LedgerEntry const&)> const& filterEntry,
std::function<bool(LedgerEntry const&)> const& acceptEntry)
std::function<bool(LedgerEntry const&)> const& acceptEntry,
bool includeAllStates)
{
ZoneScoped;

Expand Down Expand Up @@ -1413,8 +1469,14 @@ BucketManagerImpl::visitLedgerEntries(
throw std::runtime_error(std::string("missing bucket: ") +
binToHex(pair.first));
}
if (!visitEntriesInBucket(b, pair.second, minLedger, filterEntry,
acceptEntry, deletedEntries))
bool continueIteration =
includeAllStates
? visitAllEntriesInBucket(b, pair.second, minLedger,
filterEntry, acceptEntry)
: visitLiveEntriesInBucket(b, pair.second, minLedger,
filterEntry, acceptEntry,
deletedEntries);
if (!continueIteration)
{
break;
}
Expand Down
3 changes: 2 additions & 1 deletion src/bucket/BucketManagerImpl.h
Original file line number Diff line number Diff line change
Expand Up @@ -182,7 +182,8 @@ class BucketManagerImpl : public BucketManager
void visitLedgerEntries(
HistoryArchiveState const& has, std::optional<int64_t> minLedger,
std::function<bool(LedgerEntry const&)> const& filterEntry,
std::function<bool(LedgerEntry const&)> const& acceptEntry) override;
std::function<bool(LedgerEntry const&)> const& acceptEntry,
bool includeAllStates) override;

std::shared_ptr<BasicWork> scheduleVerifyReferencedBucketsWork() override;

Expand Down
4 changes: 3 additions & 1 deletion src/ledger/LedgerTxn.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -25,9 +25,11 @@
#include "xdr/Stellar-ledger-entries.h"
#include "xdrpp/marshal.h"
#include <Tracy.hpp>
#include <algorithm>
#include <soci.h>

#include <algorithm>
#include <numeric>

namespace stellar
{

Expand Down
7 changes: 4 additions & 3 deletions src/main/ApplicationUtils.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,7 @@ writeLedgerAggregationTable(
std::vector<std::string> keyFields;
if (groupByExtractor)
{
keyFields = groupByExtractor->getFieldNames();
keyFields = groupByExtractor->getColumnNames();
for (auto const& keyField : keyFields)
{
ofs << keyField << ",";
Expand Down Expand Up @@ -742,7 +742,7 @@ dumpLedger(Config cfg, std::string const& outputFile,
std::optional<std::string> filterQuery,
std::optional<uint32_t> lastModifiedLedgerCount,
std::optional<uint64_t> limit, std::optional<std::string> groupBy,
std::optional<std::string> aggregate)
std::optional<std::string> aggregate, bool includeAllStates)
{
if (groupBy && !aggregate)
{
Expand Down Expand Up @@ -820,7 +820,8 @@ dumpLedger(Config cfg, std::string const& outputFile,
}
++entryCount;
return !limit || entryCount < *limit;
});
},
includeAllStates);
}
catch (xdrquery::XDRQueryError& e)
{
Expand Down
2 changes: 1 addition & 1 deletion src/main/ApplicationUtils.h
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ int dumpLedger(Config cfg, std::string const& outputFile,
std::optional<uint32_t> lastModifiedLedgerCount,
std::optional<uint64_t> limit,
std::optional<std::string> groupBy,
std::optional<std::string> aggregate);
std::optional<std::string> aggregate, bool includeAllStates);
void showOfflineInfo(Config cfg, bool verbose);
int reportLastHistoryCheckpoint(Config cfg, std::string const& outputFile);

Expand Down
34 changes: 21 additions & 13 deletions src/main/CommandLine.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -458,6 +458,13 @@ limitParser(std::optional<std::uint64_t>& limit)
"process only this many recent ledger entries (not *most* recent)");
}

clara::Opt
includeAllStatesParser(bool& include)
{
return clara::Opt{include}["--include-all-states"](
"include all non-dead states of the entry into query results");
}

int
runWithHelp(CommandLineArgs const& args,
std::vector<ParserWithValidation> parsers, std::function<int()> f)
Expand Down Expand Up @@ -1188,19 +1195,20 @@ runDumpLedger(CommandLineArgs const& args)
std::optional<uint64_t> limit;
std::optional<std::string> groupBy;
std::optional<std::string> aggregate;
return runWithHelp(args,
{configurationParser(configOption),
outputFileParser(outputFile).required(),
filterQueryParser(filterQuery),
lastModifiedLedgerCountParser(lastModifiedLedgerCount),
limitParser(limit), groupByParser(groupBy),
aggregateParser(aggregate)},
[&] {
return dumpLedger(configOption.getConfig(),
outputFile, filterQuery,
lastModifiedLedgerCount, limit,
groupBy, aggregate);
});
bool includeAllStates = false;
return runWithHelp(
args,
{configurationParser(configOption),
outputFileParser(outputFile).required(),
filterQueryParser(filterQuery),
lastModifiedLedgerCountParser(lastModifiedLedgerCount),
limitParser(limit), groupByParser(groupBy), aggregateParser(aggregate),
includeAllStatesParser(includeAllStates)},
[&] {
return dumpLedger(configOption.getConfig(), outputFile, filterQuery,
lastModifiedLedgerCount, limit, groupBy,
aggregate, includeAllStates);
});
}

int
Expand Down
4 changes: 2 additions & 2 deletions src/util/xdrquery/XDRQuery.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -15,9 +15,9 @@ XDRFieldExtractor::XDRFieldExtractor(std::string const& query) : mQuery(query)
}

std::vector<std::string>
XDRFieldExtractor::getFieldNames() const
XDRFieldExtractor::getColumnNames() const
{
return mFieldList->getFieldNames();
return mFieldList->getColumnNames();
}

XDRAccumulator::XDRAccumulator(std::string const& query) : mQuery(query)
Expand Down
Loading

0 comments on commit 4fc139f

Please sign in to comment.