-
Notifications
You must be signed in to change notification settings - Fork 103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fast access logging for analytics #537
Comments
At the moment there are hundreds of log messages of various levels and generic One more approach is to log only binary data (e.g. integers, floats and nanoseconds since epoch) into a ring buffer and use a log reading tool to process the binary data. This is very close to what HAproxy does https://www.youtube.com/watch?v=762owEyCI4o Access log is a separate case: it can be used to compute an advanced statistics - larger log allows longer statistics. E.g. with the current access log we can compute statistics for each of return code and we don't actually need the counters implemented in #2023 (#1454). Access log can be also extended with:
Probably #2023 should be reverted, but maybe we should provide an application layer parser, which will compute the statistics. This probably can be done with the same library as |
The problem is crucial because of low performance of the kernel log and absence of analytics abilities for DDoS incident responses. To react on a DDoS incident we need to extend the access log with JA5 fingerprints #2052. Access log must have a per-cpu user-space mapped (see for example how tcp_mmap maps user-space pages) ring buffers. (Please check the current state of the generic ringbuffers and make a TODO comment and probably and issue to use it. Also see it's implementation to borrow some code.) The log records and the whole mmaped buffers must be defined with a C structure, which will be later extended, e.g with a record type for error and security events. The buffer structures must have two integer fields: A C++ daemon must spawn N threads, where N is defined as a command line argument. During startup the daemon should define the sets of CPUs, which each of the threads should process. The daemon must use the ClickHouse client library to connect to a ClickHouse instance by an address, specified in another command line argument. Each thread must process the mmaped buffers of assigned CPUs in round-robin fashion and prepare ClickHouse batches of configured size and send to ClickHouse. A code example from ChatGPT: #include <clickhouse/client.h>
using namespace clickhouse;
int main() {
// Establish connection to ClickHouse server
Client client(ClientOptions().SetHost("localhost"));
// Define batch of data
Block block;
block.AppendColumn("column1", std::make_shared<ColumnUInt64>());
block.AppendColumn("column2", std::make_shared<ColumnString>());
// Insert rows to the block
for (uint64_t i = 0; i < 1000; ++i) {
block[0]->As<ColumnUInt64>()->Append(i);
block[1]->As<ColumnString>()->Append("value" + std::to_string(i));
}
// Insert the batch into ClickHouse
client.Insert("default.my_table", block);
return 0;
} If a thread reaches head of all designated log buffers, it should sleep for a short period of time, e.g. 0.1s or 0.01s (futexes aren't available for us). All the log records must contain current timestamp (like The daemon should use C++23 and Boost, but not asio due it's poor performance. The daemon is supposed to be extended to write to a file, but for now let's just keep the current dmesg implementation for this case. Testing & docPlease update the wiki. The testing issue is #2269 |
Also, for slow DDoS attacks detection, let's add time spent to send response to the access log, like |
In #537 we need a way to deliver log data to userspace. Introduce a set of per-cpu ring buffer mapped to userspace. Signed-off-by: Alexander Ivanov <[email protected]>
In #537 we need a way to deliver log data to userspace. Introduce a set of per-cpu ring buffer mapped to userspace. Signed-off-by: Alexander Ivanov <[email protected]>
In #537 we need a way to deliver log data to userspace. Introduce a set of per-cpu ring buffer mapped to userspace. Signed-off-by: Alexander Ivanov <[email protected]>
In #537 we need a way to deliver log data to userspace. Introduce a set of per-cpu ring buffer mapped to userspace. Signed-off-by: Alexander Ivanov <[email protected]>
For access log table creation connect by clickhouse client and execute: |
@ai-tmpst this table creation should be either in our Wiki installation guide or client handling or in Tempesta scripts |
Now all information from config file parsing errors to clients blocking is written to dmesg. Instead following logs on top of TempestaDB must be introduced:
TfwHttpReq
must be passed as an argument totfw_log_access()
function.net_warn_ratelimited()
, so we're good with the system ratelimiter. We should either leave the security log in the kernel log or implement our own rate limiter.TfwHttpReq
context;The 2 modes of logging must be implemented:
The logs should be configured by independent configuration options:
variables
is the list of variables to log:srvhdr_
orclntcdr_
prefixes and-
changed to_
, e.g.srvhdr_set_cookie
orclnt_user_agent
In general, Tempesta DB should provide streaming data processing (#515 and #516) foundation for the logging application. Probably we need to keep all request and response headers with the metadata (e.g. timings, chunks information, TCP/IP and TLS layers etc) for a relatively short sliding window. Such data is extremely useful to query to debug immediate performance issues and DDoS attacks.
TBD a possible application is to batch events in front of a time series DB, e.g. ClickHouse, InfluxDB or TimescaleDB.
We need to support per-vhost logging, i.e. thousands of vhosts having several logs each. This could be done using either secondary index #733 or we have to be able to scale to thousands of TDB tables.
For better performance logs must use simple sequential ring-buffer TDB table format w/o any indexes (#516). Log records must be stored in structured TDB records. Probably we don't event need TDB for this and just mmap the ring buffer into the user space.
The binary log format could be
, where
event_type
defines the event type and it's format (number of variables and their type, e.g. client address, URI, HTTP Host header etc.). In this release the format must be simple and hardcoded.Simple retrieval user space tool like varnishlog must be developed on top of
tdbq
to print the logs and/or write them in human-readable or JSON formats to files. The tool also must be able to run in daemon mode, read the TDB tables and flush the log records to files or syslogd.The human-readable text format should be compatible with the W3C draft, but should also provide more information.
Also reference TUX and HAProxy, which also use(ed) binary logging.
The text was updated successfully, but these errors were encountered: