Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kernel-User Space Transport #77

Open
krizhanovsky opened this issue Mar 12, 2015 · 1 comment
Open

Kernel-User Space Transport #77

krizhanovsky opened this issue Mar 12, 2015 · 1 comment

Comments

@krizhanovsky
Copy link
Contributor

krizhanovsky commented Mar 12, 2015

Motivation and architecture

We need to export some logic to user-space and/or third-party servers. User-space tasks must be done asynchronously to softirq processing, just like NF_QUEUE for the netfilter. Examples are:

FastCGI, uwsgi and ICAP implement their own protocols, different from HTTP. All the logic above should not be considered as core or mission critical.

Thus, we should be able to pass some HTTP requests to user space for complex processing and get appropriate responses from user space. Configuration option (like HTTP scheduler using http_match):

    match user_space_offload uri prefix "/rest/";

should be used to pass a client request to user space processing daemon.

Zero-copy transport of HTTP messages between kernel and user spaces must be used. It should be done based on mmap() interface for parsed HTTP messages. The proposed scenario for processing a ingress HTTP request and sending a generated HTTP response is illustrated by the figure at the below.

user_kernel_comm

  1. Softirq handler receives packets that hold an HTTP request. The Linux TCP/IP stack is patched so that the packet’s payload is always placed in memory pages, which can be mmap()’ed.
  2. The request is parsed and all required data, including the parsing meta-information and the packet’s data, are placed in several memory pages. HTTP messages are processed in a zero-copy fashion, i.e. HTTP fields are not copied. Instead, appropriate pointers are stored in the parsing meta-information which point into the received packet data, like the start of HTTP header field name and value.
  3. When memory pages of HTTP request are mapped to the user space process’ address space, the softirq handler wakes up the process.
  4. Now the process can run heavy logic on the mmap()’ed request. An example of heavy logic could could be data compression.
  5. The advanced classification process can generate a response to it (e.g. with HTTP error code). The same memory mapped region is used to pass the HTTP response to the kernel.
  6. Finally, softirq handler can send the response to the client.

GFSM should be used to redirect HTTP messages satisfying user_space_offload rule to user space and wait for responses (e.g. modified HTTP requests for further redirection in ICAP case or just a response for RESTful API case).

A user-space logic may produce larger HTTP message than an original, e.g. add an HTTP header. We can do this with allocation a page fragment (also in user space) and pass it to kernel with the frament offset to let the kernel properly arrange skb fragments.

Since a user-space application may run in a virtual container, the mapping transport must be container-aware and provide a configuration which HTTP messages map to which containers, probably based on current vhost and location basis.

API

A C API must be provided to bind with various programming languages like C, C++, Rust or Python.

Probably io_uring should be used for the API, also see the generic ring buffer API proposal for the Linux kernel.

Asynchronous processing

Having event-driven software, e.g. Nginx, a modern HTTP servers can process thousands requests concurrently on modern multi-core machines. However, there are still heavy computational tasks, leading to high response times on large percentilies, e.g. data compression or some security checks, e.g. parsing and analyzing a DOM tree for a large HTTP response. These tasks are performed on CPU and can not be offloaded to a co-processor leaving CPU processing other HTTP messages. While some tasks can be offloaded to GPU, e.g. TLS handshakes, some tasks work with large memory volumes in stream mode, e.g. HTTP POST processing, so it doesn't make sense to offload them to GPU. Thus, if a server has N CPU cores and gets N HTTP request with expensive CPU computations, it can not process other light-weight requests.

This task, offloading some HTTP processing to user-space, solves the problem with synchronous processing: now we can offload expensive CPU computations to a user-space where they'll be processed with lower scheduler priority while softirq can continue to work with other HTTP requests. GFSM is useful here to store an HTTP message processing context for user-space processing.

Synchronous processing

Some logic (security applications) require to make a decision (pass/block) or mangle a traffic synchronously, to not to pass malicious traffic to a protected backend server. This processing type can be done in the same user-space process as the asynchronous one, probably using the GFSM or some synchronization mechanism in a shared memory.

Dynamic programs

The API must allow to register (attach) new synchronous and asynchronous user-space programs in run-time, without Tempesta FW restart (just like BPF scripts).

Serverless

If we map all the pages with HTTP messages as read-only for the user space and use separate memory area for writing, then this can be an alternative for the modern serverless architecture - an unpriviledged user may read their traffic and run some logic in a separate address space.

Failovering

A user space HTTP message handling program can work as a Linux process, Docker or LXC container. If the program crash in a container, then the container infrastructure is responsible to restart the process. However, for the case of Linux process Tempesta FW must take care for restarting the process.

This behaviour is inspired by Erlang OTP and will make C/C++ web applications more reliable: in worst case a user will have CGI-like application which spawns a new process for each request, but in normal case we'll have a true application server without neither the risk for the whole server crash nor extra cost on FastCGI.

References

An example of a similar solution for the Linux zero-copy read via io_uring is in Fast ZC Rx Data Plane using io uring talk.

@krizhanovsky krizhanovsky added this to the 0.5.0 SSL, Stable milestone Mar 12, 2015
@krizhanovsky krizhanovsky assigned krizhanovsky and unassigned vdmit11 May 3, 2015
@krizhanovsky krizhanovsky modified the milestones: TBD, 0.5.0 SSL & TDB Jun 19, 2015
@krizhanovsky krizhanovsky changed the title User-space/third-party TCP communication interface Kernel-User Space Transport Feb 11, 2017
@krizhanovsky krizhanovsky modified the milestones: backlog, 0.10 Kernel-User Space Transport Jan 14, 2018
krizhanovsky added a commit that referenced this issue Nov 14, 2021
low level networking layers.

GFSM was designed to build graphs of network protocols FSMs (this
design was inspired by FreeBSD netgraph). However, during the years
neither we nor external users have any requirements to introduce
any modules which use GFSM to hook TLS or HTTP entry code. There
are only 2 users of the mechanism for TLS and HTTP for now:
1. TLS -> HTTP protocols handling
2. HTTP limits (the frang module)

This patch replaces GFSM calls with direct calls to
tfw_http_req_process(), tfw_tls_msg_process() and frang_tls_handler()
in following paths:
1. sync sockets -> TLS
2. sync sockets -> HTTP
3. TLS -> HTTP
4. TLS -> Frang

As the result the function tfw_connection_recv() was eliminated.
Now the code is simpler and has lower overhead.

We still might need GFSM for the user-space requests handling (#77)
and Tempesta Language (#102).
@krizhanovsky krizhanovsky modified the milestones: 1.4 TBD (Kernel-User Space Transport), 1.2 TBD Jan 3, 2022
ttaym added a commit to ttaym/tempesta that referenced this issue Feb 21, 2022
Almost literaly follow ak patch from 2eae1da

Replace GFSM calls with direct calls to TLS and HTTP handlers
 on low level networking layers.

GFSM was designed to build graphs of network protocols FSMs (this
design was inspired by FreeBSD netgraph). However, during the years
neither we nor external users have any requirements to introduce
any modules which use GFSM to hook TLS or HTTP entry code. There
are only 2 users of the mechanism for TLS and HTTP for now:
1. TLS -> HTTP protocols handling
2. HTTP limits (the frang module)

This patch replaces GFSM calls with direct calls to
tfw_http_req_process(), tfw_tls_msg_process() and frang_tls_handler()
in following paths:
1. sync sockets -> TLS
2. sync sockets -> HTTP
3. TLS -> HTTP
4. TLS -> Frang

As the result the function tfw_connection_recv() was eliminated.
Now the code is simpler and has lower overhead.

We still might need GFSM for the user-space requests handling (tempesta-tech#77)
and Tempesta Language (tempesta-tech#102).

Contributes to tempesta-tech#755

Based-on-patch-by: Alexander K <[email protected]>
Signed-off-by: Aleksey Mikhaylov <[email protected]>
ttaym added a commit to ttaym/tempesta that referenced this issue Feb 22, 2022
Almost literaly follow ak patch from 2eae1da

Replace GFSM calls with direct calls to TLS and HTTP handlers
 on low level networking layers.

GFSM was designed to build graphs of network protocols FSMs (this
design was inspired by FreeBSD netgraph). However, during the years
neither we nor external users have any requirements to introduce
any modules which use GFSM to hook TLS or HTTP entry code. There
are only 2 users of the mechanism for TLS and HTTP for now:
1. TLS -> HTTP protocols handling
2. HTTP limits (the frang module)

This patch replaces GFSM calls with direct calls to
tfw_http_req_process(), tfw_tls_msg_process() and frang_tls_handler()
in following paths:
1. sync sockets -> TLS
2. sync sockets -> HTTP
3. TLS -> HTTP
4. TLS -> Frang

As the result the function tfw_connection_recv() was eliminated.
Now the code is simpler and has lower overhead.

We still might need GFSM for the user-space requests handling (tempesta-tech#77)
and Tempesta Language (tempesta-tech#102).

Contributes to tempesta-tech#755

Based-on-patch-by: Alexander K <[email protected]>
Signed-off-by: Aleksey Mikhaylov <[email protected]>
ttaym added a commit to ttaym/tempesta that referenced this issue Feb 22, 2022
Almost literaly follow ak patch from 2eae1da

Replace GFSM calls with direct calls to TLS and HTTP handlers
 on low level networking layers.

GFSM was designed to build graphs of network protocols FSMs (this
design was inspired by FreeBSD netgraph). However, during the years
neither we nor external users have any requirements to introduce
any modules which use GFSM to hook TLS or HTTP entry code. There
are only 2 users of the mechanism for TLS and HTTP for now:
1. TLS -> HTTP protocols handling
2. HTTP limits (the frang module)

This patch replaces GFSM calls with direct calls to
tfw_http_req_process(), tfw_tls_msg_process() and frang_tls_handler()
in following paths:
1. sync sockets -> TLS
2. sync sockets -> HTTP
3. TLS -> HTTP
4. TLS -> Frang

As the result the function tfw_connection_recv() was eliminated.
Now the code is simpler and has lower overhead.

We still might need GFSM for the user-space requests handling (tempesta-tech#77)
and Tempesta Language (tempesta-tech#102).

Contributes to tempesta-tech#755

Based-on-patch-by: Alexander K <[email protected]>
Signed-off-by: Aleksey Mikhaylov <[email protected]>
ttaym added a commit to ttaym/tempesta that referenced this issue Feb 22, 2022
Almost literaly follow ak patch from 2eae1da

Replace GFSM calls with direct calls to TLS and HTTP handlers
 on low level networking layers.

GFSM was designed to build graphs of network protocols FSMs (this
design was inspired by FreeBSD netgraph). However, during the years
neither we nor external users have any requirements to introduce
any modules which use GFSM to hook TLS or HTTP entry code. There
are only 2 users of the mechanism for TLS and HTTP for now:
1. TLS -> HTTP protocols handling
2. HTTP limits (the frang module)

This patch replaces GFSM calls with direct calls to
tfw_http_req_process(), tfw_tls_msg_process() and frang_tls_handler()
in following paths:
1. sync sockets -> TLS
2. sync sockets -> HTTP
3. TLS -> HTTP
4. TLS -> Frang

As the result the function tfw_connection_recv() was eliminated.
Now the code is simpler and has lower overhead.

We still might need GFSM for the user-space requests handling (tempesta-tech#77)
and Tempesta Language (tempesta-tech#102).

Contributes to tempesta-tech#755

Based-on-patch-by: Alexander K <[email protected]>
Signed-off-by: Aleksey Mikhaylov <[email protected]>
ttaym added a commit to ttaym/tempesta that referenced this issue Feb 22, 2022
Almost literaly follow ak patch from 2eae1da

Replace GFSM calls with direct calls to TLS and HTTP handlers
 on low level networking layers.

GFSM was designed to build graphs of network protocols FSMs (this
design was inspired by FreeBSD netgraph). However, during the years
neither we nor external users have any requirements to introduce
any modules which use GFSM to hook TLS or HTTP entry code. There
are only 2 users of the mechanism for TLS and HTTP for now:
1. TLS -> HTTP protocols handling
2. HTTP limits (the frang module)

This patch replaces GFSM calls with direct calls to
tfw_http_req_process(), tfw_tls_msg_process() and frang_tls_handler()
in following paths:
1. sync sockets -> TLS
2. sync sockets -> HTTP
3. TLS -> HTTP
4. TLS -> Frang

As the result the function tfw_connection_recv() was eliminated.
Now the code is simpler and has lower overhead.

We still might need GFSM for the user-space requests handling (tempesta-tech#77)
and Tempesta Language (tempesta-tech#102).

Contributes to tempesta-tech#755

Based-on-patch-by: Alexander K <[email protected]>
Signed-off-by: Aleksey Mikhaylov <[email protected]>
ttaym added a commit that referenced this issue Feb 24, 2022
Almost literaly follow ak patch from 2eae1da

Replace GFSM calls with direct calls to TLS and HTTP handlers
 on low level networking layers.

GFSM was designed to build graphs of network protocols FSMs (this
design was inspired by FreeBSD netgraph). However, during the years
neither we nor external users have any requirements to introduce
any modules which use GFSM to hook TLS or HTTP entry code. There
are only 2 users of the mechanism for TLS and HTTP for now:
1. TLS -> HTTP protocols handling
2. HTTP limits (the frang module)

This patch replaces GFSM calls with direct calls to
tfw_http_req_process(), tfw_tls_msg_process() and frang_tls_handler()
in following paths:
1. sync sockets -> TLS
2. sync sockets -> HTTP
3. TLS -> HTTP
4. TLS -> Frang

As the result the function tfw_connection_recv() was eliminated.
Now the code is simpler and has lower overhead.

We still might need GFSM for the user-space requests handling (#77)
and Tempesta Language (#102).

Contributes to #755

Based-on-patch-by: Alexander K <[email protected]>
Signed-off-by: Aleksey Mikhaylov <[email protected]>
@krizhanovsky krizhanovsky removed this from the 1.xx TBD milestone Mar 27, 2023
@krizhanovsky krizhanovsky added this to the 1.0 - GA milestone Mar 27, 2023
@krizhanovsky krizhanovsky removed their assignment Mar 27, 2023
@krizhanovsky krizhanovsky modified the milestones: 1.0 - GA, 1.2 - TBD Nov 12, 2023
@krizhanovsky krizhanovsky self-assigned this Nov 13, 2023
@krizhanovsky krizhanovsky modified the milestones: 1.2 - TBD, 1.0 - GA Sep 27, 2024
@krizhanovsky krizhanovsky removed their assignment Sep 27, 2024
@ai-tmpst
Copy link
Contributor

ai-tmpst commented Oct 4, 2024

In the scope of #537 is developing a ring-buffer mapped to userspace.
It could be useful in this task. Look at fw/ringbuffer.*.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants