Skip to content

Software architecture overview

Eduard Grasa edited this page Jun 2, 2021 · 31 revisions

Introduction

This page provides an overview of the software architecture and components of IRATI. For a more detailed explanation we direct the reader to Pouzin Society website, IRATI implementation. The software architecture of IRATI is shown in Figure 1.

Figure 1. Main software components of the RINA implementation by the FP7-IRATI project Figure 1. Source: S. Vrijders et al; "Prototyping the recursive internet architecture: the IRATI project approach ", IEEE Network Vol 28 (2), pp. 20-25, March 2014

The main components of IRATI have been divided into four packages:

  1. Daemons (rinad). This package contains two types of daemons (OS Processes that run in the background), implemented in C++.

    • IPC Manager Daemon (rinad/src/ipcm). The IPC Manager Daemon is the core of IPC Management in the system, acting both as the manager of IPC Processes and a broker between applications and IPC Processes (enforcing access rights, mapping flow allocation or application registration requests to the right IPC Processes, etc.)
    • IPC Process Daemon (rinad/src/ipcp). The IPC Process Daemons (one per running IPC Process in the system) implement the layer management components of an IPC Process (enrollment, flow allocation, PDU Forwarding table generation or distributed resource allocation functions).
  2. Librina (librina). The librina package contains all IRATI libraries that have been introduced to abstract from the user all the kernel interactions (such as syscalls and Netlink details). Librina provides its functionalities to user-space RINA programs via scripting language extensions or statically/dynamically linkable libraries (i.e. for C/C++ programs). Librina is more a framework/middleware than a library: it has its own memory model (explicit, no garbage collection), its execution model is event-driven and it uses concurrency mechanics (its own threads) to do part of its work.

  3. Kernel components (linux/net/rina). The kernel contains the implementation of the data transfer / data transfer control components of normal IPC Processes as well as the implementation of shim DIFs - which usually need to access functionality only available at the kernel. The Kernel IPC Manager (KIPCM) manages the lifetime (creation, destruction, monitoring) of the other component instances in the kernel, as well as its configuration. It also provides coordination at the boundary between the different IPC processes.

  4. Test applications and tools (rina-tools). This package contains test applications and tools to test and debug the RINA Prototype. Right now the rina-tools package contains the rina-echo-time application, which can work both in “echo” (ping-like behavior between two application instances) or “performance” mode (iperf-like behaviour).

Common functionalities that are used by both the IPC Process Daemon and the IPC Manager Daemon.

Event Loop

Since both the daemons are event-based distributed applications, a common event-loop class has been developed (rinad::EventLoop). The event-loop class currently intercepts only librina events - e.g. events that inform the daemons about flow allocation/deallocation, IPC process enrollment, application registration/deregistration.

The user of the event-loop class can associate a different handler for each possible librina event type, using the EventLoop::register_event() method. When the event-loop intercepts a librina event, it checks if the user has registered a handler for the type of the received event. If a handler is present, it is invoked, passing as arguments the event received and a user-provided pointer to the data model instance of the specific event consumer - e.g. the IPC Manager daemon or the IPC Process daemon.

Configuration infrastructure

The global configuration for IRATI has a non-trivial nested structure that is reflected in the rinad::RinaConfiguration class. This class has been ported from a Java class (having the same name) defined in the Java IPC Manager code. Since this class contains plenty of information about the stack configuration in general - e.g. information about the IPC processes to be created, configuration of the DIFs - the rinad:RinaConfiguration class has been moved from the IPC Manager to the rinad common functionalities.

The rinad::IPCManager class is the main class in the program, which represents the IPC Manager data model and contains all the data and functionalities used by the IPC Manager threads. Since there is only one IPC Manager per processing system, only one rinad::IPCManager object is instantiated. The rinad::IPCManager class exposes a set of methods that can be used to perform configuration operations on IRATI. The most important methods are used for: creation of IPC Processes, destruction of IPC Processes, assigning IPC Processes to DIFs, registering IPC Processes to DIFs and triggering the enrollment of an IPC Process to a DIF.

The first step the IPC Manager has to perform is the parsing of the JSON file that contains the configuration of the RINA prototype for that system. This is done at the very beginning of the program execution, before the script thread, the console thread and the event-loop are started. The purpose of the configuration parsing module is to fill the rinad::RinaConfiguration instance contained in the rinad::IPCManager instance, using the information contained in the configuration file.

The IPC Manager implements a CLI console via a dedicated thread. The currently available commands - which roughly resemble the rinad::IPCManager methods - are the following:

  • help: Show the list of available commands or the usage of a specific command.
  • exit or quit: Exit the console.
  • create-ipcp: Create a new IPC process.
  • destroy-ipcp: Destroy an existing IPC process.
  • list-ipcps: List the existing IPC processes with associated information.
  • list-ipcp-types: List the IPC process types currently available in the system.
  • assign-to-dif: Assign an IPC process to a DIF.
  • register-at-dif: Register an IPC process within a DIF.
  • unregister-from-dif: Unregister an IPC process from a DIF.
  • enroll-to-dif: Enroll an IPC process to a DIF.
  • query-rib. Display the information of the objects present at RIB of an IPC Processs.

The IPCP Daemon is a container of different components that either provide general support functions (e.g. as encoding/decoding objects, generating/parsing CDAP messages or the RIB Daemon) or implement layer management functionalities (e.g. enrollment, flow allocation, namespace management, resource allocation or security management). The IPCP can interact with the IPC Manager Daemon (IPCM) via librina’s ExtendedIPCManager proxy class, which translates function invocations to NetLink (NL) messages towards the IPCM OS process. In a similar way the IPCP Daemon interacts with the data transfer components of the IPC Process via librina’s KernelIPCProcess proxy class, which translates function invocations to NL messages or system calls, addressed to the kernel.

The IPCM creates the IPCP OS process by forking and executing the IPCP executable (i.e. ipcp). Upon creation, the IPCP Daemon checks if it has enough information to start, initializes librina and creates an instance of each component; both the utility components (Encoder, CDAP Session Manager, RIB Daemon) and the ones implementing layer management functions (Enrollment, Resource Allocator, Namespace Manager, Flow Allocator and Security Manager). After they are successfully initialized, the IPC Process main class invokes the set_ipc_process method on each of them. The IPCP Daemon has the following internal components:

RIB Daemon. The IPCP mostly leverages librina-rib for implementing its own RIB Daemon.

Enrollment Task. The rinad::EnrollmentTask class is the entry point to the functions related to IPC Process neighbour discovery and management. The logic of the enrollment sequence is implemented in two separate state machines by the rinad::EnrollerStateMachine and the rinad::EnrolleeStateMachine classes (used by the member and joining IPC Processes, respectively).

Namespace Manager. The rinad::NamespaceManager class is the IPC Process component in charge of managing the DIF’s address namespace and takes part in maintaining the Directory Forwarding Table (DFT). The DFT is a distributed map matching application names (registered with the DIF) to the address of the IPC process they are registered with. The current implementation of the DFT is a fully replicated map whose updates are disseminated to all members of the DIF via a controlled flooding approach.

Flow Allocator. The rinad::FlowAllocator class is the IPC Process component responsible for managing the lifecycle of the flows provided by the DIF. Lifecycle operations include creation, destruction and monitoring of key flow parameters to ensure the flow provides an acceptable level of service (the current prototype implements only the creation and destruction operations). The rinad::FlowAllocator class defers the work of managing individual flows to instances of the rinad::FlowAllocatorInstance class.

Resource Allocator. The rinad::ResourceAllocator class is in charge of providing the entry point to the functions performed by the ResourceAllocator. There are a wide range of functions that can be performed by this component but currently, IRATI just provides the implementation for two of them: management of N-1 Flows and generation of the PDU Forwarding Table using routing.

The rinad::NMinusOneFlowManager class manages the allocation and deallocation of the N-1 flows used by the IPC Process as well as the registration of the ICP Process to one or more N-1 DIFs.

The PDU Forwarding Table Generator is responsible for the generation of the PDU Forwarding Table (PDUFT), obtained through the use of a specific (routing) policy. The PDUFT maps destination addresses and QoS identifiers to the port identifiers of N-1 flows. These flows connect the IPC Process to its neighbours in the DIF. The PDUFT is used by the RMT – a software component residing in kernel-space – in order to multiplex/demultiplex the PDUs. IRATI implements a link-state based routing approach. It maintains a graph representing the current knowledge of the connectivity of the DIF. Each vertex of the graph represents an IPC Process while each vertex pair represents an N-1 flow interconnecting the corresponding IPC Processes. An algorithm (e.g. the Dijkstra’s shortest path) is applied to the graph in order to compute the routes from a source IPC Process to every other IPC Process in the DIF. These routes are used to fill the PDUFT with entries mapping an {address, QoS} pair to the list of N-1 ports that have to be used to reach the next hop in the path towards the destination.

Security Manager. The security manager is the IPC Process component responsible for managing all the security-related behaviours (i.e. authentication, access control, SDU Protection, credential management, auditing). Within the scope of IRATI, the implementation just provides a trivial implementation of two access-control related functions.

IRATI provides the following libraries to support the operation of user-space daemons and applications. All libraries already have an initial, complete implementation unless otherwise stated.

  • librina-application: Provides the APIs that allow an application to use RINA natively, enabling it to allocate and deallocate flows, read and write SDUs to that flows, and register/unregister to one or more DIFs.

  • librina-ipc-manager: Provides the APIs that facilitate the IPC Manager to perform the tasks related to IPC Process creation, deletion and configuration.

  • librina-ipc-process: APIs exposed by this library allow an IPC Process to configure the PDU forwarding table (through Netlink sockets), to create and delete EFCP instances (through Netlink sockets also), to request the allocation of kernel resources to support a flow (through system calls) and so on.

  • librina-faux-sockets: Allow adapting a non-native RINA application (a traditional UNIX socket based application) to lay over the RINA stack. A proof of concept implementation of this library is under development and expected to be ready by the end of the FP7-IRATI project.

  • librina-cdap: Implementation of the CDAP protocol.

  • librina-sdu-protection: APIs and implementation to use the SDU-protection module in user space to protect and unprotect SDUs (add CRCs, encryption, etc).

  • librina-common: Common interfaces and data structures.

  • librina-ipc-daemons: Interfaces and data structures common to the IPC Process and the IPC Manager Daemons.

  • librina-configuration: Classes to model the configuration of the components of an IPC Process and their policies (EFCP, RMT, Enrollment, Resource Allocation, Flow Allocation, etc)

  • librina-rib: Classes that provide an almost complete implementation of a RIB Daemon, a RIB and the “façade objects” that users of the library have to implement. The RIB Daemon converts incoming CDAP messages to operations on local objects, and operations on remote objects to outgoing CDAP messages.

Base Framework

All the kernel-space components of IRATI rely on a base framework that is composed by the following parts:

  • rmem: This part implements a common memory management layer that provides additional features over the basic primitives available for dynamic memory allocation and de-allocation (i.e. kmalloc, kzalloc and kfree). These features provide additional debugging functionalities such as memory tampering (adding pre-defined shims on-top/at-bottom of the memory area to detect out-of-rage writes) and poisoning (initializing the object contents with a known value to detect uninitialized usage) specifically tailored to RINA objects; allowing developers to easily spot memory leaks as well as memory corruption problems.

  • rref: The rref component provides reference-counting functionalities to RINA objects and allows implementing lightweight garbage-collection semantics in a per-object way. Developers can opt-in for reference counting in their objects only when needed.

  • rwq, rbmp and rmap: These parts implement façades for the Linux work-queues, bitmaps and hashmaps respectively. These components provide additional functionalities such as easier dynamic allocation, simplified interaction and ad-hoc methods for non-interruptible contexts. rqueue: The rqueue component mainly provides dynamically resizable queues, which are unavailable as part of the default kernel libraries.

  • rtimers: The rtimer abstracts the native kernel timer. It aims to ease the development of timer-based functionalities as well as to provide the means for time and cost effective porting to other platforms.

Kernel/user interface

The kernel/user space interface is conjunction of a set of system calls and Netlink sockets.

The RINA Netlink Layer is the solution implemented in IRATI to integrate Netlink in the RINA software framework. It presents an abstraction layer that deals with all the tasks related to the configuration, generation and destruction of Netlink sockets and messages, hiding their complexity to the RNL users. RNL defines a Generic Netlink family called NETLINK_RINA for the exclusive use of IRATI. This family describes the set of messages that can be sent or received by the components in the kernel. On the other hand, the RNL acts as a multi-directional communication hub that multiplexes the messages upon reception from user space and provides message generation facilities to the different components in the kernel.

Upon initialization, the Kernel IPC Manager (KIPCM) registers a set of handlers towards the RNL, one for each message defined in the stack’s RNL API. The RNL handlers are objects that contain a call-back function and a parameter that is opaquely transferred by RNL to the receptor of the call-back once it gets invoked. This approach allows each component of the kernel to register a specific handler with a call-back function that is in charge of performing the task associated to the type of the message received.

Kernel IPC Manager (KIPCM) and Kernel Flow Allocator (KFA)

The KIPCM is the counterpart in kernel-space of the IPC Manager in user-space. Its main tasks are:

  • It manages the lifecycle of several components in the kernel, such as IPC Processes and the KFA.
  • It is in charge of retrieving the corresponding IPC Process instance when it has to perform any task triggered by a call to the to-user-space interface (i.e. flow allocation, register application to IPC process, etc).
  • It abstracts the nature of the IPC Process to the applications by providing a unique API for reading and writing SDUs.
  • It is the main hub to the RNL Layer presented in section 1.3.2 correlating incoming/outgoing (replies/requests) messages by means of their sequence numbers.

The KFA is in charge of flow management related functionalities such as creation, binding to IPC processes, de-allocation, etc. These functionalities are offered to both the KIPCM via its northbound interface, and to the IPC Processes via its southbound interface. The management of port identifiers and its association to flow instances is the most important functionality provided by the KFA through its Port ID Manager (PIDM). Moreover, the KFA binds the KIPCM and IPC Processes by means of the flow it manages. For this reason, the KFA creates a flow structure for each flow and binds together its port identifier and the identifier of the IPC Process supporting the flow. The flow structure also contains information such as the flow instance allocation state and the queue of SDUs ready to be consumed by user space applications.

Normal IPC Process Components

The components of the normal IPC Process implemented in the kernel are those related to data transfer and data transfer control, namely: EFCP, the RMT and SDU Protection.

  • Error and Flow Control Protocol (EFCP). Container of the different EFCP instances. EFCP is responsible for the actual data transfer. EFCP is split into the Data Transfer Protocol (DTP) and Data Transfer Control Protocol (DTCP), loosely coupled through a state vector. There is an EFCP instance for each different connection in each IPC process, which passes data to the relaying and multiplexing task (RMT) in the outgoing direction, or to the kernel IPC manager in the incoming direction. The current implementation supports DTP and simple policies for sliding-window flow control and retransmission control in DTCP.

  • Relaying and Multiplexing Task (RMT). Container of the different RMT instances in the system. Each RMT instance (one per IPC process) multiplexes the data from N EFCP connections to M underlying flows, and relays the data coming from underlying flows to EFCP connections or to other underlying flows with information from the forwarding table, generated by the PDU forwarding table. It passes data to the kernel IPC manager in the outgoing direction, and to EFCP in the incoming direction. The current RM implementation implementation provides a simple scheduling policy that uses two single FIFO queues per N-1 flow (one for incoming SDUs and one for outgoing SDUs), therefore there is no differential treatment of traffic.

  • SDU Protection. Container of the different service data unit (SDU) protection module instances in the system. At the finest granularity, there can be a different SDU protection module instance for each distinct underlying flow. The SDU protection component in the kernel is called by the RMT component to either protect data (outgoing direction) or expose data (incoming direction). Right now, IRATI is working on two types of SDU Protection policies: Lifetime-limiting policies based on hopcount - to enforce Maximum PDU Lifetime within a DIF; and CRC-16/32 error-checking policies. These policies are under development and expected to be available by the end of 2014.

Shim IPC Processes

The task of a shim DIF is to put as small as possible a veneer over a legacy protocol to allow a RINA DIF to use it unchanged. In other words, because the DIF assumes it has a RINA API below, the Shim DIF allows a DIF to operate over a legacy technology or a physical medium. Shim DIFs are fully implemented in the kernel, since their functionality is simple and usually interact with APIs exposed by device drivers or code from the networking stack that is only available at the kernel.

  • Shim IPC Process over 802.1Q. This shim DIF wraps a VLAN with the RINA IPC API, allowing a DIF operating on top to allocate and deallocate flows to other IPC Processes. This shim DIF interacts with two other subsystems in the kernel: the device drivers layer (a shim IPC Process over 802.1Q is attached to the virtual device that represents the VLAN), and the ARP implementation developed by the FP7-IRATI consortium (the Linux ARP implementation could not be used since it was designed assuming that IP was its only user). Registration requests add an entry to the ARP tables, mapping the registered IPC Process name to the address of the Ethernet device. Flow allocation requests resolve the shim IPC Process address of the destination IPC Process by performing an ARP request.

  • Shim IPC Process for HV. The existing shared memory communication mechanisms between a VM and the HV supporting it are wrapped by the IPC API . In this way, the Shim DIF for HV can operate within IRATI and be deployed in arbitrary RINA setups. This is achieved without the need of complex device emulation, MAC addresses or artificial MTU limitations. The lower level communication mechanisms between a guest VM and its host can be modelled as a DIF with local scope, completely hidden within the physical machine (i.e. transparent to other DIFs from the outside). The transport functionalities necessary to exchange PDUs are provided by the Virtual Message Passing Interface (VMPI); an API for exchanging messages between a VM and the “Host” - the VMPI peers. Currently this shim DIF supports the KVM and XEN virtualization technologies.

  • Shim IPC Process over TCP / UDP. The shim IPC Process over TCP/UDP module makes use of the socket layer and the DNS resolver functionalities available in the kernel. This shim DIF provides two types of flows: reliable - mapped to TCP connections -, and unreliable - mapped to UDP. The mapping of registered IPC Process names to IP addresses and port numbers can be performed by two means: statically (configured at the shim DIF’s initialization time) or dynamically (via the use of DNS SRV records).

Clone this wiki locally