Skip to content

Libtrace Meta Data API

Shane Alcock edited this page May 1, 2019 · 4 revisions

This API was added in libtrace 4.0.7

On this page, we attempt to explain how libtrace can be used to explore meta-data that is either attached to packets or included in the packet stream as separate records.

Note that only a handful of formats support this type of meta-data (most notably pcap-ng and ERF).

What is meta-data?

In the packet capture context, meta-data is often used to record information about the process that was used to conduct the packet capture which may either provide useful insight to help interpret certain observed behaviours in the capture or simply act as a form of historical documentation. Meta-data can also be used to annotate packet traces to add extra context to particular events or tag interesting packets with extra notes. While this data exists alongside the conventional captured packets in a trace, it does not make sense to read, write, or extend the meta-data using the existing libtrace API functions -- instead, we need new methods and approaches to allow users to interact with this information (which may or may not be valuable for their particular analysis needs).

Meta-data within Libtrace

Firstly, there is an important distinction to make with regard to meta-data. Some meta-data is directly attached to a specific captured packet itself, i.e. it is included within the capture formatting the encapsulates the captured packet. This type of meta-data is currently not supported by our API, although this will probably be added in the near future. The more common type of meta-data is stored within separate records alongside the captured packets and this is what the current API is engineered towards.

To simplify things for downstream developers using the libtrace API, the separate meta-data records are delivered to users as instances of the libtrace_packet_t structure. This means that the user doesn't need to worry when writing code if the next item read from an input source is a packet or a meta-data record; you can call packet-level operations on a meta-data record and have it return an appropriate NULL result and vice-versa for calling meta-data analysis functions on a captured packet. As long as you have good error detection in your code (which you should have regardless) then the appearance of meta-data records in your packet stream should cause no problems.

Disabling meta-data

Having said that, there may be situations where you never need or want to have a meta-data record be returned to you by a libtrace read function such as trace_read_packet(). If this is the case, we have added a trace_config() option called TRACE_OPTION_DISCARD_META which, if set to a non-zero value prior to calling trace_start(), will cause all meta-data records to be automatically discarded.

Meta-data in parallel libtrace

In parallel libtrace, we decided to deal with meta-data slightly differently. In single-threaded libtrace, meta-data records and captured packets are read and provided to the user via the same function (i.e. trace_read_packet()). In parallel libtrace, the callback model that we use allows us to separate the processing into two distinct callbacks: one for conventional packets and one for meta-data records. This means that you should never receive a meta-data record in your regular packet callback; the meta packet callback will be invoked instead, so your callback code ends up being a bit simpler and tidier.

Libtrace structures for handling meta-data

typedef struct libtrace_meta_item {
        uint16_t section;
        uint16_t option;
        char *option_name;
        uint16_t len;
        libtrace_meta_datatype_t datatype;
        void *data;
} libtrace_meta_item_t;

typedef struct libtrace_meta_section {
        uint16_t num;
        libtrace_meta_item_t *items;
} libtrace_meta_t;

A meta-data field is best described as a simple key-value pair, which describes a particular aspect of the capture process or acts as an annotation within the packet stream. Most meta-data records will contain multiple meta-data fields. For instance, one field might contain the serial number of the capture device and another might describe the OS that was running on that device at the time. Most existing meta-data systems also tend to divide the fields amongst different sections, which provide additional scope for whatever the field is describing (i.e. does the meta-data apply to the host that the capture was run on, a particular interface or perhaps a certain "block" of captured packets).

Within libtrace, each meta-data field is represented using a libtrace_meta_item_t structure. The section and option fields are used to represent the key for the meta-data field and are numbered according to the specific assignments for the underlying meta-data format. For instance, an ERF meta-data field with a section of ERF_PROV_SECTION_CAPTURE and an option of ERF_PROV_DESCR refers to the description field for the current capture process. The option_name field contains a nice printable string representation of the key for use in output. len contains the length of the value for the meta-data field. datatype tells you what type of data is stored in the value for the meta-data field, i.e. a string, an unsigned 32 bit integer, etc. Finally, 'data' is simply a pointer to the value itself, which the user must cast to an appropriate type (as indicated by datatype) before attempting to read it.

All libtrace meta-data API functions actually operate on a wrapper structure called a libtrace_meta_t, which is simply a managed array of libtrace_meta_item_t instances. The libtrace_meta_t only contains two members: num which is the number of items in the array and items which is a pointer to the first item in the array.

Libtrace API functions for accessing meta-data

The first method for accessing meta-data is to simply ask libtrace to give you all of the meta-data fields present in the meta packet, using the trace_get_all_metadata() function:

libtrace_meta_t *trace_get_all_metadata(libtrace_packet_t *packet);

This function will return you a libtrace_meta_t where the internal array contains every meta-data field that could be found within the given packet / record. You can then iterate over all of the fields and act on each one accordingly. Note that if there are no meta-data fields within the provided packet (e.g. you passed in a conventional packet), this function will return NULL.

If you have a specific meta-data field that you are looking for, you can instead use:

libtrace_meta_t *trace_get_single_meta_field(libtrace_packet_t *packet, uint32_t section_code, uint16_t option_code);

This will return an libtrace_meta_t where the internal array contains just the meta-data field identified by the combination of the section_code and option_code that were provided as parameters. If no such field is present in the provided packet, NULL is returned.

int trace_destroy_meta(libtrace_meta_t *meta);

With both trace_get_single_meta_field() and trace_get_all_metadata(), you will need to call trace_destroy_meta() when you have finished with the returned libtrace_meta_t structure to ensure that any allocated memory within the structure is returned to the host system.

For some particularly notable meta-data fields, we have also included API functions which will directly search for and return the value for that field (if present in the meta packet). This allows users to call these functions directly without having to know what the appropriate section and option identifiers are -- this is also useful in situations where the field is available in multiple capture formats (e.g. both pcapng and ERF support the field) but the section and option identifiers may be different for each format.

A list of such functions is provided below:

char *trace_get_interface_name(libtrace_packet_t *packet, char *space, int spacelen, int index);
char *trace_get_interface_mac(libtrace_packet_t *packet, char *space, int spacelen, int index);
uint64_t trace_get_interface_speed(libtrace_packet_t *packet, int index);
uint32_t trace_get_interface_ipv4(libtrace_packet_t *packet, int index);
char *trace_get_interface_ipv4_string(libtrace_packet_t *packet, char* space, int spacelen, int index);
void *trace_get_interface_ipv6(libtrace_packet_t *packet, void *space, int spacelen, int index);
char *trace_get_interface_ipv6_string(libtrace_packet_t *packet, char *space, int spacelen, int index);
char *trace_get_interface_description(libtrace_packet_t *packet, char *space, int spacelen, int index);
char *trace_get_host_os(libtrace_packet_t *packet, char *space, int spacelen);
uint32_t trace_get_interface_fcslen(libtrace_packet_t *packet, int index);
char *trace_get_interface_comment(libtrace_packet_t *packet, char *space, int spacelen, int index);
char *trace_get_capture_application(libtrace_packet_t *packet, char *space, int spacelen);
char *trace_get_erf_dag_card_model(libtrace_packet_t *packet, char *space, int spacelen);
char *trace_get_erf_dag_version(libtrace_packet_t *packet, char *space, int spacelen);
char *trace_get_erf_dag_fw_version(libtrace_packet_t *packet, char *space, int spacelen);

Many of the above functions also have versions which will return a pointer to a libtrace_meta_t structure, if that is what you prefer -- see libtrace.h for more details.

Clone this wiki locally