Skip to content

Tips for Debugging

Michel Machado edited this page Jul 9, 2024 · 25 revisions

This page gives some tips on debugging Gatekeeper using pktgen and gdb

Table of Contents

Generating Packets on DPDK Interfaces

This section describes how to install and use the DPDK pktgen program to generate packets. We provide an example based on our "XIA" server setup.

Server Physical Setup

The XIA server has four DPDK-enabled ports that can be used to send and receive packets. The diagram below shows the four ports (along with their PCI addresses):

    +-----------------------------------------+
    |  -----               -----              |
    |  | o | A (83:00.0)   | o | B (83:00.1)  |
    |  --|--               --|--              |
    |    |                   |                |
    |  --|--               --|--              |
    |  | o | C (85:00.0)   | o | D (85:00.1)  |
    |  -----               -----              |
    +-----------------------------------------+

Note that ports A and C are connected, and ports B and D are connected. The MAC addresses of the four ports are A (e8:ea:6a:06:1f:7c), B (e8:ea:6a:06:1f:7d), C (e8:ea:6a:06:21:b2), and D (e8:ea:6a:06:21:b3). The port MAC addresses may be used for generating specific packets in the following sections.

When you run multiple DPDK applications at the same time (such as two instances of pktgen, or pktgen and another DPDK application), you need to blacklist the PCI devices, or ports in this case, that you don't want to use. Depending on how you blacklist the devices, the port numbers that are assigned by DPDK will change.

For example, if you blacklist ports A, B, and C in one instance of pktgen, then port D is known as port 0 in that instance. If, at the same time, you blacklist ports A, C, and D in another instance of pktgen, then in that instance port B is known as port 0. To avoid ambiguity, in this document we will refer to the ports using these letter designations (A, B, C, D) instead of by numbers.

Setting up pktgen

First, obtain the pktgen source code and build it:

    $ git clone http://dpdk.org/git/apps/pktgen-dpdk
    $ cd pktgen-dpdk
    $ make

Once it compiles, do a setup step:

    $ sudo -E ./setup.sh

The application should be ready to run at this point, but you may also encounter the need to perform the following step (still in the pktgen-dpdk directory):

    $ cp Pktgen.lua app/app/x86_64-native-linuxapp-gcc/

Running Two Instances of pktgen

This demonstration will show how we can generate packets on port A and send them to port C. To begin, open two terminals.

In the first terminal, go to the dpdk-pktgen directory and run:

    $ cd app/app/x86_64-native-linuxapp-gcc/
    $ sudo ./pktgen -l 0-2 --socket-mem 256 --file-prefix pg1 -b 83:00.1 -b 85:00.0 -b 85:00.1 -- -T -P -m "[1:2].0" 

The "-c 7" option specifies which lcores are available for use, which is lcores 0, 1, and 2. The "-m [1:2].0" part specifies that lcores 1 and 2 will handle rx/tx on port 0, and lcore 0 is automatically assigned to the pktgen program for displaying statistics.

The "--socket-mem 256" option puts a limit on the memory used, which is often needed when multiple DPDK applications are run at the same time. The "--file-prefix pg1" option specifies a special file prefix to use for this DPDK application's meta information, which again is needed if there are multiple DPDK applications running (which we will have in the next part).

The "-b" options blacklist different ports -- in other words, excludes those ports from being used in the application. In this command, we only want to use port A (at PCI location 83:00.0), so we blacklist the other three ports: B (83:00.1), C (85:00.0), and D (85:00.1).

The "-T" option gives the display statistics some color and the "-P" makes all ports run in promiscuous mode.

In the second terminal, run the same command, but with different lcores and blacklisted ports:

    $ cd app/app/x86_64-native-linuxapp-gcc/
    $ sudo ./pktgen -c 70 --socket-mem 256 --file-prefix pg2 -b 83:00.0 -b 83:00.1 -b 85:00.1 -- -T -P -m "[5:6].0"

This does the same as the first command, but instead allows lcores 4, 5, and 6 to be used, uses a different file prefix, and blacklists all ports except for port C.

More information about the command-line parameters is here.

Once the applications are running, go to the terminal running pktgen on port A. You can start packets flowing using:

    $ Pktgen> start 0

Remember that port numbering always starts from 0 within an application, so since we blacklisted all other ports, port 0 means port A.

When you do this, you should see the second terminal's statistics being updated. You could also start packets flowing on the second terminal (port C) and see the packets being received on port A -- on the second terminal, port C will also be called port 0, since that is the only active port in that application.

You can stop packets flowing with:

    $ Pktgen> stop 0

And quit with:

    $ Pktgen> quit

Debugging Gatekeeper

Since Gatekeeper is compiled with the -g option, gdb can be directly used with Gatekeeper. After it is compiled, you can run:

    $ sudo gdb -ex 'handle SIG33 nostop noprint' ./build/gatekeeper

From there, you can use gdb as described in its documentation. Once gdb starts, run Gatekeeper with the following command:

    (gdb) run

While the basic debugging information is fine for quick inspections, disabling compiler optimization and having more debugging information is essential for hunting bugs. To compile Gatekeeper with compiler optimization level set to 0 and more debugging information, set the environment variable EXTRA_CFLAGS as follows:

    export EXTRA_CFLAGS='-O0 -ggdb'

Once the variable EXTRA_CFLAGS is exported, just call make as you would to compile Gatekeeper.

Debugging a running Gatekeeper

Due to the restrictions of a production deployment, one may have to debug a running Gatekeeper (or Grantor) instance. The first step to debug a running Gatekeeper instance is to obtain its PID, which can be found in Gatekeeper's log. For example, the following log entry was generated during the initialization of a Gatekeeper instance:

    Main/0 2024-05-15 15:44:37 NOTICE Gatekeeper pid = 12761

With the PID 12761, the following command attaches gdb to Gatekeeper:

    $ sudo gdb -ex 'handle SIG33 nostop noprint' -p 12761

Once the investigation is done, detach gdb from Gatekeeper as follows:

    (gdb) detach

Changing the log level of a component of DPDK

One may need to increase the log level of DPDK to investigate a bug. However, increasing the log level for all DPDK may generate so much log that finding useful log entries becomes a burden. This burden can be avoided if one is only interested in a component of DPDK since it is possible to increase the log level for only that component. To change the log level of a component of DPDK, pass the parameter --log-level <type:val> to Gatekeeper.

In order to find the type of a component of DPDK, one likely has to read DPDK's source. For example, for the bonding interface, it is pmd.net.bonding. This information can be found by inspecting the output of grep RTE_LOG_REGISTER -r dependencies/dpdk/drivers/net/bonding/. Thus, to set the log level of bonding interfaces to debug, pass --log-level pmd.net.bonding:debug to Gatekeeper.

The parameter --log-level is of the many EAL parameters.

Compiling DPDK with debug information

Sometimes one has to investigate the code of DPDK with gdb as well, and having DPDK compiled with debug information included is very helpful in these cases. To compile the DPDK and sample applications with debugging information included and the optimization level set to 0, set the build type as follows:

    cd dependencies/dpdk/build
    meson configure --buildtype=debug

Once the build type is set, compile DPDK as described in setup.sh.

More information on the options available from meson, see the page Compiling the DPDK Target from Source.

Reasigning an interface to the kernel

As one works with Gatekeeper to diagnosis an issue, there may be a time when it is necessary to unbind an interface from DPDK and bind it back to the kernel. The first step to do this is to identify the PCI address of the interface and the kernel drive to use. For example:

    $ dependencies/dpdk/usertools/dpdk-devbind.py --status-dev net
    
    Network devices using DPDK-compatible driver
    ============================================
    0000:3b:00.0 'Ethernet Controller 10G X550T 1563' drv=uio_pci_generic unused=ixgbe,igb_uio,vfio-pci
    0000:3b:00.1 'Ethernet Controller 10G X550T 1563' drv=uio_pci_generic unused=ixgbe,igb_uio,vfio-pci
    0000:5e:00.1 'Ethernet Controller 10G X550T 1563' drv=uio_pci_generic unused=ixgbe,igb_uio,vfio-pci
    
    Network devices using kernel driver
    ===================================
    0000:19:00.0 'I350 Gigabit Network Connection 1521' if=eno1 drv=igb unused=igb_uio,vfio-pci,uio_pci_generic *Active*
    0000:19:00.1 'I350 Gigabit Network Connection 1521' if=eno2 drv=igb unused=igb_uio,vfio-pci,uio_pci_generic 
    0000:19:00.2 'I350 Gigabit Network Connection 1521' if=eno3 drv=igb unused=igb_uio,vfio-pci,uio_pci_generic 
    0000:19:00.3 'I350 Gigabit Network Connection 1521' if=eno4 drv=igb unused=igb_uio,vfio-pci,uio_pci_generic 
    0000:5e:00.0 'Ethernet Controller 10G X550T 1563' if=ens2f0 drv=ixgbe unused=igb_uio,vfio-pci,uio_pci_generic *Active*

In this example, we will move back to the kernel the interface whose PCI address is 0000:5e:00.1. The field "unused" of this interface identifies the kernel driver: ixgbe. In practice, one needs to match the information above with some specifics of the occasion to identify the PCI address. In this specific example, the high-level goal was to have all ports of that interface available on the kernel. One of those ports is already in the kernel: 0000:5e:00.0. Notice that the field "drv" of 0000:5e:00.0 matches the kernel driver identified before. One the PCI address and the kernel drivers are known, the process goes as follows:

    $ sudo dependencies/dpdk/usertools/dpdk-devbind.py -u 5e:00.1
    $ sudo dependencies/dpdk/usertools/dpdk-devbind.py -b ixgbe 5e:00.1

Disassembling BPF programs

Flow entries may have BPF programs associated with them, and one may need to verify that these programs were correctly compiled. To do so, one can disassemble a BPF program as follows:

 $ llvm-objdump -S --no-show-raw-insn --section=init declined.bpf
 
 declined.bpf:	file format ELF64-BPF
 
 Disassembly of section init:
 declined_init:
       0:	r0 = 0
       1:	exit

The option --no-show-raw-insn suppresses the hexadecimal representation of the instructions, which is only useful if one is debugging the BPF VM itself, or generating BPF instructions dynamically.

Gatekeeper BPF programs have two sections: "init" and "pkt". Section "init" is used to initialize a BPF state, whereas the section "pkt" is used to process a packet associate with a flow.

Generating a Debian package for debugging

Building a debug-enabled Debian package for Gatekeeper can be useful because it eases the uninstallation procedure, which must be done manually when performing the the build process with the setup.sh script.

The following procedure assumes one is at the root of a recursively-cloned gatekeeper repository, i.e. git clone --recursive http://github.com/AltraMayor/gatekeeper.git followed by cd gatekeeper.

1) Edit the Makefile to disable compiler optimizations and enable debug information:

-EXTRA_CFLAGS += -O3 -g -Wfatal-errors -DALLOW_EXPERIMENTAL_API \
+EXTRA_CFLAGS += -O0 -ggdb -Wfatal-errors -DALLOW_EXPERIMENTAL_API \

2) Edit the debian/rules file make the following changes:

  • Replace the `-g -O2` flags in the CFLAGS and LDFLAGS environment variables with -ggdb -O0:
-CFLAGS  += -g -O2 -fno-strict-aliasing -fno-strict-overflow -fPIC
-LDFLAGS += -g -O2 -fno-strict-aliasing -fno-strict-overflow -fPIC -Wl,-z,defs -Wl,--as-needed
+CFLAGS  += -ggdb -O0 -fno-strict-aliasing -fno-strict-overflow -fPIC
+LDFLAGS += -ggdb -O0 -fno-strict-aliasing -fno-strict-overflow -fPIC -Wl,-z,defs -Wl,--as-needed
  • Add the EXTRA_CFLAGS to do the same when building DPDK:
-               make config T="$(RTE_TARGET)"; \
+               make config T="$(RTE_TARGET)" EXTRA_CFLAGS="-O0 -ggdb"; \
  • Prevent binary stripping by adding the following line at the end of the file:
override_dh_strip:

3) Build the package:

$ tar --exclude-vcs -zcvf ../gatekeeper_1.0.0.orig.tar.gz -C .. gatekeeper
$ debuild -uc -us

4) Install the gatekeeper package and corresponding debug symbols package.

# dpkg -i ../gatekeeper_1.0.0-0_amd64.deb ../gatekeeper-dbgsym_1.0.0-0_amd64.ddeb

The package can then be controlled via systemd with the usual commands. For example,

 # systemctl start gatekeeper   # start the service
 # systemctl stop gatekeeper    # stop the service
 # systemctl enable gatekeeper  # enable startup on boot
 # systemctl disable gatekeeper # disable startup on boot

To run Gatekeeper under GDB, it is first necessary to perform manually a few steps that the systemd unit performs before startup. These assume the GATEKEEPER_INTERFACES environment variable has been set in /etc/gatekeeper/envvars:

# . /etc/gatekeeper/envvars
# install -oroot -ggatekeeper -m0770 -d /var/run/gatekeeper /var/run/dpdk/rte
# /usr/share/gatekeeper/devbind.sh $GATEKEEPER_INTERFACES
# gdb -ex 'handle SIG33 nostop noprint' --args /usr/sbin/gatekeeper $DPDK_ARGS -- $GATEKEEPER_ARGS

Notice that the directory must be the root of a recursively-cloned gatekeeper repository when calling gdb. Otherwise, gdb will not find the source code.

Finally, it's worth remembering that the Debian package of Gatekeeper also installs gkctl. The associated scripts are available at /usr/share/gatekeeper/. For example, to call script show_fib6.lua, enter: gkctl /usr/share/gatekeeper/show_fib6.lua.