Skip to content

Commit

Permalink
mlxsw: spectrum_acl_erp: Fix error flow of pool allocation failure
Browse files Browse the repository at this point in the history
Lately, a bug was found when many TC filters are added - at some point,
several bugs are printed to dmesg [1] and the switch is crashed with
segmentation fault.

The issue starts when gen_pool_free() fails because of unexpected
behavior - a try to free memory which is already freed, this leads to BUG()
call which crashes the switch and makes many other bugs.

Trying to track down the unexpected behavior led to a bug in eRP code. The
function mlxsw_sp_acl_erp_table_alloc() gets a pointer to the allocated
index, sets the value and returns an error code. When gen_pool_alloc()
fails it returns address 0, we track it and return -ENOBUFS outside, BUT
the call for gen_pool_alloc() already override the index in erp_table
structure. This is a problem when such allocation is done as part of
table expansion. This is not a new table, which will not be used in case
of allocation failure. We try to expand eRP table and override the
current index (non-zero) with zero. Then, it leads to an unexpected
behavior when address 0 is freed twice. Note that address 0 is valid in
erp_table->base_index and indeed other tables use it.

gen_pool_alloc() fails in case that there is no space left in the
pre-allocated pool, in our case, the pool is limited to
ACL_MAX_ERPT_BANK_SIZE, which is read from hardware. When more than max
erp entries are required, we exceed the limit and return an error, this
error leads to "Failed to migrate vregion" print.

Fix this by changing erp_table->base_index only in case of a successful
allocation.

Add a test case for such a scenario. Without this fix it causes
segmentation fault:

$ TESTS="max_erp_entries_test" ./tc_flower.sh
./tc_flower.sh: line 988:  1560 Segmentation fault      tc filter del dev $h2 ingress chain $i protocol ip pref $i handle $j flower &>/dev/null

[1]:
kernel BUG at lib/genalloc.c:508!
invalid opcode: 0000 [#1] PREEMPT SMP
CPU: 6 PID: 3531 Comm: tc Not tainted 6.7.0-rc5-custom-ga6893f479f5e #1
Hardware name: Mellanox Technologies Ltd. MSN4700/VMOD0010, BIOS 5.11 07/12/2021
RIP: 0010:gen_pool_free_owner+0xc9/0xe0
...
Call Trace:
 <TASK>
 __mlxsw_sp_acl_erp_table_other_dec+0x70/0xa0 [mlxsw_spectrum]
 mlxsw_sp_acl_erp_mask_destroy+0xf5/0x110 [mlxsw_spectrum]
 objagg_obj_root_destroy+0x18/0x80 [objagg]
 objagg_obj_destroy+0x12c/0x130 [objagg]
 mlxsw_sp_acl_erp_mask_put+0x37/0x50 [mlxsw_spectrum]
 mlxsw_sp_acl_ctcam_region_entry_remove+0x74/0xa0 [mlxsw_spectrum]
 mlxsw_sp_acl_ctcam_entry_del+0x1e/0x40 [mlxsw_spectrum]
 mlxsw_sp_acl_tcam_ventry_del+0x78/0xd0 [mlxsw_spectrum]
 mlxsw_sp_flower_destroy+0x4d/0x70 [mlxsw_spectrum]
 mlxsw_sp_flow_block_cb+0x73/0xb0 [mlxsw_spectrum]
 tc_setup_cb_destroy+0xc1/0x180
 fl_hw_destroy_filter+0x94/0xc0 [cls_flower]
 __fl_delete+0x1ac/0x1c0 [cls_flower]
 fl_destroy+0xc2/0x150 [cls_flower]
 tcf_proto_destroy+0x1a/0xa0
...
mlxsw_spectrum3 0000:07:00.0: Failed to migrate vregion
mlxsw_spectrum3 0000:07:00.0: Failed to migrate vregion

Fixes: f465261 ("mlxsw: spectrum_acl: Implement common eRP core")
Signed-off-by: Amit Cohen <[email protected]>
Signed-off-by: Ido Schimmel <[email protected]>
Signed-off-by: Petr Machata <[email protected]>
Acked-by: Paolo Abeni <[email protected]>
Link: https://lore.kernel.org/r/4cfca254dfc0e5d283974801a24371c7b6db5989.1705502064.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <[email protected]>
  • Loading branch information
Amit Cohen authored and kuba-moo committed Jan 18, 2024
1 parent f1172f3 commit 6d6eeab
Show file tree
Hide file tree
Showing 2 changed files with 56 additions and 4 deletions.
8 changes: 5 additions & 3 deletions drivers/net/ethernet/mellanox/mlxsw/spectrum_acl_erp.c
Original file line number Diff line number Diff line change
Expand Up @@ -301,6 +301,7 @@ mlxsw_sp_acl_erp_table_alloc(struct mlxsw_sp_acl_erp_core *erp_core,
unsigned long *p_index)
{
unsigned int num_rows, entry_size;
unsigned long index;

/* We only allow allocations of entire rows */
if (num_erps % erp_core->num_erp_banks != 0)
Expand All @@ -309,10 +310,11 @@ mlxsw_sp_acl_erp_table_alloc(struct mlxsw_sp_acl_erp_core *erp_core,
entry_size = erp_core->erpt_entries_size[region_type];
num_rows = num_erps / erp_core->num_erp_banks;

*p_index = gen_pool_alloc(erp_core->erp_tables, num_rows * entry_size);
if (*p_index == 0)
index = gen_pool_alloc(erp_core->erp_tables, num_rows * entry_size);
if (!index)
return -ENOBUFS;
*p_index -= MLXSW_SP_ACL_ERP_GENALLOC_OFFSET;

*p_index = index - MLXSW_SP_ACL_ERP_GENALLOC_OFFSET;

return 0;
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,8 @@ lib_dir=$(dirname $0)/../../../../net/forwarding
ALL_TESTS="single_mask_test identical_filters_test two_masks_test \
multiple_masks_test ctcam_edge_cases_test delta_simple_test \
delta_two_masks_one_key_test delta_simple_rehash_test \
bloom_simple_test bloom_complex_test bloom_delta_test"
bloom_simple_test bloom_complex_test bloom_delta_test \
max_erp_entries_test"
NUM_NETIFS=2
source $lib_dir/lib.sh
source $lib_dir/tc_common.sh
Expand Down Expand Up @@ -983,6 +984,55 @@ bloom_delta_test()
log_test "bloom delta test ($tcflags)"
}

max_erp_entries_test()
{
# The number of eRP entries is limited. Once the maximum number of eRPs
# has been reached, filters cannot be added. This test verifies that
# when this limit is reached, inserstion fails without crashing.

RET=0

local num_masks=32
local num_regions=15
local chain_failed
local mask_failed
local ret

if [[ "$tcflags" != "skip_sw" ]]; then
return 0;
fi

for ((i=1; i < $num_regions; i++)); do
for ((j=$num_masks; j >= 0; j--)); do
tc filter add dev $h2 ingress chain $i protocol ip \
pref $i handle $j flower $tcflags \
dst_ip 192.1.0.0/$j &> /dev/null
ret=$?

if [ $ret -ne 0 ]; then
chain_failed=$i
mask_failed=$j
break 2
fi
done
done

# We expect to exceed the maximum number of eRP entries, so that
# insertion eventually fails. Otherwise, the test should be adjusted to
# add more filters.
check_fail $ret "expected to exceed number of eRP entries"

for ((; i >= 1; i--)); do
for ((j=0; j <= $num_masks; j++)); do
tc filter del dev $h2 ingress chain $i protocol ip \
pref $i handle $j flower &> /dev/null
done
done

log_test "max eRP entries test ($tcflags). " \
"max chain $chain_failed, mask $mask_failed"
}

setup_prepare()
{
h1=${NETIFS[p1]}
Expand Down

0 comments on commit 6d6eeab

Please sign in to comment.