Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zebra: fix an issue: dplane failed to limit the maximum length of the… #16067

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

zice312963205
Copy link
Contributor

There is an issue where the "dplane" cannot effectively restrict the number of temporary caches for the contexts (ctx) under certain conditions. This issue occurs because when a ctx workqueue is processed and completed, the ctx is moved to another queue (rib_dplane_q), during which its memory is not immediately released. The memory for the ctx is only released after the rib_process_dplane_results function has completed processing. However, before this happens, the count in zdplane_info.dg_routes_queued is decreased, which leads the meta_queue_process function to mistakenly believe that enough space has been cleared, thus allowing more new ctxs to be created and enqueued. This results in the number of ctxs in the system not being limited as expected by the value set by dplane_get_in_queue_limit.

The fix attempts to adjust the timing of when zdplane_info.dg_routes_queued is decremented. The modification is made so that zdplane_info.dg_routes_queued is decreased only once rib_process_dplane_results has completely processed a ctx and the actual memory release process for it has begun. This way, it ensures that there is sufficient space for new ctxs to be enqueued.

@zice312963205
Copy link
Contributor Author

related issue:#15016

… queue

There is an issue where the "dplane" cannot effectively restrict the number of temporary caches for the contexts (ctx) under certain conditions. This issue occurs because when a ctx workqueue is processed and completed, the ctx is moved to another queue (rib_dplane_q), during which its memory is not immediately released. The memory for the ctx is only released after the rib_process_dplane_results function has completed processing. However, before this happens, the count in zdplane_info.dg_routes_queued is decreased, which leads the meta_queue_process function to mistakenly believe that enough space has been cleared, thus allowing more new ctxs to be created and enqueued. This results in the number of ctxs in the system not being limited as expected by the value set by dplane_get_in_queue_limit.

The fix attempts to adjust the timing of when zdplane_info.dg_routes_queued is decremented. The modification is made so that zdplane_info.dg_routes_queued is decreased only once rib_process_dplane_results has completely processed a ctx and the actual memory release process for it has begun. This way, it ensures that there is sufficient space for new ctxs to be enqueued.

Signed-off-by: hanyu.zly <[email protected]>
@donaldsharp
Copy link
Member

This fundamentally breaks route installation on startup. This is a no-go from my perspective at this moment.

Copy link

This pull request has conflicts, please resolve those before we can evaluate the pull request.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants