Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vbp: Rework probe state management #4115

Merged
merged 4 commits into from
Jul 15, 2024
Merged

Conversation

nigoroll
Copy link
Member

@nigoroll nigoroll commented Jun 3, 2024

I will be away until July, but I would appreciate reviews and collection of feedback in the meantime. Also, for the unlikely event that everyone is absolutely happy about it, I would also not oppose a merge.


Every time I looked at the probe code in the past, my mind ended up twisted and confused. A probe could change the "enabled" state (tracking the temperature) or be removed at any time (unless the mtx is held), yet the code did not seem to reflect this.

We un-twist my mind by implementing probe states:

  • cold: reflects cold backend temperature
  • scheduled: probe is on binheap, waiting for its time to come
  • running: a task has been started to run the probe
  • cooling: running probe disabled
  • deleted: running probe removed (while cooling only)

With this new scheme, we can now have (I think) a clean state diagram (see dot file):

cache_backend_probe

  • a probe begins in the cold state
  • from cold, it can either get removed or scheduled via VBP_Control(..., 1)
  • from scheduled, it can go back to cold (via VBP_Control(..., 0)) or be picked up by vbp_thread() to change to running (aka task started)
  • once the task finishes, it normally goes back to scheduled, but in the meantime it could have changed to cooling or further on to deleted, so vbp_task_comple() handles these cases and either transitions to cold or deletes the probe
  • if the task can not be scheduled, the same handling happens

We now also remove running probes from the binheap to remove complexity. I am not entirely sure if there could have been a good reason for keeping running probes on the binheap, so if this is the case, we might want to reconsider this change. But it is not obvious to me how deleting and reinserting just to delete and reinsert later should be better than deleting and reinserting later.

Written to fix #4108 for good

Copy link
Member

@dridi dridi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After a first review this looks good to me.

What I really appreciate is the first commit acting as an assessment of the current informal state machine, with a diff showing the root cause of #4108 in the third commit.

I have one nitpick and a cosmetic suggestion.

bin/varnishd/cache/cache_backend_probe.c Outdated Show resolved Hide resolved
bin/varnishd/cache/cache_backend_probe.c Outdated Show resolved Hide resolved
doc/graphviz/cache_backend_probe.dot Outdated Show resolved Hide resolved
doc/graphviz/cache_backend_probe.dot Outdated Show resolved Hide resolved
@nigoroll
Copy link
Member Author

nigoroll commented Jul 8, 2024

@dridi I took your suggestions for the next force-push, thank you!

bin/varnishd/cache/cache_backend_probe.c Outdated Show resolved Hide resolved
bin/varnishd/cache/cache_backend_probe.c Outdated Show resolved Hide resolved
@nigoroll
Copy link
Member Author

nigoroll commented Jul 8, 2024

another dridification round done

@nigoroll
Copy link
Member Author

nigoroll commented Jul 9, 2024

During extended bugwash, @bsdphk suggested adding a cluster subgraph to the graphviz file to mark possible states while a probe task is running, but I could not make it become an improvement to clarity

diff --git a/doc/graphviz/cache_backend_probe.dot b/doc/graphviz/cache_backend_probe.dot
index 9f289b162..6b93e2b6b 100644
--- a/doc/graphviz/cache_backend_probe.dot
+++ b/doc/graphviz/cache_backend_probe.dot
@@ -3,9 +3,11 @@
 digraph cache_backend_probe {
        ALLOC
        scheduled
-       running
+       subgraph cluster_vbptask {
+               running
+               cooling # going cold while task runs
+       }
        cold
-       cooling # going cold while task runs
        deleted # from cooling, removed while task runs
        FREE
 

image

I did add a commit with a rename he suggested.

@nigoroll
Copy link
Member Author

ftr, this was approved by today's bugwash

@nigoroll
Copy link
Member Author

FTR: The first dridification round had introduced a regression. The change was:

@@ -457,20 +457,16 @@ vbp_task_complete(struct vbp_target *vt)
 
        assert(vt->heap_idx == VBH_NOIDX);
 
-       if (vt->state == vbp_state_scheduled) {
-               WRONG("vbp_state_scheduled");
-       } else if (vt->state == vbp_state_running) {
+       if (vt->state == vbp_state_running) {
                vt->state = vbp_state_scheduled;
                vt->due = VTIM_real() + vt->interval;
                vbp_heap_insert(vt);
                vt = NULL;
-       } else if (vt->state == vbp_state_cold) {
-               WRONG("vbp_state_cold");
        } else if (vt->state == vbp_state_cooling) {
                vt->state = vbp_state_cold;
                vt = NULL;
        } else {
-               assert(vt->state == vbp_state_deleted);
+               WRONG(vt->state->name);
        }
        return (vt);
 }

here, I missed to accept vbp_state_deleted as valid.

Every time I looked at the probe code, my mind ended up twisted and confused. A
probe could change the "enabled" state (tracking the temperature) and be removed
at any time (unless the mtx is held), yet the code did not seem to reflect this.

We un-twist my mind by completing the transition to probe states and adding a
chain of two states for the case that a probe is controlled/deleted while its
task is running:

cooling: running probe disabled
deleted: running probe removed (while cooling only)

With this new scheme, we can now have (I think) a clean state diagram (see dot
file):

- a probe begins in the cold state
- from cold, it can either get removed or scheduled via VBP_Control()
- from scheduled, it can go back to cold (via VBP_Control()) or
  be picked up by vbp_thread() to change to running (aka task started)
- once the task finishes, it normally goes back to scheduled,
  but in the meantime it could have changed to cooling or deleted,
  so vbp_task_comple() hadles these cases and either transitions to cold
  or deletes the probe
- if the task can not be scheduled, the same handling happens

We now also remove running probes from the binheap to remove complexity.

Fixes varnishcache#4108 for good
This background thread does not run the actual probes, but schedules tasks which
do (vbp_task).

Rename suggested by phk.
@nigoroll nigoroll merged commit fc2445b into varnishcache:master Jul 15, 2024
1 of 11 checks passed
@nigoroll nigoroll deleted the vbp_rework branch July 15, 2024 18:28
@delthas
Copy link
Contributor

delthas commented Sep 25, 2024

FWIW, I'm (rarely) getting the following on reload (trying to create a MWE, but it's difficult):

Wrong turn at cache/cache_backend_probe.c:710:
running
Backtrace:
  0x560448ae1412: varnishd(VBT_format+0x72) [0x560448ae1412]
  0x560448a5f3e1: varnishd(pan_ic+545) [0x560448a5f3e1]
  0x560448ae0965: varnishd(VAS_Fail+0x45) [0x560448ae0965]
  0x560448a34b25: varnishd(VBP_Control+583) [0x560448a34b25]
  0x560448a6edab: varnishd(vcl_BackendEvent+107) [0x560448a6edab]
  0x560448a7034d: varnishd(vcl_set_state+589) [0x560448a7034d]
  0x560448a713b9: varnishd(vcl_cli_load+315) [0x560448a713b9]
  0x560448ae31b5: varnishd(cls_exec+629) [0x560448ae31b5]
  0x560448ae37a3: varnishd(VCLS_Poll+0x333) [0x560448ae37a3]
  0x560448a3aec4: varnishd(CLI_Run+0x64) [0x560448a3aec4]
  0x560448a5a3aa: varnishd(child_main+0x22a) [0x560448a5a3aa]
  0x560448aa4cfd: varnishd(mgt_launch_child+1229) [0x560448aa4cfd]
  0x560448aa5cab: varnishd(MCH_Start_Child+0xb) [0x560448aa5cab]
  0x560448a2ce26: varnishd(main+0x2716) [0x560448a2ce26]
  0x7f3d07682b45: /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5) [0x7f3d07682b45]
  0x560448a2cf48: varnishd(_start+41) [0x560448a2cf48]

Also got this once (same backtrace, but the state this time is scheduled):

Wrong turn at cache/cache_backend_probe.c:710:
scheduled
Backtrace:
0x55f4660e1412: /usr/sbin/varnishd(VBT_format+0x72) [0x55f4660e1412]
0x55f46605f3e1: /usr/sbin/varnishd(pan_ic+545) [0x55f46605f3e1]
0x55f4660e0965: /usr/sbin/varnishd(VAS_Fail+0x45) [0x55f4660e0965]
0x55f466034b25: /usr/sbin/varnishd(VBP_Control+583) [0x55f466034b25]
0x55f46606edab: /usr/sbin/varnishd(vcl_BackendEvent+107) [0x55f46606edab]
0x55f46607034d: /usr/sbin/varnishd(vcl_set_state+589) [0x55f46607034d]
0x55f4660713b9: /usr/sbin/varnishd(vcl_cli_load+315) [0x55f4660713b9]
0x55f4660e31b5: /usr/sbin/varnishd(cls_exec+629) [0x55f4660e31b5]
0x55f4660e37a3: /usr/sbin/varnishd(VCLS_Poll+0x333) [0x55f4660e37a3]
0x55f46603aec4: /usr/sbin/varnishd(CLI_Run+0x64) [0x55f46603aec4]
0x55f46605a3aa: /usr/sbin/varnishd(child_main+0x22a) [0x55f46605a3aa]
0x55f4660a4cfd: /usr/sbin/varnishd(mgt_launch_child+1229) [0x55f4660a4cfd]
0x55f4660a5cab: /usr/sbin/varnishd(MCH_Start_Child+0xb) [0x55f4660a5cab]
0x55f46602ce26: /usr/sbin/varnishd(main+0x2716) [0x55f46602ce26]
0x7fc59f7f0b45: /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5) [0x7fc59f7f0b45]
0x55f46602cf48: /usr/sbin/varnishd(+0x2cf48) [0x55f46602cf48]

(WRONG(vt->state->name) running/scheduled on enable).

Edit: finally managed to track it down to a MWE. See #4199

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Assert error in VBP_Control(): Condition(vt->heap_idx == VBH_NOIDX) not true.
3 participants