`test_start` event triggered multiple times on workers #1986

jacksmith15 · 2022-01-27T12:03:06Z

Describe the bug

When running locust in distributed mode, the test_start handler gets triggered every time new users are spawned, rather than once running once at the start of the test.

This only seems to occur if there is a test_start handler which takes longer than the interval between spawn events (I think 1s?).

This is problematic if the test needs to load test data from an external source and share it between the users.

To give an example of a concrete use case:

I want the test to access load logs from an external source and replay them.
These logs are valid for the lifetime of a test (rather than process or user) - so they can be shared between users on a worker.
When I trigger the test the log data-store gets hammered by duplicated requests (load testing the wrong thing! 😅).

Expected behavior

The test_start event handlers should be triggered once when the test begins, allowing e.g. expensive test data loading operations to occur only once.

Actual behavior

The test_start event handlers are triggered multiple times, with all but the final handler being killed prematurely.

Steps to reproduce

Given the following locustfile:

import time

from locust import HttpUser, events, runners, task


COUNT = 0

@events.test_start.add_listener
def test_start_worker(environment, **kwargs):
    if not isinstance(environment.runner, runners.WorkerRunner):
        return
    global COUNT
    COUNT += 1
    print(f"'test_start' triggered {COUNT} times.")
    time.sleep(1)  # Same result with `gevent.sleep`. Less than 1 second works fine, >=1 second causes the described behaviour.


class HelloWorldUser(HttpUser):
    @task
    def hello_world(self):
        self.client.get("/hello")
        self.client.get("/world")

And the following docker-compose.yml

version: '3'

services:
  master:
    image: locustio/locust:2.5.1
    ports:
    - "8089:8089"
    volumes:
    - ./:/mnt/locust
    command: -f /mnt/locust/locustfile.py --master -H http://master:8089 --headless -u 10 -r 1 --run-time 1m --expect-workers 1

  worker:
    image: locustio/locust:2.5.1
    volumes:
    - ./:/mnt/locust
    command: -f /mnt/locust/locustfile.py --worker --master-host master

Run

docker-compose up -d

And the logs for the worker will display:

worker_1  | [2022-01-27 11:56:19,677] 26ff2afeca30/INFO/locust.main: Starting Locust 2.5.1
worker_1  | 'test_start' triggered 1 times.
worker_1  | 'test_start' triggered 2 times.
worker_1  | 'test_start' triggered 3 times.
worker_1  | 'test_start' triggered 4 times.
worker_1  | 'test_start' triggered 5 times.
worker_1  | 'test_start' triggered 6 times.
worker_1  | 'test_start' triggered 7 times.
worker_1  | 'test_start' triggered 8 times.
worker_1  | 'test_start' triggered 9 times.
worker_1  | 'test_start' triggered 10 times.

Environment

OS: Ubuntu 21.10
Python version: 3.9.9
Locust version: 2.5.1
Locust command line that you ran: As described by docker-compose.yml
Locust file contents (anonymized if necessary): Shown above
docker-compose version 1.27.4
Docker version 20.10.7

The text was updated successfully, but these errors were encountered:

cyberw · 2022-01-27T13:34:09Z

Interesting. Maybe something for you @mboutet ?

jacksmith15 · 2022-01-27T16:39:13Z

📝 My current workaround for this is to spawn a background greenlet in the event handler, so that the event handler returns quickly and is not retriggered.

In the above example this would be:

@events.test_start.add_listener
def test_start_worker(environment, **kwargs):
    if not isinstance(environment.runner, runners.WorkerRunner):
        return
    global COUNT
    COUNT += 1
    print(f"'test_start' triggered {COUNT} times.")
    gevent.spawn_later(0, time.sleep, 1)

Naturally this means any user code which depends on this completing needs to be defensive, since the user will start before this completes.

Leaving this here as it may help others!

mboutet · 2022-01-30T02:56:50Z

Sorry for the delay, I'll try to take a look, but this seems strange.

github-actions · 2022-04-01T03:07:06Z

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 10 days.

github-actions · 2022-04-12T02:08:07Z

This issue was closed because it has been stalled for 10 days with no activity. This does not necessarily mean that the issue is bad, but it most likely means that nobody is willing to take the time to fix it. If you have found Locust useful, then consider contributing a fix yourself!

cyberw · 2022-04-12T07:58:39Z

@mboutet any chance you can have a look?

mboutet · 2022-04-13T01:37:06Z

This is definitely an undesirable behaviour. However, having a blocking test_start hook running for 1s or more is tricky to handle with the current code:

locust/locust/runners.py

Lines 1097 to 1113 in 300b93e

    
               def start_worker(self, user_classes_count: Dict[str, int], **kwargs): 
        
                   """ 
        
                   Start running a load test as a worker 
        
                   :param user_classes_count: Users to run 
        
                   """ 
        
                   self.target_user_classes_count = user_classes_count 
        
                   self.target_user_count = sum(user_classes_count.values()) 
        
                   if self.worker_state != STATE_RUNNING and self.worker_state != STATE_SPAWNING: 
        
                       self.stats.clear_all() 
        
                       self.exceptions = {} 
        
                       self.cpu_warning_emitted = False 
        
                       self.worker_cpu_warning_emitted = False 
        
                       self.environment._filter_tasks_by_tags() 
        
                       self.environment.events.test_start.fire(environment=self.environment) 
        
                   self.worker_state = STATE_SPAWNING

This method is called every time a worker receives a spawn message from the master as I explained in #1856 (comment). Therefore, if the test_start hook does not finish before the next spawn message, self.worker_state = STATE_SPAWNING never gets called.

We might be able to improve the behaviour by relocating the self.environment.events.test_start.fire just before the if self.spawning_greenlet: in:

locust/locust/runners.py

Lines 1194 to 1200 in 300b93e

    
           if self.spawning_greenlet: 
        
               # kill existing spawning greenlet before we launch new one 
        
               self.spawning_greenlet.kill(block=True) 
        
           self.spawning_greenlet = self.greenlet.spawn(lambda: self.start_worker(job["user_classes_count"])) 
        
           self.spawning_greenlet.link_exception(greenlet_exception_handler) 
        
           last_received_spawn_timestamp = job["timestamp"]

However, a long-running test_start hook will still interfere with the ramp-up/down because the spawn messages from the master won't be processed until the hook is done running.

I'll open a draft PR with the change I'm proposing.

mboutet · 2022-04-13T02:10:57Z

@jacksmith15, thanks for the detailed steps for reproducing the issue. Much appreciated!

jacksmith15 added the bug label Jan 27, 2022

github-actions bot added the stale Issue had no activity. Might still be worth fixing, but dont expect someone else to fix it label Apr 1, 2022

github-actions bot closed this as completed Apr 12, 2022

cyberw reopened this Apr 12, 2022

mboutet mentioned this issue Apr 13, 2022

Ensure test_start is run to completion on worker #2072

Merged

1 task

github-actions bot removed the stale Issue had no activity. Might still be worth fixing, but dont expect someone else to fix it label Apr 13, 2022

cyberw closed this as completed in #2072 Apr 13, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`test_start` event triggered multiple times on workers #1986

`test_start` event triggered multiple times on workers #1986

jacksmith15 commented Jan 27, 2022 •

edited

Loading

cyberw commented Jan 27, 2022

jacksmith15 commented Jan 27, 2022 •

edited

Loading

mboutet commented Jan 30, 2022

github-actions bot commented Apr 1, 2022

github-actions bot commented Apr 12, 2022

cyberw commented Apr 12, 2022

mboutet commented Apr 13, 2022

mboutet commented Apr 13, 2022

test_start event triggered multiple times on workers #1986

test_start event triggered multiple times on workers #1986

Comments

jacksmith15 commented Jan 27, 2022 • edited Loading

Describe the bug

Expected behavior

Actual behavior

Steps to reproduce

Environment

cyberw commented Jan 27, 2022

jacksmith15 commented Jan 27, 2022 • edited Loading

mboutet commented Jan 30, 2022

github-actions bot commented Apr 1, 2022

github-actions bot commented Apr 12, 2022

cyberw commented Apr 12, 2022

mboutet commented Apr 13, 2022

mboutet commented Apr 13, 2022

`test_start` event triggered multiple times on workers #1986

`test_start` event triggered multiple times on workers #1986

jacksmith15 commented Jan 27, 2022 •

edited

Loading

jacksmith15 commented Jan 27, 2022 •

edited

Loading