Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why does "Stop"-Button not work? #624

Open
Obaq-web opened this issue Feb 12, 2023 · 15 comments
Open

Why does "Stop"-Button not work? #624

Obaq-web opened this issue Feb 12, 2023 · 15 comments
Labels

Comments

@Obaq-web
Copy link

Stopping a script with "stop"-button does not stop my scripts. Terminal says "Stopped by User" but the script keeps on running. I see a "kill"-button where before was the "stop"-button. But the "kill"-button is gray an can not be clicked.
Can anyone help?

@bugy
Copy link
Owner

bugy commented Feb 12, 2023

Hi @Obaq-web your script ignores SIGTERM command. Stop just sends a signal to a process, that it should finish gracefully.
If it doesn't work, you can use kill. It should be available after 5-10 sec

@Obaq-web
Copy link
Author

Hello bugy, I don't mind stopping my scripts with "kill". However the button stays gray, no matter how long I wait. Any ideas how I can fix this?

@bugy
Copy link
Owner

bugy commented Feb 12, 2023

Probably it's a big in a new version. Which version are you using?

@Obaq-web
Copy link
Author

Version: 1.17.1

I'll test this issue with an older version soon.

@Stjefan
Copy link

Stjefan commented Feb 14, 2023

I have a similar problem (button stays gray), but in the log I can see that the corresponding kill POST request is sent. However the task is not killed.
The problem in my case is that I am running a subtask in the corresponding script. As I using windows it must be killed via the corresponding kill command. I guess the problem is that the check in
ExecutionService.py:
if execution_id in self._executors: self._executors[execution_id].kill()

is false (which is correct for the main task but not for the subtask).

Maybe someone knows a workaround for that windows specific problem.

@bugy
Copy link
Owner

bugy commented Feb 14, 2023

Hi @Stjefan the button shouldn't stay gray, that's the main issue.

The problem with windows is that there is no easy python way to gracefully kill child processes.
However forceful kill (which should be available, if a button is enabled), should work even for windows and kill child processes as well.

Could you try to run a script on windows, and kill it via this command:
taskkill /T /PID process_pid

And check, if the children are killed as well

@bugy
Copy link
Owner

bugy commented Feb 17, 2023

Hi @Obaq-web @Stjefan, I checked on the latest dev version, and kill button works fine for me. It's grey indeed, but it's clickable. I will change the button's color, to be more verbose, that it's active.
But could you confirm, that it's not clickable for you?
Also, after you click "stop", do you see a timer on a button, once you click "Stop"?

@bugy bugy added the bug label Feb 17, 2023
@Stjefan
Copy link

Stjefan commented Feb 17, 2023

I changed my machine for a different reason and now its working fine. I will check it on the other machine soon.
FYI: When it was not working, "Stop" was clickable, then the timer appeared and then the "Kill" button appeared. "Kill" was clickable (saw the POST request on the backend log), but there was no visible response to the click at the gui.
taskkill also worked fine but somehow the server falsely thought that it already stopped the script and did not execute the taskkill command. In the script I wrote timestamps to a file, which continued after clicking stop, so I verified that the task was still running.

@bugy
Copy link
Owner

bugy commented Feb 17, 2023

Hi @Stjefan could you share the script with me? Which I could use to reproduce the issue on my machine?
If it would be some demo script, not related to your work, that will be more than enough for me.

@Stjefan
Copy link

Stjefan commented Feb 17, 2023

Sure, it's a very simple script:

from time import sleep
from datetime import datetime
import sys


print("This is the name of the script: ", sys.argv[0])
print("Number of arguments: ", len(sys.argv))
print("The arguments are: " , sys.argv)


sleep(1)




if __name__ == '__main__':

  while True:
    sleep(2)
    print("doing something in a loop ...")
    with open('somefile.txt', 'a') as the_file:
        the_file.write(f'{datetime.now()}\n')
   
  print("End of the program. I was killed gracefully :)")

I finally found the difference between my two machines.
On my previous machine I started the script via the path and let Windows decide how to intepret the .py file.
Then stopping does not work.
On my current machine, I start the script via 'py path/2/file'. Then stopping works fine.
When I use 'py path/2/file' on the previous machine stopping works as well.
So it should be a problem around the default program thats used to start .py files and not your great code :).

@bugy
Copy link
Owner

bugy commented Feb 17, 2023

Hi @Stjefan thanks a lot
I think it could be still improved in script server. If stop button is there, it should work for all the cases :)

By the way, regarding:

taskkill also worked fine but somehow the server falsely thought that it already stopped the script and did not execute the taskkill command.

So you executed taskkill manually, but the real process didn't stop, right? However, script server considered this one as stopped. Is it correct understanding?
\t flag was supposed to kill all the child processes as well :(
According to their docs

@Stjefan
Copy link

Stjefan commented Feb 17, 2023

No taskkill worked as expected and \t killed the subtasks as well.
But script-server checks the following before running the taskkill:
if execution_id in self._executors: (see in ExecutionService.py)
And this condition is false in my case. So the taskkill command is never invoked.

@bugy
Copy link
Owner

bugy commented Feb 17, 2023

Hi @Stjefan

if execution_id in self._executors:
should always return true, because elements are never removed from self._executors (only on server restart)

@jost-balent
Copy link

jost-balent commented Jun 26, 2023

Hy everybody, I have a similar problem with the STOP button on v1.16.0. I am running this bash script:

#!/bin/bash
container_id=$(docker run -d --rm alpine sh -c 'for i in $(seq 1 100); do echo "$i"; sleep 1; done')
docker attach "$container_id"

If I run this script manually and press ctrl+c (SIGINT), the docker container is also stopped and everything terminates as expected. If I run the same script via bugy server, pressing the STOP button simply outputs >> STOPPED BY USER and the script is still running and the output is still being displayed. After a countdown of 5s I can press the KILL button (which is normal for the bugy server). The output of the script stops showing anything and the text " >> KILLED " is seen in the output, as expected. The problem is, the child process (in this case docker run) is still running on the server. It seems as if the child processes aren't killed by the KILL button properly but the triggering script has been properly terminated. This is the output with annotations:
image

Thank you for your time and consideration, help would be greatly appreciated.
Best regards,

Jost

@bugy bugy added this to the 1.19.0 milestone Aug 5, 2023
@bugy
Copy link
Owner

bugy commented Aug 5, 2023

Hi @jost-balent sorry, I missed your question :( I'll leave an answer anyway. There are multiple things here:

  • you are starting the container with a -d option (detached). So the container runs in "background". This is not a child process for an operating system, docker run is not linked to a parent
  • sending Ctrl+C to docker attach is debatable: Control-C while running docker attach should not terminate the Docker container moby/moby#2855
  • script server is sending SIGTERM signal to a running process. However, docker attach forwards only SIGINT (Ctrl+C). Probably SIGINT would be a better option, but for 99.9% of scripts it shouldn't matter. However, I'm not willing to change this, not to break existing installations
  • for me, "stop" button worked, in a way, that the script finished (but container was still running), so I didn't have to use "kill". However, calling kill pid from a terminal was ignored by the docker

To sum it up: this could be fixed in script server by sending SIGINT instead of SIGTERM. However, I think this problem is a quite rare use case, and could be worked around by using different docker commands.

If there are more people experiencing this problem, please let me know

@bugy bugy removed this from the 1.19.0 milestone Aug 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants