Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stop execute_shell going interactive #1327

Closed
1 task done
claytondukes opened this issue Apr 14, 2023 · 24 comments
Closed
1 task done

Stop execute_shell going interactive #1327

claytondukes opened this issue Apr 14, 2023 · 24 comments

Comments

@claytondukes
Copy link

Duplicates

  • I have searched the existing issues

Steps to reproduce 🕹

I ran a setup last night to assist in writing some proposals. It got to a point where it thought it needed to edit files and ran gedit (and it's not even running on a desktop OS), then tried to run nano - neither of which would work since it's not capable of using tools like that.

Is there some way to constrain or tell it that using interactive tools like that won't work?

Current behavior 😯

Apparently json was fixed. 
NEXT ACTION:  COMMAND = write_to_file ARGUMENTS = {'file': 'proposal.txt', 'text': 'Proposal Outline:\n1. Abstract\n2. Background\n3. Methodology\n4. Expected Outcomes\n5. Budget'}
SYSTEM:  Command write_to_file returned: File written to successfully.
SBIR AI THOUGHTS: 
REASONING: 
CRITICISM: 
Attempting to fix JSON by finding outermost brackets 
Apparently json was fixed. 
NEXT ACTION:  COMMAND = execute_shell ARGUMENTS = {'command_line': 'gedit proposal.txt'}
Executing command 'gedit proposal.txt' in working directory '/app/auto_gpt_workspace'
SYSTEM:  Command execute_shell returned: STDOUT: b'' STDERR: b'/bin/sh: 1: gedit: not found\n'
SBIR AI THOUGHTS: 
REASONING: 
CRITICISM: 
Attempting to fix JSON by finding outermost brackets 
Apparently json was fixed. 
NEXT ACTION:  COMMAND = execute_shell ARGUMENTS = {'command_line': 'nano proposal.txt'}
Executing command 'nano proposal.txt' in working directory '/app/auto_gpt_workspace'

Expected behavior 🤔

The AI should know, or even infer, that it can't use programs like that.

Your prompt 📝

ai_goals:
  - a Whitepaper for XXX and save the file
  - Create a proposal for the YYY program on "Foo Bar Baz"
  - Do not create files with placeholder text, use write_to_file when you have actual data to put in them
  - keep track of any files that you have created. If you try to access a file that you think you created but it isn't there, create it. If you are still unable to create the file after 2 tries, log an error and discontinue your tasks.
  - if you fail to parse AI output, show the actual AI output for debugging purposes
  - if you start a GPT agent, keep in mind that the GPT agent does not have information after the year 2021 and does not have internet access
  - Learn AAA using  https://<foo>
  - Learn about BBB using https://<bar>
  - Learn about CCC using https://<baz/file.pdf>
  - Find topics that XXX can solve (or aligns with)
  - Useful information for your research;
    XXX's Documentation, https://docs.x.net
    XXX's Website, https://www.x.net
    XXX's API Docs, https://api.x.io

ai_name: Proposal AI
ai_role: an AI designed to help XXX do blah. Specifically, to offer a training course that helps the <redacted> learn, build, deploy, and test <redacted>
@Slowly-Grokking
Copy link
Contributor

Slowly-Grokking commented Apr 14, 2023

lol, yeah i just ran into it trying to use nano as well. I forced nano to quit, and was able to resume though.

@JamestheDon
Copy link

killall nano

@gondar7
Copy link

gondar7 commented Apr 16, 2023

Ya a few different interactive shark commands that seem to break the loop. Hope there's a solution

@swarm4it
Copy link

though it would be nice if it could remote control an editor so we can watch it write the code ;-)

@KapDEK
Copy link

KapDEK commented Apr 18, 2023

yeah, i just ran into this too, and went to see if it was fixed, I wonder if it is possible to give AutoGPT the ability to use nano

@gondar7
Copy link

gondar7 commented Apr 18, 2023

I think the new popen shell is for interactive commands now. going to try it out

@gondar7
Copy link

gondar7 commented Apr 18, 2023

it seems to work... as in it leaves the subprocess open... but it doesnt seem to interact with it after it opens it. In the code, it looks like it doesn't accept output. Not sure this could work for interactive commands, might need something more like pexpect?

@TheNitzel
Copy link

Same issue here. So far I told it about the problem, it then installed notepad++ (which requires a Y prompt it did not put in the choco command and so I had to type it in for it) and then it opened Notepad++ and also could not interact with it. Finally it used an echo command to write to the file which did work.......and then resumed attempting to use nano ignoring that it was failing to write to the file. Until this is fixed will test out telling it to either use the write_to_file command exclusively or to use echo to create/append texts.

@bassie661
Copy link

@TheNitzel : may i ask how you told auto-GTP about the problem ? via shell ? I tried to add "not to use nano or vi" to the initial goals but that doesn't seem to work. I don't understand how auto-GPT is not able to write in or create a file, seems to me it's doing that all the time, i mean it's succesfully creating and updating auto-gpt.json for instance.

@Slowly-Grokking
Copy link
Contributor

Slowly-Grokking commented Apr 25, 2023

I don't understand how auto-GPT is not able to write in or create a file, seems to me it's doing that all the time, i mean it's succesfully creating and updating auto-gpt.json for instance.

There isn't a command that AUTOGPT AI is using to update the auto-gpt.json file, that's hardcoded in the script. write_to_file works as does append. Asking it to use interactive shells or GUIs is like asking a blind and deaf person to be a stenographer at this point in time.

Until OCR, mouse and keyboard emulation are supported, it can't interact with any GUI. Everything it can do needs to be done via command line.

how you told auto-GTP about the problem ?

You can try various prompting strategies, but if you see "Next Action: Execute Shell" and it's trying to use nano/vi or anything you don't want it to do, give it human feedback.

@bassie661
Copy link

OK thanks, i never asked it to use interactive shells or GUIs, it's going on a GUI tour all by itself, hmm, I will put some more into reading the issues and documentation, I haven't got my mind yet around the exact way this auto-GPT is working, probably have a wrong image of it. Looks to me it's starting up agents for all sorts of tasks and in my case one of the agents is trying to create a python script, that initially seems to go well, until he opens nano / vi to write to or append a file and get's stuck.

Dunno why it just doesn't just stick to command line then. Hope I can somehow let it use only command line for files. It would probably saved me 10 restarts :)

@Pwuts Pwuts changed the title Auto-GPT tries (and fails) to run commands that require interactive shells Stop execute_shell going interactive Apr 26, 2023
@Pwuts Pwuts moved this from 🆕 New to 📋 Backlog in AutoGPT development kanban Apr 26, 2023
@Boostrix
Copy link
Contributor

though it would be nice if it could remote control an editor so we can watch it write the code ;-)

some editors, like vim, definitely support a client/server mode - it would probably be a dedicated plugin to make that work, and it would have to work via popen probably, and it would benefit from the concept of "channels" to talk to other "server-like" processes.

In the meantime, one could probably use some strace-like script to look for problematic API calls and make such processes return control to Auto-GPT if they are exhibiting "blocking" behavior: To deal with binaries that may optionally use blocking calls on a Linux system where only non-blocking binaries are permitted, you can take several approaches. One option is to wrap the binary with a script that intercepts any blocking calls and terminates the process if it becomes blocked, using a tool like "strace" to trace system calls. Another option is to redirect blocking calls to non-blocking equivalents using the "fcntl" system call to set file descriptors to non-blocking mode.

@katmai
Copy link

katmai commented Apr 30, 2023

why does it need to use an editor though? it can write everything to files on it's own according to syntax. what would the use of an editor even be?

@zudsniper
Copy link

zudsniper commented Apr 30, 2023 via email

@katmai
Copy link

katmai commented Apr 30, 2023

Fun!
On Sun, Apr 30, 2023 at 07:27 katmai @.> wrote: why does it need to use an editor though? it can write everything to files on it's own according to syntax. what would the use of an editor even be? — Reply to this email directly, view it on GitHub <#1327 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AD2U6HMAGL7OITUENRFN4WDXDZLDTANCNFSM6AAAAAAW6OP4ZQ . You are receiving this because you are subscribed to this thread.Message ID: @.>
-- Sincerely, Jason P. McElhenney

makes sense :D

@Boostrix
Copy link
Contributor

Boostrix commented Apr 30, 2023

I've seen it (successfully!) using sed to edit/patch existing files - then again, "an editor" (nano, vi etc) is probably just lingo for any interactive "app" (#346) here?

It getting stuck inside nano/vim seems to happen for some folks - thus, the heuristics to detect such binaries/situations would potentially still be worthwhile ?

Also, if the sub-agent approach is pursued, a parent-agent would be in a position to monitor what's going on and it could observe/terminate a sub-agent as needed, including if it's obviously got stuck invoking a system call that is blocking.

@tomtom94
Copy link

tomtom94 commented May 1, 2023

Hi there, I don't like the timeout solution, I am using a special character to detect the input prompt.

You can have a look https://stackoverflow.com/q/76097868/10294022

@Boostrix
Copy link
Contributor

Boostrix commented May 1, 2023

For future reference, here's what GPT-4 came up with to detect an idle child process that seems to be blocking because it's waiting for I/O without actually changing its RAM/CPU utilization and without sending data to the parent process:

import ctypes
import os
import psutil
import time

libc = ctypes.CDLL('libc.so.6')

def hooked_read(fd, buf, count):
print(f"Process with PID {os.getpid()} is reading {count} bytes from file descriptor {fd}")
return libc.read(fd, buf, count)

libc.read.restype = ctypes.c_ssize_t
libc.read.argtypes = [ctypes.c_int, ctypes.c_void_p, ctypes.c_size_t]

Replace the read function with the hooked version

libc.read = hooked_read

Launch the target process using popen

process = os.popen('ls')
cpu_history = []
memory_history = []
output_history = []

Monitor the process's CPU and RAM utilization and output

while process.poll() is None:
cpu_percent = psutil.Process(process.pid).cpu_percent(interval=1)
memory_info = psutil.Process(process.pid).memory_info().rss / 1024 / 1024
output = process.read()
output_history.append(output)
print(f"Process with PID {process.pid} is using {cpu_percent:.2f}% CPU and {memory_info:.2f} MB of RAM, output so far: {output}")

# Add CPU, memory, and output data to history
cpu_history.append(cpu_percent)
memory_history.append(memory_info)

# Check if the process seems blocked
if len(cpu_history) >= 5 and all(cpu_percent < 1 for cpu_percent in cpu_history[-5:]) and all(memory_info == memory_history[-1] for memory_info in memory_history[-5:]):
    print("Process seems to be blocked")
    # Check if the process is using a blocking API call
    if all(output == output_history[-1] for output in output_history[-5:]):
        print("Process seems to be using a blocking API call")
        timeout = 10 # Timeout in seconds (defaulted to 10)
        start_time = time.time()
        while time.time() - start_time < timeout:
            cpu_percent = psutil.Process(process.pid).cpu_percent()
            memory_info = psutil.Process(process.pid).memory_info().rss / 1024 / 1024
            if cpu_percent > 0 or memory_info > 0:
                break
            time.sleep(1)
        else:
            print(f"Process with PID {process.pid} is being killed due to timeout")
            process.kill()
            break

Wait for the process to finish and print its output

output = process.read()
print(output)

@Boostrix
Copy link
Contributor

Boostrix commented May 2, 2023

saw it once again today, despite previously having used sed successfully, it wanted to start nano and vim.
I suppose, the blacklist option mentioned before would be a simple workaround.
So that the .env file can be used to explicitly disable certain shell commands like these.

Also, another user is currently working on a new "update_file" command which should hopefully help.
Alternatively, we could introduce an "CLI_EDIT" command that is explicitly constrained by its description.

@bfalans
Copy link
Contributor

bfalans commented May 4, 2023

I believe ai tries to open editors because it can't find a useful command in its own code. Since I am using my update_file, I have not been getting attempts to open editors unless I specifically ask for it. My PR #3643

@Boostrix
Copy link
Contributor

Boostrix commented May 4, 2023

We probably need a bunch of aliases to cover all cases (edit, update, rewrite, change, modify etc) and redirect the llm to use a BIF

@zudsniper
Copy link

We probably need a bunch of aliases to cover all cases (edit, update, rewrite, change, modify etc) and redirect the llm to use a BIF

I am ready for @Boostrix PR you're all up in these issues and from what I've seen you have the right idea 83% or more.
If you want any assistance with creating what you want I'm always too busy but I will make time to help get all your theories and strategies into code.
Please reach out to me if you are interested in this! All my socials are in my gh profile README but discord will be the fastest. Cheers

@lc0rp
Copy link
Contributor

lc0rp commented Jun 13, 2023

It's now possible to configure which shell commands are permitted or prohibited, which can be useful in preventing the use of interactive commands.

@lc0rp lc0rp closed this as completed Jun 13, 2023
@github-project-automation github-project-automation bot moved this from 📋 Backlog to ✅ Done in AutoGPT development kanban Jun 13, 2023
@Boostrix
Copy link
Contributor

Keep in mind that interactive programs can go non interactive and vice versa...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

No branches or pull requests