-
Notifications
You must be signed in to change notification settings - Fork 44.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Command pre- and post-processing to increase efficiency and efficacy #4045
Comments
This issue has automatically been marked as stale because it has not had any activity in the last 50 days. You can unstale it by commenting or removing the label. Otherwise, this issue will be closed in 10 days. |
This issue has automatically been marked as stale because it has not had any activity in the last 50 days. You can unstale it by commenting or removing the label. Otherwise, this issue will be closed in 10 days. |
This issue was closed automatically because it has been stale for 10 days with no activity. |
Duplicates
Summary 💡
Currently, there is a ton of I/O being caused by the LLM trying to tell Auto-GPT to run some imaginary command/script, and then figuring out that it simply doesn't exist to incrementally add the missing functionality as needed - this causes usually a relatively long message containing errors/warnings to be exchanged between agent and the LLM. It's a process of trial & error.
While this format/exchange is useful and necessary to literally "fix" things in an incremental fashion, it is often not needed to just "experiment". For instance, to figure out if a certain tool is available or if a certain script is valid/executable. Or if a file exists or if it is accessible.
This applies especially in the context of executing stuff over ssh: #3420 (comment)
In particular, there's a more in-depth analysis posted by @valayDave here: #2987 (comment)
Thus, we probably need to think about coming up with a mode to reduce the log level/verbosity, to keep I/O to a sane degree and only selectively forward relevant output to the LLM for decsion making purposes.
For example, just the mere info that a script/tool is not working (or does not exist) would usually suffice to pass the info to the LLM, we should do not pass ALL of the surrounding info to the LLM beforehand - we should be able to OPTIONALLY provide the info.
This could be done in an adaptive fashion obviously, so if the agent is trying to be too smart by being too selective, it could increase its verbosity to see if the LLM result/response is better or not.
This could work analogous to supporting and if true/false expression (think ternary operator), for instance by using running a "check" command whose arguments then encode what to do in each case.
The LLM would then receive the desired true/false behavior (possibly just a list of commands to execute), and it could opt in to also receive the exact error message/warning (e.g. in case of an unhandled exception, 3rd argument being an exception handling command) - but it probably should not receive that by default.
This is to save/reduce unnecessary API/token use and make exchanges in general more effective.
This way we could maintain an error queue, and provide the LLM with a command to traverse that queue to get a certain info that it would otherwise not get to see - it should only ever need to use verbose mode when IT wants to have more information.
If all it needs to know that something is not available/working, there are better and more compact formats to communicate that to the LLM, rather than passing around tons of error messages.
In fact, the LLM could be provided with an option to run a regex on the output of the check so that it would only get to see the result of the regex. This sort of setup would reduce my token utilization by roughly 70%, because all autogpt is doing in principle is doing experiments with my shell and with the python interpreter.
At the very least, we should have options to tell the agent what sort of info the LLM wants to receive:
Basically, for the sake of efficiency (and our API tokens !!), the LLM should be able to tell the agent which parts of the log it wants to see/get, and prefer error codes / return values whenever possible
Thoughts / ideas ?
PS: To some extent, this may apply not just to commands, but also to actions in general - which opens up a whole new can of worms, but also of potential opportunities to aggressively optimize the I/O between the LLM and the agent, which is tochung on the idea of dynamic prompting: #3937
Examples 🌈
We could use a new command to prepare execution of a command list (think state machine) to reduce/configure verbosity and output forward as requested.
check <command + params> <command to be excecuted if true> <command to be executed if false> <exception handling command>
Motivation 🔦
The text was updated successfully, but these errors were encountered: