-
Notifications
You must be signed in to change notification settings - Fork 44.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Constraints awareness #3466
Comments
this is somewhat related, but not identical: #2237 |
I want to echo my support for this idea. It would be great to be able to constrain resources. Any developer who has worked with aws or gcp has a nightmare story of running up a huge bill by accident. |
note that this implies being able to specify custom API keys for sub-agents, too (as mentioned yesterday in an PR) Also see: #3313 PS: This should probably be renamed "quotas" instead of constraints, because constraints means a different thing in the GPT/LLM context ? |
This posting sums up the typical thinking quite well:
In other words, constraints would be analogous to "quotas", with an option to set soft/hard quotas. While sub-agents running into constraint violations would need to notify their parent agent via inter-agent messaging, which would then either be handled internally, or simply pass onto the next parent agent and so on, until the human is consulted once again. Obviously, the equivalent of a "project manager agent" could be given some leeway to control its sub-agents with different constraints/API budgets and handle violations gracefully, without interrupting the main loop. |
Not knowing the architecture here very well, could it be accomplished as a plugin? |
The plugin-interface is in the process of being revamped as part of #3652 Thus, while the code might conceptually reside in plug-in space, it would de facto be a core component -since it would need to be called permanently by the core to track costs of different actions. Then again, if the core devs should end up not being supportive of the idea, that's certainly an option - but based on some comments I've seen in a few PRs, there's related work going on anyway, so what seems more likely is that this will be implemented "soonish". Initially, with a focus on tracking "obvious costs" (API use), but probably with means to extend this over time. From an execution standpoint, unnecessary looping is the one thing that most people find annoying, so tracking looping (which is a form of making the same step over and over again once you think about it), would be highly useful. Consider it like a way to not just track the number of steps, but track the number of max identical steps (where identical would be the determined by hashing the name and arguments of the action to be taken, while always getting the same response from the LLM): #3668 To literally track bandwidth, CPU/RAM utilization and disk space, we'd want to - de facto, wrap a library like psutil to do so in a multi platform fashion. |
You raise an interesting point regarding the use of quotas as a means of controlling robots within a software framework. Quotas can be a powerful tool in limiting the behavior of robots, especially when they are tied to a reporting system that allows for effective oversight. However, as you suggest, it may be more effective to implement quotas at the main software level, rather than within a plugin. This would ensure that all sub-agents within the system are subject to the same constraints and reporting mechanisms, and that violations are handled consistently across the board. It's also worth considering the potential trade-offs of using quotas in this way. On the one hand, quotas can help to prevent robots from engaging in harmful or risky behaviors, and can provide a mechanism for catching and correcting errors. On the other hand, overly restrictive quotas could limit the effectiveness of the system and prevent it from achieving its goals. Ultimately, the decision to use quotas in this way will depend on the specific context and goals of the software framework in question. It may be helpful to experiment with different levels of constraint and to solicit feedback from stakeholders and users to determine the optimal balance between control and flexibility. |
I guess I was thinking a plugin would be an easy way to deploy it quickly, get user feedback, and then when demonstrated it could become a core feature. |
For now, it seems API / cost tracking and tracking of steps is in the pipeline - the rest, we'll see. But given the agility/pace of the project, it's probably just a few weeks to get this implemented. Regarding tracking of API costs and steps, these are the PRs that I am aware of:
So, in essence, a number of folks have come up with the idea previously - just not with an overly broad focus, but given the current evolution towards a multi-agent system, tracking resource utilization seems logical, and important. There's now initial work to track memory per agent: #3844 |
When it comes to human stuff, it may be worth trying to phrase it in the prompt as a, "very slow but very powerful AGI model for usage when you get stuck", which when called just sends you a message asking for input. We could integrate that into the quota system with some fake numbers to make it tunable in the same way. "Flat $30 per generation with latency of 5 minutes". |
latency is indeed an important consideration, not just network latency, but also "thinking" latency - i.e. the pure process of delegating a task to the [remote] LLM/GPT (and that would even apply if it were local) |
This will be partially mitigated by the introduction of workspaces in the re-arch. Of course a REST API is also good for linking more than one instance of AutoGPT to another |
The following RFE also discusses how actions/commands may have their own associated "costs" and may be subject to constraints, too: #3945
Also see: |
the new budget manager implementatoin (#4040) is likely to provide a good foundation to experiment with the concept of gathering stats and monitoring/tracking those to comply with some constraints. From an architectural perspective, it would probably make sense to have the equivalent of a StatsProvider (to capture/provide data), and an actual StatsObserver/Monitor to actually see if the system remains within same well-defined bounds. Whiile this is straightforward to do for simple metrics like API tokens, number of steps taken, or duration of an execution - system specific stats would be better captured by coming up with an adaptor class to wrap psutil accordingly. That way, the system could also be told to observer CPU/RAM/DISK utilization etc |
This article goes into detail about lack of contraint awareness/budget management (beyond just API tokens): https://lorenzopieri.com/autogpt_fix/
|
This issue has automatically been marked as stale because it has not had any activity in the last 50 days. You can unstale it by commenting or removing the label. Otherwise, this issue will be closed in 10 days. |
This issue was closed automatically because it has been stale for 10 days with no activity. |
For the record, not agreeing with this being "staled" - should probably be re-opened and added to some future milestone? |
Duplicates
Summary 💡
There's a bunch of RFEs here using the terms "maximum FOO" (context, tokens, time, memory, space etc).
Thus, more broadly, it might make sense to encode support for actual constraints as a first class concept into the design, so that under the hood, the system is aware of its own resource utilization (execution time, space, traffic, tokens, context, API usage and so on).
As to API usage/billing, it would currently have to scrape some of the openai pages apparently.
That way, planning would also be better informed/simplified, because the system could take into account the "costs" of its operations.
This sort of thing would also make it possible to constrain the system when it is behind a low-bandwidth connection, and e.g. prioritize other work, to reduce bandwidth utilization
Could be thought of as the equivalent of Unix/Linux "quotas" - i.e. a way to monitor resource utilization for different types of resources - which also be a great thing for benchmarking purposes
Examples 🌈
No response
Motivation 🔦
No response
The text was updated successfully, but these errors were encountered: