-
-
Notifications
You must be signed in to change notification settings - Fork 5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC]: vLLM plugin system #7131
Comments
This looks like a step in the right direction to me (: I have 2 questions regarding this:
I really believe implementing such plugin system concept could make vLLM an even greater technology, and personally it could solve a lot of problems for me by allowing great modularity and costumization. |
It might not be easy, but should be possible. By allowing loading a plugin, the plugin has the total control to do anything it wants. In the extreme case, swap the whole vLLM code into another implementation.
We can consider this as a TODO. It needs to clean up the interface of each components in vLLM, so that users can bring in their implementation more easily. In the begining, we can reserve the space for them, e.g. use Currently, we can have
I think this is great! I didn't know it before. It is much better than env var I think. The only concern is, if users installed many plugins for the same component, e.g. scheduler, how can they select the one they want? We might need to design some config file format, to determine which plugins to use. |
I think either LD_PRELOAD way or the Python entrypoints ways are proven patterns. At the current experimental stage, I have two concerns:
Regarding the exact code being executed, I don't have much concern about security, rather it is how the plugins is being called and invoked. Will it swap in a class implementation for an abstract class, or some function, or insert some callbacks? It does seems like it needs several use cases to prove out and design over time. |
agree. so this RFC is just a start to explore how we interact with plugins. There are already 2 usecases now: out-of-tree model registration, and user-specified executor registration.
that is the stable state of plugin system. we don't need to guarantee that at the moment. it is the plugin's author's responsibility to keep their plugin up-to-date. and we can see what the community makes out of the plugin, and gradually make some part of the system pluggable with stable API.
I think we can directly call it |
Overall I got positive feedback for this RFC. I will use the entrypoint mechanism mentioned by @NadavShmayo to collect all the installed plugins, and use |
I see that you have already implemented the general-purpose plugin system, nice! Would be great if you could have a look at #7438 and give your feedback. |
#7426 finished the framework. TODOs:
|
I use |
This issue has been automatically marked as stale because it has not had any activity within 90 days. It will be automatically closed if no further activity occurs within 30 days. Leave a comment if you feel this issue should remain open. Thank you! |
Motivation.
There is an increasing need to customize vLLM, including:
ray
: [Bug]: Ray distributed backend does not support out-of-tree models via ModelRegistry APIs #5657Usually, the request is to swap out some functions / classes in vLLM, or call some functions before vLLM runs the model. While implementing them in vLLM is not difficult, the maintenaince burden grows.
In order to satisfy the growing need of customization, I propose to introduce vLLM plugin system.
It is inspired by the pytest community, where a plugin is a standalone pypi package, e.g. https://pypi.org/project/pytest-forked/ .
#7130 is a draft implementation, where I added a new env var
VLLM_PLUGINS
. The way it works, is similar to the operating system'sLD_PRELOAD
, with a colon-separated list of python modules to import.One of the most important concern, is to fight against arbitrary code execution risk. When a user serves a model using vLLM, the endpoint user cannot activate the plugin, so this does not suffer from code injection risk. However, there is indeed a risk, if the user runs vLLM in an untrusted environment. In this case:
vllm_
, so that vLLM user does not accidentally add irrelevant modules to execute.With these efforts, the security level should be the same as
LD_PRELOAD
. And sinceLD_PRELOAD
exists for so many years, I thinkVLLM_PLUGINS
should be acceptable in terms of security risk.Proposed Change.
see #7130 for the draft implementation
Feedback Period.
No response
CC List.
No response
Any Other Things.
No response
The text was updated successfully, but these errors were encountered: