Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to deploy a RESTful API deepspeed MII on one node? #164

Open
shaoxuefeng opened this issue Apr 10, 2023 · 2 comments
Open

How to deploy a RESTful API deepspeed MII on one node? #164

shaoxuefeng opened this issue Apr 10, 2023 · 2 comments
Assignees

Comments

@shaoxuefeng
Copy link

Follow the README doc, I would like to deploy a RESTful API on one node,
But I got a ValueError: No slot '1' specified on host 'localhost' error:
the deploy python code :

import mii
from mii import DeploymentType

if __name__ == "__main__":
    HOST_FILE_PATH = "./hostfile"
    mii_configs = {
        "tensor_parallel": 8,
        "dtype": "fp16",
        "enable_restful_api": True,
        "restful_api_port": 8080,
        "skip_model_check": True,
        "enable_load_balancing": False,
        "replica_num": 1,
        "hostfile": HOST_FILE_PATH,
    }

    mii.deploy(task="text-generation",
               model="/workspace/workfile/Models/gptj-350m",
               deployment_name="codegen-350m",
               mii_config=mii_configs,
               deployment_type=DeploymentType.LOCAL)

And the hosfile:

localhost slots=8

According to the Deepspeed Issue, it seems we can't start with hosftile on only node. I even update deepspeed pkg to lastest master version, but it still not work.

deepspeed          0.8.3+unknown
deepspeed-mii      0.0.5+unknown

So, How can i start a a RESTful API deepspeed MII on one node?
Thank you!

@Wohoholo
Copy link

i have started with hostfile on only node(my machine has 2 gpu, but i only can deploy on one gpu).
configs:
tensor_parallel: 1
deploy_rank: 0
other params are the same as yours
my hostfile's content:
127.0.0.1 slots=2

and, by the way, u need to set your node passwordless login itself by ssh.
i want to know how to deploy on one node with multi gpu?

@Wohoholo
Copy link

i find some detail in source script.

  1. config.py:
  2. @root_validator
    
  3. def auto_enable_load_balancing(cls, values):
    
  4.     if values["enable_restful_api"] and not values["enable_load_balancing"]:
    
  5.         logger.warn("Restful API is enabled, enabling Load Balancing")
    
  6.         values["enable_load_balancing"] = True
    
  7.     return values
    

it will make your "enable_load_balancing" become True.Then
server.py:

  1. if mii_configs.enable_load_balancing:
  2.         # Start replica instances
    
  3.         for i, repl_config in enumerate(lb_config.replica_configs):
    
  4.             hostfile = tempfile.NamedTemporaryFile(delete=False)
    
  5.             hostfile.write(
    
  6.                 f'{repl_config.hostname} slots={mii_configs.replica_num}\n'.encode())
    
  7.             processes.append(
    
  8.                 self._launch_deepspeed(
    
  9.                     deployment_name,
    
  10.                     model_name,
    
  11.                     model_path,
    
  12.                     ds_optimize,
    
  13.                     ds_zero,
    
  14.                     ds_config,
    
  15.                     mii_configs,
    
  16.                     hostfile.name,
    
  17.                     repl_config.hostname,
    
  18.                     repl_config.tensor_parallel_ports[0],
    
  19.                     mii_configs.torch_dist_port + (100 * i) +
    
  20.                     repl_config.gpu_indices[0],
    
  21.                     repl_config.gpu_indices))
    

it will write a temp file with your "replica_num" but not your hostfile.
you can comment line 5, 6 and rewite line 17 to mii_configs.hostfile.
And "tensor_parallel" must be equal to length of parameter "deploy rank" in mii_configs.
Hope that will be helpful

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants