Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add compute name and instance type param in sdk and cli #2446

Merged
merged 7 commits into from
Jul 12, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,8 @@ jobs:
path: azureml://registries/azureml/models/bert-base-uncased/versions/4
input_column_names: input_string
label_column_name: title
device: gpu
device: auto
compute_name: gpu-cluster-big
evaluation_config:
path: "../../../../../sdk/python/foundation-models/system/evaluation/fill-mask/eval-config.json"
type: uri_file
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,8 @@ jobs:
path: azureml://registries/azureml/models/distilbert-base-uncased-distilled-squad/versions/4
input_column_names: context,question
label_column_name: answer_text
device: gpu
device: auto
compute_name: gpu-cluster-big
evaluation_config:
path: "../../../../../sdk/python/foundation-models/system/evaluation/question-answering/eval-config.json"
type: uri_file
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,8 @@ jobs:
path: azureml://registries/azureml/models/sshleifer-distilbart-cnn-12-6/versions/4
input_column_names: input_string
label_column_name: summary
device: gpu
device: auto
compute_name: gpu-cluster-big
evaluation_config:
path: "../../../../../sdk/python/foundation-models/system/evaluation/summarization/eval-config.json"
type: uri_file
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,8 @@ jobs:
path: azureml://registries/azureml/models/microsoft-deberta-base-mnli/versions/4
input_column_names: input_string
label_column_name: label_string
device: gpu
device: auto
compute_name: gpu-cluster-big
evaluation_config:
path: "../../../../../sdk/python/foundation-models/system/evaluation/text-classification/eval-config.json"
type: uri_file
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,8 @@ jobs:
path: azureml://registries/azureml/models/gpt2/versions/4
input_column_names: input_string
label_column_name: ground_truth
device: gpu
device: auto
compute_name: gpu-cluster-big
evaluation_config:
path: "../../../../../sdk/python/foundation-models/system/evaluation/text-generation/eval-config.json"
type: uri_file
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,8 @@ jobs:
path: azureml://registries/azureml/models/jean-baptiste-camembert-ner/versions/4
input_column_names: input_string
label_column_name: ner_tags_str
device: gpu
device: auto
compute_name: gpu-cluster-big
evaluation_config:
path: "../../../../../sdk/python/foundation-models/system/evaluation/token-classification/eval-config.json"
type: uri_file
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,8 @@ jobs:
path: azureml://registries/azureml/models/t5-base/versions/4
input_column_names: input_string
label_column_name: ro
device: gpu
device: auto
compute_name: gpu-cluster-big
evaluation_config:
path: "../../../../../sdk/python/foundation-models/system/evaluation/translation/eval-config.json"
type: uri_file
Original file line number Diff line number Diff line change
Expand Up @@ -117,6 +117,28 @@
"registry_ml_client"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"### Creating Compute\n",
"\n",
"There are two ways to submit a job - through a compute or a serverless job.\n",
"\n",
"##### Serverless Job:\n",
"\n",
"In a serverless job, there is no need to create a compute explicitly.\n",
"Simply pass the desired instance type value to the `instance_type` parameter while creating a pipeline job.\n",
"This allows for quick and convenient job submission without the need for managing a compute cluster.\n",
"\n",
"##### Compute Job:\n",
"\n",
"To submit a job through a compute, you need to create a compute cluster beforehand.\n",
"The following code below demonstrates how to create a gpu compute cluster.\n",
"After creating the compute cluster, pass the name of the compute cluster to the `compute_name` parameter while submitting the pipeline job. This ensures that the job runs on the specified compute cluster, allowing for more control and customization."
]
},
{
"cell_type": "code",
"execution_count": null,
Expand Down Expand Up @@ -326,10 +348,7 @@
" print(\"{} - {}\".format(model[\"name\"], test_data_fil_df.shape))\n",
" test_data_file_name = \"small-test-{}.jsonl\".format(model[\"name\"])\n",
" test_data_fil_df.to_json(test_data_file_name, lines=True, orient=\"records\")"
],
"metadata": {
"collapsed": false
}
]
},
{
"attachments": {},
Expand Down Expand Up @@ -377,14 +396,18 @@
" # The following parameters map to the dataset fields\n",
" input_column_names=\"input_string\",\n",
" label_column_name=\"title\",\n",
" # compute settings\n",
" compute_name=compute_cluster,\n",
" # specify the instance type for serverless job\n",
" # instance_type= \"STANDARD_NC24\",\n",
" # Evaluation settings\n",
" task=\"fill-mask\",\n",
" # config file containing the details of evaluation metrics to calculate\n",
" evaluation_config=Input(type=AssetTypes.URI_FILE, path=\"./eval-config.json\"),\n",
" # evaluation_config_params=evaluation_config_params,\n",
" # config cluster/device job is running on\n",
" # set device to GPU/CPU on basis if GPU count was found\n",
" device=\"gpu\" if gpu_count_found else \"cpu\",\n",
" device=\"auto\",\n",
" )\n",
" return {\"evaluation_result\": evaluation_job.outputs.evaluation_result}"
]
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -117,6 +117,28 @@
"registry_ml_client"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"### Creating Compute\n",
"\n",
"There are two ways to submit a job - through a compute or a serverless job.\n",
"\n",
"##### Serverless Job:\n",
"\n",
"In a serverless job, there is no need to create a compute explicitly.\n",
"Simply pass the desired instance type value to the `instance_type` parameter while creating a pipeline job.\n",
"This allows for quick and convenient job submission without the need for managing a compute cluster.\n",
"\n",
"##### Compute Job:\n",
"\n",
"To submit a job through a compute, you need to create a compute cluster beforehand.\n",
"The following code below demonstrates how to create a gpu compute cluster.\n",
"After creating the compute cluster, pass the name of the compute cluster to the `compute_name` parameter while submitting the pipeline job. This ensures that the job runs on the specified compute cluster, allowing for more control and customization."
]
},
{
"cell_type": "code",
"execution_count": null,
Expand Down Expand Up @@ -363,13 +385,17 @@
" # The following parameters map to the dataset fields\n",
" input_column_names=\"context,question\",\n",
" label_column_name=\"answer_text\",\n",
" # compute settings\n",
" compute_name=compute_cluster,\n",
" # specify the instance type for serverless job\n",
" # instance_type= \"STANDARD_NC24\",\n",
" # Evaluation settings\n",
" task=\"question-answering\",\n",
" # config file containing the details of evaluation metrics to calculate\n",
" evaluation_config=Input(type=AssetTypes.URI_FILE, path=\"./eval-config.json\"),\n",
" # config cluster/device job is running on\n",
" # set device to GPU/CPU on basis if GPU count was found\n",
" device=\"gpu\" if gpu_count_found else \"cpu\",\n",
" device=\"auto\",\n",
" )\n",
" return {\"evaluation_result\": evaluation_job.outputs.evaluation_result}"
]
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -115,6 +115,28 @@
"registry_ml_client"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"### Creating Compute\n",
"\n",
"There are two ways to submit a job - through a compute or a serverless job.\n",
"\n",
"##### Serverless Job:\n",
"\n",
"In a serverless job, there is no need to create a compute explicitly.\n",
"Simply pass the desired instance type value to the `instance_type` parameter while creating a pipeline job.\n",
"This allows for quick and convenient job submission without the need for managing a compute cluster.\n",
"\n",
"##### Compute Job:\n",
"\n",
"To submit a job through a compute, you need to create a compute cluster beforehand.\n",
"The following code below demonstrates how to create a gpu compute cluster.\n",
"After creating the compute cluster, pass the name of the compute cluster to the `compute_name` parameter while submitting the pipeline job. This ensures that the job runs on the specified compute cluster, allowing for more control and customization."
]
},
{
"cell_type": "code",
"execution_count": null,
Expand Down Expand Up @@ -271,6 +293,17 @@
"test_data_df[\"summary\"] = test_data_df[\"highlights\"]"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# trucating the data to pass the tokenizer limit of the model\n",
"test_data_df[\"article\"] = test_data_df[\"article\"].str.slice(0, 200)\n",
"test_data_df[\"input_string\"] = test_data_df[\"input_string\"].str.slice(0, 200)"
]
},
{
"cell_type": "code",
"execution_count": null,
Expand Down Expand Up @@ -359,14 +392,18 @@
" # The following parameters map to the dataset fields\n",
" input_column_names=\"input_string\",\n",
" label_column_name=\"summary\",\n",
" # compute settings\n",
" compute_name=compute_cluster,\n",
" # specify the instance type for serverless job\n",
" # instance_type= \"STANDARD_NC24\",\n",
" # Evaluation settings\n",
" task=\"text-summarization\",\n",
" # config file containing the details of evaluation metrics to calculate\n",
" evaluation_config=Input(type=AssetTypes.URI_FILE, path=\"./eval-config.json\"),\n",
" # evaluation_config_params=evaluation_config_params,\n",
" # config cluster/device job is running on\n",
" # set device to GPU/CPU on basis if GPU count was found\n",
" device=\"gpu\" if gpu_count_found else \"cpu\",\n",
" device=\"auto\",\n",
" )\n",
" return {\"evaluation_result\": evaluation_job.outputs.evaluation_result}"
]
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -122,6 +122,28 @@
"workspace_ml_client"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"### Creating Compute\n",
"\n",
"There are two ways to submit a job - through a compute or a serverless job.\n",
"\n",
"##### Serverless Job:\n",
"\n",
"In a serverless job, there is no need to create a compute explicitly.\n",
"Simply pass the desired instance type value to the `instance_type` parameter while creating a pipeline job.\n",
"This allows for quick and convenient job submission without the need for managing a compute cluster.\n",
"\n",
"##### Compute Job:\n",
"\n",
"To submit a job through a compute, you need to create a compute cluster beforehand.\n",
"The following code below demonstrates how to create a gpu compute cluster.\n",
"After creating the compute cluster, pass the name of the compute cluster to the `compute_name` parameter while submitting the pipeline job. This ensures that the job runs on the specified compute cluster, allowing for more control and customization."
]
},
{
"cell_type": "code",
"execution_count": null,
Expand Down Expand Up @@ -356,13 +378,17 @@
" # The following parameters map to the dataset fields\n",
" input_column_names=\"input_string\",\n",
" label_column_name=\"label_string\",\n",
" # compute settings\n",
" compute_name=compute_cluster,\n",
" # specify the instance type for serverless job\n",
" # instance_type= \"STANDARD_NC24\",\n",
" # Evaluation settings\n",
" task=\"text-classification\",\n",
" # config file containing the details of evaluation metrics to calculate\n",
" evaluation_config=Input(type=AssetTypes.URI_FILE, path=\"./eval-config.json\"),\n",
" # config cluster/device job is running on\n",
" # set device to GPU/CPU on basis if GPU count was found\n",
" device=\"gpu\" if gpu_count_found else \"cpu\",\n",
" device=\"auto\",\n",
" )\n",
" return {\"evaluation_result\": evaluation_job.outputs.evaluation_result}"
]
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -117,6 +117,28 @@
"registry_ml_client"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"### Creating Compute\n",
"\n",
"There are two ways to submit a job - through a compute or a serverless job.\n",
"\n",
"##### Serverless Job:\n",
"\n",
"In a serverless job, there is no need to create a compute explicitly.\n",
"Simply pass the desired instance type value to the `instance_type` parameter while creating a pipeline job.\n",
"This allows for quick and convenient job submission without the need for managing a compute cluster.\n",
"\n",
"##### Compute Job:\n",
"\n",
"To submit a job through a compute, you need to create a compute cluster beforehand.\n",
"The following code below demonstrates how to create a gpu compute cluster.\n",
"After creating the compute cluster, pass the name of the compute cluster to the `compute_name` parameter while submitting the pipeline job. This ensures that the job runs on the specified compute cluster, allowing for more control and customization."
]
},
{
"cell_type": "code",
"execution_count": null,
Expand Down Expand Up @@ -215,6 +237,9 @@
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"models = []\n",
Expand All @@ -223,21 +248,18 @@
" reg_model = list(registry_ml_client.models.list(name=model[\"name\"]))[0]\n",
" print(reg_model.id)\n",
" models.append({**model, \"version\": reg_model.version})"
],
"metadata": {
"collapsed": false
}
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"models"
],
"metadata": {
"collapsed": false
}
]
},
{
"attachments": {},
Expand Down Expand Up @@ -360,13 +382,17 @@
" # The following parameters map to the dataset fields\n",
" input_column_names=\"input_string\",\n",
" label_column_name=\"ground_truth\",\n",
" # compute settings\n",
" compute_name=compute_cluster,\n",
" # specify the instance type for serverless job\n",
" # instance_type= \"STANDARD_NC24\",\n",
" # Evaluation settings\n",
" task=\"text-generation\",\n",
" # config file containing the details of evaluation metrics to calculate\n",
" evaluation_config=Input(type=AssetTypes.URI_FILE, path=\"./eval-config.json\"),\n",
" # config cluster/device job is running on\n",
" # set device to GPU/CPU on basis if GPU count was found\n",
" device=\"gpu\" if gpu_count_found else \"cpu\",\n",
" device=\"auto\",\n",
" )\n",
" return {\"evaluation_result\": evaluation_job.outputs.evaluation_result}"
]
Expand Down
Loading