Skip to content

Latest commit

 

History

History
1511 lines (703 loc) · 35.1 KB

api.md

File metadata and controls

1511 lines (703 loc) · 35.1 KB

Protocol Documentation

Top

api.proto

Katib API

Return generated StudyID.

Field Type Label Description
study_id string

Create a Study from Study Config. Generate an unique ID and store the Study to DB.

Field Type Label Description
study_config StudyConfig

Return generated TrialID.

Field Type Label Description
trial_id string

Create a Trial from Trial Config. Generate an unique ID and store the Trial to DB.

Field Type Label Description
trial Trial
Field Type Label Description
name string
path string

Return deleted Study ID.

Field Type Label Description
study_id string

Delete a Study from DB by Study ID.

Field Type Label Description
study_id string

Parameter for EarlyStopping service. Key-value format.

Field Type Label Description
name string Name of Parameter.
value string Value of Parameter.
Field Type Label Description
param_id string
early_stopping_algorithm string
early_stopping_parameters EarlyStoppingParameter repeated

Feasible space for optimization. Int and Double type use Max/Min. Discrete and Categorical type use List.

Field Type Label Description
max string Max Value
min string Minimum Value
list string repeated List of Values.
step string Step for double or int parameter
Field Type Label Description
early_stopping_parameter_sets EarlyStoppingParameterSet repeated
Field Type Label Description
study_id string
Field Type Label Description
early_stopping_parameters EarlyStoppingParameter repeated
Field Type Label Description
param_id string
Field Type Label Description
metrics_log_sets MetricsLogSet repeated
Field Type Label Description
study_id string
worker_ids string repeated
metrics_names string repeated
Field Type Label Description
model ModelInfo
Field Type Label Description
study_name string
worker_id string
Field Type Label Description
models ModelInfo repeated
Field Type Label Description
study_name string
Field Type Label Description
studies StudyOverview repeated
Field Type Label Description
should_stop_worker_ids string repeated
Field Type Label Description
study_id string
early_stopping_algorithm string
param_id string

Return a overview list of Studies.

Field Type Label Description
study_overviews StudyOverview repeated

Get all Study Configs from DB.

Return a config of specified Study.

Field Type Label Description
study_config StudyConfig

Get a Study Config from DB by ID of Study.

Field Type Label Description
study_id string
Field Type Label Description
suggestion_parameter_sets SuggestionParameterSet repeated
Field Type Label Description
study_id string
Field Type Label Description
suggestion_parameters SuggestionParameter repeated
Field Type Label Description
param_id string
Field Type Label Description
trials Trial repeated
Field Type Label Description
study_id string
suggestion_algorithm string
request_number int32
log_worker_ids string repeated
param_id string

Return a trial configuration by specified trial ID

Field Type Label Description
trial Trial

Get a trial configuration from DB by trial ID

Field Type Label Description
trial_id string

Return a trial list in specified Study.

Field Type Label Description
trials Trial repeated

Get a Trial Configs from DB by ID of Study.

Field Type Label Description
study_id string
Field Type Label Description
worker_full_infos WorkerFullInfo repeated

Get a full information related to specified Workers. It includes Worker Config, HyperParameters and Metrics Logs.

Field Type Label Description
study_id string
trial_id string
worker_id string
only_latest_log bool

Return a Worker list by specified condition.

Field Type Label Description
workers Worker repeated

Get a configs and status of a Worker from DB by ID of Study, Trial or Worker.

Field Type Label Description
study_id string
trial_id string
worker_id string

GraphConfig contains a config of DAG

Field Type Label Description
num_layers int32 Number of layers
input_size int32 repeated Dimenstions of input size
output_size int32 repeated Dimensions of output size

Metrics of a worker

Field Type Label Description
name string Name of metrics.
value string Value of metrics. Double float.

Metrics logs of a worker

Field Type Label Description
name string Name of metrics.
values MetricsValueTime repeated Log of metrics. Ordered by time series.

Logs of metrics for a worker.

Field Type Label Description
worker_id string ID of the corresponding worker.
metrics_logs MetricsLog repeated Logs of metrics.
worker_status State Status of the corresponding worker.

Metrics of a worker with timestamp

Field Type Label Description
time string Timestamp RFC3339 format.
value string Value of metrics. Double float.
Field Type Label Description
study_name string
worker_id string
parameters Parameter repeated
metrics Metrics repeated
model_path string

NasConfig contains a config of NAS job

Field Type Label Description
graph_config GraphConfig Config of DAG
operations NasConfig.Operations List of Operation
Field Type Label Description
operation Operation repeated

Config for operations in DAG

Field Type Label Description
operationType string Type of operation in DAG
parameter_configs Operation.ParameterConfigs List of ParameterConfig

List of ParameterConfig

Field Type Label Description
configs ParameterConfig repeated

Value of a Hyper parameter. This will be created from a correcponding Config.

Field Type Label Description
name string Name of the parameter.
parameter_type ParameterType Type of the parameter.
value string Value of the parameter.

Config for a Hyper parameter. Katib will create each Hyper parameter from this config.

Field Type Label Description
name string Name of the parameter.
parameter_type ParameterType Type of the parameter.
feasible FeasibleSpace FeasibleSpace for the parameter.

Return generated WorkerID.

Field Type Label Description
worker_id string

Create a Worker from Worker Config. Generate an unique ID and store the Worker to DB.

Field Type Label Description
worker Worker
Field Type Label Description
study_id string
metrics_log_sets MetricsLogSet repeated
Field Type Label Description
model ModelInfo
data_set DataSetInfo
tensor_board bool
Field Type Label Description
study_name string
owner string
description string
Field Type Label Description
param_id string
Field Type Label Description
study_id string
early_stopping_algorithm string
param_id string
early_stopping_parameters EarlyStoppingParameter repeated
Field Type Label Description
param_id string
Field Type Label Description
study_id string
suggestion_algorithm string
param_id string
suggestion_parameters SuggestionParameter repeated
Field Type Label Description
study_id string
Field Type Label Description
study_id string
worker_ids string repeated
is_complete bool

Config of a Study. Study represents a single optimization run over a feasible space. Each Study contains a configuration describing the feasible space, as well as a set of Trials. It is assumed that objective function f(x) does not change in the course of a Study.

Field Type Label Description
name string Name of Study.
owner string Owner of Study.
optimization_type OptimizationType Optimization type.
optimization_goal double Goal of optimization value.
parameter_configs StudyConfig.ParameterConfigs List of ParameterConfig
access_permissions string repeated Access Permission
tags Tag repeated Tag for Study
objective_value_name string Name of objective value.
metrics string repeated List of metrics name.
jobId string ID of studyjob that is created from this config.
nas_config NasConfig Config for NAS job
job_type string Type of the job, NAS or HP

List of ParameterConfig

Field Type Label Description
configs ParameterConfig repeated

Overview of a study. For UI.

Field Type Label Description
name string Name of Study.
owner string Owner of Study.
id string Study ID.
description string Discretption of Study.

Parameter for Suggestion service. Key-value format.

Field Type Label Description
name string Name of Parameter.
value string Value of Parameter.
Field Type Label Description
param_id string
suggestion_algorithm string
suggestion_parameters SuggestionParameter repeated

Tag for each resource.

Field Type Label Description
name string Name of tag.
value string Value of tag.

A set of Hyperparameter. In a study, multiple trials are evaluated by workers. Suggestion service will generate next trials. Create time will be filled in the server automatically side even user set the value

Field Type Label Description
trial_id string Trial ID.
study_id string Study ID.
parameter_set Parameter repeated Hyperparameter set
objective_value string Objective Value
tags Tag repeated Tags of Trial.
create_time string Trial create timestamp RFC3339 format.

Update a Status of Worker.

Field Type Label Description
worker_id string
status State

A process of evaluation for a trial. Types of worker supported by Katib are k8s Job, TF-Job, and Pytorch-Job.

Field Type Label Description
worker_id string Worker ID.
study_id string Study ID.
trial_id string Trial ID.
Type string Type of Worker
status State Status of Worker.
TemplatePath string Path for the manufest template of Worker.
tags Tag repeated Tags of Worker.
Field Type Label Description
Worker Worker
parameter_set Parameter repeated
metrics_logs MetricsLog repeated

Direction of optimization. Minimize or Maximize.

Name Number Description
UNKNOWN_OPTIMIZATION 0 Undefined type and not used.
MINIMIZE 1 Minimize
MAXIMIZE 2 Maximize

Types of value for HyperParameter.

Name Number Description
UNKNOWN_TYPE 0 Undefined type and not used.
DOUBLE 1 Double float type. Use "Max/Min".
INT 2 Int type. Use "Max/Min".
DISCRETE 3 Discrete number type. Use "List" as float.
CATEGORICAL 4 Categorical type. Use "List" as string.

Status code for worker. This value is stored as TINYINT in MySQL.

Name Number Description
PENDING 0 Pending. Created but not running.
RUNNING 1 Running.
COMPLETED 2 Completed.
KILLED 3 Killed. Not failed.
ERROR 120 Error.
Method Name Request Type Response Type Description
GetShouldStopWorkers GetShouldStopWorkersRequest GetShouldStopWorkersReply

Service for Main API for Katib For each RPC service, we define mapping to HTTP REST API method. The mapping includes the URL path, query parameters and request body. https://cloud.google.com/service-infrastructure/docs/service-management/reference/rpc/google.api#http

Method Name Request Type Response Type Description
CreateStudy CreateStudyRequest CreateStudyReply Create a Study from Study Config. Generate a unique ID and store the Study to DB.
GetStudy GetStudyRequest GetStudyReply Get a Study Config from DB by ID of Study.
DeleteStudy DeleteStudyRequest DeleteStudyReply Delete a Study from DB by Study ID.
GetStudyList GetStudyListRequest GetStudyListReply Get all Study Configs from DB.
CreateTrial CreateTrialRequest CreateTrialReply Create a Trial from Trial Config. Generate a unique ID and store the Trial to DB.
GetTrials GetTrialsRequest GetTrialsReply Get a Trial Configs from DB by ID of Study.
GetTrial GetTrialRequest GetTrialReply Get a Trial Configuration from DB by ID of Trial.
RegisterWorker RegisterWorkerRequest RegisterWorkerReply Create a Worker from Worker Config. Generate a unique ID and store the Worker to DB.
GetWorkers GetWorkersRequest GetWorkersReply Get a Worker Configs and Status from DB by ID of Study, Trial or Worker.
UpdateWorkerState UpdateWorkerStateRequest UpdateWorkerStateReply Update a Status of Worker.
GetWorkerFullInfo GetWorkerFullInfoRequest GetWorkerFullInfoReply Get full information related to specified Workers. It includes Worker Config, HyperParameters and Metrics Logs.
GetSuggestions GetSuggestionsRequest GetSuggestionsReply Get Suggestions from a Suggestion service.
GetShouldStopWorkers GetShouldStopWorkersRequest GetShouldStopWorkersReply
GetMetrics GetMetricsRequest GetMetricsReply Get metrics of workers. You can get all logs of metrics since start of the worker.
SetSuggestionParameters SetSuggestionParametersRequest SetSuggestionParametersReply Create or Update parameter set for a suggestion service. If you specify an ID of parameter set, it will update the parameter set by your request. If you don't specify an ID, it will create a new parameter set for corresponding study and suggestion service. The parameters are key-value format.
GetSuggestionParameters GetSuggestionParametersRequest GetSuggestionParametersReply Get suggestion parameter set from DB specified.
GetSuggestionParameterList GetSuggestionParameterListRequest GetSuggestionParameterListReply Get all suggestion parameter sets from DB.
SetEarlyStoppingParameters SetEarlyStoppingParametersRequest SetEarlyStoppingParametersReply
GetEarlyStoppingParameters GetEarlyStoppingParametersRequest GetEarlyStoppingParametersReply
GetEarlyStoppingParameterList GetEarlyStoppingParameterListRequest GetEarlyStoppingParameterListReply
SaveStudy SaveStudyRequest SaveStudyReply
SaveModel SaveModelRequest SaveModelReply
ReportMetricsLogs ReportMetricsLogsRequest ReportMetricsLogsReply Report a logs of metrics for workers. The logs for each worker must have timestamp and must be ordered in time series. When the log you reported are already reported before, it will be dismissed and get no error.
GetSavedStudies GetSavedStudiesRequest GetSavedStudiesReply
GetSavedModels GetSavedModelsRequest GetSavedModelsReply
Method Name Request Type Response Type Description
GetSuggestions GetSuggestionsRequest GetSuggestionsReply

Scalar Value Types

.proto Type Notes C++ Type Java Type Python Type
double double double float
float float float float
int32 Uses variable-length encoding. Inefficient for encoding negative numbers – if your field is likely to have negative values, use sint32 instead. int32 int int
int64 Uses variable-length encoding. Inefficient for encoding negative numbers – if your field is likely to have negative values, use sint64 instead. int64 long int/long
uint32 Uses variable-length encoding. uint32 int int/long
uint64 Uses variable-length encoding. uint64 long int/long
sint32 Uses variable-length encoding. Signed int value. These more efficiently encode negative numbers than regular int32s. int32 int int
sint64 Uses variable-length encoding. Signed int value. These more efficiently encode negative numbers than regular int64s. int64 long int/long
fixed32 Always four bytes. More efficient than uint32 if values are often greater than 2^28. uint32 int int
fixed64 Always eight bytes. More efficient than uint64 if values are often greater than 2^56. uint64 long int/long
sfixed32 Always four bytes. int32 int int
sfixed64 Always eight bytes. int64 long int/long
bool bool boolean boolean
string A string must always contain UTF-8 encoded or 7-bit ASCII text. string String str/unicode
bytes May contain any arbitrary sequence of bytes. string ByteString str