diff --git a/.nojekyll b/.nojekyll new file mode 100644 index 000000000..e69de29bb diff --git a/404.html b/404.html new file mode 100644 index 000000000..295cf2804 --- /dev/null +++ b/404.html @@ -0,0 +1,813 @@ + + + +
+ + + + + + + + + + + + + +TODO: the page is hidden now. If implemented, find all usages and uncomment them.
+ + + + + + + + + + +TODO: the page is hidden now. If implemented, find all usages and uncomment them.
+ + + + + + + + + + +This guide helps you learn how to login and logout using the MedPerf client to access the main production MedPerf server. MedPerf uses passwordless authentication. This means that login will only require you to access your email in order complete the login process.
+Follow the steps below to login:
+You will be prompted to enter your email address.
+After entering your email address, you will be provided with a verification URL and a code. A text similar to the following will be printed in your terminal:
+ +Tip
+If you are running the MedPerf client on a machine with no graphical interface, you can use the link on any other device, e.g. your cellphone. Make sure that you trust that device.
+Open the printed URL in your browser. You will be presented with a code, and you will be asked to confirm if that code is the same one printed in your terminal.
+ +Enter the received code in the previous screen.
+To disconnect the MedPerf client, simply run the following command:
+ +Note that when you log in, the MedPerf client will remember you as long as you are using the same profile
. If you switch to another profile by running medperf profile activate <other-profile>
, you may have to log in again. If you switch back again to a profile where you previously logged in, your login state will be restored.
You can always check the current login status by the running the following command:
+ + + + + + + + + + + +MedPerf requires some files to be hosted on the cloud when running machine learning pipelines. Submitting MLCubes to the MedPerf server means submitting their metadata, and not, for example, model weights or parameters files. MLCube files such as model weights need to be hosted on the cloud, and the submitted MLCube metadata will only contain URLs (or certain identifiers) for these files. Another example would be benchmark submission, where demo datasets need to be hosted.
+The MedPerf client expects files to be hosted in certain ways. Below are options of how files can be hosted and how MedPerf identitfies them (e.g. a URL).
+This can be done with any cloud hosting tool/provider you desire (such as GCP, AWS, Dropbox, Google Drive, Github). As long as your file can be accessed through a direct download link, it should work with medperf. Generating a direct download link for your hosted file can be straight-forward when using some providers (e.g. Amazon Web Services, Google Cloud Platform, Microsoft Azure) and can be a bit tricky when using others (e.g. Dropbox, GitHub, Google Drive).
+Note
+Direct download links must be permanent
+Tip
+You can make sure if a URL is a direct download link or not using tools like wget
or curl
. Running wget <URL>
will download the file if the URL is a direct download link. Running wget <URL>
may fail or may download an HTML page if the URL is not a direct download link.
When your file is hosted with a direct download link, MedPerf will be able to identify this file using that direct download link. So for example, when you are submitting an MLCube, you would pass your hosted MLCube manifest file as follows:
+ +Warning
+Files in this case are supposed to have anonymous public read access permission.
+It was a common practice by the current MedPerf users to host files on GitHub. You can learn below how to find the direct download link of a file hosted on GitHub. You can check online for other storage providers.
+It's important though to make sure the files won't be modified after being submitted to medperf, which could happen due to future commits. Because of this, the URLs of the files hosted on GitHub must contain a reference to the current commit hash. Below are the steps to get this URL for a specific file:
+You can choose the option of hosting with Synapse in cases where privacy is a concern. Please refer to this link for hosting files on the Synapse platform.
+When your file is hosted on Synapse, MedPerf will be able to identify this file using the Synapse ID corresponding to that file. So for example, when you are submitting an MLCube, you would pass your hosted MLCube manifest file as follows (note the prefix):
+ +Note that you need to authenticate with your Synapse credentials if you plan to use a Synaspe file with MedPerf. To do so run medperf auth synapse_login
.
Note
+You must authenticate if using files on Synapse. If this is not necessary, this means the file has anonymous public access read permission. In this case, Synapse allows you to generate a permanent direct download link for your file and you can follow the previous section.
+Once you have built an MLCube ready for MedPerf, you need to host it somewhere on the cloud so that it can be identified and retrieved by the MedPerf client on other machines. This requires hosting the MLCube components somewhere on the cloud. The following is a description of what needs to be hosted.
+MLCubes execute a container image behind the scenes. This container image is usually hosted on a container registry, like Docker Hub. In cases where this is not possible, medperf provides the option of passing the image file directly (i.e. having the image file hosted somewhere and providing MedPerf with the download link). MLCubes that work with images outside of the docker registry usually store the image inside the <path_to_mlcube>/workspace/.image
folder. MedPerf supports using direct container image files for Singularity only.
Note
+While there is the option of hosting the singularity image directly, it is highly recommended to use a container registry for accessability and usability purposes. MLCube also has mechanisms for converting containers to other runners, like Docker to Singularity.
+Note
+Docker Images can be on any docker container registry, not necessarily on Docker Hub.
+The following is the list of files that must be hosted separately so they can be used by MedPerf:
+mlcube.yaml
¶Every MLCube is defined by its mlcube.yaml
manifest file. As such, Medperf needs to have access to this file to recreate the MLCube. This file can be found inside your MLCube at <path_to_mlcube>/mlcube.yaml
.
parameters.yaml
(Optional)¶The parameters.yaml
file specify additional ways to parametrize your model MLCube using the same container image it is built with. This file can be found inside your MLCube at <path_to_mlcube>/workspace/parameters.yaml
.
additional_files.tar.gz
(Optional)¶MLCubes may require additional files that may be desired to keep separate from the model architecture and hosted image. For example, model weights. This allows for testing multiple implementations of the same model, without requiring a separate container image for each. If additional images are being used by your MLCube, they need to be compressed into a .tar.gz
file and hosted separately. You can create this tarball file with the following command
To facilitate hosting and interface compatibility validation, MedPerf provides a script that finds all the required assets, compresses them if necessary, and places them in a single location for easy access. To run the script, make sure you have medperf installed and you are in medperf's root directory:
+python scripts/package-mlcube.py \
+ --mlcube path/to/mlcube \
+ --mlcube-types <list-of-comma-separated-strings> \
+ --output path/to/file.tar.gz
+
where:
+path/to/mlcube
is the path to the MLCube folder containing the manifest file (mlcube.yaml
)--mlcube-types
specifies a comma-separated list of MLCube types ('data-preparator' for a data preparation MLCube, 'model' for a model MLCube, and 'metrics' for a metrics MLCube.)path/to/file.tar.gz
is a path to the output file where you want to store the compressed version of all assets.See python scripts/package-mlcube.py --help
for more information.
Once executed, you should be able to find all prepared assets at ./mlcube/assets
, as well as a compressed version of the assets
folder at the output path provided.
Note
+The --output
parameter is optional. The compressed version of the assets
folder can be useful in cases where you don't directly interact with the MedPerf server, but instead you do so through a third party. This is usually the case for challenges and competitions.
TODO: the page is hidden now. If implemented, find all usages and uncomment them.
+ + + + + + + + + + +TODO: the page is hidden now. If implemented, find all usages and uncomment them.
+ + + + + + + + + + +TODO: the page is hidden now. If implemented, find all usages and uncomment them.
+ + + + + + + + + + +In this guide, you will learn how a user can use MedPerf to create a benchmark. The key tasks can be summarized as follows:
+It's assumed that you have already set up the general testing environment as explained in the installation and setup guide.
+As the most easy way to play with the tutorials you can launch a preinstalled Codespace cloud environment for MedPerf by clicking this link:
+ +To start experimenting with MedPerf through this tutorial on your local machine, you need to start by following these quick steps:
+ +For the purpose of the tutorial, you have to initialize a local MedPerf server with a fresh database and then create the necessary entities that you will be interacting with. To do so, run the following: (make sure you are in MedPerf's root folder)
+ +A script is provided to download all the necessary files so that you follow the tutorial smoothly. Run the following: (make sure you are in MedPerf's root folder)
+ +This will create a workspace folder medperf_tutorial
where all necessary files are downloaded. The folder contains the following content:
In this tutorial we will create a benchmark that classifies chest X-Ray images.
+The medperf_tutorial/demo_data/
folder contains the demo dataset content.
images/
folder includes sample images.labels/labels.csv
provides a basic ground truth markup, indicating the class each image belongs to.The demo dataset is a sample dataset used for the development of your benchmark and used by Model Owners for the development of their models. More details are available in the section below
+The medperf_tutorial/data_preparator/
contains a DataPreparator MLCube that you must implement. This MLCube:
+ - Transforms raw data into a format convenient for model consumption, such as converting DICOM images into numpy tensors, cropping patches, normalizing columns, etc. It's up to you to define the format that is handy for future models.
+ - Ensures its output is in a standardized format, allowing Model Owners/Developers to rely on its consistency.
The medperf_tutorial/model_custom_cnn/
is an example of a Model MLCube. You need to implement a reference model which will be used by data owners to test the compatibility of their data with your pipeline. Also, Model Developers joining your benchmark will follow the input/output specifications of this model when building their own models.
The medperf_tutorial/metrics/
houses a Metrics MLCube that processes ground truth data, model predictions, and computes performance metrics - such as classification accuracy, loss, etc. After a Dataset Owner runs the benchmark pipeline on their data, these final metric values will be shared with you as the Benchmark Owner.
In real life all the listed artifacts and files have to be created on your own. However, for tutorial's sake you may use this toy data.
+The local MedPerf server is pre-configured with a dummy local authentication system. Remember that when you are communicating with the real MedPerf server, you should follow the steps in this guide to login. For the tutorials, you should not do anything.
+You are now ready to start!
++The implementation of a valid workflow is accomplished by implementing three MLCubes:
+Data Preparator MLCube: This MLCube will transform raw data into a dataset ready for the AI model execution. All data owners willing to participate in this benchmark will have their data prepared using this MLCube. A guide on how to implement data preparation MLCubes can be found here.
+Reference Model MLCube: This MLCube will contain an example model implementation for the desired AI task. It should be compatible with the data preparation MLCube (i.e., the outputs of the data preparation MLCube can be directly fed as inputs to this MLCube). A guide on how to implement model MLCubes can be found here.
+Metrics MLCube: This MLCube will be responsible for evaluating the performance of a model. It should be compatible with the reference model MLCube (i.e., the outputs of the reference model MLCube can be directly fed as inputs to this MLCube). A guide on how to implement metrics MLCubes can be found here.
+For this tutorial, you are provided with following three already implemented mlcubes for the task of chest X-ray classification. The implementations can be found in the following links: Data Preparator, Reference Model, Metrics. These mlcubes are setup locally for you and can be found in your workspace folder under data_preparator
, model_custom_cnn
, and metrics
.
+A demo dataset is a small reference dataset. It contains a few data records and their labels, which will be used to test the benchmark's workflow in two scenarios:
+It is used for testing the benchmark's default workflow. The MedPerf client automatically runs a compatibility test of the benchmark's three mlcubes prior to its submission. The test is run using the benchmark's demo dataset as input.
+When a model owner wants to participate in the benchmark, the MedPerf client tests the compatibility of their model with the benchmark's data preparation cube and metrics cube. The test is run using the benchmark's demo dataset as input.
+For this tutorial, you are provided with a demo dataset for the chest X-ray classification workflow. The dataset can be found in your workspace folder under demo_data
. It is a small dataset comprising two chest X-ray images and corresponding thoracic disease labels.
You can test the workflow now that you have the three MLCubes and the demo data. Testing the workflow before submitting any asset to the MedPerf server is usually recommended.
+MedPerf provides a single command to test an inference workflow. To test your workflow with local MLCubes and local data, the following need to be passed to the command:
+medperf_tutorial/data_preparator/mlcube/mlcube.yaml
.medperf_tutorial/model_custom_cnn/mlcube/mlcube.yaml
.medperf_tutorial/metrics/mlcube/mlcube.yaml
.medperf_tutorial/demo_data/images
.medperf_tutorial/demo_data/labels
.Run the following command to execute the test ensuring you are in MedPerf's root folder:
+medperf test run \
+ --data_preparation "medperf_tutorial/data_preparator/mlcube/mlcube.yaml" \
+ --model "medperf_tutorial/model_custom_cnn/mlcube/mlcube.yaml" \
+ --evaluator "medperf_tutorial/metrics/mlcube/mlcube.yaml" \
+ --data_path "medperf_tutorial/demo_data/images" \
+ --labels_path "medperf_tutorial/demo_data/labels"
+
Assuming the test passes successfully, you are ready to submit the MLCubes to the MedPerf server.
+The demo dataset should be packaged in a specific way as a compressed tarball file. The folder stucture in the workspace currently looks like the following:
+ +The goal is to package the folder demo_data
. You must first create a file called paths.yaml
. This file will provide instructions on how to locate the data records path and the labels path. The paths.yaml
file should specify both the data records path and the labels path.
In your workspace directory (medperf_tutorial
), create a file paths.yaml
and fill it with the following:
Note
+The paths are determined by the Data Preparator MLCube's expected input path.
+After that, the workspace should look like the following:
+ +Finally, compress the required assets (demo_data
and paths.yaml
) into a tarball file by running the following command:
And that's it! Now you have to host the tarball file (demo_data.tar.gz
) on the internet.
For the tutorial to run smoothly, the file is already hosted at the following URL:
+ +If you wish to host it by yourself, you can find the list of supported options and details about hosting files in this page.
+Finally, now after having the MLCubes submitted and the demo dataset hosted, you can submit the benchmark to the MedPerf server.
+The MedPerf server registers an MLCube as metadata comprised of a set of files that can be retrieved from the internet. This means that before submitting an MLCube you have to host its files on the internet. The MedPerf client provides a utility to prepare the files of an MLCube that need to be hosted. You can refer to this page if you want to understand what the files are, but using the utility script is enough.
+To prepare the files of the three MLCubes, run the following command ensuring you are in MedPerf's root folder:
+python scripts/package-mlcube.py --mlcube medperf_tutorial/data_preparator/mlcube --mlcube-types data-preparator
+python scripts/package-mlcube.py --mlcube medperf_tutorial/model_custom_cnn/mlcube --mlcube-types model
+python scripts/package-mlcube.py --mlcube medperf_tutorial/metrics/mlcube --mlcube-types metrics
+
For each MLCube, this script will create a new folder named assets
in the MLCube directory. This folder will contain all the files that should be hosted separately.
For the tutorial to run smoothly, the files are already hosted. If you wish to host them by yourself, you can find the list of supported options and details about hosting files in this page.
++For the Data Preparator MLCube, the submission should include:
+The URL to the hosted mlcube manifest file, which is:
+ +The URL to the hosted mlcube parameters file, which is:
+ +Use the following command to submit:
+medperf mlcube submit \
+ --name my-prep-cube \
+ --mlcube-file "https://raw.githubusercontent.com/mlcommons/medperf/main/examples/chestxray_tutorial/data_preparator/mlcube/mlcube.yaml" \
+ --parameters-file "https://raw.githubusercontent.com/mlcommons/medperf/main/examples/chestxray_tutorial/data_preparator/mlcube/workspace/parameters.yaml" \
+ --operational
+
+For the Reference Model MLCube, the submission should include:
+The URL to the hosted mlcube manifest file:
+ +The URL to the hosted mlcube parameters file:
+ +The URL to the hosted additional files tarball file:
+ +Use the following command to submit:
+medperf mlcube submit \
+--name my-modelref-cube \
+--mlcube-file "https://raw.githubusercontent.com/mlcommons/medperf/main/examples/chestxray_tutorial/model_custom_cnn/mlcube/mlcube.yaml" \
+--parameters-file "https://raw.githubusercontent.com/mlcommons/medperf/main/examples/chestxray_tutorial/model_custom_cnn/mlcube/workspace/parameters.yaml" \
+--additional-file "https://storage.googleapis.com/medperf-storage/chestxray_tutorial/cnn_weights.tar.gz" \
+--operational
+
+For the Metrics MLCube, the submission should include:
+The URL to the hosted mlcube manifest file:
+ +The URL to the hosted mlcube parameters file:
+ +Use the following command to submit:
+medperf mlcube submit \
+--name my-metrics-cube \
+--mlcube-file "https://raw.githubusercontent.com/mlcommons/medperf/main/examples/chestxray_tutorial/metrics/mlcube/mlcube.yaml" \
+--parameters-file "https://raw.githubusercontent.com/mlcommons/medperf/main/examples/chestxray_tutorial/metrics/mlcube/workspace/parameters.yaml" \
+--operational
+
Each of the three MLCubes will be assigned by a server UID. You can check the server UID for each MLCube by running:
+ +Next, you will learn how to host the demo dataset.
++You need to keep at hand the following information:
+1
2
3
You can create and submit your benchmark using the following command:
+medperf benchmark submit \
+ --name tutorial_bmk \
+ --description "MedPerf demo bmk" \
+ --demo-url "https://storage.googleapis.com/medperf-storage/chestxray_tutorial/demo_data.tar.gz" \
+ --data-preparation-mlcube 1 \
+ --reference-model-mlcube 2 \
+ --evaluator-mlcube 3 \
+ --operational
+
The MedPerf client will first automatically run a compatibility test between the MLCubes using the demo dataset. If the test is successful, the benchmark will be submitted along with the compatibility test results.
+Note
+The benchmark will stay inactive until the MedPerf server admin approves your submission.
+That's it! You can check your benchmark's server UID by running:
+ + +You have reached the end of the tutorial! If you are planning to rerun any of the tutorials, don't forget to cleanup:
+To shut down the local MedPerf server: press CTRL
+C
in the terminal where the server is running.
To cleanup the downloaded files workspace (make sure you are in the MedPerf's root directory):
+As a data owner, you plan to run a benchmark on your own dataset. Using MedPerf, you will prepare your (raw) dataset and submit information about it to the MedPerf server. You may have to consult the benchmark committee to make sure that your raw dataset aligns with the benchmark's expected input format.
+Note
+A key concept of MedPerf is the stringent confidentiality of your data. It remains exclusively on your machine. Only minimal information about your dataset, such as the hash of its contents, is submitted. Once your Dataset is submitted and associated with a benchmark, you can run all benchmark models on your data within your own infrastructure and see the results / predictions.
+This guide provides you with the necessary steps to use MedPerf as a Data Owner. The key tasks can be summarized as follows:
+It is assumed that you have the general testing environment set up.
+As the most easy way to play with the tutorials you can launch a preinstalled Codespace cloud environment for MedPerf by clicking this link:
+ +To start experimenting with MedPerf through this tutorial on your local machine, you need to start by following these quick steps:
+ +For the purpose of the tutorial, you have to initialize a local MedPerf server with a fresh database and then create the necessary entities that you will be interacting with. To do so, run the following: (make sure you are in MedPerf's root folder)
+ +A script is provided to download all the necessary files so that you follow the tutorial smoothly. Run the following: (make sure you are in MedPerf's root folder)
+ +This will create a workspace folder medperf_tutorial
where all necessary files are downloaded. The folder contains the following content:
The medperf_tutorial/sample_raw_data/
folder contains your data for the specified Benchmark. In this tutorial, where the benchmark involves classifying chest X-Ray images, your data comprises:
images/
folder contains your imageslabels/labels.csv
, which provides the ground truth markup, specifying the class of each image.The format of this data is dictated by the Benchmark Owner, as it must be compatible with the benchmark's Data Preparation MLCube. In a real-world scenario, the expected data format would differ from this toy example. Refer to the Benchmark Owner to get a format specifications and details for your practical case.
+As previously mentioned, your data itself never leaves your machine. During the dataset submission, only basic metadata is transferred, for which you will be prompted to confirm.
+In real life all the listed artifacts and files have to be created on your own. However, for tutorial's sake you may use this toy data.
+The local MedPerf server is pre-configured with a dummy local authentication system. Remember that when you are communicating with the real MedPerf server, you should follow the steps in this guide to login. For the tutorials, you should not do anything.
+You are now ready to start!
++To register your dataset, you need to collect the following information:
+medperf_tutorial/sample_raw_data/images
).medperf_tutorial/sample_raw_data/labels
)Note
+The data_path
and labels_path
are determined according to the input path requirements of the data preparation MLCube. To ensure that your data is structured correctly, it is recommended to check with the Benchmark Committee for specific details or instructions.
In order to find the benchmark ID, you can execute the following command to view the list of available benchmarks.
+ +The target benchmark ID here is 1
.
Note
+You will be submitting general information about the data, not the data itself. The data never leaves your machine.
+Run the following command to register your data (make sure you are in MedPerf's root folder):
+medperf dataset submit \
+ --name "mytestdata" \
+ --description "A tutorial dataset" \
+ --location "My machine" \
+ --data_path "medperf_tutorial/sample_raw_data/images" \
+ --labels_path "medperf_tutorial/sample_raw_data/labels" \
+ --benchmark 1
+
Once you run this command, the information to be submitted will be displayed on the screen and you will be asked to confirm your submission. Once you confirm, your dataset will be successfully registered!
+To prepare and preprocess your dataset, you need to know the server UID of your registered dataset. You can check your datasets information by running:
+ +In our tutorial, your dataset ID will be 1
. Run the following command to prepare your dataset:
This command will also calculate statistics on your data; statistics defined by the benchmark owner. These will be submitted to the MedPerf server in the next step upon your approval.
+After successfully preparing your dataset, you can mark it as ready so that it can be associated with benchmarks you want. During preparation, your dataset is considered in the Development
stage, and now you will mark it as operational.
Note
+Once marked as operational, it can never be marked as in-development anymore.
+Run the following command to mark your dataset as operational:
+ +Once you run this command, you will see on your screen the updated information of your dataset along with the statistics mentioned in the previous step. You will be asked to confirm submission of the displayed information. Once you confirm, your dataset will be successfully marked as operational!
+Next, you can proceed to request participation in the benchmark by initiating an association request.
++For submitting the results of executing the benchmark models on your data in the future, you must associate your data with the benchmark.
+Once you have submitted your dataset to the MedPerf server, it will be assigned a server UID, which you can find by running medperf dataset ls --mine
. Your dataset's server UID is also 1
.
Run the following command to request associating your dataset with the benchmark:
+ +This command will first run the benchmark's reference model on your dataset to ensure your dataset is compatible with the benchmark workflow. Then, the association request information is printed on the screen, which includes an executive summary of the test mentioned. You will be prompted to confirm sending this information and initiating this association request.
+
+When participating with a real benchmark, you must wait for the Benchmark Committee to approve the association request. You can check the status of your association requests by running medperf association ls
. The association is identified by the server UIDs of your dataset and the benchmark with which you are requesting association.
For the sake of continuing the tutorial only, run the following to simulate the benchmark committee approving your association (make sure you are in the MedPerf's root directory):
+ +You can verify if your association request has been approved by running medperf association ls
.
+MedPerf provides a command that runs all the models of a benchmark effortlessly. You only need to provide two parameters:
+1
.1
.For that, run the following command:
+ +After running the command, you will receive a summary of the executions. You will see something similar to the following:
+ model local result UID partial result from cache error
+------- ------------------ ---------------- ------------ -------
+ 2 b1m2d1 False True
+ 4 b1m4d1 False False
+Total number of models: 2
+ 1 were skipped (already executed), of which 0 have partial results
+ 0 failed
+ 1 ran successfully, of which 0 have partial results
+
+✅ Done!
+
This means that the benchmark has two models:
+b1m4d1
.You can view the results by running the following command with the specific local result UID. For example:
+ +For now, your results are only local. Next, you will learn how to submit the results.
++After executing the benchmark, you will submit a result to the MedPerf server. To do so, you have to find the target result generated UID.
+As an example, you will be submitting the result of UID b1m4d1
. To do this, run the following command:
The information that is going to be submitted will be printed to the screen and you will be prompted to confirm that you want to submit.
+ +You have reached the end of the tutorial! If you are planning to rerun any of the tutorials, don't forget to cleanup:
+To shut down the local MedPerf server: press CTRL
+C
in the terminal where the server is running.
To cleanup the downloaded files workspace (make sure you are in the MedPerf's root directory):
+Make sure you have Python 3.9 installed along with pip. To check if they are installed, run:
+ +or, depending on you machine configuration:
+ +We will assume the commands' names are pip
and python
. Use pip3
and python3
if your machine is configured differently.
Make sure you have the latest version of Docker or Singularity 3.10 installed.
+To verify docker is installed, run:
+ +To verify singularity is installed, run:
+ +If using Docker, make sure you can run Docker as a non-root user.
+(Optional) MedPerf is better to be installed in a virtual environment. We recommend using Anaconda. Having anaconda installed, create a virtual environment medperf-env
with the following command:
Then, activate your environment:
+ +Clone the MedPerf repository:
+ +Install MedPerf from source:
+ +Verify the installation:
+ +In this guide, you will learn how a Model Owner can use MedPerf to take part in a benchmark. It's highly recommend that you follow this or this guide first to implement your own model MLCube and use it throughout this tutorial. However, this guide provides an already implemented MLCube if you want to directly proceed to learn how to interact with MedPerf.
+The main tasks of this guide are:
+It's assumed that you have already set up the general testing environment as explained in the setup guide.
+As the most easy way to play with the tutorials you can launch a preinstalled Codespace cloud environment for MedPerf by clicking this link:
+ +To start experimenting with MedPerf through this tutorial on your local machine, you need to start by following these quick steps:
+ +For the purpose of the tutorial, you have to initialize a local MedPerf server with a fresh database and then create the necessary entities that you will be interacting with. To do so, run the following: (make sure you are in MedPerf's root folder)
+ +A script is provided to download all the necessary files so that you follow the tutorial smoothly. Run the following: (make sure you are in MedPerf's root folder)
+ +This will create a workspace folder medperf_tutorial
where all necessary files are downloaded. The folder contains the following content:
The medperf_tutorial/model_mobilenetv2/
is a toy Model MLCube. Once you submit your model to the benchmark, all participating Data Owners would be able to run the model within the benchmark pipeline. Therefore, your MLCube must support the specific input/output formats defined by the Benchmark Owners.
For the purposes of this tutorial, you will work with a pre-prepared toy benchmark. In a real-world scenario, you should refer to your Benchmark Owner to get a format specifications and details for your practical case.
+In real life all the listed artifacts and files have to be created on your own. However, for tutorial's sake you may use this toy data.
+The local MedPerf server is pre-configured with a dummy local authentication system. Remember that when you are communicating with the real MedPerf server, you should follow the steps in this guide to login. For the tutorials, you should not do anything.
+You are now ready to start!
++Before submitting your MLCube, it is highly recommended that you test your MLCube compatibility with the benchmarks of interest to avoid later edits and multiple submissions. Your MLCube should be compatible with the benchmark workflow in two main ways:
+These details should usually be acquired by contacting the Benchmark Committee and following their instructions.
+To test your MLCube validity with the benchmark, first run medperf benchmark ls
to identify the benchmark's server UID. In this case, it is going to be 1
.
Next, locate the MLCube. Unless you implemented your own MLCube, the MLCube provided for this tutorial is located in your workspace: medperf_tutorial/model_mobilenetv2/mlcube/mlcube.yaml
.
After that, run the compatibility test:
+medperf test run \
+ --benchmark 1 \
+ --model "medperf_tutorial/model_mobilenetv2/mlcube/mlcube.yaml"
+
Assuming the test passes successfuly, you are ready to submit the MLCube to the MedPerf server.
+The MedPerf server registers an MLCube as metadata comprised of a set of files that can be retrieved from the internet. This means that before submitting an MLCube you have to host its files on the internet. The MedPerf client provides a utility to prepare the files of an MLCube that need to be hosted. You can refer to this page if you want to understand what the files are, but using the utility script is enough.
+To prepare the files of the MLCube, run the following command ensuring you are in MedPerf's root folder:
+python scripts/package-mlcube.py --mlcube medperf_tutorial/model_mobilenetv2/mlcube --mlcube-types model
+
This script will create a new folder in the MLCube directory, named assets
, containing all the files that should be hosted separately.
For the tutorial to run smoothly, the files are already hosted. If you wish to host them by yourself, you can find the list of supported options and details about hosting files in this page.
+The submission should include the URLs of all the hosted files. For the MLCube provided for the tutorial:
+https://raw.githubusercontent.com/mlcommons/medperf/main/examples/chestxray_tutorial/model_mobilenetv2/mlcube/mlcube.yaml
+
https://raw.githubusercontent.com/mlcommons/medperf/main/examples/chestxray_tutorial/model_mobilenetv2/mlcube/workspace/parameters.yaml
+
Use the following command to submit:
+medperf mlcube submit \
+ --name my-model-cube \
+ --mlcube-file "https://raw.githubusercontent.com/mlcommons/medperf/main/examples/chestxray_tutorial/model_mobilenetv2/mlcube/mlcube.yaml" \
+ --parameters-file "https://raw.githubusercontent.com/mlcommons/medperf/main/examples/chestxray_tutorial/model_mobilenetv2/mlcube/workspace/parameters.yaml" \
+ --additional-file "https://storage.googleapis.com/medperf-storage/chestxray_tutorial/mobilenetv2_weights.tar.gz" \
+ --operational
+
The MLCube will be assigned by a server UID. You can check it by running:
+ ++Benchmark workflows are run by Data Owners, who will get notified when a new model is added to a benchmark. You must request the association for your model to be part of the benchmark.
+To initiate an association request, you need to collect the following information:
+1
4
.Run the following command to request associating your MLCube with the benchmark:
+ +This command will first run the benchmark's workflow on your model to ensure your model is compatible with the benchmark workflow. Then, the association request information is printed on the screen, which includes an executive summary of the test mentioned. You will be prompted to confirm sending this information and initiating this association request.
+
+When participating with a real benchmark, you must wait for the Benchmark Committee to approve the association request. You can check the status of your association requests by running medperf association ls
. The association is identified by the server UIDs of your MLCube and the benchmark with which you are requesting association.
You have reached the end of the tutorial! If you are planning to rerun any of the tutorials, don't forget to cleanup:
+To shut down the local MedPerf server: press CTRL
+C
in the terminal where the server is running.
To cleanup the downloaded files workspace (make sure you are in the MedPerf's root directory):
+The MedPerf client provides all the necessary tools to run a complete benchmark experiment. Below, you will find a comprehensive breakdown of user roles and the corresponding functionalities they can access and perform using the MedPerf client:
+This setup is only for running the tutorials. If you are using MedPerf with a real benchmark and real experiments, skip to this section to optionally change your container runner. Then, follow the tutorials as a general guidance for your real experiments.
+If this is your first time using MedPerf, install the MedPerf client library as described here.
+For this tutorial, you should spawn a local MedPerf server for the MedPerf client to communicate with. Note that this server will be hosted on your localhost
and not on the internet.
Install the server requirements ensuring you are in MedPerf's root folder:
+ +Run the local MedPerf server using the following command:
+ +The local MedPerf server now is ready to recieve requests. You can always stop the server by pressing CTRL
+C
in the terminal where you ran the server.
After that, you will be configuring the MedPerf client to communicate with the local MedPerf server. Make sure you continue following the instructions in a new terminal.
+The MedPerf client can be configured by creating or modifying "profiles
". A profile is a set of configuration parameters used by the client during runtime. By default, the profile named default
will be active.
The default
profile is preconfigured so that the client communicates with the main MedPerf server (api.medperf.org). For the purposes of the tutorial, you will be using the local
profile as it is preconfigured so that the client communicates with the local MedPerf server.
To activate the local
profile, run the following command:
You can always check which profile is active by running:
+ +To view the current active profile's configured parameters, you can run the following:
+ +You can configure the MedPerf client to use either Docker or Singularity. The local
profile is configured to use Docker. If you want to use MedPerf with Singularity, modify the local
profile configured parameters by running the following:
This command will modify the platform
parameter of the currently activated profile.
The local MedPerf server now is ready to recieve requests, and the MedPerf client is ready to communicate. Depending on your role, you can follow these hands-on tutorials:
+ + + + + + + + + + + +File: getting_started/shared/before_we_start.md
UndefinedError: 'dict object' has no attribute 'tutorial_id'
+Traceback (most recent call last):
+ File "/opt/hostedtoolcache/Python/3.9.20/x64/lib/python3.9/site-packages/mkdocs_macros/plugin.py", line 527, in render
+ return md_template.render(**page_variables)
+ File "/opt/hostedtoolcache/Python/3.9.20/x64/lib/python3.9/site-packages/jinja2/environment.py", line 1304, in render
+ self.environment.handle_exception()
+ File "/opt/hostedtoolcache/Python/3.9.20/x64/lib/python3.9/site-packages/jinja2/environment.py", line 939, in handle_exception
+ raise rewrite_traceback_stack(source=source)
+ File "<template>", line 41, in top-level template code
+jinja2.exceptions.UndefinedError: 'dict object' has no attribute 'tutorial_id'
+
You have reached the end of the tutorial! If you are planning to rerun any of the tutorials, don't forget to cleanup:
+To shut down the local MedPerf server: press CTRL
+C
in the terminal where the server is running.
To cleanup the downloaded files workspace (make sure you are in the MedPerf's root directory):
+The MedPerf server registers an MLCube as metadata comprised of a set of files that can be retrieved from the internet. This means that before submitting an MLCube you have to host its files on the internet. The MedPerf client provides a utility to prepare the files of an MLCube that need to be hosted. You can refer to this page if you want to understand what the files are, but using the utility script is enough.
+ + + + + + + + + + +In this tutorial we will create a benchmark that classifies chest X-Ray images.
+The medperf_tutorial/demo_data/
folder contains the demo dataset content.
images/
folder includes sample images.labels/labels.csv
provides a basic ground truth markup, indicating the class each image belongs to.The demo dataset is a sample dataset used for the development of your benchmark and used by Model Owners for the development of their models. More details are available in the section below
+The medperf_tutorial/data_preparator/
contains a DataPreparator MLCube that you must implement. This MLCube:
+ - Transforms raw data into a format convenient for model consumption, such as converting DICOM images into numpy tensors, cropping patches, normalizing columns, etc. It's up to you to define the format that is handy for future models.
+ - Ensures its output is in a standardized format, allowing Model Owners/Developers to rely on its consistency.
The medperf_tutorial/model_custom_cnn/
is an example of a Model MLCube. You need to implement a reference model which will be used by data owners to test the compatibility of their data with your pipeline. Also, Model Developers joining your benchmark will follow the input/output specifications of this model when building their own models.
The medperf_tutorial/metrics/
houses a Metrics MLCube that processes ground truth data, model predictions, and computes performance metrics - such as classification accuracy, loss, etc. After a Dataset Owner runs the benchmark pipeline on their data, these final metric values will be shared with you as the Benchmark Owner.
The medperf_tutorial/sample_raw_data/
folder contains your data for the specified Benchmark. In this tutorial, where the benchmark involves classifying chest X-Ray images, your data comprises:
images/
folder contains your imageslabels/labels.csv
, which provides the ground truth markup, specifying the class of each image.The format of this data is dictated by the Benchmark Owner, as it must be compatible with the benchmark's Data Preparation MLCube. In a real-world scenario, the expected data format would differ from this toy example. Refer to the Benchmark Owner to get a format specifications and details for your practical case.
+As previously mentioned, your data itself never leaves your machine. During the dataset submission, only basic metadata is transferred, for which you will be prompted to confirm.
+ + + + + + + + + + +The medperf_tutorial/model_mobilenetv2/
is a toy Model MLCube. Once you submit your model to the benchmark, all participating Data Owners would be able to run the model within the benchmark pipeline. Therefore, your MLCube must support the specific input/output formats defined by the Benchmark Owners.
For the purposes of this tutorial, you will work with a pre-prepared toy benchmark. In a real-world scenario, you should refer to your Benchmark Owner to get a format specifications and details for your practical case.
+ + + + + + + + + + +MedPerf uses passwordless authentication. This means that there will be no need for a password, and you have to access your email in order complete the signup process.
+Automatic signups are currently disabled. Please contact the MedPerf team in order to provision an account.
+Tip
+You don't need an account to run the tutorials and learn how to use the MedPerf client.
+The tutorials simulate a benchmarking example for the task of detecting thoracic diseases from chest X-ray scans. You can find the description of the used data here. Throughout the tutorials, you will be interacting with a temporary local MedPerf server as described in the setup page. This allows you to freely experiment with the MedPerf client and rerun the tutorials as many times as you want, providing you with an immersive learning experience. Please note that these tutorials also serve as a general guidance to be followed when using the MedPerf client in a real scenario.
+Before proceeding to the tutorials, make sure you have the general tutorial environment set up.
+To ensure users have the best experience in learning the fundamentals of MedPerf and utilizing the MedPerf client, the following set of tutorials are provided:
+ + + + + + + + + + + + + +Explore our documentation hub to understand everything you need to get started with MedPerf, + including definitions, setup, tutorials, advanced concepts, and more.
+ +Click here to see the documentation specifically for benchmark owners.
+Click here to see the documentation specifically for model owners.
+Click here to see the documentation specifically for data owners.
+The server contains all the metadata necessary to coordinate and execute experiments. No code assets or datasets are stored on the server.
+The backend server is implemented in Django, and it can be found in the server folder in the MedPerf Github repository.
+The MedPerf client contains all the necessary tools to interact with the server, preparing datasets for benchmarks and running experiments on the local machine. It can be found in this folder in the MedPerf Github repository.
+The client communicates to the server through the API to, for example, authenticate a user, retrieve benchmarks/MLcubes and send results.
+The client is currently available to the user through a command-line interface (CLI).
+The auth provider manages MedPerf users identities, authentication, and authorization to access the MedPerf server. Users will authenticate with the auth provider and authorize their MedPerf client to access the MedPerf server. Upon authorization, the MedPerf client will use access tokens issued by the auth provider in every request to the MedPerf server. The MedPerf server is configured to processes only requests authorized by the auth provider.
+Currently, MedPerf uses Auth0 as the auth provider.
+ + + + + + + + + + +This guide will walk you through how to wrap a model trained using GaNDLF as a MedPerf-compatible MLCube ready to be used for inference (i.e. as a Model MLCube). The steps can be summarized as follows:
+Before proceeding, make sure you have medperf installed and GaNDLF installed.
+A script is provided to download all the necessary files so that you follow the tutorial smoothly. Run the following: (make sure you are in MedPerf's root folder)
+ +This will create a workspace folder medperf_tutorial
where all necessary files are downloaded. Run cd medperf_tutorial
to switch to this folder.
Train a small GaNDLF model to use for this guide. You can skip this step if you already have a trained model.
+Make sure you are in the workspace folder medperf_tutorial
. Run:
gandlf_run \
+ -c ./config_getting_started_segmentation_rad3d.yaml \
+ -i ./data.csv \
+ -m ./trained_model_output \
+ -t True \
+ -d cpu
+
Note that if you want to train on GPU you can use -d cuda
, but the example used here should take only few seconds using the CPU.
Warning
+This tutorial assumes the user is using the latest GaNDLF version. The configuration file config_getting_started_segmentation_rad3d.yaml
will cause problems if you are using a different version, make sure you do the necessary changes.
You will now have your trained model and its related files in the folder trained_model_output
. Next, you will start learning how to wrap this trained model within an MLCube.
MedPerf provides a cookiecutter to create an MLCube file that is ready to be consumed by gandlf_deploy
and produces an MLCube ready to be used by MedPerf. To create the MLCube, run: (make sure you are in the workspace folder medperf_tutorial
)
Note
+MedPerf is running CookieCutter under the hood. This medperf command provides additional arguments for handling different scenarios. You can see more information on this by running medperf mlcube create --help
You will be prompted to customize the MLCube creation. Below is an example of how your response might look like:
+project_name [GaNDLF MLCube]: My GaNDLF MLCube # (1)!
+project_slug [my_gandlf_mlcube]: my_gandlf_mlcube # (2)!
+description [GaNDLF MLCube Template. Provided by MLCommons]: GaNDLF MLCube implementation # (3)!
+author_name [John Smith]: John Smith # (4)!
+accelerator_count [1]: 0 # (5)!
+docker_build_file [Dockerfile-CUDA11.6]: Dockerfile-CPU # (6)!
+docker_image_name [docker/image:latest]: johnsmith/gandlf_model:0.0.1 # (7)!
+
Assuming you chose my_gandlf_mlcube
as the project slug, you will find your MLCube created under the folder my_gandlf_mlcube
. Next, you will use a GaNDLF
utility to build the MLCube.
Note
+You might need to specify additional configurations in the mlcube.yaml
file if you are using a GPU. Check the generated mlcube.yaml
file for more info, as well as the MLCube documentation.
When deploying the GaNDLF model directly as a model MLCube, the default entrypoint will be gandlf_run ...
. You can override the entrypoint with a custom python script. One of the usecases is described below.
gandlf_run
expects a data.csv
file in the input data folder, which describes the inference test cases and their associated paths (Read more about GaNDLF's csv file conventions here). In case your MLCube will expect a data folder with a predefined data input structure but without this csv file, you can use a custom script that prepares this csv file as an entrypoint. You can find the recommended template and an example here.
To deploy the GaNDLF model as an MLCube, run the following: (make sure you are in the workspace folder medperf_tutorial
)
gandlf_deploy \
+ -c ./config_getting_started_segmentation_rad3d.yaml \
+ -m ./trained_model_output \
+ --target docker \
+ --mlcube-root ./my_gandlf_mlcube \
+ -o ./built_gandlf_mlcube \
+ --mlcube-type model \
+ --entrypoint <(optional) path to your custom entrypoint script> \ # (1)!
+ -g False # (2)!
+
True
if you want the resulting MLCube to use a GPU for inference.GaNDLF will use your initial MLCube configuration my_gandlf_mlcube
, the GaNDLF experiment configuration file config_classification.yaml
, and the trained model trained_model_output
to create a ready MLCube built_gandlf_mlcube
and build the docker image that will be used by the MLCube. The docker image will have the model weights and the GaNDLF experiment configuration file embedded. You can check that your image was built by running docker image ls
. You will see johnsmith/gandlf_model:0.0.1
(or whatever image name that was used) created moments ago.
That's it! You have built a MedPerf-compatible MLCube with GaNDLF. You may want to submit your MLCube to MedPerf, you can follow this tutorial.
+Tip
+MLCubes created by GaNDLF have the model weights and configuration file embedded in the docker image. When you want to deploy your MLCube for MedPerf, all you need to do is pushing the docker image and hosting the mlcube.yaml file.
+You have reached the end of the tutorial! If you are planning to rerun any of the tutorials, don't forget to cleanup:
+TODO: Change the structure to align with mlcube_models, to help users wrap their existing code into mlcube
+This guide is one of three designed to assist users in building MedPerf-compatible MLCubes. The other two guides focus on creating a Model MLCube and a Metrics MLCube. Together, these three MLCubes form a complete benchmark workflow for the task of thoracic disease detection from Chest X-rays.
+In summary, a functional MedPerf pipeline includes these steps:
+my_raw_data/
. If the pipeline is run by another person (Model Owner/Benchmark Owner), a predefined my_benchmark_demo_raw_data/
would be used instead (created and distributed by the Benchmark Owner).my_prepared_dataset/
folder (MLCube is implemented by the Benchmark Owner).my_model_predictions/
folder (MLCube is implemented by the Model Owner; the Benchmark Owner must implement a baseline model MLCube to be used as a mock-up).my_metrics.yaml
file (MLCube implemented by the Benchmark Owner).Aforementioned guides detail steps 2-4. As all steps demonstrate building specific MLCubes, we recommend starting with the Model MLCube guide, which offers a more detailed explanation of the MLCube's concept and structure. Another option is to explore MLCube basic docs. In this guide provides the shortened concepts description, focusing on nuances and input/output parameters.
+This guide describes the tasks, structure and input/output parameters of Data Preparator MLCube, allowing users at the end to be able to implement their own MedPerf-compatible MLCube for Benchmark purposes.
+The guide starts with general advices, steps, and the required API for building these MLCubes. Subsequently, it will lead you through creating your MLCube using the Chest X-ray Data Preprocessor MLCube as a practical example.
+It's considered best practice to handle data in various formats. For instance, if the benchmark involves image processing, it's beneficial to support JPEGs, PNGs, BMPs, and other expected image formats; accommodate large and small images, etc. Such flexibility simplifies the process for Dataset Owners, allowing them to export data in their preferred format. The Data Preparator's role is to convert all reasonable input data into a unified format.
+Your MLCube must implement three command tasks:
+prepare
: your main task that transforms raw input data into a unified format.sanity_check
: verifies the cleanliness and consistency of prepare
outputs (e.g., ensuring no records lack ground truth labels, labels contain only expected values, data fields are reasonable without outliers or NaNs, etc.)statistics
: Calculates some aggregated statistics on the transformed dataset. Once the Dataset Owner submits their dataset, these statistics will be uploaded to you as the Benchmark Owner.It's assumed that you already have:
+This guide will help you encapsulate your preparation code within an MLCube. Make sure you extracted each part of your logic, so it can be run independently.
+Each command execution receives specific parameters. While you are flexible in code implementation, keep in mind that your implementation will receive the following input arguments:
+prepare
Command)¶The parameters include:
+- data_path
: the path to the raw data folder (read-only).
+- labels_path
: the path to the ground truth labels folder (read-only).
+- Any other optional extra params that you attach to the MLCube, such as path to .txt
file with acceptable labels. Note: these extra parameters contain values defined by you, the MLCube owner, not the users' data.
+- output_path
: an r/w folder for storing transformed dataset objects.
+- output_labels_path
: an r/w folder for storing transformed labels.
sanity_check
Command)¶The parameters include:
+- data_path
: the path to the transformed data folder (read-only).
+- labels_path
: the path to the transformed ground truth labels folder (read-only).
+- Any other optional extra params that you attach to the MLCube - same as for the prepare
command.
The sanity check does not produce outputs; it either completes successfully or fails.
+statistics
Command)¶data_path
: the path to the transformed data folder (read-only).labels_path
: the path to the transformed ground truth labels folder (read-only).prepare
command.output_path
: path to .yaml
file where your code should write down calculated statistics.While this guide leads you through creating your own MLCube, you can always check a prebuilt example for a better understanding of how it works in an already implemented MLCube. The example is available here: +
+The guide uses this implementation to describe concepts.
+First, ensure you have MedPerf installed. Create a Data Preparator MLCube template by running the following command:
+ +You will be prompted to fill in some configuration options through the CLI. Below are the options and their default values:
+project_name [Data Preparator MLCube]: # (1)!
+project_slug [data_preparator_mlcube]: # (2)!
+description [Data Preparator MLCube Template. Provided by MLCommons]: # (3)!
+author_name [John Smith]: # (4)!
+accelerator_count [0]: # (5)!
+docker_image_name [docker/image:latest]: # (6)!
+
After filling the configuration options, the following directory structure will be generated:
+.
+└── evaluator_mlcube
+ ├── mlcube
+ │ ├── mlcube.yaml
+ │ └── workspace
+ │ └── parameters.yaml
+ └── project
+ ├── Dockerfile
+ ├── mlcube.py
+ └── requirements.txt
+
project
Folder¶This is where your preprocessing logic will live. It contains a standard Docker image project with a specific API for the entrypoint. mlcube.py
contains the entrypoint and handles all the tasks we've described. Update this template with your code and bind your logic to specified functions for all three commands.
+Refer to the Chest X-ray tutorial example for an example of how it should look:
"""MLCube handler file"""
+import typer
+import yaml
+from prepare import prepare_dataset
+from sanity_check import perform_sanity_checks
+from stats import generate_statistics
+
+app = typer.Typer()
+
+
+@app.command("prepare")
+def prepare(
+ data_path: str = typer.Option(..., "--data_path"),
+ labels_path: str = typer.Option(..., "--labels_path"),
+ parameters_file: str = typer.Option(..., "--parameters_file"),
+ output_path: str = typer.Option(..., "--output_path"),
+ output_labels_path: str = typer.Option(..., "--output_labels_path"),
+):
+ with open(parameters_file) as f:
+ parameters = yaml.safe_load(f)
+
+ prepare_dataset(data_path, labels_path, parameters, output_path, output_labels_path)
+
+
+@app.command("sanity_check")
+def sanity_check(
+ data_path: str = typer.Option(..., "--data_path"),
+ labels_path: str = typer.Option(..., "--labels_path"),
+ parameters_file: str = typer.Option(..., "--parameters_file"),
+):
+ with open(parameters_file) as f:
+ parameters = yaml.safe_load(f)
+
+ perform_sanity_checks(data_path, labels_path, parameters)
+
+
+@app.command("statistics")
+def statistics(
+ data_path: str = typer.Option(..., "--data_path"),
+ labels_path: str = typer.Option(..., "--labels_path"),
+ parameters_file: str = typer.Option(..., "--parameters_file"),
+ out_path: str = typer.Option(..., "--output_path"),
+):
+ with open(parameters_file) as f:
+ parameters = yaml.safe_load(f)
+
+ generate_statistics(data_path, labels_path, parameters, out_path)
+
+
+if __name__ == "__main__":
+ app()
+
mlcube
Folder¶This folder is primarily for configuring your MLCube and providing additional files the MLCube may interact with, such as parameters or model weights.
+mlcube.yaml
MLCube Configuration¶The mlcube/mlcube.yaml
file contains metadata and configuration of your mlcube. This file is already populated with the configuration you provided during the template creation step. There is no need to edit anything in this file except if you are specifying extra parameters to the commands (e.g., you want to pass a sklearn's StardardScaler
weights or any other parameters required for data transformation).
name: Chest X-ray Data Preparator
+description: MedPerf Tutorial - Data Preparation MLCube.
+authors:
+ - { name: MLCommons Medical Working Group }
+
+platform:
+ accelerator_count: 0
+
+docker:
+ # Image name
+ image: mlcommons/chestxray-tutorial-prep:0.0.0
+ # Docker build context relative to $MLCUBE_ROOT. Default is `build`.
+ build_context: "../project"
+ # Docker file name within docker build context, default is `Dockerfile`.
+ build_file: "Dockerfile"
+
+tasks:
+ prepare:
+ parameters:
+ inputs:
+ {
+ data_path: input_data,
+ labels_path: input_labels,
+ parameters_file: parameters.yaml,
+ }
+ outputs: { output_path: data/, output_labels_path: labels/ }
+ sanity_check:
+ parameters:
+ inputs:
+ {
+ data_path: data/,
+ labels_path: labels/,
+
+ parameters_file: parameters.yaml,
+ }
+ statistics:
+ parameters:
+ inputs:
+ {
+ data_path: data/,
+ labels_path: labels/,
+
+ parameters_file: parameters.yaml,
+ }
+ outputs: { output_path: { type: file, default: statistics.yaml } }
+
All paths are relative to mlcube/workspace/
folder.
To set up additional inputs, add a key-value pair in the task's inputs
dictionary:
...
+ prepare:
+ parameters:
+ inputs:
+ {
+ data_path: input_data,
+ labels_path: input_labels,
+ parameters_file: parameters.yaml,
+ standardscaler_weights: additional_files/standardscaler.pkl
+ }
+ outputs: { output_path: data/, output_labels_path: labels/ }
+...
+
Considering the note about path locations, this new file should be stored at mlcube/workspace/additional_files/standardscaler.pkl
Your preprocessing logic might depend on certain parameters (e.g., which labels are accepted). It is generally better to pass such parameters when running the MLCube, rather than hardcoding them. This can be done via a parameters.yaml
file that is passed to the MLCube. This file will be available to the previously described commands (if you declare it in the inputs
dict of a specific command). You can parse this file in the mlcube.py
file and pass its contents to your logic.
This file should be placed in the mlcube/workspace
folder.
After you follow the previous sections and fulfill the image with your logic, the MLCube is ready to be built and run. Run the command below to build the MLCube. Ensure you are in the mlcube/
subfolder of your Data Preparator.
This command builds your Docker image and prepares the MLCube for use.
+MedPerf will take care of running your MLCube. However, it's recommended to test the MLCube alone before using it with MedPerf for better debugging.
+To run the MLCube, use the command below. Ensure you are located in the mlcube/
subfolder of your Data Preparator.
mlcube run --task prepare data_path=<path_to_raw_data> \
+ labels_path=<path_to_raw_labels> \
+ output_path=<path_to_save_transformed_data> \
+ output_labels_path=<path_to_save_transformed_labels>
+
Relative paths
+Keep in mind that though we are running tasks from mlcube/
, all the paths should be absolute or relative to mlcube/workspace/
.
Default values
+We have declared a default values for every path parameter. This allows for omitting these parameters in our commands.
+Consider the following structure: +
.
+└── data_preparator_mlcube
+ ├── mlcube
+ │ ├── mlcube.yaml
+ │ └── workspace
+ │ └── parameters.yaml
+ └── project
+ └── ...
+└── my_data
+ ├── data
+ │ ├── ...
+ └── labels
+ └── ...
+
Now, you can execute the commands below, being located at data_preparator_mlcube/mlcube/
:
+
mlcube run --task prepare data_path=../../my_data/data/ labels_path=../../my_data/labels/
+mlcube run --task sanity_check
+mlcube run --task statistics output_path=../../my_data/statistics.yaml
+
Note that:
+mlcube/workspace
rather then to the current working directory,mlcube/workspace/data/
and others. The provided example codebase runs only on CPU. You can modify it to pass a GPU inside Docker image if your code utilizes it.
+The general instructions for building an MLCube to work with a GPU are the same as the provided instructions, but with the following slight modifications:
+0
for the accelerator_count
that you will be prompted with when creating the MLCube template or modify platform.accelerator_count
value of mlcube.yaml
configuration.docker
section of the mlcube.yaml
, add a key value pair: gpu_args: --gpus=all
. These gpu_args
will be passed to docker run
command by MLCube. You may add more than just --gpus=all
.pip
dependencies in the requirements.txt
file to download pytorch
with cuda, or by changing the base image of the dockerfile.TODO: Change the structure to align with mlcube_models, to help users wrap their existing code into mlcube
+This guide is one of three designed to assist users in building MedPerf-compatible MLCubes. The other two guides focus on creating a Data Preparator MLCube and a Model MLCube. Together, these three MLCubes form a complete benchmark workflow for the task of thoracic disease detection from Chest X-rays.
+In summary, a functional MedPerf pipeline includes these steps:
+my_raw_data/
. If the pipeline is run by another person (Model Owner/Benchmark Owner), a predefined my_benchmark_demo_raw_data/
would be used instead (created and distributed by the Benchmark Owner).my_prepared_dataset/
folder (MLCube is implemented by the Benchmark Owner).my_model_predictions/
folder (MLCube is implemented by the Model Owner; the Benchmark Owner must implement a baseline model MLCube to be used as a mock-up).my_metrics.yaml
file (MLCube implemented by the Benchmark Owner).Aforementioned guides detail steps 2-4. As all steps demonstrate building specific MLCubes, we recommend starting with the Model MLCube guide, which offers a more detailed explanation of the MLCube's concept and structure. Another option is to explore MLCube basic docs. In this guide provides the shortened concepts description, focusing on nuances and input/output parameters.
+This guide describes the tasks, structure and input/output parameters of Metrics MLCube, allowing users at the end to be able to implement their own MedPerf-compatible MLCube for Benchmark purposes.
+The guide starts with general advices, steps, and the required API for building these MLCubes. Subsequently, it will lead you through creating your MLCube using the Chest X-ray Data Preprocessor MLCube as a practical example.
+Note: As the Dataset Owner would share the output of your metrics evaluation with you as Benchmark Owner, ensure that your metrics are not too specific and do not reveal any Personally Identifiable Information (PII) or other confidential data (including dataset statistics) - otherwise, no Dataset Owners would agree to participate in your benchmark.
+Your MLCube must implement an evaluate
command that calculates your metrics.
It's assumed that you as Benchmark Owner already have:
+my_model_predictions/
folder.This guide will help you encapsulate your preparation code within an MLCube. Make sure you extracted metric calculation logic, so it can be executed independently.
+During execution, the evaluation
command will receive specific parameters. While you are flexible in code implementation, keep in mind that your implementation will receive the following input arguments:
predictions
: the path to the folder containing your predictions (read-only).labels
: the path to the folder containing transformed ground truth labels (read-only).output_path
: path to .yaml
file where your code should write down calculated metrics.While this guide leads you through creating your own MLCube, you can always check a prebuilt example for a better understanding of how it works in an already implemented MLCube. The example is available here: +
+The guide uses this implementation to describe concepts.
+First, ensure you have MedPerf installed. Create a Metrics MLCube template by running the following command:
+ +You will be prompted to fill in some configuration options through the CLI. Below are the options and their default values:
+project_name [Evaluator MLCube]: # (1)!
+project_slug [evaluator_mlcube]: # (2)!
+description [Evaluator MLCube Template. Provided by MLCommons]: # (3)!
+author_name [John Smith]: # (4)!
+accelerator_count [0]: # (5)!
+docker_image_name [docker/image:latest]: # (6)!
+
After filling the configuration options, the following directory structure will be generated:
+.
+└── data_preparator_mlcube
+ ├── mlcube
+ │ ├── mlcube.yaml
+ │ └── workspace
+ │ └── parameters.yaml
+ └── project
+ ├── Dockerfile
+ ├── mlcube.py
+ └── requirements.txt
+
project
Folder¶This is where your metrics logic will live. It contains a standard Docker image project with a specific API for the entrypoint. mlcube.py
contains the entrypoint and handles the evaluate
task. Update this template with your code and bind your logic to specified command entry-point function.
+Refer to the Chest X-ray tutorial example for an example of how it should look:
"""MLCube handler file"""
+import typer
+import yaml
+from metrics import calculate_metrics
+
+app = typer.Typer()
+
+
+@app.command("evaluate")
+def evaluate(
+ labels: str = typer.Option(..., "--labels"),
+ predictions: str = typer.Option(..., "--predictions"),
+ parameters_file: str = typer.Option(..., "--parameters_file"),
+ output_path: str = typer.Option(..., "--output_path"),
+):
+ with open(parameters_file) as f:
+ parameters = yaml.safe_load(f)
+
+ calculate_metrics(labels, predictions, parameters, output_path)
+
+
+@app.command("hotfix")
+def hotfix():
+ # NOOP command for typer to behave correctly. DO NOT REMOVE OR MODIFY
+ pass
+
+
+if __name__ == "__main__":
+ app()
+
mlcube
Folder¶This folder is primarily for configuring your MLCube and providing additional files the MLCube may interact with, such as parameters or model weights.
+mlcube.yaml
MLCube Configuration¶The mlcube/mlcube.yaml
file contains metadata and configuration of your mlcube. This file is already populated with the configuration you provided during the template creation step. There is no need to edit anything in this file except if you are specifying extra parameters to the commands.
name: Classification Metrics
+description: MedPerf Tutorial - Metrics MLCube.
+authors:
+ - { name: MLCommons Medical Working Group }
+
+platform:
+ accelerator_count: 0
+
+docker:
+ # Image name
+ image: mlcommons/chestxray-tutorial-metrics:0.0.0
+ # Docker build context relative to $MLCUBE_ROOT. Default is `build`.
+ build_context: "../project"
+ # Docker file name within docker build context, default is `Dockerfile`.
+ build_file: "Dockerfile"
+
+tasks:
+ evaluate:
+ # Computes evaluation metrics on the given predictions and ground truths
+ parameters:
+ inputs:
+ {
+ predictions: predictions,
+ labels: labels,
+ parameters_file: parameters.yaml,
+ }
+ outputs: { output_path: { type: "file", default: "results.yaml" } }
+
All paths are relative to mlcube/workspace/
folder.
To set up additional inputs, add a key-value pair in the task's inputs
dictionary:
...
+ prepare:
+ parameters:
+ inputs:
+ {
+ predictions: predictions,
+ labels: labels,
+ parameters_file: parameters.yaml,
+ some_additional_file_with_weights: additional_files/my_weights.zip
+ }
+ outputs: { output_path: { type: "file", default: "results.yaml" } }
+...
+
Considering the note about path locations, this new file should be stored at mlcube/workspace/additional_files/my_weights.zip
.
Your metrics evaluation logic might depend on certain parameters (e.g., proba threshold for classifying predictions). It is generally better to pass such parameters when running the MLCube, rather than hardcoding them. This can be done via a parameters.yaml
file that is passed to the MLCube. You can parse this file in the mlcube.py
file and pass its contents to your logic.
This file should be placed in the mlcube/workspace
folder.
After you follow the previous sections and fulfill the image with your logic, the MLCube is ready to be built and run. Run the command below to build the MLCube. Ensure you are in the mlcube/
subfolder of your Evaluator.
This command builds your Docker image and prepares the MLCube for use.
+MedPerf will take care of running your MLCube. However, it's recommended to test the MLCube alone before using it with MedPerf for better debugging.
+To run the MLCube, use the command below. Ensure you are located in the mlcube/
subfolder of your Data Preparator.
mlcube run --task evaluate predictions=<path_to_predictions> \
+ labels=<path_to_transformed_labels> \
+ output_path=<path_to_yaml_file_to_save>
+
Relative paths
+Keep in mind that though we are running tasks from mlcube/
, all the paths should be absolute or relative to mlcube/workspace/
.
Default values
+Default values are set for every path parameter, allowing for their omission in commands. For example, in the discussed Chest X-Ray example, the predictions
input is defined as follows:
+
If this parameter is omitted (e.g., running MLCube with default parameters by mlcube run --task evaluate
), it's assumed that predictions are stored in the mlcube/workspace/predictions/
folder.
The provided example codebase runs only on CPU. You can modify it to pass a GPU inside Docker image if your code utilizes it.
+The general instructions for building an MLCube to work with a GPU are the same as the provided instructions, but with the following slight modifications:
+0
for the accelerator_count
that you will be prompted with when creating the MLCube template or modify platform.accelerator_count
value of mlcube.yaml
configuration.docker
section of the mlcube.yaml
, add a key value pair: gpu_args: --gpus=all
. These gpu_args
will be passed to docker run
command by MLCube. You may add more than just --gpus=all
.pip
dependencies in the requirements.txt
file to download pytorch
with cuda, or by changing the base image of the dockerfile.This is one of the three guides that help the user build MedPerf-compatible MLCubes. The other two guides are for building a Data Preparator MLCube and a Metrics MLCube. Together, the three MLCubes examples constitute a complete benchmark workflow for the task of thoracic disease detection from Chest X-rays.
+This guide will help users familiarize themselves with the expected interface of the Model MLCube and gain a comprehensive understanding of its components. By following this walkthrough, users will gain insights into the structure and organization of a Model MLCube, allowing them at the end to be able to implement their own MedPerf-compatible Model MLCube.
+The guide will start by providing general advice, steps, and hints on building these MLCubes. Then, an example will be presented through which the provided guidance will be applied step-by-step to build a Chest X-ray classifier MLCube. The final MLCube code can be found here.
+It is assumed that you already have a working code that runs inference on data and generates predictions, and what you want to accomplish through this guide is to wrap your inference code within an MLCube.
+MedPerf provides MLCube templates. You should start from a template for faster implementation and to build MLCubes that are compatible with MedPerf.
+First, make sure you have MedPerf installed. You can create a model MLCube template by running the following command:
+ +You will be prompted to fill in some configuration options through the CLI. Below are the options and their default values:
+project_name [Model MLCube]: # (1)!
+project_slug [model_mlcube]: # (2)!
+description [Model MLCube Template. Provided by MLCommons]: # (3)!
+author_name [John Smith]: # (4)!
+accelerator_count [0]: # (5)!
+docker_image_name [docker/image:latest]: # (6)!
+
After filling the configuration options, the following directory structure will be generated:
+.
+└── model_mlcube
+ ├── mlcube
+ │ ├── mlcube.yaml
+ │ └── workspace
+ │ └── parameters.yaml
+ └── project
+ ├── Dockerfile
+ ├── mlcube.py
+ └── requirements.txt
+
The next sections will go through the contents of this directory in details and customize it.
+project
folder¶This is where your inference logic will live. This folder initially contains three files as shown above. The upcoming sections will cover their use in details.
+The first thing to do is put your code files in this folder.
+This is done through the mlcube.py
file. This file defines the interface of the MLCube and should be linked to your inference logic.
"""MLCube handler file"""
+import typer
+
+app = typer.Typer()
+
+
+@app.command("infer")
+def infer(
+ data_path: str = typer.Option(..., "--data_path"),
+ parameters_file: str = typer.Option(..., "--parameters_file"),
+ output_path: str = typer.Option(..., "--output_path"),
+ # Provide additional parameters as described in the mlcube.yaml file
+ # e.g. model weights:
+ # weights: str = typer.Option(..., "--weights"),
+):
+ # Modify the prepare command as needed
+ raise NotImplementedError("The evaluate method is not yet implemented")
+
+
+@app.command("hotfix")
+def hotfix():
+ # NOOP command for typer to behave correctly. DO NOT REMOVE OR MODIFY
+ pass
+
+
+if __name__ == "__main__":
+ app()
+
As shown above, this file exposes a command infer
. It's basic arguments are the input data path, the output predictions path, and a parameters file path.
The parameters file, as will be explained in the upcoming sections, gives flexibility to your MLCube. For example, instead of hardcoding the inference batch size in the code, it can be configured by passing a parameters file to your MLCube which contains its value. This way, your same MLCube can be reused with multiple batch sizes by just changing the input parameters file.
+You should ignore the hotfix
command as described in the file.
The infer
command will be automatically called by the MLCube when it's built and run. This command should call your inference logic. Make sure you replace its contents with a code that calls your inference logic. This could be by importing a function from your code files and calling it with the necessary arguments.
The MLCube will execute a docker image whose entrypoint is mlcube.py
. The MLCube will first build this image from the Dockerfile
specified in the project
folder. You can customize the Dockerfile however you want as long as the entrypoint is runs the mlcube.py
file
Make sure you include in your Dockerfile any system dependency your code depends on. It is also common to have pip
dependencies, make sure you install them in the Dockerfile as well.
Below is the docker file provided in the template:
+FROM python:3.9.16-slim
+
+COPY ./requirements.txt /mlcube_project/requirements.txt
+
+RUN pip3 install --no-cache-dir -r /mlcube_project/requirements.txt
+
+ENV LANG C.UTF-8
+
+COPY . /mlcube_project
+
+ENTRYPOINT ["python3", "/mlcube_project/mlcube.py"]
+
As shown above, this docker file makes sure python
is available by using the python base image, installs pip
dependencies using the requirements.txt
file, and sets the entrypoint to run mlcube.py
. Note that the MLCube tool will invoke the Docker build
command from the project
folder, so it will copy all your files found in the project
to the Docker image.
mlcube
folder¶This folder is mainly for configuring your MLCube and providing additional files the MLCube may interact with, such as parameters or model weights.
+Your inference logic may depend on some parameters (e.g. inference batch size). It is usually a more favorable design to not hardcode such parameters, but instead pass them when running the MLCube. This can be done by having a parameters.yaml
file as an input to the MLCube. This file will be available to the infer
command described before. You can parse this file in the mlcube.py
file and pass its contents to your code.
This file should be placed in the mlcube/workspace
folder.
It is a good practice not to ship your model weights within the docker image to reduce the image size and provide flexibility of running the MLCube with different model weights. To do this, model weights path should be provided as a separate parameter to the MLCube. You should place your model weights in a folder named additional_files
inside the mlcube/workspace
folder. This is how MedPerf expects any additional input to your MLCube beside the data path and the paramters file.
After placing your model weights in mlcube/workspace/additional_files
, you have to modify two files:
mlcube.py
: add an argument to the infer
command which will correspond to the path of your model weights. Remember also to pass this argument to your inference logic.mlcube.yaml
: The next section introduces this file and describes it in details. You should add your extra input arguments to this file as well, as described below.The mlcube.yaml
file contains metadata and configuration of your mlcube. This file was already populated with the configuration you provided during the step of creating the template. There is no need to edit anything in this file except if you are specifying extra parameters to the infer
command (e.g., model weights as described in the previous section).
You will be modifying the tasks
section of the mlcube.yaml
file in order to account for extra additional inputs:
tasks:
+ infer:
+ # Computes predictions on input data
+ parameters:
+ inputs: {
+ data_path: data/,
+ parameters_file: parameters.yaml,
+ # Feel free to include other files required for inference.
+ # These files MUST go inside the additional_files path.
+ # e.g. model weights
+ # weights: additional_files/weights.pt,
+ }
+ outputs: { output_path: { type: directory, default: predictions } }
+
As hinted by the comments as well, you can add the additional parameters by specifying an extra key-value pair in the inputs
dictionary of the infer
task.
After you follow the previous sections, the MLCube is ready to be built and run. Run the command below to build the MLCube. Make sure you are in the folder model_mlcube/mlcube
.
This command will build your docker image and make the MLCube ready to use.
+MedPerf will take care of running your MLCube. However, it's recommended to test the MLCube alone before using it with MedPerf for better debugging.
+Use the command below to run the MLCube. Make sure you are in the folder model_mlcube/mlcube
.
mlcube run --task infer data_path=<absolute path to input data> output_path=<absolute path to a folder where predictions will be saved>
+
Assume you have the codebase below. This code can be used to predict thoracic diseases based on Chest X-ray data. The classification task is modeled as a multi-label classification class.
+"""
+Taken from MedMNIST/MedMNIST.
+"""
+
+import torch.nn as nn
+
+
+class SimpleCNN(nn.Module):
+ def __init__(self, in_channels, num_classes):
+ super(SimpleCNN, self).__init__()
+
+ self.layer1 = nn.Sequential(
+ nn.Conv2d(in_channels, 16, kernel_size=3), nn.BatchNorm2d(16), nn.ReLU()
+ )
+
+ self.layer2 = nn.Sequential(
+ nn.Conv2d(16, 16, kernel_size=3),
+ nn.BatchNorm2d(16),
+ nn.ReLU(),
+ nn.MaxPool2d(kernel_size=2, stride=2),
+ )
+
+ self.layer3 = nn.Sequential(
+ nn.Conv2d(16, 64, kernel_size=3), nn.BatchNorm2d(64), nn.ReLU()
+ )
+
+ self.layer4 = nn.Sequential(
+ nn.Conv2d(64, 64, kernel_size=3), nn.BatchNorm2d(64), nn.ReLU()
+ )
+
+ self.layer5 = nn.Sequential(
+ nn.Conv2d(64, 64, kernel_size=3, padding=1),
+ nn.BatchNorm2d(64),
+ nn.ReLU(),
+ nn.MaxPool2d(kernel_size=2, stride=2),
+ )
+
+ self.fc = nn.Sequential(
+ nn.Linear(64 * 4 * 4, 128),
+ nn.ReLU(),
+ nn.Linear(128, 128),
+ nn.ReLU(),
+ nn.Linear(128, num_classes),
+ )
+
+ def forward(self, x):
+ x = self.layer1(x)
+ x = self.layer2(x)
+ x = self.layer3(x)
+ x = self.layer4(x)
+ x = self.layer5(x)
+ x = x.view(x.size(0), -1)
+ x = self.fc(x)
+ return x
+
import numpy as np
+import torchvision.transforms as transforms
+import os
+from torch.utils.data import Dataset
+
+
+class CustomImageDataset(Dataset):
+ def __init__(self, data_path):
+ self.transform = transforms.Compose(
+ [transforms.ToTensor(), transforms.Normalize(mean=[0.5], std=[0.5])]
+ )
+ self.files = os.listdir(data_path)
+ self.data_path = data_path
+
+ def __len__(self):
+ return len(self.files)
+
+ def __getitem__(self, idx):
+ img_path = os.path.join(self.data_path, self.files[idx])
+ image = np.load(img_path, allow_pickle=True)
+ image = self.transform(image)
+ file_id = self.files[idx].strip(".npy")
+ return image, file_id
+
import torch
+from models import SimpleCNN
+from tqdm import tqdm
+from torch.utils.data import DataLoader
+from data_loader import CustomImageDataset
+from pprint import pprint
+
+
+data_path = "path/to/data/folder"
+weights = "path/to/weights.pt"
+in_channels = 1
+num_classes = 14
+batch_size = 5
+
+# load model
+model = SimpleCNN(in_channels=in_channels, num_classes=num_classes)
+model.load_state_dict(torch.load(weights))
+model.eval()
+
+# load prepared data
+dataset = CustomImageDataset(data_path)
+dataloader = DataLoader(dataset, batch_size=batch_size, shuffle=False)
+
+# inference
+predictions_dict = {}
+with torch.no_grad():
+ for images, files_ids in tqdm(dataloader):
+ outputs = model(images)
+ outputs = torch.nn.Sigmoid()(outputs)
+ outputs = outputs.detach().numpy()
+
+ for file_id, output in zip(files_ids, outputs):
+ predictions_dict[file_id] = output
+
+pprint(predictions_dict)
+
Throughout the next sections, this code will be wrapped within an MLCube.
+The guidlines listed previously in this section will now be applied to the given codebase. Assume that you were instructed by the benchmark you are participating with to have your MLCube interface as follows:
+It is important to make sure that your MLCube will output an expected predictions format and consume a defined data format, since it will be used in a benchmarking pipeline whose data input is fixed and whose metrics calculation logic expects a fixed predictions format.
+Considering the codebase above, here are the things that should be done before proceeding to build the MLCube:
+infer.py
only prints predictions but doesn't store them. This has to be changed.infer.py
hardcodes some parameters (num_classes
, in_channels
, batch_size
) as well as the path to the trained model weights. Consider making these items configurable parameters. (This is optional but recommended)infer.py
to be a function so that is can be easily called by mlcube.py
.The other files models.py
and data_loader.py
seem to be good already. The data loader expects a folder containing a list of numpy arrays, as instructed.
Here is the modified version of infer.py
according to the points listed above:
import torch
+import os
+from models import SimpleCNN
+from tqdm import tqdm
+from torch.utils.data import DataLoader
+from data_loader import CustomImageDataset
+import json
+
+
+def run_inference(data_path, parameters, output_path, weights):
+ in_channels = parameters["in_channels"]
+ num_classes = parameters["num_classes"]
+ batch_size = parameters["batch_size"]
+
+ # load model
+ model = SimpleCNN(in_channels=in_channels, num_classes=num_classes)
+ model.load_state_dict(torch.load(weights))
+ model.eval()
+
+ # load prepared data
+ dataset = CustomImageDataset(data_path)
+ dataloader = DataLoader(dataset, batch_size=batch_size, shuffle=False)
+
+ # inference
+ predictions_dict = {}
+ with torch.no_grad():
+ for images, files_ids in tqdm(dataloader):
+ outputs = model(images)
+ outputs = torch.nn.Sigmoid()(outputs)
+ outputs = outputs.detach().numpy().tolist()
+
+ for file_id, output in zip(files_ids, outputs):
+ predictions_dict[file_id] = output
+
+ # save
+ preds_file = os.path.join(output_path, "predictions.json")
+ with open(preds_file, "w") as f:
+ json.dump(predictions_dict, f, indent=4)
+
Assuming you installed MedPerf, run the following:
+ +You will be prompted to fill in the configuration options. Use the following configuration as a reference:
+project_name [Model MLCube]: Custom CNN Classification Model
+project_slug [model_mlcube]: model_custom_cnn
+description [Model MLCube Template. Provided by MLCommons]: MedPerf Tutorial - Model MLCube.
+author_name [John Smith]: <use your name>
+accelerator_count [0]: 0
+docker_image_name [docker/image:latest]: repository/model-tutorial:0.0.0
+
Note that docker_image_name
is arbitrarily chosen. Use a valid docker image.
Move the three files of the codebase to the project
folder. The directory tree will then look like this:
.
+└── model_custom_cnn
+ ├── mlcube
+ │ ├── mlcube.yaml
+ │ └── workspace
+ │ └── parameters.yaml
+ └── project
+ ├── Dockerfile
+ ├── mlcube.py
+ ├── models.py
+ ├── data_loader.py
+ ├── infer.py
+ └── requirements.txt
+
Since num_classes
, in_channels
, and batch_size
are now parametrized, they should be defined in workspace/parameters.yaml
. Also, the model weights should be placed inside workspace/additional_files
.
Modify parameters.yaml
to include the following:
Download the following model weights to use in this example: Click here to Download
+Extract the file to workspace/additional_files
. The directory tree should look like this:
.
+└── model_custom_cnn
+ ├── mlcube
+ │ ├── mlcube.yaml
+ │ └── workspace
+ │ ├── additional_files
+ │ │ └── cnn_weights.pth
+ │ └── parameters.yaml
+ └── project
+ ├── Dockerfile
+ ├── mlcube.py
+ ├── models.py
+ ├── data_loader.py
+ ├── infer.py
+ └── requirements.txt
+
mlcube.py
¶Next, the inference logic should be triggered from mlcube.py
. The parameters_file
will be read in mlcube.py
and passed as a dictionary to the inference logic. Also, an extra parameter weights
is added to the function signature which will correspond to the model weights path. See below the modified mlcube.py
file.
"""MLCube handler file"""
+import typer
+import yaml
+
+from infer import run_inference
+
+app = typer.Typer()
+
+
+@app.command("infer")
+def infer(
+ data_path: str = typer.Option(..., "--data_path"),
+ parameters_file: str = typer.Option(..., "--parameters_file"),
+ output_path: str = typer.Option(..., "--output_path"),
+ weights: str = typer.Option(..., "--weights"),
+):
+ with open(parameters_file) as f:
+ parameters = yaml.safe_load(f)
+
+ run_inference(data_path, parameters, output_path, weights)
+
+
+@app.command("hotfix")
+def hotfix():
+ # NOOP command for typer to behave correctly. DO NOT REMOVE OR MODIFY
+ pass
+
+
+if __name__ == "__main__":
+ app()
+
The provided Dockerfile in the template is enough and preconfigured to download pip
dependencies from the requirements.txt
file. All that is needed is to modify the requirements.txt
file to include the project's pip dependencies.
typer==0.9.0
+numpy==1.24.3
+PyYAML==6.0
+torch==2.0.1
+torchvision==0.15.2
+tqdm==4.65.0
+--extra-index-url https://download.pytorch.org/whl/cpu
+
mlcube.yaml
¶Since the extra parameter weights
was added to the infer
task in mlcube.py
, this has to be reflected on the defined MLCube interface in the mlcube.yaml
file. Modify the tasks
section to include an extra input parameter: weights: additional_files/cnn_weights.pth
.
Tip
+The MLCube tool interprets these paths as relative to the workspace
.
The tasks
section will then look like this:
tasks:
+ infer:
+ # Computes predictions on input data
+ parameters:
+ inputs:
+ {
+ data_path: data/,
+ parameters_file: parameters.yaml,
+ weights: additional_files/cnn_weights.pth,
+ }
+ outputs: { output_path: { type: directory, default: predictions } }
+
Run the command below to create the MLCube. Make sure you are in the folder model_custom_cnn/mlcube
.
This command will build your docker image and make the MLCube ready to use.
+Tip
+Run docker image ls
to see your built Docker image.
Download a sample data to run on: Click here to Download
+Extract the data. You will get a folder sample_prepared_data
containing a list chest X-ray images as numpy arrays.
Use the command below to run the MLCube. Make sure you are in the the folder model_custom_cnn/mlcube
.
mlcube run --task infer data_path=<absolute path to `sample_prepared_data`> output_path=<absolute path to a folder where predictions will be saved>
+
The provided example codebase runs only on CPU. You can modify it to have pytorch
run inference on a GPU.
The general instructions for building an MLCube to work with a GPU are the same as the provided instructions, but with the following slight modifications:
+pip
dependencies in the requirements.txt
file to download pytorch
with cuda, or by changing the base image of the dockerfile.For testing your MLCube with GPUs using the MLCube tool as in the previous section, make sure you run the mlcube run
command with a --gpus
argument. Example: mlcube run --gpus=all ...
For testing your MLCube with GPUs using MedPerf, make sure you pass as well the --gpus
argument to the MedPerf command. Example: medperf --gpus=all <subcommand> ...
.
Tip
+Run medperf --help
to see the possible options you can use for the --gpus
argument.
MLCube is a set of common conventions for creating Machine Learning (ML) software that can "plug-and-play" on many different systems. It is basically a container image with a simple interface and the correct metadata that allows researchers and developers to easily share and experiment with ML pipelines.
+You can read more about MLCubes here.
+In MedPerf, MLCubes are required for creating the three technical components of a benchmarking experiment: the data preparation flow, the model inference flow, and the evaluation flow. A Benchmark Committee will be required to create three MLCubes that implement these components. A Model Owner will be required to wrap their model code within an MLCube in order to submit it to the MedPerf server and participate in a benchmark.
+MLCubes are general-purpose. MedPerf defines three specific design types of MLCubes according to their purpose: The Data Preparator MLCube, the Model MLCube, and the Metrics MLCube. Each type has a specific MLCube task configuration that defines the MLCube's interface. Users need to follow these design specs when building their MLCubes to be conforming with MedPerf. We provide below a high-level description of each MLCube type and a link to a guide for building an example for each type.
+The Data Preparator MLCube is used to prepare the data for executing the benchmark. Ideally, it can receive different data standards for the task at hand, transforming them into a single, unified standard. Additionally, it ensures the quality and compatibility of the data and computes statistics and metadata for registration purposes.
+This MLCube's interface should expose the following tasks:
+Prepare: Transforms the input data into the expected output data standard. It receives as input the location of the original data, as well as the location of the labels, and outputs the prepared dataset and accompanying labels.
+Sanity check: Ensures the integrity of the prepared data. It may check for anomalies and data corruption (e.g. blank images, empty test cases). It constitutes a set of conditions the prepared data should comply with.
+Statistics: Computes statistics on the prepared data.
+Check this guide on how to create a Data Preparation MLCube.
+The model MLCube contains a pre-trained machine learning model that is going to be evaluated by the benchmark. It's interface should expose the following task:
+Check this guide on how to create a Model MLCube.
+The Metrics MLCube is used for computing metrics on the model predictions by comparing them against the provided labels. It's interface should expose the following task:
+Check this guide on how to create a Metrics MLCube.
+ + + + + + + + + + +The following section will describe how you can create a {{ no such element: dict object['name'] }} from scratch. This documentation goes through the set of commands provided to help during this process, as well as the contents of a {{ no such element: dict object['name'] }}.
+MedPerf provides some cookiecutter templates for all the related MLCubes. Additionally, it provides commands to easily retreive and use these templates. For that, you need to make sure MedPerf is installed. If you haven not done so, please follow the steps below:
+Clone the repository: +
+Install the MedPerf CLI: +
+If you have not done so, create a folder for keeping all MLCubes created in this tutorial: +
+Create a {{ no such element: dict object['name'] }} through MedPerf: +
+ You should be prompted to fill in some configuration options through the CLI. An example of some good options to provide for this specific task is presented below: +Let's have a look at what the previous command generated. First, lets look at the whole folder structure: +
+ + + + + + + + + + +Note
+MedPerf is running CookieCutter under the hood. This medperf command provides additional arguments for handling different scenarios. You can see more information on this by running medperf mlcube create --help
MLCubes rely on containers to work. By default, Medperf provides a functional Dockerfile, which uses ubuntu:18.0.4
and python3.6
. This Dockerfile handles all the required procedures to install your project and redirect commands to the project/mlcube.py
file. You can modify as you see fit, as long as the entry point behaves as a CLI, as described before.
Running Docker MLCubes with Singularity
+If you are building a Docker MLCube and expect it to be also run using Singularity, you need to keep in mind that Singularity containers built from Docker images ignore the WORKDIR
instruction if used in Dockerfiles. Make sure you also follow their best practices for writing Singularity-compatible Dockerfiles.
Now its time to run our own implementation. We won't go into much detail, since we covered the basics before. But, here are the commands you can run to build and run your MLCube.
+{{ no such element: dict object['slug'] }}_mlcube
, run
+ Build the Docker image using the shortcuts provided by MLCubse. Here is how you can do it: +
+Pdocker.build_strategy=always
enforces MLCube to build the image from source.In order to provide a basic example of how Medperf MLCubes work under the hood, a toy Hello World benchmark is provided. This benchmark implements a pipeline for ingesting people's names and generating greetings for those names given some criteria. Although this is not the most scientific example, it provides a clear idea of all the pieces required to implement your MLCubes for Medperf.
+You can find the {{ no such element: dict object['name'] }} code here
+ + + + + + + + + + +What is the hotfix
function inside mlcube.py
?
To summarize, this issue is benign and can be safely ignored. It prevents a potential issue with the CLI and does not require further action.
+If you use the typer
/click
library for your command-line interface (CLI) and have only one @app.command
, the command line may not be parsed as expected by mlcube. This is due to a known issue that can be resolved by adding more than one task to the mlcube interface.
To avoid a potential issue with the CLI, we add a dummy typer command to our model cubes that only have one task. If you're not using typer
/click
, you don't need this dummy command.
The provided MLCube template assumes your project is python based. Because of this, it provides a requirements.txt
file to specify the dependencies to run your project. This file is automatically used by the Dockerfile
to install and set up your project. Since some dependencies are necessary, let's add them to the file:
Before digging into the code, let's try manually running the {{ no such element: dict object['name'] }}. During this process, it should be possible to see how MLCube interacts with the folders in the workspace and what is expected to happen during each step:
+Clone the repository: +
+Install mlcube and mlcube-docker using pip: +
+Navigate to the HelloWorld directory within the examples folder with +
+Change to the current example's mlcube
folder with
+
Explore our documentation hub to understand everything you need to get started with MedPerf, + including definitions, setup, tutorials, advanced concepts, and more.
+ +Click here to see the documentation specifically for benchmark owners.
+Click here to see the documentation specifically for model owners.
+Click here to see the documentation specifically for data owners.
+get_medperf_user_data()
+
+¶Return cached medperf user data. Get from the server if not found
+ +cli/medperf/account_management/account_management.py
set_medperf_user_data()
+
+¶Get and cache user data from the MedPerf server
+ +cli/medperf/account_management/account_management.py
execute(benchmark_uid=typer.Option(..., '--benchmark', '-b', help='UID of the desired benchmark'), data_uid=typer.Option(..., '--data_uid', '-d', help='Registered Dataset UID'), model_uid=typer.Option(..., '--model_uid', '-m', help='UID of model to execute'), approval=typer.Option(False, '-y', help='Skip approval step'), ignore_model_errors=typer.Option(False, '--ignore-model-errors', help='Ignore failing model cubes, allowing for possibly submitting partial results'), no_cache=typer.Option(False, '--no-cache', help='Ignore existing results. The experiment then will be rerun'))
+
+¶Runs the benchmark execution step for a given benchmark, prepared dataset and model
+ +cli/medperf/cli.py
Approval
+
+
+¶cli/medperf/commands/association/approval.py
run(benchmark_uid, approval_status, dataset_uid=None, mlcube_uid=None)
+
+
+ staticmethod
+
+
+¶Sets approval status for an association between a benchmark and a dataset or mlcube
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
benchmark_uid |
+
+ int
+ |
+
+
+
+ Benchmark UID. + |
+ + required + | +
approval_status |
+
+ str
+ |
+
+
+
+ Desired approval status to set for the association. + |
+ + required + | +
comms |
+
+ Comms
+ |
+
+
+
+ Instance of Comms interface. + |
+ + required + | +
ui |
+
+ UI
+ |
+
+
+
+ Instance of UI interface. + |
+ + required + | +
dataset_uid |
+
+ int
+ |
+
+
+
+ Dataset UID. Defaults to None. + |
+
+ None
+ |
+
mlcube_uid |
+
+ int
+ |
+
+
+
+ MLCube UID. Defaults to None. + |
+
+ None
+ |
+
cli/medperf/commands/association/approval.py
approve(benchmark_uid=typer.Option(..., '--benchmark', '-b', help='Benchmark UID'), dataset_uid=typer.Option(None, '--dataset', '-d', help='Dataset UID'), mlcube_uid=typer.Option(None, '--mlcube', '-m', help='MLCube UID'))
+
+¶Approves an association between a benchmark and a dataset or model mlcube
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
benchmark_uid |
+
+ int
+ |
+
+
+
+ Benchmark UID. + |
+
+ typer.Option(..., '--benchmark', '-b', help='Benchmark UID')
+ |
+
dataset_uid |
+
+ int
+ |
+
+
+
+ Dataset UID. + |
+
+ typer.Option(None, '--dataset', '-d', help='Dataset UID')
+ |
+
mlcube_uid |
+
+ int
+ |
+
+
+
+ Model MLCube UID. + |
+
+ typer.Option(None, '--mlcube', '-m', help='MLCube UID')
+ |
+
cli/medperf/commands/association/association.py
list(filter=typer.Argument(None))
+
+¶Display all associations related to the current user.
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
filter |
+
+ str
+ |
+
+
+
+ Filter associations by approval status. +Defaults to displaying all user associations. + |
+
+ typer.Argument(None)
+ |
+
cli/medperf/commands/association/association.py
reject(benchmark_uid=typer.Option(..., '--benchmark', '-b', help='Benchmark UID'), dataset_uid=typer.Option(None, '--dataset', '-d', help='Dataset UID'), mlcube_uid=typer.Option(None, '--mlcube', '-m', help='MLCube UID'))
+
+¶Rejects an association between a benchmark and a dataset or model mlcube
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
benchmark_uid |
+
+ int
+ |
+
+
+
+ Benchmark UID. + |
+
+ typer.Option(..., '--benchmark', '-b', help='Benchmark UID')
+ |
+
dataset_uid |
+
+ int
+ |
+
+
+
+ Dataset UID. + |
+
+ typer.Option(None, '--dataset', '-d', help='Dataset UID')
+ |
+
mlcube_uid |
+
+ int
+ |
+
+
+
+ Model MLCube UID. + |
+
+ typer.Option(None, '--mlcube', '-m', help='MLCube UID')
+ |
+
cli/medperf/commands/association/association.py
set_priority(benchmark_uid=typer.Option(..., '--benchmark', '-b', help='Benchmark UID'), mlcube_uid=typer.Option(..., '--mlcube', '-m', help='MLCube UID'), priority=typer.Option(..., '--priority', '-p', help='Priority, an integer'))
+
+¶Updates the priority of a benchmark-model association. Model priorities within +a benchmark define which models need to be executed before others when +this benchmark is run. A model with a higher priority is executed before +a model with lower priority. The order of execution of models of the same priority +is arbitrary.
+ + + +Examples:
+ +Assume there are three models of IDs (1,2,3), associated with a certain benchmark, +all having priority = 0. +- By setting the priority of model (2) to the value of 1, the client will make +sure that model (2) is executed before models (1,3). +- By setting the priority of model (1) to the value of -5, the client will make +sure that models (2,3) are executed before model (1).
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
benchmark_uid |
+
+ int
+ |
+
+
+
+ Benchmark UID. + |
+
+ typer.Option(..., '--benchmark', '-b', help='Benchmark UID')
+ |
+
mlcube_uid |
+
+ int
+ |
+
+
+
+ Model MLCube UID. + |
+
+ typer.Option(..., '--mlcube', '-m', help='MLCube UID')
+ |
+
priority |
+
+ int
+ |
+
+
+
+ Priority, an integer + |
+
+ typer.Option(..., '--priority', '-p', help='Priority, an integer')
+ |
+
cli/medperf/commands/association/association.py
ListAssociations
+
+
+¶cli/medperf/commands/association/list.py
run(filter=None)
+
+
+ staticmethod
+
+
+¶Get Pending association requests
+ +cli/medperf/commands/association/list.py
AssociationPriority
+
+
+¶cli/medperf/commands/association/priority.py
run(benchmark_uid, mlcube_uid, priority)
+
+
+ staticmethod
+
+
+¶Sets priority for an association between a benchmark and an mlcube
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
benchmark_uid |
+
+ int
+ |
+
+
+
+ Benchmark UID. + |
+ + required + | +
mlcube_uid |
+
+ int
+ |
+
+
+
+ MLCube UID. + |
+ + required + | +
priority |
+
+ int
+ |
+
+
+
+ priority value + |
+ + required + | +
cli/medperf/commands/association/priority.py
login(email=typer.Option(None, '--email', '-e', help='The email associated with your account'))
+
+¶Authenticate to be able to access the MedPerf server. A verification link will +be provided and should be open in a browser to complete the login process.
+ +cli/medperf/commands/auth/auth.py
logout()
+
+¶status()
+
+¶synapse_login(token=typer.Option(None, '--token', '-t', help='Personal Access Token to login with'))
+
+¶Login to the synapse server. +Provide either a username and a password, or a token
+ +cli/medperf/commands/auth/auth.py
Login
+
+
+¶cli/medperf/commands/auth/login.py
run(email=None)
+
+
+ staticmethod
+
+
+¶Authenticate to be able to access the MedPerf server. A verification link will +be provided and should be open in a browser to complete the login process.
+ +cli/medperf/commands/auth/login.py
Logout
+
+
+¶cli/medperf/commands/auth/logout.py
SynapseLogin
+
+
+¶cli/medperf/commands/auth/synapse_login.py
login_with_token(access_token=None)
+
+
+ classmethod
+
+
+¶Login to the Synapse server. Must be done only once.
+ +cli/medperf/commands/auth/synapse_login.py
run(token=None)
+
+
+ classmethod
+
+
+¶Login to the Synapse server. Must be done only once.
+ +cli/medperf/commands/auth/synapse_login.py
AssociateBenchmark
+
+
+¶cli/medperf/commands/benchmark/associate.py
run(benchmark_uid, model_uid, data_uid, approved=False, no_cache=False)
+
+
+ classmethod
+
+
+¶Associates a dataset or model to the given benchmark
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
benchmark_uid |
+
+ int
+ |
+
+
+
+ UID of benchmark to associate entities with + |
+ + required + | +
model_uid |
+
+ int
+ |
+
+
+
+ UID of model to associate with benchmark + |
+ + required + | +
data_uid |
+
+ int
+ |
+
+
+
+ UID of dataset to associate with benchmark + |
+ + required + | +
comms |
+
+ Comms
+ |
+
+
+
+ Instance of Communications interface + |
+ + required + | +
ui |
+
+ UI
+ |
+
+
+
+ Instance of UI interface + |
+ + required + | +
approved |
+
+ bool
+ |
+
+
+
+ Skip approval step. Defaults to False + |
+
+ False
+ |
+
cli/medperf/commands/benchmark/associate.py
associate(benchmark_uid=typer.Option(..., '--benchmark_uid', '-b', help='UID of benchmark to associate with'), model_uid=typer.Option(None, '--model_uid', '-m', help='UID of model MLCube to associate'), dataset_uid=typer.Option(None, '--data_uid', '-d', help='Server UID of registered dataset to associate'), approval=typer.Option(False, '-y', help='Skip approval step'), no_cache=typer.Option(False, '--no-cache', help='Execute the test even if results already exist'))
+
+¶Associates a benchmark with a given mlcube or dataset. Only one option at a time.
+ +cli/medperf/commands/benchmark/benchmark.py
list(unregistered=typer.Option(False, '--unregistered', help='Get unregistered benchmarks'), mine=typer.Option(False, '--mine', help='Get current-user benchmarks'))
+
+¶List benchmarks
+ +cli/medperf/commands/benchmark/benchmark.py
run(benchmark_uid=typer.Option(..., '--benchmark', '-b', help='UID of the desired benchmark'), data_uid=typer.Option(..., '--data_uid', '-d', help='Registered Dataset UID'), file=typer.Option(None, '--models-from-file', '-f', help='A file containing the model UIDs to be executed.\n\n The file should contain a single line as a list of\n\n comma-separated integers corresponding to the model UIDs'), ignore_model_errors=typer.Option(False, '--ignore-model-errors', help='Ignore failing model cubes, allowing for possibly submitting partial results'), no_cache=typer.Option(False, '--no-cache', help='Execute even if results already exist'))
+
+¶Runs the benchmark execution step for a given benchmark, prepared dataset and model
+ +cli/medperf/commands/benchmark/benchmark.py
submit(name=typer.Option(..., '--name', '-n', help='Name of the benchmark'), description=typer.Option(..., '--description', '-d', help='Description of the benchmark'), docs_url=typer.Option('', '--docs-url', '-u', help='URL to documentation'), demo_url=typer.Option(..., '--demo-url', help='Identifier to download the demonstration dataset tarball file.\n\n See `medperf mlcube submit --help` for more information'), demo_hash=typer.Option('', '--demo-hash', help='Hash of demonstration dataset tarball file'), data_preparation_mlcube=typer.Option(..., '--data-preparation-mlcube', '-p', help='Data Preparation MLCube UID'), reference_model_mlcube=typer.Option(..., '--reference-model-mlcube', '-m', help='Reference Model MLCube UID'), evaluator_mlcube=typer.Option(..., '--evaluator-mlcube', '-e', help='Evaluator MLCube UID'), skip_data_preparation_step=typer.Option(False, '--skip-demo-data-preparation', help='Use this flag if the demo dataset is already prepared'), operational=typer.Option(False, '--operational', help='Submit the Benchmark as OPERATIONAL'))
+
+¶Submits a new benchmark to the platform
+ +cli/medperf/commands/benchmark/benchmark.py
view(entity_id=typer.Argument(None, help='Benchmark ID'), format=typer.Option('yaml', '-f', '--format', help='Format to display contents. Available formats: [yaml, json]'), unregistered=typer.Option(False, '--unregistered', help='Display unregistered benchmarks if benchmark ID is not provided'), mine=typer.Option(False, '--mine', help='Display current-user benchmarks if benchmark ID is not provided'), output=typer.Option(None, '--output', '-o', help='Output file to store contents. If not provided, the output will be displayed'))
+
+¶Displays the information of one or more benchmarks
+ +cli/medperf/commands/benchmark/benchmark.py
SubmitBenchmark
+
+
+¶cli/medperf/commands/benchmark/submit.py
11 + 12 + 13 + 14 + 15 + 16 + 17 + 18 + 19 + 20 + 21 + 22 + 23 + 24 + 25 + 26 + 27 + 28 + 29 + 30 + 31 + 32 + 33 + 34 + 35 + 36 + 37 + 38 + 39 + 40 + 41 + 42 + 43 + 44 + 45 + 46 + 47 + 48 + 49 + 50 + 51 + 52 + 53 + 54 + 55 + 56 + 57 + 58 + 59 + 60 + 61 + 62 + 63 + 64 + 65 + 66 + 67 + 68 + 69 + 70 + 71 + 72 + 73 + 74 + 75 + 76 + 77 + 78 + 79 + 80 + 81 + 82 + 83 + 84 + 85 + 86 + 87 + 88 + 89 + 90 + 91 + 92 + 93 + 94 + 95 + 96 + 97 + 98 + 99 +100 +101 +102 +103 +104 +105 +106 +107 |
|
get_extra_information()
+
+¶Retrieves information that must be populated automatically, +like hash, generated uid and test results
+ +cli/medperf/commands/benchmark/submit.py
run(benchmark_info, no_cache=True, skip_data_preparation_step=False)
+
+
+ classmethod
+
+
+¶Submits a new cube to the medperf platform
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
benchmark_info |
+
+ dict
+ |
+
+
+
+ benchmark information +expected keys: + name (str): benchmark name + description (str): benchmark description + docs_url (str): benchmark documentation url + demo_url (str): benchmark demo dataset url + demo_hash (str): benchmark demo dataset hash + data_preparation_mlcube (int): benchmark data preparation mlcube uid + reference_model_mlcube (int): benchmark reference model mlcube uid + evaluator_mlcube (int): benchmark data evaluator mlcube uid + |
+ + required + | +
cli/medperf/commands/benchmark/submit.py
run_compatibility_test()
+
+¶Runs a compatibility test to ensure elements are compatible, +and to extract additional information required for submission
+ +cli/medperf/commands/benchmark/submit.py
to_permanent_path(bmk_dict)
+
+¶Renames the temporary benchmark submission to a permanent one
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
bmk_dict |
+
+ dict
+ |
+
+
+
+ dictionary containing updated information of the submitted benchmark + |
+ + required + | +
cli/medperf/commands/benchmark/submit.py
list()
+
+¶List previously executed tests reports.
+ +cli/medperf/commands/compatibility_test/compatibility_test.py
run(benchmark_uid=typer.Option(None, '--benchmark', '-b', help='UID of the benchmark to test. Optional'), data_uid=typer.Option(None, '--data_uid', '-d', help='Prepared Dataset UID. Used for dataset testing. Optional. Defaults to benchmark demo dataset.'), demo_dataset_url=typer.Option(None, '--demo_dataset_url', help='Identifier to download the demonstration dataset tarball file.\n\n See `medperf mlcube submit --help` for more information'), demo_dataset_hash=typer.Option(None, '--demo_dataset_hash', help='Hash of the demo dataset, if provided.'), data_path=typer.Option(None, '--data_path', help='Path to raw input data.'), labels_path=typer.Option(None, '--labels_path', help='Path to the labels of the raw input data, if provided.'), data_prep=typer.Option(None, '--data_preparation', '-p', help='UID or local path to the data preparation mlcube. Optional. Defaults to benchmark data preparation mlcube.'), model=typer.Option(None, '--model', '-m', help='UID or local path to the model mlcube. Optional. Defaults to benchmark reference mlcube.'), evaluator=typer.Option(None, '--evaluator', '-e', help='UID or local path to the evaluator mlcube. Optional. Defaults to benchmark evaluator mlcube'), no_cache=typer.Option(False, '--no-cache', help='Execute the test even if results already exist'), offline=typer.Option(False, '--offline', help='Execute the test without connecting to the MedPerf server.'), skip_data_preparation_step=typer.Option(False, '--skip-demo-data-preparation', help='Use this flag if the passed demo dataset or data path is already prepared'))
+
+¶Executes a compatibility test for a determined benchmark. +Can test prepared and unprepared datasets, remote and local models independently.
+ +cli/medperf/commands/compatibility_test/compatibility_test.py
14 +15 +16 +17 +18 +19 +20 +21 +22 +23 +24 +25 +26 +27 +28 +29 +30 +31 +32 +33 +34 +35 +36 +37 +38 +39 +40 +41 +42 +43 +44 +45 +46 +47 +48 +49 +50 +51 +52 +53 +54 +55 +56 +57 +58 +59 +60 +61 +62 +63 +64 +65 +66 +67 +68 +69 +70 +71 +72 +73 +74 +75 +76 +77 +78 +79 +80 +81 +82 +83 +84 +85 +86 +87 +88 +89 +90 +91 |
|
view(entity_id=typer.Argument(None, help='Test report ID'), format=typer.Option('yaml', '-f', '--format', help='Format to display contents. Available formats: [yaml, json]'), output=typer.Option(None, '--output', '-o', help='Output file to store contents. If not provided, the output will be displayed'))
+
+¶Displays the information of one or more test reports
+ +cli/medperf/commands/compatibility_test/compatibility_test.py
CompatibilityTestExecution
+
+
+¶cli/medperf/commands/compatibility_test/run.py
12 + 13 + 14 + 15 + 16 + 17 + 18 + 19 + 20 + 21 + 22 + 23 + 24 + 25 + 26 + 27 + 28 + 29 + 30 + 31 + 32 + 33 + 34 + 35 + 36 + 37 + 38 + 39 + 40 + 41 + 42 + 43 + 44 + 45 + 46 + 47 + 48 + 49 + 50 + 51 + 52 + 53 + 54 + 55 + 56 + 57 + 58 + 59 + 60 + 61 + 62 + 63 + 64 + 65 + 66 + 67 + 68 + 69 + 70 + 71 + 72 + 73 + 74 + 75 + 76 + 77 + 78 + 79 + 80 + 81 + 82 + 83 + 84 + 85 + 86 + 87 + 88 + 89 + 90 + 91 + 92 + 93 + 94 + 95 + 96 + 97 + 98 + 99 +100 +101 +102 +103 +104 +105 +106 +107 +108 +109 +110 +111 +112 +113 +114 +115 +116 +117 +118 +119 +120 +121 +122 +123 +124 +125 +126 +127 +128 +129 +130 +131 +132 +133 +134 +135 +136 +137 +138 +139 +140 +141 +142 +143 +144 +145 +146 +147 +148 +149 +150 +151 +152 +153 +154 +155 +156 +157 +158 +159 +160 +161 +162 +163 +164 +165 +166 +167 +168 +169 +170 +171 +172 +173 +174 +175 +176 +177 +178 +179 +180 +181 +182 +183 +184 +185 +186 +187 +188 +189 +190 +191 +192 +193 +194 +195 +196 +197 +198 +199 +200 +201 +202 +203 +204 +205 +206 +207 +208 +209 +210 +211 +212 +213 +214 +215 +216 +217 +218 +219 +220 +221 +222 +223 +224 +225 +226 +227 +228 +229 +230 +231 +232 +233 +234 +235 +236 +237 +238 +239 +240 +241 +242 +243 +244 +245 +246 +247 +248 +249 +250 +251 +252 +253 +254 +255 +256 +257 +258 +259 +260 +261 +262 +263 +264 +265 +266 +267 +268 +269 +270 +271 |
|
cached_results()
+
+¶checks the existance of, and retrieves if possible, the compatibility test +result. This method is called prior to the test execution.
+ + + +Returns:
+Type | +Description | +
---|---|
+ dict | None
+ |
+
+
+
+ None if the results does not exist or if self.no_cache is True, + |
+
+ | +
+
+
+ otherwise it returns the found results. + |
+
cli/medperf/commands/compatibility_test/run.py
execute()
+
+¶Runs the test execution flow and returns the results
+ + + +Returns:
+Name | Type | +Description | +
---|---|---|
dict | + | +
+
+
+ returns the results of the test execution. + |
+
cli/medperf/commands/compatibility_test/run.py
initialize_report()
+
+¶Initializes an instance of TestReport
to hold the current test information.
cli/medperf/commands/compatibility_test/run.py
prepare_cubes()
+
+¶Prepares the mlcubes. If the provided mlcube is a path, it will create +a temporary uid and link the cube path to the medperf storage path.
+ +cli/medperf/commands/compatibility_test/run.py
prepare_dataset()
+
+¶Assigns the data_uid used for testing and retrieves the dataset. +If the data is not prepared, it calls the data preparation step +on the given local data path or using a remote demo dataset.
+ +cli/medperf/commands/compatibility_test/run.py
process_benchmark()
+
+¶Process the benchmark input if given. Sets the needed parameters from +the benchmark.
+ +cli/medperf/commands/compatibility_test/run.py
run(benchmark=None, data_prep=None, model=None, evaluator=None, data_path=None, labels_path=None, demo_dataset_url=None, demo_dataset_hash=None, data_uid=None, no_cache=False, offline=False, skip_data_preparation_step=False)
+
+
+ classmethod
+
+
+¶Execute a test workflow. Components of a complete workflow should be passed. +When only the benchmark is provided, it implies the following workflow will be used: +- the benchmark's demo dataset is used as the raw data +- the benchmark's data preparation cube is used +- the benchmark's reference model cube is used +- the benchmark's metrics cube is used
+Overriding benchmark's components: +- The data prepration, model, and metrics cubes can be overriden by specifying a cube either +as an integer (registered) or a path (local). The path can refer either to the mlcube config +file or to the mlcube directory containing the mlcube config file. +- Instead of using the demo dataset of the benchmark, The input raw data can be overriden by providing: + - a demo dataset url and its hash + - data path and labels path +- A prepared dataset can be directly used. In this case the data preparator cube is never used. +The prepared data can be provided by either specifying an integer (registered) or a hash of a +locally prepared dataset.
+Whether the benchmark is provided or not, the command will fail either if the user fails to +provide a valid complete workflow, or if the user provided extra redundant parameters.
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
benchmark |
+
+ int
+ |
+
+
+
+ Benchmark to run the test workflow for + |
+
+ None
+ |
+
data_prep |
+
+ str
+ |
+
+
+
+ data preparation mlcube uid or local path. + |
+
+ None
+ |
+
model |
+
+ str
+ |
+
+
+
+ model mlcube uid or local path. + |
+
+ None
+ |
+
evaluator |
+
+ str
+ |
+
+
+
+ evaluator mlcube uid or local path. + |
+
+ None
+ |
+
data_path |
+
+ str
+ |
+
+
+
+ path to a local raw data + |
+
+ None
+ |
+
labels_path |
+
+ str
+ |
+
+
+
+ path to the labels of the local raw data + |
+
+ None
+ |
+
demo_dataset_url |
+
+ str
+ |
+
+
+
+ Identifier to download the demonstration dataset tarball file. + |
+
+ None
+ |
+
demo_dataset_hash |
+
+ str
+ |
+
+
+
+ The hash of the demo dataset tarball file + |
+
+ None
+ |
+
data_uid |
+
+ str
+ |
+
+
+
+ A prepared dataset UID + |
+
+ None
+ |
+
no_cache |
+
+ bool
+ |
+
+
+
+ Whether to ignore cached results of the test execution. Defaults to False. + |
+
+ False
+ |
+
offline |
+
+ bool
+ |
+
+
+
+ Whether to disable communication to the MedPerf server and rely only on + |
+
+ False
+ |
+
Returns:
+Type | +Description | +
---|---|
+ str
+ |
+
+
+
+ Prepared Dataset UID used for the test. Could be the one provided or a generated one. + |
+
+ dict
+ |
+
+
+
+ Results generated by the test. + |
+
cli/medperf/commands/compatibility_test/run.py
13 +14 +15 +16 +17 +18 +19 +20 +21 +22 +23 +24 +25 +26 +27 +28 +29 +30 +31 +32 +33 +34 +35 +36 +37 +38 +39 +40 +41 +42 +43 +44 +45 +46 +47 +48 +49 +50 +51 +52 +53 +54 +55 +56 +57 +58 +59 +60 +61 +62 +63 +64 +65 +66 +67 +68 +69 +70 +71 +72 +73 +74 +75 +76 +77 +78 +79 +80 +81 +82 +83 +84 +85 +86 +87 +88 +89 +90 +91 +92 +93 +94 +95 +96 +97 +98 |
|
write(results)
+
+¶Writes a report of the test execution to the disk
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
results |
+
+ dict
+ |
+
+
+
+ the results of the test execution + |
+ + required + | +
cli/medperf/commands/compatibility_test/run.py
download_demo_data(dset_url, dset_hash)
+
+¶Retrieves the demo dataset associated to the specified benchmark
+ + + +Returns:
+Name | Type | +Description | +
---|---|---|
data_path |
+ str
+ |
+
+
+
+ Location of the downloaded data + |
+
labels_path |
+ str
+ |
+
+
+
+ Location of the downloaded labels + |
+
cli/medperf/commands/compatibility_test/utils.py
prepare_cube(cube_uid)
+
+¶Assigns the attr used for testing according to the initialization parameters. +If the value is a path, it will create a temporary uid and link the cube path to +the medperf storage path.
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
attr |
+
+ str
+ |
+
+
+
+ Attribute to check and/or reassign. + |
+ + required + | +
fallback |
+
+ any
+ |
+
+
+
+ Value to assign if attribute is empty. Defaults to None. + |
+ + required + | +
cli/medperf/commands/compatibility_test/utils.py
CompatibilityTestParamsValidator
+
+
+¶Validates the input parameters to the CompatibilityTestExecution class
+ +cli/medperf/commands/compatibility_test/validate_params.py
4 + 5 + 6 + 7 + 8 + 9 + 10 + 11 + 12 + 13 + 14 + 15 + 16 + 17 + 18 + 19 + 20 + 21 + 22 + 23 + 24 + 25 + 26 + 27 + 28 + 29 + 30 + 31 + 32 + 33 + 34 + 35 + 36 + 37 + 38 + 39 + 40 + 41 + 42 + 43 + 44 + 45 + 46 + 47 + 48 + 49 + 50 + 51 + 52 + 53 + 54 + 55 + 56 + 57 + 58 + 59 + 60 + 61 + 62 + 63 + 64 + 65 + 66 + 67 + 68 + 69 + 70 + 71 + 72 + 73 + 74 + 75 + 76 + 77 + 78 + 79 + 80 + 81 + 82 + 83 + 84 + 85 + 86 + 87 + 88 + 89 + 90 + 91 + 92 + 93 + 94 + 95 + 96 + 97 + 98 + 99 +100 +101 +102 +103 +104 +105 +106 +107 +108 +109 +110 +111 +112 +113 +114 +115 +116 +117 +118 +119 +120 +121 +122 +123 +124 +125 +126 +127 +128 +129 +130 +131 +132 +133 +134 +135 +136 +137 +138 +139 +140 +141 +142 +143 +144 +145 +146 +147 +148 +149 +150 +151 +152 +153 +154 +155 +156 +157 +158 +159 +160 |
|
get_data_source()
+
+¶Parses the input parameters and returns a string, one of: +"prepared", if the source of data is a prepared dataset uid, +"path", if the source of data is a local path to raw data, +"demo", if the source of data is a demo dataset url, +or "benchmark", if the source of data is the demo dataset of a benchmark.
+This function assumes the passed parameters to the constructor have been already +validated.
+ +cli/medperf/commands/compatibility_test/validate_params.py
validate()
+
+¶Ensures test has been passed a valid combination of parameters.
+Raises medperf.exceptions.InvalidArgumentError
when the parameters are
+invalid.
cli/medperf/commands/compatibility_test/validate_params.py
AssociateDataset
+
+
+¶cli/medperf/commands/dataset/associate.py
run(data_uid, benchmark_uid, approved=False, no_cache=False)
+
+
+ staticmethod
+
+
+¶Associates a registered dataset with a benchmark
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
data_uid |
+
+ int
+ |
+
+
+
+ UID of the registered dataset to associate + |
+ + required + | +
benchmark_uid |
+
+ int
+ |
+
+
+
+ UID of the benchmark to associate with + |
+ + required + | +
cli/medperf/commands/dataset/associate.py
associate(data_uid=typer.Option(..., '--data_uid', '-d', help='Registered Dataset UID'), benchmark_uid=typer.Option(..., '--benchmark_uid', '-b', help='Benchmark UID'), approval=typer.Option(False, '-y', help='Skip approval step'), no_cache=typer.Option(False, '--no-cache', help='Execute the test even if results already exist'))
+
+¶Associate a registered dataset with a specific benchmark. +The dataset and benchmark must share the same data preparation cube.
+ +cli/medperf/commands/dataset/dataset.py
list(unregistered=typer.Option(False, '--unregistered', help='Get unregistered datasets'), mine=typer.Option(False, '--mine', help='Get current-user datasets'), mlcube=typer.Option(None, '--mlcube', '-m', help='Get datasets for a given data prep mlcube'))
+
+¶List datasets
+ +cli/medperf/commands/dataset/dataset.py
prepare(data_uid=typer.Option(..., '--data_uid', '-d', help='Dataset UID'), approval=typer.Option(False, '-y', help='Skip report submission approval step (In this case, it is assumed to be approved)'))
+
+¶Runs the Data preparation step for a raw dataset
+ +cli/medperf/commands/dataset/dataset.py
set_operational(data_uid=typer.Option(..., '--data_uid', '-d', help='Dataset UID'), approval=typer.Option(False, '-y', help='Skip confirmation and statistics submission approval step'))
+
+¶Marks a dataset as Operational
+ +cli/medperf/commands/dataset/dataset.py
submit(benchmark_uid=typer.Option(None, '--benchmark', '-b', help='UID of the desired benchmark'), data_prep_uid=typer.Option(None, '--data_prep', '-p', help='UID of the desired preparation cube'), data_path=typer.Option(..., '--data_path', '-d', help='Path to the data'), labels_path=typer.Option(..., '--labels_path', '-l', help='Path to the labels'), metadata_path=typer.Option(None, '--metadata_path', '-m', help='Metadata folder location (Might be required if the dataset is already prepared)'), name=typer.Option(..., '--name', help='A human-readable name of the dataset'), description=typer.Option(None, '--description', help='A description of the dataset'), location=typer.Option(None, '--location', help='Location or Institution the data belongs to'), approval=typer.Option(False, '-y', help='Skip approval step'), submit_as_prepared=typer.Option(False, '--submit-as-prepared', help='Use this flag if the dataset is already prepared'))
+
+¶Submits a Dataset instance to the backend
+ +cli/medperf/commands/dataset/dataset.py
view(entity_id=typer.Argument(None, help='Dataset ID'), format=typer.Option('yaml', '-f', '--format', help='Format to display contents. Available formats: [yaml, json]'), unregistered=typer.Option(False, '--unregistered', help='Display unregistered datasets if dataset ID is not provided'), mine=typer.Option(False, '--mine', help='Display current-user datasets if dataset ID is not provided'), output=typer.Option(None, '--output', '-o', help='Output file to store contents. If not provided, the output will be displayed'))
+
+¶Displays the information of one or more datasets
+ +cli/medperf/commands/dataset/dataset.py
DatasetSetOperational
+
+
+¶cli/medperf/commands/dataset/set_operational.py
generate_uids()
+
+¶Auto-generates dataset UIDs for both input and output paths
+ +cli/medperf/commands/dataset/set_operational.py
todict()
+
+¶Dictionary representation of the update body
+ + + +Returns:
+Name | Type | +Description | +
---|---|---|
dict |
+ dict
+ |
+
+
+
+ dictionary containing information pertaining the dataset. + |
+
cli/medperf/commands/dataset/set_operational.py
write()
+
+¶Writes the registration into disk
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
filename |
+
+ str
+ |
+
+
+
+ name of the file. Defaults to config.reg_file. + |
+ + required + | +
DataCreation
+
+
+¶cli/medperf/commands/dataset/submit.py
17 + 18 + 19 + 20 + 21 + 22 + 23 + 24 + 25 + 26 + 27 + 28 + 29 + 30 + 31 + 32 + 33 + 34 + 35 + 36 + 37 + 38 + 39 + 40 + 41 + 42 + 43 + 44 + 45 + 46 + 47 + 48 + 49 + 50 + 51 + 52 + 53 + 54 + 55 + 56 + 57 + 58 + 59 + 60 + 61 + 62 + 63 + 64 + 65 + 66 + 67 + 68 + 69 + 70 + 71 + 72 + 73 + 74 + 75 + 76 + 77 + 78 + 79 + 80 + 81 + 82 + 83 + 84 + 85 + 86 + 87 + 88 + 89 + 90 + 91 + 92 + 93 + 94 + 95 + 96 + 97 + 98 + 99 +100 +101 +102 +103 +104 +105 +106 +107 +108 +109 +110 +111 +112 +113 +114 +115 +116 +117 +118 +119 +120 +121 +122 +123 +124 +125 +126 +127 +128 +129 +130 +131 +132 +133 +134 +135 +136 +137 +138 +139 +140 +141 +142 +143 +144 +145 +146 +147 +148 +149 +150 +151 +152 +153 +154 +155 +156 +157 +158 +159 +160 +161 +162 +163 +164 +165 +166 +167 +168 +169 +170 +171 +172 +173 +174 +175 +176 +177 +178 +179 +180 +181 |
|
create_dataset_object()
+
+¶generates dataset UIDs for both input path
+ +cli/medperf/commands/dataset/submit.py
to_permanent_path(updated_dataset_dict)
+
+¶Renames the temporary benchmark submission to a permanent one
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
bmk_dict |
+
+ dict
+ |
+
+
+
+ dictionary containing updated information of the submitted benchmark + |
+ + required + | +
cli/medperf/commands/dataset/submit.py
Execution
+
+
+¶cli/medperf/commands/execution.py
12 + 13 + 14 + 15 + 16 + 17 + 18 + 19 + 20 + 21 + 22 + 23 + 24 + 25 + 26 + 27 + 28 + 29 + 30 + 31 + 32 + 33 + 34 + 35 + 36 + 37 + 38 + 39 + 40 + 41 + 42 + 43 + 44 + 45 + 46 + 47 + 48 + 49 + 50 + 51 + 52 + 53 + 54 + 55 + 56 + 57 + 58 + 59 + 60 + 61 + 62 + 63 + 64 + 65 + 66 + 67 + 68 + 69 + 70 + 71 + 72 + 73 + 74 + 75 + 76 + 77 + 78 + 79 + 80 + 81 + 82 + 83 + 84 + 85 + 86 + 87 + 88 + 89 + 90 + 91 + 92 + 93 + 94 + 95 + 96 + 97 + 98 + 99 +100 +101 +102 +103 +104 +105 +106 +107 +108 +109 +110 +111 +112 +113 +114 +115 +116 +117 +118 +119 +120 +121 +122 +123 +124 +125 +126 +127 |
|
run(dataset, model, evaluator, ignore_model_errors=False)
+
+
+ classmethod
+
+
+¶Benchmark execution flow.
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
benchmark_uid |
+
+ int
+ |
+
+
+
+ UID of the desired benchmark + |
+ + required + | +
data_uid |
+
+ str
+ |
+
+
+
+ Registered Dataset UID + |
+ + required + | +
model_uid |
+
+ int
+ |
+
+
+
+ UID of model to execute + |
+ + required + | +
cli/medperf/commands/execution.py
EntityList
+
+
+¶cli/medperf/commands/list.py
run(entity_class, fields, unregistered=False, mine_only=False, **kwargs)
+
+
+ staticmethod
+
+
+¶Lists all local datasets
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
unregistered |
+
+ bool
+ |
+
+
+
+ Display only local unregistered results. Defaults to False. + |
+
+ False
+ |
+
mine_only |
+
+ bool
+ |
+
+
+
+ Display all registered current-user results. Defaults to False. + |
+
+ False
+ |
+
kwargs |
+
+ dict
+ |
+
+
+
+ Additional parameters for filtering entity lists. + |
+
+ {}
+ |
+
cli/medperf/commands/list.py
AssociateCube
+
+
+¶cli/medperf/commands/mlcube/associate.py
run(cube_uid, benchmark_uid, approved=False, no_cache=False)
+
+
+ classmethod
+
+
+¶Associates a cube with a given benchmark
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
cube_uid |
+
+ int
+ |
+
+
+
+ UID of model MLCube + |
+ + required + | +
benchmark_uid |
+
+ int
+ |
+
+
+
+ UID of benchmark + |
+ + required + | +
approved |
+
+ bool
+ |
+
+
+
+ Skip validation step. Defualts to False + |
+
+ False
+ |
+
cli/medperf/commands/mlcube/associate.py
CreateCube
+
+
+¶cli/medperf/commands/mlcube/create.py
run(template_name, output_path='.', config_file=None)
+
+
+ classmethod
+
+
+¶Creates a new MLCube based on one of the provided templates
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
template_name |
+
+ str
+ |
+
+
+
+ The name of the template to use + |
+ + required + | +
output_path |
+
+ (str, Optional)
+ |
+
+
+
+ The desired path for the MLCube. Defaults to current path. + |
+
+ '.'
+ |
+
config_file |
+
+ (str, Optional)
+ |
+
+
+
+ Path to a JSON configuration file. If not passed, user is prompted. + |
+
+ None
+ |
+
cli/medperf/commands/mlcube/create.py
associate(benchmark_uid=typer.Option(..., '--benchmark', '-b', help='Benchmark UID'), model_uid=typer.Option(..., '--model_uid', '-m', help='Model UID'), approval=typer.Option(False, '-y', help='Skip approval step'), no_cache=typer.Option(False, '--no-cache', help='Execute the test even if results already exist'))
+
+¶Associates an MLCube to a benchmark
+ +cli/medperf/commands/mlcube/mlcube.py
create(template=typer.Argument(..., help=f'MLCube template name. Available templates: [{' | '.join(config.templates.keys())}]'), output_path=typer.Option('.', '--output', '-o', help='Save the generated MLCube to the specified path'), config_file=typer.Option(None, '--config-file', '-c', help='JSON Configuration file. If not present then user is prompted for configuration'))
+
+¶Creates an MLCube based on one of the specified templates
+ +cli/medperf/commands/mlcube/mlcube.py
list(unregistered=typer.Option(False, '--unregistered', help='Get unregistered mlcubes'), mine=typer.Option(False, '--mine', help='Get current-user mlcubes'))
+
+¶List mlcubes
+ +cli/medperf/commands/mlcube/mlcube.py
submit(name=typer.Option(..., '--name', '-n', help='Name of the mlcube'), mlcube_file=typer.Option(..., '--mlcube-file', '-m', help='Identifier to download the mlcube file. See the description above'), mlcube_hash=typer.Option('', '--mlcube-hash', help='hash of mlcube file'), parameters_file=typer.Option('', '--parameters-file', '-p', help='Identifier to download the parameters file. See the description above'), parameters_hash=typer.Option('', '--parameters-hash', help='hash of parameters file'), additional_file=typer.Option('', '--additional-file', '-a', help='Identifier to download the additional files tarball. See the description above'), additional_hash=typer.Option('', '--additional-hash', help='hash of additional file'), image_file=typer.Option('', '--image-file', '-i', help='Identifier to download the image file. See the description above'), image_hash=typer.Option('', '--image-hash', help='hash of image file'), operational=typer.Option(False, '--operational', help='Submit the MLCube as OPERATIONAL'))
+
+¶Submits a new cube to the platform.
+ +mlcube_file
+parameters_file
+additional_file
+image_file
+are expected to be given in the following format: source_prefix
instructs the client how to download the resource, and resource_identifier
+is the identifier used to download the asset. The following are supported:
A direct link: "direct:
An asset hosted on the Synapse platform: "synapse:
If a URL is given without a source prefix, it will be treated as a direct download link.
+ +cli/medperf/commands/mlcube/mlcube.py
54 + 55 + 56 + 57 + 58 + 59 + 60 + 61 + 62 + 63 + 64 + 65 + 66 + 67 + 68 + 69 + 70 + 71 + 72 + 73 + 74 + 75 + 76 + 77 + 78 + 79 + 80 + 81 + 82 + 83 + 84 + 85 + 86 + 87 + 88 + 89 + 90 + 91 + 92 + 93 + 94 + 95 + 96 + 97 + 98 + 99 +100 +101 +102 +103 +104 +105 +106 +107 +108 +109 +110 +111 +112 +113 +114 +115 +116 +117 +118 +119 +120 +121 +122 +123 |
|
view(entity_id=typer.Argument(None, help='MLCube ID'), format=typer.Option('yaml', '-f', '--format', help='Format to display contents. Available formats: [yaml, json]'), unregistered=typer.Option(False, '--unregistered', help='Display unregistered mlcubes if mlcube ID is not provided'), mine=typer.Option(False, '--mine', help='Display current-user mlcubes if mlcube ID is not provided'), output=typer.Option(None, '--output', '-o', help='Output file to store contents. If not provided, the output will be displayed'))
+
+¶Displays the information of one or more mlcubes
+ +cli/medperf/commands/mlcube/mlcube.py
SubmitCube
+
+
+¶cli/medperf/commands/mlcube/submit.py
run(submit_info)
+
+
+ classmethod
+
+
+¶Submits a new cube to the medperf platform
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
submit_info |
+
+ dict
+ |
+
+
+
+ Dictionary containing the cube information. + |
+ + required + | +
cli/medperf/commands/mlcube/submit.py
to_permanent_path(cube_dict)
+
+¶Renames the temporary cube submission to a permanent one using the uid of +the registered cube
+ +cli/medperf/commands/mlcube/submit.py
activate(profile)
+
+¶Assigns the active profile, which is used by default
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
profile |
+
+ str
+ |
+
+
+
+ Name of the profile to be used. + |
+ + required + | +
cli/medperf/commands/profile.py
create(ctx, name=typer.Option(..., '--name', '-n', help="Profile's name"))
+
+¶Creates a new profile for managing and customizing configuration
+ +cli/medperf/commands/profile.py
delete(profile)
+
+¶Deletes a profile's configuration.
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
profile |
+
+ str
+ |
+
+
+
+ Profile to delete. + |
+ + required + | +
cli/medperf/commands/profile.py
list()
+
+¶Lists all available profiles
+ +cli/medperf/commands/profile.py
set_args(ctx)
+
+¶Assign key-value configuration pairs to the current profile.
+ +cli/medperf/commands/profile.py
view(profile=typer.Argument(None))
+
+¶Displays a profile's configuration.
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
profile |
+
+ str
+ |
+
+
+
+ Profile to display information from. Defaults to active profile. + |
+
+ typer.Argument(None)
+ |
+
cli/medperf/commands/profile.py
BenchmarkExecution
+
+
+¶cli/medperf/commands/result/create.py
20 + 21 + 22 + 23 + 24 + 25 + 26 + 27 + 28 + 29 + 30 + 31 + 32 + 33 + 34 + 35 + 36 + 37 + 38 + 39 + 40 + 41 + 42 + 43 + 44 + 45 + 46 + 47 + 48 + 49 + 50 + 51 + 52 + 53 + 54 + 55 + 56 + 57 + 58 + 59 + 60 + 61 + 62 + 63 + 64 + 65 + 66 + 67 + 68 + 69 + 70 + 71 + 72 + 73 + 74 + 75 + 76 + 77 + 78 + 79 + 80 + 81 + 82 + 83 + 84 + 85 + 86 + 87 + 88 + 89 + 90 + 91 + 92 + 93 + 94 + 95 + 96 + 97 + 98 + 99 +100 +101 +102 +103 +104 +105 +106 +107 +108 +109 +110 +111 +112 +113 +114 +115 +116 +117 +118 +119 +120 +121 +122 +123 +124 +125 +126 +127 +128 +129 +130 +131 +132 +133 +134 +135 +136 +137 +138 +139 +140 +141 +142 +143 +144 +145 +146 +147 +148 +149 +150 +151 +152 +153 +154 +155 +156 +157 +158 +159 +160 +161 +162 +163 +164 +165 +166 +167 +168 +169 +170 +171 +172 +173 +174 +175 +176 +177 +178 +179 +180 +181 +182 +183 +184 +185 +186 +187 +188 +189 +190 +191 +192 +193 +194 +195 +196 +197 +198 +199 +200 +201 +202 +203 +204 +205 +206 +207 +208 +209 +210 +211 +212 +213 +214 +215 +216 +217 +218 +219 +220 +221 +222 +223 +224 +225 +226 +227 +228 +229 +230 +231 +232 +233 +234 +235 +236 +237 +238 +239 +240 +241 +242 +243 +244 +245 +246 +247 +248 +249 +250 +251 +252 +253 +254 +255 +256 +257 +258 +259 +260 +261 +262 +263 +264 +265 +266 +267 +268 +269 +270 +271 +272 +273 +274 +275 +276 +277 +278 +279 +280 +281 +282 +283 +284 +285 +286 +287 +288 +289 +290 +291 +292 +293 |
|
run(benchmark_uid, data_uid, models_uids=None, models_input_file=None, ignore_model_errors=False, ignore_failed_experiments=False, no_cache=False, show_summary=False)
+
+
+ classmethod
+
+
+¶Benchmark execution flow.
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
benchmark_uid |
+
+ int
+ |
+
+
+
+ UID of the desired benchmark + |
+ + required + | +
data_uid |
+
+ str
+ |
+
+
+
+ Registered Dataset UID + |
+ + required + | +
models_uids |
+
+ List | None
+ |
+
+
+
+ list of model UIDs to execute. + if None, models_input_file will be used + |
+
+ None
+ |
+
models_input_file |
+
+ Optional[str]
+ |
+
+
+
+ filename to read from + |
+
+ None
+ |
+
cli/medperf/commands/result/create.py
create(benchmark_uid=typer.Option(..., '--benchmark', '-b', help='UID of the desired benchmark'), data_uid=typer.Option(..., '--data_uid', '-d', help='Registered Dataset UID'), model_uid=typer.Option(..., '--model_uid', '-m', help='UID of model to execute'), ignore_model_errors=typer.Option(False, '--ignore-model-errors', help='Ignore failing model cubes, allowing for possibly submitting partial results'), no_cache=typer.Option(False, '--no-cache', help='Execute even if results already exist'))
+
+¶Runs the benchmark execution step for a given benchmark, prepared dataset and model
+ +cli/medperf/commands/result/result.py
list(unregistered=typer.Option(False, '--unregistered', help='Get unregistered results'), mine=typer.Option(False, '--mine', help='Get current-user results'), benchmark=typer.Option(None, '--benchmark', '-b', help='Get results for a given benchmark'))
+
+¶List results
+ +cli/medperf/commands/result/result.py
submit(result_uid=typer.Option(..., '--result', '-r', help='Unregistered result UID'), approval=typer.Option(False, '-y', help='Skip approval step'))
+
+¶Submits already obtained results to the server
+ +cli/medperf/commands/result/result.py
view(entity_id=typer.Argument(None, help='Result ID'), format=typer.Option('yaml', '-f', '--format', help='Format to display contents. Available formats: [yaml, json]'), unregistered=typer.Option(False, '--unregistered', help='Display unregistered results if result ID is not provided'), mine=typer.Option(False, '--mine', help='Display current-user results if result ID is not provided'), benchmark=typer.Option(None, '--benchmark', '-b', help='Get results for a given benchmark'), output=typer.Option(None, '--output', '-o', help='Output file to store contents. If not provided, the output will be displayed'))
+
+¶Displays the information of one or more results
+ +cli/medperf/commands/result/result.py
ResultSubmission
+
+
+¶cli/medperf/commands/result/submit.py
to_permanent_path(result_dict)
+
+¶Rename the temporary result submission to a permanent one
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
result_dict |
+
+ dict
+ |
+
+
+
+ updated results dictionary + |
+ + required + | +
cli/medperf/commands/result/submit.py
clean()
+
+¶ls()
+
+¶Show the location of the current medperf assets
+ +cli/medperf/commands/storage.py
move(path=typer.Option(..., '--target', '-t', help='Target path'))
+
+¶Moves all storage folders to a target base path. Folders include: +Benchmarks, datasets, mlcubes, results, tests, ...
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
path |
+
+ str
+ |
+
+
+
+ target path + |
+
+ typer.Option(..., '--target', '-t', help='Target path')
+ |
+
cli/medperf/commands/storage.py
EntityView
+
+
+¶cli/medperf/commands/view.py
11 + 12 + 13 + 14 + 15 + 16 + 17 + 18 + 19 + 20 + 21 + 22 + 23 + 24 + 25 + 26 + 27 + 28 + 29 + 30 + 31 + 32 + 33 + 34 + 35 + 36 + 37 + 38 + 39 + 40 + 41 + 42 + 43 + 44 + 45 + 46 + 47 + 48 + 49 + 50 + 51 + 52 + 53 + 54 + 55 + 56 + 57 + 58 + 59 + 60 + 61 + 62 + 63 + 64 + 65 + 66 + 67 + 68 + 69 + 70 + 71 + 72 + 73 + 74 + 75 + 76 + 77 + 78 + 79 + 80 + 81 + 82 + 83 + 84 + 85 + 86 + 87 + 88 + 89 + 90 + 91 + 92 + 93 + 94 + 95 + 96 + 97 + 98 + 99 +100 |
|
run(entity_id, entity_class, format='yaml', unregistered=False, mine_only=False, output=None, **kwargs)
+
+
+ staticmethod
+
+
+¶Displays the contents of a single or multiple entities of a given type
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
entity_id |
+
+ Union[int, str]
+ |
+
+
+
+ Entity identifies + |
+ + required + | +
entity_class |
+
+ Entity
+ |
+
+
+
+ Entity type + |
+ + required + | +
unregistered |
+
+ bool
+ |
+
+
+
+ Display only local unregistered entities. Defaults to False. + |
+
+ False
+ |
+
mine_only |
+
+ bool
+ |
+
+
+
+ Display all current-user entities. Defaults to False. + |
+
+ False
+ |
+
format |
+
+ str
+ |
+
+
+
+ What format to use to display the contents. Valid formats: [yaml, json]. Defaults to yaml. + |
+
+ 'yaml'
+ |
+
output |
+
+ str
+ |
+
+
+
+ Path to a file for storing the entity contents. If not provided, the contents are printed. + |
+
+ None
+ |
+
kwargs |
+
+ dict
+ |
+
+
+
+ Additional parameters for filtering entity lists. + |
+
+ {}
+ |
+
cli/medperf/commands/view.py
Auth0
+
+
+¶
+ Bases: Auth
cli/medperf/comms/auth/auth0.py
18 + 19 + 20 + 21 + 22 + 23 + 24 + 25 + 26 + 27 + 28 + 29 + 30 + 31 + 32 + 33 + 34 + 35 + 36 + 37 + 38 + 39 + 40 + 41 + 42 + 43 + 44 + 45 + 46 + 47 + 48 + 49 + 50 + 51 + 52 + 53 + 54 + 55 + 56 + 57 + 58 + 59 + 60 + 61 + 62 + 63 + 64 + 65 + 66 + 67 + 68 + 69 + 70 + 71 + 72 + 73 + 74 + 75 + 76 + 77 + 78 + 79 + 80 + 81 + 82 + 83 + 84 + 85 + 86 + 87 + 88 + 89 + 90 + 91 + 92 + 93 + 94 + 95 + 96 + 97 + 98 + 99 +100 +101 +102 +103 +104 +105 +106 +107 +108 +109 +110 +111 +112 +113 +114 +115 +116 +117 +118 +119 +120 +121 +122 +123 +124 +125 +126 +127 +128 +129 +130 +131 +132 +133 +134 +135 +136 +137 +138 +139 +140 +141 +142 +143 +144 +145 +146 +147 +148 +149 +150 +151 +152 +153 +154 +155 +156 +157 +158 +159 +160 +161 +162 +163 +164 +165 +166 +167 +168 +169 +170 +171 +172 +173 +174 +175 +176 +177 +178 +179 +180 +181 +182 +183 +184 +185 +186 +187 +188 +189 +190 +191 +192 +193 +194 +195 +196 +197 +198 +199 +200 +201 +202 +203 +204 +205 +206 +207 +208 +209 +210 +211 +212 +213 +214 +215 +216 +217 +218 +219 +220 +221 +222 +223 +224 +225 +226 +227 +228 +229 +230 +231 +232 +233 +234 +235 +236 +237 +238 +239 +240 +241 +242 +243 +244 +245 +246 +247 +248 +249 +250 +251 +252 +253 +254 +255 +256 +257 +258 +259 +260 +261 +262 +263 +264 +265 +266 +267 +268 +269 +270 +271 +272 +273 +274 +275 +276 +277 +278 +279 +280 +281 +282 +283 +284 +285 +286 +287 +288 +289 +290 |
|
access_token
+
+
+ property
+
+
+¶Thread and process-safe access token retrieval
+__check_token_email(id_token_payload, email)
+
+¶Checks if the email provided by the user in the terminal matches the +email found in the recieved id token.
+ +cli/medperf/comms/auth/auth0.py
__get_device_access_token(device_code, polling_interval)
+
+¶Get the access token from the auth0 backend associated with +the device code requested before. This function will keep polling +the access token until the user completes the browser flow part +of the authorization process.
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
device_code |
+
+ str
+ |
+
+
+
+ A temporary device code requested by |
+ + required + | +
polling_interval |
+
+ float
+ |
+
+
+
+ number of seconds to wait between each two polling requests + |
+ + required + | +
Returns:
+Name | Type | +Description | +
---|---|---|
json_res |
+ dict
+ |
+
+
+
+ the response of the successful request, containg the access/refresh tokens pair + |
+
token_issued_at |
+ float
+ |
+
+
+
+ the timestamp when the access token was issued + |
+
cli/medperf/comms/auth/auth0.py
__raise_errors(res, action)
+
+¶log the failed request's response and raise errors.
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
res |
+
+ requests.Response
+ |
+
+
+
+ the response of a failed request + |
+ + required + | +
action |
+
+ str
+ |
+
+
+
+ a string for more informative error display + |
+ + required + | +
cli/medperf/comms/auth/auth0.py
__refresh_access_token(refresh_token)
+
+¶Retrieve and store a new access token using a refresh token. +A new refresh token will also be retrieved and stored.
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
refresh_token |
+
+ str
+ |
+
+
+
+ the refresh token + |
+ + required + | +
Returns:
+Name | Type | +Description | +
---|---|---|
access_token |
+ str
+ |
+
+
+
+ the new access token + |
+
cli/medperf/comms/auth/auth0.py
__request_device_code()
+
+¶Get a device code from the auth0 backend to be used for the authorization process
+ +cli/medperf/comms/auth/auth0.py
login(email)
+
+¶Retrieves and stores an access token/refresh token pair from the auth0 +backend using the device authorization flow.
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
email |
+
+ str
+ |
+
+
+
+ user email. This will be used to validate that the received + id_token contains the same email address. + |
+ + required + | +
cli/medperf/comms/auth/auth0.py
logout()
+
+¶Logs out the user by revoking their refresh token and deleting the +stored tokens.
+ +cli/medperf/comms/auth/auth0.py
Auth
+
+
+¶
+ Bases: ABC
cli/medperf/comms/auth/interface.py
access_token
+
+
+ abstractmethod
+ property
+
+
+¶An access token to authorize requests to the MedPerf server
+__init__()
+
+
+ abstractmethod
+
+
+¶login(email)
+
+
+ abstractmethod
+
+
+¶Local
+
+
+¶
+ Bases: Auth
cli/medperf/comms/auth/local.py
access_token
+
+
+ property
+
+
+¶Reads and returns an access token of the currently logged +in user to be used for authorizing requests to the MedPerf server.
+ + + +Returns:
+Name | Type | +Description | +
---|---|---|
access_token |
+ str
+ |
+
+
+
+ the access token + |
+
login(email)
+
+¶Retrieves and stores an access token from a local store json file.
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
email |
+
+ str
+ |
+
+
+
+ user email. + |
+ + required + | +
cli/medperf/comms/auth/local.py
This module defines a wrapper around the existing token verifier in auth0-python library.
+The library is designed to cache public keys in memory. Since our client is ephemeral, we
+wrapped the library's JwksFetcher
to cache keys in the filesystem storage, and wrapped the
+library's signature verifier to use this new JwksFetcher
This module downloads files from the internet. It provides a set of +functions to download common files that are necessary for workflow executions +and are not on the MedPerf server. An example of such files is model weights +of a Model MLCube.
+This module takes care of validating the integrity of the downloaded file +if a hash was specified when requesting the file. It also returns the hash +of the downloaded file, which can be the original specified hash or the +calculated hash of the freshly downloaded file if no hash was specified.
+Additionally, to avoid unnecessary downloads, an existing file +will not be re-downloaded.
+ + + +get_benchmark_demo_dataset(url, expected_hash=None)
+
+¶Downloads and extracts a demo dataset. If the hash is provided, +the file's integrity will be checked upon download.
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
url |
+
+ str
+ |
+
+
+
+ URL where the compressed demo dataset file can be downloaded. + |
+ + required + | +
expected_hash |
+
+ str
+ |
+
+
+
+ expected hash of the downloaded file + |
+
+ None
+ |
+
Returns:
+Name | Type | +Description | +
---|---|---|
output_path |
+ str
+ |
+
+
+
+ location where the uncompressed demo dataset is stored locally. + |
+
hash_value |
+ str
+ |
+
+
+
+ The hash of the downloaded tarball file + |
+
cli/medperf/comms/entity_resources/resources.py
get_cube(url, cube_path, expected_hash=None)
+
+¶Downloads and writes a cube mlcube.yaml file
+ +cli/medperf/comms/entity_resources/resources.py
get_cube_additional(url, cube_path, expected_tarball_hash=None)
+
+¶Retrieves additional files of an MLCube. The additional files +will be in a compressed tarball file. The function will additionally +extract this file.
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
url |
+
+ str
+ |
+
+
+
+ URL where the additional_files.tar.gz file can be downloaded. + |
+ + required + | +
cube_path |
+
+ str
+ |
+
+
+
+ Cube location. + |
+ + required + | +
expected_tarball_hash |
+
+ str
+ |
+
+
+
+ expected hash of tarball file + |
+
+ None
+ |
+
Returns:
+Name | Type | +Description | +
---|---|---|
tarball_hash |
+ str
+ |
+
+
+
+ The hash of the downloaded tarball file + |
+
cli/medperf/comms/entity_resources/resources.py
get_cube_image(url, cube_path, hash_value=None)
+
+¶Retrieves and stores the image file from the server. Stores images +on a shared location, and retrieves a cached image by hash if found locally. +Creates a symbolic link to the cube storage.
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
url |
+
+ str
+ |
+
+
+
+ URL where the image file can be downloaded. + |
+ + required + | +
cube_path |
+
+ str
+ |
+
+
+
+ Path to cube. + |
+ + required + | +
hash_value |
+
+ (str, Optional)
+ |
+
+
+
+ File hash to store under shared storage. Defaults to None. + |
+
+ None
+ |
+
Returns:
+Name | Type | +Description | +
---|---|---|
image_cube_file |
+ str
+ |
+
+
+
+ Location where the image file is stored locally. + |
+
hash_value |
+ str
+ |
+
+
+
+ The hash of the downloaded file + |
+
cli/medperf/comms/entity_resources/resources.py
get_cube_params(url, cube_path, expected_hash=None)
+
+¶Downloads and writes a cube parameters.yaml file
+ +cli/medperf/comms/entity_resources/resources.py
DirectLinkSource
+
+
+¶
+ Bases: BaseSource
cli/medperf/comms/entity_resources/sources/direct.py
__download_once(resource_identifier, output_path)
+
+¶Downloads a direct-download-link file by streaming its contents. source: +https://stackoverflow.com/questions/16694907/download-large-file-in-python-with-requests
+ +cli/medperf/comms/entity_resources/sources/direct.py
download(resource_identifier, output_path)
+
+¶Downloads a direct-download-link file with multiple attempts. This is +done due to facing transient network failure from some direct download +link servers.
+ +cli/medperf/comms/entity_resources/sources/direct.py
validate_resource(value)
+
+
+ classmethod
+
+
+¶This class expects a resource string of the form
+direct:<URL>
or only a URL.
Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
resource |
+
+ str
+ |
+
+
+
+ the resource string + |
+ + required + | +
Returns:
+Type | +Description | +
---|---|
+ str | None
+ |
+
+
+
+ The URL if the pattern matches, else None + |
+
cli/medperf/comms/entity_resources/sources/direct.py
BaseSource
+
+
+¶
+ Bases: ABC
cli/medperf/comms/entity_resources/sources/source.py
__init__()
+
+
+ abstractmethod
+
+
+¶authenticate()
+
+
+ abstractmethod
+
+
+¶download(resource_identifier, output_path)
+
+
+ abstractmethod
+
+
+¶Downloads the requested resource to the specified location
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
resource_identifier |
+
+ str
+ |
+
+
+
+ The identifier that is used to download + |
+ + required + | +
output_path |
+
+ str
+ |
+
+
+
+ The path to download the resource to + |
+ + required + | +
cli/medperf/comms/entity_resources/sources/source.py
validate_resource(value)
+
+
+ abstractmethod
+ classmethod
+
+
+¶SynapseSource
+
+
+¶
+ Bases: BaseSource
cli/medperf/comms/entity_resources/sources/synapse.py
validate_resource(value)
+
+
+ classmethod
+
+
+¶This class expects a resource string of the form
+synapse:<synapse_id>
, where syn<Integer>
.
Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
resource |
+
+ str
+ |
+
+
+
+ the resource string + |
+ + required + | +
Returns:
+Type | +Description | +
---|---|
+ str | None
+ |
+
+
+
+ The synapse ID if the pattern matches, else None + |
+
cli/medperf/comms/entity_resources/sources/synapse.py
__parse_resource(resource)
+
+¶Parses a resource string and returns its identifier and the source class +it can be downloaded from. +The function iterates over all supported sources and checks which one accepts +this resource. A resource is a string that should match a certain pattern to be +downloaded by a certain resource.
+If the resource pattern does not correspond to any supported source, the
+function raises an InvalidArgumentError
Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
resource |
+
+ str
+ |
+
+
+
+ The resource string. Must be in the form |
+ + required + | +
cli/medperf/comms/entity_resources/utils.py
download_resource(resource, output_path, expected_hash=None)
+
+¶Downloads a resource/file from the internet. Passing a hash is optional. +If hash is provided, the downloaded file's hash will be checked and an error +will be raised if it is incorrect.
+Upon success, the function returns the hash of the downloaded file.
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
resource |
+
+ str
+ |
+
+
+
+ The resource string. Must be in the form |
+ + required + | +
output_path |
+
+ str
+ |
+
+
+
+ The path to download the resource to + |
+ + required + | +
expected_hash |
+
+ (optional, str)
+ |
+
+
+
+ The expected hash of the file to be downloaded + |
+
+ None
+ |
+
Returns:
+Type | +Description | +
---|---|
+ | +
+
+
+ The hash of the downloaded file (or existing file) + |
+
cli/medperf/comms/entity_resources/utils.py
tmp_download_resource(resource)
+
+¶Downloads a resource to the temporary storage.
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
resource |
+
+ str
+ |
+
+
+
+ The resource string. Must be in the form |
+ + required + | +
Returns:
+Name | Type | +Description | +
---|---|---|
tmp_output_path |
+ str
+ |
+
+
+
+ The location where the resource was downloaded + |
+
cli/medperf/comms/entity_resources/utils.py
to_permanent_path(tmp_output_path, output_path)
+
+¶Writes a file from the temporary storage to the desired output path.
+ +cli/medperf/comms/entity_resources/utils.py
Comms
+
+
+¶
+ Bases: ABC
cli/medperf/comms/interface.py
5 + 6 + 7 + 8 + 9 + 10 + 11 + 12 + 13 + 14 + 15 + 16 + 17 + 18 + 19 + 20 + 21 + 22 + 23 + 24 + 25 + 26 + 27 + 28 + 29 + 30 + 31 + 32 + 33 + 34 + 35 + 36 + 37 + 38 + 39 + 40 + 41 + 42 + 43 + 44 + 45 + 46 + 47 + 48 + 49 + 50 + 51 + 52 + 53 + 54 + 55 + 56 + 57 + 58 + 59 + 60 + 61 + 62 + 63 + 64 + 65 + 66 + 67 + 68 + 69 + 70 + 71 + 72 + 73 + 74 + 75 + 76 + 77 + 78 + 79 + 80 + 81 + 82 + 83 + 84 + 85 + 86 + 87 + 88 + 89 + 90 + 91 + 92 + 93 + 94 + 95 + 96 + 97 + 98 + 99 +100 +101 +102 +103 +104 +105 +106 +107 +108 +109 +110 +111 +112 +113 +114 +115 +116 +117 +118 +119 +120 +121 +122 +123 +124 +125 +126 +127 +128 +129 +130 +131 +132 +133 +134 +135 +136 +137 +138 +139 +140 +141 +142 +143 +144 +145 +146 +147 +148 +149 +150 +151 +152 +153 +154 +155 +156 +157 +158 +159 +160 +161 +162 +163 +164 +165 +166 +167 +168 +169 +170 +171 +172 +173 +174 +175 +176 +177 +178 +179 +180 +181 +182 +183 +184 +185 +186 +187 +188 +189 +190 +191 +192 +193 +194 +195 +196 +197 +198 +199 +200 +201 +202 +203 +204 +205 +206 +207 +208 +209 +210 +211 +212 +213 +214 +215 +216 +217 +218 +219 +220 +221 +222 +223 +224 +225 +226 +227 +228 +229 +230 +231 +232 +233 +234 +235 +236 +237 +238 +239 +240 +241 +242 +243 +244 +245 +246 +247 +248 +249 +250 +251 +252 +253 +254 +255 +256 +257 +258 +259 +260 +261 +262 +263 +264 +265 +266 +267 +268 +269 +270 +271 +272 +273 +274 +275 +276 +277 +278 +279 +280 +281 +282 +283 +284 +285 +286 +287 +288 +289 +290 +291 +292 +293 +294 +295 +296 +297 +298 +299 +300 +301 |
|
__init__(source)
+
+
+ abstractmethod
+
+
+¶Create an instance of a communication object.
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
source |
+
+ str
+ |
+
+
+
+ location of the communication source. Where messages are going to be sent. + |
+ + required + | +
ui |
+
+ UI
+ |
+
+
+
+ Implementation of the UI interface. + |
+ + required + | +
token |
+
+ (str, Optional)
+ |
+
+
+
+ authentication token to be used throughout communication. Defaults to None. + |
+ + required + | +
cli/medperf/comms/interface.py
associate_cube(cube_uid, benchmark_uid, metadata={})
+
+
+ abstractmethod
+
+
+¶Create an MLCube-Benchmark association
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
cube_uid |
+
+ str
+ |
+
+
+
+ MLCube UID + |
+ + required + | +
benchmark_uid |
+
+ int
+ |
+
+
+
+ Benchmark UID + |
+ + required + | +
metadata |
+
+ dict
+ |
+
+
+
+ Additional metadata. Defaults to {}. + |
+
+ {}
+ |
+
cli/medperf/comms/interface.py
associate_dset(data_uid, benchmark_uid, metadata={})
+
+
+ abstractmethod
+
+
+¶Create a Dataset Benchmark association
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
data_uid |
+
+ int
+ |
+
+
+
+ Registered dataset UID + |
+ + required + | +
benchmark_uid |
+
+ int
+ |
+
+
+
+ Benchmark UID + |
+ + required + | +
metadata |
+
+ dict
+ |
+
+
+
+ Additional metadata. Defaults to {}. + |
+
+ {}
+ |
+
cli/medperf/comms/interface.py
get_benchmark(benchmark_uid)
+
+
+ abstractmethod
+
+
+¶Retrieves the benchmark specification file from the server
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
benchmark_uid |
+
+ int
+ |
+
+
+
+ uid for the desired benchmark + |
+ + required + | +
Returns:
+Name | Type | +Description | +
---|---|---|
dict |
+ dict
+ |
+
+
+
+ benchmark specification + |
+
cli/medperf/comms/interface.py
get_benchmark_model_associations(benchmark_uid)
+
+
+ abstractmethod
+
+
+¶Retrieves all the model associations of a benchmark.
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
benchmark_uid |
+
+ int
+ |
+
+
+
+ UID of the desired benchmark + |
+ + required + | +
Returns:
+Type | +Description | +
---|---|
+ List[int]
+ |
+
+
+
+ list[int]: List of benchmark model associations + |
+
cli/medperf/comms/interface.py
get_benchmark_results(benchmark_id)
+
+
+ abstractmethod
+
+
+¶Retrieves all results for a given benchmark
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
benchmark_id |
+
+ int
+ |
+
+
+
+ benchmark ID to retrieve results from + |
+ + required + | +
Returns:
+Name | Type | +Description | +
---|---|---|
dict |
+ dict
+ |
+
+
+
+ dictionary with the contents of each result in the specified benchmark + |
+
cli/medperf/comms/interface.py
get_benchmarks()
+
+
+ abstractmethod
+
+
+¶Retrieves all benchmarks in the platform.
+ + + +Returns:
+Type | +Description | +
---|---|
+ List[dict]
+ |
+
+
+
+ List[dict]: all benchmarks information. + |
+
get_cube_metadata(cube_uid)
+
+
+ abstractmethod
+
+
+¶Retrieves metadata about the specified cube
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
cube_uid |
+
+ int
+ |
+
+
+
+ UID of the desired cube. + |
+ + required + | +
Returns:
+Name | Type | +Description | +
---|---|---|
dict |
+ dict
+ |
+
+
+
+ Dictionary containing url and hashes for the cube files + |
+
cli/medperf/comms/interface.py
get_cubes()
+
+
+ abstractmethod
+
+
+¶Retrieves all MLCubes in the platform
+ + + +Returns:
+Type | +Description | +
---|---|
+ List[dict]
+ |
+
+
+
+ List[dict]: List containing the data of all MLCubes + |
+
get_cubes_associations()
+
+
+ abstractmethod
+
+
+¶Get all cube associations related to the current user
+ + + +Returns:
+Type | +Description | +
---|---|
+ List[dict]
+ |
+
+
+
+ List[dict]: List containing all associations information + |
+
get_current_user()
+
+
+ abstractmethod
+
+
+¶get_dataset(dset_uid)
+
+
+ abstractmethod
+
+
+¶Retrieves a specific dataset
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
dset_uid |
+
+ str
+ |
+
+
+
+ Dataset UID + |
+ + required + | +
Returns:
+Name | Type | +Description | +
---|---|---|
dict |
+ dict
+ |
+
+
+
+ Dataset metadata + |
+
get_datasets()
+
+
+ abstractmethod
+
+
+¶Retrieves all datasets in the platform
+ + + +Returns:
+Type | +Description | +
---|---|
+ List[dict]
+ |
+
+
+
+ List[dict]: List of data from all datasets + |
+
get_datasets_associations()
+
+
+ abstractmethod
+
+
+¶Get all dataset associations related to the current user
+ + + +Returns:
+Type | +Description | +
---|---|
+ List[dict]
+ |
+
+
+
+ List[dict]: List containing all associations information + |
+
get_result(result_uid)
+
+
+ abstractmethod
+
+
+¶Retrieves a specific result data
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
result_uid |
+
+ str
+ |
+
+
+
+ Result UID + |
+ + required + | +
Returns:
+Name | Type | +Description | +
---|---|---|
dict |
+ dict
+ |
+
+
+
+ Result metadata + |
+
get_results()
+
+
+ abstractmethod
+
+
+¶Retrieves all results
+ + + +Returns:
+Type | +Description | +
---|---|
+ List[dict]
+ |
+
+
+
+ List[dict]: List of results + |
+
get_user(user_id)
+
+
+ abstractmethod
+
+
+¶Retrieves the specified user. This will only return if +the current user has permission to view the requested user, +either by being himself, an admin or an owner of a data preparation +mlcube used by the requested user
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
user_id |
+
+ int
+ |
+
+
+
+ User UID + |
+ + required + | +
Returns:
+Name | Type | +Description | +
---|---|---|
dict |
+ dict
+ |
+
+
+
+ Requested user information + |
+
cli/medperf/comms/interface.py
get_user_benchmarks()
+
+
+ abstractmethod
+
+
+¶Retrieves all benchmarks created by the user
+ + + +Returns:
+Type | +Description | +
---|---|
+ List[dict]
+ |
+
+
+
+ List[dict]: Benchmarks data + |
+
get_user_cubes()
+
+
+ abstractmethod
+
+
+¶Retrieves metadata from all cubes registered by the user
+ + + +Returns:
+Type | +Description | +
---|---|
+ List[dict]
+ |
+
+
+
+ List[dict]: List of dictionaries containing the mlcubes registration information + |
+
cli/medperf/comms/interface.py
get_user_datasets()
+
+
+ abstractmethod
+
+
+¶Retrieves all datasets registered by the user
+ + + +Returns:
+Name | Type | +Description | +
---|---|---|
dict |
+ dict
+ |
+
+
+
+ dictionary with the contents of each dataset registration query + |
+
get_user_results()
+
+
+ abstractmethod
+
+
+¶Retrieves all results registered by the user
+ + + +Returns:
+Name | Type | +Description | +
---|---|---|
dict |
+ dict
+ |
+
+
+
+ dictionary with the contents of each dataset registration query + |
+
parse_url(url)
+
+
+ abstractmethod
+ classmethod
+
+
+¶Parse the source URL so that it can be used by the comms implementation. +It should handle protocols and versioning to be able to communicate with the API.
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
url |
+
+ str
+ |
+
+
+
+ base URL + |
+ + required + | +
Returns:
+Name | Type | +Description | +
---|---|---|
str |
+ str
+ |
+
+
+
+ parsed URL with protocol and version + |
+
cli/medperf/comms/interface.py
set_dataset_association_approval(dataset_uid, benchmark_uid, status)
+
+
+ abstractmethod
+
+
+¶Approves a dataset association
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
dataset_uid |
+
+ str
+ |
+
+
+
+ Dataset UID + |
+ + required + | +
benchmark_uid |
+
+ str
+ |
+
+
+
+ Benchmark UID + |
+ + required + | +
status |
+
+ str
+ |
+
+
+
+ Approval status to set for the association + |
+ + required + | +
cli/medperf/comms/interface.py
set_mlcube_association_approval(mlcube_uid, benchmark_uid, status)
+
+
+ abstractmethod
+
+
+¶Approves an mlcube association
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
mlcube_uid |
+
+ str
+ |
+
+
+
+ Dataset UID + |
+ + required + | +
benchmark_uid |
+
+ str
+ |
+
+
+
+ Benchmark UID + |
+ + required + | +
status |
+
+ str
+ |
+
+
+
+ Approval status to set for the association + |
+ + required + | +
cli/medperf/comms/interface.py
set_mlcube_association_priority(benchmark_uid, mlcube_uid, priority)
+
+
+ abstractmethod
+
+
+¶Sets the priority of an mlcube-benchmark association
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
mlcube_uid |
+
+ str
+ |
+
+
+
+ MLCube UID + |
+ + required + | +
benchmark_uid |
+
+ str
+ |
+
+
+
+ Benchmark UID + |
+ + required + | +
priority |
+
+ int
+ |
+
+
+
+ priority value to set for the association + |
+ + required + | +
cli/medperf/comms/interface.py
update_dataset(dataset_id, data)
+
+
+ abstractmethod
+
+
+¶Updates the contents of a datasets identified by dataset_id to the new data dictionary. +Updates may be partial.
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
dataset_id |
+
+ int
+ |
+
+
+
+ ID of the dataset to update + |
+ + required + | +
data |
+
+ dict
+ |
+
+
+
+ Updated information of the dataset. + |
+ + required + | +
cli/medperf/comms/interface.py
upload_benchmark(benchmark_dict)
+
+
+ abstractmethod
+
+
+¶Uploads a new benchmark to the server.
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
benchmark_dict |
+
+ dict
+ |
+
+
+
+ benchmark_data to be uploaded + |
+ + required + | +
Returns:
+Name | Type | +Description | +
---|---|---|
int |
+ int
+ |
+
+
+
+ UID of newly created benchmark + |
+
cli/medperf/comms/interface.py
upload_dataset(reg_dict)
+
+
+ abstractmethod
+
+
+¶Uploads registration data to the server, under the sha name of the file.
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
reg_dict |
+
+ dict
+ |
+
+
+
+ Dictionary containing registration information. + |
+ + required + | +
Returns:
+Name | Type | +Description | +
---|---|---|
int |
+ int
+ |
+
+
+
+ id of the created dataset registration. + |
+
cli/medperf/comms/interface.py
upload_mlcube(mlcube_body)
+
+
+ abstractmethod
+
+
+¶Uploads an MLCube instance to the platform
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
mlcube_body |
+
+ dict
+ |
+
+
+
+ Dictionary containing all the relevant data for creating mlcubes + |
+ + required + | +
Returns:
+Name | Type | +Description | +
---|---|---|
int |
+ int
+ |
+
+
+
+ id of the created mlcube instance on the platform + |
+
cli/medperf/comms/interface.py
upload_result(results_dict)
+
+
+ abstractmethod
+
+
+¶Uploads result to the server.
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
results_dict |
+
+ dict
+ |
+
+
+
+ Dictionary containing results information. + |
+ + required + | +
Returns:
+Name | Type | +Description | +
---|---|---|
int |
+ int
+ |
+
+
+
+ id of the generated results entry + |
+
cli/medperf/comms/interface.py
REST
+
+
+¶
+ Bases: Comms
cli/medperf/comms/rest.py
21 + 22 + 23 + 24 + 25 + 26 + 27 + 28 + 29 + 30 + 31 + 32 + 33 + 34 + 35 + 36 + 37 + 38 + 39 + 40 + 41 + 42 + 43 + 44 + 45 + 46 + 47 + 48 + 49 + 50 + 51 + 52 + 53 + 54 + 55 + 56 + 57 + 58 + 59 + 60 + 61 + 62 + 63 + 64 + 65 + 66 + 67 + 68 + 69 + 70 + 71 + 72 + 73 + 74 + 75 + 76 + 77 + 78 + 79 + 80 + 81 + 82 + 83 + 84 + 85 + 86 + 87 + 88 + 89 + 90 + 91 + 92 + 93 + 94 + 95 + 96 + 97 + 98 + 99 +100 +101 +102 +103 +104 +105 +106 +107 +108 +109 +110 +111 +112 +113 +114 +115 +116 +117 +118 +119 +120 +121 +122 +123 +124 +125 +126 +127 +128 +129 +130 +131 +132 +133 +134 +135 +136 +137 +138 +139 +140 +141 +142 +143 +144 +145 +146 +147 +148 +149 +150 +151 +152 +153 +154 +155 +156 +157 +158 +159 +160 +161 +162 +163 +164 +165 +166 +167 +168 +169 +170 +171 +172 +173 +174 +175 +176 +177 +178 +179 +180 +181 +182 +183 +184 +185 +186 +187 +188 +189 +190 +191 +192 +193 +194 +195 +196 +197 +198 +199 +200 +201 +202 +203 +204 +205 +206 +207 +208 +209 +210 +211 +212 +213 +214 +215 +216 +217 +218 +219 +220 +221 +222 +223 +224 +225 +226 +227 +228 +229 +230 +231 +232 +233 +234 +235 +236 +237 +238 +239 +240 +241 +242 +243 +244 +245 +246 +247 +248 +249 +250 +251 +252 +253 +254 +255 +256 +257 +258 +259 +260 +261 +262 +263 +264 +265 +266 +267 +268 +269 +270 +271 +272 +273 +274 +275 +276 +277 +278 +279 +280 +281 +282 +283 +284 +285 +286 +287 +288 +289 +290 +291 +292 +293 +294 +295 +296 +297 +298 +299 +300 +301 +302 +303 +304 +305 +306 +307 +308 +309 +310 +311 +312 +313 +314 +315 +316 +317 +318 +319 +320 +321 +322 +323 +324 +325 +326 +327 +328 +329 +330 +331 +332 +333 +334 +335 +336 +337 +338 +339 +340 +341 +342 +343 +344 +345 +346 +347 +348 +349 +350 +351 +352 +353 +354 +355 +356 +357 +358 +359 +360 +361 +362 +363 +364 +365 +366 +367 +368 +369 +370 +371 +372 +373 +374 +375 +376 +377 +378 +379 +380 +381 +382 +383 +384 +385 +386 +387 +388 +389 +390 +391 +392 +393 +394 +395 +396 +397 +398 +399 +400 +401 +402 +403 +404 +405 +406 +407 +408 +409 +410 +411 +412 +413 +414 +415 +416 +417 +418 +419 +420 +421 +422 +423 +424 +425 +426 +427 +428 +429 +430 +431 +432 +433 +434 +435 +436 +437 +438 +439 +440 +441 +442 +443 +444 +445 +446 +447 +448 +449 +450 +451 +452 +453 +454 +455 +456 +457 +458 +459 +460 +461 +462 +463 +464 +465 +466 +467 +468 +469 +470 +471 +472 +473 +474 +475 +476 +477 +478 +479 +480 +481 +482 +483 +484 +485 +486 +487 +488 +489 +490 +491 +492 +493 +494 +495 +496 +497 +498 +499 +500 +501 +502 +503 +504 +505 +506 +507 +508 +509 +510 +511 +512 +513 +514 +515 +516 +517 +518 +519 +520 +521 +522 +523 +524 +525 +526 +527 +528 +529 +530 +531 +532 +533 +534 +535 +536 +537 +538 +539 +540 +541 +542 +543 +544 +545 +546 +547 +548 +549 +550 +551 +552 +553 |
|
__get_list(url, num_elements=None, page_size=config.default_page_size, offset=0, binary_reduction=False)
+
+¶Retrieves a list of elements from a URL by iterating over pages until num_elements is obtained. +If num_elements is None, then iterates until all elements have been retrieved. +If binary_reduction is enabled, errors are assumed to be related to response size. In that case, +the page_size is reduced by half until a successful response is obtained or until page_size can't be +reduced anymore.
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
url |
+
+ str
+ |
+
+
+
+ The url to retrieve elements from + |
+ + required + | +
num_elements |
+
+ int
+ |
+
+
+
+ The desired number of elements to be retrieved. Defaults to None. + |
+
+ None
+ |
+
page_size |
+
+ int
+ |
+
+
+
+ Starting page size. Defaults to config.default_page_size. + |
+
+ config.default_page_size
+ |
+
start_limit |
+
+ int
+ |
+
+
+
+ The starting position for element retrieval. Defaults to 0. + |
+ + required + | +
binary_reduction |
+
+ bool
+ |
+
+
+
+ Wether to handle errors by halfing the page size. Defaults to False. + |
+
+ False
+ |
+
Returns:
+Type | +Description | +
---|---|
+ | +
+
+
+ List[dict]: A list of dictionaries representing the retrieved elements. + |
+
cli/medperf/comms/rest.py
__set_approval_status(url, status)
+
+¶Sets the approval status of a resource
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
url |
+
+ str
+ |
+
+
+
+ URL to the resource to update + |
+ + required + | +
status |
+
+ str
+ |
+
+
+
+ approval status to set + |
+ + required + | +
Returns:
+Type | +Description | +
---|---|
+ requests.Response
+ |
+
+
+
+ requests.Response: Response object returned by the update + |
+
cli/medperf/comms/rest.py
associate_cube(cube_uid, benchmark_uid, metadata={})
+
+¶Create an MLCube-Benchmark association
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
cube_uid |
+
+ int
+ |
+
+
+
+ MLCube UID + |
+ + required + | +
benchmark_uid |
+
+ int
+ |
+
+
+
+ Benchmark UID + |
+ + required + | +
metadata |
+
+ dict
+ |
+
+
+
+ Additional metadata. Defaults to {}. + |
+
+ {}
+ |
+
cli/medperf/comms/rest.py
associate_dset(data_uid, benchmark_uid, metadata={})
+
+¶Create a Dataset Benchmark association
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
data_uid |
+
+ int
+ |
+
+
+
+ Registered dataset UID + |
+ + required + | +
benchmark_uid |
+
+ int
+ |
+
+
+
+ Benchmark UID + |
+ + required + | +
metadata |
+
+ dict
+ |
+
+
+
+ Additional metadata. Defaults to {}. + |
+
+ {}
+ |
+
cli/medperf/comms/rest.py
get_benchmark(benchmark_uid)
+
+¶Retrieves the benchmark specification file from the server
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
benchmark_uid |
+
+ int
+ |
+
+
+
+ uid for the desired benchmark + |
+ + required + | +
Returns:
+Name | Type | +Description | +
---|---|---|
dict |
+ dict
+ |
+
+
+
+ benchmark specification + |
+
cli/medperf/comms/rest.py
get_benchmark_model_associations(benchmark_uid)
+
+¶Retrieves all the model associations of a benchmark.
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
benchmark_uid |
+
+ int
+ |
+
+
+
+ UID of the desired benchmark + |
+ + required + | +
Returns:
+Type | +Description | +
---|---|
+ List[int]
+ |
+
+
+
+ list[int]: List of benchmark model associations + |
+
cli/medperf/comms/rest.py
get_benchmark_results(benchmark_id)
+
+¶Retrieves all results for a given benchmark
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
benchmark_id |
+
+ int
+ |
+
+
+
+ benchmark ID to retrieve results from + |
+ + required + | +
Returns:
+Name | Type | +Description | +
---|---|---|
dict |
+ dict
+ |
+
+
+
+ dictionary with the contents of each result in the specified benchmark + |
+
cli/medperf/comms/rest.py
get_benchmarks()
+
+¶Retrieves all benchmarks in the platform.
+ + + +Returns:
+Type | +Description | +
---|---|
+ List[dict]
+ |
+
+
+
+ List[dict]: all benchmarks information. + |
+
cli/medperf/comms/rest.py
get_cube_metadata(cube_uid)
+
+¶Retrieves metadata about the specified cube
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
cube_uid |
+
+ int
+ |
+
+
+
+ UID of the desired cube. + |
+ + required + | +
Returns:
+Name | Type | +Description | +
---|---|---|
dict |
+ dict
+ |
+
+
+
+ Dictionary containing url and hashes for the cube files + |
+
cli/medperf/comms/rest.py
get_cubes()
+
+¶Retrieves all MLCubes in the platform
+ + + +Returns:
+Type | +Description | +
---|---|
+ List[dict]
+ |
+
+
+
+ List[dict]: List containing the data of all MLCubes + |
+
cli/medperf/comms/rest.py
get_cubes_associations()
+
+¶Get all cube associations related to the current user
+ + + +Returns:
+Type | +Description | +
---|---|
+ List[dict]
+ |
+
+
+
+ List[dict]: List containing all associations information + |
+
cli/medperf/comms/rest.py
get_current_user()
+
+¶get_dataset(dset_uid)
+
+¶Retrieves a specific dataset
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
dset_uid |
+
+ int
+ |
+
+
+
+ Dataset UID + |
+ + required + | +
Returns:
+Name | Type | +Description | +
---|---|---|
dict |
+ dict
+ |
+
+
+
+ Dataset metadata + |
+
cli/medperf/comms/rest.py
get_datasets()
+
+¶Retrieves all datasets in the platform
+ + + +Returns:
+Type | +Description | +
---|---|
+ List[dict]
+ |
+
+
+
+ List[dict]: List of data from all datasets + |
+
cli/medperf/comms/rest.py
get_datasets_associations()
+
+¶Get all dataset associations related to the current user
+ + + +Returns:
+Type | +Description | +
---|---|
+ List[dict]
+ |
+
+
+
+ List[dict]: List containing all associations information + |
+
cli/medperf/comms/rest.py
get_mlcube_datasets(mlcube_id)
+
+¶Retrieves all datasets that have the specified mlcube as the prep mlcube
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
mlcube_id |
+
+ int
+ |
+
+
+
+ mlcube ID to retrieve datasets from + |
+ + required + | +
Returns:
+Name | Type | +Description | +
---|---|---|
dict |
+ dict
+ |
+
+
+
+ dictionary with the contents of each dataset + |
+
cli/medperf/comms/rest.py
get_result(result_uid)
+
+¶Retrieves a specific result data
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
result_uid |
+
+ int
+ |
+
+
+
+ Result UID + |
+ + required + | +
Returns:
+Name | Type | +Description | +
---|---|---|
dict |
+ dict
+ |
+
+
+
+ Result metadata + |
+
cli/medperf/comms/rest.py
get_results()
+
+¶Retrieves all results
+ + + +Returns:
+Type | +Description | +
---|---|
+ List[dict]
+ |
+
+
+
+ List[dict]: List of results + |
+
cli/medperf/comms/rest.py
get_user(user_id)
+
+¶Retrieves the specified user. This will only return if +the current user has permission to view the requested user, +either by being himself, an admin or an owner of a data preparation +mlcube used by the requested user
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
user_id |
+
+ int
+ |
+
+
+
+ User UID + |
+ + required + | +
Returns:
+Name | Type | +Description | +
---|---|---|
dict |
+ dict
+ |
+
+
+
+ Requested user information + |
+
cli/medperf/comms/rest.py
get_user_benchmarks()
+
+¶Retrieves all benchmarks created by the user
+ + + +Returns:
+Type | +Description | +
---|---|
+ List[dict]
+ |
+
+
+
+ List[dict]: Benchmarks data + |
+
cli/medperf/comms/rest.py
get_user_cubes()
+
+¶Retrieves metadata from all cubes registered by the user
+ + + +Returns:
+Type | +Description | +
---|---|
+ List[dict]
+ |
+
+
+
+ List[dict]: List of dictionaries containing the mlcubes registration information + |
+
cli/medperf/comms/rest.py
get_user_datasets()
+
+¶Retrieves all datasets registered by the user
+ + + +Returns:
+Name | Type | +Description | +
---|---|---|
dict |
+ dict
+ |
+
+
+
+ dictionary with the contents of each dataset registration query + |
+
cli/medperf/comms/rest.py
get_user_results()
+
+¶Retrieves all results registered by the user
+ + + +Returns:
+Name | Type | +Description | +
---|---|---|
dict |
+ dict
+ |
+
+
+
+ dictionary with the contents of each result registration query + |
+
cli/medperf/comms/rest.py
parse_url(url)
+
+
+ classmethod
+
+
+¶Parse the source URL so that it can be used by the comms implementation. +It should handle protocols and versioning to be able to communicate with the API.
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
url |
+
+ str
+ |
+
+
+
+ base URL + |
+ + required + | +
Returns:
+Name | Type | +Description | +
---|---|---|
str |
+ str
+ |
+
+
+
+ parsed URL with protocol and version + |
+
cli/medperf/comms/rest.py
set_dataset_association_approval(benchmark_uid, dataset_uid, status)
+
+¶Approves a dataset association
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
dataset_uid |
+
+ int
+ |
+
+
+
+ Dataset UID + |
+ + required + | +
benchmark_uid |
+
+ int
+ |
+
+
+
+ Benchmark UID + |
+ + required + | +
status |
+
+ str
+ |
+
+
+
+ Approval status to set for the association + |
+ + required + | +
cli/medperf/comms/rest.py
set_mlcube_association_approval(benchmark_uid, mlcube_uid, status)
+
+¶Approves an mlcube association
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
mlcube_uid |
+
+ int
+ |
+
+
+
+ Dataset UID + |
+ + required + | +
benchmark_uid |
+
+ int
+ |
+
+
+
+ Benchmark UID + |
+ + required + | +
status |
+
+ str
+ |
+
+
+
+ Approval status to set for the association + |
+ + required + | +
cli/medperf/comms/rest.py
set_mlcube_association_priority(benchmark_uid, mlcube_uid, priority)
+
+¶Sets the priority of an mlcube-benchmark association
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
mlcube_uid |
+
+ int
+ |
+
+
+
+ MLCube UID + |
+ + required + | +
benchmark_uid |
+
+ int
+ |
+
+
+
+ Benchmark UID + |
+ + required + | +
priority |
+
+ int
+ |
+
+
+
+ priority value to set for the association + |
+ + required + | +
cli/medperf/comms/rest.py
upload_benchmark(benchmark_dict)
+
+¶Uploads a new benchmark to the server.
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
benchmark_dict |
+
+ dict
+ |
+
+
+
+ benchmark_data to be uploaded + |
+ + required + | +
Returns:
+Name | Type | +Description | +
---|---|---|
int |
+ int
+ |
+
+
+
+ UID of newly created benchmark + |
+
cli/medperf/comms/rest.py
upload_dataset(reg_dict)
+
+¶Uploads registration data to the server, under the sha name of the file.
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
reg_dict |
+
+ dict
+ |
+
+
+
+ Dictionary containing registration information. + |
+ + required + | +
Returns:
+Name | Type | +Description | +
---|---|---|
int |
+ int
+ |
+
+
+
+ id of the created dataset registration. + |
+
cli/medperf/comms/rest.py
upload_mlcube(mlcube_body)
+
+¶Uploads an MLCube instance to the platform
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
mlcube_body |
+
+ dict
+ |
+
+
+
+ Dictionary containing all the relevant data for creating mlcubes + |
+ + required + | +
Returns:
+Name | Type | +Description | +
---|---|---|
int |
+ int
+ |
+
+
+
+ id of the created mlcube instance on the platform + |
+
cli/medperf/comms/rest.py
upload_result(results_dict)
+
+¶Uploads result to the server.
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
results_dict |
+
+ dict
+ |
+
+
+
+ Dictionary containing results information. + |
+ + required + | +
Returns:
+Name | Type | +Description | +
---|---|---|
int |
+ int
+ |
+
+
+
+ id of the generated results entry + |
+
cli/medperf/comms/rest.py
add_inline_parameters(func)
+
+¶Decorator that adds common configuration options to a typer command
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
func |
+
+ Callable
+ |
+
+
+
+ function to be decorated + |
+ + required + | +
Returns:
+Name | Type | +Description | +
---|---|---|
Callable |
+ Callable
+ |
+
+
+
+ decorated function + |
+
cli/medperf/decorators.py
154 +155 +156 +157 +158 +159 +160 +161 +162 +163 +164 +165 +166 +167 +168 +169 +170 +171 +172 +173 +174 +175 +176 +177 +178 +179 +180 +181 +182 +183 +184 +185 +186 +187 +188 +189 +190 +191 +192 +193 +194 +195 +196 +197 +198 +199 +200 +201 +202 +203 +204 +205 +206 +207 +208 +209 +210 +211 +212 +213 +214 +215 +216 +217 +218 +219 +220 +221 +222 +223 +224 +225 +226 +227 +228 +229 +230 +231 |
|
clean_except(func)
+
+¶Decorator for handling errors. It allows logging +and cleaning the project's directory before throwing the error.
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
func |
+
+ Callable
+ |
+
+
+
+ Function to handle for errors + |
+ + required + | +
Returns:
+Name | Type | +Description | +
---|---|---|
Callable |
+ Callable
+ |
+
+
+
+ Decorated function + |
+
cli/medperf/decorators.py
configurable(func)
+
+¶Decorator that adds common configuration options to a typer command
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
func |
+
+ Callable
+ |
+
+
+
+ function to be decorated + |
+ + required + | +
Returns:
+Name | Type | +Description | +
---|---|---|
Callable |
+ Callable
+ |
+
+
+
+ decorated function + |
+
cli/medperf/decorators.py
48 + 49 + 50 + 51 + 52 + 53 + 54 + 55 + 56 + 57 + 58 + 59 + 60 + 61 + 62 + 63 + 64 + 65 + 66 + 67 + 68 + 69 + 70 + 71 + 72 + 73 + 74 + 75 + 76 + 77 + 78 + 79 + 80 + 81 + 82 + 83 + 84 + 85 + 86 + 87 + 88 + 89 + 90 + 91 + 92 + 93 + 94 + 95 + 96 + 97 + 98 + 99 +100 +101 +102 +103 +104 +105 +106 +107 +108 +109 +110 +111 +112 +113 +114 +115 +116 +117 +118 +119 +120 +121 +122 +123 +124 +125 +126 +127 +128 +129 +130 +131 +132 +133 +134 +135 +136 +137 +138 +139 +140 +141 +142 +143 +144 +145 +146 +147 +148 +149 +150 +151 |
|
Benchmark
+
+
+¶
+ Bases: Entity
, ApprovableSchema
, DeployableSchema
Class representing a Benchmark
+a benchmark is a bundle of assets that enables quantitative +measurement of the performance of AI models for a specific +clinical problem. A Benchmark instance contains information +regarding how to prepare datasets for execution, as well as +what models to run and how to evaluate them.
+ +cli/medperf/entities/benchmark.py
10 + 11 + 12 + 13 + 14 + 15 + 16 + 17 + 18 + 19 + 20 + 21 + 22 + 23 + 24 + 25 + 26 + 27 + 28 + 29 + 30 + 31 + 32 + 33 + 34 + 35 + 36 + 37 + 38 + 39 + 40 + 41 + 42 + 43 + 44 + 45 + 46 + 47 + 48 + 49 + 50 + 51 + 52 + 53 + 54 + 55 + 56 + 57 + 58 + 59 + 60 + 61 + 62 + 63 + 64 + 65 + 66 + 67 + 68 + 69 + 70 + 71 + 72 + 73 + 74 + 75 + 76 + 77 + 78 + 79 + 80 + 81 + 82 + 83 + 84 + 85 + 86 + 87 + 88 + 89 + 90 + 91 + 92 + 93 + 94 + 95 + 96 + 97 + 98 + 99 +100 +101 +102 +103 +104 +105 +106 +107 +108 +109 +110 +111 +112 |
|
__init__(*args, **kwargs)
+
+¶Creates a new benchmark instance
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
bmk_desc |
+
+ Union[dict, BenchmarkModel]
+ |
+
+
+
+ Benchmark instance description + |
+ + required + | +
get_models_uids(benchmark_uid)
+
+
+ classmethod
+
+
+¶Retrieves the list of models associated to the benchmark
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
benchmark_uid |
+
+ int
+ |
+
+
+
+ UID of the benchmark. + |
+ + required + | +
comms |
+
+ Comms
+ |
+
+
+
+ Instance of the communications interface. + |
+ + required + | +
Returns:
+Type | +Description | +
---|---|
+ List[int]
+ |
+
+
+
+ List[int]: List of mlcube uids + |
+
cli/medperf/entities/benchmark.py
remote_prefilter(filters)
+
+
+ staticmethod
+
+
+¶Applies filtering logic that must be done before retrieving remote entities
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
filters |
+
+ dict
+ |
+
+
+
+ filters to apply + |
+ + required + | +
Returns:
+Name | Type | +Description | +
---|---|---|
callable |
+ callable
+ |
+
+
+
+ A function for retrieving remote entities with the applied prefilters + |
+
cli/medperf/entities/benchmark.py
Cube
+
+
+¶
+ Bases: Entity
, DeployableSchema
Class representing an MLCube Container
+Medperf platform uses the MLCube container for components such as +Dataset Preparation, Evaluation, and the Registered Models. MLCube +containers are software containers (e.g., Docker and Singularity) +with standard metadata and a consistent file-system level interface.
+ +cli/medperf/entities/cube.py
23 + 24 + 25 + 26 + 27 + 28 + 29 + 30 + 31 + 32 + 33 + 34 + 35 + 36 + 37 + 38 + 39 + 40 + 41 + 42 + 43 + 44 + 45 + 46 + 47 + 48 + 49 + 50 + 51 + 52 + 53 + 54 + 55 + 56 + 57 + 58 + 59 + 60 + 61 + 62 + 63 + 64 + 65 + 66 + 67 + 68 + 69 + 70 + 71 + 72 + 73 + 74 + 75 + 76 + 77 + 78 + 79 + 80 + 81 + 82 + 83 + 84 + 85 + 86 + 87 + 88 + 89 + 90 + 91 + 92 + 93 + 94 + 95 + 96 + 97 + 98 + 99 +100 +101 +102 +103 +104 +105 +106 +107 +108 +109 +110 +111 +112 +113 +114 +115 +116 +117 +118 +119 +120 +121 +122 +123 +124 +125 +126 +127 +128 +129 +130 +131 +132 +133 +134 +135 +136 +137 +138 +139 +140 +141 +142 +143 +144 +145 +146 +147 +148 +149 +150 +151 +152 +153 +154 +155 +156 +157 +158 +159 +160 +161 +162 +163 +164 +165 +166 +167 +168 +169 +170 +171 +172 +173 +174 +175 +176 +177 +178 +179 +180 +181 +182 +183 +184 +185 +186 +187 +188 +189 +190 +191 +192 +193 +194 +195 +196 +197 +198 +199 +200 +201 +202 +203 +204 +205 +206 +207 +208 +209 +210 +211 +212 +213 +214 +215 +216 +217 +218 +219 +220 +221 +222 +223 +224 +225 +226 +227 +228 +229 +230 +231 +232 +233 +234 +235 +236 +237 +238 +239 +240 +241 +242 +243 +244 +245 +246 +247 +248 +249 +250 +251 +252 +253 +254 +255 +256 +257 +258 +259 +260 +261 +262 +263 +264 +265 +266 +267 +268 +269 +270 +271 +272 +273 +274 +275 +276 +277 +278 +279 +280 +281 +282 +283 +284 +285 +286 +287 +288 +289 +290 +291 +292 +293 +294 +295 +296 +297 +298 +299 +300 +301 +302 +303 +304 +305 +306 +307 +308 +309 +310 +311 +312 +313 +314 +315 +316 +317 +318 +319 +320 +321 +322 +323 +324 +325 +326 +327 +328 +329 +330 +331 +332 +333 +334 +335 +336 +337 +338 +339 +340 +341 +342 +343 +344 +345 +346 +347 +348 +349 +350 +351 +352 +353 +354 +355 +356 +357 +358 +359 +360 +361 +362 +363 +364 +365 +366 +367 +368 +369 |
|
__init__(*args, **kwargs)
+
+¶Creates a Cube instance
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
cube_desc |
+
+ Union[dict, CubeModel]
+ |
+
+
+
+ MLCube Instance description + |
+ + required + | +
cli/medperf/entities/cube.py
get(cube_uid, local_only=False)
+
+
+ classmethod
+
+
+¶Retrieves and creates a Cube instance from the comms. If cube already exists +inside the user's computer then retrieves it from there.
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
cube_uid |
+
+ str
+ |
+
+
+
+ UID of the cube. + |
+ + required + | +
Returns:
+Name | Type | +Description | +
---|---|---|
Cube |
+ Cube
+ |
+
+
+
+ a Cube instance with the retrieved data. + |
+
cli/medperf/entities/cube.py
get_config(identifier)
+
+¶Returns the output parameter specified in the mlcube.yaml file
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
identifier |
+
+ str
+ |
+
+
+
+
|
+ + required + | +
Returns:
+Name | Type | +Description | +
---|---|---|
str | + | +
+
+
+ the parameter value, None if not found + |
+
cli/medperf/entities/cube.py
get_default_output(task, out_key, param_key=None)
+
+¶Returns the output parameter specified in the mlcube.yaml file
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
task |
+
+ str
+ |
+
+
+
+ the task of interest + |
+ + required + | +
out_key |
+
+ str
+ |
+
+
+
+ key used to identify the desired output in the yaml file + |
+ + required + | +
param_key |
+
+ str
+ |
+
+
+
+ key inside the parameters file that completes the output path. Defaults to None. + |
+
+ None
+ |
+
Returns:
+Name | Type | +Description | +
---|---|---|
str |
+ str
+ |
+
+
+
+ the path as specified in the mlcube.yaml file for the desired +output for the desired task. Defaults to None if out_key not found + |
+
cli/medperf/entities/cube.py
remote_prefilter(filters)
+
+
+ staticmethod
+
+
+¶Applies filtering logic that must be done before retrieving remote entities
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
filters |
+
+ dict
+ |
+
+
+
+ filters to apply + |
+ + required + | +
Returns:
+Name | Type | +Description | +
---|---|---|
callable | + | +
+
+
+ A function for retrieving remote entities with the applied prefilters + |
+
cli/medperf/entities/cube.py
run(task, output_logs=None, string_params={}, timeout=None, read_protected_input=True, **kwargs)
+
+¶Executes a given task on the cube instance
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
task |
+
+ str
+ |
+
+
+
+ task to run + |
+ + required + | +
string_params |
+
+ Dict[str]
+ |
+
+
+
+ Extra parameters that can't be passed as normal function args. + Defaults to {}. + |
+
+ {}
+ |
+
timeout |
+
+ int
+ |
+
+
+
+ timeout for the task in seconds. Defaults to None. + |
+
+ None
+ |
+
read_protected_input |
+
+ bool
+ |
+
+
+
+ Wether to disable write permissions on input volumes. Defaults to True. + |
+
+ True
+ |
+
kwargs |
+
+ dict
+ |
+
+
+
+ additional arguments that are passed directly to the mlcube command + |
+
+ {}
+ |
+
cli/medperf/entities/cube.py
230 +231 +232 +233 +234 +235 +236 +237 +238 +239 +240 +241 +242 +243 +244 +245 +246 +247 +248 +249 +250 +251 +252 +253 +254 +255 +256 +257 +258 +259 +260 +261 +262 +263 +264 +265 +266 +267 +268 +269 +270 +271 +272 +273 +274 +275 +276 +277 +278 +279 +280 +281 +282 +283 +284 +285 +286 +287 +288 +289 +290 +291 +292 +293 +294 +295 +296 +297 +298 +299 +300 +301 +302 +303 +304 +305 +306 +307 +308 +309 |
|
Dataset
+
+
+¶
+ Bases: Entity
, DeployableSchema
Class representing a Dataset
+Datasets are stored locally in the Data Owner's machine. They contain +information regarding the prepared dataset, such as name and description, +general statistics and an UID generated by hashing the contents of the +data preparation output.
+ +cli/medperf/entities/dataset.py
14 + 15 + 16 + 17 + 18 + 19 + 20 + 21 + 22 + 23 + 24 + 25 + 26 + 27 + 28 + 29 + 30 + 31 + 32 + 33 + 34 + 35 + 36 + 37 + 38 + 39 + 40 + 41 + 42 + 43 + 44 + 45 + 46 + 47 + 48 + 49 + 50 + 51 + 52 + 53 + 54 + 55 + 56 + 57 + 58 + 59 + 60 + 61 + 62 + 63 + 64 + 65 + 66 + 67 + 68 + 69 + 70 + 71 + 72 + 73 + 74 + 75 + 76 + 77 + 78 + 79 + 80 + 81 + 82 + 83 + 84 + 85 + 86 + 87 + 88 + 89 + 90 + 91 + 92 + 93 + 94 + 95 + 96 + 97 + 98 + 99 +100 +101 +102 +103 +104 +105 +106 +107 +108 +109 +110 +111 +112 +113 +114 +115 +116 +117 +118 +119 +120 +121 +122 +123 +124 +125 +126 +127 +128 +129 +130 +131 +132 +133 +134 +135 +136 +137 |
|
remote_prefilter(filters)
+
+
+ staticmethod
+
+
+¶Applies filtering logic that must be done before retrieving remote entities
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
filters |
+
+ dict
+ |
+
+
+
+ filters to apply + |
+ + required + | +
Returns:
+Name | Type | +Description | +
---|---|---|
callable |
+ callable
+ |
+
+
+
+ A function for retrieving remote entities with the applied prefilters + |
+
cli/medperf/entities/dataset.py
Entity
+
+
+¶
+ Bases: MedperfSchema
, ABC
cli/medperf/entities/interface.py
13 + 14 + 15 + 16 + 17 + 18 + 19 + 20 + 21 + 22 + 23 + 24 + 25 + 26 + 27 + 28 + 29 + 30 + 31 + 32 + 33 + 34 + 35 + 36 + 37 + 38 + 39 + 40 + 41 + 42 + 43 + 44 + 45 + 46 + 47 + 48 + 49 + 50 + 51 + 52 + 53 + 54 + 55 + 56 + 57 + 58 + 59 + 60 + 61 + 62 + 63 + 64 + 65 + 66 + 67 + 68 + 69 + 70 + 71 + 72 + 73 + 74 + 75 + 76 + 77 + 78 + 79 + 80 + 81 + 82 + 83 + 84 + 85 + 86 + 87 + 88 + 89 + 90 + 91 + 92 + 93 + 94 + 95 + 96 + 97 + 98 + 99 +100 +101 +102 +103 +104 +105 +106 +107 +108 +109 +110 +111 +112 +113 +114 +115 +116 +117 +118 +119 +120 +121 +122 +123 +124 +125 +126 +127 +128 +129 +130 +131 +132 +133 +134 +135 +136 +137 +138 +139 +140 +141 +142 +143 +144 +145 +146 +147 +148 +149 +150 +151 +152 +153 +154 +155 +156 +157 +158 +159 +160 +161 +162 +163 +164 +165 +166 +167 +168 +169 +170 +171 +172 +173 +174 +175 +176 +177 +178 +179 +180 +181 +182 +183 +184 +185 +186 +187 +188 +189 +190 +191 +192 +193 +194 +195 +196 +197 +198 +199 +200 +201 +202 +203 +204 +205 +206 +207 +208 +209 +210 +211 +212 +213 +214 +215 +216 +217 +218 +219 +220 +221 +222 +223 +224 +225 |
|
__get_local_dict(uid)
+
+
+ classmethod
+
+
+¶Retrieves a local entity information
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
uid |
+
+ str
+ |
+
+
+
+ uid of the local entity + |
+ + required + | +
Returns:
+Name | Type | +Description | +
---|---|---|
dict |
+ dict
+ |
+
+
+
+ information of the entity + |
+
cli/medperf/entities/interface.py
__local_get(uid)
+
+
+ classmethod
+
+
+¶Retrieves and creates an entity instance from the local storage.
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
uid |
+
+ str | int
+ |
+
+
+
+ UID of the entity + |
+ + required + | +
Returns:
+Name | Type | +Description | +
---|---|---|
Entity |
+ EntityType
+ |
+
+
+
+ Specified Entity Instance + |
+
cli/medperf/entities/interface.py
__remote_get(uid)
+
+
+ classmethod
+
+
+¶Retrieves and creates an entity instance from the comms instance.
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
uid |
+
+ int
+ |
+
+
+
+ server UID of the entity + |
+ + required + | +
Returns:
+Name | Type | +Description | +
---|---|---|
Entity |
+ EntityType
+ |
+
+
+
+ Specified Entity Instance + |
+
cli/medperf/entities/interface.py
all(unregistered=False, filters={})
+
+
+ classmethod
+
+
+¶Gets a list of all instances of the respective entity. +Whether the list is local or remote depends on the implementation.
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
unregistered |
+
+ bool
+ |
+
+
+
+ Wether to retrieve only unregistered local entities. Defaults to False. + |
+
+ False
+ |
+
filters |
+
+ dict
+ |
+
+
+
+ key-value pairs specifying filters to apply to the list of entities. + |
+
+ {}
+ |
+
Returns:
+Type | +Description | +
---|---|
+ List[EntityType]
+ |
+
+
+
+ List[Entity]: a list of entities. + |
+
cli/medperf/entities/interface.py
display_dict()
+
+¶Returns a dictionary of entity properties that can be displayed +to a user interface using a verbose name of the property rather than +the internal names
+ + + +Returns:
+Name | Type | +Description | +
---|---|---|
dict |
+ dict
+ |
+
+
+
+ the display dictionary + |
+
cli/medperf/entities/interface.py
get(uid, local_only=False)
+
+
+ classmethod
+
+
+¶Gets an instance of the respective entity. +Wether this requires only local read or remote calls depends +on the implementation.
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
uid |
+
+ str
+ |
+
+
+
+ Unique Identifier to retrieve the entity + |
+ + required + | +
local_only |
+
+ bool
+ |
+
+
+
+ If True, the entity will be retrieved locally + |
+
+ False
+ |
+
Returns:
+Name | Type | +Description | +
---|---|---|
Entity |
+ EntityType
+ |
+
+
+
+ Entity Instance associated to the UID + |
+
cli/medperf/entities/interface.py
remote_prefilter(filters)
+
+
+ staticmethod
+
+
+¶Applies filtering logic that must be done before retrieving remote entities
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
filters |
+
+ dict
+ |
+
+
+
+ filters to apply + |
+ + required + | +
Returns:
+Name | Type | +Description | +
---|---|---|
callable |
+ callable
+ |
+
+
+
+ A function for retrieving remote entities with the applied prefilters + |
+
cli/medperf/entities/interface.py
upload()
+
+¶Upload the entity-related information to the communication's interface
+ + + +Returns:
+Name | Type | +Description | +
---|---|---|
Dict |
+ Dict
+ |
+
+
+
+ Dictionary with the updated entity information + |
+
cli/medperf/entities/interface.py
write()
+
+¶Writes the entity to the local storage
+ + + +Returns:
+Name | Type | +Description | +
---|---|---|
str |
+ str
+ |
+
+
+
+ Path to the stored entity + |
+
cli/medperf/entities/interface.py
TestReport
+
+
+¶
+ Bases: Entity
Class representing a compatibility test report entry
+A test report consists of the components of a test execution: +- data used, which can be: + - a demo dataset url and its hash, or + - a raw data path and its labels path, or + - a prepared dataset uid +- Data preparation cube if the data used was not already prepared +- model cube +- evaluator cube +- results
+ +However, we still use the same Entity interface used by other entities
+in order to reduce repeated code. Consequently, we mocked a few methods
+and attributes inherited from the Entity interface that are not relevant to
+this entity, such as the name
and id
attributes, and such as
+the get
and all
methods.
cli/medperf/entities/report.py
8 + 9 + 10 + 11 + 12 + 13 + 14 + 15 + 16 + 17 + 18 + 19 + 20 + 21 + 22 + 23 + 24 + 25 + 26 + 27 + 28 + 29 + 30 + 31 + 32 + 33 + 34 + 35 + 36 + 37 + 38 + 39 + 40 + 41 + 42 + 43 + 44 + 45 + 46 + 47 + 48 + 49 + 50 + 51 + 52 + 53 + 54 + 55 + 56 + 57 + 58 + 59 + 60 + 61 + 62 + 63 + 64 + 65 + 66 + 67 + 68 + 69 + 70 + 71 + 72 + 73 + 74 + 75 + 76 + 77 + 78 + 79 + 80 + 81 + 82 + 83 + 84 + 85 + 86 + 87 + 88 + 89 + 90 + 91 + 92 + 93 + 94 + 95 + 96 + 97 + 98 + 99 +100 +101 +102 +103 +104 +105 |
|
local_id
+
+
+ property
+
+
+¶A helper that generates a unique hash for a test report.
+get(uid, local_only=False)
+
+
+ classmethod
+
+
+¶Gets an instance of the TestReport. ignores local_only inherited flag as TestReport is always a local entity.
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
uid |
+
+ str
+ |
+
+
+
+ Report Unique Identifier + |
+ + required + | +
local_only |
+
+ bool
+ |
+
+
+
+ ignored. Left for aligning with parent Entity class + |
+
+ False
+ |
+
Returns:
+Name | Type | +Description | +
---|---|---|
TestReport |
+ TestReport
+ |
+
+
+
+ Report Instance associated to the UID + |
+
cli/medperf/entities/report.py
Result
+
+
+¶
+ Bases: Entity
, ApprovableSchema
Class representing a Result entry
+Results are obtained after successfully running a benchmark +execution flow. They contain information regarding the +components involved in obtaining metrics results, as well as the +results themselves. This class provides methods for working with +benchmark results and how to upload them to the backend.
+ +cli/medperf/entities/result.py
7 + 8 + 9 +10 +11 +12 +13 +14 +15 +16 +17 +18 +19 +20 +21 +22 +23 +24 +25 +26 +27 +28 +29 +30 +31 +32 +33 +34 +35 +36 +37 +38 +39 +40 +41 +42 +43 +44 +45 +46 +47 +48 +49 +50 +51 +52 +53 +54 +55 +56 +57 +58 +59 +60 +61 +62 +63 +64 +65 +66 +67 +68 +69 +70 +71 +72 +73 +74 +75 +76 +77 +78 +79 +80 +81 +82 +83 +84 +85 +86 +87 +88 +89 |
|
__init__(*args, **kwargs)
+
+¶remote_prefilter(filters)
+
+
+ staticmethod
+
+
+¶Applies filtering logic that must be done before retrieving remote entities
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
filters |
+
+ dict
+ |
+
+
+
+ filters to apply + |
+ + required + | +
Returns:
+Name | Type | +Description | +
---|---|---|
callable |
+ callable
+ |
+
+
+
+ A function for retrieving remote entities with the applied prefilters + |
+
cli/medperf/entities/result.py
MedperfSchema
+
+
+¶
+ Bases: BaseModel
cli/medperf/entities/schemas.py
11 +12 +13 +14 +15 +16 +17 +18 +19 +20 +21 +22 +23 +24 +25 +26 +27 +28 +29 +30 +31 +32 +33 +34 +35 +36 +37 +38 +39 +40 +41 +42 +43 +44 +45 +46 +47 +48 +49 +50 +51 +52 +53 +54 +55 +56 +57 +58 +59 +60 +61 +62 +63 +64 +65 +66 +67 +68 +69 +70 +71 +72 +73 +74 +75 +76 +77 +78 +79 +80 +81 +82 +83 +84 +85 +86 +87 +88 |
|
__init__(*args, **kwargs)
+
+¶Override the ValidationError procedure so we can +format the error message in our desired way
+ +cli/medperf/entities/schemas.py
dict(*args, **kwargs)
+
+¶Overrides dictionary implementation so it filters out +fields not defined in the pydantic model
+ + + +Returns:
+Name | Type | +Description | +
---|---|---|
dict |
+ dict
+ |
+
+
+
+ filtered dictionary + |
+
cli/medperf/entities/schemas.py
todict()
+
+¶Dictionary containing both original and alias fields
+ + + +Returns:
+Name | Type | +Description | +
---|---|---|
dict |
+ dict
+ |
+
+
+
+ Extended dictionary representation + |
+
cli/medperf/entities/schemas.py
AuthenticationError
+
+
+¶
+ Bases: MedperfException
Raised when authentication can't be processed
+ + + +CleanExit
+
+
+¶
+ Bases: MedperfException
Raised when Medperf needs to stop for non erroneous reasons
+ +cli/medperf/exceptions.py
CommunicationAuthenticationError
+
+
+¶
+ Bases: CommunicationError
Raised when the communication interface can't handle an authentication request
+ + + +CommunicationError
+
+
+¶
+ Bases: MedperfException
Raised when an error happens due to the communication interface
+ + + +CommunicationRequestError
+
+
+¶
+ Bases: CommunicationError
Raised when the communication interface can't handle a request appropiately
+ + + +CommunicationRetrievalError
+
+
+¶
+ Bases: CommunicationError
Raised when the communication interface can't retrieve an element
+ + + +ExecutionError
+
+
+¶
+ Bases: MedperfException
Raised when an execution component fails
+ + + +InvalidArgumentError
+
+
+¶
+ Bases: MedperfException
Raised when an argument or set of arguments are consided invalid
+ + + +InvalidEntityError
+
+
+¶
+ Bases: MedperfException
Raised when an entity is considered invalid
+ + + +CLI
+
+
+¶
+ Bases: UI
cli/medperf/ui/cli.py
9 + 10 + 11 + 12 + 13 + 14 + 15 + 16 + 17 + 18 + 19 + 20 + 21 + 22 + 23 + 24 + 25 + 26 + 27 + 28 + 29 + 30 + 31 + 32 + 33 + 34 + 35 + 36 + 37 + 38 + 39 + 40 + 41 + 42 + 43 + 44 + 45 + 46 + 47 + 48 + 49 + 50 + 51 + 52 + 53 + 54 + 55 + 56 + 57 + 58 + 59 + 60 + 61 + 62 + 63 + 64 + 65 + 66 + 67 + 68 + 69 + 70 + 71 + 72 + 73 + 74 + 75 + 76 + 77 + 78 + 79 + 80 + 81 + 82 + 83 + 84 + 85 + 86 + 87 + 88 + 89 + 90 + 91 + 92 + 93 + 94 + 95 + 96 + 97 + 98 + 99 +100 +101 +102 +103 +104 +105 +106 +107 +108 +109 +110 +111 +112 +113 +114 +115 +116 +117 +118 +119 |
|
hidden_prompt(msg)
+
+¶Displays a prompt to the user and waits for an aswer. User input is not displayed
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
msg |
+
+ str
+ |
+
+
+
+ message to use for the prompt + |
+ + required + | +
Returns:
+Name | Type | +Description | +
---|---|---|
str |
+ str
+ |
+
+
+
+ user input + |
+
cli/medperf/ui/cli.py
interactive()
+
+¶Context managed interactive session.
+ + + +Yields:
+Name | Type | +Description | +
---|---|---|
CLI | + | +
+
+
+ Yields the current CLI instance with an interactive session initialized + |
+
cli/medperf/ui/cli.py
print(msg='')
+
+¶Display a message on the command line
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
msg |
+
+ str
+ |
+
+
+
+ message to print + |
+
+ ''
+ |
+
print_error(msg)
+
+¶Display an error message on the command line
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
msg |
+
+ str
+ |
+
+
+
+ error message to display + |
+ + required + | +
cli/medperf/ui/cli.py
print_highlight(msg='')
+
+¶Display a highlighted message
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
msg |
+
+ str
+ |
+
+
+
+ message to print + |
+
+ ''
+ |
+
print_warning(msg)
+
+¶Display a warning message on the command line
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
msg |
+
+ str
+ |
+
+
+
+ warning message to display + |
+ + required + | +
prompt(msg)
+
+¶Displays a prompt to the user and waits for an answer
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
msg |
+
+ str
+ |
+
+
+
+ message to use for the prompt + |
+ + required + | +
Returns:
+Name | Type | +Description | +
---|---|---|
str |
+ str
+ |
+
+
+
+ user input + |
+
start_interactive()
+
+¶Start an interactive session where messages can be overwritten +and animations can be displayed
+ + +UI
+
+
+¶
+ Bases: ABC
cli/medperf/ui/interface.py
hidden_prompt(msg)
+
+
+ abstractmethod
+
+
+¶Displays a prompt to the user and waits for an aswer. User input is not displayed
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
msg |
+
+ str
+ |
+
+
+
+ message to use for the prompt + |
+ + required + | +
Returns:
+Name | Type | +Description | +
---|---|---|
str |
+ str
+ |
+
+
+
+ user input + |
+
cli/medperf/ui/interface.py
interactive()
+
+
+ abstractmethod
+
+
+¶print(msg='')
+
+
+ abstractmethod
+
+
+¶Display a message to the interface. If on interactive session overrides +previous message
+ + +print_error(msg)
+
+
+ abstractmethod
+
+
+¶print_highlight(msg='')
+
+
+ abstractmethod
+
+
+¶Display a message on the command line with green color
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
msg |
+
+ str
+ |
+
+
+
+ message to print + |
+
+ ''
+ |
+
print_warning(msg)
+
+¶prompt(msg)
+
+
+ abstractmethod
+
+
+¶start_interactive()
+
+
+ abstractmethod
+
+
+¶Initialize an interactive session for animations or overriding messages. +If the UI doesn't support this, the function can be left empty.
+ + +stop_interactive()
+
+
+ abstractmethod
+
+
+¶Terminate an interactive session. +If the UI doesn't support this, the function can be left empty.
+ + +text(msg)
+
+
+ abstractmethod
+
+
+¶Displays a messages that overwrites previous messages if they were created +during an interactive session. +If not supported or not on an interactive session, it is expected to fallback +to the UI print function.
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
msg |
+
+ str
+ |
+
+
+
+ message to display + |
+ + required + | +
cli/medperf/ui/interface.py
StdIn
+
+
+¶
+ Bases: UI
Class for using sys.stdin/sys.stdout exclusively. Used mainly for automating +execution with class-like objects. Using only basic IO methods ensures that +piping from the command-line. Should not be used in normal execution, as +hidden prompts and interactive prints will not work as expected.
+ +cli/medperf/ui/stdin.py
approval_prompt(msg)
+
+¶Helper function for prompting the user for things they have to explicitly approve.
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
msg |
+
+ str
+ |
+
+
+
+ What message to ask the user for approval. + |
+ + required + | +
Returns:
+Name | Type | +Description | +
---|---|---|
bool |
+ bool
+ |
+
+
+
+ Wether the user explicitly approved or not. + |
+
cli/medperf/utils.py
check_for_updates()
+
+¶Check if the current branch is up-to-date with its remote counterpart using GitPython.
+ +cli/medperf/utils.py
cleanup()
+
+¶Removes clutter and unused files from the medperf folder structure.
+ +cli/medperf/utils.py
combine_proc_sp_text(proc)
+
+¶Combines the output of a process and the spinner. +Joins any string captured from the process with the +spinner current text. Any strings ending with any other +character from the subprocess will be returned later.
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
proc |
+
+ spawn
+ |
+
+
+
+ a pexpect spawned child + |
+ + required + | +
Returns:
+Name | Type | +Description | +
---|---|---|
str |
+ str
+ |
+
+
+
+ all non-carriage-return-ending string captured from proc + |
+
cli/medperf/utils.py
dict_pretty_print(in_dict, skip_none_values=True)
+
+¶Helper function for distinctively printing dictionaries with yaml format.
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
in_dict |
+
+ dict
+ |
+
+
+
+ dictionary to print + |
+ + required + | +
skip_none_values |
+
+ bool
+ |
+
+
+
+ if fields with |
+
+ True
+ |
+
cli/medperf/utils.py
filter_latest_associations(associations, entity_key)
+
+¶Given a list of entity-benchmark associations, this function +retrieves a list containing the latest association of each +entity instance.
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
associations |
+
+ list[dict]
+ |
+
+
+
+ the list of associations + |
+ + required + | +
entity_key |
+
+ str
+ |
+
+
+
+ either "dataset" or "model_mlcube" + |
+ + required + | +
Returns:
+Type | +Description | +
---|---|
+ | +
+
+
+ list[dict]: the list containing the latest association of each + entity instance. + |
+
cli/medperf/utils.py
format_errors_dict(errors_dict)
+
+¶Reformats the error details from a field-error(s) dictionary into a human-readable string for printing
+ +cli/medperf/utils.py
generate_tmp_path()
+
+¶Generates a temporary path by means of getting the current timestamp +with a random salt
+ + + +Returns:
+Name | Type | +Description | +
---|---|---|
str |
+ str
+ |
+
+
+
+ generated temporary path + |
+
cli/medperf/utils.py
generate_tmp_uid()
+
+¶Generates a temporary uid by means of getting the current timestamp +with a random salt
+ + + +Returns:
+Name | Type | +Description | +
---|---|---|
str |
+ str
+ |
+
+
+
+ generated temporary uid + |
+
cli/medperf/utils.py
get_cube_image_name(cube_path)
+
+¶Retrieves the singularity image name of the mlcube by reading its mlcube.yaml file
+ +cli/medperf/utils.py
get_file_hash(path)
+
+¶Calculates the sha256 hash for a given file.
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
path |
+
+ str
+ |
+
+
+
+ Location of the file of interest. + |
+ + required + | +
Returns:
+Name | Type | +Description | +
---|---|---|
str |
+ str
+ |
+
+
+
+ Calculated hash + |
+
cli/medperf/utils.py
get_folders_hash(paths)
+
+¶Generates a hash for all the contents of the fiven folders. This procedure +hashes all the files in all passed folders, sorts them and then hashes that list.
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
paths |
+
+ List(str
+ |
+
+
+
+ Folders to hash. + |
+ + required + | +
Returns:
+Name | Type | +Description | +
---|---|---|
str |
+ str
+ |
+
+
+
+ sha256 hash that represents all the folders altogether + |
+
cli/medperf/utils.py
get_uids(path)
+
+¶Retrieves the UID of all the elements in the specified path.
+ + + +Returns:
+Type | +Description | +
---|---|
+ List[str]
+ |
+
+
+
+ List[str]: UIDs of objects in path. + |
+
cli/medperf/utils.py
pretty_error(msg)
+
+¶Prints an error message with typer protocol
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
msg |
+
+ str
+ |
+
+
+
+ Error message to show to the user + |
+ + required + | +
cli/medperf/utils.py
remove_path(path)
+
+¶Cleans up a clutter object. In case of failure, it is moved to .trash
cli/medperf/utils.py
sanitize_json(data)
+
+¶Makes sure the input data is JSON compliant.
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
data |
+
+ dict
+ |
+
+
+
+ dictionary containing data to be represented as JSON. + |
+ + required + | +
Returns:
+Name | Type | +Description | +
---|---|---|
dict |
+ dict
+ |
+
+
+
+ sanitized dictionary + |
+
cli/medperf/utils.py
untar(filepath, remove=True)
+
+¶Untars and optionally removes the tar.gz file
+ + + +Parameters:
+Name | +Type | +Description | +Default | +
---|---|---|---|
filepath |
+
+ str
+ |
+
+
+
+ Path where the tar.gz file can be found. + |
+ + required + | +
remove |
+
+ bool
+ |
+
+
+
+ Wether to delete the tar.gz file. Defaults to True. + |
+
+ True
+ |
+
Returns:
+Name | Type | +Description | +
---|---|---|
str |
+ str
+ |
+
+
+
+ location where the untared files can be found. + |
+
cli/medperf/utils.py
Here we introduce user roles at MedPerf. Depending on the objectives and expectations a user may have multiple roles.
+May include healthcare stakeholders (e.g., hospitals, clinicians, patient advocacy groups, payors, etc.), regulatory bodies, data providers and model owners wishing to drive the evaluation of AI models on real world data. While the Benchmark Committee does not have admin privileges on MedPerf, they have elevated permissions regarding benchmark assets (e.g., task, evaluation metrics, etc.) and policies (e.g., participation of model owners, data providers, anonymizations)
+ +May include hospitals, medical practices, research organizations, and healthcare payors that own medical data, register medical data, and execute benchmarks.
+ +May include ML researchers and software vendors that own a trained medical ML model and want to evaluate its performance against a benchmark.
+ +Organizations like MLCommons that operate the MedPerf platform enabling benchmark committees to develop and run benchmarks.
+ + + + + + + + + + + +TODO: the page is hidden now. If implemented, find all usages and uncomment them.
"},{"location":"medperf_components/","title":"MedPerf Components","text":""},{"location":"medperf_components/#medperf-server","title":"MedPerf Server","text":"The server contains all the metadata necessary to coordinate and execute experiments. No code assets or datasets are stored on the server.
The backend server is implemented in Django, and it can be found in the server folder in the MedPerf Github repository.
"},{"location":"medperf_components/#medperf-client","title":"MedPerf Client","text":"The MedPerf client contains all the necessary tools to interact with the server, preparing datasets for benchmarks and running experiments on the local machine. It can be found in this folder in the MedPerf Github repository.
The client communicates to the server through the API to, for example, authenticate a user, retrieve benchmarks/MLcubes and send results.
The client is currently available to the user through a command-line interface (CLI).
"},{"location":"medperf_components/#auth-provider","title":"Auth Provider","text":"The auth provider manages MedPerf users identities, authentication, and authorization to access the MedPerf server. Users will authenticate with the auth provider and authorize their MedPerf client to access the MedPerf server. Upon authorization, the MedPerf client will use access tokens issued by the auth provider in every request to the MedPerf server. The MedPerf server is configured to processes only requests authorized by the auth provider.
Currently, MedPerf uses Auth0 as the auth provider.
"},{"location":"roles/","title":"User Roles and Responsibilities","text":"Here we introduce user roles at MedPerf. Depending on the objectives and expectations a user may have multiple roles.
"},{"location":"roles/#benchmark-committee","title":"Benchmark Committee","text":"May include healthcare stakeholders (e.g., hospitals, clinicians, patient advocacy groups, payors, etc.), regulatory bodies, data providers and model owners wishing to drive the evaluation of AI models on real world data. While the Benchmark Committee does not have admin privileges on MedPerf, they have elevated permissions regarding benchmark assets (e.g., task, evaluation metrics, etc.) and policies (e.g., participation of model owners, data providers, anonymizations)
"},{"location":"roles/#data-providers","title":"Data Providers","text":"May include hospitals, medical practices, research organizations, and healthcare payors that own medical data, register medical data, and execute benchmarks.
"},{"location":"roles/#model-owners","title":"Model Owners","text":"May include ML researchers and software vendors that own a trained medical ML model and want to evaluate its performance against a benchmark.
"},{"location":"roles/#platform-providers","title":"Platform Providers","text":"Organizations like MLCommons that operate the MedPerf platform enabling benchmark committees to develop and run benchmarks.
"},{"location":"what_is_medperf/","title":"What is Medperf?","text":"MedPerf is an open-source framework for benchmarking medical ML models. It uses Federated Evaluation a method in which medical ML models are securely distributed to multiple global facilities for evaluation prioritizing patient privacy to mitigate legal and regulatory risks. The goal of Federated Evaluation is to make it simple and reliable to share ML models with many data providers, evaluate those ML models against their data in controlled settings, then aggregate and analyze the findings.
The MedPerf approach empowers healthcare stakeholders through neutral governance to assess and verify the performance of ML models in an efficient and human-supervised process without sharing any patient data across facilities during the process.
Federated evaluation of medical AI model using MedPerf on a hypothetical example"},{"location":"what_is_medperf/#why-medperf","title":"Why MedPerf?","text":"MedPerf aims to identify bias and generalizability issues of medical ML models by evaluating them on diverse medical data across the world. This process allows developers of medical ML to efficiently identify performance and reliability issues on their models while healthcare stakeholders (e.g., hospitals, practices, etc.) can validate such models against clinical efficacy.
Importantly, MedPerf supports technology for neutral governance in order to enable full trust and transparency among participating parties (e.g., AI vendor, data provider, regulatory body, etc.). This is all encapsulated in the benchmark committee which is the overseeing body on a benchmark.
Benchmark committee in MedPerf"},{"location":"what_is_medperf/#benefits-to-healthcare-stakeholders","title":"Benefits to healthcare stakeholders","text":"Anyone who joins our platform can get several benefits, regardless of the role they will assume.
Benefits to healthacare stakeholders using MedPerfOur paper describes the design philosophy in detail.
"},{"location":"workflow/","title":"Benchmark Workflow","text":"A benchmark in MedPerf is a collection of assets that are developed by the benchmark committee that aims to evaluate medical ML on decentralized data providers.
The process is simple yet effective enabling scalability.
"},{"location":"workflow/#step-1-establish-benchmark-committee","title":"Step 1. Establish Benchmark Committee","text":"The benchmarking process starts with establishing a benchmark committee of healthcare stakeholders (experts, committee), which will identify a clinical problem where an effective ML-based solution can have a significant clinical impact.
"},{"location":"workflow/#step-2-register-benchmark","title":"Step 2. Register Benchmark","text":"MLCubes are the building blocks of an experiment and are required in order to create a benchmark. Three MLCubes (Data Preparator MLCube, Reference Model MLCube, and Metrics MLCube) need to be submitted. After submitting the three MLCubes, alongside with a sample reference dataset, the Benchmark Committee is capable of creating a benchmark. Once the benchmark is submitted, the Medperf admin must approve it before it can be seen by other users. Follow our Hands-on Tutorial for detailed step-by-step guidelines.
"},{"location":"workflow/#step-3-register-dataset","title":"Step 3. Register Dataset","text":"Data Providers that want to be part of the benchmark can register their own datasets, prepare them, and associate them with the benchmark. A dataset will be prepared using the benchmark's Data Preparator MLCube and the dataset's metadata is registered within the MedPerf server.
Data PreparationThe data provider then can request to participate in the benchmark with their dataset. Requesting the association will run the benchmark's reference workflow to assure the compatibility of the prepared dataset structure with the workflow. Once the association request is approved by the Benchmark Committee, then the dataset becomes a part of the benchmark.
"},{"location":"workflow/#step-4-register-models","title":"Step 4. Register Models","text":"Once a benchmark is submitted by the Benchmark Committee, any user can submit their own Model MLCubes and request an association with the benchmark. This association request executes the benchmark locally with the given model on the benchmark's reference dataset to ensure workflow validity and compatibility. If the model successfully passes the compatibility test, and its association is approved by the Benchmark Committee, it becomes a part of the benchmark.
"},{"location":"workflow/#step-5-execute-benchmark","title":"Step 5. Execute Benchmark","text":"The Benchmark Committee may notify Data Providers that models are available for benchmarking. Data Providers can then run the benchmark models locally on their data.
This procedure retrieves the model MLCubes associated with the benchmark and runs them on the indicated prepared dataset to generate predictions. The Metrics MLCube of the benchmark is then retrieved to evaluate the predictions. Once the evaluation results are generated, the data provider can submit them to the platform.
"},{"location":"workflow/#step-6-aggregate-and-release-results","title":"Step 6. Aggregate and Release Results","text":"The benchmarking platform aggregates the results of running the models against the datasets and shares them according to the Benchmark Committee's policy.
The sharing policy controls how much of the data is shared, ranging from a single aggregated metric to a more detailed model-data cross product. A public leaderboard is available to Model Owners who produce the best performances.
"},{"location":"concepts/associations/","title":"In Progress","text":"TODO: the page is hidden now. If implemented, find all usages and uncomment them.
"},{"location":"concepts/auth/","title":"Authentication","text":"This guide helps you learn how to login and logout using the MedPerf client to access the main production MedPerf server. MedPerf uses passwordless authentication. This means that login will only require you to access your email in order complete the login process.
"},{"location":"concepts/auth/#login","title":"Login","text":"Follow the steps below to login:
medperf auth login\n
You will be prompted to enter your email address.
After entering your email address, you will be provided with a verification URL and a code. A text similar to the following will be printed in your terminal:
Tip
If you are running the MedPerf client on a machine with no graphical interface, you can use the link on any other device, e.g. your cellphone. Make sure that you trust that device.
Open the printed URL in your browser. You will be presented with a code, and you will be asked to confirm if that code is the same one printed in your terminal.
Enter the received code in the previous screen.
To disconnect the MedPerf client, simply run the following command:
medperf auth logout\n
"},{"location":"concepts/auth/#checking-the-authentication-status","title":"Checking the authentication status","text":"Note that when you log in, the MedPerf client will remember you as long as you are using the same profile
. If you switch to another profile by running medperf profile activate <other-profile>
, you may have to log in again. If you switch back again to a profile where you previously logged in, your login state will be restored.
You can always check the current login status by the running the following command:
medperf auth status\n
"},{"location":"concepts/hosting_files/","title":"Hosting Files","text":"MedPerf requires some files to be hosted on the cloud when running machine learning pipelines. Submitting MLCubes to the MedPerf server means submitting their metadata, and not, for example, model weights or parameters files. MLCube files such as model weights need to be hosted on the cloud, and the submitted MLCube metadata will only contain URLs (or certain identifiers) for these files. Another example would be benchmark submission, where demo datasets need to be hosted.
The MedPerf client expects files to be hosted in certain ways. Below are options of how files can be hosted and how MedPerf identitfies them (e.g. a URL).
"},{"location":"concepts/hosting_files/#file-hosting","title":"File hosting","text":"This can be done with any cloud hosting tool/provider you desire (such as GCP, AWS, Dropbox, Google Drive, Github). As long as your file can be accessed through a direct download link, it should work with medperf. Generating a direct download link for your hosted file can be straight-forward when using some providers (e.g. Amazon Web Services, Google Cloud Platform, Microsoft Azure) and can be a bit tricky when using others (e.g. Dropbox, GitHub, Google Drive).
Note
Direct download links must be permanent
Tip
You can make sure if a URL is a direct download link or not using tools like wget
or curl
. Running wget <URL>
will download the file if the URL is a direct download link. Running wget <URL>
may fail or may download an HTML page if the URL is not a direct download link.
When your file is hosted with a direct download link, MedPerf will be able to identify this file using that direct download link. So for example, when you are submitting an MLCube, you would pass your hosted MLCube manifest file as follows:
--mlcube-file <the-direct-download-link-to-the-file>\n
Warning
Files in this case are supposed to have anonymous public read access permission.
"},{"location":"concepts/hosting_files/#direct-download-links-of-files-on-github","title":"Direct download links of files on GitHub","text":"It was a common practice by the current MedPerf users to host files on GitHub. You can learn below how to find the direct download link of a file hosted on GitHub. You can check online for other storage providers.
It's important though to make sure the files won't be modified after being submitted to medperf, which could happen due to future commits. Because of this, the URLs of the files hosted on GitHub must contain a reference to the current commit hash. Below are the steps to get this URL for a specific file:
You can choose the option of hosting with Synapse in cases where privacy is a concern. Please refer to this link for hosting files on the Synapse platform.
When your file is hosted on Synapse, MedPerf will be able to identify this file using the Synapse ID corresponding to that file. So for example, when you are submitting an MLCube, you would pass your hosted MLCube manifest file as follows (note the prefix):
--mlcube-file synapse:<the-synapse-id-of-the-file>\n
Note that you need to authenticate with your Synapse credentials if you plan to use a Synaspe file with MedPerf. To do so run medperf auth synapse_login
.
Note
You must authenticate if using files on Synapse. If this is not necessary, this means the file has anonymous public access read permission. In this case, Synapse allows you to generate a permanent direct download link for your file and you can follow the previous section.
"},{"location":"concepts/mlcube_files/","title":"MLCube Components: What to Host?","text":"Once you have built an MLCube ready for MedPerf, you need to host it somewhere on the cloud so that it can be identified and retrieved by the MedPerf client on other machines. This requires hosting the MLCube components somewhere on the cloud. The following is a description of what needs to be hosted.
"},{"location":"concepts/mlcube_files/#hosting-your-container-image","title":"Hosting Your Container Image","text":"MLCubes execute a container image behind the scenes. This container image is usually hosted on a container registry, like Docker Hub. In cases where this is not possible, medperf provides the option of passing the image file directly (i.e. having the image file hosted somewhere and providing MedPerf with the download link). MLCubes that work with images outside of the docker registry usually store the image inside the <path_to_mlcube>/workspace/.image
folder. MedPerf supports using direct container image files for Singularity only.
Note
While there is the option of hosting the singularity image directly, it is highly recommended to use a container registry for accessability and usability purposes. MLCube also has mechanisms for converting containers to other runners, like Docker to Singularity.
Note
Docker Images can be on any docker container registry, not necessarily on Docker Hub.
"},{"location":"concepts/mlcube_files/#files-to-be-hosted","title":"Files to be hosted","text":"The following is the list of files that must be hosted separately so they can be used by MedPerf:
"},{"location":"concepts/mlcube_files/#mlcubeyaml","title":"mlcube.yaml
","text":"Every MLCube is defined by its mlcube.yaml
manifest file. As such, Medperf needs to have access to this file to recreate the MLCube. This file can be found inside your MLCube at <path_to_mlcube>/mlcube.yaml
.
parameters.yaml
(Optional)","text":"The parameters.yaml
file specify additional ways to parametrize your model MLCube using the same container image it is built with. This file can be found inside your MLCube at <path_to_mlcube>/workspace/parameters.yaml
.
additional_files.tar.gz
(Optional)","text":"MLCubes may require additional files that may be desired to keep separate from the model architecture and hosted image. For example, model weights. This allows for testing multiple implementations of the same model, without requiring a separate container image for each. If additional images are being used by your MLCube, they need to be compressed into a .tar.gz
file and hosted separately. You can create this tarball file with the following command
tar -czf additional_files.tar.gz -C <path_to_mlcube>/workspace/additional_files .\n
"},{"location":"concepts/mlcube_files/#preparing-an-mlcube-for-hosting","title":"Preparing an MLCube for hosting","text":"To facilitate hosting and interface compatibility validation, MedPerf provides a script that finds all the required assets, compresses them if necessary, and places them in a single location for easy access. To run the script, make sure you have medperf installed and you are in medperf's root directory:
python scripts/package-mlcube.py \\\n--mlcube path/to/mlcube \\\n--mlcube-types <list-of-comma-separated-strings> \\\n--output path/to/file.tar.gz\n
where:
path/to/mlcube
is the path to the MLCube folder containing the manifest file (mlcube.yaml
)--mlcube-types
specifies a comma-separated list of MLCube types ('data-preparator' for a data preparation MLCube, 'model' for a model MLCube, and 'metrics' for a metrics MLCube.)path/to/file.tar.gz
is a path to the output file where you want to store the compressed version of all assets.See python scripts/package-mlcube.py --help
for more information.
Once executed, you should be able to find all prepared assets at ./mlcube/assets
, as well as a compressed version of the assets
folder at the output path provided.
Note
The --output
parameter is optional. The compressed version of the assets
folder can be useful in cases where you don't directly interact with the MedPerf server, but instead you do so through a third party. This is usually the case for challenges and competitions.
TODO: the page is hidden now. If implemented, find all usages and uncomment them.
"},{"location":"concepts/profiles/","title":"In Progress","text":"TODO: the page is hidden now. If implemented, find all usages and uncomment them.
"},{"location":"concepts/single_run/","title":"In Progress","text":"TODO: the page is hidden now. If implemented, find all usages and uncomment them.
"},{"location":"getting_started/benchmark_owner_demo/","title":"Bechmark Committee","text":""},{"location":"getting_started/benchmark_owner_demo/#hands-on-tutorial-for-bechmark-committee","title":"Hands-on Tutorial for Bechmark Committee","text":""},{"location":"getting_started/benchmark_owner_demo/#overview","title":"Overview","text":"In this guide, you will learn how a user can use MedPerf to create a benchmark. The key tasks can be summarized as follows:
It's assumed that you have already set up the general testing environment as explained in the installation and setup guide.
"},{"location":"getting_started/benchmark_owner_demo/#before-you-start","title":"Before You Start","text":""},{"location":"getting_started/benchmark_owner_demo/#first-steps","title":"First steps","text":""},{"location":"getting_started/benchmark_owner_demo/#running-in-cloud-via-github-codespaces","title":"Running in cloud via Github Codespaces","text":"As the most easy way to play with the tutorials you can launch a preinstalled Codespace cloud environment for MedPerf by clicking this link:
"},{"location":"getting_started/benchmark_owner_demo/#running-in-local-environment","title":"Running in local environment","text":"To start experimenting with MedPerf through this tutorial on your local machine, you need to start by following these quick steps:
For the purpose of the tutorial, you have to initialize a local MedPerf server with a fresh database and then create the necessary entities that you will be interacting with. To do so, run the following: (make sure you are in MedPerf's root folder)
cd server\nsh reset_db.sh\npython seed.py --demo benchmark\ncd ..\n
"},{"location":"getting_started/benchmark_owner_demo/#download-the-necessary-files","title":"Download the Necessary files","text":"A script is provided to download all the necessary files so that you follow the tutorial smoothly. Run the following: (make sure you are in MedPerf's root folder)
sh tutorials_scripts/setup_benchmark_tutorial.sh\n
This will create a workspace folder medperf_tutorial
where all necessary files are downloaded. The folder contains the following content:
In this tutorial we will create a benchmark that classifies chest X-Ray images.
In real life all the listed artifacts and files have to be created on your own. However, for tutorial's sake you may use this toy data.
"},{"location":"getting_started/benchmark_owner_demo/#demo-data","title":"Demo Data","text":"The medperf_tutorial/demo_data/
folder contains the demo dataset content.
images/
folder includes sample images.labels/labels.csv
provides a basic ground truth markup, indicating the class each image belongs to.The demo dataset is a sample dataset used for the development of your benchmark and used by Model Owners for the development of their models. More details are available in the section below
"},{"location":"getting_started/benchmark_owner_demo/#data-preparator-mlcube","title":"Data Preparator MLCube","text":"The medperf_tutorial/data_preparator/
contains a DataPreparator MLCube that you must implement. This MLCube: - Transforms raw data into a format convenient for model consumption, such as converting DICOM images into numpy tensors, cropping patches, normalizing columns, etc. It's up to you to define the format that is handy for future models. - Ensures its output is in a standardized format, allowing Model Owners/Developers to rely on its consistency.
The medperf_tutorial/model_custom_cnn/
is an example of a Model MLCube. You need to implement a reference model which will be used by data owners to test the compatibility of their data with your pipeline. Also, Model Developers joining your benchmark will follow the input/output specifications of this model when building their own models.
The medperf_tutorial/metrics/
houses a Metrics MLCube that processes ground truth data, model predictions, and computes performance metrics - such as classification accuracy, loss, etc. After a Dataset Owner runs the benchmark pipeline on their data, these final metric values will be shared with you as the Benchmark Owner.
The local MedPerf server is pre-configured with a dummy local authentication system. Remember that when you are communicating with the real MedPerf server, you should follow the steps in this guide to login. For the tutorials, you should not do anything.
You are now ready to start!
"},{"location":"getting_started/benchmark_owner_demo/#1-implement-a-valid-workflow","title":"1. Implement a Valid Workflow","text":"The implementation of a valid workflow is accomplished by implementing three MLCubes:
Data Preparator MLCube: This MLCube will transform raw data into a dataset ready for the AI model execution. All data owners willing to participate in this benchmark will have their data prepared using this MLCube. A guide on how to implement data preparation MLCubes can be found here.
Reference Model MLCube: This MLCube will contain an example model implementation for the desired AI task. It should be compatible with the data preparation MLCube (i.e., the outputs of the data preparation MLCube can be directly fed as inputs to this MLCube). A guide on how to implement model MLCubes can be found here.
Metrics MLCube: This MLCube will be responsible for evaluating the performance of a model. It should be compatible with the reference model MLCube (i.e., the outputs of the reference model MLCube can be directly fed as inputs to this MLCube). A guide on how to implement metrics MLCubes can be found here.
For this tutorial, you are provided with following three already implemented mlcubes for the task of chest X-ray classification. The implementations can be found in the following links: Data Preparator, Reference Model, Metrics. These mlcubes are setup locally for you and can be found in your workspace folder under data_preparator
, model_custom_cnn
, and metrics
.
A demo dataset is a small reference dataset. It contains a few data records and their labels, which will be used to test the benchmark's workflow in two scenarios:
It is used for testing the benchmark's default workflow. The MedPerf client automatically runs a compatibility test of the benchmark's three mlcubes prior to its submission. The test is run using the benchmark's demo dataset as input.
When a model owner wants to participate in the benchmark, the MedPerf client tests the compatibility of their model with the benchmark's data preparation cube and metrics cube. The test is run using the benchmark's demo dataset as input.
For this tutorial, you are provided with a demo dataset for the chest X-ray classification workflow. The dataset can be found in your workspace folder under demo_data
. It is a small dataset comprising two chest X-ray images and corresponding thoracic disease labels.
You can test the workflow now that you have the three MLCubes and the demo data. Testing the workflow before submitting any asset to the MedPerf server is usually recommended.
"},{"location":"getting_started/benchmark_owner_demo/#3-test-your-workflow","title":"3. Test your Workflow","text":"MedPerf provides a single command to test an inference workflow. To test your workflow with local MLCubes and local data, the following need to be passed to the command:
medperf_tutorial/data_preparator/mlcube/mlcube.yaml
.medperf_tutorial/model_custom_cnn/mlcube/mlcube.yaml
.medperf_tutorial/metrics/mlcube/mlcube.yaml
.medperf_tutorial/demo_data/images
.medperf_tutorial/demo_data/labels
.Run the following command to execute the test ensuring you are in MedPerf's root folder:
medperf test run \\\n--data_preparation \"medperf_tutorial/data_preparator/mlcube/mlcube.yaml\" \\\n--model \"medperf_tutorial/model_custom_cnn/mlcube/mlcube.yaml\" \\\n--evaluator \"medperf_tutorial/metrics/mlcube/mlcube.yaml\" \\\n--data_path \"medperf_tutorial/demo_data/images\" \\\n--labels_path \"medperf_tutorial/demo_data/labels\"\n
Assuming the test passes successfully, you are ready to submit the MLCubes to the MedPerf server.
"},{"location":"getting_started/benchmark_owner_demo/#4-host-the-demo-dataset","title":"4. Host the Demo Dataset","text":"The demo dataset should be packaged in a specific way as a compressed tarball file. The folder stucture in the workspace currently looks like the following:
.\n\u2514\u2500\u2500 medperf_tutorial\n \u251c\u2500\u2500 demo_data\n \u2502 \u251c\u2500\u2500 images\n \u2502 \u2514\u2500\u2500 labels\n \u2502\n ...\n
The goal is to package the folder demo_data
. You must first create a file called paths.yaml
. This file will provide instructions on how to locate the data records path and the labels path. The paths.yaml
file should specify both the data records path and the labels path.
In your workspace directory (medperf_tutorial
), create a file paths.yaml
and fill it with the following:
data_path: demo_data/images\nlabels_path: demo_data/labels\n
Note
The paths are determined by the Data Preparator MLCube's expected input path.
After that, the workspace should look like the following:
.\n\u2514\u2500\u2500 medperf_tutorial\n \u251c\u2500\u2500 demo_data\n \u2502 \u251c\u2500\u2500 images\n \u2502 \u2514\u2500\u2500 labels\n \u251c\u2500\u2500 paths.yaml\n \u2502\n ...\n
Finally, compress the required assets (demo_data
and paths.yaml
) into a tarball file by running the following command:
cd medperf_tutorial\ntar -czf demo_data.tar.gz demo_data paths.yaml\ncd ..\n
And that's it! Now you have to host the tarball file (demo_data.tar.gz
) on the internet.
For the tutorial to run smoothly, the file is already hosted at the following URL:
https://storage.googleapis.com/medperf-storage/chestxray_tutorial/demo_data.tar.gz\n
If you wish to host it by yourself, you can find the list of supported options and details about hosting files in this page.
Finally, now after having the MLCubes submitted and the demo dataset hosted, you can submit the benchmark to the MedPerf server.
"},{"location":"getting_started/benchmark_owner_demo/#5-submitting-the-mlcubes","title":"5. Submitting the MLCubes","text":""},{"location":"getting_started/benchmark_owner_demo/#how-does-medperf-recognize-an-mlcube","title":"How does MedPerf Recognize an MLCube?","text":"The MedPerf server registers an MLCube as metadata comprised of a set of files that can be retrieved from the internet. This means that before submitting an MLCube you have to host its files on the internet. The MedPerf client provides a utility to prepare the files of an MLCube that need to be hosted. You can refer to this page if you want to understand what the files are, but using the utility script is enough.
To prepare the files of the three MLCubes, run the following command ensuring you are in MedPerf's root folder:
python scripts/package-mlcube.py --mlcube medperf_tutorial/data_preparator/mlcube --mlcube-types data-preparator\npython scripts/package-mlcube.py --mlcube medperf_tutorial/model_custom_cnn/mlcube --mlcube-types model\npython scripts/package-mlcube.py --mlcube medperf_tutorial/metrics/mlcube --mlcube-types metrics\n
For each MLCube, this script will create a new folder named assets
in the MLCube directory. This folder will contain all the files that should be hosted separately.
For the tutorial to run smoothly, the files are already hosted. If you wish to host them by yourself, you can find the list of supported options and details about hosting files in this page.
"},{"location":"getting_started/benchmark_owner_demo/#submit-the-mlcubes","title":"Submit the MLCubes","text":""},{"location":"getting_started/benchmark_owner_demo/#data-preparator-mlcube_1","title":"Data Preparator MLCube","text":"For the Data Preparator MLCube, the submission should include:
The URL to the hosted mlcube manifest file, which is:
https://raw.githubusercontent.com/mlcommons/medperf/main/examples/chestxray_tutorial/data_preparator/mlcube/mlcube.yaml\n
The URL to the hosted mlcube parameters file, which is:
https://raw.githubusercontent.com/mlcommons/medperf/main/examples/chestxray_tutorial/data_preparator/mlcube/workspace/parameters.yaml\n
Use the following command to submit:
medperf mlcube submit \\\n--name my-prep-cube \\\n--mlcube-file \"https://raw.githubusercontent.com/mlcommons/medperf/main/examples/chestxray_tutorial/data_preparator/mlcube/mlcube.yaml\" \\\n--parameters-file \"https://raw.githubusercontent.com/mlcommons/medperf/main/examples/chestxray_tutorial/data_preparator/mlcube/workspace/parameters.yaml\" \\\n--operational\n
"},{"location":"getting_started/benchmark_owner_demo/#reference-model-mlcube","title":"Reference Model MLCube","text":"For the Reference Model MLCube, the submission should include:
The URL to the hosted mlcube manifest file:
https://raw.githubusercontent.com/mlcommons/medperf/main/examples/chestxray_tutorial/model_custom_cnn/mlcube/mlcube.yaml\n
The URL to the hosted mlcube parameters file:
https://raw.githubusercontent.com/mlcommons/medperf/main/examples/chestxray_tutorial/model_custom_cnn/mlcube/workspace/parameters.yaml\n
The URL to the hosted additional files tarball file:
https://storage.googleapis.com/medperf-storage/chestxray_tutorial/cnn_weights.tar.gz\n
Use the following command to submit:
medperf mlcube submit \\\n--name my-modelref-cube \\\n--mlcube-file \"https://raw.githubusercontent.com/mlcommons/medperf/main/examples/chestxray_tutorial/model_custom_cnn/mlcube/mlcube.yaml\" \\\n--parameters-file \"https://raw.githubusercontent.com/mlcommons/medperf/main/examples/chestxray_tutorial/model_custom_cnn/mlcube/workspace/parameters.yaml\" \\\n--additional-file \"https://storage.googleapis.com/medperf-storage/chestxray_tutorial/cnn_weights.tar.gz\" \\\n--operational\n
"},{"location":"getting_started/benchmark_owner_demo/#metrics-mlcube_1","title":"Metrics MLCube","text":"For the Metrics MLCube, the submission should include:
The URL to the hosted mlcube manifest file:
https://raw.githubusercontent.com/mlcommons/medperf/main/examples/chestxray_tutorial/metrics/mlcube/mlcube.yaml\n
The URL to the hosted mlcube parameters file:
https://raw.githubusercontent.com/mlcommons/medperf/main/examples/chestxray_tutorial/metrics/mlcube/workspace/parameters.yaml\n
Use the following command to submit:
medperf mlcube submit \\\n--name my-metrics-cube \\\n--mlcube-file \"https://raw.githubusercontent.com/mlcommons/medperf/main/examples/chestxray_tutorial/metrics/mlcube/mlcube.yaml\" \\\n--parameters-file \"https://raw.githubusercontent.com/mlcommons/medperf/main/examples/chestxray_tutorial/metrics/mlcube/workspace/parameters.yaml\" \\\n--operational\n
Each of the three MLCubes will be assigned by a server UID. You can check the server UID for each MLCube by running:
medperf mlcube ls --mine\n
Next, you will learn how to host the demo dataset.
"},{"location":"getting_started/benchmark_owner_demo/#6-submit-your-benchmark","title":"6. Submit your Benchmark","text":"You need to keep at hand the following information:
https://storage.googleapis.com/medperf-storage/chestxray_tutorial/demo_data.tar.gz\n
medperf mlcube ls\n
1
2
3
You can create and submit your benchmark using the following command:
medperf benchmark submit \\\n--name tutorial_bmk \\\n--description \"MedPerf demo bmk\" \\\n--demo-url \"https://storage.googleapis.com/medperf-storage/chestxray_tutorial/demo_data.tar.gz\" \\\n--data-preparation-mlcube 1 \\\n--reference-model-mlcube 2 \\\n--evaluator-mlcube 3 \\\n--operational\n
The MedPerf client will first automatically run a compatibility test between the MLCubes using the demo dataset. If the test is successful, the benchmark will be submitted along with the compatibility test results.
Note
The benchmark will stay inactive until the MedPerf server admin approves your submission.
That's it! You can check your benchmark's server UID by running:
medperf benchmark ls --mine\n
"},{"location":"getting_started/benchmark_owner_demo/#cleanup-optional","title":"Cleanup (Optional)","text":"You have reached the end of the tutorial! If you are planning to rerun any of the tutorials, don't forget to cleanup:
To shut down the local MedPerf server: press CTRL
+C
in the terminal where the server is running.
To cleanup the downloaded files workspace (make sure you are in the MedPerf's root directory):
rm -fr medperf_tutorial\n
cd server\nsh reset_db.sh\n
rm -fr ~/.medperf/localhost_8000\n
"},{"location":"getting_started/data_owner_demo/","title":"Data Owners","text":""},{"location":"getting_started/data_owner_demo/#hands-on-tutorial-for-data-owners","title":"Hands-on Tutorial for Data Owners","text":""},{"location":"getting_started/data_owner_demo/#overview","title":"Overview","text":"As a data owner, you plan to run a benchmark on your own dataset. Using MedPerf, you will prepare your (raw) dataset and submit information about it to the MedPerf server. You may have to consult the benchmark committee to make sure that your raw dataset aligns with the benchmark's expected input format.
Note
A key concept of MedPerf is the stringent confidentiality of your data. It remains exclusively on your machine. Only minimal information about your dataset, such as the hash of its contents, is submitted. Once your Dataset is submitted and associated with a benchmark, you can run all benchmark models on your data within your own infrastructure and see the results / predictions.
This guide provides you with the necessary steps to use MedPerf as a Data Owner. The key tasks can be summarized as follows:
It is assumed that you have the general testing environment set up.
"},{"location":"getting_started/data_owner_demo/#before-you-start","title":"Before You Start","text":""},{"location":"getting_started/data_owner_demo/#first-steps","title":"First steps","text":""},{"location":"getting_started/data_owner_demo/#running-in-cloud-via-github-codespaces","title":"Running in cloud via Github Codespaces","text":"As the most easy way to play with the tutorials you can launch a preinstalled Codespace cloud environment for MedPerf by clicking this link:
"},{"location":"getting_started/data_owner_demo/#running-in-local-environment","title":"Running in local environment","text":"To start experimenting with MedPerf through this tutorial on your local machine, you need to start by following these quick steps:
For the purpose of the tutorial, you have to initialize a local MedPerf server with a fresh database and then create the necessary entities that you will be interacting with. To do so, run the following: (make sure you are in MedPerf's root folder)
cd server\nsh reset_db.sh\npython seed.py --demo data\ncd ..\n
"},{"location":"getting_started/data_owner_demo/#download-the-necessary-files","title":"Download the Necessary files","text":"A script is provided to download all the necessary files so that you follow the tutorial smoothly. Run the following: (make sure you are in MedPerf's root folder)
sh tutorials_scripts/setup_data_tutorial.sh\n
This will create a workspace folder medperf_tutorial
where all necessary files are downloaded. The folder contains the following content:
In real life all the listed artifacts and files have to be created on your own. However, for tutorial's sake you may use this toy data.
"},{"location":"getting_started/data_owner_demo/#tutorials-dataset-example","title":"Tutorial's Dataset Example","text":"The medperf_tutorial/sample_raw_data/
folder contains your data for the specified Benchmark. In this tutorial, where the benchmark involves classifying chest X-Ray images, your data comprises:
images/
folder contains your imageslabels/labels.csv
, which provides the ground truth markup, specifying the class of each image.The format of this data is dictated by the Benchmark Owner, as it must be compatible with the benchmark's Data Preparation MLCube. In a real-world scenario, the expected data format would differ from this toy example. Refer to the Benchmark Owner to get a format specifications and details for your practical case.
As previously mentioned, your data itself never leaves your machine. During the dataset submission, only basic metadata is transferred, for which you will be prompted to confirm.
"},{"location":"getting_started/data_owner_demo/#login-to-the-local-medperf-server","title":"Login to the Local MedPerf Server","text":"The local MedPerf server is pre-configured with a dummy local authentication system. Remember that when you are communicating with the real MedPerf server, you should follow the steps in this guide to login. For the tutorials, you should not do anything.
You are now ready to start!
"},{"location":"getting_started/data_owner_demo/#1-register-your-data-information","title":"1. Register your Data Information","text":"To register your dataset, you need to collect the following information:
medperf_tutorial/sample_raw_data/images
).medperf_tutorial/sample_raw_data/labels
)Note
The data_path
and labels_path
are determined according to the input path requirements of the data preparation MLCube. To ensure that your data is structured correctly, it is recommended to check with the Benchmark Committee for specific details or instructions.
In order to find the benchmark ID, you can execute the following command to view the list of available benchmarks.
medperf benchmark ls\n
The target benchmark ID here is 1
.
Note
You will be submitting general information about the data, not the data itself. The data never leaves your machine.
Run the following command to register your data (make sure you are in MedPerf's root folder):
medperf dataset submit \\\n--name \"mytestdata\" \\\n--description \"A tutorial dataset\" \\\n--location \"My machine\" \\\n--data_path \"medperf_tutorial/sample_raw_data/images\" \\\n--labels_path \"medperf_tutorial/sample_raw_data/labels\" \\\n--benchmark 1\n
Once you run this command, the information to be submitted will be displayed on the screen and you will be asked to confirm your submission. Once you confirm, your dataset will be successfully registered!
"},{"location":"getting_started/data_owner_demo/#2-prepare-your-data","title":"2. Prepare your Data","text":"To prepare and preprocess your dataset, you need to know the server UID of your registered dataset. You can check your datasets information by running:
medperf dataset ls --mine\n
In our tutorial, your dataset ID will be 1
. Run the following command to prepare your dataset:
medperf dataset prepare --data_uid 1\n
This command will also calculate statistics on your data; statistics defined by the benchmark owner. These will be submitted to the MedPerf server in the next step upon your approval.
"},{"location":"getting_started/data_owner_demo/#3-mark-your-dataset-as-operational","title":"3. Mark your Dataset as Operational","text":"After successfully preparing your dataset, you can mark it as ready so that it can be associated with benchmarks you want. During preparation, your dataset is considered in the Development
stage, and now you will mark it as operational.
Note
Once marked as operational, it can never be marked as in-development anymore.
Run the following command to mark your dataset as operational:
medperf dataset set_operational --data_uid 1\n
Once you run this command, you will see on your screen the updated information of your dataset along with the statistics mentioned in the previous step. You will be asked to confirm submission of the displayed information. Once you confirm, your dataset will be successfully marked as operational!
Next, you can proceed to request participation in the benchmark by initiating an association request.
"},{"location":"getting_started/data_owner_demo/#4-request-participation","title":"4. Request Participation","text":"For submitting the results of executing the benchmark models on your data in the future, you must associate your data with the benchmark.
Once you have submitted your dataset to the MedPerf server, it will be assigned a server UID, which you can find by running medperf dataset ls --mine
. Your dataset's server UID is also 1
.
Run the following command to request associating your dataset with the benchmark:
medperf dataset associate --benchmark_uid 1 --data_uid 1\n
This command will first run the benchmark's reference model on your dataset to ensure your dataset is compatible with the benchmark workflow. Then, the association request information is printed on the screen, which includes an executive summary of the test mentioned. You will be prompted to confirm sending this information and initiating this association request.
"},{"location":"getting_started/data_owner_demo/#how-to-proceed-after-requesting-association","title":"How to proceed after requesting association","text":" When participating with a real benchmark, you must wait for the Benchmark Committee to approve the association request. You can check the status of your association requests by running medperf association ls
. The association is identified by the server UIDs of your dataset and the benchmark with which you are requesting association.
For the sake of continuing the tutorial only, run the following to simulate the benchmark committee approving your association (make sure you are in the MedPerf's root directory):
sh tutorials_scripts/simulate_data_association_approval.sh\n
You can verify if your association request has been approved by running medperf association ls
.
MedPerf provides a command that runs all the models of a benchmark effortlessly. You only need to provide two parameters:
1
.1
.For that, run the following command:
medperf benchmark run --benchmark 1 --data_uid 1\n
After running the command, you will receive a summary of the executions. You will see something similar to the following:
model local result UID partial result from cache error\n------- ------------------ ---------------- ------------ -------\n 2 b1m2d1 False True\n 4 b1m4d1 False False\nTotal number of models: 2\n 1 were skipped (already executed), of which 0 have partial results\n 0 failed\n 1 ran successfully, of which 0 have partial results\n\n\u2705 Done!\n
This means that the benchmark has two models:
b1m4d1
.You can view the results by running the following command with the specific local result UID. For example:
medperf result view b1m4d1\n
For now, your results are only local. Next, you will learn how to submit the results.
"},{"location":"getting_started/data_owner_demo/#6-submit-a-result","title":"6. Submit a Result","text":"After executing the benchmark, you will submit a result to the MedPerf server. To do so, you have to find the target result generated UID.
As an example, you will be submitting the result of UID b1m4d1
. To do this, run the following command:
medperf result submit --result b1m4d1\n
The information that is going to be submitted will be printed to the screen and you will be prompted to confirm that you want to submit.
"},{"location":"getting_started/data_owner_demo/#cleanup-optional","title":"Cleanup (Optional)","text":"You have reached the end of the tutorial! If you are planning to rerun any of the tutorials, don't forget to cleanup:
To shut down the local MedPerf server: press CTRL
+C
in the terminal where the server is running.
To cleanup the downloaded files workspace (make sure you are in the MedPerf's root directory):
rm -fr medperf_tutorial\n
cd server\nsh reset_db.sh\n
rm -fr ~/.medperf/localhost_8000\n
"},{"location":"getting_started/installation/","title":"Installation","text":""},{"location":"getting_started/installation/#prerequisites","title":"Prerequisites","text":""},{"location":"getting_started/installation/#python","title":"Python","text":"Make sure you have Python 3.9 installed along with pip. To check if they are installed, run:
python --version\npip --version\n
or, depending on you machine configuration:
python3 --version\npip3 --version\n
We will assume the commands' names are pip
and python
. Use pip3
and python3
if your machine is configured differently.
Make sure you have the latest version of Docker or Singularity 3.10 installed.
To verify docker is installed, run:
docker --version\n
To verify singularity is installed, run:
singularity --version\n
If using Docker, make sure you can run Docker as a non-root user.
"},{"location":"getting_started/installation/#install-medperf","title":"Install MedPerf","text":"(Optional) MedPerf is better to be installed in a virtual environment. We recommend using Anaconda. Having anaconda installed, create a virtual environment medperf-env
with the following command:
conda create -n medperf-env python=3.9\n
Then, activate your environment:
conda activate medperf-env\n
Clone the MedPerf repository:
git clone https://github.com/mlcommons/medperf.git\ncd medperf\n
Install MedPerf from source:
pip install -e ./cli\n
Verify the installation:
medperf --version\n
In this guide, you will learn how a Model Owner can use MedPerf to take part in a benchmark. It's highly recommend that you follow this or this guide first to implement your own model MLCube and use it throughout this tutorial. However, this guide provides an already implemented MLCube if you want to directly proceed to learn how to interact with MedPerf.
The main tasks of this guide are:
It's assumed that you have already set up the general testing environment as explained in the setup guide.
"},{"location":"getting_started/model_owner_demo/#before-you-start","title":"Before You Start","text":""},{"location":"getting_started/model_owner_demo/#first-steps","title":"First steps","text":""},{"location":"getting_started/model_owner_demo/#running-in-cloud-via-github-codespaces","title":"Running in cloud via Github Codespaces","text":"As the most easy way to play with the tutorials you can launch a preinstalled Codespace cloud environment for MedPerf by clicking this link:
"},{"location":"getting_started/model_owner_demo/#running-in-local-environment","title":"Running in local environment","text":"To start experimenting with MedPerf through this tutorial on your local machine, you need to start by following these quick steps:
For the purpose of the tutorial, you have to initialize a local MedPerf server with a fresh database and then create the necessary entities that you will be interacting with. To do so, run the following: (make sure you are in MedPerf's root folder)
cd server\nsh reset_db.sh\npython seed.py --demo model\ncd ..\n
"},{"location":"getting_started/model_owner_demo/#download-the-necessary-files","title":"Download the Necessary files","text":"A script is provided to download all the necessary files so that you follow the tutorial smoothly. Run the following: (make sure you are in MedPerf's root folder)
sh tutorials_scripts/setup_model_tutorial.sh\n
This will create a workspace folder medperf_tutorial
where all necessary files are downloaded. The folder contains the following content:
In real life all the listed artifacts and files have to be created on your own. However, for tutorial's sake you may use this toy data.
"},{"location":"getting_started/model_owner_demo/#model-mlcube","title":"Model MLCube","text":"The medperf_tutorial/model_mobilenetv2/
is a toy Model MLCube. Once you submit your model to the benchmark, all participating Data Owners would be able to run the model within the benchmark pipeline. Therefore, your MLCube must support the specific input/output formats defined by the Benchmark Owners.
For the purposes of this tutorial, you will work with a pre-prepared toy benchmark. In a real-world scenario, you should refer to your Benchmark Owner to get a format specifications and details for your practical case.
"},{"location":"getting_started/model_owner_demo/#login-to-the-local-medperf-server","title":"Login to the Local MedPerf Server","text":"The local MedPerf server is pre-configured with a dummy local authentication system. Remember that when you are communicating with the real MedPerf server, you should follow the steps in this guide to login. For the tutorials, you should not do anything.
You are now ready to start!
"},{"location":"getting_started/model_owner_demo/#1-test-your-mlcube-compatibility","title":"1. Test your MLCube Compatibility","text":"Before submitting your MLCube, it is highly recommended that you test your MLCube compatibility with the benchmarks of interest to avoid later edits and multiple submissions. Your MLCube should be compatible with the benchmark workflow in two main ways:
These details should usually be acquired by contacting the Benchmark Committee and following their instructions.
To test your MLCube validity with the benchmark, first run medperf benchmark ls
to identify the benchmark's server UID. In this case, it is going to be 1
.
Next, locate the MLCube. Unless you implemented your own MLCube, the MLCube provided for this tutorial is located in your workspace: medperf_tutorial/model_mobilenetv2/mlcube/mlcube.yaml
.
After that, run the compatibility test:
medperf test run \\\n--benchmark 1 \\\n--model \"medperf_tutorial/model_mobilenetv2/mlcube/mlcube.yaml\"\n
Assuming the test passes successfuly, you are ready to submit the MLCube to the MedPerf server.
"},{"location":"getting_started/model_owner_demo/#2-submit-the-mlcube","title":"2. Submit the MLCube","text":""},{"location":"getting_started/model_owner_demo/#how-does-medperf-recognize-an-mlcube","title":"How does MedPerf Recognize an MLCube?","text":"The MedPerf server registers an MLCube as metadata comprised of a set of files that can be retrieved from the internet. This means that before submitting an MLCube you have to host its files on the internet. The MedPerf client provides a utility to prepare the files of an MLCube that need to be hosted. You can refer to this page if you want to understand what the files are, but using the utility script is enough.
To prepare the files of the MLCube, run the following command ensuring you are in MedPerf's root folder:
python scripts/package-mlcube.py --mlcube medperf_tutorial/model_mobilenetv2/mlcube --mlcube-types model\n
This script will create a new folder in the MLCube directory, named assets
, containing all the files that should be hosted separately.
For the tutorial to run smoothly, the files are already hosted. If you wish to host them by yourself, you can find the list of supported options and details about hosting files in this page.
"},{"location":"getting_started/model_owner_demo/#submit-the-mlcube","title":"Submit the MLCube","text":"The submission should include the URLs of all the hosted files. For the MLCube provided for the tutorial:
https://raw.githubusercontent.com/mlcommons/medperf/main/examples/chestxray_tutorial/model_mobilenetv2/mlcube/mlcube.yaml\n
https://raw.githubusercontent.com/mlcommons/medperf/main/examples/chestxray_tutorial/model_mobilenetv2/mlcube/workspace/parameters.yaml\n
https://storage.googleapis.com/medperf-storage/chestxray_tutorial/mobilenetv2_weights.tar.gz\n
Use the following command to submit:
medperf mlcube submit \\\n--name my-model-cube \\\n--mlcube-file \"https://raw.githubusercontent.com/mlcommons/medperf/main/examples/chestxray_tutorial/model_mobilenetv2/mlcube/mlcube.yaml\" \\\n--parameters-file \"https://raw.githubusercontent.com/mlcommons/medperf/main/examples/chestxray_tutorial/model_mobilenetv2/mlcube/workspace/parameters.yaml\" \\\n--additional-file \"https://storage.googleapis.com/medperf-storage/chestxray_tutorial/mobilenetv2_weights.tar.gz\" \\\n--operational\n
The MLCube will be assigned by a server UID. You can check it by running:
medperf mlcube ls --mine\n
"},{"location":"getting_started/model_owner_demo/#3-request-participation","title":"3. Request Participation","text":"Benchmark workflows are run by Data Owners, who will get notified when a new model is added to a benchmark. You must request the association for your model to be part of the benchmark.
To initiate an association request, you need to collect the following information:
1
4
.Run the following command to request associating your MLCube with the benchmark:
medperf mlcube associate --benchmark 1 --model_uid 4\n
This command will first run the benchmark's workflow on your model to ensure your model is compatible with the benchmark workflow. Then, the association request information is printed on the screen, which includes an executive summary of the test mentioned. You will be prompted to confirm sending this information and initiating this association request.
"},{"location":"getting_started/model_owner_demo/#what-happens-after-requesting-the-association","title":"What Happens After Requesting the Association?","text":" When participating with a real benchmark, you must wait for the Benchmark Committee to approve the association request. You can check the status of your association requests by running medperf association ls
. The association is identified by the server UIDs of your MLCube and the benchmark with which you are requesting association.
You have reached the end of the tutorial! If you are planning to rerun any of the tutorials, don't forget to cleanup:
To shut down the local MedPerf server: press CTRL
+C
in the terminal where the server is running.
To cleanup the downloaded files workspace (make sure you are in the MedPerf's root directory):
rm -fr medperf_tutorial\n
cd server\nsh reset_db.sh\n
rm -fr ~/.medperf/localhost_8000\n
"},{"location":"getting_started/overview/","title":"Overview","text":"The MedPerf client provides all the necessary tools to run a complete benchmark experiment. Below, you will find a comprehensive breakdown of user roles and the corresponding functionalities they can access and perform using the MedPerf client:
This setup is only for running the tutorials. If you are using MedPerf with a real benchmark and real experiments, skip to this section to optionally change your container runner. Then, follow the tutorials as a general guidance for your real experiments.
"},{"location":"getting_started/setup/#install-the-medperf-client","title":"Install the MedPerf Client","text":"If this is your first time using MedPerf, install the MedPerf client library as described here.
"},{"location":"getting_started/setup/#run-a-local-medperf-server","title":"Run a Local MedPerf Server","text":"For this tutorial, you should spawn a local MedPerf server for the MedPerf client to communicate with. Note that this server will be hosted on your localhost
and not on the internet.
Install the server requirements ensuring you are in MedPerf's root folder:
pip install -r server/requirements.txt\npip install -r server/test-requirements.txt\n
Run the local MedPerf server using the following command:
cd server\ncp .env.local.local-auth .env\nsh setup-dev-server.sh\n
The local MedPerf server now is ready to recieve requests. You can always stop the server by pressing CTRL
+C
in the terminal where you ran the server.
After that, you will be configuring the MedPerf client to communicate with the local MedPerf server. Make sure you continue following the instructions in a new terminal.
"},{"location":"getting_started/setup/#configure-the-medperf-client","title":"Configure the MedPerf Client","text":"The MedPerf client can be configured by creating or modifying \"profiles
\". A profile is a set of configuration parameters used by the client during runtime. By default, the profile named default
will be active.
The default
profile is preconfigured so that the client communicates with the main MedPerf server (api.medperf.org). For the purposes of the tutorial, you will be using the local
profile as it is preconfigured so that the client communicates with the local MedPerf server.
To activate the local
profile, run the following command:
medperf profile activate local\n
You can always check which profile is active by running:
medperf profile ls\n
To view the current active profile's configured parameters, you can run the following:
medperf profile view\n
"},{"location":"getting_started/setup/#choose-the-container-runner","title":"Choose the Container Runner","text":"You can configure the MedPerf client to use either Docker or Singularity. The local
profile is configured to use Docker. If you want to use MedPerf with Singularity, modify the local
profile configured parameters by running the following:
medperf profile set --platform singularity\n
This command will modify the platform
parameter of the currently activated profile.
The local MedPerf server now is ready to recieve requests, and the MedPerf client is ready to communicate. Depending on your role, you can follow these hands-on tutorials:
How a benchmark committee can create and submit a benchmark.
How a model owner can submit a model.
How a data owner can prepare their data and execute a benchmark.
MedPerf uses passwordless authentication. This means that there will be no need for a password, and you have to access your email in order complete the signup process.
Automatic signups are currently disabled. Please contact the MedPerf team in order to provision an account.
Tip
You don't need an account to run the tutorials and learn how to use the MedPerf client.
"},{"location":"getting_started/signup/#whats-next","title":"What's Next?","text":"The tutorials simulate a benchmarking example for the task of detecting thoracic diseases from chest X-ray scans. You can find the description of the used data here. Throughout the tutorials, you will be interacting with a temporary local MedPerf server as described in the setup page. This allows you to freely experiment with the MedPerf client and rerun the tutorials as many times as you want, providing you with an immersive learning experience. Please note that these tutorials also serve as a general guidance to be followed when using the MedPerf client in a real scenario.
Before proceeding to the tutorials, make sure you have the general tutorial environment set up.
To ensure users have the best experience in learning the fundamentals of MedPerf and utilizing the MedPerf client, the following set of tutorials are provided:
Benchmark CommitteeClick here to see the documentation specifically for benchmark owners.
Model OwnerClick here to see the documentation specifically for model owners.
Data OwnerClick here to see the documentation specifically for data owners.
"},{"location":"getting_started/shared/before_we_start/","title":"Macro Rendering Error","text":"File: getting_started/shared/before_we_start.md
UndefinedError: 'dict object' has no attribute 'tutorial_id'
Traceback (most recent call last):\n File \"/opt/hostedtoolcache/Python/3.9.20/x64/lib/python3.9/site-packages/mkdocs_macros/plugin.py\", line 527, in render\n return md_template.render(**page_variables)\n File \"/opt/hostedtoolcache/Python/3.9.20/x64/lib/python3.9/site-packages/jinja2/environment.py\", line 1304, in render\n self.environment.handle_exception()\n File \"/opt/hostedtoolcache/Python/3.9.20/x64/lib/python3.9/site-packages/jinja2/environment.py\", line 939, in handle_exception\n raise rewrite_traceback_stack(source=source)\n File \"<template>\", line 41, in top-level template code\njinja2.exceptions.UndefinedError: 'dict object' has no attribute 'tutorial_id'\n
"},{"location":"getting_started/shared/cleanup/","title":"Cleanup","text":""},{"location":"getting_started/shared/cleanup/#cleanup-optional","title":"Cleanup (Optional)","text":"You have reached the end of the tutorial! If you are planning to rerun any of the tutorials, don't forget to cleanup:
To shut down the local MedPerf server: press CTRL
+C
in the terminal where the server is running.
To cleanup the downloaded files workspace (make sure you are in the MedPerf's root directory):
rm -fr medperf_tutorial\n
cd server\nsh reset_db.sh\n
rm -fr ~/.medperf/localhost_8000\n
"},{"location":"getting_started/shared/mlcube_submission_overview/","title":"Mlcube submission overview","text":"The MedPerf server registers an MLCube as metadata comprised of a set of files that can be retrieved from the internet. This means that before submitting an MLCube you have to host its files on the internet. The MedPerf client provides a utility to prepare the files of an MLCube that need to be hosted. You can refer to this page if you want to understand what the files are, but using the utility script is enough.
"},{"location":"getting_started/shared/redirect_to_hosting_files/","title":"Redirect to hosting files","text":""},{"location":"getting_started/shared/redirect_to_hosting_files/#host-the-files","title":"Host the Files","text":"For the tutorial to run smoothly, the files are already hosted. If you wish to host them by yourself, you can find the list of supported options and details about hosting files in this page.
"},{"location":"getting_started/shared/tutorials_content_overview/benchmark/","title":"Benchmark","text":"In this tutorial we will create a benchmark that classifies chest X-Ray images.
"},{"location":"getting_started/shared/tutorials_content_overview/benchmark/#demo-data","title":"Demo Data","text":"The medperf_tutorial/demo_data/
folder contains the demo dataset content.
images/
folder includes sample images.labels/labels.csv
provides a basic ground truth markup, indicating the class each image belongs to.The demo dataset is a sample dataset used for the development of your benchmark and used by Model Owners for the development of their models. More details are available in the section below
"},{"location":"getting_started/shared/tutorials_content_overview/benchmark/#data-preparator-mlcube","title":"Data Preparator MLCube","text":"The medperf_tutorial/data_preparator/
contains a DataPreparator MLCube that you must implement. This MLCube: - Transforms raw data into a format convenient for model consumption, such as converting DICOM images into numpy tensors, cropping patches, normalizing columns, etc. It's up to you to define the format that is handy for future models. - Ensures its output is in a standardized format, allowing Model Owners/Developers to rely on its consistency.
The medperf_tutorial/model_custom_cnn/
is an example of a Model MLCube. You need to implement a reference model which will be used by data owners to test the compatibility of their data with your pipeline. Also, Model Developers joining your benchmark will follow the input/output specifications of this model when building their own models.
The medperf_tutorial/metrics/
houses a Metrics MLCube that processes ground truth data, model predictions, and computes performance metrics - such as classification accuracy, loss, etc. After a Dataset Owner runs the benchmark pipeline on their data, these final metric values will be shared with you as the Benchmark Owner.
The medperf_tutorial/sample_raw_data/
folder contains your data for the specified Benchmark. In this tutorial, where the benchmark involves classifying chest X-Ray images, your data comprises:
images/
folder contains your imageslabels/labels.csv
, which provides the ground truth markup, specifying the class of each image.The format of this data is dictated by the Benchmark Owner, as it must be compatible with the benchmark's Data Preparation MLCube. In a real-world scenario, the expected data format would differ from this toy example. Refer to the Benchmark Owner to get a format specifications and details for your practical case.
As previously mentioned, your data itself never leaves your machine. During the dataset submission, only basic metadata is transferred, for which you will be prompted to confirm.
"},{"location":"getting_started/shared/tutorials_content_overview/model/","title":"Model","text":""},{"location":"getting_started/shared/tutorials_content_overview/model/#model-mlcube","title":"Model MLCube","text":"The medperf_tutorial/model_mobilenetv2/
is a toy Model MLCube. Once you submit your model to the benchmark, all participating Data Owners would be able to run the model within the benchmark pipeline. Therefore, your MLCube must support the specific input/output formats defined by the Benchmark Owners.
For the purposes of this tutorial, you will work with a pre-prepared toy benchmark. In a real-world scenario, you should refer to your Benchmark Owner to get a format specifications and details for your practical case.
"},{"location":"mlcubes/gandlf_mlcube/","title":"Creating a GaNDLF MLCube","text":""},{"location":"mlcubes/gandlf_mlcube/#overview","title":"Overview","text":"This guide will walk you through how to wrap a model trained using GaNDLF as a MedPerf-compatible MLCube ready to be used for inference (i.e. as a Model MLCube). The steps can be summarized as follows:
Before proceeding, make sure you have medperf installed and GaNDLF installed.
"},{"location":"mlcubes/gandlf_mlcube/#before-we-start","title":"Before We Start","text":""},{"location":"mlcubes/gandlf_mlcube/#download-the-necessary-files","title":"Download the Necessary files","text":"A script is provided to download all the necessary files so that you follow the tutorial smoothly. Run the following: (make sure you are in MedPerf's root folder)
sh tutorials_scripts/setup_GaNDLF_mlcube_tutorial.sh\n
This will create a workspace folder medperf_tutorial
where all necessary files are downloaded. Run cd medperf_tutorial
to switch to this folder.
Train a small GaNDLF model to use for this guide. You can skip this step if you already have a trained model.
Make sure you are in the workspace folder medperf_tutorial
. Run:
gandlf_run \\\n-c ./config_getting_started_segmentation_rad3d.yaml \\\n-i ./data.csv \\\n-m ./trained_model_output \\\n-t True \\\n-d cpu\n
Note that if you want to train on GPU you can use -d cuda
, but the example used here should take only few seconds using the CPU.
Warning
This tutorial assumes the user is using the latest GaNDLF version. The configuration file config_getting_started_segmentation_rad3d.yaml
will cause problems if you are using a different version, make sure you do the necessary changes.
You will now have your trained model and its related files in the folder trained_model_output
. Next, you will start learning how to wrap this trained model within an MLCube.
MedPerf provides a cookiecutter to create an MLCube file that is ready to be consumed by gandlf_deploy
and produces an MLCube ready to be used by MedPerf. To create the MLCube, run: (make sure you are in the workspace folder medperf_tutorial
)
medperf mlcube create gandlf\n
Note
MedPerf is running CookieCutter under the hood. This medperf command provides additional arguments for handling different scenarios. You can see more information on this by running medperf mlcube create --help
You will be prompted to customize the MLCube creation. Below is an example of how your response might look like:
project_name [GaNDLF MLCube]: My GaNDLF MLCube # (1)!\nproject_slug [my_gandlf_mlcube]: my_gandlf_mlcube # (2)!\ndescription [GaNDLF MLCube Template. Provided by MLCommons]: GaNDLF MLCube implementation # (3)!\nauthor_name [John Smith]: John Smith # (4)!\naccelerator_count [1]: 0 # (5)!\ndocker_build_file [Dockerfile-CUDA11.6]: Dockerfile-CPU # (6)!\ndocker_image_name [docker/image:latest]: johnsmith/gandlf_model:0.0.1 # (7)!\n
Assuming you chose my_gandlf_mlcube
as the project slug, you will find your MLCube created under the folder my_gandlf_mlcube
. Next, you will use a GaNDLF
utility to build the MLCube.
Note
You might need to specify additional configurations in the mlcube.yaml
file if you are using a GPU. Check the generated mlcube.yaml
file for more info, as well as the MLCube documentation.
When deploying the GaNDLF model directly as a model MLCube, the default entrypoint will be gandlf_run ...
. You can override the entrypoint with a custom python script. One of the usecases is described below.
gandlf_run
expects a data.csv
file in the input data folder, which describes the inference test cases and their associated paths (Read more about GaNDLF's csv file conventions here). In case your MLCube will expect a data folder with a predefined data input structure but without this csv file, you can use a custom script that prepares this csv file as an entrypoint. You can find the recommended template and an example here.
To deploy the GaNDLF model as an MLCube, run the following: (make sure you are in the workspace folder medperf_tutorial
)
gandlf_deploy \\\n-c ./config_getting_started_segmentation_rad3d.yaml \\\n-m ./trained_model_output \\\n--target docker \\\n--mlcube-root ./my_gandlf_mlcube \\\n-o ./built_gandlf_mlcube \\\n--mlcube-type model \\\n--entrypoint <(optional) path to your custom entrypoint script> \\ # (1)!\n-g False # (2)!\n
True
if you want the resulting MLCube to use a GPU for inference.GaNDLF will use your initial MLCube configuration my_gandlf_mlcube
, the GaNDLF experiment configuration file config_classification.yaml
, and the trained model trained_model_output
to create a ready MLCube built_gandlf_mlcube
and build the docker image that will be used by the MLCube. The docker image will have the model weights and the GaNDLF experiment configuration file embedded. You can check that your image was built by running docker image ls
. You will see johnsmith/gandlf_model:0.0.1
(or whatever image name that was used) created moments ago.
That's it! You have built a MedPerf-compatible MLCube with GaNDLF. You may want to submit your MLCube to MedPerf, you can follow this tutorial.
Tip
MLCubes created by GaNDLF have the model weights and configuration file embedded in the docker image. When you want to deploy your MLCube for MedPerf, all you need to do is pushing the docker image and hosting the mlcube.yaml file.
"},{"location":"mlcubes/gandlf_mlcube/#cleanup-optional","title":"Cleanup (Optional)","text":"You have reached the end of the tutorial! If you are planning to rerun any of the tutorials, don't forget to cleanup:
rm -fr medperf_tutorial\n
"},{"location":"mlcubes/gandlf_mlcube/#see-also","title":"See Also","text":"TODO: Change the structure to align with mlcube_models, to help users wrap their existing code into mlcube
"},{"location":"mlcubes/mlcube_data_WIP/#data-preparator-mlcube","title":"Data Preparator MLCube","text":""},{"location":"mlcubes/mlcube_data_WIP/#introduction","title":"Introduction","text":"This guide is one of three designed to assist users in building MedPerf-compatible MLCubes. The other two guides focus on creating a Model MLCube and a Metrics MLCube. Together, these three MLCubes form a complete benchmark workflow for the task of thoracic disease detection from Chest X-rays.
In summary, a functional MedPerf pipeline includes these steps:
my_raw_data/
. If the pipeline is run by another person (Model Owner/Benchmark Owner), a predefined my_benchmark_demo_raw_data/
would be used instead (created and distributed by the Benchmark Owner).my_prepared_dataset/
folder (MLCube is implemented by the Benchmark Owner).my_model_predictions/
folder (MLCube is implemented by the Model Owner; the Benchmark Owner must implement a baseline model MLCube to be used as a mock-up).my_metrics.yaml
file (MLCube implemented by the Benchmark Owner).Aforementioned guides detail steps 2-4. As all steps demonstrate building specific MLCubes, we recommend starting with the Model MLCube guide, which offers a more detailed explanation of the MLCube's concept and structure. Another option is to explore MLCube basic docs. In this guide provides the shortened concepts description, focusing on nuances and input/output parameters.
"},{"location":"mlcubes/mlcube_data_WIP/#about-this-guide","title":"About this Guide","text":"This guide describes the tasks, structure and input/output parameters of Data Preparator MLCube, allowing users at the end to be able to implement their own MedPerf-compatible MLCube for Benchmark purposes.
The guide starts with general advices, steps, and the required API for building these MLCubes. Subsequently, it will lead you through creating your MLCube using the Chest X-ray Data Preprocessor MLCube as a practical example.
It's considered best practice to handle data in various formats. For instance, if the benchmark involves image processing, it's beneficial to support JPEGs, PNGs, BMPs, and other expected image formats; accommodate large and small images, etc. Such flexibility simplifies the process for Dataset Owners, allowing them to export data in their preferred format. The Data Preparator's role is to convert all reasonable input data into a unified format.
"},{"location":"mlcubes/mlcube_data_WIP/#before-building-the-mlcube","title":"Before Building the MLCube","text":"Your MLCube must implement three command tasks:
prepare
: your main task that transforms raw input data into a unified format.sanity_check
: verifies the cleanliness and consistency of prepare
outputs (e.g., ensuring no records lack ground truth labels, labels contain only expected values, data fields are reasonable without outliers or NaNs, etc.)statistics
: Calculates some aggregated statistics on the transformed dataset. Once the Dataset Owner submits their dataset, these statistics will be uploaded to you as the Benchmark Owner.It's assumed that you already have:
This guide will help you encapsulate your preparation code within an MLCube. Make sure you extracted each part of your logic, so it can be run independently.
"},{"location":"mlcubes/mlcube_data_WIP/#required-api","title":"Required API","text":"Each command execution receives specific parameters. While you are flexible in code implementation, keep in mind that your implementation will receive the following input arguments:
"},{"location":"mlcubes/mlcube_data_WIP/#data-preparation-api-prepare-command","title":"Data Preparation API (prepare
Command)","text":"The parameters include: - data_path
: the path to the raw data folder (read-only). - labels_path
: the path to the ground truth labels folder (read-only). - Any other optional extra params that you attach to the MLCube, such as path to .txt
file with acceptable labels. Note: these extra parameters contain values defined by you, the MLCube owner, not the users' data. - output_path
: an r/w folder for storing transformed dataset objects. - output_labels_path
: an r/w folder for storing transformed labels.
sanity_check
Command)","text":"The parameters include: - data_path
: the path to the transformed data folder (read-only). - labels_path
: the path to the transformed ground truth labels folder (read-only). - Any other optional extra params that you attach to the MLCube - same as for the prepare
command.
The sanity check does not produce outputs; it either completes successfully or fails.
"},{"location":"mlcubes/mlcube_data_WIP/#statistics-api-statistics-command","title":"Statistics API (statistics
Command)","text":"data_path
: the path to the transformed data folder (read-only).labels_path
: the path to the transformed ground truth labels folder (read-only).prepare
command.output_path
: path to .yaml
file where your code should write down calculated statistics.While this guide leads you through creating your own MLCube, you can always check a prebuilt example for a better understanding of how it works in an already implemented MLCube. The example is available here:
cd examples/chestxray_tutorial/data_preparator/\n
The guide uses this implementation to describe concepts.
"},{"location":"mlcubes/mlcube_data_WIP/#use-an-mlcube-template","title":"Use an MLCube Template","text":"First, ensure you have MedPerf installed. Create a Data Preparator MLCube template by running the following command:
medperf mlcube create data_preparator\n
You will be prompted to fill in some configuration options through the CLI. Below are the options and their default values:
project_name [Data Preparator MLCube]: # (1)!\nproject_slug [data_preparator_mlcube]: # (2)!\ndescription [Data Preparator MLCube Template. Provided by MLCommons]: # (3)!\nauthor_name [John Smith]: # (4)!\naccelerator_count [0]: # (5)!\ndocker_image_name [docker/image:latest]: # (6)!\n
After filling the configuration options, the following directory structure will be generated:
.\n\u2514\u2500\u2500 evaluator_mlcube\n \u251c\u2500\u2500 mlcube\n \u2502 \u251c\u2500\u2500 mlcube.yaml\n \u2502 \u2514\u2500\u2500 workspace\n \u2502 \u2514\u2500\u2500 parameters.yaml\n \u2514\u2500\u2500 project\n \u251c\u2500\u2500 Dockerfile\n \u251c\u2500\u2500 mlcube.py\n \u2514\u2500\u2500 requirements.txt\n
"},{"location":"mlcubes/mlcube_data_WIP/#the-project-folder","title":"The project
Folder","text":"This is where your preprocessing logic will live. It contains a standard Docker image project with a specific API for the entrypoint. mlcube.py
contains the entrypoint and handles all the tasks we've described. Update this template with your code and bind your logic to specified functions for all three commands. Refer to the Chest X-ray tutorial example for an example of how it should look:
\"\"\"MLCube handler file\"\"\"\nimport typer\nimport yaml\nfrom prepare import prepare_dataset\nfrom sanity_check import perform_sanity_checks\nfrom stats import generate_statistics\napp = typer.Typer()\n@app.command(\"prepare\")\ndef prepare(\ndata_path: str = typer.Option(..., \"--data_path\"),\nlabels_path: str = typer.Option(..., \"--labels_path\"),\nparameters_file: str = typer.Option(..., \"--parameters_file\"),\noutput_path: str = typer.Option(..., \"--output_path\"),\noutput_labels_path: str = typer.Option(..., \"--output_labels_path\"),\n):\nwith open(parameters_file) as f:\nparameters = yaml.safe_load(f)\nprepare_dataset(data_path, labels_path, parameters, output_path, output_labels_path)\n@app.command(\"sanity_check\")\ndef sanity_check(\ndata_path: str = typer.Option(..., \"--data_path\"),\nlabels_path: str = typer.Option(..., \"--labels_path\"),\nparameters_file: str = typer.Option(..., \"--parameters_file\"),\n):\nwith open(parameters_file) as f:\nparameters = yaml.safe_load(f)\nperform_sanity_checks(data_path, labels_path, parameters)\n@app.command(\"statistics\")\ndef statistics(\ndata_path: str = typer.Option(..., \"--data_path\"),\nlabels_path: str = typer.Option(..., \"--labels_path\"),\nparameters_file: str = typer.Option(..., \"--parameters_file\"),\nout_path: str = typer.Option(..., \"--output_path\"),\n):\nwith open(parameters_file) as f:\nparameters = yaml.safe_load(f)\ngenerate_statistics(data_path, labels_path, parameters, out_path)\nif __name__ == \"__main__\":\napp()\n
"},{"location":"mlcubes/mlcube_data_WIP/#the-mlcube-folder","title":"The mlcube
Folder","text":"This folder is primarily for configuring your MLCube and providing additional files the MLCube may interact with, such as parameters or model weights.
"},{"location":"mlcubes/mlcube_data_WIP/#mlcubeyaml-mlcube-configuration","title":"mlcube.yaml
MLCube Configuration","text":"The mlcube/mlcube.yaml
file contains metadata and configuration of your mlcube. This file is already populated with the configuration you provided during the template creation step. There is no need to edit anything in this file except if you are specifying extra parameters to the commands (e.g., you want to pass a sklearn's StardardScaler
weights or any other parameters required for data transformation).
name: Chest X-ray Data Preparator\ndescription: MedPerf Tutorial - Data Preparation MLCube.\nauthors:\n- { name: MLCommons Medical Working Group }\nplatform:\naccelerator_count: 0\ndocker:\n# Image name\nimage: mlcommons/chestxray-tutorial-prep:0.0.0\n# Docker build context relative to $MLCUBE_ROOT. Default is `build`.\nbuild_context: \"../project\"\n# Docker file name within docker build context, default is `Dockerfile`.\nbuild_file: \"Dockerfile\"\ntasks:\nprepare:\nparameters:\ninputs:\n{\ndata_path: input_data,\nlabels_path: input_labels,\nparameters_file: parameters.yaml,\n}\noutputs: { output_path: data/, output_labels_path: labels/ }\nsanity_check:\nparameters:\ninputs:\n{\ndata_path: data/,\nlabels_path: labels/,\nparameters_file: parameters.yaml,\n}\nstatistics:\nparameters:\ninputs:\n{\ndata_path: data/,\nlabels_path: labels/,\nparameters_file: parameters.yaml,\n}\noutputs: { output_path: { type: file, default: statistics.yaml } }\n
All paths are relative to mlcube/workspace/
folder.
To set up additional inputs, add a key-value pair in the task's inputs
dictionary:
...\nprepare:\nparameters:\ninputs:\n{\n data_path: input_data,\n labels_path: input_labels,\n parameters_file: parameters.yaml,\n standardscaler_weights: additional_files/standardscaler.pkl\n}\noutputs: { output_path: data/, output_labels_path: labels/ }\n...\n
Considering the note about path locations, this new file should be stored at mlcube/workspace/additional_files/standardscaler.pkl
Your preprocessing logic might depend on certain parameters (e.g., which labels are accepted). It is generally better to pass such parameters when running the MLCube, rather than hardcoding them. This can be done via a parameters.yaml
file that is passed to the MLCube. This file will be available to the previously described commands (if you declare it in the inputs
dict of a specific command). You can parse this file in the mlcube.py
file and pass its contents to your logic.
This file should be placed in the mlcube/workspace
folder.
After you follow the previous sections and fulfill the image with your logic, the MLCube is ready to be built and run. Run the command below to build the MLCube. Ensure you are in the mlcube/
subfolder of your Data Preparator.
mlcube configure -Pdocker.build_strategy=always\n
This command builds your Docker image and prepares the MLCube for use.
"},{"location":"mlcubes/mlcube_data_WIP/#run-your-mlcube","title":"Run Your MLCube","text":"MedPerf will take care of running your MLCube. However, it's recommended to test the MLCube alone before using it with MedPerf for better debugging.
To run the MLCube, use the command below. Ensure you are located in the mlcube/
subfolder of your Data Preparator.
mlcube run --task prepare data_path=<path_to_raw_data> \\\nlabels_path=<path_to_raw_labels> \\\noutput_path=<path_to_save_transformed_data> \\\noutput_labels_path=<path_to_save_transformed_labels>\n
Relative paths
Keep in mind that though we are running tasks from mlcube/
, all the paths should be absolute or relative to mlcube/workspace/
.
Default values
We have declared a default values for every path parameter. This allows for omitting these parameters in our commands.
Consider the following structure:
.\n\u2514\u2500\u2500 data_preparator_mlcube\n \u251c\u2500\u2500 mlcube\n \u2502 \u251c\u2500\u2500 mlcube.yaml\n \u2502 \u2514\u2500\u2500 workspace\n \u2502 \u2514\u2500\u2500 parameters.yaml\n \u2514\u2500\u2500 project\n \u2514\u2500\u2500 ...\n\u2514\u2500\u2500 my_data\n \u251c\u2500\u2500 data\n \u2502 \u251c\u2500\u2500 ...\n \u2514\u2500\u2500 labels\n \u2514\u2500\u2500 ...\n
Now, you can execute the commands below, being located at data_preparator_mlcube/mlcube/
:
mlcube run --task prepare data_path=../../my_data/data/ labels_path=../../my_data/labels/\nmlcube run --task sanity_check\nmlcube run --task statistics output_path=../../my_data/statistics.yaml\n
Note that:
mlcube/workspace
rather then to the current working directory,mlcube/workspace/data/
and others. The provided example codebase runs only on CPU. You can modify it to pass a GPU inside Docker image if your code utilizes it.
The general instructions for building an MLCube to work with a GPU are the same as the provided instructions, but with the following slight modifications:
0
for the accelerator_count
that you will be prompted with when creating the MLCube template or modify platform.accelerator_count
value of mlcube.yaml
configuration.docker
section of the mlcube.yaml
, add a key value pair: gpu_args: --gpus=all
. These gpu_args
will be passed to docker run
command by MLCube. You may add more than just --gpus=all
.pip
dependencies in the requirements.txt
file to download pytorch
with cuda, or by changing the base image of the dockerfile.TODO: Change the structure to align with mlcube_models, to help users wrap their existing code into mlcube
"},{"location":"mlcubes/mlcube_metrics_WIP/#metricsevaluator-mlcube","title":"Metrics/Evaluator MLCube","text":""},{"location":"mlcubes/mlcube_metrics_WIP/#introduction","title":"Introduction","text":"This guide is one of three designed to assist users in building MedPerf-compatible MLCubes. The other two guides focus on creating a Data Preparator MLCube and a Model MLCube. Together, these three MLCubes form a complete benchmark workflow for the task of thoracic disease detection from Chest X-rays.
In summary, a functional MedPerf pipeline includes these steps:
my_raw_data/
. If the pipeline is run by another person (Model Owner/Benchmark Owner), a predefined my_benchmark_demo_raw_data/
would be used instead (created and distributed by the Benchmark Owner).my_prepared_dataset/
folder (MLCube is implemented by the Benchmark Owner).my_model_predictions/
folder (MLCube is implemented by the Model Owner; the Benchmark Owner must implement a baseline model MLCube to be used as a mock-up).my_metrics.yaml
file (MLCube implemented by the Benchmark Owner).Aforementioned guides detail steps 2-4. As all steps demonstrate building specific MLCubes, we recommend starting with the Model MLCube guide, which offers a more detailed explanation of the MLCube's concept and structure. Another option is to explore MLCube basic docs. In this guide provides the shortened concepts description, focusing on nuances and input/output parameters.
"},{"location":"mlcubes/mlcube_metrics_WIP/#about-this-guide","title":"About this Guide","text":"This guide describes the tasks, structure and input/output parameters of Metrics MLCube, allowing users at the end to be able to implement their own MedPerf-compatible MLCube for Benchmark purposes.
The guide starts with general advices, steps, and the required API for building these MLCubes. Subsequently, it will lead you through creating your MLCube using the Chest X-ray Data Preprocessor MLCube as a practical example.
Note: As the Dataset Owner would share the output of your metrics evaluation with you as Benchmark Owner, ensure that your metrics are not too specific and do not reveal any Personally Identifiable Information (PII) or other confidential data (including dataset statistics) - otherwise, no Dataset Owners would agree to participate in your benchmark.
"},{"location":"mlcubes/mlcube_metrics_WIP/#before-building-the-mlcube","title":"Before Building the MLCube","text":"Your MLCube must implement an evaluate
command that calculates your metrics.
It's assumed that you as Benchmark Owner already have:
my_model_predictions/
folder.This guide will help you encapsulate your preparation code within an MLCube. Make sure you extracted metric calculation logic, so it can be executed independently.
"},{"location":"mlcubes/mlcube_metrics_WIP/#required-api","title":"Required API","text":"During execution, the evaluation
command will receive specific parameters. While you are flexible in code implementation, keep in mind that your implementation will receive the following input arguments:
predictions
: the path to the folder containing your predictions (read-only).labels
: the path to the folder containing transformed ground truth labels (read-only).output_path
: path to .yaml
file where your code should write down calculated metrics.While this guide leads you through creating your own MLCube, you can always check a prebuilt example for a better understanding of how it works in an already implemented MLCube. The example is available here:
cd examples/chestxray_tutorial/metrics/\n
The guide uses this implementation to describe concepts.
"},{"location":"mlcubes/mlcube_metrics_WIP/#use-an-mlcube-template","title":"Use an MLCube Template","text":"First, ensure you have MedPerf installed. Create a Metrics MLCube template by running the following command:
medperf mlcube create evaluator\n
You will be prompted to fill in some configuration options through the CLI. Below are the options and their default values:
project_name [Evaluator MLCube]: # (1)!\nproject_slug [evaluator_mlcube]: # (2)!\ndescription [Evaluator MLCube Template. Provided by MLCommons]: # (3)!\nauthor_name [John Smith]: # (4)!\naccelerator_count [0]: # (5)!\ndocker_image_name [docker/image:latest]: # (6)!\n
After filling the configuration options, the following directory structure will be generated:
.\n\u2514\u2500\u2500 data_preparator_mlcube\n \u251c\u2500\u2500 mlcube\n \u2502 \u251c\u2500\u2500 mlcube.yaml\n \u2502 \u2514\u2500\u2500 workspace\n \u2502 \u2514\u2500\u2500 parameters.yaml\n \u2514\u2500\u2500 project\n \u251c\u2500\u2500 Dockerfile\n \u251c\u2500\u2500 mlcube.py\n \u2514\u2500\u2500 requirements.txt\n
"},{"location":"mlcubes/mlcube_metrics_WIP/#the-project-folder","title":"The project
Folder","text":"This is where your metrics logic will live. It contains a standard Docker image project with a specific API for the entrypoint. mlcube.py
contains the entrypoint and handles the evaluate
task. Update this template with your code and bind your logic to specified command entry-point function. Refer to the Chest X-ray tutorial example for an example of how it should look:
\"\"\"MLCube handler file\"\"\"\nimport typer\nimport yaml\nfrom metrics import calculate_metrics\napp = typer.Typer()\n@app.command(\"evaluate\")\ndef evaluate(\nlabels: str = typer.Option(..., \"--labels\"),\npredictions: str = typer.Option(..., \"--predictions\"),\nparameters_file: str = typer.Option(..., \"--parameters_file\"),\noutput_path: str = typer.Option(..., \"--output_path\"),\n):\nwith open(parameters_file) as f:\nparameters = yaml.safe_load(f)\ncalculate_metrics(labels, predictions, parameters, output_path)\n@app.command(\"hotfix\")\ndef hotfix():\n# NOOP command for typer to behave correctly. DO NOT REMOVE OR MODIFY\npass\nif __name__ == \"__main__\":\napp()\n
"},{"location":"mlcubes/mlcube_metrics_WIP/#the-mlcube-folder","title":"The mlcube
Folder","text":"This folder is primarily for configuring your MLCube and providing additional files the MLCube may interact with, such as parameters or model weights.
"},{"location":"mlcubes/mlcube_metrics_WIP/#mlcubeyaml-mlcube-configuration","title":"mlcube.yaml
MLCube Configuration","text":"The mlcube/mlcube.yaml
file contains metadata and configuration of your mlcube. This file is already populated with the configuration you provided during the template creation step. There is no need to edit anything in this file except if you are specifying extra parameters to the commands.
name: Classification Metrics\ndescription: MedPerf Tutorial - Metrics MLCube.\nauthors:\n- { name: MLCommons Medical Working Group }\nplatform:\naccelerator_count: 0\ndocker:\n# Image name\nimage: mlcommons/chestxray-tutorial-metrics:0.0.0\n# Docker build context relative to $MLCUBE_ROOT. Default is `build`.\nbuild_context: \"../project\"\n# Docker file name within docker build context, default is `Dockerfile`.\nbuild_file: \"Dockerfile\"\ntasks:\nevaluate:\n# Computes evaluation metrics on the given predictions and ground truths\nparameters:\ninputs:\n{\npredictions: predictions,\nlabels: labels,\nparameters_file: parameters.yaml,\n}\noutputs: { output_path: { type: \"file\", default: \"results.yaml\" } }\n
All paths are relative to mlcube/workspace/
folder.
To set up additional inputs, add a key-value pair in the task's inputs
dictionary:
...\nprepare:\nparameters:\ninputs:\n{\n predictions: predictions,\n labels: labels,\n parameters_file: parameters.yaml,\n some_additional_file_with_weights: additional_files/my_weights.zip\n}\noutputs: { output_path: { type: \"file\", default: \"results.yaml\" } }\n...\n
Considering the note about path locations, this new file should be stored at mlcube/workspace/additional_files/my_weights.zip
.
Your metrics evaluation logic might depend on certain parameters (e.g., proba threshold for classifying predictions). It is generally better to pass such parameters when running the MLCube, rather than hardcoding them. This can be done via a parameters.yaml
file that is passed to the MLCube. You can parse this file in the mlcube.py
file and pass its contents to your logic.
This file should be placed in the mlcube/workspace
folder.
After you follow the previous sections and fulfill the image with your logic, the MLCube is ready to be built and run. Run the command below to build the MLCube. Ensure you are in the mlcube/
subfolder of your Evaluator.
mlcube configure -Pdocker.build_strategy=always\n
This command builds your Docker image and prepares the MLCube for use.
"},{"location":"mlcubes/mlcube_metrics_WIP/#run-your-mlcube","title":"Run Your MLCube","text":"MedPerf will take care of running your MLCube. However, it's recommended to test the MLCube alone before using it with MedPerf for better debugging.
To run the MLCube, use the command below. Ensure you are located in the mlcube/
subfolder of your Data Preparator.
mlcube run --task evaluate predictions=<path_to_predictions> \\\nlabels=<path_to_transformed_labels> \\\noutput_path=<path_to_yaml_file_to_save>\n
Relative paths
Keep in mind that though we are running tasks from mlcube/
, all the paths should be absolute or relative to mlcube/workspace/
.
Default values
Default values are set for every path parameter, allowing for their omission in commands. For example, in the discussed Chest X-Ray example, the predictions
input is defined as follows:
...\ninputs:\n{\npredictions: predictions,\nlabels: labels,\n}\n...\n
If this parameter is omitted (e.g., running MLCube with default parameters by mlcube run --task evaluate
), it's assumed that predictions are stored in the mlcube/workspace/predictions/
folder.
The provided example codebase runs only on CPU. You can modify it to pass a GPU inside Docker image if your code utilizes it.
The general instructions for building an MLCube to work with a GPU are the same as the provided instructions, but with the following slight modifications:
0
for the accelerator_count
that you will be prompted with when creating the MLCube template or modify platform.accelerator_count
value of mlcube.yaml
configuration.docker
section of the mlcube.yaml
, add a key value pair: gpu_args: --gpus=all
. These gpu_args
will be passed to docker run
command by MLCube. You may add more than just --gpus=all
.pip
dependencies in the requirements.txt
file to download pytorch
with cuda, or by changing the base image of the dockerfile.This is one of the three guides that help the user build MedPerf-compatible MLCubes. The other two guides are for building a Data Preparator MLCube and a Metrics MLCube. Together, the three MLCubes examples constitute a complete benchmark workflow for the task of thoracic disease detection from Chest X-rays.
"},{"location":"mlcubes/mlcube_models/#about-this-guide","title":"About this Guide","text":"This guide will help users familiarize themselves with the expected interface of the Model MLCube and gain a comprehensive understanding of its components. By following this walkthrough, users will gain insights into the structure and organization of a Model MLCube, allowing them at the end to be able to implement their own MedPerf-compatible Model MLCube.
The guide will start by providing general advice, steps, and hints on building these MLCubes. Then, an example will be presented through which the provided guidance will be applied step-by-step to build a Chest X-ray classifier MLCube. The final MLCube code can be found here.
"},{"location":"mlcubes/mlcube_models/#before-building-the-mlcube","title":"Before Building the MLCube","text":"It is assumed that you already have a working code that runs inference on data and generates predictions, and what you want to accomplish through this guide is to wrap your inference code within an MLCube.
MedPerf provides MLCube templates. You should start from a template for faster implementation and to build MLCubes that are compatible with MedPerf.
First, make sure you have MedPerf installed. You can create a model MLCube template by running the following command:
medperf mlcube create model\n
You will be prompted to fill in some configuration options through the CLI. Below are the options and their default values:
project_name [Model MLCube]: # (1)!\nproject_slug [model_mlcube]: # (2)!\ndescription [Model MLCube Template. Provided by MLCommons]: # (3)!\nauthor_name [John Smith]: # (4)!\naccelerator_count [0]: # (5)!\ndocker_image_name [docker/image:latest]: # (6)!\n
After filling the configuration options, the following directory structure will be generated:
.\n\u2514\u2500\u2500 model_mlcube\n \u251c\u2500\u2500 mlcube\n \u2502 \u251c\u2500\u2500 mlcube.yaml\n \u2502 \u2514\u2500\u2500 workspace\n \u2502 \u2514\u2500\u2500 parameters.yaml\n \u2514\u2500\u2500 project\n \u251c\u2500\u2500 Dockerfile\n \u251c\u2500\u2500 mlcube.py\n \u2514\u2500\u2500 requirements.txt\n
The next sections will go through the contents of this directory in details and customize it.
"},{"location":"mlcubes/mlcube_models/#the-project-folder","title":"Theproject
folder","text":"This is where your inference logic will live. This folder initially contains three files as shown above. The upcoming sections will cover their use in details.
The first thing to do is put your code files in this folder.
"},{"location":"mlcubes/mlcube_models/#how-will-the-mlcube-identify-your-code","title":"How will the MLCube identify your code?","text":"This is done through the mlcube.py
file. This file defines the interface of the MLCube and should be linked to your inference logic.
\"\"\"MLCube handler file\"\"\"\nimport typer\napp = typer.Typer()\n@app.command(\"infer\")\ndef infer(\ndata_path: str = typer.Option(..., \"--data_path\"),\nparameters_file: str = typer.Option(..., \"--parameters_file\"),\noutput_path: str = typer.Option(..., \"--output_path\"),\n# Provide additional parameters as described in the mlcube.yaml file\n# e.g. model weights:\n# weights: str = typer.Option(..., \"--weights\"),\n):\n# Modify the prepare command as needed\nraise NotImplementedError(\"The evaluate method is not yet implemented\")\n@app.command(\"hotfix\")\ndef hotfix():\n# NOOP command for typer to behave correctly. DO NOT REMOVE OR MODIFY\npass\nif __name__ == \"__main__\":\napp()\n
As shown above, this file exposes a command infer
. It's basic arguments are the input data path, the output predictions path, and a parameters file path.
The parameters file, as will be explained in the upcoming sections, gives flexibility to your MLCube. For example, instead of hardcoding the inference batch size in the code, it can be configured by passing a parameters file to your MLCube which contains its value. This way, your same MLCube can be reused with multiple batch sizes by just changing the input parameters file.
You should ignore the hotfix
command as described in the file.
The infer
command will be automatically called by the MLCube when it's built and run. This command should call your inference logic. Make sure you replace its contents with a code that calls your inference logic. This could be by importing a function from your code files and calling it with the necessary arguments.
The MLCube will execute a docker image whose entrypoint is mlcube.py
. The MLCube will first build this image from the Dockerfile
specified in the project
folder. You can customize the Dockerfile however you want as long as the entrypoint is runs the mlcube.py
file
Make sure you include in your Dockerfile any system dependency your code depends on. It is also common to have pip
dependencies, make sure you install them in the Dockerfile as well.
Below is the docker file provided in the template:
DockerfileFROM python:3.9.16-slim\nCOPY ./requirements.txt /mlcube_project/requirements.txt RUN pip3 install --no-cache-dir -r /mlcube_project/requirements.txt\n\nENV LANG C.UTF-8\n\nCOPY . /mlcube_project\n\nENTRYPOINT [\"python3\", \"/mlcube_project/mlcube.py\"]\n
As shown above, this docker file makes sure python
is available by using the python base image, installs pip
dependencies using the requirements.txt
file, and sets the entrypoint to run mlcube.py
. Note that the MLCube tool will invoke the Docker build
command from the project
folder, so it will copy all your files found in the project
to the Docker image.
mlcube
folder","text":"This folder is mainly for configuring your MLCube and providing additional files the MLCube may interact with, such as parameters or model weights.
"},{"location":"mlcubes/mlcube_models/#include-additional-input-files","title":"Include additional input files","text":""},{"location":"mlcubes/mlcube_models/#parameters","title":"parameters","text":"Your inference logic may depend on some parameters (e.g. inference batch size). It is usually a more favorable design to not hardcode such parameters, but instead pass them when running the MLCube. This can be done by having a parameters.yaml
file as an input to the MLCube. This file will be available to the infer
command described before. You can parse this file in the mlcube.py
file and pass its contents to your code.
This file should be placed in the mlcube/workspace
folder.
It is a good practice not to ship your model weights within the docker image to reduce the image size and provide flexibility of running the MLCube with different model weights. To do this, model weights path should be provided as a separate parameter to the MLCube. You should place your model weights in a folder named additional_files
inside the mlcube/workspace
folder. This is how MedPerf expects any additional input to your MLCube beside the data path and the paramters file.
After placing your model weights in mlcube/workspace/additional_files
, you have to modify two files:
mlcube.py
: add an argument to the infer
command which will correspond to the path of your model weights. Remember also to pass this argument to your inference logic.mlcube.yaml
: The next section introduces this file and describes it in details. You should add your extra input arguments to this file as well, as described below.The mlcube.yaml
file contains metadata and configuration of your mlcube. This file was already populated with the configuration you provided during the step of creating the template. There is no need to edit anything in this file except if you are specifying extra parameters to the infer
command (e.g., model weights as described in the previous section).
You will be modifying the tasks
section of the mlcube.yaml
file in order to account for extra additional inputs:
tasks:\ninfer:\n# Computes predictions on input data\nparameters:\ninputs: {\n data_path: data/,\n parameters_file: parameters.yaml,\n# Feel free to include other files required for inference.\n# These files MUST go inside the additional_files path.\n# e.g. model weights\n # weights: additional_files/weights.pt,\n}\noutputs: { output_path: { type: directory, default: predictions } }\n
As hinted by the comments as well, you can add the additional parameters by specifying an extra key-value pair in the inputs
dictionary of the infer
task.
After you follow the previous sections, the MLCube is ready to be built and run. Run the command below to build the MLCube. Make sure you are in the folder model_mlcube/mlcube
.
mlcube configure -Pdocker.build_strategy=always\n
This command will build your docker image and make the MLCube ready to use.
"},{"location":"mlcubes/mlcube_models/#run-your-mlcube","title":"Run your MLCube","text":"MedPerf will take care of running your MLCube. However, it's recommended to test the MLCube alone before using it with MedPerf for better debugging.
Use the command below to run the MLCube. Make sure you are in the folder model_mlcube/mlcube
.
mlcube run --task infer data_path=<absolute path to input data> output_path=<absolute path to a folder where predictions will be saved>\n
"},{"location":"mlcubes/mlcube_models/#a-working-example","title":"A Working Example","text":"Assume you have the codebase below. This code can be used to predict thoracic diseases based on Chest X-ray data. The classification task is modeled as a multi-label classification class.
models.py\"\"\"\nTaken from MedMNIST/MedMNIST.\n\"\"\"\nimport torch.nn as nn\nclass SimpleCNN(nn.Module):\ndef __init__(self, in_channels, num_classes):\nsuper(SimpleCNN, self).__init__()\nself.layer1 = nn.Sequential(\nnn.Conv2d(in_channels, 16, kernel_size=3), nn.BatchNorm2d(16), nn.ReLU()\n)\nself.layer2 = nn.Sequential(\nnn.Conv2d(16, 16, kernel_size=3),\nnn.BatchNorm2d(16),\nnn.ReLU(),\nnn.MaxPool2d(kernel_size=2, stride=2),\n)\nself.layer3 = nn.Sequential(\nnn.Conv2d(16, 64, kernel_size=3), nn.BatchNorm2d(64), nn.ReLU()\n)\nself.layer4 = nn.Sequential(\nnn.Conv2d(64, 64, kernel_size=3), nn.BatchNorm2d(64), nn.ReLU()\n)\nself.layer5 = nn.Sequential(\nnn.Conv2d(64, 64, kernel_size=3, padding=1),\nnn.BatchNorm2d(64),\nnn.ReLU(),\nnn.MaxPool2d(kernel_size=2, stride=2),\n)\nself.fc = nn.Sequential(\nnn.Linear(64 * 4 * 4, 128),\nnn.ReLU(),\nnn.Linear(128, 128),\nnn.ReLU(),\nnn.Linear(128, num_classes),\n)\ndef forward(self, x):\nx = self.layer1(x)\nx = self.layer2(x)\nx = self.layer3(x)\nx = self.layer4(x)\nx = self.layer5(x)\nx = x.view(x.size(0), -1)\nx = self.fc(x)\nreturn x\n
data_loader.py import numpy as np\nimport torchvision.transforms as transforms\nimport os\nfrom torch.utils.data import Dataset\nclass CustomImageDataset(Dataset):\ndef __init__(self, data_path):\nself.transform = transforms.Compose(\n[transforms.ToTensor(), transforms.Normalize(mean=[0.5], std=[0.5])]\n)\nself.files = os.listdir(data_path)\nself.data_path = data_path\ndef __len__(self):\nreturn len(self.files)\ndef __getitem__(self, idx):\nimg_path = os.path.join(self.data_path, self.files[idx])\nimage = np.load(img_path, allow_pickle=True)\nimage = self.transform(image)\nfile_id = self.files[idx].strip(\".npy\")\nreturn image, file_id\n
infer.py import torch\nfrom models import SimpleCNN\nfrom tqdm import tqdm\nfrom torch.utils.data import DataLoader\nfrom data_loader import CustomImageDataset\nfrom pprint import pprint\ndata_path = \"path/to/data/folder\"\nweights = \"path/to/weights.pt\"\nin_channels = 1\nnum_classes = 14\nbatch_size = 5\n# load model\nmodel = SimpleCNN(in_channels=in_channels, num_classes=num_classes)\nmodel.load_state_dict(torch.load(weights))\nmodel.eval()\n# load prepared data\ndataset = CustomImageDataset(data_path)\ndataloader = DataLoader(dataset, batch_size=batch_size, shuffle=False)\n# inference\npredictions_dict = {}\nwith torch.no_grad():\nfor images, files_ids in tqdm(dataloader):\noutputs = model(images)\noutputs = torch.nn.Sigmoid()(outputs)\noutputs = outputs.detach().numpy()\nfor file_id, output in zip(files_ids, outputs):\npredictions_dict[file_id] = output\npprint(predictions_dict)\n
Throughout the next sections, this code will be wrapped within an MLCube.
"},{"location":"mlcubes/mlcube_models/#before-building-the-mlcube_1","title":"Before Building the MLCube","text":"The guidlines listed previously in this section will now be applied to the given codebase. Assume that you were instructed by the benchmark you are participating with to have your MLCube interface as follows:
It is important to make sure that your MLCube will output an expected predictions format and consume a defined data format, since it will be used in a benchmarking pipeline whose data input is fixed and whose metrics calculation logic expects a fixed predictions format.
Considering the codebase above, here are the things that should be done before proceeding to build the MLCube:
infer.py
only prints predictions but doesn't store them. This has to be changed.infer.py
hardcodes some parameters (num_classes
, in_channels
, batch_size
) as well as the path to the trained model weights. Consider making these items configurable parameters. (This is optional but recommended)infer.py
to be a function so that is can be easily called by mlcube.py
.The other files models.py
and data_loader.py
seem to be good already. The data loader expects a folder containing a list of numpy arrays, as instructed.
Here is the modified version of infer.py
according to the points listed above:
import torch\nimport os\nfrom models import SimpleCNN\nfrom tqdm import tqdm\nfrom torch.utils.data import DataLoader\nfrom data_loader import CustomImageDataset\nimport json\ndef run_inference(data_path, parameters, output_path, weights):\nin_channels = parameters[\"in_channels\"]\nnum_classes = parameters[\"num_classes\"]\nbatch_size = parameters[\"batch_size\"]\n# load model\nmodel = SimpleCNN(in_channels=in_channels, num_classes=num_classes)\nmodel.load_state_dict(torch.load(weights))\nmodel.eval()\n# load prepared data\ndataset = CustomImageDataset(data_path)\ndataloader = DataLoader(dataset, batch_size=batch_size, shuffle=False)\n# inference\npredictions_dict = {}\nwith torch.no_grad():\nfor images, files_ids in tqdm(dataloader):\noutputs = model(images)\noutputs = torch.nn.Sigmoid()(outputs)\noutputs = outputs.detach().numpy().tolist()\nfor file_id, output in zip(files_ids, outputs):\npredictions_dict[file_id] = output\n# save\npreds_file = os.path.join(output_path, \"predictions.json\")\nwith open(preds_file, \"w\") as f:\njson.dump(predictions_dict, f, indent=4)\n
"},{"location":"mlcubes/mlcube_models/#create-an-mlcube-template","title":"Create an MLCube Template","text":"Assuming you installed MedPerf, run the following:
medperf mlcube create model\n
You will be prompted to fill in the configuration options. Use the following configuration as a reference:
project_name [Model MLCube]: Custom CNN Classification Model\nproject_slug [model_mlcube]: model_custom_cnn\ndescription [Model MLCube Template. Provided by MLCommons]: MedPerf Tutorial - Model MLCube.\nauthor_name [John Smith]: <use your name>\naccelerator_count [0]: 0\ndocker_image_name [docker/image:latest]: repository/model-tutorial:0.0.0\n
Note that docker_image_name
is arbitrarily chosen. Use a valid docker image.
Move the three files of the codebase to the project
folder. The directory tree will then look like this:
.\n\u2514\u2500\u2500 model_custom_cnn\n \u251c\u2500\u2500 mlcube\n \u2502 \u251c\u2500\u2500 mlcube.yaml\n \u2502 \u2514\u2500\u2500 workspace\n \u2502 \u2514\u2500\u2500 parameters.yaml\n \u2514\u2500\u2500 project\n \u251c\u2500\u2500 Dockerfile\n \u251c\u2500\u2500 mlcube.py\n \u251c\u2500\u2500 models.py\n \u251c\u2500\u2500 data_loader.py\n \u251c\u2500\u2500 infer.py\n \u2514\u2500\u2500 requirements.txt\n
"},{"location":"mlcubes/mlcube_models/#add-your-parameters-and-model-weights","title":"Add your parameters and model weights","text":"Since num_classes
, in_channels
, and batch_size
are now parametrized, they should be defined in workspace/parameters.yaml
. Also, the model weights should be placed inside workspace/additional_files
.
Modify parameters.yaml
to include the following:
in_channels: 1\nnum_classes: 14\nbatch_size: 5\n
"},{"location":"mlcubes/mlcube_models/#add-model-weights","title":"Add model weights","text":"Download the following model weights to use in this example: Click here to Download
Extract the file to workspace/additional_files
. The directory tree should look like this:
.\n\u2514\u2500\u2500 model_custom_cnn\n \u251c\u2500\u2500 mlcube\n \u2502 \u251c\u2500\u2500 mlcube.yaml\n \u2502 \u2514\u2500\u2500 workspace\n \u2502 \u251c\u2500\u2500 additional_files\n \u2502 \u2502 \u2514\u2500\u2500 cnn_weights.pth\n \u2502 \u2514\u2500\u2500 parameters.yaml\n \u2514\u2500\u2500 project\n \u251c\u2500\u2500 Dockerfile\n \u251c\u2500\u2500 mlcube.py\n \u251c\u2500\u2500 models.py\n \u251c\u2500\u2500 data_loader.py\n \u251c\u2500\u2500 infer.py\n \u2514\u2500\u2500 requirements.txt\n
"},{"location":"mlcubes/mlcube_models/#modify-mlcubepy","title":"Modify mlcube.py
","text":"Next, the inference logic should be triggered from mlcube.py
. The parameters_file
will be read in mlcube.py
and passed as a dictionary to the inference logic. Also, an extra parameter weights
is added to the function signature which will correspond to the model weights path. See below the modified mlcube.py
file.
\"\"\"MLCube handler file\"\"\"\nimport typer\nimport yaml\nfrom infer import run_inference\napp = typer.Typer()\n@app.command(\"infer\")\ndef infer(\ndata_path: str = typer.Option(..., \"--data_path\"),\nparameters_file: str = typer.Option(..., \"--parameters_file\"),\noutput_path: str = typer.Option(..., \"--output_path\"),\nweights: str = typer.Option(..., \"--weights\"),\n):\nwith open(parameters_file) as f:\nparameters = yaml.safe_load(f)\nrun_inference(data_path, parameters, output_path, weights)\n@app.command(\"hotfix\")\ndef hotfix():\n# NOOP command for typer to behave correctly. DO NOT REMOVE OR MODIFY\npass\nif __name__ == \"__main__\":\napp()\n
"},{"location":"mlcubes/mlcube_models/#prepare-the-dockerfile","title":"Prepare the Dockerfile","text":"The provided Dockerfile in the template is enough and preconfigured to download pip
dependencies from the requirements.txt
file. All that is needed is to modify the requirements.txt
file to include the project's pip dependencies.
typer==0.9.0\nnumpy==1.24.3\nPyYAML==6.0\ntorch==2.0.1\ntorchvision==0.15.2\ntqdm==4.65.0\n--extra-index-url https://download.pytorch.org/whl/cpu\n
"},{"location":"mlcubes/mlcube_models/#modify-mlcubeyaml","title":"Modify mlcube.yaml
","text":"Since the extra parameter weights
was added to the infer
task in mlcube.py
, this has to be reflected on the defined MLCube interface in the mlcube.yaml
file. Modify the tasks
section to include an extra input parameter: weights: additional_files/cnn_weights.pth
.
Tip
The MLCube tool interprets these paths as relative to the workspace
.
The tasks
section will then look like this:
tasks:\ninfer:\n# Computes predictions on input data\nparameters:\ninputs:\n{\n data_path: data/,\n parameters_file: parameters.yaml,\n weights: additional_files/cnn_weights.pth,\n}\noutputs: { output_path: { type: directory, default: predictions } }\n
"},{"location":"mlcubes/mlcube_models/#build-your-mlcube_1","title":"Build your MLCube","text":"Run the command below to create the MLCube. Make sure you are in the folder model_custom_cnn/mlcube
.
mlcube configure -Pdocker.build_strategy=always\n
This command will build your docker image and make the MLCube ready to use.
Tip
Run docker image ls
to see your built Docker image.
Download a sample data to run on: Click here to Download
Extract the data. You will get a folder sample_prepared_data
containing a list chest X-ray images as numpy arrays.
Use the command below to run the MLCube. Make sure you are in the the folder model_custom_cnn/mlcube
.
mlcube run --task infer data_path=<absolute path to `sample_prepared_data`> output_path=<absolute path to a folder where predictions will be saved>\n
"},{"location":"mlcubes/mlcube_models/#using-the-example-with-gpus","title":"Using the Example with GPUs","text":"The provided example codebase runs only on CPU. You can modify it to have pytorch
run inference on a GPU.
The general instructions for building an MLCube to work with a GPU are the same as the provided instructions, but with the following slight modifications:
pip
dependencies in the requirements.txt
file to download pytorch
with cuda, or by changing the base image of the dockerfile.For testing your MLCube with GPUs using the MLCube tool as in the previous section, make sure you run the mlcube run
command with a --gpus
argument. Example: mlcube run --gpus=all ...
For testing your MLCube with GPUs using MedPerf, make sure you pass as well the --gpus
argument to the MedPerf command. Example: medperf --gpus=all <subcommand> ...
.
Tip
Run medperf --help
to see the possible options you can use for the --gpus
argument.
MLCube is a set of common conventions for creating Machine Learning (ML) software that can \"plug-and-play\" on many different systems. It is basically a container image with a simple interface and the correct metadata that allows researchers and developers to easily share and experiment with ML pipelines.
You can read more about MLCubes here.
In MedPerf, MLCubes are required for creating the three technical components of a benchmarking experiment: the data preparation flow, the model inference flow, and the evaluation flow. A Benchmark Committee will be required to create three MLCubes that implement these components. A Model Owner will be required to wrap their model code within an MLCube in order to submit it to the MedPerf server and participate in a benchmark.
MLCubes are general-purpose. MedPerf defines three specific design types of MLCubes according to their purpose: The Data Preparator MLCube, the Model MLCube, and the Metrics MLCube. Each type has a specific MLCube task configuration that defines the MLCube's interface. Users need to follow these design specs when building their MLCubes to be conforming with MedPerf. We provide below a high-level description of each MLCube type and a link to a guide for building an example for each type.
"},{"location":"mlcubes/mlcubes/#data-preparator-mlcube","title":"Data Preparator MLCube","text":"The Data Preparator MLCube is used to prepare the data for executing the benchmark. Ideally, it can receive different data standards for the task at hand, transforming them into a single, unified standard. Additionally, it ensures the quality and compatibility of the data and computes statistics and metadata for registration purposes.
This MLCube's interface should expose the following tasks:
Prepare: Transforms the input data into the expected output data standard. It receives as input the location of the original data, as well as the location of the labels, and outputs the prepared dataset and accompanying labels.
Sanity check: Ensures the integrity of the prepared data. It may check for anomalies and data corruption (e.g. blank images, empty test cases). It constitutes a set of conditions the prepared data should comply with.
Statistics: Computes statistics on the prepared data.
Check this guide on how to create a Data Preparation MLCube.
"},{"location":"mlcubes/mlcubes/#model-mlcube","title":"Model MLCube","text":"The model MLCube contains a pre-trained machine learning model that is going to be evaluated by the benchmark. It's interface should expose the following task:
Check this guide on how to create a Model MLCube.
"},{"location":"mlcubes/mlcubes/#metricsevaluator-mlcube","title":"Metrics/Evaluator MLCube","text":"The Metrics MLCube is used for computing metrics on the model predictions by comparing them against the provided labels. It's interface should expose the following task:
Check this guide on how to create a Metrics MLCube.
"},{"location":"mlcubes/shared/build/","title":"Build","text":""},{"location":"mlcubes/shared/build/#building-a-no-such-element-dict-objectname","title":"Building a {{ no such element: dict object['name'] }}","text":"The following section will describe how you can create a {{ no such element: dict object['name'] }} from scratch. This documentation goes through the set of commands provided to help during this process, as well as the contents of a {{ no such element: dict object['name'] }}.
"},{"location":"mlcubes/shared/build/#setup","title":"Setup","text":"MedPerf provides some cookiecutter templates for all the related MLCubes. Additionally, it provides commands to easily retreive and use these templates. For that, you need to make sure MedPerf is installed. If you haven not done so, please follow the steps below:
Clone the repository:
git clone https://github.com/mlcommons/medperf\ncd medperf\n
Install the MedPerf CLI:
pip install -e cli\n
If you have not done so, create a folder for keeping all MLCubes created in this tutorial:
mkdir tutorial\ncd tutorial\n
Create a {{ no such element: dict object['name'] }} through MedPerf:
medperf mlcube create {{ no such element: dict object['slug'] }}\n
You should be prompted to fill in some configuration options through the CLI. An example of some good options to provide for this specific task is presented below: Let's have a look at what the previous command generated. First, lets look at the whole folder structure:
tree
"},{"location":"mlcubes/shared/cookiecutter/","title":"Cookiecutter","text":"Note
MedPerf is running CookieCutter under the hood. This medperf command provides additional arguments for handling different scenarios. You can see more information on this by running medperf mlcube create --help
MLCubes rely on containers to work. By default, Medperf provides a functional Dockerfile, which uses ubuntu:18.0.4
and python3.6
. This Dockerfile handles all the required procedures to install your project and redirect commands to the project/mlcube.py
file. You can modify as you see fit, as long as the entry point behaves as a CLI, as described before.
Running Docker MLCubes with Singularity
If you are building a Docker MLCube and expect it to be also run using Singularity, you need to keep in mind that Singularity containers built from Docker images ignore the WORKDIR
instruction if used in Dockerfiles. Make sure you also follow their best practices for writing Singularity-compatible Dockerfiles.
Now its time to run our own implementation. We won't go into much detail, since we covered the basics before. But, here are the commands you can run to build and run your MLCube.
{{ no such element: dict object['slug'] }}_mlcube
, run cd mlcube\n
Build the Docker image using the shortcuts provided by MLCubse. Here is how you can do it:
mlcube configure -Pdocker.build_strategy=always # (1)!\n
Pdocker.build_strategy=always
enforces MLCube to build the image from source.In order to provide a basic example of how Medperf MLCubes work under the hood, a toy Hello World benchmark is provided. This benchmark implements a pipeline for ingesting people's names and generating greetings for those names given some criteria. Although this is not the most scientific example, it provides a clear idea of all the pieces required to implement your MLCubes for Medperf.
You can find the {{ no such element: dict object['name'] }} code here
"},{"location":"mlcubes/shared/hotfix/","title":"Hotfix","text":"What is the hotfix
function inside mlcube.py
?
To summarize, this issue is benign and can be safely ignored. It prevents a potential issue with the CLI and does not require further action.
If you use the typer
/click
library for your command-line interface (CLI) and have only one @app.command
, the command line may not be parsed as expected by mlcube. This is due to a known issue that can be resolved by adding more than one task to the mlcube interface.
To avoid a potential issue with the CLI, we add a dummy typer command to our model cubes that only have one task. If you're not using typer
/click
, you don't need this dummy command.
The provided MLCube template assumes your project is python based. Because of this, it provides a requirements.txt
file to specify the dependencies to run your project. This file is automatically used by the Dockerfile
to install and set up your project. Since some dependencies are necessary, let's add them to the file:
Before digging into the code, let's try manually running the {{ no such element: dict object['name'] }}. During this process, it should be possible to see how MLCube interacts with the folders in the workspace and what is expected to happen during each step:
"},{"location":"mlcubes/shared/setup/#setup","title":"Setup","text":"Clone the repository:
git clone https://github.com/mlcommons/medperf\ncd medperf\n
Install mlcube and mlcube-docker using pip:
pip install mlcube mlcube-docker\n
Navigate to the HelloWorld directory within the examples folder with
cd examples/HelloWorld\n
Change to the current example's mlcube
folder with
cd {{ no such element: dict object['slug'] }}/mlcube\n
execute(benchmark_uid=typer.Option(..., '--benchmark', '-b', help='UID of the desired benchmark'), data_uid=typer.Option(..., '--data_uid', '-d', help='Registered Dataset UID'), model_uid=typer.Option(..., '--model_uid', '-m', help='UID of model to execute'), approval=typer.Option(False, '-y', help='Skip approval step'), ignore_model_errors=typer.Option(False, '--ignore-model-errors', help='Ignore failing model cubes, allowing for possibly submitting partial results'), no_cache=typer.Option(False, '--no-cache', help='Ignore existing results. The experiment then will be rerun'))
","text":"Runs the benchmark execution step for a given benchmark, prepared dataset and model
Source code incli/medperf/cli.py
@app.command(\"run\")\n@clean_except\ndef execute(\nbenchmark_uid: int = typer.Option(\n..., \"--benchmark\", \"-b\", help=\"UID of the desired benchmark\"\n),\ndata_uid: int = typer.Option(\n..., \"--data_uid\", \"-d\", help=\"Registered Dataset UID\"\n),\nmodel_uid: int = typer.Option(\n..., \"--model_uid\", \"-m\", help=\"UID of model to execute\"\n),\napproval: bool = typer.Option(False, \"-y\", help=\"Skip approval step\"),\nignore_model_errors: bool = typer.Option(\nFalse,\n\"--ignore-model-errors\",\nhelp=\"Ignore failing model cubes, allowing for possibly submitting partial results\",\n),\nno_cache: bool = typer.Option(\nFalse,\n\"--no-cache\",\nhelp=\"Ignore existing results. The experiment then will be rerun\",\n),\n):\n\"\"\"Runs the benchmark execution step for a given benchmark, prepared dataset and model\"\"\"\nresult = BenchmarkExecution.run(\nbenchmark_uid,\ndata_uid,\n[model_uid],\nignore_model_errors=ignore_model_errors,\nno_cache=no_cache,\n)[0]\nif result.id: # TODO: use result.is_registered once PR #338 is merged\nconfig.ui.print( # TODO: msg should be colored yellow\n\"\"\"An existing registered result for the requested execution has been\\n\n found. If you wish to submit a new result for the same execution,\\n\n please run the command again with the --no-cache option.\\n\"\"\"\n)\nelse:\nResultSubmission.run(result.local_id, approved=approval)\nconfig.ui.print(\"\u2705 Done!\")\n
"},{"location":"reference/config/","title":"Config","text":""},{"location":"reference/decorators/","title":"Decorators","text":""},{"location":"reference/decorators/#decorators.add_inline_parameters","title":"add_inline_parameters(func)
","text":"Decorator that adds common configuration options to a typer command
Parameters:
Name Type Description Defaultfunc
Callable
function to be decorated
requiredReturns:
Name Type DescriptionCallable
Callable
decorated function
Source code incli/medperf/decorators.py
def add_inline_parameters(func: Callable) -> Callable:\n\"\"\"Decorator that adds common configuration options to a typer command\n Args:\n func (Callable): function to be decorated\n Returns:\n Callable: decorated function\n \"\"\"\n# NOTE: changing parameters here should be accompanied\n# by changing config.inline_parameters\n@merge_args(func)\ndef wrapper(\n*args,\nloglevel: str = typer.Option(\nconfig.loglevel,\n\"--loglevel\",\nhelp=\"Logging level [debug | info | warning | error]\",\n),\nprepare_timeout: int = typer.Option(\nconfig.prepare_timeout,\n\"--prepare_timeout\",\nhelp=\"Maximum time in seconds before interrupting prepare task\",\n),\nsanity_check_timeout: int = typer.Option(\nconfig.sanity_check_timeout,\n\"--sanity_check_timeout\",\nhelp=\"Maximum time in seconds before interrupting sanity_check task\",\n),\nstatistics_timeout: int = typer.Option(\nconfig.statistics_timeout,\n\"--statistics_timeout\",\nhelp=\"Maximum time in seconds before interrupting statistics task\",\n),\ninfer_timeout: int = typer.Option(\nconfig.infer_timeout,\n\"--infer_timeout\",\nhelp=\"Maximum time in seconds before interrupting infer task\",\n),\nevaluate_timeout: int = typer.Option(\nconfig.evaluate_timeout,\n\"--evaluate_timeout\",\nhelp=\"Maximum time in seconds before interrupting evaluate task\",\n),\ncontainer_loglevel: str = typer.Option(\nconfig.container_loglevel,\n\"--container-loglevel\",\nhelp=\"Logging level for containers to be run [debug | info | warning | error]\",\n),\nplatform: str = typer.Option(\nconfig.platform,\n\"--platform\",\nhelp=\"Platform to use for MLCube. [docker | singularity]\",\n),\ngpus: str = typer.Option(\nconfig.gpus,\n\"--gpus\",\nhelp=\"\"\"\n What GPUs to expose to MLCube.\n Accepted Values are:\\n\n - \"\" or 0: to expose no GPUs (e.g.: --gpus=\"\")\\n\n - \"all\": to expose all GPUs. (e.g.: --gpus=all)\\n\n - an integer: to expose a certain number of GPUs. ONLY AVAILABLE FOR DOCKER\n (e.g., --gpus=2 to expose 2 GPUs)\\n\n - Form \"device=<id1>,<id2>\": to expose specific GPUs.\n (e.g., --gpus=\"device=0,2\")\\n\"\"\",\n),\ncleanup: bool = typer.Option(\nconfig.cleanup,\n\"--cleanup/--no-cleanup\",\nhelp=\"Whether to clean up temporary medperf storage after execution\",\n),\n**kwargs,\n):\nreturn func(*args, **kwargs)\nreturn wrapper\n
"},{"location":"reference/decorators/#decorators.clean_except","title":"clean_except(func)
","text":"Decorator for handling errors. It allows logging and cleaning the project's directory before throwing the error.
Parameters:
Name Type Description Defaultfunc
Callable
Function to handle for errors
requiredReturns:
Name Type DescriptionCallable
Callable
Decorated function
Source code incli/medperf/decorators.py
def clean_except(func: Callable) -> Callable:\n\"\"\"Decorator for handling errors. It allows logging\n and cleaning the project's directory before throwing the error.\n Args:\n func (Callable): Function to handle for errors\n Returns:\n Callable: Decorated function\n \"\"\"\n@functools.wraps(func)\ndef wrapper(*args, **kwargs):\ntry:\nlogging.info(f\"Running function '{func.__name__}'\")\nfunc(*args, **kwargs)\nexcept CleanExit as e:\nlogging.info(str(e))\nconfig.ui.print(str(e))\nsys.exit(e.medperf_status_code)\nexcept MedperfException as e:\nlogging.exception(e)\npretty_error(str(e))\nsys.exit(1)\nexcept Exception as e:\nlogging.error(\"An unexpected error occured. Terminating.\")\nlogging.exception(e)\nraise e\nfinally:\npackage_logs()\ncleanup()\nreturn wrapper\n
"},{"location":"reference/decorators/#decorators.configurable","title":"configurable(func)
","text":"Decorator that adds common configuration options to a typer command
Parameters:
Name Type Description Defaultfunc
Callable
function to be decorated
requiredReturns:
Name Type DescriptionCallable
Callable
decorated function
Source code incli/medperf/decorators.py
def configurable(func: Callable) -> Callable:\n\"\"\"Decorator that adds common configuration options to a typer command\n Args:\n func (Callable): function to be decorated\n Returns:\n Callable: decorated function\n \"\"\"\n# NOTE: changing parameters here should be accompanied\n# by changing configurable_parameters\n@merge_args(func)\ndef wrapper(\n*args,\nserver: str = typer.Option(\nconfig.server, \"--server\", help=\"URL of a hosted MedPerf API instance\"\n),\nauth_class: str = typer.Option(\nconfig.auth_class,\n\"--auth_class\",\nhelp=\"Authentication interface to use [Auth0]\",\n),\nauth_domain: str = typer.Option(\nconfig.auth_domain, \"--auth_domain\", help=\"Auth0 domain name\"\n),\nauth_jwks_url: str = typer.Option(\nconfig.auth_jwks_url, \"--auth_jwks_url\", help=\"Auth0 Json Web Key set URL\"\n),\nauth_idtoken_issuer: str = typer.Option(\nconfig.auth_idtoken_issuer,\n\"--auth_idtoken_issuer\",\nhelp=\"Auth0 ID token issuer\",\n),\nauth_client_id: str = typer.Option(\nconfig.auth_client_id, \"--auth_client_id\", help=\"Auth0 client ID\"\n),\nauth_audience: str = typer.Option(\nconfig.auth_audience,\n\"--auth_audience\",\nhelp=\"Server's Auth0 API identifier\",\n),\ncertificate: str = typer.Option(\nconfig.certificate, \"--certificate\", help=\"path to a valid SSL certificate\"\n),\nloglevel: str = typer.Option(\nconfig.loglevel,\n\"--loglevel\",\nhelp=\"Logging level [debug | info | warning | error]\",\n),\nprepare_timeout: int = typer.Option(\nconfig.prepare_timeout,\n\"--prepare_timeout\",\nhelp=\"Maximum time in seconds before interrupting prepare task\",\n),\nsanity_check_timeout: int = typer.Option(\nconfig.sanity_check_timeout,\n\"--sanity_check_timeout\",\nhelp=\"Maximum time in seconds before interrupting sanity_check task\",\n),\nstatistics_timeout: int = typer.Option(\nconfig.statistics_timeout,\n\"--statistics_timeout\",\nhelp=\"Maximum time in seconds before interrupting statistics task\",\n),\ninfer_timeout: int = typer.Option(\nconfig.infer_timeout,\n\"--infer_timeout\",\nhelp=\"Maximum time in seconds before interrupting infer task\",\n),\nevaluate_timeout: int = typer.Option(\nconfig.evaluate_timeout,\n\"--evaluate_timeout\",\nhelp=\"Maximum time in seconds before interrupting evaluate task\",\n),\ncontainer_loglevel: str = typer.Option(\nconfig.container_loglevel,\n\"--container-loglevel\",\nhelp=\"Logging level for containers to be run [debug | info | warning | error]\",\n),\nplatform: str = typer.Option(\nconfig.platform,\n\"--platform\",\nhelp=\"Platform to use for MLCube. [docker | singularity]\",\n),\ngpus: str = typer.Option(\nconfig.gpus,\n\"--gpus\",\nhelp=\"\"\"\n What GPUs to expose to MLCube.\n Accepted Values are comma separated GPU IDs (e.g \"1,2\"), or \\\"all\\\".\n MLCubes that aren't configured to use GPUs won't be affected by this.\n Defaults to all available GPUs\"\"\",\n),\ncleanup: bool = typer.Option(\nconfig.cleanup,\n\"--cleanup/--no-cleanup\",\nhelp=\"Wether to clean up temporary medperf storage after execution\",\n),\n**kwargs,\n):\nreturn func(*args, **kwargs)\nreturn wrapper\n
"},{"location":"reference/enums/","title":"Enums","text":""},{"location":"reference/exceptions/","title":"Exceptions","text":""},{"location":"reference/exceptions/#exceptions.AuthenticationError","title":"AuthenticationError
","text":" Bases: MedperfException
Raised when authentication can't be processed
Source code incli/medperf/exceptions.py
class AuthenticationError(MedperfException):\n\"\"\"Raised when authentication can't be processed\"\"\"\n
"},{"location":"reference/exceptions/#exceptions.CleanExit","title":"CleanExit
","text":" Bases: MedperfException
Raised when Medperf needs to stop for non erroneous reasons
Source code incli/medperf/exceptions.py
class CleanExit(MedperfException):\n\"\"\"Raised when Medperf needs to stop for non erroneous reasons\"\"\"\ndef __init__(self, *args, medperf_status_code=0) -> None:\nsuper().__init__(*args)\nself.medperf_status_code = medperf_status_code\n
"},{"location":"reference/exceptions/#exceptions.CommunicationAuthenticationError","title":"CommunicationAuthenticationError
","text":" Bases: CommunicationError
Raised when the communication interface can't handle an authentication request
Source code incli/medperf/exceptions.py
class CommunicationAuthenticationError(CommunicationError):\n\"\"\"Raised when the communication interface can't handle an authentication request\"\"\"\n
"},{"location":"reference/exceptions/#exceptions.CommunicationError","title":"CommunicationError
","text":" Bases: MedperfException
Raised when an error happens due to the communication interface
Source code incli/medperf/exceptions.py
class CommunicationError(MedperfException):\n\"\"\"Raised when an error happens due to the communication interface\"\"\"\n
"},{"location":"reference/exceptions/#exceptions.CommunicationRequestError","title":"CommunicationRequestError
","text":" Bases: CommunicationError
Raised when the communication interface can't handle a request appropiately
Source code incli/medperf/exceptions.py
class CommunicationRequestError(CommunicationError):\n\"\"\"Raised when the communication interface can't handle a request appropiately\"\"\"\n
"},{"location":"reference/exceptions/#exceptions.CommunicationRetrievalError","title":"CommunicationRetrievalError
","text":" Bases: CommunicationError
Raised when the communication interface can't retrieve an element
Source code incli/medperf/exceptions.py
class CommunicationRetrievalError(CommunicationError):\n\"\"\"Raised when the communication interface can't retrieve an element\"\"\"\n
"},{"location":"reference/exceptions/#exceptions.ExecutionError","title":"ExecutionError
","text":" Bases: MedperfException
Raised when an execution component fails
Source code incli/medperf/exceptions.py
class ExecutionError(MedperfException):\n\"\"\"Raised when an execution component fails\"\"\"\n
"},{"location":"reference/exceptions/#exceptions.InvalidArgumentError","title":"InvalidArgumentError
","text":" Bases: MedperfException
Raised when an argument or set of arguments are consided invalid
Source code incli/medperf/exceptions.py
class InvalidArgumentError(MedperfException):\n\"\"\"Raised when an argument or set of arguments are consided invalid\"\"\"\n
"},{"location":"reference/exceptions/#exceptions.InvalidEntityError","title":"InvalidEntityError
","text":" Bases: MedperfException
Raised when an entity is considered invalid
Source code incli/medperf/exceptions.py
class InvalidEntityError(MedperfException):\n\"\"\"Raised when an entity is considered invalid\"\"\"\n
"},{"location":"reference/exceptions/#exceptions.MedperfException","title":"MedperfException
","text":" Bases: Exception
Medperf base exception
Source code incli/medperf/exceptions.py
class MedperfException(Exception):\n\"\"\"Medperf base exception\"\"\"\n
"},{"location":"reference/init/","title":"Init","text":""},{"location":"reference/utils/","title":"Utils","text":""},{"location":"reference/utils/#utils.approval_prompt","title":"approval_prompt(msg)
","text":"Helper function for prompting the user for things they have to explicitly approve.
Parameters:
Name Type Description Defaultmsg
str
What message to ask the user for approval.
requiredReturns:
Name Type Descriptionbool
bool
Wether the user explicitly approved or not.
Source code incli/medperf/utils.py
def approval_prompt(msg: str) -> bool:\n\"\"\"Helper function for prompting the user for things they have to explicitly approve.\n Args:\n msg (str): What message to ask the user for approval.\n Returns:\n bool: Wether the user explicitly approved or not.\n \"\"\"\nlogging.info(\"Prompting for user's approval\")\nui = config.ui\napproval = None\nwhile approval is None or approval not in \"yn\":\napproval = ui.prompt(msg.strip() + \" \").lower()\nlogging.info(f\"User answered approval with {approval}\")\nreturn approval == \"y\"\n
"},{"location":"reference/utils/#utils.check_for_updates","title":"check_for_updates()
","text":"Check if the current branch is up-to-date with its remote counterpart using GitPython.
Source code incli/medperf/utils.py
def check_for_updates() -> None:\n\"\"\"Check if the current branch is up-to-date with its remote counterpart using GitPython.\"\"\"\nrepo = Repo(config.BASE_DIR)\nif repo.bare:\nlogging.debug(\"Repo is bare\")\nreturn\nlogging.debug(f\"Current git commit: {repo.head.commit.hexsha}\")\ntry:\nfor remote in repo.remotes:\nremote.fetch()\nif repo.head.is_detached:\nlogging.debug(\"Repo is in detached state\")\nreturn\ncurrent_branch = repo.active_branch\ntracking_branch = current_branch.tracking_branch()\nif tracking_branch is None:\nlogging.debug(\"Current branch does not track a remote branch.\")\nreturn\nif current_branch.commit.hexsha == tracking_branch.commit.hexsha:\nlogging.debug(\"No git branch updates.\")\nreturn\nlogging.debug(\nf\"Git branch updates found: {current_branch.commit.hexsha} -> {tracking_branch.commit.hexsha}\"\n)\nconfig.ui.print_warning(\n\"MedPerf client updates found. Please, update your MedPerf installation.\"\n)\nexcept GitCommandError as e:\nlogging.debug(\n\"Exception raised during updates check. Maybe user checked out repo with git@ and private key\"\n\" or repo is in detached / non-tracked state?\"\n)\nlogging.debug(e)\n
"},{"location":"reference/utils/#utils.cleanup","title":"cleanup()
","text":"Removes clutter and unused files from the medperf folder structure.
Source code incli/medperf/utils.py
def cleanup():\n\"\"\"Removes clutter and unused files from the medperf folder structure.\"\"\"\nif not config.cleanup:\nlogging.info(\"Cleanup disabled\")\nreturn\nfor path in config.tmp_paths:\nremove_path(path)\ntrash_folder = config.trash_folder\nif os.path.exists(trash_folder) and os.listdir(trash_folder):\nmsg = \"WARNING: Failed to premanently cleanup some files. Consider deleting\"\nmsg += f\" '{trash_folder}' manually to avoid unnecessary storage.\"\nconfig.ui.print_warning(msg)\n
"},{"location":"reference/utils/#utils.combine_proc_sp_text","title":"combine_proc_sp_text(proc)
","text":"Combines the output of a process and the spinner. Joins any string captured from the process with the spinner current text. Any strings ending with any other character from the subprocess will be returned later.
Parameters:
Name Type Description Defaultproc
spawn
a pexpect spawned child
requiredReturns:
Name Type Descriptionstr
str
all non-carriage-return-ending string captured from proc
Source code incli/medperf/utils.py
def combine_proc_sp_text(proc: spawn) -> str:\n\"\"\"Combines the output of a process and the spinner.\n Joins any string captured from the process with the\n spinner current text. Any strings ending with any other\n character from the subprocess will be returned later.\n Args:\n proc (spawn): a pexpect spawned child\n Returns:\n str: all non-carriage-return-ending string captured from proc\n \"\"\"\nui = config.ui\nproc_out = \"\"\nbreak_ = False\nlog_filter = _MLCubeOutputFilter(proc.pid)\nwhile not break_:\nif not proc.isalive():\nbreak_ = True\ntry:\nline = proc.readline()\nexcept TIMEOUT:\nlogging.error(\"Process timed out\")\nlogging.debug(proc_out)\nraise ExecutionError(\"Process timed out\")\nline = line.decode(\"utf-8\", \"ignore\")\nif not line:\ncontinue\n# Always log each line just in case the final proc_out\n# wasn't logged for some reason\nlogging.debug(line)\nproc_out += line\nif not log_filter.check_line(line):\nui.print(f\"{Fore.WHITE}{Style.DIM}{line.strip()}{Style.RESET_ALL}\")\nlogging.debug(\"MLCube process finished\")\nlogging.debug(proc_out)\nreturn proc_out\n
"},{"location":"reference/utils/#utils.dict_pretty_print","title":"dict_pretty_print(in_dict, skip_none_values=True)
","text":"Helper function for distinctively printing dictionaries with yaml format.
Parameters:
Name Type Description Defaultin_dict
dict
dictionary to print
requiredskip_none_values
bool
if fields with None
values should be omitted
True
Source code in cli/medperf/utils.py
def dict_pretty_print(in_dict: dict, skip_none_values: bool = True):\n\"\"\"Helper function for distinctively printing dictionaries with yaml format.\n Args:\n in_dict (dict): dictionary to print\n skip_none_values (bool): if fields with `None` values should be omitted\n \"\"\"\nlogging.debug(f\"Printing dictionary to the user: {in_dict}\")\nui = config.ui\nui.print()\nui.print(\"=\" * 20)\nif skip_none_values:\nin_dict = {k: v for (k, v) in in_dict.items() if v is not None}\nui.print(yaml.dump(in_dict))\nlogging.debug(f\"Dictionary printed to the user: {in_dict}\")\nui.print(\"=\" * 20)\n
"},{"location":"reference/utils/#utils.filter_latest_associations","title":"filter_latest_associations(associations, entity_key)
","text":"Given a list of entity-benchmark associations, this function retrieves a list containing the latest association of each entity instance.
Parameters:
Name Type Description Defaultassociations
list[dict]
the list of associations
requiredentity_key
str
either \"dataset\" or \"model_mlcube\"
requiredReturns:
Type Descriptionlist[dict]: the list containing the latest association of each entity instance.
Source code incli/medperf/utils.py
def filter_latest_associations(associations, entity_key):\n\"\"\"Given a list of entity-benchmark associations, this function\n retrieves a list containing the latest association of each\n entity instance.\n Args:\n associations (list[dict]): the list of associations\n entity_key (str): either \"dataset\" or \"model_mlcube\"\n Returns:\n list[dict]: the list containing the latest association of each\n entity instance.\n \"\"\"\nassociations.sort(key=lambda assoc: parse_datetime(assoc[\"created_at\"]))\nlatest_associations = {}\nfor assoc in associations:\nentity_id = assoc[entity_key]\nlatest_associations[entity_id] = assoc\nlatest_associations = list(latest_associations.values())\nreturn latest_associations\n
"},{"location":"reference/utils/#utils.format_errors_dict","title":"format_errors_dict(errors_dict)
","text":"Reformats the error details from a field-error(s) dictionary into a human-readable string for printing
Source code incli/medperf/utils.py
def format_errors_dict(errors_dict: dict):\n\"\"\"Reformats the error details from a field-error(s) dictionary into a human-readable string for printing\"\"\"\nerror_msg = \"\"\nfor field, errors in errors_dict.items():\nerror_msg += \"\\n\"\nif isinstance(field, tuple):\nfield = field[0]\nerror_msg += f\"- {field}: \"\nif isinstance(errors, str):\nerror_msg += errors\nelif len(errors) == 1:\n# If a single error for a field is given, don't create a sublist\nerror_msg += errors[0]\nelse:\n# Create a sublist otherwise\nfor e_msg in errors:\nerror_msg += \"\\n\"\nerror_msg += f\"\\t- {e_msg}\"\nreturn error_msg\n
"},{"location":"reference/utils/#utils.generate_tmp_path","title":"generate_tmp_path()
","text":"Generates a temporary path by means of getting the current timestamp with a random salt
Returns:
Name Type Descriptionstr
str
generated temporary path
Source code incli/medperf/utils.py
def generate_tmp_path() -> str:\n\"\"\"Generates a temporary path by means of getting the current timestamp\n with a random salt\n Returns:\n str: generated temporary path\n \"\"\"\ntmp_path = os.path.join(config.tmp_folder, generate_tmp_uid())\nconfig.tmp_paths.append(tmp_path)\nreturn tmp_path\n
"},{"location":"reference/utils/#utils.generate_tmp_uid","title":"generate_tmp_uid()
","text":"Generates a temporary uid by means of getting the current timestamp with a random salt
Returns:
Name Type Descriptionstr
str
generated temporary uid
Source code incli/medperf/utils.py
def generate_tmp_uid() -> str:\n\"\"\"Generates a temporary uid by means of getting the current timestamp\n with a random salt\n Returns:\n str: generated temporary uid\n \"\"\"\ndt = datetime.utcnow()\nts_int = int(datetime.timestamp(dt))\nsalt = random.randint(-ts_int, ts_int)\nts = str(ts_int + salt)\nreturn ts\n
"},{"location":"reference/utils/#utils.get_cube_image_name","title":"get_cube_image_name(cube_path)
","text":"Retrieves the singularity image name of the mlcube by reading its mlcube.yaml file
Source code incli/medperf/utils.py
def get_cube_image_name(cube_path: str) -> str:\n\"\"\"Retrieves the singularity image name of the mlcube by reading its mlcube.yaml file\"\"\"\ncube_config_path = os.path.join(cube_path, config.cube_filename)\nwith open(cube_config_path, \"r\") as f:\ncube_config = yaml.safe_load(f)\ntry:\n# TODO: Why do we check singularity only there? Why not docker?\nreturn cube_config[\"singularity\"][\"image\"]\nexcept KeyError:\nmsg = \"The provided mlcube doesn't seem to be configured for singularity\"\nraise MedperfException(msg)\n
"},{"location":"reference/utils/#utils.get_file_hash","title":"get_file_hash(path)
","text":"Calculates the sha256 hash for a given file.
Parameters:
Name Type Description Defaultpath
str
Location of the file of interest.
requiredReturns:
Name Type Descriptionstr
str
Calculated hash
Source code incli/medperf/utils.py
def get_file_hash(path: str) -> str:\n\"\"\"Calculates the sha256 hash for a given file.\n Args:\n path (str): Location of the file of interest.\n Returns:\n str: Calculated hash\n \"\"\"\nlogging.debug(\"Calculating hash for file {}\".format(path))\nBUF_SIZE = 65536\nsha = hashlib.sha256()\nwith open(path, \"rb\") as f:\nwhile True:\ndata = f.read(BUF_SIZE)\nif not data:\nbreak\nsha.update(data)\nsha_val = sha.hexdigest()\nlogging.debug(f\"Hash for file {path}: {sha_val}\")\nreturn sha_val\n
"},{"location":"reference/utils/#utils.get_folders_hash","title":"get_folders_hash(paths)
","text":"Generates a hash for all the contents of the fiven folders. This procedure hashes all the files in all passed folders, sorts them and then hashes that list.
Parameters:
Name Type Description Defaultpaths
List(str
Folders to hash.
requiredReturns:
Name Type Descriptionstr
str
sha256 hash that represents all the folders altogether
Source code incli/medperf/utils.py
def get_folders_hash(paths: List[str]) -> str:\n\"\"\"Generates a hash for all the contents of the fiven folders. This procedure\n hashes all the files in all passed folders, sorts them and then hashes that list.\n Args:\n paths List(str): Folders to hash.\n Returns:\n str: sha256 hash that represents all the folders altogether\n \"\"\"\nhashes = []\n# The hash doesn't depend on the order of paths or folders, as the hashes get sorted after the fact\nfor path in paths:\nfor root, _, files in os.walk(path, topdown=False):\nfor file in files:\nlogging.debug(f\"Hashing file {file}\")\nfilepath = os.path.join(root, file)\nhashes.append(get_file_hash(filepath))\nhashes = sorted(hashes)\nsha = hashlib.sha256()\nfor hash in hashes:\nsha.update(hash.encode(\"utf-8\"))\nhash_val = sha.hexdigest()\nlogging.debug(f\"Folder hash: {hash_val}\")\nreturn hash_val\n
"},{"location":"reference/utils/#utils.get_uids","title":"get_uids(path)
","text":"Retrieves the UID of all the elements in the specified path.
Returns:
Type DescriptionList[str]
List[str]: UIDs of objects in path.
Source code incli/medperf/utils.py
def get_uids(path: str) -> List[str]:\n\"\"\"Retrieves the UID of all the elements in the specified path.\n Returns:\n List[str]: UIDs of objects in path.\n \"\"\"\nlogging.debug(\"Retrieving datasets\")\nuids = next(os.walk(path))[1]\nlogging.debug(f\"Found {len(uids)} datasets\")\nlogging.debug(f\"Datasets: {uids}\")\nreturn uids\n
"},{"location":"reference/utils/#utils.pretty_error","title":"pretty_error(msg)
","text":"Prints an error message with typer protocol
Parameters:
Name Type Description Defaultmsg
str
Error message to show to the user
required Source code incli/medperf/utils.py
def pretty_error(msg: str):\n\"\"\"Prints an error message with typer protocol\n Args:\n msg (str): Error message to show to the user\n \"\"\"\nui = config.ui\nlogging.warning(\n\"MedPerf had to stop execution. See logs above for more information\"\n)\nif msg[-1] != \".\":\nmsg = msg + \".\"\nui.print_error(msg)\n
"},{"location":"reference/utils/#utils.remove_path","title":"remove_path(path)
","text":"Cleans up a clutter object. In case of failure, it is moved to .trash
cli/medperf/utils.py
def remove_path(path):\n\"\"\"Cleans up a clutter object. In case of failure, it is moved to `.trash`\"\"\"\n# NOTE: We assume medperf will always have permissions to unlink\n# and rename clutter paths, since for now they are expected to live\n# in folders owned by medperf\nif not os.path.exists(path):\nreturn\nlogging.info(f\"Removing clutter path: {path}\")\n# Don't delete symlinks\nif os.path.islink(path):\nos.unlink(path)\nreturn\ntry:\nif os.path.isfile(path):\nos.remove(path)\nelse:\nshutil.rmtree(path)\nexcept OSError as e:\nlogging.error(f\"Could not remove {path}: {str(e)}\")\nmove_to_trash(path)\n
"},{"location":"reference/utils/#utils.sanitize_json","title":"sanitize_json(data)
","text":"Makes sure the input data is JSON compliant.
Parameters:
Name Type Description Defaultdata
dict
dictionary containing data to be represented as JSON.
requiredReturns:
Name Type Descriptiondict
dict
sanitized dictionary
Source code incli/medperf/utils.py
def sanitize_json(data: dict) -> dict:\n\"\"\"Makes sure the input data is JSON compliant.\n Args:\n data (dict): dictionary containing data to be represented as JSON.\n Returns:\n dict: sanitized dictionary\n \"\"\"\njson_string = json.dumps(data)\njson_string = re.sub(r\"\\bNaN\\b\", '\"nan\"', json_string)\njson_string = re.sub(r\"(-?)\\bInfinity\\b\", r'\"\\1Infinity\"', json_string)\ndata = json.loads(json_string)\nreturn data\n
"},{"location":"reference/utils/#utils.untar","title":"untar(filepath, remove=True)
","text":"Untars and optionally removes the tar.gz file
Parameters:
Name Type Description Defaultfilepath
str
Path where the tar.gz file can be found.
requiredremove
bool
Wether to delete the tar.gz file. Defaults to True.
True
Returns:
Name Type Descriptionstr
str
location where the untared files can be found.
Source code incli/medperf/utils.py
def untar(filepath: str, remove: bool = True) -> str:\n\"\"\"Untars and optionally removes the tar.gz file\n Args:\n filepath (str): Path where the tar.gz file can be found.\n remove (bool): Wether to delete the tar.gz file. Defaults to True.\n Returns:\n str: location where the untared files can be found.\n \"\"\"\nlogging.info(f\"Uncompressing tar.gz at {filepath}\")\naddpath = str(Path(filepath).parent)\ntar = tarfile.open(filepath)\ntar.extractall(addpath)\ntar.close()\n# OS Specific issue: Mac Creates superfluous files with tarfile library\n[\nremove_path(spurious_file)\nfor spurious_file in glob(addpath + \"/**/._*\", recursive=True)\n]\nif remove:\nlogging.info(f\"Deleting {filepath}\")\nremove_path(filepath)\nreturn addpath\n
"},{"location":"reference/account_management/account_management/","title":"Account management","text":""},{"location":"reference/account_management/account_management/#account_management.account_management.get_medperf_user_data","title":"get_medperf_user_data()
","text":"Return cached medperf user data. Get from the server if not found
Source code incli/medperf/account_management/account_management.py
def get_medperf_user_data():\n\"\"\"Return cached medperf user data. Get from the server if not found\"\"\"\nconfig_p = read_config()\nif config.credentials_keyword not in config_p.active_profile:\nraise MedperfException(\"You are not logged in\")\nmedperf_user = config_p.active_profile[config.credentials_keyword].get(\n\"medperf_user\", None\n)\nif medperf_user is None:\nmedperf_user = set_medperf_user_data()\nreturn medperf_user\n
"},{"location":"reference/account_management/account_management/#account_management.account_management.set_medperf_user_data","title":"set_medperf_user_data()
","text":"Get and cache user data from the MedPerf server
Source code incli/medperf/account_management/account_management.py
def set_medperf_user_data():\n\"\"\"Get and cache user data from the MedPerf server\"\"\"\nconfig_p = read_config()\nmedperf_user = config.comms.get_current_user()\nconfig_p.active_profile[config.credentials_keyword][\"medperf_user\"] = medperf_user\nwrite_config(config_p)\nreturn medperf_user\n
"},{"location":"reference/account_management/token_storage/filesystem/","title":"Filesystem","text":""},{"location":"reference/account_management/token_storage/keyring_/","title":"Keyring","text":"Keyring token storage is NOT used. We used it before this commit but users who connect to remote machines through passwordless SSH faced some issues.
"},{"location":"reference/commands/execution/","title":"Execution","text":""},{"location":"reference/commands/execution/#commands.execution.Execution","title":"Execution
","text":"Source code in cli/medperf/commands/execution.py
class Execution:\n@classmethod\ndef run(\ncls, dataset: Dataset, model: Cube, evaluator: Cube, ignore_model_errors=False\n):\n\"\"\"Benchmark execution flow.\n Args:\n benchmark_uid (int): UID of the desired benchmark\n data_uid (str): Registered Dataset UID\n model_uid (int): UID of model to execute\n \"\"\"\nexecution = cls(dataset, model, evaluator, ignore_model_errors)\nexecution.prepare()\nwith execution.ui.interactive():\nexecution.run_inference()\nexecution.run_evaluation()\nexecution_summary = execution.todict()\nreturn execution_summary\ndef __init__(\nself, dataset: Dataset, model: Cube, evaluator: Cube, ignore_model_errors=False\n):\nself.comms = config.comms\nself.ui = config.ui\nself.dataset = dataset\nself.model = model\nself.evaluator = evaluator\nself.ignore_model_errors = ignore_model_errors\ndef prepare(self):\nself.partial = False\nself.preds_path = self.__setup_predictions_path()\nself.model_logs_path, self.metrics_logs_path = self.__setup_logs_path()\nself.results_path = generate_tmp_path()\nlogging.debug(f\"tmp results output: {self.results_path}\")\ndef __setup_logs_path(self):\nmodel_uid = self.model.local_id\neval_uid = self.evaluator.local_id\ndata_uid = self.dataset.local_id\nlogs_path = os.path.join(\nconfig.experiments_logs_folder, str(model_uid), str(data_uid)\n)\nos.makedirs(logs_path, exist_ok=True)\nmodel_logs_path = os.path.join(logs_path, \"model.log\")\nmetrics_logs_path = os.path.join(logs_path, f\"metrics_{eval_uid}.log\")\nreturn model_logs_path, metrics_logs_path\ndef __setup_predictions_path(self):\nmodel_uid = self.model.local_id\ndata_uid = self.dataset.local_id\npreds_path = os.path.join(\nconfig.predictions_folder, str(model_uid), str(data_uid)\n)\nif os.path.exists(preds_path):\nmsg = f\"Found existing predictions for model {self.model.id} on dataset \"\nmsg += f\"{self.dataset.id} at {preds_path}. Consider deleting this \"\nmsg += \"folder if you wish to overwrite the predictions.\"\nraise ExecutionError(msg)\nreturn preds_path\ndef run_inference(self):\nself.ui.text = \"Running model inference on dataset\"\ninfer_timeout = config.infer_timeout\npreds_path = self.preds_path\ndata_path = self.dataset.data_path\ntry:\nself.model.run(\ntask=\"infer\",\noutput_logs=self.model_logs_path,\ntimeout=infer_timeout,\ndata_path=data_path,\noutput_path=preds_path,\n)\nself.ui.print(\"> Model execution complete\")\nexcept ExecutionError as e:\nif not self.ignore_model_errors:\nlogging.error(f\"Model MLCube Execution failed: {e}\")\nraise ExecutionError(f\"Model MLCube failed: {e}\")\nelse:\nself.partial = True\nlogging.warning(f\"Model MLCube Execution failed: {e}\")\ndef run_evaluation(self):\nself.ui.text = \"Running model evaluation on dataset\"\nevaluate_timeout = config.evaluate_timeout\npreds_path = self.preds_path\nlabels_path = self.dataset.labels_path\nresults_path = self.results_path\nself.ui.text = \"Evaluating results\"\ntry:\nself.evaluator.run(\ntask=\"evaluate\",\noutput_logs=self.metrics_logs_path,\ntimeout=evaluate_timeout,\npredictions=preds_path,\nlabels=labels_path,\noutput_path=results_path,\n)\nexcept ExecutionError as e:\nlogging.error(f\"Metrics MLCube Execution failed: {e}\")\nraise ExecutionError(\"Metrics MLCube failed\")\ndef todict(self):\nreturn {\n\"results\": self.get_results(),\n\"partial\": self.partial,\n}\ndef get_results(self):\nwith open(self.results_path, \"r\") as f:\nresults = yaml.safe_load(f)\nreturn results\n
"},{"location":"reference/commands/execution/#commands.execution.Execution.run","title":"run(dataset, model, evaluator, ignore_model_errors=False)
classmethod
","text":"Benchmark execution flow.
Parameters:
Name Type Description Defaultbenchmark_uid
int
UID of the desired benchmark
requireddata_uid
str
Registered Dataset UID
requiredmodel_uid
int
UID of model to execute
required Source code incli/medperf/commands/execution.py
@classmethod\ndef run(\ncls, dataset: Dataset, model: Cube, evaluator: Cube, ignore_model_errors=False\n):\n\"\"\"Benchmark execution flow.\n Args:\n benchmark_uid (int): UID of the desired benchmark\n data_uid (str): Registered Dataset UID\n model_uid (int): UID of model to execute\n \"\"\"\nexecution = cls(dataset, model, evaluator, ignore_model_errors)\nexecution.prepare()\nwith execution.ui.interactive():\nexecution.run_inference()\nexecution.run_evaluation()\nexecution_summary = execution.todict()\nreturn execution_summary\n
"},{"location":"reference/commands/list/","title":"List","text":""},{"location":"reference/commands/list/#commands.list.EntityList","title":"EntityList
","text":"Source code in cli/medperf/commands/list.py
class EntityList:\n@staticmethod\ndef run(\nentity_class: Type[Entity],\nfields: List[str],\nunregistered: bool = False,\nmine_only: bool = False,\n**kwargs,\n):\n\"\"\"Lists all local datasets\n Args:\n unregistered (bool, optional): Display only local unregistered results. Defaults to False.\n mine_only (bool, optional): Display all registered current-user results. Defaults to False.\n kwargs (dict): Additional parameters for filtering entity lists.\n \"\"\"\nentity_list = EntityList(\nentity_class, fields, unregistered, mine_only, **kwargs\n)\nentity_list.prepare()\nentity_list.validate()\nentity_list.filter()\nentity_list.display()\ndef __init__(\nself,\nentity_class: Type[Entity],\nfields: List[str],\nunregistered: bool,\nmine_only: bool,\n**kwargs,\n):\nself.entity_class = entity_class\nself.fields = fields\nself.unregistered = unregistered\nself.mine_only = mine_only\nself.filters = kwargs\nself.data = []\ndef prepare(self):\nif self.mine_only:\nself.filters[\"owner\"] = get_medperf_user_data()[\"id\"]\nentities = self.entity_class.all(\nunregistered=self.unregistered, filters=self.filters\n)\nself.data = [entity.display_dict() for entity in entities]\ndef validate(self):\nif self.data:\nvalid_fields = set(self.data[0].keys())\nchosen_fields = set(self.fields)\nif not chosen_fields.issubset(valid_fields):\ninvalid_fields = chosen_fields.difference(valid_fields)\ninvalid_fields = \", \".join(invalid_fields)\nraise InvalidArgumentError(f\"Invalid field(s): {invalid_fields}\")\ndef filter(self):\nself.data = [\n{field: entity_dict[field] for field in self.fields}\nfor entity_dict in self.data\n]\ndef display(self):\nheaders = self.fields\ndata_lists = [list(entity_dict.values()) for entity_dict in self.data]\ntab = tabulate(data_lists, headers=headers)\nconfig.ui.print(tab)\n
"},{"location":"reference/commands/list/#commands.list.EntityList.run","title":"run(entity_class, fields, unregistered=False, mine_only=False, **kwargs)
staticmethod
","text":"Lists all local datasets
Parameters:
Name Type Description Defaultunregistered
bool
Display only local unregistered results. Defaults to False.
False
mine_only
bool
Display all registered current-user results. Defaults to False.
False
kwargs
dict
Additional parameters for filtering entity lists.
{}
Source code in cli/medperf/commands/list.py
@staticmethod\ndef run(\nentity_class: Type[Entity],\nfields: List[str],\nunregistered: bool = False,\nmine_only: bool = False,\n**kwargs,\n):\n\"\"\"Lists all local datasets\n Args:\n unregistered (bool, optional): Display only local unregistered results. Defaults to False.\n mine_only (bool, optional): Display all registered current-user results. Defaults to False.\n kwargs (dict): Additional parameters for filtering entity lists.\n \"\"\"\nentity_list = EntityList(\nentity_class, fields, unregistered, mine_only, **kwargs\n)\nentity_list.prepare()\nentity_list.validate()\nentity_list.filter()\nentity_list.display()\n
"},{"location":"reference/commands/profile/","title":"Profile","text":""},{"location":"reference/commands/profile/#commands.profile.activate","title":"activate(profile)
","text":"Assigns the active profile, which is used by default
Parameters:
Name Type Description Defaultprofile
str
Name of the profile to be used.
required Source code incli/medperf/commands/profile.py
@app.command(\"activate\")\n@clean_except\ndef activate(profile: str):\n\"\"\"Assigns the active profile, which is used by default\n Args:\n profile (str): Name of the profile to be used.\n \"\"\"\nconfig_p = read_config()\nif profile not in config_p:\nraise InvalidArgumentError(\"The provided profile does not exists\")\nconfig_p.activate(profile)\nwrite_config(config_p)\n
"},{"location":"reference/commands/profile/#commands.profile.create","title":"create(ctx, name=typer.Option(..., '--name', '-n', help=\"Profile's name\"))
","text":"Creates a new profile for managing and customizing configuration
Source code incli/medperf/commands/profile.py
@app.command(\"create\")\n@clean_except\n@configurable\ndef create(\nctx: typer.Context,\nname: str = typer.Option(..., \"--name\", \"-n\", help=\"Profile's name\"),\n):\n\"\"\"Creates a new profile for managing and customizing configuration\"\"\"\nargs = ctx.params\nargs.pop(\"name\")\nconfig_p = read_config()\nif name in config_p:\nraise InvalidArgumentError(\"A profile with the same name already exists\")\nconfig_p[name] = args\nwrite_config(config_p)\n
"},{"location":"reference/commands/profile/#commands.profile.delete","title":"delete(profile)
","text":"Deletes a profile's configuration.
Parameters:
Name Type Description Defaultprofile
str
Profile to delete.
required Source code incli/medperf/commands/profile.py
@app.command(\"delete\")\n@clean_except\ndef delete(profile: str):\n\"\"\"Deletes a profile's configuration.\n Args:\n profile (str): Profile to delete.\n \"\"\"\nconfig_p = read_config()\nif profile not in config_p.profiles:\nraise InvalidArgumentError(\"The provided profile does not exists\")\nif profile in [\nconfig.default_profile_name,\nconfig.testauth_profile_name,\nconfig.test_profile_name,\n]:\nraise InvalidArgumentError(\"Cannot delete reserved profiles\")\nif config_p.is_profile_active(profile):\nraise InvalidArgumentError(\"Cannot delete a currently activated profile\")\ndel config_p[profile]\nwrite_config(config_p)\n
"},{"location":"reference/commands/profile/#commands.profile.list","title":"list()
","text":"Lists all available profiles
Source code incli/medperf/commands/profile.py
@app.command(\"ls\")\n@clean_except\ndef list():\n\"\"\"Lists all available profiles\"\"\"\nui = config.ui\nconfig_p = read_config()\nfor profile in config_p:\nif config_p.is_profile_active(profile):\nui.print_highlight(\"* \" + profile)\nelse:\nui.print(\" \" + profile)\n
"},{"location":"reference/commands/profile/#commands.profile.set_args","title":"set_args(ctx)
","text":"Assign key-value configuration pairs to the current profile.
Source code incli/medperf/commands/profile.py
@app.command(\"set\")\n@clean_except\n@configurable\ndef set_args(ctx: typer.Context):\n\"\"\"Assign key-value configuration pairs to the current profile.\"\"\"\nargs = ctx.params\nconfig_p = read_config()\nconfig_p.active_profile.update(args)\nwrite_config(config_p)\n
"},{"location":"reference/commands/profile/#commands.profile.view","title":"view(profile=typer.Argument(None))
","text":"Displays a profile's configuration.
Parameters:
Name Type Description Defaultprofile
str
Profile to display information from. Defaults to active profile.
typer.Argument(None)
Source code in cli/medperf/commands/profile.py
@app.command(\"view\")\n@clean_except\ndef view(profile: str = typer.Argument(None)):\n\"\"\"Displays a profile's configuration.\n Args:\n profile (str, optional): Profile to display information from. Defaults to active profile.\n \"\"\"\nconfig_p = read_config()\nprofile_config = config_p.active_profile\nif profile:\nprofile_config = config_p[profile]\nprofile_config.pop(config.credentials_keyword, None)\nprofile_name = profile if profile else config_p.active_profile_name\nconfig.ui.print(f\"\\nProfile '{profile_name}':\")\ndict_pretty_print(profile_config, skip_none_values=False)\n
"},{"location":"reference/commands/storage/","title":"Storage","text":""},{"location":"reference/commands/storage/#commands.storage.clean","title":"clean()
","text":"Cleans up clutter paths
Source code incli/medperf/commands/storage.py
@app.command(\"cleanup\")\ndef clean():\n\"\"\"Cleans up clutter paths\"\"\"\n# Force cleanup to be true\nconfig.cleanup = True\ncleanup()\n
"},{"location":"reference/commands/storage/#commands.storage.ls","title":"ls()
","text":"Show the location of the current medperf assets
Source code incli/medperf/commands/storage.py
@app.command(\"ls\")\n@clean_except\ndef ls():\n\"\"\"Show the location of the current medperf assets\"\"\"\nheaders = [\"Asset\", \"Location\"]\ninfo = []\nfor folder in config.storage:\ninfo.append((folder, config.storage[folder][\"base\"]))\ntab = tabulate(info, headers=headers)\nconfig.ui.print(tab)\n
"},{"location":"reference/commands/storage/#commands.storage.move","title":"move(path=typer.Option(..., '--target', '-t', help='Target path'))
","text":"Moves all storage folders to a target base path. Folders include: Benchmarks, datasets, mlcubes, results, tests, ...
Parameters:
Name Type Description Defaultpath
str
target path
typer.Option(..., '--target', '-t', help='Target path')
Source code in cli/medperf/commands/storage.py
@app.command(\"move\")\n@clean_except\ndef move(path: str = typer.Option(..., \"--target\", \"-t\", help=\"Target path\")):\n\"\"\"Moves all storage folders to a target base path. Folders include:\n Benchmarks, datasets, mlcubes, results, tests, ...\n Args:\n path (str): target path\n \"\"\"\nmove_storage(path)\n
"},{"location":"reference/commands/view/","title":"View","text":""},{"location":"reference/commands/view/#commands.view.EntityView","title":"EntityView
","text":"Source code in cli/medperf/commands/view.py
class EntityView:\n@staticmethod\ndef run(\nentity_id: Union[int, str],\nentity_class: Type[Entity],\nformat: str = \"yaml\",\nunregistered: bool = False,\nmine_only: bool = False,\noutput: str = None,\n**kwargs,\n):\n\"\"\"Displays the contents of a single or multiple entities of a given type\n Args:\n entity_id (Union[int, str]): Entity identifies\n entity_class (Entity): Entity type\n unregistered (bool, optional): Display only local unregistered entities. Defaults to False.\n mine_only (bool, optional): Display all current-user entities. Defaults to False.\n format (str, optional): What format to use to display the contents. Valid formats: [yaml, json]. Defaults to yaml.\n output (str, optional): Path to a file for storing the entity contents. If not provided, the contents are printed.\n kwargs (dict): Additional parameters for filtering entity lists.\n \"\"\"\nentity_view = EntityView(\nentity_id, entity_class, format, unregistered, mine_only, output, **kwargs\n)\nentity_view.validate()\nentity_view.prepare()\nif output is None:\nentity_view.display()\nelse:\nentity_view.store()\ndef __init__(\nself,\nentity_id: Union[int, str],\nentity_class: Type[Entity],\nformat: str,\nunregistered: bool,\nmine_only: bool,\noutput: str,\n**kwargs,\n):\nself.entity_id = entity_id\nself.entity_class = entity_class\nself.format = format\nself.unregistered = unregistered\nself.mine_only = mine_only\nself.output = output\nself.filters = kwargs\nself.data = []\ndef validate(self):\nvalid_formats = set([\"yaml\", \"json\"])\nif self.format not in valid_formats:\nraise InvalidArgumentError(\"The provided format is not supported\")\ndef prepare(self):\nif self.entity_id is not None:\nentities = [self.entity_class.get(self.entity_id)]\nelse:\nif self.mine_only:\nself.filters[\"owner\"] = get_medperf_user_data()[\"id\"]\nentities = self.entity_class.all(\nunregistered=self.unregistered, filters=self.filters\n)\nself.data = [entity.todict() for entity in entities]\nif self.entity_id is not None:\n# User expects a single entity if id provided\n# Don't output the view as a list of entities\nself.data = self.data[0]\ndef display(self):\nif self.format == \"json\":\nformatter = json.dumps\nif self.format == \"yaml\":\nformatter = yaml.dump\nformatted_data = formatter(self.data)\nconfig.ui.print(formatted_data)\ndef store(self):\nif self.format == \"json\":\nformatter = json.dump\nif self.format == \"yaml\":\nformatter = yaml.dump\nwith open(self.output, \"w\") as f:\nformatter(self.data, f)\n
"},{"location":"reference/commands/view/#commands.view.EntityView.run","title":"run(entity_id, entity_class, format='yaml', unregistered=False, mine_only=False, output=None, **kwargs)
staticmethod
","text":"Displays the contents of a single or multiple entities of a given type
Parameters:
Name Type Description Defaultentity_id
Union[int, str]
Entity identifies
requiredentity_class
Entity
Entity type
requiredunregistered
bool
Display only local unregistered entities. Defaults to False.
False
mine_only
bool
Display all current-user entities. Defaults to False.
False
format
str
What format to use to display the contents. Valid formats: [yaml, json]. Defaults to yaml.
'yaml'
output
str
Path to a file for storing the entity contents. If not provided, the contents are printed.
None
kwargs
dict
Additional parameters for filtering entity lists.
{}
Source code in cli/medperf/commands/view.py
@staticmethod\ndef run(\nentity_id: Union[int, str],\nentity_class: Type[Entity],\nformat: str = \"yaml\",\nunregistered: bool = False,\nmine_only: bool = False,\noutput: str = None,\n**kwargs,\n):\n\"\"\"Displays the contents of a single or multiple entities of a given type\n Args:\n entity_id (Union[int, str]): Entity identifies\n entity_class (Entity): Entity type\n unregistered (bool, optional): Display only local unregistered entities. Defaults to False.\n mine_only (bool, optional): Display all current-user entities. Defaults to False.\n format (str, optional): What format to use to display the contents. Valid formats: [yaml, json]. Defaults to yaml.\n output (str, optional): Path to a file for storing the entity contents. If not provided, the contents are printed.\n kwargs (dict): Additional parameters for filtering entity lists.\n \"\"\"\nentity_view = EntityView(\nentity_id, entity_class, format, unregistered, mine_only, output, **kwargs\n)\nentity_view.validate()\nentity_view.prepare()\nif output is None:\nentity_view.display()\nelse:\nentity_view.store()\n
"},{"location":"reference/commands/association/approval/","title":"Approval","text":""},{"location":"reference/commands/association/approval/#commands.association.approval.Approval","title":"Approval
","text":"Source code in cli/medperf/commands/association/approval.py
class Approval:\n@staticmethod\ndef run(\nbenchmark_uid: int,\napproval_status: str,\ndataset_uid: int = None,\nmlcube_uid: int = None,\n):\n\"\"\"Sets approval status for an association between a benchmark and a dataset or mlcube\n Args:\n benchmark_uid (int): Benchmark UID.\n approval_status (str): Desired approval status to set for the association.\n comms (Comms): Instance of Comms interface.\n ui (UI): Instance of UI interface.\n dataset_uid (int, optional): Dataset UID. Defaults to None.\n mlcube_uid (int, optional): MLCube UID. Defaults to None.\n \"\"\"\ncomms = config.comms\ntoo_many_resources = dataset_uid and mlcube_uid\nno_resource = dataset_uid is None and mlcube_uid is None\nif no_resource or too_many_resources:\nraise InvalidArgumentError(\"Must provide either a dataset or mlcube\")\nif dataset_uid:\ncomms.set_dataset_association_approval(\nbenchmark_uid, dataset_uid, approval_status.value\n)\nif mlcube_uid:\ncomms.set_mlcube_association_approval(\nbenchmark_uid, mlcube_uid, approval_status.value\n)\n
"},{"location":"reference/commands/association/approval/#commands.association.approval.Approval.run","title":"run(benchmark_uid, approval_status, dataset_uid=None, mlcube_uid=None)
staticmethod
","text":"Sets approval status for an association between a benchmark and a dataset or mlcube
Parameters:
Name Type Description Defaultbenchmark_uid
int
Benchmark UID.
requiredapproval_status
str
Desired approval status to set for the association.
requiredcomms
Comms
Instance of Comms interface.
requiredui
UI
Instance of UI interface.
requireddataset_uid
int
Dataset UID. Defaults to None.
None
mlcube_uid
int
MLCube UID. Defaults to None.
None
Source code in cli/medperf/commands/association/approval.py
@staticmethod\ndef run(\nbenchmark_uid: int,\napproval_status: str,\ndataset_uid: int = None,\nmlcube_uid: int = None,\n):\n\"\"\"Sets approval status for an association between a benchmark and a dataset or mlcube\n Args:\n benchmark_uid (int): Benchmark UID.\n approval_status (str): Desired approval status to set for the association.\n comms (Comms): Instance of Comms interface.\n ui (UI): Instance of UI interface.\n dataset_uid (int, optional): Dataset UID. Defaults to None.\n mlcube_uid (int, optional): MLCube UID. Defaults to None.\n \"\"\"\ncomms = config.comms\ntoo_many_resources = dataset_uid and mlcube_uid\nno_resource = dataset_uid is None and mlcube_uid is None\nif no_resource or too_many_resources:\nraise InvalidArgumentError(\"Must provide either a dataset or mlcube\")\nif dataset_uid:\ncomms.set_dataset_association_approval(\nbenchmark_uid, dataset_uid, approval_status.value\n)\nif mlcube_uid:\ncomms.set_mlcube_association_approval(\nbenchmark_uid, mlcube_uid, approval_status.value\n)\n
"},{"location":"reference/commands/association/association/","title":"Association","text":""},{"location":"reference/commands/association/association/#commands.association.association.approve","title":"approve(benchmark_uid=typer.Option(..., '--benchmark', '-b', help='Benchmark UID'), dataset_uid=typer.Option(None, '--dataset', '-d', help='Dataset UID'), mlcube_uid=typer.Option(None, '--mlcube', '-m', help='MLCube UID'))
","text":"Approves an association between a benchmark and a dataset or model mlcube
Parameters:
Name Type Description Defaultbenchmark_uid
int
Benchmark UID.
typer.Option(..., '--benchmark', '-b', help='Benchmark UID')
dataset_uid
int
Dataset UID.
typer.Option(None, '--dataset', '-d', help='Dataset UID')
mlcube_uid
int
Model MLCube UID.
typer.Option(None, '--mlcube', '-m', help='MLCube UID')
Source code in cli/medperf/commands/association/association.py
@app.command(\"approve\")\n@clean_except\ndef approve(\nbenchmark_uid: int = typer.Option(..., \"--benchmark\", \"-b\", help=\"Benchmark UID\"),\ndataset_uid: int = typer.Option(None, \"--dataset\", \"-d\", help=\"Dataset UID\"),\nmlcube_uid: int = typer.Option(None, \"--mlcube\", \"-m\", help=\"MLCube UID\"),\n):\n\"\"\"Approves an association between a benchmark and a dataset or model mlcube\n Args:\n benchmark_uid (int): Benchmark UID.\n dataset_uid (int, optional): Dataset UID.\n mlcube_uid (int, optional): Model MLCube UID.\n \"\"\"\nApproval.run(benchmark_uid, Status.APPROVED, dataset_uid, mlcube_uid)\nconfig.ui.print(\"\u2705 Done!\")\n
"},{"location":"reference/commands/association/association/#commands.association.association.list","title":"list(filter=typer.Argument(None))
","text":"Display all associations related to the current user.
Parameters:
Name Type Description Defaultfilter
str
Filter associations by approval status. Defaults to displaying all user associations.
typer.Argument(None)
Source code in cli/medperf/commands/association/association.py
@app.command(\"ls\")\n@clean_except\ndef list(filter: Optional[str] = typer.Argument(None)):\n\"\"\"Display all associations related to the current user.\n Args:\n filter (str, optional): Filter associations by approval status.\n Defaults to displaying all user associations.\n \"\"\"\nListAssociations.run(filter)\n
"},{"location":"reference/commands/association/association/#commands.association.association.reject","title":"reject(benchmark_uid=typer.Option(..., '--benchmark', '-b', help='Benchmark UID'), dataset_uid=typer.Option(None, '--dataset', '-d', help='Dataset UID'), mlcube_uid=typer.Option(None, '--mlcube', '-m', help='MLCube UID'))
","text":"Rejects an association between a benchmark and a dataset or model mlcube
Parameters:
Name Type Description Defaultbenchmark_uid
int
Benchmark UID.
typer.Option(..., '--benchmark', '-b', help='Benchmark UID')
dataset_uid
int
Dataset UID.
typer.Option(None, '--dataset', '-d', help='Dataset UID')
mlcube_uid
int
Model MLCube UID.
typer.Option(None, '--mlcube', '-m', help='MLCube UID')
Source code in cli/medperf/commands/association/association.py
@app.command(\"reject\")\n@clean_except\ndef reject(\nbenchmark_uid: int = typer.Option(..., \"--benchmark\", \"-b\", help=\"Benchmark UID\"),\ndataset_uid: int = typer.Option(None, \"--dataset\", \"-d\", help=\"Dataset UID\"),\nmlcube_uid: int = typer.Option(None, \"--mlcube\", \"-m\", help=\"MLCube UID\"),\n):\n\"\"\"Rejects an association between a benchmark and a dataset or model mlcube\n Args:\n benchmark_uid (int): Benchmark UID.\n dataset_uid (int, optional): Dataset UID.\n mlcube_uid (int, optional): Model MLCube UID.\n \"\"\"\nApproval.run(benchmark_uid, Status.REJECTED, dataset_uid, mlcube_uid)\nconfig.ui.print(\"\u2705 Done!\")\n
"},{"location":"reference/commands/association/association/#commands.association.association.set_priority","title":"set_priority(benchmark_uid=typer.Option(..., '--benchmark', '-b', help='Benchmark UID'), mlcube_uid=typer.Option(..., '--mlcube', '-m', help='MLCube UID'), priority=typer.Option(..., '--priority', '-p', help='Priority, an integer'))
","text":"Updates the priority of a benchmark-model association. Model priorities within a benchmark define which models need to be executed before others when this benchmark is run. A model with a higher priority is executed before a model with lower priority. The order of execution of models of the same priority is arbitrary.
Examples:
Assume there are three models of IDs (1,2,3), associated with a certain benchmark, all having priority = 0. - By setting the priority of model (2) to the value of 1, the client will make sure that model (2) is executed before models (1,3). - By setting the priority of model (1) to the value of -5, the client will make sure that models (2,3) are executed before model (1).
Parameters:
Name Type Description Defaultbenchmark_uid
int
Benchmark UID.
typer.Option(..., '--benchmark', '-b', help='Benchmark UID')
mlcube_uid
int
Model MLCube UID.
typer.Option(..., '--mlcube', '-m', help='MLCube UID')
priority
int
Priority, an integer
typer.Option(..., '--priority', '-p', help='Priority, an integer')
Source code in cli/medperf/commands/association/association.py
@app.command(\"set_priority\")\n@clean_except\ndef set_priority(\nbenchmark_uid: int = typer.Option(..., \"--benchmark\", \"-b\", help=\"Benchmark UID\"),\nmlcube_uid: int = typer.Option(..., \"--mlcube\", \"-m\", help=\"MLCube UID\"),\npriority: int = typer.Option(..., \"--priority\", \"-p\", help=\"Priority, an integer\"),\n):\n\"\"\"Updates the priority of a benchmark-model association. Model priorities within\n a benchmark define which models need to be executed before others when\n this benchmark is run. A model with a higher priority is executed before\n a model with lower priority. The order of execution of models of the same priority\n is arbitrary.\n Examples:\n Assume there are three models of IDs (1,2,3), associated with a certain benchmark,\n all having priority = 0.\n - By setting the priority of model (2) to the value of 1, the client will make\n sure that model (2) is executed before models (1,3).\n - By setting the priority of model (1) to the value of -5, the client will make\n sure that models (2,3) are executed before model (1).\n Args:\n benchmark_uid (int): Benchmark UID.\n mlcube_uid (int): Model MLCube UID.\n priority (int): Priority, an integer\n \"\"\"\nAssociationPriority.run(benchmark_uid, mlcube_uid, priority)\nconfig.ui.print(\"\u2705 Done!\")\n
"},{"location":"reference/commands/association/list/","title":"List","text":""},{"location":"reference/commands/association/list/#commands.association.list.ListAssociations","title":"ListAssociations
","text":"Source code in cli/medperf/commands/association/list.py
class ListAssociations:\n@staticmethod\ndef run(filter: str = None):\n\"\"\"Get Pending association requests\"\"\"\ncomms = config.comms\nui = config.ui\ndset_assocs = comms.get_datasets_associations()\ncube_assocs = comms.get_cubes_associations()\n# Might be worth seeing if creating an association class that encapsulates\n# most of the logic here is useful\nassocs = dset_assocs + cube_assocs\nif filter:\nfilter = filter.upper()\nassocs = [assoc for assoc in assocs if assoc[\"approval_status\"] == filter]\nassocs_info = []\nfor assoc in assocs:\nassoc_info = (\nassoc.get(\"dataset\", None),\nassoc.get(\"model_mlcube\", None),\nassoc[\"benchmark\"],\nassoc[\"initiated_by\"],\nassoc[\"approval_status\"],\nassoc.get(\"priority\", None),\n# NOTE: We should find a better way to show priorities, since a priority\n# is better shown when listing cube associations only, of a specific\n# benchmark. Maybe this is resolved after we add a general filtering\n# feature to list commands.\n)\nassocs_info.append(assoc_info)\nheaders = [\n\"Dataset UID\",\n\"MLCube UID\",\n\"Benchmark UID\",\n\"Initiated by\",\n\"Status\",\n\"Priority\",\n]\ntab = tabulate(assocs_info, headers=headers)\nui.print(tab)\n
"},{"location":"reference/commands/association/list/#commands.association.list.ListAssociations.run","title":"run(filter=None)
staticmethod
","text":"Get Pending association requests
Source code incli/medperf/commands/association/list.py
@staticmethod\ndef run(filter: str = None):\n\"\"\"Get Pending association requests\"\"\"\ncomms = config.comms\nui = config.ui\ndset_assocs = comms.get_datasets_associations()\ncube_assocs = comms.get_cubes_associations()\n# Might be worth seeing if creating an association class that encapsulates\n# most of the logic here is useful\nassocs = dset_assocs + cube_assocs\nif filter:\nfilter = filter.upper()\nassocs = [assoc for assoc in assocs if assoc[\"approval_status\"] == filter]\nassocs_info = []\nfor assoc in assocs:\nassoc_info = (\nassoc.get(\"dataset\", None),\nassoc.get(\"model_mlcube\", None),\nassoc[\"benchmark\"],\nassoc[\"initiated_by\"],\nassoc[\"approval_status\"],\nassoc.get(\"priority\", None),\n# NOTE: We should find a better way to show priorities, since a priority\n# is better shown when listing cube associations only, of a specific\n# benchmark. Maybe this is resolved after we add a general filtering\n# feature to list commands.\n)\nassocs_info.append(assoc_info)\nheaders = [\n\"Dataset UID\",\n\"MLCube UID\",\n\"Benchmark UID\",\n\"Initiated by\",\n\"Status\",\n\"Priority\",\n]\ntab = tabulate(assocs_info, headers=headers)\nui.print(tab)\n
"},{"location":"reference/commands/association/priority/","title":"Priority","text":""},{"location":"reference/commands/association/priority/#commands.association.priority.AssociationPriority","title":"AssociationPriority
","text":"Source code in cli/medperf/commands/association/priority.py
class AssociationPriority:\n@staticmethod\ndef run(benchmark_uid: int, mlcube_uid: int, priority: int):\n\"\"\"Sets priority for an association between a benchmark and an mlcube\n Args:\n benchmark_uid (int): Benchmark UID.\n mlcube_uid (int): MLCube UID.\n priority (int): priority value\n \"\"\"\nassociated_cubes = Benchmark.get_models_uids(benchmark_uid)\nif mlcube_uid not in associated_cubes:\nraise InvalidArgumentError(\n\"The given mlcube doesn't exist or is not associated with the benchmark\"\n)\nconfig.comms.set_mlcube_association_priority(\nbenchmark_uid, mlcube_uid, priority\n)\n
"},{"location":"reference/commands/association/priority/#commands.association.priority.AssociationPriority.run","title":"run(benchmark_uid, mlcube_uid, priority)
staticmethod
","text":"Sets priority for an association between a benchmark and an mlcube
Parameters:
Name Type Description Defaultbenchmark_uid
int
Benchmark UID.
requiredmlcube_uid
int
MLCube UID.
requiredpriority
int
priority value
required Source code incli/medperf/commands/association/priority.py
@staticmethod\ndef run(benchmark_uid: int, mlcube_uid: int, priority: int):\n\"\"\"Sets priority for an association between a benchmark and an mlcube\n Args:\n benchmark_uid (int): Benchmark UID.\n mlcube_uid (int): MLCube UID.\n priority (int): priority value\n \"\"\"\nassociated_cubes = Benchmark.get_models_uids(benchmark_uid)\nif mlcube_uid not in associated_cubes:\nraise InvalidArgumentError(\n\"The given mlcube doesn't exist or is not associated with the benchmark\"\n)\nconfig.comms.set_mlcube_association_priority(\nbenchmark_uid, mlcube_uid, priority\n)\n
"},{"location":"reference/commands/auth/auth/","title":"Auth","text":""},{"location":"reference/commands/auth/auth/#commands.auth.auth.login","title":"login(email=typer.Option(None, '--email', '-e', help='The email associated with your account'))
","text":"Authenticate to be able to access the MedPerf server. A verification link will be provided and should be open in a browser to complete the login process.
Source code incli/medperf/commands/auth/auth.py
@app.command(\"login\")\n@clean_except\ndef login(\nemail: str = typer.Option(\nNone, \"--email\", \"-e\", help=\"The email associated with your account\"\n)\n):\n\"\"\"Authenticate to be able to access the MedPerf server. A verification link will\n be provided and should be open in a browser to complete the login process.\"\"\"\nLogin.run(email)\nconfig.ui.print(\"\u2705 Done!\")\n
"},{"location":"reference/commands/auth/auth/#commands.auth.auth.logout","title":"logout()
","text":"Revoke the currently active login state.
Source code incli/medperf/commands/auth/auth.py
@app.command(\"logout\")\n@clean_except\ndef logout():\n\"\"\"Revoke the currently active login state.\"\"\"\nLogout.run()\nconfig.ui.print(\"\u2705 Done!\")\n
"},{"location":"reference/commands/auth/auth/#commands.auth.auth.status","title":"status()
","text":"Shows the currently logged in user.
Source code incli/medperf/commands/auth/auth.py
@app.command(\"status\")\n@clean_except\ndef status():\n\"\"\"Shows the currently logged in user.\"\"\"\nStatus.run()\n
"},{"location":"reference/commands/auth/auth/#commands.auth.auth.synapse_login","title":"synapse_login(token=typer.Option(None, '--token', '-t', help='Personal Access Token to login with'))
","text":"Login to the synapse server. Provide either a username and a password, or a token
Source code incli/medperf/commands/auth/auth.py
@app.command(\"synapse_login\")\n@clean_except\ndef synapse_login(\ntoken: str = typer.Option(\nNone, \"--token\", \"-t\", help=\"Personal Access Token to login with\"\n),\n):\n\"\"\"Login to the synapse server.\n Provide either a username and a password, or a token\n \"\"\"\nSynapseLogin.run(token=token)\nconfig.ui.print(\"\u2705 Done!\")\n
"},{"location":"reference/commands/auth/login/","title":"Login","text":""},{"location":"reference/commands/auth/login/#commands.auth.login.Login","title":"Login
","text":"Source code in cli/medperf/commands/auth/login.py
class Login:\n@staticmethod\ndef run(email: str = None):\n\"\"\"Authenticate to be able to access the MedPerf server. A verification link will\n be provided and should be open in a browser to complete the login process.\"\"\"\nraise_if_logged_in()\nif not email:\nemail = config.ui.prompt(\"Please type your email: \")\ntry:\nvalidate_email(email, check_deliverability=False)\nexcept EmailNotValidError as e:\nraise InvalidArgumentError(str(e))\nconfig.auth.login(email)\n
"},{"location":"reference/commands/auth/login/#commands.auth.login.Login.run","title":"run(email=None)
staticmethod
","text":"Authenticate to be able to access the MedPerf server. A verification link will be provided and should be open in a browser to complete the login process.
Source code incli/medperf/commands/auth/login.py
@staticmethod\ndef run(email: str = None):\n\"\"\"Authenticate to be able to access the MedPerf server. A verification link will\n be provided and should be open in a browser to complete the login process.\"\"\"\nraise_if_logged_in()\nif not email:\nemail = config.ui.prompt(\"Please type your email: \")\ntry:\nvalidate_email(email, check_deliverability=False)\nexcept EmailNotValidError as e:\nraise InvalidArgumentError(str(e))\nconfig.auth.login(email)\n
"},{"location":"reference/commands/auth/logout/","title":"Logout","text":""},{"location":"reference/commands/auth/logout/#commands.auth.logout.Logout","title":"Logout
","text":"Source code in cli/medperf/commands/auth/logout.py
class Logout:\n@staticmethod\ndef run():\n\"\"\"Revoke the currently active login state.\"\"\"\nconfig.auth.logout()\n
"},{"location":"reference/commands/auth/logout/#commands.auth.logout.Logout.run","title":"run()
staticmethod
","text":"Revoke the currently active login state.
Source code incli/medperf/commands/auth/logout.py
@staticmethod\ndef run():\n\"\"\"Revoke the currently active login state.\"\"\"\nconfig.auth.logout()\n
"},{"location":"reference/commands/auth/status/","title":"Status","text":""},{"location":"reference/commands/auth/status/#commands.auth.status.Status","title":"Status
","text":"Source code in cli/medperf/commands/auth/status.py
class Status:\n@staticmethod\ndef run():\n\"\"\"Shows the currently logged in user.\"\"\"\naccount_info = read_user_account()\nif account_info is None:\nconfig.ui.print(\"You are not logged in\")\nreturn\nemail = account_info[\"email\"]\nconfig.ui.print(f\"Logged in user email address: {email}\")\n
"},{"location":"reference/commands/auth/status/#commands.auth.status.Status.run","title":"run()
staticmethod
","text":"Shows the currently logged in user.
Source code incli/medperf/commands/auth/status.py
@staticmethod\ndef run():\n\"\"\"Shows the currently logged in user.\"\"\"\naccount_info = read_user_account()\nif account_info is None:\nconfig.ui.print(\"You are not logged in\")\nreturn\nemail = account_info[\"email\"]\nconfig.ui.print(f\"Logged in user email address: {email}\")\n
"},{"location":"reference/commands/auth/synapse_login/","title":"Synapse login","text":""},{"location":"reference/commands/auth/synapse_login/#commands.auth.synapse_login.SynapseLogin","title":"SynapseLogin
","text":"Source code in cli/medperf/commands/auth/synapse_login.py
class SynapseLogin:\n@classmethod\ndef run(cls, token: str = None):\n\"\"\"Login to the Synapse server. Must be done only once.\"\"\"\nif not token:\nmsg = (\n\"Please provide your Synapse Personal Access Token (PAT). \"\n\"You can generate a new PAT at \"\n\"https://www.synapse.org/#!PersonalAccessTokens:0\\n\"\n\"Synapse PAT: \"\n)\ntoken = config.ui.hidden_prompt(msg)\ncls.login_with_token(token)\n@classmethod\ndef login_with_token(cls, access_token=None):\n\"\"\"Login to the Synapse server. Must be done only once.\"\"\"\nsyn = synapseclient.Synapse()\ntry:\nsyn.login(authToken=access_token)\nexcept SynapseAuthenticationError as err:\nraise CommunicationAuthenticationError(\"Invalid Synapse credentials\") from err\n
"},{"location":"reference/commands/auth/synapse_login/#commands.auth.synapse_login.SynapseLogin.login_with_token","title":"login_with_token(access_token=None)
classmethod
","text":"Login to the Synapse server. Must be done only once.
Source code incli/medperf/commands/auth/synapse_login.py
@classmethod\ndef login_with_token(cls, access_token=None):\n\"\"\"Login to the Synapse server. Must be done only once.\"\"\"\nsyn = synapseclient.Synapse()\ntry:\nsyn.login(authToken=access_token)\nexcept SynapseAuthenticationError as err:\nraise CommunicationAuthenticationError(\"Invalid Synapse credentials\") from err\n
"},{"location":"reference/commands/auth/synapse_login/#commands.auth.synapse_login.SynapseLogin.run","title":"run(token=None)
classmethod
","text":"Login to the Synapse server. Must be done only once.
Source code incli/medperf/commands/auth/synapse_login.py
@classmethod\ndef run(cls, token: str = None):\n\"\"\"Login to the Synapse server. Must be done only once.\"\"\"\nif not token:\nmsg = (\n\"Please provide your Synapse Personal Access Token (PAT). \"\n\"You can generate a new PAT at \"\n\"https://www.synapse.org/#!PersonalAccessTokens:0\\n\"\n\"Synapse PAT: \"\n)\ntoken = config.ui.hidden_prompt(msg)\ncls.login_with_token(token)\n
"},{"location":"reference/commands/benchmark/associate/","title":"Associate","text":""},{"location":"reference/commands/benchmark/associate/#commands.benchmark.associate.AssociateBenchmark","title":"AssociateBenchmark
","text":"Source code in cli/medperf/commands/benchmark/associate.py
class AssociateBenchmark:\n@classmethod\ndef run(\ncls,\nbenchmark_uid: int,\nmodel_uid: int,\ndata_uid: int,\napproved=False,\nno_cache=False,\n):\n\"\"\"Associates a dataset or model to the given benchmark\n Args:\n benchmark_uid (int): UID of benchmark to associate entities with\n model_uid (int): UID of model to associate with benchmark\n data_uid (int): UID of dataset to associate with benchmark\n comms (Comms): Instance of Communications interface\n ui (UI): Instance of UI interface\n approved (bool): Skip approval step. Defaults to False\n \"\"\"\ntoo_many_resources = data_uid and model_uid\nno_resource = data_uid is None and model_uid is None\nif no_resource or too_many_resources:\nraise InvalidArgumentError(\"Must provide either a dataset or mlcube\")\nif model_uid is not None:\nAssociateCube.run(\nmodel_uid, benchmark_uid, approved=approved, no_cache=no_cache\n)\nif data_uid is not None:\nAssociateDataset.run(\ndata_uid, benchmark_uid, approved=approved, no_cache=no_cache\n)\n
"},{"location":"reference/commands/benchmark/associate/#commands.benchmark.associate.AssociateBenchmark.run","title":"run(benchmark_uid, model_uid, data_uid, approved=False, no_cache=False)
classmethod
","text":"Associates a dataset or model to the given benchmark
Parameters:
Name Type Description Defaultbenchmark_uid
int
UID of benchmark to associate entities with
requiredmodel_uid
int
UID of model to associate with benchmark
requireddata_uid
int
UID of dataset to associate with benchmark
requiredcomms
Comms
Instance of Communications interface
requiredui
UI
Instance of UI interface
requiredapproved
bool
Skip approval step. Defaults to False
False
Source code in cli/medperf/commands/benchmark/associate.py
@classmethod\ndef run(\ncls,\nbenchmark_uid: int,\nmodel_uid: int,\ndata_uid: int,\napproved=False,\nno_cache=False,\n):\n\"\"\"Associates a dataset or model to the given benchmark\n Args:\n benchmark_uid (int): UID of benchmark to associate entities with\n model_uid (int): UID of model to associate with benchmark\n data_uid (int): UID of dataset to associate with benchmark\n comms (Comms): Instance of Communications interface\n ui (UI): Instance of UI interface\n approved (bool): Skip approval step. Defaults to False\n \"\"\"\ntoo_many_resources = data_uid and model_uid\nno_resource = data_uid is None and model_uid is None\nif no_resource or too_many_resources:\nraise InvalidArgumentError(\"Must provide either a dataset or mlcube\")\nif model_uid is not None:\nAssociateCube.run(\nmodel_uid, benchmark_uid, approved=approved, no_cache=no_cache\n)\nif data_uid is not None:\nAssociateDataset.run(\ndata_uid, benchmark_uid, approved=approved, no_cache=no_cache\n)\n
"},{"location":"reference/commands/benchmark/benchmark/","title":"Benchmark","text":""},{"location":"reference/commands/benchmark/benchmark/#commands.benchmark.benchmark.associate","title":"associate(benchmark_uid=typer.Option(..., '--benchmark_uid', '-b', help='UID of benchmark to associate with'), model_uid=typer.Option(None, '--model_uid', '-m', help='UID of model MLCube to associate'), dataset_uid=typer.Option(None, '--data_uid', '-d', help='Server UID of registered dataset to associate'), approval=typer.Option(False, '-y', help='Skip approval step'), no_cache=typer.Option(False, '--no-cache', help='Execute the test even if results already exist'))
","text":"Associates a benchmark with a given mlcube or dataset. Only one option at a time.
Source code incli/medperf/commands/benchmark/benchmark.py
@app.command(\"associate\")\n@clean_except\ndef associate(\nbenchmark_uid: int = typer.Option(\n..., \"--benchmark_uid\", \"-b\", help=\"UID of benchmark to associate with\"\n),\nmodel_uid: int = typer.Option(\nNone, \"--model_uid\", \"-m\", help=\"UID of model MLCube to associate\"\n),\ndataset_uid: int = typer.Option(\nNone, \"--data_uid\", \"-d\", help=\"Server UID of registered dataset to associate\"\n),\napproval: bool = typer.Option(False, \"-y\", help=\"Skip approval step\"),\nno_cache: bool = typer.Option(\nFalse,\n\"--no-cache\",\nhelp=\"Execute the test even if results already exist\",\n),\n):\n\"\"\"Associates a benchmark with a given mlcube or dataset. Only one option at a time.\"\"\"\nAssociateBenchmark.run(\nbenchmark_uid, model_uid, dataset_uid, approved=approval, no_cache=no_cache\n)\nconfig.ui.print(\"\u2705 Done!\")\n
"},{"location":"reference/commands/benchmark/benchmark/#commands.benchmark.benchmark.list","title":"list(unregistered=typer.Option(False, '--unregistered', help='Get unregistered benchmarks'), mine=typer.Option(False, '--mine', help='Get current-user benchmarks'))
","text":"List benchmarks
Source code incli/medperf/commands/benchmark/benchmark.py
@app.command(\"ls\")\n@clean_except\ndef list(\nunregistered: bool = typer.Option(\nFalse, \"--unregistered\", help=\"Get unregistered benchmarks\"\n),\nmine: bool = typer.Option(False, \"--mine\", help=\"Get current-user benchmarks\"),\n):\n\"\"\"List benchmarks\"\"\"\nEntityList.run(\nBenchmark,\nfields=[\"UID\", \"Name\", \"Description\", \"State\", \"Approval Status\", \"Registered\"],\nunregistered=unregistered,\nmine_only=mine,\n)\n
"},{"location":"reference/commands/benchmark/benchmark/#commands.benchmark.benchmark.run","title":"run(benchmark_uid=typer.Option(..., '--benchmark', '-b', help='UID of the desired benchmark'), data_uid=typer.Option(..., '--data_uid', '-d', help='Registered Dataset UID'), file=typer.Option(None, '--models-from-file', '-f', help='A file containing the model UIDs to be executed.\\n\\n The file should contain a single line as a list of\\n\\n comma-separated integers corresponding to the model UIDs'), ignore_model_errors=typer.Option(False, '--ignore-model-errors', help='Ignore failing model cubes, allowing for possibly submitting partial results'), no_cache=typer.Option(False, '--no-cache', help='Execute even if results already exist'))
","text":"Runs the benchmark execution step for a given benchmark, prepared dataset and model
Source code incli/medperf/commands/benchmark/benchmark.py
@app.command(\"run\")\n@clean_except\ndef run(\nbenchmark_uid: int = typer.Option(\n..., \"--benchmark\", \"-b\", help=\"UID of the desired benchmark\"\n),\ndata_uid: int = typer.Option(\n..., \"--data_uid\", \"-d\", help=\"Registered Dataset UID\"\n),\nfile: str = typer.Option(\nNone,\n\"--models-from-file\",\n\"-f\",\nhelp=\"\"\"A file containing the model UIDs to be executed.\\n\n The file should contain a single line as a list of\\n\n comma-separated integers corresponding to the model UIDs\"\"\",\n),\nignore_model_errors: bool = typer.Option(\nFalse,\n\"--ignore-model-errors\",\nhelp=\"Ignore failing model cubes, allowing for possibly submitting partial results\",\n),\nno_cache: bool = typer.Option(\nFalse,\n\"--no-cache\",\nhelp=\"Execute even if results already exist\",\n),\n):\n\"\"\"Runs the benchmark execution step for a given benchmark, prepared dataset and model\"\"\"\nBenchmarkExecution.run(\nbenchmark_uid,\ndata_uid,\nmodels_uids=None,\nno_cache=no_cache,\nmodels_input_file=file,\nignore_model_errors=ignore_model_errors,\nshow_summary=True,\nignore_failed_experiments=True,\n)\nconfig.ui.print(\"\u2705 Done!\")\n
"},{"location":"reference/commands/benchmark/benchmark/#commands.benchmark.benchmark.submit","title":"submit(name=typer.Option(..., '--name', '-n', help='Name of the benchmark'), description=typer.Option(..., '--description', '-d', help='Description of the benchmark'), docs_url=typer.Option('', '--docs-url', '-u', help='URL to documentation'), demo_url=typer.Option(..., '--demo-url', help='Identifier to download the demonstration dataset tarball file.\\n\\n See `medperf mlcube submit --help` for more information'), demo_hash=typer.Option('', '--demo-hash', help='Hash of demonstration dataset tarball file'), data_preparation_mlcube=typer.Option(..., '--data-preparation-mlcube', '-p', help='Data Preparation MLCube UID'), reference_model_mlcube=typer.Option(..., '--reference-model-mlcube', '-m', help='Reference Model MLCube UID'), evaluator_mlcube=typer.Option(..., '--evaluator-mlcube', '-e', help='Evaluator MLCube UID'), skip_data_preparation_step=typer.Option(False, '--skip-demo-data-preparation', help='Use this flag if the demo dataset is already prepared'), operational=typer.Option(False, '--operational', help='Submit the Benchmark as OPERATIONAL'))
","text":"Submits a new benchmark to the platform
Source code incli/medperf/commands/benchmark/benchmark.py
@app.command(\"submit\")\n@clean_except\ndef submit(\nname: str = typer.Option(..., \"--name\", \"-n\", help=\"Name of the benchmark\"),\ndescription: str = typer.Option(\n..., \"--description\", \"-d\", help=\"Description of the benchmark\"\n),\ndocs_url: str = typer.Option(\"\", \"--docs-url\", \"-u\", help=\"URL to documentation\"),\ndemo_url: str = typer.Option(\n...,\n\"--demo-url\",\nhelp=\"\"\"Identifier to download the demonstration dataset tarball file.\\n\n See `medperf mlcube submit --help` for more information\"\"\",\n),\ndemo_hash: str = typer.Option(\n\"\", \"--demo-hash\", help=\"Hash of demonstration dataset tarball file\"\n),\ndata_preparation_mlcube: int = typer.Option(\n..., \"--data-preparation-mlcube\", \"-p\", help=\"Data Preparation MLCube UID\"\n),\nreference_model_mlcube: int = typer.Option(\n..., \"--reference-model-mlcube\", \"-m\", help=\"Reference Model MLCube UID\"\n),\nevaluator_mlcube: int = typer.Option(\n..., \"--evaluator-mlcube\", \"-e\", help=\"Evaluator MLCube UID\"\n),\nskip_data_preparation_step: bool = typer.Option(\nFalse,\n\"--skip-demo-data-preparation\",\nhelp=\"Use this flag if the demo dataset is already prepared\",\n),\noperational: bool = typer.Option(\nFalse,\n\"--operational\",\nhelp=\"Submit the Benchmark as OPERATIONAL\",\n),\n):\n\"\"\"Submits a new benchmark to the platform\"\"\"\nbenchmark_info = {\n\"name\": name,\n\"description\": description,\n\"docs_url\": docs_url,\n\"demo_dataset_tarball_url\": demo_url,\n\"demo_dataset_tarball_hash\": demo_hash,\n\"data_preparation_mlcube\": data_preparation_mlcube,\n\"reference_model_mlcube\": reference_model_mlcube,\n\"data_evaluator_mlcube\": evaluator_mlcube,\n\"state\": \"OPERATION\" if operational else \"DEVELOPMENT\",\n}\nSubmitBenchmark.run(\nbenchmark_info,\nskip_data_preparation_step=skip_data_preparation_step,\n)\nconfig.ui.print(\"\u2705 Done!\")\n
"},{"location":"reference/commands/benchmark/benchmark/#commands.benchmark.benchmark.view","title":"view(entity_id=typer.Argument(None, help='Benchmark ID'), format=typer.Option('yaml', '-f', '--format', help='Format to display contents. Available formats: [yaml, json]'), unregistered=typer.Option(False, '--unregistered', help='Display unregistered benchmarks if benchmark ID is not provided'), mine=typer.Option(False, '--mine', help='Display current-user benchmarks if benchmark ID is not provided'), output=typer.Option(None, '--output', '-o', help='Output file to store contents. If not provided, the output will be displayed'))
","text":"Displays the information of one or more benchmarks
Source code incli/medperf/commands/benchmark/benchmark.py
@app.command(\"view\")\n@clean_except\ndef view(\nentity_id: Optional[int] = typer.Argument(None, help=\"Benchmark ID\"),\nformat: str = typer.Option(\n\"yaml\",\n\"-f\",\n\"--format\",\nhelp=\"Format to display contents. Available formats: [yaml, json]\",\n),\nunregistered: bool = typer.Option(\nFalse,\n\"--unregistered\",\nhelp=\"Display unregistered benchmarks if benchmark ID is not provided\",\n),\nmine: bool = typer.Option(\nFalse,\n\"--mine\",\nhelp=\"Display current-user benchmarks if benchmark ID is not provided\",\n),\noutput: str = typer.Option(\nNone,\n\"--output\",\n\"-o\",\nhelp=\"Output file to store contents. If not provided, the output will be displayed\",\n),\n):\n\"\"\"Displays the information of one or more benchmarks\"\"\"\nEntityView.run(entity_id, Benchmark, format, unregistered, mine, output)\n
"},{"location":"reference/commands/benchmark/submit/","title":"Submit","text":""},{"location":"reference/commands/benchmark/submit/#commands.benchmark.submit.SubmitBenchmark","title":"SubmitBenchmark
","text":"Source code in cli/medperf/commands/benchmark/submit.py
class SubmitBenchmark:\n@classmethod\ndef run(\ncls,\nbenchmark_info: dict,\nno_cache: bool = True,\nskip_data_preparation_step: bool = False,\n):\n\"\"\"Submits a new cube to the medperf platform\n Args:\n benchmark_info (dict): benchmark information\n expected keys:\n name (str): benchmark name\n description (str): benchmark description\n docs_url (str): benchmark documentation url\n demo_url (str): benchmark demo dataset url\n demo_hash (str): benchmark demo dataset hash\n data_preparation_mlcube (int): benchmark data preparation mlcube uid\n reference_model_mlcube (int): benchmark reference model mlcube uid\n evaluator_mlcube (int): benchmark data evaluator mlcube uid\n \"\"\"\nui = config.ui\nsubmission = cls(benchmark_info, no_cache, skip_data_preparation_step)\nwith ui.interactive():\nui.text = \"Getting additional information\"\nsubmission.get_extra_information()\nui.print(\"> Completed benchmark registration information\")\nui.text = \"Submitting Benchmark to MedPerf\"\nupdated_benchmark_body = submission.submit()\nui.print(\"Uploaded\")\nsubmission.to_permanent_path(updated_benchmark_body)\nsubmission.write(updated_benchmark_body)\ndef __init__(\nself,\nbenchmark_info: dict,\nno_cache: bool = True,\nskip_data_preparation_step: bool = False,\n):\nself.ui = config.ui\nself.bmk = Benchmark(**benchmark_info)\nself.no_cache = no_cache\nself.skip_data_preparation_step = skip_data_preparation_step\nself.bmk.metadata[\"demo_dataset_already_prepared\"] = skip_data_preparation_step\nconfig.tmp_paths.append(self.bmk.path)\ndef get_extra_information(self):\n\"\"\"Retrieves information that must be populated automatically,\n like hash, generated uid and test results\n \"\"\"\nbmk_demo_url = self.bmk.demo_dataset_tarball_url\nbmk_demo_hash = self.bmk.demo_dataset_tarball_hash\ntry:\n_, demo_hash = resources.get_benchmark_demo_dataset(\nbmk_demo_url, bmk_demo_hash\n)\nexcept InvalidEntityError as e:\nraise InvalidEntityError(f\"Demo dataset {bmk_demo_url}: {e}\")\nself.bmk.demo_dataset_tarball_hash = demo_hash\ndemo_uid, results = self.run_compatibility_test()\nself.bmk.demo_dataset_generated_uid = demo_uid\nself.bmk.metadata[\"results\"] = results\ndef run_compatibility_test(self):\n\"\"\"Runs a compatibility test to ensure elements are compatible,\n and to extract additional information required for submission\n \"\"\"\nself.ui.print(\"Running compatibility test\")\nself.bmk.write()\ndata_uid, results = CompatibilityTestExecution.run(\nbenchmark=self.bmk.local_id,\nno_cache=self.no_cache,\nskip_data_preparation_step=self.skip_data_preparation_step,\n)\nreturn data_uid, results\ndef submit(self):\nupdated_body = self.bmk.upload()\nreturn updated_body\ndef to_permanent_path(self, bmk_dict: dict):\n\"\"\"Renames the temporary benchmark submission to a permanent one\n Args:\n bmk_dict (dict): dictionary containing updated information of the submitted benchmark\n \"\"\"\nold_bmk_loc = self.bmk.path\nupdated_bmk = Benchmark(**bmk_dict)\nnew_bmk_loc = updated_bmk.path\nremove_path(new_bmk_loc)\nos.rename(old_bmk_loc, new_bmk_loc)\ndef write(self, updated_body):\nbmk = Benchmark(**updated_body)\nbmk.write()\n
"},{"location":"reference/commands/benchmark/submit/#commands.benchmark.submit.SubmitBenchmark.get_extra_information","title":"get_extra_information()
","text":"Retrieves information that must be populated automatically, like hash, generated uid and test results
Source code incli/medperf/commands/benchmark/submit.py
def get_extra_information(self):\n\"\"\"Retrieves information that must be populated automatically,\n like hash, generated uid and test results\n \"\"\"\nbmk_demo_url = self.bmk.demo_dataset_tarball_url\nbmk_demo_hash = self.bmk.demo_dataset_tarball_hash\ntry:\n_, demo_hash = resources.get_benchmark_demo_dataset(\nbmk_demo_url, bmk_demo_hash\n)\nexcept InvalidEntityError as e:\nraise InvalidEntityError(f\"Demo dataset {bmk_demo_url}: {e}\")\nself.bmk.demo_dataset_tarball_hash = demo_hash\ndemo_uid, results = self.run_compatibility_test()\nself.bmk.demo_dataset_generated_uid = demo_uid\nself.bmk.metadata[\"results\"] = results\n
"},{"location":"reference/commands/benchmark/submit/#commands.benchmark.submit.SubmitBenchmark.run","title":"run(benchmark_info, no_cache=True, skip_data_preparation_step=False)
classmethod
","text":"Submits a new cube to the medperf platform
Parameters:
Name Type Description Defaultbenchmark_info
dict
benchmark information expected keys: name (str): benchmark name description (str): benchmark description docs_url (str): benchmark documentation url demo_url (str): benchmark demo dataset url demo_hash (str): benchmark demo dataset hash data_preparation_mlcube (int): benchmark data preparation mlcube uid reference_model_mlcube (int): benchmark reference model mlcube uid evaluator_mlcube (int): benchmark data evaluator mlcube uid
required Source code incli/medperf/commands/benchmark/submit.py
@classmethod\ndef run(\ncls,\nbenchmark_info: dict,\nno_cache: bool = True,\nskip_data_preparation_step: bool = False,\n):\n\"\"\"Submits a new cube to the medperf platform\n Args:\n benchmark_info (dict): benchmark information\n expected keys:\n name (str): benchmark name\n description (str): benchmark description\n docs_url (str): benchmark documentation url\n demo_url (str): benchmark demo dataset url\n demo_hash (str): benchmark demo dataset hash\n data_preparation_mlcube (int): benchmark data preparation mlcube uid\n reference_model_mlcube (int): benchmark reference model mlcube uid\n evaluator_mlcube (int): benchmark data evaluator mlcube uid\n \"\"\"\nui = config.ui\nsubmission = cls(benchmark_info, no_cache, skip_data_preparation_step)\nwith ui.interactive():\nui.text = \"Getting additional information\"\nsubmission.get_extra_information()\nui.print(\"> Completed benchmark registration information\")\nui.text = \"Submitting Benchmark to MedPerf\"\nupdated_benchmark_body = submission.submit()\nui.print(\"Uploaded\")\nsubmission.to_permanent_path(updated_benchmark_body)\nsubmission.write(updated_benchmark_body)\n
"},{"location":"reference/commands/benchmark/submit/#commands.benchmark.submit.SubmitBenchmark.run_compatibility_test","title":"run_compatibility_test()
","text":"Runs a compatibility test to ensure elements are compatible, and to extract additional information required for submission
Source code incli/medperf/commands/benchmark/submit.py
def run_compatibility_test(self):\n\"\"\"Runs a compatibility test to ensure elements are compatible,\n and to extract additional information required for submission\n \"\"\"\nself.ui.print(\"Running compatibility test\")\nself.bmk.write()\ndata_uid, results = CompatibilityTestExecution.run(\nbenchmark=self.bmk.local_id,\nno_cache=self.no_cache,\nskip_data_preparation_step=self.skip_data_preparation_step,\n)\nreturn data_uid, results\n
"},{"location":"reference/commands/benchmark/submit/#commands.benchmark.submit.SubmitBenchmark.to_permanent_path","title":"to_permanent_path(bmk_dict)
","text":"Renames the temporary benchmark submission to a permanent one
Parameters:
Name Type Description Defaultbmk_dict
dict
dictionary containing updated information of the submitted benchmark
required Source code incli/medperf/commands/benchmark/submit.py
def to_permanent_path(self, bmk_dict: dict):\n\"\"\"Renames the temporary benchmark submission to a permanent one\n Args:\n bmk_dict (dict): dictionary containing updated information of the submitted benchmark\n \"\"\"\nold_bmk_loc = self.bmk.path\nupdated_bmk = Benchmark(**bmk_dict)\nnew_bmk_loc = updated_bmk.path\nremove_path(new_bmk_loc)\nos.rename(old_bmk_loc, new_bmk_loc)\n
"},{"location":"reference/commands/compatibility_test/compatibility_test/","title":"Compatibility test","text":""},{"location":"reference/commands/compatibility_test/compatibility_test/#commands.compatibility_test.compatibility_test.list","title":"list()
","text":"List previously executed tests reports.
Source code incli/medperf/commands/compatibility_test/compatibility_test.py
@app.command(\"ls\")\n@clean_except\ndef list():\n\"\"\"List previously executed tests reports.\"\"\"\nEntityList.run(\nTestReport,\nfields=[\"UID\", \"Data Source\", \"Model\", \"Evaluator\"],\nunregistered=True,\n)\n
"},{"location":"reference/commands/compatibility_test/compatibility_test/#commands.compatibility_test.compatibility_test.run","title":"run(benchmark_uid=typer.Option(None, '--benchmark', '-b', help='UID of the benchmark to test. Optional'), data_uid=typer.Option(None, '--data_uid', '-d', help='Prepared Dataset UID. Used for dataset testing. Optional. Defaults to benchmark demo dataset.'), demo_dataset_url=typer.Option(None, '--demo_dataset_url', help='Identifier to download the demonstration dataset tarball file.\\n\\n See `medperf mlcube submit --help` for more information'), demo_dataset_hash=typer.Option(None, '--demo_dataset_hash', help='Hash of the demo dataset, if provided.'), data_path=typer.Option(None, '--data_path', help='Path to raw input data.'), labels_path=typer.Option(None, '--labels_path', help='Path to the labels of the raw input data, if provided.'), data_prep=typer.Option(None, '--data_preparation', '-p', help='UID or local path to the data preparation mlcube. Optional. Defaults to benchmark data preparation mlcube.'), model=typer.Option(None, '--model', '-m', help='UID or local path to the model mlcube. Optional. Defaults to benchmark reference mlcube.'), evaluator=typer.Option(None, '--evaluator', '-e', help='UID or local path to the evaluator mlcube. Optional. Defaults to benchmark evaluator mlcube'), no_cache=typer.Option(False, '--no-cache', help='Execute the test even if results already exist'), offline=typer.Option(False, '--offline', help='Execute the test without connecting to the MedPerf server.'), skip_data_preparation_step=typer.Option(False, '--skip-demo-data-preparation', help='Use this flag if the passed demo dataset or data path is already prepared'))
","text":"Executes a compatibility test for a determined benchmark. Can test prepared and unprepared datasets, remote and local models independently.
Source code incli/medperf/commands/compatibility_test/compatibility_test.py
@app.command(\"run\")\n@clean_except\ndef run(\nbenchmark_uid: int = typer.Option(\nNone, \"--benchmark\", \"-b\", help=\"UID of the benchmark to test. Optional\"\n),\ndata_uid: str = typer.Option(\nNone,\n\"--data_uid\",\n\"-d\",\nhelp=\"Prepared Dataset UID. Used for dataset testing. Optional. Defaults to benchmark demo dataset.\",\n),\ndemo_dataset_url: str = typer.Option(\nNone,\n\"--demo_dataset_url\",\nhelp=\"\"\"Identifier to download the demonstration dataset tarball file.\\n\n See `medperf mlcube submit --help` for more information\"\"\",\n),\ndemo_dataset_hash: str = typer.Option(\nNone, \"--demo_dataset_hash\", help=\"Hash of the demo dataset, if provided.\"\n),\ndata_path: str = typer.Option(None, \"--data_path\", help=\"Path to raw input data.\"),\nlabels_path: str = typer.Option(\nNone,\n\"--labels_path\",\nhelp=\"Path to the labels of the raw input data, if provided.\",\n),\ndata_prep: str = typer.Option(\nNone,\n\"--data_preparation\",\n\"-p\",\nhelp=\"UID or local path to the data preparation mlcube. Optional. Defaults to benchmark data preparation mlcube.\",\n),\nmodel: str = typer.Option(\nNone,\n\"--model\",\n\"-m\",\nhelp=\"UID or local path to the model mlcube. Optional. Defaults to benchmark reference mlcube.\",\n),\nevaluator: str = typer.Option(\nNone,\n\"--evaluator\",\n\"-e\",\nhelp=\"UID or local path to the evaluator mlcube. Optional. Defaults to benchmark evaluator mlcube\",\n),\nno_cache: bool = typer.Option(\nFalse, \"--no-cache\", help=\"Execute the test even if results already exist\"\n),\noffline: bool = typer.Option(\nFalse,\n\"--offline\",\nhelp=\"Execute the test without connecting to the MedPerf server.\",\n),\nskip_data_preparation_step: bool = typer.Option(\nFalse,\n\"--skip-demo-data-preparation\",\nhelp=\"Use this flag if the passed demo dataset or data path is already prepared\",\n),\n):\n\"\"\"\n Executes a compatibility test for a determined benchmark.\n Can test prepared and unprepared datasets, remote and local models independently.\n \"\"\"\nCompatibilityTestExecution.run(\nbenchmark_uid,\ndata_prep,\nmodel,\nevaluator,\ndata_path,\nlabels_path,\ndemo_dataset_url,\ndemo_dataset_hash,\ndata_uid,\nno_cache=no_cache,\noffline=offline,\nskip_data_preparation_step=skip_data_preparation_step,\n)\nconfig.ui.print(\"\u2705 Done!\")\n
"},{"location":"reference/commands/compatibility_test/compatibility_test/#commands.compatibility_test.compatibility_test.view","title":"view(entity_id=typer.Argument(None, help='Test report ID'), format=typer.Option('yaml', '-f', '--format', help='Format to display contents. Available formats: [yaml, json]'), output=typer.Option(None, '--output', '-o', help='Output file to store contents. If not provided, the output will be displayed'))
","text":"Displays the information of one or more test reports
Source code incli/medperf/commands/compatibility_test/compatibility_test.py
@app.command(\"view\")\n@clean_except\ndef view(\nentity_id: Optional[str] = typer.Argument(None, help=\"Test report ID\"),\nformat: str = typer.Option(\n\"yaml\",\n\"-f\",\n\"--format\",\nhelp=\"Format to display contents. Available formats: [yaml, json]\",\n),\noutput: str = typer.Option(\nNone,\n\"--output\",\n\"-o\",\nhelp=\"Output file to store contents. If not provided, the output will be displayed\",\n),\n):\n\"\"\"Displays the information of one or more test reports\"\"\"\nEntityView.run(entity_id, TestReport, format, unregistered=True, output=output)\n
"},{"location":"reference/commands/compatibility_test/run/","title":"Run","text":""},{"location":"reference/commands/compatibility_test/run/#commands.compatibility_test.run.CompatibilityTestExecution","title":"CompatibilityTestExecution
","text":"Source code in cli/medperf/commands/compatibility_test/run.py
class CompatibilityTestExecution:\n@classmethod\ndef run(\ncls,\nbenchmark: int = None,\ndata_prep: str = None,\nmodel: str = None,\nevaluator: str = None,\ndata_path: str = None,\nlabels_path: str = None,\ndemo_dataset_url: str = None,\ndemo_dataset_hash: str = None,\ndata_uid: str = None,\nno_cache: bool = False,\noffline: bool = False,\nskip_data_preparation_step: bool = False,\n) -> (str, dict):\n\"\"\"Execute a test workflow. Components of a complete workflow should be passed.\n When only the benchmark is provided, it implies the following workflow will be used:\n - the benchmark's demo dataset is used as the raw data\n - the benchmark's data preparation cube is used\n - the benchmark's reference model cube is used\n - the benchmark's metrics cube is used\n Overriding benchmark's components:\n - The data prepration, model, and metrics cubes can be overriden by specifying a cube either\n as an integer (registered) or a path (local). The path can refer either to the mlcube config\n file or to the mlcube directory containing the mlcube config file.\n - Instead of using the demo dataset of the benchmark, The input raw data can be overriden by providing:\n - a demo dataset url and its hash\n - data path and labels path\n - A prepared dataset can be directly used. In this case the data preparator cube is never used.\n The prepared data can be provided by either specifying an integer (registered) or a hash of a\n locally prepared dataset.\n Whether the benchmark is provided or not, the command will fail either if the user fails to\n provide a valid complete workflow, or if the user provided extra redundant parameters.\n Args:\n benchmark (int, optional): Benchmark to run the test workflow for\n data_prep (str, optional): data preparation mlcube uid or local path.\n model (str, optional): model mlcube uid or local path.\n evaluator (str, optional): evaluator mlcube uid or local path.\n data_path (str, optional): path to a local raw data\n labels_path (str, optional): path to the labels of the local raw data\n demo_dataset_url (str, optional): Identifier to download the demonstration dataset tarball file.\\n\n See `medperf mlcube submit --help` for more information\n demo_dataset_hash (str, optional): The hash of the demo dataset tarball file\n data_uid (str, optional): A prepared dataset UID\n no_cache (bool): Whether to ignore cached results of the test execution. Defaults to False.\n offline (bool): Whether to disable communication to the MedPerf server and rely only on\n local copies of the server assets. Defaults to False.\n Returns:\n (str): Prepared Dataset UID used for the test. Could be the one provided or a generated one.\n (dict): Results generated by the test.\n \"\"\"\nlogging.info(\"Starting test execution\")\ntest_exec = cls(\nbenchmark,\ndata_prep,\nmodel,\nevaluator,\ndata_path,\nlabels_path,\ndemo_dataset_url,\ndemo_dataset_hash,\ndata_uid,\nno_cache,\noffline,\nskip_data_preparation_step,\n)\ntest_exec.validate()\ntest_exec.set_data_source()\ntest_exec.process_benchmark()\ntest_exec.prepare_cubes()\ntest_exec.prepare_dataset()\ntest_exec.initialize_report()\nresults = test_exec.cached_results()\nif results is None:\nresults = test_exec.execute()\ntest_exec.write(results)\nelse:\nlogging.info(\"Existing results are found. Test would not be re-executed.\")\nlogging.debug(f\"Existing results: {results}\")\nreturn test_exec.data_uid, results\ndef __init__(\nself,\nbenchmark: int = None,\ndata_prep: str = None,\nmodel: str = None,\nevaluator: str = None,\ndata_path: str = None,\nlabels_path: str = None,\ndemo_dataset_url: str = None,\ndemo_dataset_hash: str = None,\ndata_uid: str = None,\nno_cache: bool = False,\noffline: bool = False,\nskip_data_preparation_step: bool = False,\n):\nself.benchmark_uid = benchmark\nself.data_prep = data_prep\nself.model = model\nself.evaluator = evaluator\nself.data_path = data_path\nself.labels_path = labels_path\nself.demo_dataset_url = demo_dataset_url\nself.demo_dataset_hash = demo_dataset_hash\nself.data_uid = data_uid\nself.no_cache = no_cache\nself.offline = offline\nself.skip_data_preparation_step = skip_data_preparation_step\n# This property will be set to either \"path\", \"demo\", \"prepared\", or \"benchmark\"\nself.data_source = None\nself.dataset = None\nself.model_cube = None\nself.evaluator_cube = None\nself.validator = CompatibilityTestParamsValidator(\nself.benchmark_uid,\nself.data_prep,\nself.model,\nself.evaluator,\nself.data_path,\nself.labels_path,\nself.demo_dataset_url,\nself.demo_dataset_hash,\nself.data_uid,\n)\ndef validate(self):\nself.validator.validate()\ndef set_data_source(self):\nself.data_source = self.validator.get_data_source()\ndef process_benchmark(self):\n\"\"\"Process the benchmark input if given. Sets the needed parameters from\n the benchmark.\"\"\"\nif not self.benchmark_uid:\nreturn\nbenchmark = Benchmark.get(self.benchmark_uid, local_only=self.offline)\nif self.data_source != \"prepared\":\nself.data_prep = self.data_prep or benchmark.data_preparation_mlcube\nself.model = self.model or benchmark.reference_model_mlcube\nself.evaluator = self.evaluator or benchmark.data_evaluator_mlcube\nif self.data_source == \"benchmark\":\nself.demo_dataset_url = benchmark.demo_dataset_tarball_url\nself.demo_dataset_hash = benchmark.demo_dataset_tarball_hash\nself.skip_data_preparation_step = benchmark.metadata.get(\n\"demo_dataset_already_prepared\", False\n)\ndef prepare_cubes(self):\n\"\"\"Prepares the mlcubes. If the provided mlcube is a path, it will create\n a temporary uid and link the cube path to the medperf storage path.\"\"\"\nif self.data_source != \"prepared\":\nlogging.info(f\"Establishing the data preparation cube: {self.data_prep}\")\nself.data_prep = prepare_cube(self.data_prep)\nlogging.info(f\"Establishing the model cube: {self.model}\")\nself.model = prepare_cube(self.model)\nlogging.info(f\"Establishing the evaluator cube: {self.evaluator}\")\nself.evaluator = prepare_cube(self.evaluator)\nself.model_cube = get_cube(self.model, \"Model\", local_only=self.offline)\nself.evaluator_cube = get_cube(\nself.evaluator, \"Evaluator\", local_only=self.offline\n)\ndef prepare_dataset(self):\n\"\"\"Assigns the data_uid used for testing and retrieves the dataset.\n If the data is not prepared, it calls the data preparation step\n on the given local data path or using a remote demo dataset.\"\"\"\nlogging.info(\"Establishing data_uid for test execution\")\nif self.data_source != \"prepared\":\nif self.data_source == \"path\":\ndata_path, labels_path = self.data_path, self.labels_path\n# TODO: this has to be redesigned. Compatibility tests command\n# is starting to have a lot of input arguments. For now\n# let's not support accepting a metadata path\nmetadata_path = None\nelse:\ndata_path, labels_path, metadata_path = download_demo_data(\nself.demo_dataset_url, self.demo_dataset_hash\n)\nself.data_uid = create_test_dataset(\ndata_path,\nlabels_path,\nmetadata_path,\nself.data_prep,\nself.skip_data_preparation_step,\n)\nself.dataset = Dataset.get(self.data_uid, local_only=self.offline)\ndef initialize_report(self):\n\"\"\"Initializes an instance of `TestReport` to hold the current test information.\"\"\"\nreport_data = {\n\"demo_dataset_url\": self.demo_dataset_url,\n\"demo_dataset_hash\": self.demo_dataset_hash,\n\"data_path\": self.data_path,\n\"labels_path\": self.labels_path,\n\"prepared_data_hash\": self.data_uid,\n\"data_preparation_mlcube\": self.data_prep,\n\"model\": self.model,\n\"data_evaluator_mlcube\": self.evaluator,\n}\nself.report = TestReport(**report_data)\ndef cached_results(self):\n\"\"\"checks the existance of, and retrieves if possible, the compatibility test\n result. This method is called prior to the test execution.\n Returns:\n (dict|None): None if the results does not exist or if self.no_cache is True,\n otherwise it returns the found results.\n \"\"\"\nif self.no_cache:\nreturn\nuid = self.report.local_id\ntry:\nreport = TestReport.get(uid)\nexcept InvalidArgumentError:\nreturn\nlogging.info(f\"Existing report {uid} was detected.\")\nlogging.info(\"The compatibilty test will not be re-executed.\")\nreturn report.results\ndef execute(self):\n\"\"\"Runs the test execution flow and returns the results\n Returns:\n dict: returns the results of the test execution.\n \"\"\"\nexecution_summary = Execution.run(\ndataset=self.dataset,\nmodel=self.model_cube,\nevaluator=self.evaluator_cube,\nignore_model_errors=False,\n)\nreturn execution_summary[\"results\"]\ndef write(self, results):\n\"\"\"Writes a report of the test execution to the disk\n Args:\n results (dict): the results of the test execution\n \"\"\"\nself.report.set_results(results)\nself.report.write()\n
"},{"location":"reference/commands/compatibility_test/run/#commands.compatibility_test.run.CompatibilityTestExecution.cached_results","title":"cached_results()
","text":"checks the existance of, and retrieves if possible, the compatibility test result. This method is called prior to the test execution.
Returns:
Type Descriptiondict | None
None if the results does not exist or if self.no_cache is True,
otherwise it returns the found results.
Source code incli/medperf/commands/compatibility_test/run.py
def cached_results(self):\n\"\"\"checks the existance of, and retrieves if possible, the compatibility test\n result. This method is called prior to the test execution.\n Returns:\n (dict|None): None if the results does not exist or if self.no_cache is True,\n otherwise it returns the found results.\n \"\"\"\nif self.no_cache:\nreturn\nuid = self.report.local_id\ntry:\nreport = TestReport.get(uid)\nexcept InvalidArgumentError:\nreturn\nlogging.info(f\"Existing report {uid} was detected.\")\nlogging.info(\"The compatibilty test will not be re-executed.\")\nreturn report.results\n
"},{"location":"reference/commands/compatibility_test/run/#commands.compatibility_test.run.CompatibilityTestExecution.execute","title":"execute()
","text":"Runs the test execution flow and returns the results
Returns:
Name Type Descriptiondict
returns the results of the test execution.
Source code incli/medperf/commands/compatibility_test/run.py
def execute(self):\n\"\"\"Runs the test execution flow and returns the results\n Returns:\n dict: returns the results of the test execution.\n \"\"\"\nexecution_summary = Execution.run(\ndataset=self.dataset,\nmodel=self.model_cube,\nevaluator=self.evaluator_cube,\nignore_model_errors=False,\n)\nreturn execution_summary[\"results\"]\n
"},{"location":"reference/commands/compatibility_test/run/#commands.compatibility_test.run.CompatibilityTestExecution.initialize_report","title":"initialize_report()
","text":"Initializes an instance of TestReport
to hold the current test information.
cli/medperf/commands/compatibility_test/run.py
def initialize_report(self):\n\"\"\"Initializes an instance of `TestReport` to hold the current test information.\"\"\"\nreport_data = {\n\"demo_dataset_url\": self.demo_dataset_url,\n\"demo_dataset_hash\": self.demo_dataset_hash,\n\"data_path\": self.data_path,\n\"labels_path\": self.labels_path,\n\"prepared_data_hash\": self.data_uid,\n\"data_preparation_mlcube\": self.data_prep,\n\"model\": self.model,\n\"data_evaluator_mlcube\": self.evaluator,\n}\nself.report = TestReport(**report_data)\n
"},{"location":"reference/commands/compatibility_test/run/#commands.compatibility_test.run.CompatibilityTestExecution.prepare_cubes","title":"prepare_cubes()
","text":"Prepares the mlcubes. If the provided mlcube is a path, it will create a temporary uid and link the cube path to the medperf storage path.
Source code incli/medperf/commands/compatibility_test/run.py
def prepare_cubes(self):\n\"\"\"Prepares the mlcubes. If the provided mlcube is a path, it will create\n a temporary uid and link the cube path to the medperf storage path.\"\"\"\nif self.data_source != \"prepared\":\nlogging.info(f\"Establishing the data preparation cube: {self.data_prep}\")\nself.data_prep = prepare_cube(self.data_prep)\nlogging.info(f\"Establishing the model cube: {self.model}\")\nself.model = prepare_cube(self.model)\nlogging.info(f\"Establishing the evaluator cube: {self.evaluator}\")\nself.evaluator = prepare_cube(self.evaluator)\nself.model_cube = get_cube(self.model, \"Model\", local_only=self.offline)\nself.evaluator_cube = get_cube(\nself.evaluator, \"Evaluator\", local_only=self.offline\n)\n
"},{"location":"reference/commands/compatibility_test/run/#commands.compatibility_test.run.CompatibilityTestExecution.prepare_dataset","title":"prepare_dataset()
","text":"Assigns the data_uid used for testing and retrieves the dataset. If the data is not prepared, it calls the data preparation step on the given local data path or using a remote demo dataset.
Source code incli/medperf/commands/compatibility_test/run.py
def prepare_dataset(self):\n\"\"\"Assigns the data_uid used for testing and retrieves the dataset.\n If the data is not prepared, it calls the data preparation step\n on the given local data path or using a remote demo dataset.\"\"\"\nlogging.info(\"Establishing data_uid for test execution\")\nif self.data_source != \"prepared\":\nif self.data_source == \"path\":\ndata_path, labels_path = self.data_path, self.labels_path\n# TODO: this has to be redesigned. Compatibility tests command\n# is starting to have a lot of input arguments. For now\n# let's not support accepting a metadata path\nmetadata_path = None\nelse:\ndata_path, labels_path, metadata_path = download_demo_data(\nself.demo_dataset_url, self.demo_dataset_hash\n)\nself.data_uid = create_test_dataset(\ndata_path,\nlabels_path,\nmetadata_path,\nself.data_prep,\nself.skip_data_preparation_step,\n)\nself.dataset = Dataset.get(self.data_uid, local_only=self.offline)\n
"},{"location":"reference/commands/compatibility_test/run/#commands.compatibility_test.run.CompatibilityTestExecution.process_benchmark","title":"process_benchmark()
","text":"Process the benchmark input if given. Sets the needed parameters from the benchmark.
Source code incli/medperf/commands/compatibility_test/run.py
def process_benchmark(self):\n\"\"\"Process the benchmark input if given. Sets the needed parameters from\n the benchmark.\"\"\"\nif not self.benchmark_uid:\nreturn\nbenchmark = Benchmark.get(self.benchmark_uid, local_only=self.offline)\nif self.data_source != \"prepared\":\nself.data_prep = self.data_prep or benchmark.data_preparation_mlcube\nself.model = self.model or benchmark.reference_model_mlcube\nself.evaluator = self.evaluator or benchmark.data_evaluator_mlcube\nif self.data_source == \"benchmark\":\nself.demo_dataset_url = benchmark.demo_dataset_tarball_url\nself.demo_dataset_hash = benchmark.demo_dataset_tarball_hash\nself.skip_data_preparation_step = benchmark.metadata.get(\n\"demo_dataset_already_prepared\", False\n)\n
"},{"location":"reference/commands/compatibility_test/run/#commands.compatibility_test.run.CompatibilityTestExecution.run","title":"run(benchmark=None, data_prep=None, model=None, evaluator=None, data_path=None, labels_path=None, demo_dataset_url=None, demo_dataset_hash=None, data_uid=None, no_cache=False, offline=False, skip_data_preparation_step=False)
classmethod
","text":"Execute a test workflow. Components of a complete workflow should be passed. When only the benchmark is provided, it implies the following workflow will be used: - the benchmark's demo dataset is used as the raw data - the benchmark's data preparation cube is used - the benchmark's reference model cube is used - the benchmark's metrics cube is used
Overriding benchmark's components: - The data prepration, model, and metrics cubes can be overriden by specifying a cube either as an integer (registered) or a path (local). The path can refer either to the mlcube config file or to the mlcube directory containing the mlcube config file. - Instead of using the demo dataset of the benchmark, The input raw data can be overriden by providing: - a demo dataset url and its hash - data path and labels path - A prepared dataset can be directly used. In this case the data preparator cube is never used. The prepared data can be provided by either specifying an integer (registered) or a hash of a locally prepared dataset.
Whether the benchmark is provided or not, the command will fail either if the user fails to provide a valid complete workflow, or if the user provided extra redundant parameters.
Parameters:
Name Type Description Defaultbenchmark
int
Benchmark to run the test workflow for
None
data_prep
str
data preparation mlcube uid or local path.
None
model
str
model mlcube uid or local path.
None
evaluator
str
evaluator mlcube uid or local path.
None
data_path
str
path to a local raw data
None
labels_path
str
path to the labels of the local raw data
None
demo_dataset_url
str
Identifier to download the demonstration dataset tarball file.
None
demo_dataset_hash
str
The hash of the demo dataset tarball file
None
data_uid
str
A prepared dataset UID
None
no_cache
bool
Whether to ignore cached results of the test execution. Defaults to False.
False
offline
bool
Whether to disable communication to the MedPerf server and rely only on
False
Returns:
Type Descriptionstr
Prepared Dataset UID used for the test. Could be the one provided or a generated one.
dict
Results generated by the test.
Source code incli/medperf/commands/compatibility_test/run.py
@classmethod\ndef run(\ncls,\nbenchmark: int = None,\ndata_prep: str = None,\nmodel: str = None,\nevaluator: str = None,\ndata_path: str = None,\nlabels_path: str = None,\ndemo_dataset_url: str = None,\ndemo_dataset_hash: str = None,\ndata_uid: str = None,\nno_cache: bool = False,\noffline: bool = False,\nskip_data_preparation_step: bool = False,\n) -> (str, dict):\n\"\"\"Execute a test workflow. Components of a complete workflow should be passed.\n When only the benchmark is provided, it implies the following workflow will be used:\n - the benchmark's demo dataset is used as the raw data\n - the benchmark's data preparation cube is used\n - the benchmark's reference model cube is used\n - the benchmark's metrics cube is used\n Overriding benchmark's components:\n - The data prepration, model, and metrics cubes can be overriden by specifying a cube either\n as an integer (registered) or a path (local). The path can refer either to the mlcube config\n file or to the mlcube directory containing the mlcube config file.\n - Instead of using the demo dataset of the benchmark, The input raw data can be overriden by providing:\n - a demo dataset url and its hash\n - data path and labels path\n - A prepared dataset can be directly used. In this case the data preparator cube is never used.\n The prepared data can be provided by either specifying an integer (registered) or a hash of a\n locally prepared dataset.\n Whether the benchmark is provided or not, the command will fail either if the user fails to\n provide a valid complete workflow, or if the user provided extra redundant parameters.\n Args:\n benchmark (int, optional): Benchmark to run the test workflow for\n data_prep (str, optional): data preparation mlcube uid or local path.\n model (str, optional): model mlcube uid or local path.\n evaluator (str, optional): evaluator mlcube uid or local path.\n data_path (str, optional): path to a local raw data\n labels_path (str, optional): path to the labels of the local raw data\n demo_dataset_url (str, optional): Identifier to download the demonstration dataset tarball file.\\n\n See `medperf mlcube submit --help` for more information\n demo_dataset_hash (str, optional): The hash of the demo dataset tarball file\n data_uid (str, optional): A prepared dataset UID\n no_cache (bool): Whether to ignore cached results of the test execution. Defaults to False.\n offline (bool): Whether to disable communication to the MedPerf server and rely only on\n local copies of the server assets. Defaults to False.\n Returns:\n (str): Prepared Dataset UID used for the test. Could be the one provided or a generated one.\n (dict): Results generated by the test.\n \"\"\"\nlogging.info(\"Starting test execution\")\ntest_exec = cls(\nbenchmark,\ndata_prep,\nmodel,\nevaluator,\ndata_path,\nlabels_path,\ndemo_dataset_url,\ndemo_dataset_hash,\ndata_uid,\nno_cache,\noffline,\nskip_data_preparation_step,\n)\ntest_exec.validate()\ntest_exec.set_data_source()\ntest_exec.process_benchmark()\ntest_exec.prepare_cubes()\ntest_exec.prepare_dataset()\ntest_exec.initialize_report()\nresults = test_exec.cached_results()\nif results is None:\nresults = test_exec.execute()\ntest_exec.write(results)\nelse:\nlogging.info(\"Existing results are found. Test would not be re-executed.\")\nlogging.debug(f\"Existing results: {results}\")\nreturn test_exec.data_uid, results\n
"},{"location":"reference/commands/compatibility_test/run/#commands.compatibility_test.run.CompatibilityTestExecution.write","title":"write(results)
","text":"Writes a report of the test execution to the disk
Parameters:
Name Type Description Defaultresults
dict
the results of the test execution
required Source code incli/medperf/commands/compatibility_test/run.py
def write(self, results):\n\"\"\"Writes a report of the test execution to the disk\n Args:\n results (dict): the results of the test execution\n \"\"\"\nself.report.set_results(results)\nself.report.write()\n
"},{"location":"reference/commands/compatibility_test/utils/","title":"Utils","text":""},{"location":"reference/commands/compatibility_test/utils/#commands.compatibility_test.utils.download_demo_data","title":"download_demo_data(dset_url, dset_hash)
","text":"Retrieves the demo dataset associated to the specified benchmark
Returns:
Name Type Descriptiondata_path
str
Location of the downloaded data
labels_path
str
Location of the downloaded labels
Source code incli/medperf/commands/compatibility_test/utils.py
def download_demo_data(dset_url, dset_hash):\n\"\"\"Retrieves the demo dataset associated to the specified benchmark\n Returns:\n data_path (str): Location of the downloaded data\n labels_path (str): Location of the downloaded labels\n \"\"\"\ntry:\ndemo_dset_path, _ = resources.get_benchmark_demo_dataset(dset_url, dset_hash)\nexcept InvalidEntityError as e:\nraise InvalidEntityError(f\"Demo dataset {dset_url}: {e}\")\n# It is assumed that all demo datasets contain a file\n# which specifies the input of the data preparation step\npaths_file = os.path.join(demo_dset_path, config.demo_dset_paths_file)\nwith open(paths_file, \"r\") as f:\npaths = yaml.safe_load(f)\ndata_path = os.path.join(demo_dset_path, paths[\"data_path\"])\nlabels_path = os.path.join(demo_dset_path, paths[\"labels_path\"])\nmetadata_path = None\nif \"metadata_path\" in paths:\nmetadata_path = os.path.join(demo_dset_path, paths[\"metadata_path\"])\nreturn data_path, labels_path, metadata_path\n
"},{"location":"reference/commands/compatibility_test/utils/#commands.compatibility_test.utils.prepare_cube","title":"prepare_cube(cube_uid)
","text":"Assigns the attr used for testing according to the initialization parameters. If the value is a path, it will create a temporary uid and link the cube path to the medperf storage path.
Parameters:
Name Type Description Defaultattr
str
Attribute to check and/or reassign.
requiredfallback
any
Value to assign if attribute is empty. Defaults to None.
required Source code incli/medperf/commands/compatibility_test/utils.py
def prepare_cube(cube_uid: str):\n\"\"\"Assigns the attr used for testing according to the initialization parameters.\n If the value is a path, it will create a temporary uid and link the cube path to\n the medperf storage path.\n Arguments:\n attr (str): Attribute to check and/or reassign.\n fallback (any): Value to assign if attribute is empty. Defaults to None.\n \"\"\"\n# Test if value looks like an mlcube_uid, if so skip path validation\nif str(cube_uid).isdigit():\nlogging.info(f\"MLCube value {cube_uid} resembles an mlcube_uid\")\nreturn cube_uid\n# Check if value is a local mlcube\npath = Path(cube_uid)\nif path.is_file():\npath = path.parent\npath = path.resolve()\nif os.path.exists(path):\nmlcube_yaml_path = os.path.join(path, config.cube_filename)\nif os.path.exists(mlcube_yaml_path):\nlogging.info(\"local path provided. Creating symbolic link\")\ntemp_uid = prepare_local_cube(path)\nreturn temp_uid\nlogging.error(f\"mlcube {cube_uid} was not found as an existing mlcube\")\nraise InvalidArgumentError(\nf\"The provided mlcube ({cube_uid}) could not be found as a local or remote mlcube\"\n)\n
"},{"location":"reference/commands/compatibility_test/validate_params/","title":"Validate params","text":""},{"location":"reference/commands/compatibility_test/validate_params/#commands.compatibility_test.validate_params.CompatibilityTestParamsValidator","title":"CompatibilityTestParamsValidator
","text":"Validates the input parameters to the CompatibilityTestExecution class
Source code incli/medperf/commands/compatibility_test/validate_params.py
class CompatibilityTestParamsValidator:\n\"\"\"Validates the input parameters to the CompatibilityTestExecution class\"\"\"\ndef __init__(\nself,\nbenchmark: int = None,\ndata_prep: str = None,\nmodel: str = None,\nevaluator: str = None,\ndata_path: str = None,\nlabels_path: str = None,\ndemo_dataset_url: str = None,\ndemo_dataset_hash: str = None,\ndata_uid: str = None,\n):\nself.benchmark_uid = benchmark\nself.data_prep = data_prep\nself.model = model\nself.evaluator = evaluator\nself.data_path = data_path\nself.labels_path = labels_path\nself.demo_dataset_url = demo_dataset_url\nself.demo_dataset_hash = demo_dataset_hash\nself.data_uid = data_uid\ndef __validate_cubes(self):\nif not self.model and not self.benchmark_uid:\nraise InvalidArgumentError(\n\"A model mlcube or a benchmark should at least be specified\"\n)\nif not self.evaluator and not self.benchmark_uid:\nraise InvalidArgumentError(\n\"A metrics mlcube or a benchmark should at least be specified\"\n)\ndef __raise_redundant_data_source(self):\nmsg = \"Make sure you pass only one data source: \"\nmsg += \"either a prepared dataset, a data path and labels path, or a demo dataset url\"\nraise InvalidArgumentError(msg)\ndef __validate_prepared_data_source(self):\nif any(\n[\nself.data_path,\nself.labels_path,\nself.demo_dataset_url,\nself.demo_dataset_hash,\n]\n):\nself.__raise_redundant_data_source()\nif self.data_prep:\nraise InvalidArgumentError(\n\"A data preparation cube is not needed when specifying a prepared dataset\"\n)\ndef __validate_data_path_source(self):\nif not self.labels_path:\nraise InvalidArgumentError(\n\"Labels path should be specified when providing data path\"\n)\nif any([self.demo_dataset_url, self.demo_dataset_hash, self.data_uid]):\nself.__raise_redundant_data_source()\nif not self.data_prep and not self.benchmark_uid:\nraise InvalidArgumentError(\n\"A data preparation cube should be passed when specifying raw data input\"\n)\ndef __validate_demo_data_source(self):\nif not self.demo_dataset_hash:\nraise InvalidArgumentError(\n\"The hash of the provided demo dataset should be specified\"\n)\nif any([self.data_path, self.labels_path, self.data_uid]):\nself.__raise_redundant_data_source()\nif not self.data_prep and not self.benchmark_uid:\nraise InvalidArgumentError(\n\"A data preparation cube should be passed when specifying raw data input\"\n)\ndef __validate_data_source(self):\nif self.data_uid:\nself.__validate_prepared_data_source()\nreturn\nif self.data_path:\nself.__validate_data_path_source()\nreturn\nif self.demo_dataset_url:\nself.__validate_demo_data_source()\nreturn\nif self.benchmark_uid:\nreturn\nmsg = \"A data source should at least be specified, either by providing\"\nmsg += \" a prepared data uid, a demo dataset url, data path, or a benchmark\"\nraise InvalidArgumentError(msg)\ndef __validate_redundant_benchmark(self):\nif not self.benchmark_uid:\nreturn\nredundant_bmk_demo = any([self.data_uid, self.data_path, self.demo_dataset_url])\nredundant_bmk_model = self.model is not None\nredundant_bmk_evaluator = self.evaluator is not None\nredundant_bmk_preparator = (\nself.data_prep is not None or self.data_uid is not None\n)\nif all(\n[\nredundant_bmk_demo,\nredundant_bmk_model,\nredundant_bmk_evaluator,\nredundant_bmk_preparator,\n]\n):\nraise InvalidArgumentError(\"The provided benchmark will not be used\")\ndef validate(self):\n\"\"\"Ensures test has been passed a valid combination of parameters.\n Raises `medperf.exceptions.InvalidArgumentError` when the parameters are\n invalid.\n \"\"\"\nself.__validate_cubes()\nself.__validate_data_source()\nself.__validate_redundant_benchmark()\ndef get_data_source(self):\n\"\"\"Parses the input parameters and returns a string, one of:\n \"prepared\", if the source of data is a prepared dataset uid,\n \"path\", if the source of data is a local path to raw data,\n \"demo\", if the source of data is a demo dataset url,\n or \"benchmark\", if the source of data is the demo dataset of a benchmark.\n This function assumes the passed parameters to the constructor have been already\n validated.\n \"\"\"\nif self.data_uid:\nreturn \"prepared\"\nif self.data_path:\nreturn \"path\"\nif self.demo_dataset_url:\nreturn \"demo\"\nif self.benchmark_uid:\nreturn \"benchmark\"\nraise MedperfException(\n\"Ensure calling the `validate` method before using this method\"\n)\n
"},{"location":"reference/commands/compatibility_test/validate_params/#commands.compatibility_test.validate_params.CompatibilityTestParamsValidator.get_data_source","title":"get_data_source()
","text":"Parses the input parameters and returns a string, one of: \"prepared\", if the source of data is a prepared dataset uid, \"path\", if the source of data is a local path to raw data, \"demo\", if the source of data is a demo dataset url, or \"benchmark\", if the source of data is the demo dataset of a benchmark.
This function assumes the passed parameters to the constructor have been already validated.
Source code incli/medperf/commands/compatibility_test/validate_params.py
def get_data_source(self):\n\"\"\"Parses the input parameters and returns a string, one of:\n \"prepared\", if the source of data is a prepared dataset uid,\n \"path\", if the source of data is a local path to raw data,\n \"demo\", if the source of data is a demo dataset url,\n or \"benchmark\", if the source of data is the demo dataset of a benchmark.\n This function assumes the passed parameters to the constructor have been already\n validated.\n \"\"\"\nif self.data_uid:\nreturn \"prepared\"\nif self.data_path:\nreturn \"path\"\nif self.demo_dataset_url:\nreturn \"demo\"\nif self.benchmark_uid:\nreturn \"benchmark\"\nraise MedperfException(\n\"Ensure calling the `validate` method before using this method\"\n)\n
"},{"location":"reference/commands/compatibility_test/validate_params/#commands.compatibility_test.validate_params.CompatibilityTestParamsValidator.validate","title":"validate()
","text":"Ensures test has been passed a valid combination of parameters. Raises medperf.exceptions.InvalidArgumentError
when the parameters are invalid.
cli/medperf/commands/compatibility_test/validate_params.py
def validate(self):\n\"\"\"Ensures test has been passed a valid combination of parameters.\n Raises `medperf.exceptions.InvalidArgumentError` when the parameters are\n invalid.\n \"\"\"\nself.__validate_cubes()\nself.__validate_data_source()\nself.__validate_redundant_benchmark()\n
"},{"location":"reference/commands/dataset/associate/","title":"Associate","text":""},{"location":"reference/commands/dataset/associate/#commands.dataset.associate.AssociateDataset","title":"AssociateDataset
","text":"Source code in cli/medperf/commands/dataset/associate.py
class AssociateDataset:\n@staticmethod\ndef run(data_uid: int, benchmark_uid: int, approved=False, no_cache=False):\n\"\"\"Associates a registered dataset with a benchmark\n Args:\n data_uid (int): UID of the registered dataset to associate\n benchmark_uid (int): UID of the benchmark to associate with\n \"\"\"\ncomms = config.comms\nui = config.ui\ndset = Dataset.get(data_uid)\nif dset.id is None:\nmsg = \"The provided dataset is not registered.\"\nraise InvalidArgumentError(msg)\nbenchmark = Benchmark.get(benchmark_uid)\nif dset.data_preparation_mlcube != benchmark.data_preparation_mlcube:\nraise InvalidArgumentError(\n\"The specified dataset wasn't prepared for this benchmark\"\n)\nresult = BenchmarkExecution.run(\nbenchmark_uid,\ndata_uid,\n[benchmark.reference_model_mlcube],\nno_cache=no_cache,\n)[0]\nui.print(\"These are the results generated by the compatibility test. \")\nui.print(\"This will be sent along the association request.\")\nui.print(\"They will not be part of the benchmark.\")\ndict_pretty_print(result.results)\nmsg = \"Please confirm that you would like to associate\"\nmsg += f\" the dataset {dset.name} with the benchmark {benchmark.name}.\"\nmsg += \" [Y/n]\"\napproved = approved or approval_prompt(msg)\nif approved:\nui.print(\"Generating dataset benchmark association\")\nmetadata = {\"test_result\": result.results}\ncomms.associate_dset(dset.id, benchmark_uid, metadata)\nelse:\nui.print(\"Dataset association operation cancelled.\")\n
"},{"location":"reference/commands/dataset/associate/#commands.dataset.associate.AssociateDataset.run","title":"run(data_uid, benchmark_uid, approved=False, no_cache=False)
staticmethod
","text":"Associates a registered dataset with a benchmark
Parameters:
Name Type Description Defaultdata_uid
int
UID of the registered dataset to associate
requiredbenchmark_uid
int
UID of the benchmark to associate with
required Source code incli/medperf/commands/dataset/associate.py
@staticmethod\ndef run(data_uid: int, benchmark_uid: int, approved=False, no_cache=False):\n\"\"\"Associates a registered dataset with a benchmark\n Args:\n data_uid (int): UID of the registered dataset to associate\n benchmark_uid (int): UID of the benchmark to associate with\n \"\"\"\ncomms = config.comms\nui = config.ui\ndset = Dataset.get(data_uid)\nif dset.id is None:\nmsg = \"The provided dataset is not registered.\"\nraise InvalidArgumentError(msg)\nbenchmark = Benchmark.get(benchmark_uid)\nif dset.data_preparation_mlcube != benchmark.data_preparation_mlcube:\nraise InvalidArgumentError(\n\"The specified dataset wasn't prepared for this benchmark\"\n)\nresult = BenchmarkExecution.run(\nbenchmark_uid,\ndata_uid,\n[benchmark.reference_model_mlcube],\nno_cache=no_cache,\n)[0]\nui.print(\"These are the results generated by the compatibility test. \")\nui.print(\"This will be sent along the association request.\")\nui.print(\"They will not be part of the benchmark.\")\ndict_pretty_print(result.results)\nmsg = \"Please confirm that you would like to associate\"\nmsg += f\" the dataset {dset.name} with the benchmark {benchmark.name}.\"\nmsg += \" [Y/n]\"\napproved = approved or approval_prompt(msg)\nif approved:\nui.print(\"Generating dataset benchmark association\")\nmetadata = {\"test_result\": result.results}\ncomms.associate_dset(dset.id, benchmark_uid, metadata)\nelse:\nui.print(\"Dataset association operation cancelled.\")\n
"},{"location":"reference/commands/dataset/dataset/","title":"Dataset","text":""},{"location":"reference/commands/dataset/dataset/#commands.dataset.dataset.associate","title":"associate(data_uid=typer.Option(..., '--data_uid', '-d', help='Registered Dataset UID'), benchmark_uid=typer.Option(..., '--benchmark_uid', '-b', help='Benchmark UID'), approval=typer.Option(False, '-y', help='Skip approval step'), no_cache=typer.Option(False, '--no-cache', help='Execute the test even if results already exist'))
","text":"Associate a registered dataset with a specific benchmark. The dataset and benchmark must share the same data preparation cube.
Source code incli/medperf/commands/dataset/dataset.py
@app.command(\"associate\")\n@clean_except\ndef associate(\ndata_uid: int = typer.Option(\n..., \"--data_uid\", \"-d\", help=\"Registered Dataset UID\"\n),\nbenchmark_uid: int = typer.Option(\n..., \"--benchmark_uid\", \"-b\", help=\"Benchmark UID\"\n),\napproval: bool = typer.Option(False, \"-y\", help=\"Skip approval step\"),\nno_cache: bool = typer.Option(\nFalse,\n\"--no-cache\",\nhelp=\"Execute the test even if results already exist\",\n),\n):\n\"\"\"Associate a registered dataset with a specific benchmark.\n The dataset and benchmark must share the same data preparation cube.\n \"\"\"\nui = config.ui\nAssociateDataset.run(data_uid, benchmark_uid, approved=approval, no_cache=no_cache)\nui.print(\"\u2705 Done!\")\n
"},{"location":"reference/commands/dataset/dataset/#commands.dataset.dataset.list","title":"list(unregistered=typer.Option(False, '--unregistered', help='Get unregistered datasets'), mine=typer.Option(False, '--mine', help='Get current-user datasets'), mlcube=typer.Option(None, '--mlcube', '-m', help='Get datasets for a given data prep mlcube'))
","text":"List datasets
Source code incli/medperf/commands/dataset/dataset.py
@app.command(\"ls\")\n@clean_except\ndef list(\nunregistered: bool = typer.Option(\nFalse, \"--unregistered\", help=\"Get unregistered datasets\"\n),\nmine: bool = typer.Option(False, \"--mine\", help=\"Get current-user datasets\"),\nmlcube: int = typer.Option(\nNone, \"--mlcube\", \"-m\", help=\"Get datasets for a given data prep mlcube\"\n),\n):\n\"\"\"List datasets\"\"\"\nEntityList.run(\nDataset,\nfields=[\"UID\", \"Name\", \"Data Preparation Cube UID\", \"State\", \"Status\", \"Owner\"],\nunregistered=unregistered,\nmine_only=mine,\nmlcube=mlcube,\n)\n
"},{"location":"reference/commands/dataset/dataset/#commands.dataset.dataset.prepare","title":"prepare(data_uid=typer.Option(..., '--data_uid', '-d', help='Dataset UID'), approval=typer.Option(False, '-y', help='Skip report submission approval step (In this case, it is assumed to be approved)'))
","text":"Runs the Data preparation step for a raw dataset
Source code incli/medperf/commands/dataset/dataset.py
@app.command(\"prepare\")\n@clean_except\ndef prepare(\ndata_uid: str = typer.Option(..., \"--data_uid\", \"-d\", help=\"Dataset UID\"),\napproval: bool = typer.Option(\nFalse,\n\"-y\",\nhelp=\"Skip report submission approval step (In this case, it is assumed to be approved)\",\n),\n):\n\"\"\"Runs the Data preparation step for a raw dataset\"\"\"\nui = config.ui\nDataPreparation.run(data_uid, approve_sending_reports=approval)\nui.print(\"\u2705 Done!\")\n
"},{"location":"reference/commands/dataset/dataset/#commands.dataset.dataset.set_operational","title":"set_operational(data_uid=typer.Option(..., '--data_uid', '-d', help='Dataset UID'), approval=typer.Option(False, '-y', help='Skip confirmation and statistics submission approval step'))
","text":"Marks a dataset as Operational
Source code incli/medperf/commands/dataset/dataset.py
@app.command(\"set_operational\")\n@clean_except\ndef set_operational(\ndata_uid: str = typer.Option(..., \"--data_uid\", \"-d\", help=\"Dataset UID\"),\napproval: bool = typer.Option(\nFalse, \"-y\", help=\"Skip confirmation and statistics submission approval step\"\n),\n):\n\"\"\"Marks a dataset as Operational\"\"\"\nui = config.ui\nDatasetSetOperational.run(data_uid, approved=approval)\nui.print(\"\u2705 Done!\")\n
"},{"location":"reference/commands/dataset/dataset/#commands.dataset.dataset.submit","title":"submit(benchmark_uid=typer.Option(None, '--benchmark', '-b', help='UID of the desired benchmark'), data_prep_uid=typer.Option(None, '--data_prep', '-p', help='UID of the desired preparation cube'), data_path=typer.Option(..., '--data_path', '-d', help='Path to the data'), labels_path=typer.Option(..., '--labels_path', '-l', help='Path to the labels'), metadata_path=typer.Option(None, '--metadata_path', '-m', help='Metadata folder location (Might be required if the dataset is already prepared)'), name=typer.Option(..., '--name', help='A human-readable name of the dataset'), description=typer.Option(None, '--description', help='A description of the dataset'), location=typer.Option(None, '--location', help='Location or Institution the data belongs to'), approval=typer.Option(False, '-y', help='Skip approval step'), submit_as_prepared=typer.Option(False, '--submit-as-prepared', help='Use this flag if the dataset is already prepared'))
","text":"Submits a Dataset instance to the backend
Source code incli/medperf/commands/dataset/dataset.py
@app.command(\"submit\")\n@clean_except\ndef submit(\nbenchmark_uid: int = typer.Option(\nNone, \"--benchmark\", \"-b\", help=\"UID of the desired benchmark\"\n),\ndata_prep_uid: int = typer.Option(\nNone, \"--data_prep\", \"-p\", help=\"UID of the desired preparation cube\"\n),\ndata_path: str = typer.Option(..., \"--data_path\", \"-d\", help=\"Path to the data\"),\nlabels_path: str = typer.Option(\n..., \"--labels_path\", \"-l\", help=\"Path to the labels\"\n),\nmetadata_path: str = typer.Option(\nNone,\n\"--metadata_path\",\n\"-m\",\nhelp=\"Metadata folder location (Might be required if the dataset is already prepared)\",\n),\nname: str = typer.Option(\n..., \"--name\", help=\"A human-readable name of the dataset\"\n),\ndescription: str = typer.Option(\nNone, \"--description\", help=\"A description of the dataset\"\n),\nlocation: str = typer.Option(\nNone, \"--location\", help=\"Location or Institution the data belongs to\"\n),\napproval: bool = typer.Option(False, \"-y\", help=\"Skip approval step\"),\nsubmit_as_prepared: bool = typer.Option(\nFalse,\n\"--submit-as-prepared\",\nhelp=\"Use this flag if the dataset is already prepared\",\n),\n):\n\"\"\"Submits a Dataset instance to the backend\"\"\"\nui = config.ui\nDataCreation.run(\nbenchmark_uid,\ndata_prep_uid,\ndata_path,\nlabels_path,\nmetadata_path,\nname=name,\ndescription=description,\nlocation=location,\napproved=approval,\nsubmit_as_prepared=submit_as_prepared,\n)\nui.print(\"\u2705 Done!\")\n
"},{"location":"reference/commands/dataset/dataset/#commands.dataset.dataset.view","title":"view(entity_id=typer.Argument(None, help='Dataset ID'), format=typer.Option('yaml', '-f', '--format', help='Format to display contents. Available formats: [yaml, json]'), unregistered=typer.Option(False, '--unregistered', help='Display unregistered datasets if dataset ID is not provided'), mine=typer.Option(False, '--mine', help='Display current-user datasets if dataset ID is not provided'), output=typer.Option(None, '--output', '-o', help='Output file to store contents. If not provided, the output will be displayed'))
","text":"Displays the information of one or more datasets
Source code incli/medperf/commands/dataset/dataset.py
@app.command(\"view\")\n@clean_except\ndef view(\nentity_id: Optional[str] = typer.Argument(None, help=\"Dataset ID\"),\nformat: str = typer.Option(\n\"yaml\",\n\"-f\",\n\"--format\",\nhelp=\"Format to display contents. Available formats: [yaml, json]\",\n),\nunregistered: bool = typer.Option(\nFalse,\n\"--unregistered\",\nhelp=\"Display unregistered datasets if dataset ID is not provided\",\n),\nmine: bool = typer.Option(\nFalse,\n\"--mine\",\nhelp=\"Display current-user datasets if dataset ID is not provided\",\n),\noutput: str = typer.Option(\nNone,\n\"--output\",\n\"-o\",\nhelp=\"Output file to store contents. If not provided, the output will be displayed\",\n),\n):\n\"\"\"Displays the information of one or more datasets\"\"\"\nEntityView.run(entity_id, Dataset, format, unregistered, mine, output)\n
"},{"location":"reference/commands/dataset/prepare/","title":"Prepare","text":""},{"location":"reference/commands/dataset/set_operational/","title":"Set operational","text":""},{"location":"reference/commands/dataset/set_operational/#commands.dataset.set_operational.DatasetSetOperational","title":"DatasetSetOperational
","text":"Source code in cli/medperf/commands/dataset/set_operational.py
class DatasetSetOperational:\n# TODO: this will be refactored when merging entity edit PR\n@classmethod\ndef run(cls, dataset_id: int, approved: bool = False):\npreparation = cls(dataset_id, approved)\npreparation.validate()\npreparation.generate_uids()\npreparation.set_statistics()\npreparation.set_operational()\npreparation.update()\npreparation.write()\nreturn preparation.dataset.id\ndef __init__(self, dataset_id: int, approved: bool):\nself.ui = config.ui\nself.dataset = Dataset.get(dataset_id)\nself.approved = approved\ndef validate(self):\nif self.dataset.state == \"OPERATION\":\nraise InvalidArgumentError(\"The dataset is already operational\")\nif not self.dataset.is_ready():\nraise InvalidArgumentError(\"The dataset is not checked\")\ndef generate_uids(self):\n\"\"\"Auto-generates dataset UIDs for both input and output paths\"\"\"\nraw_data_path, raw_labels_path = self.dataset.get_raw_paths()\nprepared_data_path = self.dataset.data_path\nprepared_labels_path = self.dataset.labels_path\nin_uid = get_folders_hash([raw_data_path, raw_labels_path])\ngenerated_uid = get_folders_hash([prepared_data_path, prepared_labels_path])\nself.dataset.input_data_hash = in_uid\nself.dataset.generated_uid = generated_uid\ndef set_statistics(self):\nwith open(self.dataset.statistics_path, \"r\") as f:\nstats = yaml.safe_load(f)\nself.dataset.generated_metadata = stats\ndef set_operational(self):\nself.dataset.state = \"OPERATION\"\ndef update(self):\nbody = self.todict()\ndict_pretty_print(body)\nmsg = \"Do you approve sending the presented data to MedPerf? [Y/n] \"\nself.approved = self.approved or approval_prompt(msg)\nif self.approved:\nconfig.comms.update_dataset(self.dataset.id, body)\nreturn\nraise CleanExit(\"Setting Dataset as operational was cancelled\")\ndef todict(self) -> dict:\n\"\"\"Dictionary representation of the update body\n Returns:\n dict: dictionary containing information pertaining the dataset.\n \"\"\"\nreturn {\n\"input_data_hash\": self.dataset.input_data_hash,\n\"generated_uid\": self.dataset.generated_uid,\n\"generated_metadata\": self.dataset.generated_metadata,\n\"state\": self.dataset.state,\n}\ndef write(self) -> str:\n\"\"\"Writes the registration into disk\n Args:\n filename (str, optional): name of the file. Defaults to config.reg_file.\n \"\"\"\nself.dataset.write()\n
"},{"location":"reference/commands/dataset/set_operational/#commands.dataset.set_operational.DatasetSetOperational.generate_uids","title":"generate_uids()
","text":"Auto-generates dataset UIDs for both input and output paths
Source code incli/medperf/commands/dataset/set_operational.py
def generate_uids(self):\n\"\"\"Auto-generates dataset UIDs for both input and output paths\"\"\"\nraw_data_path, raw_labels_path = self.dataset.get_raw_paths()\nprepared_data_path = self.dataset.data_path\nprepared_labels_path = self.dataset.labels_path\nin_uid = get_folders_hash([raw_data_path, raw_labels_path])\ngenerated_uid = get_folders_hash([prepared_data_path, prepared_labels_path])\nself.dataset.input_data_hash = in_uid\nself.dataset.generated_uid = generated_uid\n
"},{"location":"reference/commands/dataset/set_operational/#commands.dataset.set_operational.DatasetSetOperational.todict","title":"todict()
","text":"Dictionary representation of the update body
Returns:
Name Type Descriptiondict
dict
dictionary containing information pertaining the dataset.
Source code incli/medperf/commands/dataset/set_operational.py
def todict(self) -> dict:\n\"\"\"Dictionary representation of the update body\n Returns:\n dict: dictionary containing information pertaining the dataset.\n \"\"\"\nreturn {\n\"input_data_hash\": self.dataset.input_data_hash,\n\"generated_uid\": self.dataset.generated_uid,\n\"generated_metadata\": self.dataset.generated_metadata,\n\"state\": self.dataset.state,\n}\n
"},{"location":"reference/commands/dataset/set_operational/#commands.dataset.set_operational.DatasetSetOperational.write","title":"write()
","text":"Writes the registration into disk
Parameters:
Name Type Description Defaultfilename
str
name of the file. Defaults to config.reg_file.
required Source code incli/medperf/commands/dataset/set_operational.py
def write(self) -> str:\n\"\"\"Writes the registration into disk\n Args:\n filename (str, optional): name of the file. Defaults to config.reg_file.\n \"\"\"\nself.dataset.write()\n
"},{"location":"reference/commands/dataset/submit/","title":"Submit","text":""},{"location":"reference/commands/dataset/submit/#commands.dataset.submit.DataCreation","title":"DataCreation
","text":"Source code in cli/medperf/commands/dataset/submit.py
class DataCreation:\n@classmethod\ndef run(\ncls,\nbenchmark_uid: int,\nprep_cube_uid: int,\ndata_path: str,\nlabels_path: str,\nmetadata_path: str = None,\nname: str = None,\ndescription: str = None,\nlocation: str = None,\napproved: bool = False,\nsubmit_as_prepared: bool = False,\nfor_test: bool = False,\n):\npreparation = cls(\nbenchmark_uid,\nprep_cube_uid,\ndata_path,\nlabels_path,\nmetadata_path,\nname,\ndescription,\nlocation,\napproved,\nsubmit_as_prepared,\nfor_test,\n)\npreparation.validate()\npreparation.validate_prep_cube()\npreparation.create_dataset_object()\nif submit_as_prepared:\npreparation.make_dataset_prepared()\nupdated_dataset_dict = preparation.upload()\npreparation.to_permanent_path(updated_dataset_dict)\npreparation.write(updated_dataset_dict)\nreturn updated_dataset_dict[\"id\"]\ndef __init__(\nself,\nbenchmark_uid: int,\nprep_cube_uid: int,\ndata_path: str,\nlabels_path: str,\nmetadata_path: str,\nname: str,\ndescription: str,\nlocation: str,\napproved: bool,\nsubmit_as_prepared: bool,\nfor_test: bool,\n):\nself.ui = config.ui\nself.data_path = str(Path(data_path).resolve())\nself.labels_path = str(Path(labels_path).resolve())\nself.metadata_path = metadata_path\nself.name = name\nself.description = description\nself.location = location\nself.benchmark_uid = benchmark_uid\nself.prep_cube_uid = prep_cube_uid\nself.approved = approved\nself.submit_as_prepared = submit_as_prepared\nself.for_test = for_test\ndef validate(self):\nif not os.path.exists(self.data_path):\nraise InvalidArgumentError(\"The provided data path doesn't exist\")\nif not os.path.exists(self.labels_path):\nraise InvalidArgumentError(\"The provided labels path doesn't exist\")\nif not self.submit_as_prepared and self.metadata_path:\nraise InvalidArgumentError(\n\"metadata path should only be provided when the dataset is submitted as prepared\"\n)\nif self.metadata_path:\nself.metadata_path = str(Path(self.metadata_path).resolve())\nif not os.path.exists(self.metadata_path):\nraise InvalidArgumentError(\"The provided metadata path doesn't exist\")\n# TODO: should we check the prep mlcube and accordingly check if metadata path\n# is required? For now, we will anyway create an empty metadata folder\n# (in self.make_dataset_prepared)\ntoo_many_resources = self.benchmark_uid and self.prep_cube_uid\nno_resource = self.benchmark_uid is None and self.prep_cube_uid is None\nif no_resource or too_many_resources:\nraise InvalidArgumentError(\n\"Must provide either a benchmark or a preparation mlcube\"\n)\ndef validate_prep_cube(self):\nif self.prep_cube_uid is None:\nbenchmark = Benchmark.get(self.benchmark_uid)\nself.prep_cube_uid = benchmark.data_preparation_mlcube\nCube.get(self.prep_cube_uid)\ndef create_dataset_object(self):\n\"\"\"generates dataset UIDs for both input path\"\"\"\nin_uid = get_folders_hash([self.data_path, self.labels_path])\ndataset = Dataset(\nname=self.name,\ndescription=self.description,\nlocation=self.location,\ndata_preparation_mlcube=self.prep_cube_uid,\ninput_data_hash=in_uid,\ngenerated_uid=in_uid,\nsplit_seed=0,\ngenerated_metadata={},\nstate=\"DEVELOPMENT\",\nsubmitted_as_prepared=self.submit_as_prepared,\nfor_test=self.for_test,\n)\ndataset.write()\nconfig.tmp_paths.append(dataset.path)\ndataset.set_raw_paths(\nraw_data_path=self.data_path,\nraw_labels_path=self.labels_path,\n)\nself.dataset = dataset\ndef make_dataset_prepared(self):\nshutil.copytree(self.data_path, self.dataset.data_path)\nshutil.copytree(self.labels_path, self.dataset.labels_path)\nif self.metadata_path:\nshutil.copytree(self.metadata_path, self.dataset.metadata_path)\nelse:\n# Create an empty folder. The statistics logic should\n# also expect an empty folder to accommodate for users who\n# have prepared datasets with no the metadata information\nos.makedirs(self.dataset.metadata_path, exist_ok=True)\ndef upload(self):\nsubmission_dict = self.dataset.todict()\ndict_pretty_print(submission_dict)\nmsg = \"Do you approve the registration of the presented data to MedPerf? [Y/n] \"\nwarning = (\n\"Upon submission, your email address will be visible to the Data Preparation\"\n+ \" Owner for traceability and debugging purposes.\"\n)\nself.ui.print_warning(warning)\nself.approved = self.approved or approval_prompt(msg)\nif self.approved:\nupdated_body = self.dataset.upload()\nreturn updated_body\nraise CleanExit(\"Dataset submission operation cancelled\")\ndef to_permanent_path(self, updated_dataset_dict: dict):\n\"\"\"Renames the temporary benchmark submission to a permanent one\n Args:\n bmk_dict (dict): dictionary containing updated information of the submitted benchmark\n \"\"\"\nold_dataset_loc = self.dataset.path\nupdated_dataset = Dataset(**updated_dataset_dict)\nnew_dataset_loc = updated_dataset.path\nremove_path(new_dataset_loc)\nos.rename(old_dataset_loc, new_dataset_loc)\ndef write(self, updated_dataset_dict):\ndataset = Dataset(**updated_dataset_dict)\ndataset.write()\n
"},{"location":"reference/commands/dataset/submit/#commands.dataset.submit.DataCreation.create_dataset_object","title":"create_dataset_object()
","text":"generates dataset UIDs for both input path
Source code incli/medperf/commands/dataset/submit.py
def create_dataset_object(self):\n\"\"\"generates dataset UIDs for both input path\"\"\"\nin_uid = get_folders_hash([self.data_path, self.labels_path])\ndataset = Dataset(\nname=self.name,\ndescription=self.description,\nlocation=self.location,\ndata_preparation_mlcube=self.prep_cube_uid,\ninput_data_hash=in_uid,\ngenerated_uid=in_uid,\nsplit_seed=0,\ngenerated_metadata={},\nstate=\"DEVELOPMENT\",\nsubmitted_as_prepared=self.submit_as_prepared,\nfor_test=self.for_test,\n)\ndataset.write()\nconfig.tmp_paths.append(dataset.path)\ndataset.set_raw_paths(\nraw_data_path=self.data_path,\nraw_labels_path=self.labels_path,\n)\nself.dataset = dataset\n
"},{"location":"reference/commands/dataset/submit/#commands.dataset.submit.DataCreation.to_permanent_path","title":"to_permanent_path(updated_dataset_dict)
","text":"Renames the temporary benchmark submission to a permanent one
Parameters:
Name Type Description Defaultbmk_dict
dict
dictionary containing updated information of the submitted benchmark
required Source code incli/medperf/commands/dataset/submit.py
def to_permanent_path(self, updated_dataset_dict: dict):\n\"\"\"Renames the temporary benchmark submission to a permanent one\n Args:\n bmk_dict (dict): dictionary containing updated information of the submitted benchmark\n \"\"\"\nold_dataset_loc = self.dataset.path\nupdated_dataset = Dataset(**updated_dataset_dict)\nnew_dataset_loc = updated_dataset.path\nremove_path(new_dataset_loc)\nos.rename(old_dataset_loc, new_dataset_loc)\n
"},{"location":"reference/commands/mlcube/associate/","title":"Associate","text":""},{"location":"reference/commands/mlcube/associate/#commands.mlcube.associate.AssociateCube","title":"AssociateCube
","text":"Source code in cli/medperf/commands/mlcube/associate.py
class AssociateCube:\n@classmethod\ndef run(\ncls,\ncube_uid: int,\nbenchmark_uid: int,\napproved=False,\nno_cache=False,\n):\n\"\"\"Associates a cube with a given benchmark\n Args:\n cube_uid (int): UID of model MLCube\n benchmark_uid (int): UID of benchmark\n approved (bool): Skip validation step. Defualts to False\n \"\"\"\ncomms = config.comms\nui = config.ui\ncube = Cube.get(cube_uid)\nbenchmark = Benchmark.get(benchmark_uid)\n_, results = CompatibilityTestExecution.run(\nbenchmark=benchmark_uid, model=cube_uid, no_cache=no_cache\n)\nui.print(\"These are the results generated by the compatibility test. \")\nui.print(\"This will be sent along the association request.\")\nui.print(\"They will not be part of the benchmark.\")\ndict_pretty_print(results)\nmsg = \"Please confirm that you would like to associate \"\nmsg += f\"the MLCube '{cube.name}' with the benchmark '{benchmark.name}' [Y/n]\"\napproved = approved or approval_prompt(msg)\nif approved:\nui.print(\"Generating mlcube benchmark association\")\nmetadata = {\"test_result\": results}\ncomms.associate_cube(cube_uid, benchmark_uid, metadata)\nelse:\nui.print(\"MLCube association operation cancelled\")\n
"},{"location":"reference/commands/mlcube/associate/#commands.mlcube.associate.AssociateCube.run","title":"run(cube_uid, benchmark_uid, approved=False, no_cache=False)
classmethod
","text":"Associates a cube with a given benchmark
Parameters:
Name Type Description Defaultcube_uid
int
UID of model MLCube
requiredbenchmark_uid
int
UID of benchmark
requiredapproved
bool
Skip validation step. Defualts to False
False
Source code in cli/medperf/commands/mlcube/associate.py
@classmethod\ndef run(\ncls,\ncube_uid: int,\nbenchmark_uid: int,\napproved=False,\nno_cache=False,\n):\n\"\"\"Associates a cube with a given benchmark\n Args:\n cube_uid (int): UID of model MLCube\n benchmark_uid (int): UID of benchmark\n approved (bool): Skip validation step. Defualts to False\n \"\"\"\ncomms = config.comms\nui = config.ui\ncube = Cube.get(cube_uid)\nbenchmark = Benchmark.get(benchmark_uid)\n_, results = CompatibilityTestExecution.run(\nbenchmark=benchmark_uid, model=cube_uid, no_cache=no_cache\n)\nui.print(\"These are the results generated by the compatibility test. \")\nui.print(\"This will be sent along the association request.\")\nui.print(\"They will not be part of the benchmark.\")\ndict_pretty_print(results)\nmsg = \"Please confirm that you would like to associate \"\nmsg += f\"the MLCube '{cube.name}' with the benchmark '{benchmark.name}' [Y/n]\"\napproved = approved or approval_prompt(msg)\nif approved:\nui.print(\"Generating mlcube benchmark association\")\nmetadata = {\"test_result\": results}\ncomms.associate_cube(cube_uid, benchmark_uid, metadata)\nelse:\nui.print(\"MLCube association operation cancelled\")\n
"},{"location":"reference/commands/mlcube/create/","title":"Create","text":""},{"location":"reference/commands/mlcube/create/#commands.mlcube.create.CreateCube","title":"CreateCube
","text":"Source code in cli/medperf/commands/mlcube/create.py
class CreateCube:\n@classmethod\ndef run(cls, template_name: str, output_path: str = \".\", config_file: str = None):\n\"\"\"Creates a new MLCube based on one of the provided templates\n Args:\n template_name (str): The name of the template to use\n output_path (str, Optional): The desired path for the MLCube. Defaults to current path.\n config_file (str, Optional): Path to a JSON configuration file. If not passed, user is prompted.\n \"\"\"\ntemplate_dirs = config.templates\nif template_name not in template_dirs:\ntemplates = list(template_dirs.keys())\nraise InvalidArgumentError(\nf\"Invalid template name. Available templates: [{' | '.join(templates)}]\"\n)\nno_input = False\nif config_file is not None:\nno_input = True\n# Get package parent path\npath = abspath(Path(__file__).parent.parent.parent)\ntemplate_dir = template_dirs[template_name]\ncookiecutter(\npath,\ndirectory=template_dir,\noutput_dir=output_path,\nconfig_file=config_file,\nno_input=no_input,\n)\n
"},{"location":"reference/commands/mlcube/create/#commands.mlcube.create.CreateCube.run","title":"run(template_name, output_path='.', config_file=None)
classmethod
","text":"Creates a new MLCube based on one of the provided templates
Parameters:
Name Type Description Defaulttemplate_name
str
The name of the template to use
requiredoutput_path
(str, Optional)
The desired path for the MLCube. Defaults to current path.
'.'
config_file
(str, Optional)
Path to a JSON configuration file. If not passed, user is prompted.
None
Source code in cli/medperf/commands/mlcube/create.py
@classmethod\ndef run(cls, template_name: str, output_path: str = \".\", config_file: str = None):\n\"\"\"Creates a new MLCube based on one of the provided templates\n Args:\n template_name (str): The name of the template to use\n output_path (str, Optional): The desired path for the MLCube. Defaults to current path.\n config_file (str, Optional): Path to a JSON configuration file. If not passed, user is prompted.\n \"\"\"\ntemplate_dirs = config.templates\nif template_name not in template_dirs:\ntemplates = list(template_dirs.keys())\nraise InvalidArgumentError(\nf\"Invalid template name. Available templates: [{' | '.join(templates)}]\"\n)\nno_input = False\nif config_file is not None:\nno_input = True\n# Get package parent path\npath = abspath(Path(__file__).parent.parent.parent)\ntemplate_dir = template_dirs[template_name]\ncookiecutter(\npath,\ndirectory=template_dir,\noutput_dir=output_path,\nconfig_file=config_file,\nno_input=no_input,\n)\n
"},{"location":"reference/commands/mlcube/mlcube/","title":"Mlcube","text":""},{"location":"reference/commands/mlcube/mlcube/#commands.mlcube.mlcube.associate","title":"associate(benchmark_uid=typer.Option(..., '--benchmark', '-b', help='Benchmark UID'), model_uid=typer.Option(..., '--model_uid', '-m', help='Model UID'), approval=typer.Option(False, '-y', help='Skip approval step'), no_cache=typer.Option(False, '--no-cache', help='Execute the test even if results already exist'))
","text":"Associates an MLCube to a benchmark
Source code incli/medperf/commands/mlcube/mlcube.py
@app.command(\"associate\")\n@clean_except\ndef associate(\nbenchmark_uid: int = typer.Option(..., \"--benchmark\", \"-b\", help=\"Benchmark UID\"),\nmodel_uid: int = typer.Option(..., \"--model_uid\", \"-m\", help=\"Model UID\"),\napproval: bool = typer.Option(False, \"-y\", help=\"Skip approval step\"),\nno_cache: bool = typer.Option(\nFalse,\n\"--no-cache\",\nhelp=\"Execute the test even if results already exist\",\n),\n):\n\"\"\"Associates an MLCube to a benchmark\"\"\"\nAssociateCube.run(model_uid, benchmark_uid, approved=approval, no_cache=no_cache)\nconfig.ui.print(\"\u2705 Done!\")\n
"},{"location":"reference/commands/mlcube/mlcube/#commands.mlcube.mlcube.create","title":"create(template=typer.Argument(..., help=f'MLCube template name. Available templates: [{' | '.join(config.templates.keys())}]'), output_path=typer.Option('.', '--output', '-o', help='Save the generated MLCube to the specified path'), config_file=typer.Option(None, '--config-file', '-c', help='JSON Configuration file. If not present then user is prompted for configuration'))
","text":"Creates an MLCube based on one of the specified templates
Source code incli/medperf/commands/mlcube/mlcube.py
@app.command(\"create\")\n@clean_except\ndef create(\ntemplate: str = typer.Argument(\n...,\nhelp=f\"MLCube template name. Available templates: [{' | '.join(config.templates.keys())}]\",\n),\noutput_path: str = typer.Option(\n\".\", \"--output\", \"-o\", help=\"Save the generated MLCube to the specified path\"\n),\nconfig_file: str = typer.Option(\nNone,\n\"--config-file\",\n\"-c\",\nhelp=\"JSON Configuration file. If not present then user is prompted for configuration\",\n),\n):\n\"\"\"Creates an MLCube based on one of the specified templates\"\"\"\nCreateCube.run(template, output_path, config_file)\n
"},{"location":"reference/commands/mlcube/mlcube/#commands.mlcube.mlcube.list","title":"list(unregistered=typer.Option(False, '--unregistered', help='Get unregistered mlcubes'), mine=typer.Option(False, '--mine', help='Get current-user mlcubes'))
","text":"List mlcubes
Source code incli/medperf/commands/mlcube/mlcube.py
@app.command(\"ls\")\n@clean_except\ndef list(\nunregistered: bool = typer.Option(\nFalse, \"--unregistered\", help=\"Get unregistered mlcubes\"\n),\nmine: bool = typer.Option(False, \"--mine\", help=\"Get current-user mlcubes\"),\n):\n\"\"\"List mlcubes\"\"\"\nEntityList.run(\nCube,\nfields=[\"UID\", \"Name\", \"State\", \"Registered\"],\nunregistered=unregistered,\nmine_only=mine,\n)\n
"},{"location":"reference/commands/mlcube/mlcube/#commands.mlcube.mlcube.submit","title":"submit(name=typer.Option(..., '--name', '-n', help='Name of the mlcube'), mlcube_file=typer.Option(..., '--mlcube-file', '-m', help='Identifier to download the mlcube file. See the description above'), mlcube_hash=typer.Option('', '--mlcube-hash', help='hash of mlcube file'), parameters_file=typer.Option('', '--parameters-file', '-p', help='Identifier to download the parameters file. See the description above'), parameters_hash=typer.Option('', '--parameters-hash', help='hash of parameters file'), additional_file=typer.Option('', '--additional-file', '-a', help='Identifier to download the additional files tarball. See the description above'), additional_hash=typer.Option('', '--additional-hash', help='hash of additional file'), image_file=typer.Option('', '--image-file', '-i', help='Identifier to download the image file. See the description above'), image_hash=typer.Option('', '--image-hash', help='hash of image file'), operational=typer.Option(False, '--operational', help='Submit the MLCube as OPERATIONAL'))
","text":"Submits a new cube to the platform.
The following assetsmlcube_file
parameters_file
additional_file
image_file
are expected to be given in the following format: where source_prefix
instructs the client how to download the resource, and resource_identifier
is the identifier used to download the asset. The following are supported:
A direct link: \"direct:\"
An asset hosted on the Synapse platform: \"synapse:\"
If a URL is given without a source prefix, it will be treated as a direct download link.
Source code incli/medperf/commands/mlcube/mlcube.py
@app.command(\"submit\")\n@clean_except\ndef submit(\nname: str = typer.Option(..., \"--name\", \"-n\", help=\"Name of the mlcube\"),\nmlcube_file: str = typer.Option(\n...,\n\"--mlcube-file\",\n\"-m\",\nhelp=\"Identifier to download the mlcube file. See the description above\",\n),\nmlcube_hash: str = typer.Option(\"\", \"--mlcube-hash\", help=\"hash of mlcube file\"),\nparameters_file: str = typer.Option(\n\"\",\n\"--parameters-file\",\n\"-p\",\nhelp=\"Identifier to download the parameters file. See the description above\",\n),\nparameters_hash: str = typer.Option(\n\"\", \"--parameters-hash\", help=\"hash of parameters file\"\n),\nadditional_file: str = typer.Option(\n\"\",\n\"--additional-file\",\n\"-a\",\nhelp=\"Identifier to download the additional files tarball. See the description above\",\n),\nadditional_hash: str = typer.Option(\n\"\", \"--additional-hash\", help=\"hash of additional file\"\n),\nimage_file: str = typer.Option(\n\"\",\n\"--image-file\",\n\"-i\",\nhelp=\"Identifier to download the image file. See the description above\",\n),\nimage_hash: str = typer.Option(\"\", \"--image-hash\", help=\"hash of image file\"),\noperational: bool = typer.Option(\nFalse,\n\"--operational\",\nhelp=\"Submit the MLCube as OPERATIONAL\",\n),\n):\n\"\"\"Submits a new cube to the platform.\\n\n The following assets:\\n\n - mlcube_file\\n\n - parameters_file\\n\n - additional_file\\n\n - image_file\\n\n are expected to be given in the following format: <source_prefix:resource_identifier>\n where `source_prefix` instructs the client how to download the resource, and `resource_identifier`\n is the identifier used to download the asset. The following are supported:\\n\n 1. A direct link: \"direct:<URL>\"\\n\n 2. An asset hosted on the Synapse platform: \"synapse:<synapse ID>\"\\n\\n\n If a URL is given without a source prefix, it will be treated as a direct download link.\n \"\"\"\nmlcube_info = {\n\"name\": name,\n\"git_mlcube_url\": mlcube_file,\n\"git_mlcube_hash\": mlcube_hash,\n\"git_parameters_url\": parameters_file,\n\"parameters_hash\": parameters_hash,\n\"image_tarball_url\": image_file,\n\"image_tarball_hash\": image_hash,\n\"additional_files_tarball_url\": additional_file,\n\"additional_files_tarball_hash\": additional_hash,\n\"state\": \"OPERATION\" if operational else \"DEVELOPMENT\",\n}\nSubmitCube.run(mlcube_info)\nconfig.ui.print(\"\u2705 Done!\")\n
"},{"location":"reference/commands/mlcube/mlcube/#commands.mlcube.mlcube.view","title":"view(entity_id=typer.Argument(None, help='MLCube ID'), format=typer.Option('yaml', '-f', '--format', help='Format to display contents. Available formats: [yaml, json]'), unregistered=typer.Option(False, '--unregistered', help='Display unregistered mlcubes if mlcube ID is not provided'), mine=typer.Option(False, '--mine', help='Display current-user mlcubes if mlcube ID is not provided'), output=typer.Option(None, '--output', '-o', help='Output file to store contents. If not provided, the output will be displayed'))
","text":"Displays the information of one or more mlcubes
Source code incli/medperf/commands/mlcube/mlcube.py
@app.command(\"view\")\n@clean_except\ndef view(\nentity_id: Optional[int] = typer.Argument(None, help=\"MLCube ID\"),\nformat: str = typer.Option(\n\"yaml\",\n\"-f\",\n\"--format\",\nhelp=\"Format to display contents. Available formats: [yaml, json]\",\n),\nunregistered: bool = typer.Option(\nFalse,\n\"--unregistered\",\nhelp=\"Display unregistered mlcubes if mlcube ID is not provided\",\n),\nmine: bool = typer.Option(\nFalse,\n\"--mine\",\nhelp=\"Display current-user mlcubes if mlcube ID is not provided\",\n),\noutput: str = typer.Option(\nNone,\n\"--output\",\n\"-o\",\nhelp=\"Output file to store contents. If not provided, the output will be displayed\",\n),\n):\n\"\"\"Displays the information of one or more mlcubes\"\"\"\nEntityView.run(entity_id, Cube, format, unregistered, mine, output)\n
"},{"location":"reference/commands/mlcube/submit/","title":"Submit","text":""},{"location":"reference/commands/mlcube/submit/#commands.mlcube.submit.SubmitCube","title":"SubmitCube
","text":"Source code in cli/medperf/commands/mlcube/submit.py
class SubmitCube:\n@classmethod\ndef run(cls, submit_info: dict):\n\"\"\"Submits a new cube to the medperf platform\n Args:\n submit_info (dict): Dictionary containing the cube information.\n \"\"\"\nui = config.ui\nsubmission = cls(submit_info)\nwith ui.interactive():\nui.text = \"Validating MLCube can be downloaded\"\nsubmission.download()\nui.text = \"Submitting MLCube to MedPerf\"\nupdated_cube_dict = submission.upload()\nsubmission.to_permanent_path(updated_cube_dict)\nsubmission.write(updated_cube_dict)\ndef __init__(self, submit_info: dict):\nself.comms = config.comms\nself.ui = config.ui\nself.cube = Cube(**submit_info)\nconfig.tmp_paths.append(self.cube.path)\ndef download(self):\nself.cube.download_config_files()\nself.cube.download_run_files()\ndef upload(self):\nupdated_body = self.cube.upload()\nreturn updated_body\ndef to_permanent_path(self, cube_dict):\n\"\"\"Renames the temporary cube submission to a permanent one using the uid of\n the registered cube\n \"\"\"\nold_cube_loc = self.cube.path\nupdated_cube = Cube(**cube_dict)\nnew_cube_loc = updated_cube.path\nremove_path(new_cube_loc)\nos.rename(old_cube_loc, new_cube_loc)\ndef write(self, updated_cube_dict):\ncube = Cube(**updated_cube_dict)\ncube.write()\n
"},{"location":"reference/commands/mlcube/submit/#commands.mlcube.submit.SubmitCube.run","title":"run(submit_info)
classmethod
","text":"Submits a new cube to the medperf platform
Parameters:
Name Type Description Defaultsubmit_info
dict
Dictionary containing the cube information.
required Source code incli/medperf/commands/mlcube/submit.py
@classmethod\ndef run(cls, submit_info: dict):\n\"\"\"Submits a new cube to the medperf platform\n Args:\n submit_info (dict): Dictionary containing the cube information.\n \"\"\"\nui = config.ui\nsubmission = cls(submit_info)\nwith ui.interactive():\nui.text = \"Validating MLCube can be downloaded\"\nsubmission.download()\nui.text = \"Submitting MLCube to MedPerf\"\nupdated_cube_dict = submission.upload()\nsubmission.to_permanent_path(updated_cube_dict)\nsubmission.write(updated_cube_dict)\n
"},{"location":"reference/commands/mlcube/submit/#commands.mlcube.submit.SubmitCube.to_permanent_path","title":"to_permanent_path(cube_dict)
","text":"Renames the temporary cube submission to a permanent one using the uid of the registered cube
Source code incli/medperf/commands/mlcube/submit.py
def to_permanent_path(self, cube_dict):\n\"\"\"Renames the temporary cube submission to a permanent one using the uid of\n the registered cube\n \"\"\"\nold_cube_loc = self.cube.path\nupdated_cube = Cube(**cube_dict)\nnew_cube_loc = updated_cube.path\nremove_path(new_cube_loc)\nos.rename(old_cube_loc, new_cube_loc)\n
"},{"location":"reference/commands/result/create/","title":"Create","text":""},{"location":"reference/commands/result/create/#commands.result.create.BenchmarkExecution","title":"BenchmarkExecution
","text":"Source code in cli/medperf/commands/result/create.py
class BenchmarkExecution:\n@classmethod\ndef run(\ncls,\nbenchmark_uid: int,\ndata_uid: int,\nmodels_uids: Optional[List[int]] = None,\nmodels_input_file: Optional[str] = None,\nignore_model_errors=False,\nignore_failed_experiments=False,\nno_cache=False,\nshow_summary=False,\n):\n\"\"\"Benchmark execution flow.\n Args:\n benchmark_uid (int): UID of the desired benchmark\n data_uid (str): Registered Dataset UID\n models_uids (List|None): list of model UIDs to execute.\n if None, models_input_file will be used\n models_input_file: filename to read from\n if models_uids and models_input_file are None, use all benchmark models\n \"\"\"\nexecution = cls(\nbenchmark_uid,\ndata_uid,\nmodels_uids,\nmodels_input_file,\nignore_model_errors,\nignore_failed_experiments,\n)\nexecution.prepare()\nexecution.validate()\nexecution.prepare_models()\nif not no_cache:\nexecution.load_cached_results()\nwith execution.ui.interactive():\nresults = execution.run_experiments()\nif show_summary:\nexecution.print_summary()\nreturn results\ndef __init__(\nself,\nbenchmark_uid: int,\ndata_uid: int,\nmodels_uids,\nmodels_input_file: str = None,\nignore_model_errors=False,\nignore_failed_experiments=False,\n):\nself.benchmark_uid = benchmark_uid\nself.data_uid = data_uid\nself.models_uids = models_uids\nself.models_input_file = models_input_file\nself.ui = config.ui\nself.evaluator = None\nself.ignore_model_errors = ignore_model_errors\nself.ignore_failed_experiments = ignore_failed_experiments\nself.cached_results = {}\nself.experiments = []\ndef prepare(self):\nself.benchmark = Benchmark.get(self.benchmark_uid)\nself.ui.print(f\"Benchmark Execution: {self.benchmark.name}\")\nself.dataset = Dataset.get(self.data_uid)\nevaluator_uid = self.benchmark.data_evaluator_mlcube\nself.evaluator = self.__get_cube(evaluator_uid, \"Evaluator\")\ndef validate(self):\ndset_prep_cube = self.dataset.data_preparation_mlcube\nbmark_prep_cube = self.benchmark.data_preparation_mlcube\nif self.dataset.id is None:\nmsg = \"The provided dataset is not registered.\"\nraise InvalidArgumentError(msg)\nif self.dataset.state != \"OPERATION\":\nmsg = \"The provided dataset is not operational.\"\nraise InvalidArgumentError(msg)\nif dset_prep_cube != bmark_prep_cube:\nmsg = \"The provided dataset is not compatible with the specified benchmark.\"\nraise InvalidArgumentError(msg)\ndef prepare_models(self):\nif self.models_input_file:\nself.models_uids = self.__get_models_from_file()\nif self.models_uids == [self.benchmark.reference_model_mlcube]:\n# avoid the need of sending a request to the server for\n# finding the benchmark's associated models\nreturn\nbenchmark_models = Benchmark.get_models_uids(self.benchmark_uid)\nbenchmark_models.append(self.benchmark.reference_model_mlcube)\nif self.models_uids is None:\nself.models_uids = benchmark_models\nelse:\nself.__validate_models(benchmark_models)\ndef __get_models_from_file(self):\nif not os.path.exists(self.models_input_file):\nraise InvalidArgumentError(\"The given file does not exist\")\nwith open(self.models_input_file) as f:\ntext = f.read()\nmodels = text.strip().split(\",\")\ntry:\nreturn list(map(int, models))\nexcept ValueError as e:\nmsg = f\"Could not parse the given file: {e}. \"\nmsg += \"The file should contain a list of comma-separated integers\"\nraise InvalidArgumentError(msg)\ndef __validate_models(self, benchmark_models):\nmodels_set = set(self.models_uids)\nbenchmark_models_set = set(benchmark_models)\nnon_assoc_cubes = models_set.difference(benchmark_models_set)\nif non_assoc_cubes:\nif len(non_assoc_cubes) > 1:\nmsg = f\"Model of UID {non_assoc_cubes} is not associated with the specified benchmark.\"\nelse:\nmsg = f\"Models of UIDs {non_assoc_cubes} are not associated with the specified benchmark.\"\nraise InvalidArgumentError(msg)\ndef load_cached_results(self):\nuser_id = get_medperf_user_data()[\"id\"]\nresults = Result.all(filters={\"owner\": user_id})\nresults += Result.all(unregistered=True)\nbenchmark_dset_results = [\nresult\nfor result in results\nif result.benchmark == self.benchmark_uid\nand result.dataset == self.data_uid\n]\nself.cached_results = {\nresult.model: result for result in benchmark_dset_results\n}\ndef __get_cube(self, uid: int, name: str) -> Cube:\nself.ui.text = f\"Retrieving {name} cube\"\ncube = Cube.get(uid)\ncube.download_run_files()\nself.ui.print(f\"> {name} cube download complete\")\nreturn cube\ndef run_experiments(self):\nfor model_uid in self.models_uids:\nif model_uid in self.cached_results:\nself.experiments.append(\n{\n\"model_uid\": model_uid,\n\"result\": self.cached_results[model_uid],\n\"cached\": True,\n\"error\": \"\",\n}\n)\ncontinue\ntry:\nmodel_cube = self.__get_cube(model_uid, \"Model\")\nexecution_summary = Execution.run(\ndataset=self.dataset,\nmodel=model_cube,\nevaluator=self.evaluator,\nignore_model_errors=self.ignore_model_errors,\n)\nexcept MedperfException as e:\nself.__handle_experiment_error(model_uid, e)\nself.experiments.append(\n{\n\"model_uid\": model_uid,\n\"result\": None,\n\"cached\": False,\n\"error\": str(e),\n}\n)\ncontinue\npartial = execution_summary[\"partial\"]\nresults = execution_summary[\"results\"]\nresult = self.__write_result(model_uid, results, partial)\nself.experiments.append(\n{\n\"model_uid\": model_uid,\n\"result\": result,\n\"cached\": False,\n\"error\": \"\",\n}\n)\nreturn [experiment[\"result\"] for experiment in self.experiments]\ndef __handle_experiment_error(self, model_uid, exception):\nif isinstance(exception, InvalidEntityError):\nconfig.ui.print_error(\nf\"There was an error when retrieving the model mlcube {model_uid}: {exception}\"\n)\nelif isinstance(exception, ExecutionError):\nconfig.ui.print_error(\nf\"There was an error when executing the benchmark with the model {model_uid}: {exception}\"\n)\nelse:\nraise exception\nif not self.ignore_failed_experiments:\nraise exception\ndef __result_dict(self, model_uid, results, partial):\nreturn {\n\"name\": f\"b{self.benchmark_uid}m{model_uid}d{self.data_uid}\",\n\"benchmark\": self.benchmark_uid,\n\"model\": model_uid,\n\"dataset\": self.data_uid,\n\"results\": results,\n\"metadata\": {\"partial\": partial},\n}\ndef __write_result(self, model_uid, results, partial):\nresults_info = self.__result_dict(model_uid, results, partial)\nresult = Result(**results_info)\nresult.write()\nreturn result\ndef print_summary(self):\nheaders = [\"model\", \"local result UID\", \"partial result\", \"from cache\", \"error\"]\ndata_lists_for_display = []\nnum_total = len(self.experiments)\nnum_success_run = 0\nnum_failed = 0\nnum_skipped = 0\nnum_partial_skipped = 0\nnum_partial_run = 0\nfor experiment in self.experiments:\n# populate display data\nif experiment[\"result\"]:\ndata_lists_for_display.append(\n[\nexperiment[\"model_uid\"],\nexperiment[\"result\"].local_id,\nexperiment[\"result\"].metadata[\"partial\"],\nexperiment[\"cached\"],\nexperiment[\"error\"],\n]\n)\nelse:\ndata_lists_for_display.append(\n[experiment[\"model_uid\"], \"\", \"\", \"\", experiment[\"error\"]]\n)\n# statistics\nif experiment[\"error\"]:\nnum_failed += 1\nelif experiment[\"cached\"]:\nnum_skipped += 1\nif experiment[\"result\"].metadata[\"partial\"]:\nnum_partial_skipped += 1\nelif experiment[\"result\"]:\nnum_success_run += 1\nif experiment[\"result\"].metadata[\"partial\"]:\nnum_partial_run += 1\ntab = tabulate(data_lists_for_display, headers=headers)\nmsg = f\"Total number of models: {num_total}\\n\"\nmsg += f\"\\t{num_skipped} were skipped (already executed), \"\nmsg += f\"of which {num_partial_run} have partial results\\n\"\nmsg += f\"\\t{num_failed} failed\\n\"\nmsg += f\"\\t{num_success_run} ran successfully, \"\nmsg += f\"of which {num_partial_run} have partial results\\n\"\nconfig.ui.print(tab)\nconfig.ui.print(msg)\n
"},{"location":"reference/commands/result/create/#commands.result.create.BenchmarkExecution.run","title":"run(benchmark_uid, data_uid, models_uids=None, models_input_file=None, ignore_model_errors=False, ignore_failed_experiments=False, no_cache=False, show_summary=False)
classmethod
","text":"Benchmark execution flow.
Parameters:
Name Type Description Defaultbenchmark_uid
int
UID of the desired benchmark
requireddata_uid
str
Registered Dataset UID
requiredmodels_uids
List | None
list of model UIDs to execute. if None, models_input_file will be used
None
models_input_file
Optional[str]
filename to read from
None
Source code in cli/medperf/commands/result/create.py
@classmethod\ndef run(\ncls,\nbenchmark_uid: int,\ndata_uid: int,\nmodels_uids: Optional[List[int]] = None,\nmodels_input_file: Optional[str] = None,\nignore_model_errors=False,\nignore_failed_experiments=False,\nno_cache=False,\nshow_summary=False,\n):\n\"\"\"Benchmark execution flow.\n Args:\n benchmark_uid (int): UID of the desired benchmark\n data_uid (str): Registered Dataset UID\n models_uids (List|None): list of model UIDs to execute.\n if None, models_input_file will be used\n models_input_file: filename to read from\n if models_uids and models_input_file are None, use all benchmark models\n \"\"\"\nexecution = cls(\nbenchmark_uid,\ndata_uid,\nmodels_uids,\nmodels_input_file,\nignore_model_errors,\nignore_failed_experiments,\n)\nexecution.prepare()\nexecution.validate()\nexecution.prepare_models()\nif not no_cache:\nexecution.load_cached_results()\nwith execution.ui.interactive():\nresults = execution.run_experiments()\nif show_summary:\nexecution.print_summary()\nreturn results\n
"},{"location":"reference/commands/result/result/","title":"Result","text":""},{"location":"reference/commands/result/result/#commands.result.result.create","title":"create(benchmark_uid=typer.Option(..., '--benchmark', '-b', help='UID of the desired benchmark'), data_uid=typer.Option(..., '--data_uid', '-d', help='Registered Dataset UID'), model_uid=typer.Option(..., '--model_uid', '-m', help='UID of model to execute'), ignore_model_errors=typer.Option(False, '--ignore-model-errors', help='Ignore failing model cubes, allowing for possibly submitting partial results'), no_cache=typer.Option(False, '--no-cache', help='Execute even if results already exist'))
","text":"Runs the benchmark execution step for a given benchmark, prepared dataset and model
Source code incli/medperf/commands/result/result.py
@app.command(\"create\")\n@clean_except\ndef create(\nbenchmark_uid: int = typer.Option(\n..., \"--benchmark\", \"-b\", help=\"UID of the desired benchmark\"\n),\ndata_uid: int = typer.Option(\n..., \"--data_uid\", \"-d\", help=\"Registered Dataset UID\"\n),\nmodel_uid: int = typer.Option(\n..., \"--model_uid\", \"-m\", help=\"UID of model to execute\"\n),\nignore_model_errors: bool = typer.Option(\nFalse,\n\"--ignore-model-errors\",\nhelp=\"Ignore failing model cubes, allowing for possibly submitting partial results\",\n),\nno_cache: bool = typer.Option(\nFalse,\n\"--no-cache\",\nhelp=\"Execute even if results already exist\",\n),\n):\n\"\"\"Runs the benchmark execution step for a given benchmark, prepared dataset and model\"\"\"\nBenchmarkExecution.run(\nbenchmark_uid,\ndata_uid,\n[model_uid],\nno_cache=no_cache,\nignore_model_errors=ignore_model_errors,\n)\nconfig.ui.print(\"\u2705 Done!\")\n
"},{"location":"reference/commands/result/result/#commands.result.result.list","title":"list(unregistered=typer.Option(False, '--unregistered', help='Get unregistered results'), mine=typer.Option(False, '--mine', help='Get current-user results'), benchmark=typer.Option(None, '--benchmark', '-b', help='Get results for a given benchmark'))
","text":"List results
Source code incli/medperf/commands/result/result.py
@app.command(\"ls\")\n@clean_except\ndef list(\nunregistered: bool = typer.Option(\nFalse, \"--unregistered\", help=\"Get unregistered results\"\n),\nmine: bool = typer.Option(False, \"--mine\", help=\"Get current-user results\"),\nbenchmark: int = typer.Option(\nNone, \"--benchmark\", \"-b\", help=\"Get results for a given benchmark\"\n),\n):\n\"\"\"List results\"\"\"\nEntityList.run(\nResult,\nfields=[\"UID\", \"Benchmark\", \"Model\", \"Dataset\", \"Registered\"],\nunregistered=unregistered,\nmine_only=mine,\nbenchmark=benchmark,\n)\n
"},{"location":"reference/commands/result/result/#commands.result.result.submit","title":"submit(result_uid=typer.Option(..., '--result', '-r', help='Unregistered result UID'), approval=typer.Option(False, '-y', help='Skip approval step'))
","text":"Submits already obtained results to the server
Source code incli/medperf/commands/result/result.py
@app.command(\"submit\")\n@clean_except\ndef submit(\nresult_uid: str = typer.Option(\n..., \"--result\", \"-r\", help=\"Unregistered result UID\"\n),\napproval: bool = typer.Option(False, \"-y\", help=\"Skip approval step\"),\n):\n\"\"\"Submits already obtained results to the server\"\"\"\nResultSubmission.run(result_uid, approved=approval)\nconfig.ui.print(\"\u2705 Done!\")\n
"},{"location":"reference/commands/result/result/#commands.result.result.view","title":"view(entity_id=typer.Argument(None, help='Result ID'), format=typer.Option('yaml', '-f', '--format', help='Format to display contents. Available formats: [yaml, json]'), unregistered=typer.Option(False, '--unregistered', help='Display unregistered results if result ID is not provided'), mine=typer.Option(False, '--mine', help='Display current-user results if result ID is not provided'), benchmark=typer.Option(None, '--benchmark', '-b', help='Get results for a given benchmark'), output=typer.Option(None, '--output', '-o', help='Output file to store contents. If not provided, the output will be displayed'))
","text":"Displays the information of one or more results
Source code incli/medperf/commands/result/result.py
@app.command(\"view\")\n@clean_except\ndef view(\nentity_id: Optional[str] = typer.Argument(None, help=\"Result ID\"),\nformat: str = typer.Option(\n\"yaml\",\n\"-f\",\n\"--format\",\nhelp=\"Format to display contents. Available formats: [yaml, json]\",\n),\nunregistered: bool = typer.Option(\nFalse,\n\"--unregistered\",\nhelp=\"Display unregistered results if result ID is not provided\",\n),\nmine: bool = typer.Option(\nFalse,\n\"--mine\",\nhelp=\"Display current-user results if result ID is not provided\",\n),\nbenchmark: int = typer.Option(\nNone, \"--benchmark\", \"-b\", help=\"Get results for a given benchmark\"\n),\noutput: str = typer.Option(\nNone,\n\"--output\",\n\"-o\",\nhelp=\"Output file to store contents. If not provided, the output will be displayed\",\n),\n):\n\"\"\"Displays the information of one or more results\"\"\"\nEntityView.run(\nentity_id, Result, format, unregistered, mine, output, benchmark=benchmark\n)\n
"},{"location":"reference/commands/result/submit/","title":"Submit","text":""},{"location":"reference/commands/result/submit/#commands.result.submit.ResultSubmission","title":"ResultSubmission
","text":"Source code in cli/medperf/commands/result/submit.py
class ResultSubmission:\n@classmethod\ndef run(cls, result_uid, approved=False):\nsub = cls(result_uid, approved=approved)\nsub.get_result()\nupdated_result_dict = sub.upload_results()\nsub.to_permanent_path(updated_result_dict)\nsub.write(updated_result_dict)\ndef __init__(self, result_uid, approved=False):\nself.result_uid = result_uid\nself.comms = config.comms\nself.ui = config.ui\nself.approved = approved\ndef get_result(self):\nself.result = Result.get(self.result_uid)\ndef request_approval(self):\ndict_pretty_print(self.result.results)\nself.ui.print(\"Above are the results generated by the model\")\napproved = approval_prompt(\n\"Do you approve uploading the presented results to the MedPerf? [Y/n]\"\n)\nreturn approved\ndef upload_results(self):\napproved = self.approved or self.request_approval()\nif not approved:\nraise CleanExit(\"Results upload operation cancelled\")\nupdated_result_dict = self.result.upload()\nreturn updated_result_dict\ndef to_permanent_path(self, result_dict: dict):\n\"\"\"Rename the temporary result submission to a permanent one\n Args:\n result_dict (dict): updated results dictionary\n \"\"\"\nold_result_loc = self.result.path\nupdated_result = Result(**result_dict)\nnew_result_loc = updated_result.path\nremove_path(new_result_loc)\nos.rename(old_result_loc, new_result_loc)\ndef write(self, updated_result_dict):\nresult = Result(**updated_result_dict)\nresult.write()\n
"},{"location":"reference/commands/result/submit/#commands.result.submit.ResultSubmission.to_permanent_path","title":"to_permanent_path(result_dict)
","text":"Rename the temporary result submission to a permanent one
Parameters:
Name Type Description Defaultresult_dict
dict
updated results dictionary
required Source code incli/medperf/commands/result/submit.py
def to_permanent_path(self, result_dict: dict):\n\"\"\"Rename the temporary result submission to a permanent one\n Args:\n result_dict (dict): updated results dictionary\n \"\"\"\nold_result_loc = self.result.path\nupdated_result = Result(**result_dict)\nnew_result_loc = updated_result.path\nremove_path(new_result_loc)\nos.rename(old_result_loc, new_result_loc)\n
"},{"location":"reference/comms/factory/","title":"Factory","text":""},{"location":"reference/comms/interface/","title":"Interface","text":""},{"location":"reference/comms/interface/#comms.interface.Comms","title":"Comms
","text":" Bases: ABC
cli/medperf/comms/interface.py
class Comms(ABC):\n@abstractmethod\ndef __init__(self, source: str):\n\"\"\"Create an instance of a communication object.\n Args:\n source (str): location of the communication source. Where messages are going to be sent.\n ui (UI): Implementation of the UI interface.\n token (str, Optional): authentication token to be used throughout communication. Defaults to None.\n \"\"\"\n@classmethod\n@abstractmethod\ndef parse_url(self, url: str) -> str:\n\"\"\"Parse the source URL so that it can be used by the comms implementation.\n It should handle protocols and versioning to be able to communicate with the API.\n Args:\n url (str): base URL\n Returns:\n str: parsed URL with protocol and version\n \"\"\"\n@abstractmethod\ndef get_current_user(self):\n\"\"\"Retrieve the currently-authenticated user information\"\"\"\n@abstractmethod\ndef get_benchmarks(self) -> List[dict]:\n\"\"\"Retrieves all benchmarks in the platform.\n Returns:\n List[dict]: all benchmarks information.\n \"\"\"\n@abstractmethod\ndef get_benchmark(self, benchmark_uid: int) -> dict:\n\"\"\"Retrieves the benchmark specification file from the server\n Args:\n benchmark_uid (int): uid for the desired benchmark\n Returns:\n dict: benchmark specification\n \"\"\"\n@abstractmethod\ndef get_benchmark_model_associations(self, benchmark_uid: int) -> List[int]:\n\"\"\"Retrieves all the model associations of a benchmark.\n Args:\n benchmark_uid (int): UID of the desired benchmark\n Returns:\n list[int]: List of benchmark model associations\n \"\"\"\n@abstractmethod\ndef get_user_benchmarks(self) -> List[dict]:\n\"\"\"Retrieves all benchmarks created by the user\n Returns:\n List[dict]: Benchmarks data\n \"\"\"\n@abstractmethod\ndef get_cubes(self) -> List[dict]:\n\"\"\"Retrieves all MLCubes in the platform\n Returns:\n List[dict]: List containing the data of all MLCubes\n \"\"\"\n@abstractmethod\ndef get_cube_metadata(self, cube_uid: int) -> dict:\n\"\"\"Retrieves metadata about the specified cube\n Args:\n cube_uid (int): UID of the desired cube.\n Returns:\n dict: Dictionary containing url and hashes for the cube files\n \"\"\"\n@abstractmethod\ndef get_user_cubes(self) -> List[dict]:\n\"\"\"Retrieves metadata from all cubes registered by the user\n Returns:\n List[dict]: List of dictionaries containing the mlcubes registration information\n \"\"\"\n@abstractmethod\ndef upload_benchmark(self, benchmark_dict: dict) -> int:\n\"\"\"Uploads a new benchmark to the server.\n Args:\n benchmark_dict (dict): benchmark_data to be uploaded\n Returns:\n int: UID of newly created benchmark\n \"\"\"\n@abstractmethod\ndef upload_mlcube(self, mlcube_body: dict) -> int:\n\"\"\"Uploads an MLCube instance to the platform\n Args:\n mlcube_body (dict): Dictionary containing all the relevant data for creating mlcubes\n Returns:\n int: id of the created mlcube instance on the platform\n \"\"\"\n@abstractmethod\ndef get_datasets(self) -> List[dict]:\n\"\"\"Retrieves all datasets in the platform\n Returns:\n List[dict]: List of data from all datasets\n \"\"\"\n@abstractmethod\ndef get_dataset(self, dset_uid: str) -> dict:\n\"\"\"Retrieves a specific dataset\n Args:\n dset_uid (str): Dataset UID\n Returns:\n dict: Dataset metadata\n \"\"\"\n@abstractmethod\ndef get_user_datasets(self) -> dict:\n\"\"\"Retrieves all datasets registered by the user\n Returns:\n dict: dictionary with the contents of each dataset registration query\n \"\"\"\n@abstractmethod\ndef upload_dataset(self, reg_dict: dict) -> int:\n\"\"\"Uploads registration data to the server, under the sha name of the file.\n Args:\n reg_dict (dict): Dictionary containing registration information.\n Returns:\n int: id of the created dataset registration.\n \"\"\"\n@abstractmethod\ndef get_results(self) -> List[dict]:\n\"\"\"Retrieves all results\n Returns:\n List[dict]: List of results\n \"\"\"\n@abstractmethod\ndef get_result(self, result_uid: str) -> dict:\n\"\"\"Retrieves a specific result data\n Args:\n result_uid (str): Result UID\n Returns:\n dict: Result metadata\n \"\"\"\n@abstractmethod\ndef get_user_results(self) -> dict:\n\"\"\"Retrieves all results registered by the user\n Returns:\n dict: dictionary with the contents of each dataset registration query\n \"\"\"\n@abstractmethod\ndef get_benchmark_results(self, benchmark_id: int) -> dict:\n\"\"\"Retrieves all results for a given benchmark\n Args:\n benchmark_id (int): benchmark ID to retrieve results from\n Returns:\n dict: dictionary with the contents of each result in the specified benchmark\n \"\"\"\n@abstractmethod\ndef upload_result(self, results_dict: dict) -> int:\n\"\"\"Uploads result to the server.\n Args:\n results_dict (dict): Dictionary containing results information.\n Returns:\n int: id of the generated results entry\n \"\"\"\n@abstractmethod\ndef associate_dset(self, data_uid: int, benchmark_uid: int, metadata: dict = {}):\n\"\"\"Create a Dataset Benchmark association\n Args:\n data_uid (int): Registered dataset UID\n benchmark_uid (int): Benchmark UID\n metadata (dict, optional): Additional metadata. Defaults to {}.\n \"\"\"\n@abstractmethod\ndef associate_cube(self, cube_uid: str, benchmark_uid: int, metadata: dict = {}):\n\"\"\"Create an MLCube-Benchmark association\n Args:\n cube_uid (str): MLCube UID\n benchmark_uid (int): Benchmark UID\n metadata (dict, optional): Additional metadata. Defaults to {}.\n \"\"\"\n@abstractmethod\ndef set_dataset_association_approval(\nself, dataset_uid: str, benchmark_uid: str, status: str\n):\n\"\"\"Approves a dataset association\n Args:\n dataset_uid (str): Dataset UID\n benchmark_uid (str): Benchmark UID\n status (str): Approval status to set for the association\n \"\"\"\n@abstractmethod\ndef set_mlcube_association_approval(\nself, mlcube_uid: str, benchmark_uid: str, status: str\n):\n\"\"\"Approves an mlcube association\n Args:\n mlcube_uid (str): Dataset UID\n benchmark_uid (str): Benchmark UID\n status (str): Approval status to set for the association\n \"\"\"\n@abstractmethod\ndef get_datasets_associations(self) -> List[dict]:\n\"\"\"Get all dataset associations related to the current user\n Returns:\n List[dict]: List containing all associations information\n \"\"\"\n@abstractmethod\ndef get_cubes_associations(self) -> List[dict]:\n\"\"\"Get all cube associations related to the current user\n Returns:\n List[dict]: List containing all associations information\n \"\"\"\n@abstractmethod\ndef set_mlcube_association_priority(\nself, benchmark_uid: str, mlcube_uid: str, priority: int\n):\n\"\"\"Sets the priority of an mlcube-benchmark association\n Args:\n mlcube_uid (str): MLCube UID\n benchmark_uid (str): Benchmark UID\n priority (int): priority value to set for the association\n \"\"\"\n@abstractmethod\ndef update_dataset(self, dataset_id: int, data: dict):\n\"\"\"Updates the contents of a datasets identified by dataset_id to the new data dictionary.\n Updates may be partial.\n Args:\n dataset_id (int): ID of the dataset to update\n data (dict): Updated information of the dataset.\n \"\"\"\n@abstractmethod\ndef get_user(self, user_id: int) -> dict:\n\"\"\"Retrieves the specified user. This will only return if\n the current user has permission to view the requested user,\n either by being himself, an admin or an owner of a data preparation\n mlcube used by the requested user\n Args:\n user_id (int): User UID\n Returns:\n dict: Requested user information\n \"\"\"\n
"},{"location":"reference/comms/interface/#comms.interface.Comms.__init__","title":"__init__(source)
abstractmethod
","text":"Create an instance of a communication object.
Parameters:
Name Type Description Defaultsource
str
location of the communication source. Where messages are going to be sent.
requiredui
UI
Implementation of the UI interface.
requiredtoken
(str, Optional)
authentication token to be used throughout communication. Defaults to None.
required Source code incli/medperf/comms/interface.py
@abstractmethod\ndef __init__(self, source: str):\n\"\"\"Create an instance of a communication object.\n Args:\n source (str): location of the communication source. Where messages are going to be sent.\n ui (UI): Implementation of the UI interface.\n token (str, Optional): authentication token to be used throughout communication. Defaults to None.\n \"\"\"\n
"},{"location":"reference/comms/interface/#comms.interface.Comms.associate_cube","title":"associate_cube(cube_uid, benchmark_uid, metadata={})
abstractmethod
","text":"Create an MLCube-Benchmark association
Parameters:
Name Type Description Defaultcube_uid
str
MLCube UID
requiredbenchmark_uid
int
Benchmark UID
requiredmetadata
dict
Additional metadata. Defaults to {}.
{}
Source code in cli/medperf/comms/interface.py
@abstractmethod\ndef associate_cube(self, cube_uid: str, benchmark_uid: int, metadata: dict = {}):\n\"\"\"Create an MLCube-Benchmark association\n Args:\n cube_uid (str): MLCube UID\n benchmark_uid (int): Benchmark UID\n metadata (dict, optional): Additional metadata. Defaults to {}.\n \"\"\"\n
"},{"location":"reference/comms/interface/#comms.interface.Comms.associate_dset","title":"associate_dset(data_uid, benchmark_uid, metadata={})
abstractmethod
","text":"Create a Dataset Benchmark association
Parameters:
Name Type Description Defaultdata_uid
int
Registered dataset UID
requiredbenchmark_uid
int
Benchmark UID
requiredmetadata
dict
Additional metadata. Defaults to {}.
{}
Source code in cli/medperf/comms/interface.py
@abstractmethod\ndef associate_dset(self, data_uid: int, benchmark_uid: int, metadata: dict = {}):\n\"\"\"Create a Dataset Benchmark association\n Args:\n data_uid (int): Registered dataset UID\n benchmark_uid (int): Benchmark UID\n metadata (dict, optional): Additional metadata. Defaults to {}.\n \"\"\"\n
"},{"location":"reference/comms/interface/#comms.interface.Comms.get_benchmark","title":"get_benchmark(benchmark_uid)
abstractmethod
","text":"Retrieves the benchmark specification file from the server
Parameters:
Name Type Description Defaultbenchmark_uid
int
uid for the desired benchmark
requiredReturns:
Name Type Descriptiondict
dict
benchmark specification
Source code incli/medperf/comms/interface.py
@abstractmethod\ndef get_benchmark(self, benchmark_uid: int) -> dict:\n\"\"\"Retrieves the benchmark specification file from the server\n Args:\n benchmark_uid (int): uid for the desired benchmark\n Returns:\n dict: benchmark specification\n \"\"\"\n
"},{"location":"reference/comms/interface/#comms.interface.Comms.get_benchmark_model_associations","title":"get_benchmark_model_associations(benchmark_uid)
abstractmethod
","text":"Retrieves all the model associations of a benchmark.
Parameters:
Name Type Description Defaultbenchmark_uid
int
UID of the desired benchmark
requiredReturns:
Type DescriptionList[int]
list[int]: List of benchmark model associations
Source code incli/medperf/comms/interface.py
@abstractmethod\ndef get_benchmark_model_associations(self, benchmark_uid: int) -> List[int]:\n\"\"\"Retrieves all the model associations of a benchmark.\n Args:\n benchmark_uid (int): UID of the desired benchmark\n Returns:\n list[int]: List of benchmark model associations\n \"\"\"\n
"},{"location":"reference/comms/interface/#comms.interface.Comms.get_benchmark_results","title":"get_benchmark_results(benchmark_id)
abstractmethod
","text":"Retrieves all results for a given benchmark
Parameters:
Name Type Description Defaultbenchmark_id
int
benchmark ID to retrieve results from
requiredReturns:
Name Type Descriptiondict
dict
dictionary with the contents of each result in the specified benchmark
Source code incli/medperf/comms/interface.py
@abstractmethod\ndef get_benchmark_results(self, benchmark_id: int) -> dict:\n\"\"\"Retrieves all results for a given benchmark\n Args:\n benchmark_id (int): benchmark ID to retrieve results from\n Returns:\n dict: dictionary with the contents of each result in the specified benchmark\n \"\"\"\n
"},{"location":"reference/comms/interface/#comms.interface.Comms.get_benchmarks","title":"get_benchmarks()
abstractmethod
","text":"Retrieves all benchmarks in the platform.
Returns:
Type DescriptionList[dict]
List[dict]: all benchmarks information.
Source code incli/medperf/comms/interface.py
@abstractmethod\ndef get_benchmarks(self) -> List[dict]:\n\"\"\"Retrieves all benchmarks in the platform.\n Returns:\n List[dict]: all benchmarks information.\n \"\"\"\n
"},{"location":"reference/comms/interface/#comms.interface.Comms.get_cube_metadata","title":"get_cube_metadata(cube_uid)
abstractmethod
","text":"Retrieves metadata about the specified cube
Parameters:
Name Type Description Defaultcube_uid
int
UID of the desired cube.
requiredReturns:
Name Type Descriptiondict
dict
Dictionary containing url and hashes for the cube files
Source code incli/medperf/comms/interface.py
@abstractmethod\ndef get_cube_metadata(self, cube_uid: int) -> dict:\n\"\"\"Retrieves metadata about the specified cube\n Args:\n cube_uid (int): UID of the desired cube.\n Returns:\n dict: Dictionary containing url and hashes for the cube files\n \"\"\"\n
"},{"location":"reference/comms/interface/#comms.interface.Comms.get_cubes","title":"get_cubes()
abstractmethod
","text":"Retrieves all MLCubes in the platform
Returns:
Type DescriptionList[dict]
List[dict]: List containing the data of all MLCubes
Source code incli/medperf/comms/interface.py
@abstractmethod\ndef get_cubes(self) -> List[dict]:\n\"\"\"Retrieves all MLCubes in the platform\n Returns:\n List[dict]: List containing the data of all MLCubes\n \"\"\"\n
"},{"location":"reference/comms/interface/#comms.interface.Comms.get_cubes_associations","title":"get_cubes_associations()
abstractmethod
","text":"Get all cube associations related to the current user
Returns:
Type DescriptionList[dict]
List[dict]: List containing all associations information
Source code incli/medperf/comms/interface.py
@abstractmethod\ndef get_cubes_associations(self) -> List[dict]:\n\"\"\"Get all cube associations related to the current user\n Returns:\n List[dict]: List containing all associations information\n \"\"\"\n
"},{"location":"reference/comms/interface/#comms.interface.Comms.get_current_user","title":"get_current_user()
abstractmethod
","text":"Retrieve the currently-authenticated user information
Source code incli/medperf/comms/interface.py
@abstractmethod\ndef get_current_user(self):\n\"\"\"Retrieve the currently-authenticated user information\"\"\"\n
"},{"location":"reference/comms/interface/#comms.interface.Comms.get_dataset","title":"get_dataset(dset_uid)
abstractmethod
","text":"Retrieves a specific dataset
Parameters:
Name Type Description Defaultdset_uid
str
Dataset UID
requiredReturns:
Name Type Descriptiondict
dict
Dataset metadata
Source code incli/medperf/comms/interface.py
@abstractmethod\ndef get_dataset(self, dset_uid: str) -> dict:\n\"\"\"Retrieves a specific dataset\n Args:\n dset_uid (str): Dataset UID\n Returns:\n dict: Dataset metadata\n \"\"\"\n
"},{"location":"reference/comms/interface/#comms.interface.Comms.get_datasets","title":"get_datasets()
abstractmethod
","text":"Retrieves all datasets in the platform
Returns:
Type DescriptionList[dict]
List[dict]: List of data from all datasets
Source code incli/medperf/comms/interface.py
@abstractmethod\ndef get_datasets(self) -> List[dict]:\n\"\"\"Retrieves all datasets in the platform\n Returns:\n List[dict]: List of data from all datasets\n \"\"\"\n
"},{"location":"reference/comms/interface/#comms.interface.Comms.get_datasets_associations","title":"get_datasets_associations()
abstractmethod
","text":"Get all dataset associations related to the current user
Returns:
Type DescriptionList[dict]
List[dict]: List containing all associations information
Source code incli/medperf/comms/interface.py
@abstractmethod\ndef get_datasets_associations(self) -> List[dict]:\n\"\"\"Get all dataset associations related to the current user\n Returns:\n List[dict]: List containing all associations information\n \"\"\"\n
"},{"location":"reference/comms/interface/#comms.interface.Comms.get_result","title":"get_result(result_uid)
abstractmethod
","text":"Retrieves a specific result data
Parameters:
Name Type Description Defaultresult_uid
str
Result UID
requiredReturns:
Name Type Descriptiondict
dict
Result metadata
Source code incli/medperf/comms/interface.py
@abstractmethod\ndef get_result(self, result_uid: str) -> dict:\n\"\"\"Retrieves a specific result data\n Args:\n result_uid (str): Result UID\n Returns:\n dict: Result metadata\n \"\"\"\n
"},{"location":"reference/comms/interface/#comms.interface.Comms.get_results","title":"get_results()
abstractmethod
","text":"Retrieves all results
Returns:
Type DescriptionList[dict]
List[dict]: List of results
Source code incli/medperf/comms/interface.py
@abstractmethod\ndef get_results(self) -> List[dict]:\n\"\"\"Retrieves all results\n Returns:\n List[dict]: List of results\n \"\"\"\n
"},{"location":"reference/comms/interface/#comms.interface.Comms.get_user","title":"get_user(user_id)
abstractmethod
","text":"Retrieves the specified user. This will only return if the current user has permission to view the requested user, either by being himself, an admin or an owner of a data preparation mlcube used by the requested user
Parameters:
Name Type Description Defaultuser_id
int
User UID
requiredReturns:
Name Type Descriptiondict
dict
Requested user information
Source code incli/medperf/comms/interface.py
@abstractmethod\ndef get_user(self, user_id: int) -> dict:\n\"\"\"Retrieves the specified user. This will only return if\n the current user has permission to view the requested user,\n either by being himself, an admin or an owner of a data preparation\n mlcube used by the requested user\n Args:\n user_id (int): User UID\n Returns:\n dict: Requested user information\n \"\"\"\n
"},{"location":"reference/comms/interface/#comms.interface.Comms.get_user_benchmarks","title":"get_user_benchmarks()
abstractmethod
","text":"Retrieves all benchmarks created by the user
Returns:
Type DescriptionList[dict]
List[dict]: Benchmarks data
Source code incli/medperf/comms/interface.py
@abstractmethod\ndef get_user_benchmarks(self) -> List[dict]:\n\"\"\"Retrieves all benchmarks created by the user\n Returns:\n List[dict]: Benchmarks data\n \"\"\"\n
"},{"location":"reference/comms/interface/#comms.interface.Comms.get_user_cubes","title":"get_user_cubes()
abstractmethod
","text":"Retrieves metadata from all cubes registered by the user
Returns:
Type DescriptionList[dict]
List[dict]: List of dictionaries containing the mlcubes registration information
Source code incli/medperf/comms/interface.py
@abstractmethod\ndef get_user_cubes(self) -> List[dict]:\n\"\"\"Retrieves metadata from all cubes registered by the user\n Returns:\n List[dict]: List of dictionaries containing the mlcubes registration information\n \"\"\"\n
"},{"location":"reference/comms/interface/#comms.interface.Comms.get_user_datasets","title":"get_user_datasets()
abstractmethod
","text":"Retrieves all datasets registered by the user
Returns:
Name Type Descriptiondict
dict
dictionary with the contents of each dataset registration query
Source code incli/medperf/comms/interface.py
@abstractmethod\ndef get_user_datasets(self) -> dict:\n\"\"\"Retrieves all datasets registered by the user\n Returns:\n dict: dictionary with the contents of each dataset registration query\n \"\"\"\n
"},{"location":"reference/comms/interface/#comms.interface.Comms.get_user_results","title":"get_user_results()
abstractmethod
","text":"Retrieves all results registered by the user
Returns:
Name Type Descriptiondict
dict
dictionary with the contents of each dataset registration query
Source code incli/medperf/comms/interface.py
@abstractmethod\ndef get_user_results(self) -> dict:\n\"\"\"Retrieves all results registered by the user\n Returns:\n dict: dictionary with the contents of each dataset registration query\n \"\"\"\n
"},{"location":"reference/comms/interface/#comms.interface.Comms.parse_url","title":"parse_url(url)
abstractmethod
classmethod
","text":"Parse the source URL so that it can be used by the comms implementation. It should handle protocols and versioning to be able to communicate with the API.
Parameters:
Name Type Description Defaulturl
str
base URL
requiredReturns:
Name Type Descriptionstr
str
parsed URL with protocol and version
Source code incli/medperf/comms/interface.py
@classmethod\n@abstractmethod\ndef parse_url(self, url: str) -> str:\n\"\"\"Parse the source URL so that it can be used by the comms implementation.\n It should handle protocols and versioning to be able to communicate with the API.\n Args:\n url (str): base URL\n Returns:\n str: parsed URL with protocol and version\n \"\"\"\n
"},{"location":"reference/comms/interface/#comms.interface.Comms.set_dataset_association_approval","title":"set_dataset_association_approval(dataset_uid, benchmark_uid, status)
abstractmethod
","text":"Approves a dataset association
Parameters:
Name Type Description Defaultdataset_uid
str
Dataset UID
requiredbenchmark_uid
str
Benchmark UID
requiredstatus
str
Approval status to set for the association
required Source code incli/medperf/comms/interface.py
@abstractmethod\ndef set_dataset_association_approval(\nself, dataset_uid: str, benchmark_uid: str, status: str\n):\n\"\"\"Approves a dataset association\n Args:\n dataset_uid (str): Dataset UID\n benchmark_uid (str): Benchmark UID\n status (str): Approval status to set for the association\n \"\"\"\n
"},{"location":"reference/comms/interface/#comms.interface.Comms.set_mlcube_association_approval","title":"set_mlcube_association_approval(mlcube_uid, benchmark_uid, status)
abstractmethod
","text":"Approves an mlcube association
Parameters:
Name Type Description Defaultmlcube_uid
str
Dataset UID
requiredbenchmark_uid
str
Benchmark UID
requiredstatus
str
Approval status to set for the association
required Source code incli/medperf/comms/interface.py
@abstractmethod\ndef set_mlcube_association_approval(\nself, mlcube_uid: str, benchmark_uid: str, status: str\n):\n\"\"\"Approves an mlcube association\n Args:\n mlcube_uid (str): Dataset UID\n benchmark_uid (str): Benchmark UID\n status (str): Approval status to set for the association\n \"\"\"\n
"},{"location":"reference/comms/interface/#comms.interface.Comms.set_mlcube_association_priority","title":"set_mlcube_association_priority(benchmark_uid, mlcube_uid, priority)
abstractmethod
","text":"Sets the priority of an mlcube-benchmark association
Parameters:
Name Type Description Defaultmlcube_uid
str
MLCube UID
requiredbenchmark_uid
str
Benchmark UID
requiredpriority
int
priority value to set for the association
required Source code incli/medperf/comms/interface.py
@abstractmethod\ndef set_mlcube_association_priority(\nself, benchmark_uid: str, mlcube_uid: str, priority: int\n):\n\"\"\"Sets the priority of an mlcube-benchmark association\n Args:\n mlcube_uid (str): MLCube UID\n benchmark_uid (str): Benchmark UID\n priority (int): priority value to set for the association\n \"\"\"\n
"},{"location":"reference/comms/interface/#comms.interface.Comms.update_dataset","title":"update_dataset(dataset_id, data)
abstractmethod
","text":"Updates the contents of a datasets identified by dataset_id to the new data dictionary. Updates may be partial.
Parameters:
Name Type Description Defaultdataset_id
int
ID of the dataset to update
requireddata
dict
Updated information of the dataset.
required Source code incli/medperf/comms/interface.py
@abstractmethod\ndef update_dataset(self, dataset_id: int, data: dict):\n\"\"\"Updates the contents of a datasets identified by dataset_id to the new data dictionary.\n Updates may be partial.\n Args:\n dataset_id (int): ID of the dataset to update\n data (dict): Updated information of the dataset.\n \"\"\"\n
"},{"location":"reference/comms/interface/#comms.interface.Comms.upload_benchmark","title":"upload_benchmark(benchmark_dict)
abstractmethod
","text":"Uploads a new benchmark to the server.
Parameters:
Name Type Description Defaultbenchmark_dict
dict
benchmark_data to be uploaded
requiredReturns:
Name Type Descriptionint
int
UID of newly created benchmark
Source code incli/medperf/comms/interface.py
@abstractmethod\ndef upload_benchmark(self, benchmark_dict: dict) -> int:\n\"\"\"Uploads a new benchmark to the server.\n Args:\n benchmark_dict (dict): benchmark_data to be uploaded\n Returns:\n int: UID of newly created benchmark\n \"\"\"\n
"},{"location":"reference/comms/interface/#comms.interface.Comms.upload_dataset","title":"upload_dataset(reg_dict)
abstractmethod
","text":"Uploads registration data to the server, under the sha name of the file.
Parameters:
Name Type Description Defaultreg_dict
dict
Dictionary containing registration information.
requiredReturns:
Name Type Descriptionint
int
id of the created dataset registration.
Source code incli/medperf/comms/interface.py
@abstractmethod\ndef upload_dataset(self, reg_dict: dict) -> int:\n\"\"\"Uploads registration data to the server, under the sha name of the file.\n Args:\n reg_dict (dict): Dictionary containing registration information.\n Returns:\n int: id of the created dataset registration.\n \"\"\"\n
"},{"location":"reference/comms/interface/#comms.interface.Comms.upload_mlcube","title":"upload_mlcube(mlcube_body)
abstractmethod
","text":"Uploads an MLCube instance to the platform
Parameters:
Name Type Description Defaultmlcube_body
dict
Dictionary containing all the relevant data for creating mlcubes
requiredReturns:
Name Type Descriptionint
int
id of the created mlcube instance on the platform
Source code incli/medperf/comms/interface.py
@abstractmethod\ndef upload_mlcube(self, mlcube_body: dict) -> int:\n\"\"\"Uploads an MLCube instance to the platform\n Args:\n mlcube_body (dict): Dictionary containing all the relevant data for creating mlcubes\n Returns:\n int: id of the created mlcube instance on the platform\n \"\"\"\n
"},{"location":"reference/comms/interface/#comms.interface.Comms.upload_result","title":"upload_result(results_dict)
abstractmethod
","text":"Uploads result to the server.
Parameters:
Name Type Description Defaultresults_dict
dict
Dictionary containing results information.
requiredReturns:
Name Type Descriptionint
int
id of the generated results entry
Source code incli/medperf/comms/interface.py
@abstractmethod\ndef upload_result(self, results_dict: dict) -> int:\n\"\"\"Uploads result to the server.\n Args:\n results_dict (dict): Dictionary containing results information.\n Returns:\n int: id of the generated results entry\n \"\"\"\n
"},{"location":"reference/comms/rest/","title":"Rest","text":""},{"location":"reference/comms/rest/#comms.rest.REST","title":"REST
","text":" Bases: Comms
cli/medperf/comms/rest.py
class REST(Comms):\ndef __init__(self, source: str):\nself.server_url = self.parse_url(source)\nself.cert = config.certificate\nif self.cert is None:\n# No certificate provided, default to normal verification\nself.cert = True\n@classmethod\ndef parse_url(cls, url: str) -> str:\n\"\"\"Parse the source URL so that it can be used by the comms implementation.\n It should handle protocols and versioning to be able to communicate with the API.\n Args:\n url (str): base URL\n Returns:\n str: parsed URL with protocol and version\n \"\"\"\nurl_sections = url.split(\"://\")\napi_path = f\"/api/v{config.major_version}\"\n# Remove protocol if passed\nif len(url_sections) > 1:\nurl = \"\".join(url_sections[1:])\nreturn f\"https://{url}{api_path}\"\ndef __auth_get(self, url, **kwargs):\nreturn self.__auth_req(url, requests.get, **kwargs)\ndef __auth_post(self, url, **kwargs):\nreturn self.__auth_req(url, requests.post, **kwargs)\ndef __auth_put(self, url, **kwargs):\nreturn self.__auth_req(url, requests.put, **kwargs)\ndef __auth_req(self, url, req_func, **kwargs):\ntoken = config.auth.access_token\nreturn self.__req(\nurl, req_func, headers={\"Authorization\": f\"Bearer {token}\"}, **kwargs\n)\ndef __req(self, url, req_func, **kwargs):\nlogging.debug(f\"Calling {req_func}: {url}\")\nif \"json\" in kwargs:\nlogging.debug(f\"Passing JSON contents: {kwargs['json']}\")\nkwargs[\"json\"] = sanitize_json(kwargs[\"json\"])\ntry:\nreturn req_func(url, verify=self.cert, **kwargs)\nexcept requests.exceptions.SSLError as e:\nlogging.error(f\"Couldn't connect to {self.server_url}: {e}\")\nraise CommunicationError(\n\"Couldn't connect to server through HTTPS. If running locally, \"\n\"remember to provide the server certificate through --certificate\"\n)\ndef __get_list(\nself,\nurl,\nnum_elements=None,\npage_size=config.default_page_size,\noffset=0,\nbinary_reduction=False,\n):\n\"\"\"Retrieves a list of elements from a URL by iterating over pages until num_elements is obtained.\n If num_elements is None, then iterates until all elements have been retrieved.\n If binary_reduction is enabled, errors are assumed to be related to response size. In that case,\n the page_size is reduced by half until a successful response is obtained or until page_size can't be\n reduced anymore.\n Args:\n url (str): The url to retrieve elements from\n num_elements (int, optional): The desired number of elements to be retrieved. Defaults to None.\n page_size (int, optional): Starting page size. Defaults to config.default_page_size.\n start_limit (int, optional): The starting position for element retrieval. Defaults to 0.\n binary_reduction (bool, optional): Wether to handle errors by halfing the page size. Defaults to False.\n Returns:\n List[dict]: A list of dictionaries representing the retrieved elements.\n \"\"\"\nel_list = []\nif num_elements is None:\nnum_elements = float(\"inf\")\nwhile len(el_list) < num_elements:\npaginated_url = f\"{url}?limit={page_size}&offset={offset}\"\nres = self.__auth_get(paginated_url)\nif res.status_code != 200:\nif not binary_reduction:\nlog_response_error(res)\ndetails = format_errors_dict(res.json())\nraise CommunicationRetrievalError(\nf\"there was an error retrieving the current list: {details}\"\n)\nlog_response_error(res, warn=True)\ndetails = format_errors_dict(res.json())\nif page_size <= 1:\nraise CommunicationRetrievalError(\nf\"Could not retrieve list. Minimum page size achieved without success: {details}\"\n)\npage_size = page_size // 2\ncontinue\nelse:\ndata = res.json()\nel_list += data[\"results\"]\noffset += len(data[\"results\"])\nif data[\"next\"] is None:\nbreak\nif isinstance(num_elements, int):\nreturn el_list[:num_elements]\nreturn el_list\ndef __set_approval_status(self, url: str, status: str) -> requests.Response:\n\"\"\"Sets the approval status of a resource\n Args:\n url (str): URL to the resource to update\n status (str): approval status to set\n Returns:\n requests.Response: Response object returned by the update\n \"\"\"\ndata = {\"approval_status\": status}\nres = self.__auth_put(url, json=data)\nreturn res\ndef get_current_user(self):\n\"\"\"Retrieve the currently-authenticated user information\"\"\"\nres = self.__auth_get(f\"{self.server_url}/me/\")\nreturn res.json()\ndef get_benchmarks(self) -> List[dict]:\n\"\"\"Retrieves all benchmarks in the platform.\n Returns:\n List[dict]: all benchmarks information.\n \"\"\"\nbmks = self.__get_list(f\"{self.server_url}/benchmarks/\")\nreturn bmks\ndef get_benchmark(self, benchmark_uid: int) -> dict:\n\"\"\"Retrieves the benchmark specification file from the server\n Args:\n benchmark_uid (int): uid for the desired benchmark\n Returns:\n dict: benchmark specification\n \"\"\"\nres = self.__auth_get(f\"{self.server_url}/benchmarks/{benchmark_uid}\")\nif res.status_code != 200:\nlog_response_error(res)\ndetails = format_errors_dict(res.json())\nraise CommunicationRetrievalError(\nf\"the specified benchmark doesn't exist: {details}\"\n)\nreturn res.json()\ndef get_benchmark_model_associations(self, benchmark_uid: int) -> List[int]:\n\"\"\"Retrieves all the model associations of a benchmark.\n Args:\n benchmark_uid (int): UID of the desired benchmark\n Returns:\n list[int]: List of benchmark model associations\n \"\"\"\nassocs = self.__get_list(f\"{self.server_url}/benchmarks/{benchmark_uid}/models\")\nreturn filter_latest_associations(assocs, \"model_mlcube\")\ndef get_user_benchmarks(self) -> List[dict]:\n\"\"\"Retrieves all benchmarks created by the user\n Returns:\n List[dict]: Benchmarks data\n \"\"\"\nbmks = self.__get_list(f\"{self.server_url}/me/benchmarks/\")\nreturn bmks\ndef get_cubes(self) -> List[dict]:\n\"\"\"Retrieves all MLCubes in the platform\n Returns:\n List[dict]: List containing the data of all MLCubes\n \"\"\"\ncubes = self.__get_list(f\"{self.server_url}/mlcubes/\")\nreturn cubes\ndef get_cube_metadata(self, cube_uid: int) -> dict:\n\"\"\"Retrieves metadata about the specified cube\n Args:\n cube_uid (int): UID of the desired cube.\n Returns:\n dict: Dictionary containing url and hashes for the cube files\n \"\"\"\nres = self.__auth_get(f\"{self.server_url}/mlcubes/{cube_uid}/\")\nif res.status_code != 200:\nlog_response_error(res)\ndetails = format_errors_dict(res.json())\nraise CommunicationRetrievalError(\nf\"the specified cube doesn't exist {details}\"\n)\nreturn res.json()\ndef get_user_cubes(self) -> List[dict]:\n\"\"\"Retrieves metadata from all cubes registered by the user\n Returns:\n List[dict]: List of dictionaries containing the mlcubes registration information\n \"\"\"\ncubes = self.__get_list(f\"{self.server_url}/me/mlcubes/\")\nreturn cubes\ndef upload_benchmark(self, benchmark_dict: dict) -> int:\n\"\"\"Uploads a new benchmark to the server.\n Args:\n benchmark_dict (dict): benchmark_data to be uploaded\n Returns:\n int: UID of newly created benchmark\n \"\"\"\nres = self.__auth_post(f\"{self.server_url}/benchmarks/\", json=benchmark_dict)\nif res.status_code != 201:\nlog_response_error(res)\ndetails = format_errors_dict(res.json())\nraise CommunicationRetrievalError(f\"Could not upload benchmark: {details}\")\nreturn res.json()\ndef upload_mlcube(self, mlcube_body: dict) -> int:\n\"\"\"Uploads an MLCube instance to the platform\n Args:\n mlcube_body (dict): Dictionary containing all the relevant data for creating mlcubes\n Returns:\n int: id of the created mlcube instance on the platform\n \"\"\"\nres = self.__auth_post(f\"{self.server_url}/mlcubes/\", json=mlcube_body)\nif res.status_code != 201:\nlog_response_error(res)\ndetails = format_errors_dict(res.json())\nraise CommunicationRetrievalError(f\"Could not upload the mlcube: {details}\")\nreturn res.json()\ndef get_datasets(self) -> List[dict]:\n\"\"\"Retrieves all datasets in the platform\n Returns:\n List[dict]: List of data from all datasets\n \"\"\"\ndsets = self.__get_list(f\"{self.server_url}/datasets/\")\nreturn dsets\ndef get_dataset(self, dset_uid: int) -> dict:\n\"\"\"Retrieves a specific dataset\n Args:\n dset_uid (int): Dataset UID\n Returns:\n dict: Dataset metadata\n \"\"\"\nres = self.__auth_get(f\"{self.server_url}/datasets/{dset_uid}/\")\nif res.status_code != 200:\nlog_response_error(res)\ndetails = format_errors_dict(res.json())\nraise CommunicationRetrievalError(\nf\"Could not retrieve the specified dataset from server: {details}\"\n)\nreturn res.json()\ndef get_user_datasets(self) -> dict:\n\"\"\"Retrieves all datasets registered by the user\n Returns:\n dict: dictionary with the contents of each dataset registration query\n \"\"\"\ndsets = self.__get_list(f\"{self.server_url}/me/datasets/\")\nreturn dsets\ndef upload_dataset(self, reg_dict: dict) -> int:\n\"\"\"Uploads registration data to the server, under the sha name of the file.\n Args:\n reg_dict (dict): Dictionary containing registration information.\n Returns:\n int: id of the created dataset registration.\n \"\"\"\nres = self.__auth_post(f\"{self.server_url}/datasets/\", json=reg_dict)\nif res.status_code != 201:\nlog_response_error(res)\ndetails = format_errors_dict(res.json())\nraise CommunicationRequestError(f\"Could not upload the dataset: {details}\")\nreturn res.json()\ndef get_results(self) -> List[dict]:\n\"\"\"Retrieves all results\n Returns:\n List[dict]: List of results\n \"\"\"\nres = self.__get_list(f\"{self.server_url}/results\")\nif res.status_code != 200:\nlog_response_error(res)\ndetails = format_errors_dict(res.json())\nraise CommunicationRetrievalError(f\"Could not retrieve results: {details}\")\nreturn res.json()\ndef get_result(self, result_uid: int) -> dict:\n\"\"\"Retrieves a specific result data\n Args:\n result_uid (int): Result UID\n Returns:\n dict: Result metadata\n \"\"\"\nres = self.__auth_get(f\"{self.server_url}/results/{result_uid}/\")\nif res.status_code != 200:\nlog_response_error(res)\ndetails = format_errors_dict(res.json())\nraise CommunicationRetrievalError(\nf\"Could not retrieve the specified result: {details}\"\n)\nreturn res.json()\ndef get_user_results(self) -> dict:\n\"\"\"Retrieves all results registered by the user\n Returns:\n dict: dictionary with the contents of each result registration query\n \"\"\"\nresults = self.__get_list(f\"{self.server_url}/me/results/\")\nreturn results\ndef get_benchmark_results(self, benchmark_id: int) -> dict:\n\"\"\"Retrieves all results for a given benchmark\n Args:\n benchmark_id (int): benchmark ID to retrieve results from\n Returns:\n dict: dictionary with the contents of each result in the specified benchmark\n \"\"\"\nresults = self.__get_list(\nf\"{self.server_url}/benchmarks/{benchmark_id}/results\"\n)\nreturn results\ndef upload_result(self, results_dict: dict) -> int:\n\"\"\"Uploads result to the server.\n Args:\n results_dict (dict): Dictionary containing results information.\n Returns:\n int: id of the generated results entry\n \"\"\"\nres = self.__auth_post(f\"{self.server_url}/results/\", json=results_dict)\nif res.status_code != 201:\nlog_response_error(res)\ndetails = format_errors_dict(res.json())\nraise CommunicationRequestError(f\"Could not upload the results: {details}\")\nreturn res.json()\ndef associate_dset(self, data_uid: int, benchmark_uid: int, metadata: dict = {}):\n\"\"\"Create a Dataset Benchmark association\n Args:\n data_uid (int): Registered dataset UID\n benchmark_uid (int): Benchmark UID\n metadata (dict, optional): Additional metadata. Defaults to {}.\n \"\"\"\ndata = {\n\"dataset\": data_uid,\n\"benchmark\": benchmark_uid,\n\"approval_status\": Status.PENDING.value,\n\"metadata\": metadata,\n}\nres = self.__auth_post(f\"{self.server_url}/datasets/benchmarks/\", json=data)\nif res.status_code != 201:\nlog_response_error(res)\ndetails = format_errors_dict(res.json())\nraise CommunicationRequestError(\nf\"Could not associate dataset to benchmark: {details}\"\n)\ndef associate_cube(self, cube_uid: int, benchmark_uid: int, metadata: dict = {}):\n\"\"\"Create an MLCube-Benchmark association\n Args:\n cube_uid (int): MLCube UID\n benchmark_uid (int): Benchmark UID\n metadata (dict, optional): Additional metadata. Defaults to {}.\n \"\"\"\ndata = {\n\"approval_status\": Status.PENDING.value,\n\"model_mlcube\": cube_uid,\n\"benchmark\": benchmark_uid,\n\"metadata\": metadata,\n}\nres = self.__auth_post(f\"{self.server_url}/mlcubes/benchmarks/\", json=data)\nif res.status_code != 201:\nlog_response_error(res)\ndetails = format_errors_dict(res.json())\nraise CommunicationRequestError(\nf\"Could not associate mlcube to benchmark: {details}\"\n)\ndef set_dataset_association_approval(\nself, benchmark_uid: int, dataset_uid: int, status: str\n):\n\"\"\"Approves a dataset association\n Args:\n dataset_uid (int): Dataset UID\n benchmark_uid (int): Benchmark UID\n status (str): Approval status to set for the association\n \"\"\"\nurl = f\"{self.server_url}/datasets/{dataset_uid}/benchmarks/{benchmark_uid}/\"\nres = self.__set_approval_status(url, status)\nif res.status_code != 200:\nlog_response_error(res)\ndetails = format_errors_dict(res.json())\nraise CommunicationRequestError(\nf\"Could not approve association between dataset {dataset_uid} and benchmark {benchmark_uid}: {details}\"\n)\ndef set_mlcube_association_approval(\nself, benchmark_uid: int, mlcube_uid: int, status: str\n):\n\"\"\"Approves an mlcube association\n Args:\n mlcube_uid (int): Dataset UID\n benchmark_uid (int): Benchmark UID\n status (str): Approval status to set for the association\n \"\"\"\nurl = f\"{self.server_url}/mlcubes/{mlcube_uid}/benchmarks/{benchmark_uid}/\"\nres = self.__set_approval_status(url, status)\nif res.status_code != 200:\nlog_response_error(res)\ndetails = format_errors_dict(res.json())\nraise CommunicationRequestError(\nf\"Could not approve association between mlcube {mlcube_uid} and benchmark {benchmark_uid}: {details}\"\n)\ndef get_datasets_associations(self) -> List[dict]:\n\"\"\"Get all dataset associations related to the current user\n Returns:\n List[dict]: List containing all associations information\n \"\"\"\nassocs = self.__get_list(f\"{self.server_url}/me/datasets/associations/\")\nreturn filter_latest_associations(assocs, \"dataset\")\ndef get_cubes_associations(self) -> List[dict]:\n\"\"\"Get all cube associations related to the current user\n Returns:\n List[dict]: List containing all associations information\n \"\"\"\nassocs = self.__get_list(f\"{self.server_url}/me/mlcubes/associations/\")\nreturn filter_latest_associations(assocs, \"model_mlcube\")\ndef set_mlcube_association_priority(\nself, benchmark_uid: int, mlcube_uid: int, priority: int\n):\n\"\"\"Sets the priority of an mlcube-benchmark association\n Args:\n mlcube_uid (int): MLCube UID\n benchmark_uid (int): Benchmark UID\n priority (int): priority value to set for the association\n \"\"\"\nurl = f\"{self.server_url}/mlcubes/{mlcube_uid}/benchmarks/{benchmark_uid}/\"\ndata = {\"priority\": priority}\nres = self.__auth_put(url, json=data)\nif res.status_code != 200:\nlog_response_error(res)\ndetails = format_errors_dict(res.json())\nraise CommunicationRequestError(\nf\"Could not set the priority of mlcube {mlcube_uid} within the benchmark {benchmark_uid}: {details}\"\n)\ndef update_dataset(self, dataset_id: int, data: dict):\nurl = f\"{self.server_url}/datasets/{dataset_id}/\"\nres = self.__auth_put(url, json=data)\nif res.status_code != 200:\nlog_response_error(res)\ndetails = format_errors_dict(res.json())\nraise CommunicationRequestError(f\"Could not update dataset: {details}\")\nreturn res.json()\ndef get_mlcube_datasets(self, mlcube_id: int) -> dict:\n\"\"\"Retrieves all datasets that have the specified mlcube as the prep mlcube\n Args:\n mlcube_id (int): mlcube ID to retrieve datasets from\n Returns:\n dict: dictionary with the contents of each dataset\n \"\"\"\ndatasets = self.__get_list(f\"{self.server_url}/mlcubes/{mlcube_id}/datasets/\")\nreturn datasets\ndef get_user(self, user_id: int) -> dict:\n\"\"\"Retrieves the specified user. This will only return if\n the current user has permission to view the requested user,\n either by being himself, an admin or an owner of a data preparation\n mlcube used by the requested user\n Args:\n user_id (int): User UID\n Returns:\n dict: Requested user information\n \"\"\"\nurl = f\"{self.server_url}/users/{user_id}/\"\nres = self.__auth_get(url)\nif res.status_code != 200:\nlog_response_error(res)\ndetails = format_errors_dict(res.json())\nraise CommunicationRequestError(f\"Could not retrieve user: {details}\")\nreturn res.json()\n
"},{"location":"reference/comms/rest/#comms.rest.REST.__get_list","title":"__get_list(url, num_elements=None, page_size=config.default_page_size, offset=0, binary_reduction=False)
","text":"Retrieves a list of elements from a URL by iterating over pages until num_elements is obtained. If num_elements is None, then iterates until all elements have been retrieved. If binary_reduction is enabled, errors are assumed to be related to response size. In that case, the page_size is reduced by half until a successful response is obtained or until page_size can't be reduced anymore.
Parameters:
Name Type Description Defaulturl
str
The url to retrieve elements from
requirednum_elements
int
The desired number of elements to be retrieved. Defaults to None.
None
page_size
int
Starting page size. Defaults to config.default_page_size.
config.default_page_size
start_limit
int
The starting position for element retrieval. Defaults to 0.
requiredbinary_reduction
bool
Wether to handle errors by halfing the page size. Defaults to False.
False
Returns:
Type DescriptionList[dict]: A list of dictionaries representing the retrieved elements.
Source code incli/medperf/comms/rest.py
def __get_list(\nself,\nurl,\nnum_elements=None,\npage_size=config.default_page_size,\noffset=0,\nbinary_reduction=False,\n):\n\"\"\"Retrieves a list of elements from a URL by iterating over pages until num_elements is obtained.\n If num_elements is None, then iterates until all elements have been retrieved.\n If binary_reduction is enabled, errors are assumed to be related to response size. In that case,\n the page_size is reduced by half until a successful response is obtained or until page_size can't be\n reduced anymore.\n Args:\n url (str): The url to retrieve elements from\n num_elements (int, optional): The desired number of elements to be retrieved. Defaults to None.\n page_size (int, optional): Starting page size. Defaults to config.default_page_size.\n start_limit (int, optional): The starting position for element retrieval. Defaults to 0.\n binary_reduction (bool, optional): Wether to handle errors by halfing the page size. Defaults to False.\n Returns:\n List[dict]: A list of dictionaries representing the retrieved elements.\n \"\"\"\nel_list = []\nif num_elements is None:\nnum_elements = float(\"inf\")\nwhile len(el_list) < num_elements:\npaginated_url = f\"{url}?limit={page_size}&offset={offset}\"\nres = self.__auth_get(paginated_url)\nif res.status_code != 200:\nif not binary_reduction:\nlog_response_error(res)\ndetails = format_errors_dict(res.json())\nraise CommunicationRetrievalError(\nf\"there was an error retrieving the current list: {details}\"\n)\nlog_response_error(res, warn=True)\ndetails = format_errors_dict(res.json())\nif page_size <= 1:\nraise CommunicationRetrievalError(\nf\"Could not retrieve list. Minimum page size achieved without success: {details}\"\n)\npage_size = page_size // 2\ncontinue\nelse:\ndata = res.json()\nel_list += data[\"results\"]\noffset += len(data[\"results\"])\nif data[\"next\"] is None:\nbreak\nif isinstance(num_elements, int):\nreturn el_list[:num_elements]\nreturn el_list\n
"},{"location":"reference/comms/rest/#comms.rest.REST.__set_approval_status","title":"__set_approval_status(url, status)
","text":"Sets the approval status of a resource
Parameters:
Name Type Description Defaulturl
str
URL to the resource to update
requiredstatus
str
approval status to set
requiredReturns:
Type Descriptionrequests.Response
requests.Response: Response object returned by the update
Source code incli/medperf/comms/rest.py
def __set_approval_status(self, url: str, status: str) -> requests.Response:\n\"\"\"Sets the approval status of a resource\n Args:\n url (str): URL to the resource to update\n status (str): approval status to set\n Returns:\n requests.Response: Response object returned by the update\n \"\"\"\ndata = {\"approval_status\": status}\nres = self.__auth_put(url, json=data)\nreturn res\n
"},{"location":"reference/comms/rest/#comms.rest.REST.associate_cube","title":"associate_cube(cube_uid, benchmark_uid, metadata={})
","text":"Create an MLCube-Benchmark association
Parameters:
Name Type Description Defaultcube_uid
int
MLCube UID
requiredbenchmark_uid
int
Benchmark UID
requiredmetadata
dict
Additional metadata. Defaults to {}.
{}
Source code in cli/medperf/comms/rest.py
def associate_cube(self, cube_uid: int, benchmark_uid: int, metadata: dict = {}):\n\"\"\"Create an MLCube-Benchmark association\n Args:\n cube_uid (int): MLCube UID\n benchmark_uid (int): Benchmark UID\n metadata (dict, optional): Additional metadata. Defaults to {}.\n \"\"\"\ndata = {\n\"approval_status\": Status.PENDING.value,\n\"model_mlcube\": cube_uid,\n\"benchmark\": benchmark_uid,\n\"metadata\": metadata,\n}\nres = self.__auth_post(f\"{self.server_url}/mlcubes/benchmarks/\", json=data)\nif res.status_code != 201:\nlog_response_error(res)\ndetails = format_errors_dict(res.json())\nraise CommunicationRequestError(\nf\"Could not associate mlcube to benchmark: {details}\"\n)\n
"},{"location":"reference/comms/rest/#comms.rest.REST.associate_dset","title":"associate_dset(data_uid, benchmark_uid, metadata={})
","text":"Create a Dataset Benchmark association
Parameters:
Name Type Description Defaultdata_uid
int
Registered dataset UID
requiredbenchmark_uid
int
Benchmark UID
requiredmetadata
dict
Additional metadata. Defaults to {}.
{}
Source code in cli/medperf/comms/rest.py
def associate_dset(self, data_uid: int, benchmark_uid: int, metadata: dict = {}):\n\"\"\"Create a Dataset Benchmark association\n Args:\n data_uid (int): Registered dataset UID\n benchmark_uid (int): Benchmark UID\n metadata (dict, optional): Additional metadata. Defaults to {}.\n \"\"\"\ndata = {\n\"dataset\": data_uid,\n\"benchmark\": benchmark_uid,\n\"approval_status\": Status.PENDING.value,\n\"metadata\": metadata,\n}\nres = self.__auth_post(f\"{self.server_url}/datasets/benchmarks/\", json=data)\nif res.status_code != 201:\nlog_response_error(res)\ndetails = format_errors_dict(res.json())\nraise CommunicationRequestError(\nf\"Could not associate dataset to benchmark: {details}\"\n)\n
"},{"location":"reference/comms/rest/#comms.rest.REST.get_benchmark","title":"get_benchmark(benchmark_uid)
","text":"Retrieves the benchmark specification file from the server
Parameters:
Name Type Description Defaultbenchmark_uid
int
uid for the desired benchmark
requiredReturns:
Name Type Descriptiondict
dict
benchmark specification
Source code incli/medperf/comms/rest.py
def get_benchmark(self, benchmark_uid: int) -> dict:\n\"\"\"Retrieves the benchmark specification file from the server\n Args:\n benchmark_uid (int): uid for the desired benchmark\n Returns:\n dict: benchmark specification\n \"\"\"\nres = self.__auth_get(f\"{self.server_url}/benchmarks/{benchmark_uid}\")\nif res.status_code != 200:\nlog_response_error(res)\ndetails = format_errors_dict(res.json())\nraise CommunicationRetrievalError(\nf\"the specified benchmark doesn't exist: {details}\"\n)\nreturn res.json()\n
"},{"location":"reference/comms/rest/#comms.rest.REST.get_benchmark_model_associations","title":"get_benchmark_model_associations(benchmark_uid)
","text":"Retrieves all the model associations of a benchmark.
Parameters:
Name Type Description Defaultbenchmark_uid
int
UID of the desired benchmark
requiredReturns:
Type DescriptionList[int]
list[int]: List of benchmark model associations
Source code incli/medperf/comms/rest.py
def get_benchmark_model_associations(self, benchmark_uid: int) -> List[int]:\n\"\"\"Retrieves all the model associations of a benchmark.\n Args:\n benchmark_uid (int): UID of the desired benchmark\n Returns:\n list[int]: List of benchmark model associations\n \"\"\"\nassocs = self.__get_list(f\"{self.server_url}/benchmarks/{benchmark_uid}/models\")\nreturn filter_latest_associations(assocs, \"model_mlcube\")\n
"},{"location":"reference/comms/rest/#comms.rest.REST.get_benchmark_results","title":"get_benchmark_results(benchmark_id)
","text":"Retrieves all results for a given benchmark
Parameters:
Name Type Description Defaultbenchmark_id
int
benchmark ID to retrieve results from
requiredReturns:
Name Type Descriptiondict
dict
dictionary with the contents of each result in the specified benchmark
Source code incli/medperf/comms/rest.py
def get_benchmark_results(self, benchmark_id: int) -> dict:\n\"\"\"Retrieves all results for a given benchmark\n Args:\n benchmark_id (int): benchmark ID to retrieve results from\n Returns:\n dict: dictionary with the contents of each result in the specified benchmark\n \"\"\"\nresults = self.__get_list(\nf\"{self.server_url}/benchmarks/{benchmark_id}/results\"\n)\nreturn results\n
"},{"location":"reference/comms/rest/#comms.rest.REST.get_benchmarks","title":"get_benchmarks()
","text":"Retrieves all benchmarks in the platform.
Returns:
Type DescriptionList[dict]
List[dict]: all benchmarks information.
Source code incli/medperf/comms/rest.py
def get_benchmarks(self) -> List[dict]:\n\"\"\"Retrieves all benchmarks in the platform.\n Returns:\n List[dict]: all benchmarks information.\n \"\"\"\nbmks = self.__get_list(f\"{self.server_url}/benchmarks/\")\nreturn bmks\n
"},{"location":"reference/comms/rest/#comms.rest.REST.get_cube_metadata","title":"get_cube_metadata(cube_uid)
","text":"Retrieves metadata about the specified cube
Parameters:
Name Type Description Defaultcube_uid
int
UID of the desired cube.
requiredReturns:
Name Type Descriptiondict
dict
Dictionary containing url and hashes for the cube files
Source code incli/medperf/comms/rest.py
def get_cube_metadata(self, cube_uid: int) -> dict:\n\"\"\"Retrieves metadata about the specified cube\n Args:\n cube_uid (int): UID of the desired cube.\n Returns:\n dict: Dictionary containing url and hashes for the cube files\n \"\"\"\nres = self.__auth_get(f\"{self.server_url}/mlcubes/{cube_uid}/\")\nif res.status_code != 200:\nlog_response_error(res)\ndetails = format_errors_dict(res.json())\nraise CommunicationRetrievalError(\nf\"the specified cube doesn't exist {details}\"\n)\nreturn res.json()\n
"},{"location":"reference/comms/rest/#comms.rest.REST.get_cubes","title":"get_cubes()
","text":"Retrieves all MLCubes in the platform
Returns:
Type DescriptionList[dict]
List[dict]: List containing the data of all MLCubes
Source code incli/medperf/comms/rest.py
def get_cubes(self) -> List[dict]:\n\"\"\"Retrieves all MLCubes in the platform\n Returns:\n List[dict]: List containing the data of all MLCubes\n \"\"\"\ncubes = self.__get_list(f\"{self.server_url}/mlcubes/\")\nreturn cubes\n
"},{"location":"reference/comms/rest/#comms.rest.REST.get_cubes_associations","title":"get_cubes_associations()
","text":"Get all cube associations related to the current user
Returns:
Type DescriptionList[dict]
List[dict]: List containing all associations information
Source code incli/medperf/comms/rest.py
def get_cubes_associations(self) -> List[dict]:\n\"\"\"Get all cube associations related to the current user\n Returns:\n List[dict]: List containing all associations information\n \"\"\"\nassocs = self.__get_list(f\"{self.server_url}/me/mlcubes/associations/\")\nreturn filter_latest_associations(assocs, \"model_mlcube\")\n
"},{"location":"reference/comms/rest/#comms.rest.REST.get_current_user","title":"get_current_user()
","text":"Retrieve the currently-authenticated user information
Source code incli/medperf/comms/rest.py
def get_current_user(self):\n\"\"\"Retrieve the currently-authenticated user information\"\"\"\nres = self.__auth_get(f\"{self.server_url}/me/\")\nreturn res.json()\n
"},{"location":"reference/comms/rest/#comms.rest.REST.get_dataset","title":"get_dataset(dset_uid)
","text":"Retrieves a specific dataset
Parameters:
Name Type Description Defaultdset_uid
int
Dataset UID
requiredReturns:
Name Type Descriptiondict
dict
Dataset metadata
Source code incli/medperf/comms/rest.py
def get_dataset(self, dset_uid: int) -> dict:\n\"\"\"Retrieves a specific dataset\n Args:\n dset_uid (int): Dataset UID\n Returns:\n dict: Dataset metadata\n \"\"\"\nres = self.__auth_get(f\"{self.server_url}/datasets/{dset_uid}/\")\nif res.status_code != 200:\nlog_response_error(res)\ndetails = format_errors_dict(res.json())\nraise CommunicationRetrievalError(\nf\"Could not retrieve the specified dataset from server: {details}\"\n)\nreturn res.json()\n
"},{"location":"reference/comms/rest/#comms.rest.REST.get_datasets","title":"get_datasets()
","text":"Retrieves all datasets in the platform
Returns:
Type DescriptionList[dict]
List[dict]: List of data from all datasets
Source code incli/medperf/comms/rest.py
def get_datasets(self) -> List[dict]:\n\"\"\"Retrieves all datasets in the platform\n Returns:\n List[dict]: List of data from all datasets\n \"\"\"\ndsets = self.__get_list(f\"{self.server_url}/datasets/\")\nreturn dsets\n
"},{"location":"reference/comms/rest/#comms.rest.REST.get_datasets_associations","title":"get_datasets_associations()
","text":"Get all dataset associations related to the current user
Returns:
Type DescriptionList[dict]
List[dict]: List containing all associations information
Source code incli/medperf/comms/rest.py
def get_datasets_associations(self) -> List[dict]:\n\"\"\"Get all dataset associations related to the current user\n Returns:\n List[dict]: List containing all associations information\n \"\"\"\nassocs = self.__get_list(f\"{self.server_url}/me/datasets/associations/\")\nreturn filter_latest_associations(assocs, \"dataset\")\n
"},{"location":"reference/comms/rest/#comms.rest.REST.get_mlcube_datasets","title":"get_mlcube_datasets(mlcube_id)
","text":"Retrieves all datasets that have the specified mlcube as the prep mlcube
Parameters:
Name Type Description Defaultmlcube_id
int
mlcube ID to retrieve datasets from
requiredReturns:
Name Type Descriptiondict
dict
dictionary with the contents of each dataset
Source code incli/medperf/comms/rest.py
def get_mlcube_datasets(self, mlcube_id: int) -> dict:\n\"\"\"Retrieves all datasets that have the specified mlcube as the prep mlcube\n Args:\n mlcube_id (int): mlcube ID to retrieve datasets from\n Returns:\n dict: dictionary with the contents of each dataset\n \"\"\"\ndatasets = self.__get_list(f\"{self.server_url}/mlcubes/{mlcube_id}/datasets/\")\nreturn datasets\n
"},{"location":"reference/comms/rest/#comms.rest.REST.get_result","title":"get_result(result_uid)
","text":"Retrieves a specific result data
Parameters:
Name Type Description Defaultresult_uid
int
Result UID
requiredReturns:
Name Type Descriptiondict
dict
Result metadata
Source code incli/medperf/comms/rest.py
def get_result(self, result_uid: int) -> dict:\n\"\"\"Retrieves a specific result data\n Args:\n result_uid (int): Result UID\n Returns:\n dict: Result metadata\n \"\"\"\nres = self.__auth_get(f\"{self.server_url}/results/{result_uid}/\")\nif res.status_code != 200:\nlog_response_error(res)\ndetails = format_errors_dict(res.json())\nraise CommunicationRetrievalError(\nf\"Could not retrieve the specified result: {details}\"\n)\nreturn res.json()\n
"},{"location":"reference/comms/rest/#comms.rest.REST.get_results","title":"get_results()
","text":"Retrieves all results
Returns:
Type DescriptionList[dict]
List[dict]: List of results
Source code incli/medperf/comms/rest.py
def get_results(self) -> List[dict]:\n\"\"\"Retrieves all results\n Returns:\n List[dict]: List of results\n \"\"\"\nres = self.__get_list(f\"{self.server_url}/results\")\nif res.status_code != 200:\nlog_response_error(res)\ndetails = format_errors_dict(res.json())\nraise CommunicationRetrievalError(f\"Could not retrieve results: {details}\")\nreturn res.json()\n
"},{"location":"reference/comms/rest/#comms.rest.REST.get_user","title":"get_user(user_id)
","text":"Retrieves the specified user. This will only return if the current user has permission to view the requested user, either by being himself, an admin or an owner of a data preparation mlcube used by the requested user
Parameters:
Name Type Description Defaultuser_id
int
User UID
requiredReturns:
Name Type Descriptiondict
dict
Requested user information
Source code incli/medperf/comms/rest.py
def get_user(self, user_id: int) -> dict:\n\"\"\"Retrieves the specified user. This will only return if\n the current user has permission to view the requested user,\n either by being himself, an admin or an owner of a data preparation\n mlcube used by the requested user\n Args:\n user_id (int): User UID\n Returns:\n dict: Requested user information\n \"\"\"\nurl = f\"{self.server_url}/users/{user_id}/\"\nres = self.__auth_get(url)\nif res.status_code != 200:\nlog_response_error(res)\ndetails = format_errors_dict(res.json())\nraise CommunicationRequestError(f\"Could not retrieve user: {details}\")\nreturn res.json()\n
"},{"location":"reference/comms/rest/#comms.rest.REST.get_user_benchmarks","title":"get_user_benchmarks()
","text":"Retrieves all benchmarks created by the user
Returns:
Type DescriptionList[dict]
List[dict]: Benchmarks data
Source code incli/medperf/comms/rest.py
def get_user_benchmarks(self) -> List[dict]:\n\"\"\"Retrieves all benchmarks created by the user\n Returns:\n List[dict]: Benchmarks data\n \"\"\"\nbmks = self.__get_list(f\"{self.server_url}/me/benchmarks/\")\nreturn bmks\n
"},{"location":"reference/comms/rest/#comms.rest.REST.get_user_cubes","title":"get_user_cubes()
","text":"Retrieves metadata from all cubes registered by the user
Returns:
Type DescriptionList[dict]
List[dict]: List of dictionaries containing the mlcubes registration information
Source code incli/medperf/comms/rest.py
def get_user_cubes(self) -> List[dict]:\n\"\"\"Retrieves metadata from all cubes registered by the user\n Returns:\n List[dict]: List of dictionaries containing the mlcubes registration information\n \"\"\"\ncubes = self.__get_list(f\"{self.server_url}/me/mlcubes/\")\nreturn cubes\n
"},{"location":"reference/comms/rest/#comms.rest.REST.get_user_datasets","title":"get_user_datasets()
","text":"Retrieves all datasets registered by the user
Returns:
Name Type Descriptiondict
dict
dictionary with the contents of each dataset registration query
Source code incli/medperf/comms/rest.py
def get_user_datasets(self) -> dict:\n\"\"\"Retrieves all datasets registered by the user\n Returns:\n dict: dictionary with the contents of each dataset registration query\n \"\"\"\ndsets = self.__get_list(f\"{self.server_url}/me/datasets/\")\nreturn dsets\n
"},{"location":"reference/comms/rest/#comms.rest.REST.get_user_results","title":"get_user_results()
","text":"Retrieves all results registered by the user
Returns:
Name Type Descriptiondict
dict
dictionary with the contents of each result registration query
Source code incli/medperf/comms/rest.py
def get_user_results(self) -> dict:\n\"\"\"Retrieves all results registered by the user\n Returns:\n dict: dictionary with the contents of each result registration query\n \"\"\"\nresults = self.__get_list(f\"{self.server_url}/me/results/\")\nreturn results\n
"},{"location":"reference/comms/rest/#comms.rest.REST.parse_url","title":"parse_url(url)
classmethod
","text":"Parse the source URL so that it can be used by the comms implementation. It should handle protocols and versioning to be able to communicate with the API.
Parameters:
Name Type Description Defaulturl
str
base URL
requiredReturns:
Name Type Descriptionstr
str
parsed URL with protocol and version
Source code incli/medperf/comms/rest.py
@classmethod\ndef parse_url(cls, url: str) -> str:\n\"\"\"Parse the source URL so that it can be used by the comms implementation.\n It should handle protocols and versioning to be able to communicate with the API.\n Args:\n url (str): base URL\n Returns:\n str: parsed URL with protocol and version\n \"\"\"\nurl_sections = url.split(\"://\")\napi_path = f\"/api/v{config.major_version}\"\n# Remove protocol if passed\nif len(url_sections) > 1:\nurl = \"\".join(url_sections[1:])\nreturn f\"https://{url}{api_path}\"\n
"},{"location":"reference/comms/rest/#comms.rest.REST.set_dataset_association_approval","title":"set_dataset_association_approval(benchmark_uid, dataset_uid, status)
","text":"Approves a dataset association
Parameters:
Name Type Description Defaultdataset_uid
int
Dataset UID
requiredbenchmark_uid
int
Benchmark UID
requiredstatus
str
Approval status to set for the association
required Source code incli/medperf/comms/rest.py
def set_dataset_association_approval(\nself, benchmark_uid: int, dataset_uid: int, status: str\n):\n\"\"\"Approves a dataset association\n Args:\n dataset_uid (int): Dataset UID\n benchmark_uid (int): Benchmark UID\n status (str): Approval status to set for the association\n \"\"\"\nurl = f\"{self.server_url}/datasets/{dataset_uid}/benchmarks/{benchmark_uid}/\"\nres = self.__set_approval_status(url, status)\nif res.status_code != 200:\nlog_response_error(res)\ndetails = format_errors_dict(res.json())\nraise CommunicationRequestError(\nf\"Could not approve association between dataset {dataset_uid} and benchmark {benchmark_uid}: {details}\"\n)\n
"},{"location":"reference/comms/rest/#comms.rest.REST.set_mlcube_association_approval","title":"set_mlcube_association_approval(benchmark_uid, mlcube_uid, status)
","text":"Approves an mlcube association
Parameters:
Name Type Description Defaultmlcube_uid
int
Dataset UID
requiredbenchmark_uid
int
Benchmark UID
requiredstatus
str
Approval status to set for the association
required Source code incli/medperf/comms/rest.py
def set_mlcube_association_approval(\nself, benchmark_uid: int, mlcube_uid: int, status: str\n):\n\"\"\"Approves an mlcube association\n Args:\n mlcube_uid (int): Dataset UID\n benchmark_uid (int): Benchmark UID\n status (str): Approval status to set for the association\n \"\"\"\nurl = f\"{self.server_url}/mlcubes/{mlcube_uid}/benchmarks/{benchmark_uid}/\"\nres = self.__set_approval_status(url, status)\nif res.status_code != 200:\nlog_response_error(res)\ndetails = format_errors_dict(res.json())\nraise CommunicationRequestError(\nf\"Could not approve association between mlcube {mlcube_uid} and benchmark {benchmark_uid}: {details}\"\n)\n
"},{"location":"reference/comms/rest/#comms.rest.REST.set_mlcube_association_priority","title":"set_mlcube_association_priority(benchmark_uid, mlcube_uid, priority)
","text":"Sets the priority of an mlcube-benchmark association
Parameters:
Name Type Description Defaultmlcube_uid
int
MLCube UID
requiredbenchmark_uid
int
Benchmark UID
requiredpriority
int
priority value to set for the association
required Source code incli/medperf/comms/rest.py
def set_mlcube_association_priority(\nself, benchmark_uid: int, mlcube_uid: int, priority: int\n):\n\"\"\"Sets the priority of an mlcube-benchmark association\n Args:\n mlcube_uid (int): MLCube UID\n benchmark_uid (int): Benchmark UID\n priority (int): priority value to set for the association\n \"\"\"\nurl = f\"{self.server_url}/mlcubes/{mlcube_uid}/benchmarks/{benchmark_uid}/\"\ndata = {\"priority\": priority}\nres = self.__auth_put(url, json=data)\nif res.status_code != 200:\nlog_response_error(res)\ndetails = format_errors_dict(res.json())\nraise CommunicationRequestError(\nf\"Could not set the priority of mlcube {mlcube_uid} within the benchmark {benchmark_uid}: {details}\"\n)\n
"},{"location":"reference/comms/rest/#comms.rest.REST.upload_benchmark","title":"upload_benchmark(benchmark_dict)
","text":"Uploads a new benchmark to the server.
Parameters:
Name Type Description Defaultbenchmark_dict
dict
benchmark_data to be uploaded
requiredReturns:
Name Type Descriptionint
int
UID of newly created benchmark
Source code incli/medperf/comms/rest.py
def upload_benchmark(self, benchmark_dict: dict) -> int:\n\"\"\"Uploads a new benchmark to the server.\n Args:\n benchmark_dict (dict): benchmark_data to be uploaded\n Returns:\n int: UID of newly created benchmark\n \"\"\"\nres = self.__auth_post(f\"{self.server_url}/benchmarks/\", json=benchmark_dict)\nif res.status_code != 201:\nlog_response_error(res)\ndetails = format_errors_dict(res.json())\nraise CommunicationRetrievalError(f\"Could not upload benchmark: {details}\")\nreturn res.json()\n
"},{"location":"reference/comms/rest/#comms.rest.REST.upload_dataset","title":"upload_dataset(reg_dict)
","text":"Uploads registration data to the server, under the sha name of the file.
Parameters:
Name Type Description Defaultreg_dict
dict
Dictionary containing registration information.
requiredReturns:
Name Type Descriptionint
int
id of the created dataset registration.
Source code incli/medperf/comms/rest.py
def upload_dataset(self, reg_dict: dict) -> int:\n\"\"\"Uploads registration data to the server, under the sha name of the file.\n Args:\n reg_dict (dict): Dictionary containing registration information.\n Returns:\n int: id of the created dataset registration.\n \"\"\"\nres = self.__auth_post(f\"{self.server_url}/datasets/\", json=reg_dict)\nif res.status_code != 201:\nlog_response_error(res)\ndetails = format_errors_dict(res.json())\nraise CommunicationRequestError(f\"Could not upload the dataset: {details}\")\nreturn res.json()\n
"},{"location":"reference/comms/rest/#comms.rest.REST.upload_mlcube","title":"upload_mlcube(mlcube_body)
","text":"Uploads an MLCube instance to the platform
Parameters:
Name Type Description Defaultmlcube_body
dict
Dictionary containing all the relevant data for creating mlcubes
requiredReturns:
Name Type Descriptionint
int
id of the created mlcube instance on the platform
Source code incli/medperf/comms/rest.py
def upload_mlcube(self, mlcube_body: dict) -> int:\n\"\"\"Uploads an MLCube instance to the platform\n Args:\n mlcube_body (dict): Dictionary containing all the relevant data for creating mlcubes\n Returns:\n int: id of the created mlcube instance on the platform\n \"\"\"\nres = self.__auth_post(f\"{self.server_url}/mlcubes/\", json=mlcube_body)\nif res.status_code != 201:\nlog_response_error(res)\ndetails = format_errors_dict(res.json())\nraise CommunicationRetrievalError(f\"Could not upload the mlcube: {details}\")\nreturn res.json()\n
"},{"location":"reference/comms/rest/#comms.rest.REST.upload_result","title":"upload_result(results_dict)
","text":"Uploads result to the server.
Parameters:
Name Type Description Defaultresults_dict
dict
Dictionary containing results information.
requiredReturns:
Name Type Descriptionint
int
id of the generated results entry
Source code incli/medperf/comms/rest.py
def upload_result(self, results_dict: dict) -> int:\n\"\"\"Uploads result to the server.\n Args:\n results_dict (dict): Dictionary containing results information.\n Returns:\n int: id of the generated results entry\n \"\"\"\nres = self.__auth_post(f\"{self.server_url}/results/\", json=results_dict)\nif res.status_code != 201:\nlog_response_error(res)\ndetails = format_errors_dict(res.json())\nraise CommunicationRequestError(f\"Could not upload the results: {details}\")\nreturn res.json()\n
"},{"location":"reference/comms/auth/auth0/","title":"Auth0","text":""},{"location":"reference/comms/auth/auth0/#comms.auth.auth0.Auth0","title":"Auth0
","text":" Bases: Auth
cli/medperf/comms/auth/auth0.py
class Auth0(Auth):\ndef __init__(self):\nself.domain = config.auth_domain\nself.client_id = config.auth_client_id\nself.audience = config.auth_audience\nself._lock = threading.Lock()\ndef login(self, email):\n\"\"\"Retrieves and stores an access token/refresh token pair from the auth0\n backend using the device authorization flow.\n Args:\n email (str): user email. This will be used to validate that the received\n id_token contains the same email address.\n \"\"\"\ndevice_code_response = self.__request_device_code()\ndevice_code = device_code_response[\"device_code\"]\nuser_code = device_code_response[\"user_code\"]\nverification_uri_complete = device_code_response[\"verification_uri_complete\"]\ninterval = device_code_response[\"interval\"]\nconfig.ui.print(\n\"\\nPlease go to the following link to complete your login request:\\n\"\nf\"\\t{verification_uri_complete}\\n\\n\"\n\"Make sure that you will be presented with the following code:\\n\"\nf\"\\t{user_code}\\n\\n\"\n)\nconfig.ui.print_warning(\n\"Keep this terminal open until you complete your login request. \"\n\"The command will exit on its own once you complete the request. \"\n\"If you wish to stop the login request anyway, press Ctrl+C.\"\n)\ntoken_response, token_issued_at = self.__get_device_access_token(\ndevice_code, interval\n)\naccess_token = token_response[\"access_token\"]\nid_token = token_response[\"id_token\"]\nrefresh_token = token_response[\"refresh_token\"]\ntoken_expires_in = token_response[\"expires_in\"]\nid_token_payload = verify_token(id_token)\nself.__check_token_email(id_token_payload, email)\nset_credentials(\naccess_token,\nrefresh_token,\nid_token_payload,\ntoken_issued_at,\ntoken_expires_in,\nlogin_event=True,\n)\ndef __request_device_code(self):\n\"\"\"Get a device code from the auth0 backend to be used for the authorization process\"\"\"\nurl = f\"https://{self.domain}/oauth/device/code\"\nheaders = {\"content-type\": \"application/x-www-form-urlencoded\"}\nbody = {\n\"client_id\": self.client_id,\n\"audience\": self.audience,\n\"scope\": \"offline_access openid email\",\n}\nres = requests.post(url=url, headers=headers, data=body)\nif res.status_code != 200:\nself.__raise_errors(res, \"Login\")\nreturn res.json()\ndef __get_device_access_token(self, device_code, polling_interval):\n\"\"\"Get the access token from the auth0 backend associated with\n the device code requested before. This function will keep polling\n the access token until the user completes the browser flow part\n of the authorization process.\n Args:\n device_code (str): A temporary device code requested by `__request_device_code`\n polling_interval (float): number of seconds to wait between each two polling requests\n Returns:\n json_res (dict): the response of the successful request, containg the access/refresh tokens pair\n token_issued_at (float): the timestamp when the access token was issued\n \"\"\"\nurl = f\"https://{self.domain}/oauth/token\"\nheaders = {\"content-type\": \"application/x-www-form-urlencoded\"}\nbody = {\n\"grant_type\": \"urn:ietf:params:oauth:grant-type:device_code\",\n\"device_code\": device_code,\n\"client_id\": self.client_id,\n}\nwhile True:\ntime.sleep(polling_interval)\ntoken_issued_at = time.time()\nres = requests.post(url=url, headers=headers, data=body)\nif res.status_code == 200:\njson_res = res.json()\nreturn json_res, token_issued_at\ntry:\njson_res = res.json()\nexcept requests.exceptions.JSONDecodeError:\njson_res = {}\nerror = json_res.get(\"error\", None)\nif error not in [\"slow_down\", \"authorization_pending\"]:\nself.__raise_errors(res, \"Login\")\ndef __check_token_email(self, id_token_payload, email):\n\"\"\"Checks if the email provided by the user in the terminal matches the\n email found in the recieved id token.\"\"\"\nemail_in_token = id_token_payload[\"email\"]\nif email.lower() != email_in_token:\nraise CommunicationError(\n\"The email provided in the terminal does not match the one provided during login\"\n)\ndef logout(self):\n\"\"\"Logs out the user by revoking their refresh token and deleting the\n stored tokens.\"\"\"\ncreds = read_credentials()\nrefresh_token = creds[\"refresh_token\"]\nurl = f\"https://{self.domain}/oauth/revoke\"\nheaders = {\"content-type\": \"application/json\"}\nbody = {\n\"client_id\": self.client_id,\n\"token\": refresh_token,\n}\nres = requests.post(url=url, headers=headers, json=body)\nif res.status_code != 200:\nself.__raise_errors(res, \"Logout\")\ndelete_credentials()\n@property\ndef access_token(self):\n\"\"\"Thread and process-safe access token retrieval\"\"\"\n# In case of multiple threads are using the same connection object,\n# keep the thread lock, otherwise the database will throw\n# errors of starting a transaction within a transaction.\n# In case of each thread is using a different connection object,\n# keep the thread lock to avoid the OperationalError when\n# multiple threads want to access the database.\nwith self._lock:\n# TODO: This is temporary. Use a cleaner solution.\ndb = sqlite3.connect(config.tokens_db, isolation_level=None, timeout=60)\ntry:\ndb.execute(\"BEGIN EXCLUSIVE TRANSACTION\")\nexcept sqlite3.OperationalError:\nmsg = \"Another process is using the database. Try again later\"\nraise CommunicationError(msg)\ntoken = self._access_token\n# Sqlite will automatically execute COMMIT and close the connection\n# if an exception is raised during the retrieval of the access token.\ndb.execute(\"COMMIT\")\ndb.close()\nreturn token\n@property\ndef _access_token(self):\n\"\"\"Reads and returns an access token of the currently logged\n in user to be used for authorizing requests to the MedPerf server.\n Refresh the token if necessary.\n Returns:\n access_token (str): the access token\n \"\"\"\ncreds = read_credentials()\naccess_token = creds[\"access_token\"]\nrefresh_token = creds[\"refresh_token\"]\ntoken_expires_in = creds[\"token_expires_in\"]\ntoken_issued_at = creds[\"token_issued_at\"]\nlogged_in_at = creds[\"logged_in_at\"]\n# token_issued_at and expires_in are for the access token\nsliding_expiration_time = (\ntoken_issued_at + token_expires_in - config.token_expiration_leeway\n)\nabsolute_expiration_time = (\nlogged_in_at\n+ config.token_absolute_expiry\n- config.refresh_token_expiration_leeway\n)\ncurrent_time = time.time()\nif current_time < sliding_expiration_time:\n# Access token not expired. No need to refresh.\nreturn access_token\n# So we need to refresh.\nif current_time > absolute_expiration_time:\n# Expired refresh token. Force logout and ask the user to re-authenticate\nlogging.debug(\nf\"Refresh token expired: {absolute_expiration_time=} <> {current_time=}\"\n)\nself.logout()\nraise AuthenticationError(\"Token expired. Please re-authenticate\")\n# Expired access token and not expired refresh token. Refresh.\naccess_token = self.__refresh_access_token(refresh_token)\nreturn access_token\ndef __refresh_access_token(self, refresh_token):\n\"\"\"Retrieve and store a new access token using a refresh token.\n A new refresh token will also be retrieved and stored.\n Args:\n refresh_token (str): the refresh token\n Returns:\n access_token (str): the new access token\n \"\"\"\nurl = f\"https://{self.domain}/oauth/token\"\nheaders = {\"content-type\": \"application/x-www-form-urlencoded\"}\nbody = {\n\"grant_type\": \"refresh_token\",\n\"client_id\": self.client_id,\n\"refresh_token\": refresh_token,\n}\ntoken_issued_at = time.time()\nlogging.debug(\"Refreshing access token.\")\nres = requests.post(url=url, headers=headers, data=body)\nif res.status_code != 200:\nself.__raise_errors(res, \"Token refresh\")\njson_res = res.json()\naccess_token = json_res[\"access_token\"]\nid_token = json_res[\"id_token\"]\nrefresh_token = json_res[\"refresh_token\"]\ntoken_expires_in = json_res[\"expires_in\"]\nid_token_payload = verify_token(id_token)\nset_credentials(\naccess_token,\nrefresh_token,\nid_token_payload,\ntoken_issued_at,\ntoken_expires_in,\n)\nreturn access_token\ndef __raise_errors(self, res, action):\n\"\"\"log the failed request's response and raise errors.\n Args:\n res (requests.Response): the response of a failed request\n action (str): a string for more informative error display\n to the user.\n \"\"\"\nlog_response_error(res)\nif res.status_code == 429:\nraise CommunicationError(\"Too many requests. Try again later.\")\ntry:\njson_res = res.json()\nexcept requests.exceptions.JSONDecodeError:\njson_res = {}\ndescription = json_res.get(\"error_description\", \"\")\nmsg = f\"{action} failed.\"\nif description:\nmsg += f\" {description}\"\nraise CommunicationError(msg)\n
"},{"location":"reference/comms/auth/auth0/#comms.auth.auth0.Auth0.access_token","title":"access_token
property
","text":"Thread and process-safe access token retrieval
"},{"location":"reference/comms/auth/auth0/#comms.auth.auth0.Auth0.__check_token_email","title":"__check_token_email(id_token_payload, email)
","text":"Checks if the email provided by the user in the terminal matches the email found in the recieved id token.
Source code incli/medperf/comms/auth/auth0.py
def __check_token_email(self, id_token_payload, email):\n\"\"\"Checks if the email provided by the user in the terminal matches the\n email found in the recieved id token.\"\"\"\nemail_in_token = id_token_payload[\"email\"]\nif email.lower() != email_in_token:\nraise CommunicationError(\n\"The email provided in the terminal does not match the one provided during login\"\n)\n
"},{"location":"reference/comms/auth/auth0/#comms.auth.auth0.Auth0.__get_device_access_token","title":"__get_device_access_token(device_code, polling_interval)
","text":"Get the access token from the auth0 backend associated with the device code requested before. This function will keep polling the access token until the user completes the browser flow part of the authorization process.
Parameters:
Name Type Description Defaultdevice_code
str
A temporary device code requested by __request_device_code
polling_interval
float
number of seconds to wait between each two polling requests
requiredReturns:
Name Type Descriptionjson_res
dict
the response of the successful request, containg the access/refresh tokens pair
token_issued_at
float
the timestamp when the access token was issued
Source code incli/medperf/comms/auth/auth0.py
def __get_device_access_token(self, device_code, polling_interval):\n\"\"\"Get the access token from the auth0 backend associated with\n the device code requested before. This function will keep polling\n the access token until the user completes the browser flow part\n of the authorization process.\n Args:\n device_code (str): A temporary device code requested by `__request_device_code`\n polling_interval (float): number of seconds to wait between each two polling requests\n Returns:\n json_res (dict): the response of the successful request, containg the access/refresh tokens pair\n token_issued_at (float): the timestamp when the access token was issued\n \"\"\"\nurl = f\"https://{self.domain}/oauth/token\"\nheaders = {\"content-type\": \"application/x-www-form-urlencoded\"}\nbody = {\n\"grant_type\": \"urn:ietf:params:oauth:grant-type:device_code\",\n\"device_code\": device_code,\n\"client_id\": self.client_id,\n}\nwhile True:\ntime.sleep(polling_interval)\ntoken_issued_at = time.time()\nres = requests.post(url=url, headers=headers, data=body)\nif res.status_code == 200:\njson_res = res.json()\nreturn json_res, token_issued_at\ntry:\njson_res = res.json()\nexcept requests.exceptions.JSONDecodeError:\njson_res = {}\nerror = json_res.get(\"error\", None)\nif error not in [\"slow_down\", \"authorization_pending\"]:\nself.__raise_errors(res, \"Login\")\n
"},{"location":"reference/comms/auth/auth0/#comms.auth.auth0.Auth0.__raise_errors","title":"__raise_errors(res, action)
","text":"log the failed request's response and raise errors.
Parameters:
Name Type Description Defaultres
requests.Response
the response of a failed request
requiredaction
str
a string for more informative error display
required Source code incli/medperf/comms/auth/auth0.py
def __raise_errors(self, res, action):\n\"\"\"log the failed request's response and raise errors.\n Args:\n res (requests.Response): the response of a failed request\n action (str): a string for more informative error display\n to the user.\n \"\"\"\nlog_response_error(res)\nif res.status_code == 429:\nraise CommunicationError(\"Too many requests. Try again later.\")\ntry:\njson_res = res.json()\nexcept requests.exceptions.JSONDecodeError:\njson_res = {}\ndescription = json_res.get(\"error_description\", \"\")\nmsg = f\"{action} failed.\"\nif description:\nmsg += f\" {description}\"\nraise CommunicationError(msg)\n
"},{"location":"reference/comms/auth/auth0/#comms.auth.auth0.Auth0.__refresh_access_token","title":"__refresh_access_token(refresh_token)
","text":"Retrieve and store a new access token using a refresh token. A new refresh token will also be retrieved and stored.
Parameters:
Name Type Description Defaultrefresh_token
str
the refresh token
requiredReturns:
Name Type Descriptionaccess_token
str
the new access token
Source code incli/medperf/comms/auth/auth0.py
def __refresh_access_token(self, refresh_token):\n\"\"\"Retrieve and store a new access token using a refresh token.\n A new refresh token will also be retrieved and stored.\n Args:\n refresh_token (str): the refresh token\n Returns:\n access_token (str): the new access token\n \"\"\"\nurl = f\"https://{self.domain}/oauth/token\"\nheaders = {\"content-type\": \"application/x-www-form-urlencoded\"}\nbody = {\n\"grant_type\": \"refresh_token\",\n\"client_id\": self.client_id,\n\"refresh_token\": refresh_token,\n}\ntoken_issued_at = time.time()\nlogging.debug(\"Refreshing access token.\")\nres = requests.post(url=url, headers=headers, data=body)\nif res.status_code != 200:\nself.__raise_errors(res, \"Token refresh\")\njson_res = res.json()\naccess_token = json_res[\"access_token\"]\nid_token = json_res[\"id_token\"]\nrefresh_token = json_res[\"refresh_token\"]\ntoken_expires_in = json_res[\"expires_in\"]\nid_token_payload = verify_token(id_token)\nset_credentials(\naccess_token,\nrefresh_token,\nid_token_payload,\ntoken_issued_at,\ntoken_expires_in,\n)\nreturn access_token\n
"},{"location":"reference/comms/auth/auth0/#comms.auth.auth0.Auth0.__request_device_code","title":"__request_device_code()
","text":"Get a device code from the auth0 backend to be used for the authorization process
Source code incli/medperf/comms/auth/auth0.py
def __request_device_code(self):\n\"\"\"Get a device code from the auth0 backend to be used for the authorization process\"\"\"\nurl = f\"https://{self.domain}/oauth/device/code\"\nheaders = {\"content-type\": \"application/x-www-form-urlencoded\"}\nbody = {\n\"client_id\": self.client_id,\n\"audience\": self.audience,\n\"scope\": \"offline_access openid email\",\n}\nres = requests.post(url=url, headers=headers, data=body)\nif res.status_code != 200:\nself.__raise_errors(res, \"Login\")\nreturn res.json()\n
"},{"location":"reference/comms/auth/auth0/#comms.auth.auth0.Auth0.login","title":"login(email)
","text":"Retrieves and stores an access token/refresh token pair from the auth0 backend using the device authorization flow.
Parameters:
Name Type Description Defaultemail
str
user email. This will be used to validate that the received id_token contains the same email address.
required Source code incli/medperf/comms/auth/auth0.py
def login(self, email):\n\"\"\"Retrieves and stores an access token/refresh token pair from the auth0\n backend using the device authorization flow.\n Args:\n email (str): user email. This will be used to validate that the received\n id_token contains the same email address.\n \"\"\"\ndevice_code_response = self.__request_device_code()\ndevice_code = device_code_response[\"device_code\"]\nuser_code = device_code_response[\"user_code\"]\nverification_uri_complete = device_code_response[\"verification_uri_complete\"]\ninterval = device_code_response[\"interval\"]\nconfig.ui.print(\n\"\\nPlease go to the following link to complete your login request:\\n\"\nf\"\\t{verification_uri_complete}\\n\\n\"\n\"Make sure that you will be presented with the following code:\\n\"\nf\"\\t{user_code}\\n\\n\"\n)\nconfig.ui.print_warning(\n\"Keep this terminal open until you complete your login request. \"\n\"The command will exit on its own once you complete the request. \"\n\"If you wish to stop the login request anyway, press Ctrl+C.\"\n)\ntoken_response, token_issued_at = self.__get_device_access_token(\ndevice_code, interval\n)\naccess_token = token_response[\"access_token\"]\nid_token = token_response[\"id_token\"]\nrefresh_token = token_response[\"refresh_token\"]\ntoken_expires_in = token_response[\"expires_in\"]\nid_token_payload = verify_token(id_token)\nself.__check_token_email(id_token_payload, email)\nset_credentials(\naccess_token,\nrefresh_token,\nid_token_payload,\ntoken_issued_at,\ntoken_expires_in,\nlogin_event=True,\n)\n
"},{"location":"reference/comms/auth/auth0/#comms.auth.auth0.Auth0.logout","title":"logout()
","text":"Logs out the user by revoking their refresh token and deleting the stored tokens.
Source code incli/medperf/comms/auth/auth0.py
def logout(self):\n\"\"\"Logs out the user by revoking their refresh token and deleting the\n stored tokens.\"\"\"\ncreds = read_credentials()\nrefresh_token = creds[\"refresh_token\"]\nurl = f\"https://{self.domain}/oauth/revoke\"\nheaders = {\"content-type\": \"application/json\"}\nbody = {\n\"client_id\": self.client_id,\n\"token\": refresh_token,\n}\nres = requests.post(url=url, headers=headers, json=body)\nif res.status_code != 200:\nself.__raise_errors(res, \"Logout\")\ndelete_credentials()\n
"},{"location":"reference/comms/auth/interface/","title":"Interface","text":""},{"location":"reference/comms/auth/interface/#comms.auth.interface.Auth","title":"Auth
","text":" Bases: ABC
cli/medperf/comms/auth/interface.py
class Auth(ABC):\n@abstractmethod\ndef __init__(self):\n\"\"\"Initialize the class\"\"\"\n@abstractmethod\ndef login(self, email):\n\"\"\"Log in a user\"\"\"\n@abstractmethod\ndef logout(self):\n\"\"\"Log out a user\"\"\"\n@property\n@abstractmethod\ndef access_token(self):\n\"\"\"An access token to authorize requests to the MedPerf server\"\"\"\n
"},{"location":"reference/comms/auth/interface/#comms.auth.interface.Auth.access_token","title":"access_token
abstractmethod
property
","text":"An access token to authorize requests to the MedPerf server
"},{"location":"reference/comms/auth/interface/#comms.auth.interface.Auth.__init__","title":"__init__()
abstractmethod
","text":"Initialize the class
Source code incli/medperf/comms/auth/interface.py
@abstractmethod\ndef __init__(self):\n\"\"\"Initialize the class\"\"\"\n
"},{"location":"reference/comms/auth/interface/#comms.auth.interface.Auth.login","title":"login(email)
abstractmethod
","text":"Log in a user
Source code incli/medperf/comms/auth/interface.py
@abstractmethod\ndef login(self, email):\n\"\"\"Log in a user\"\"\"\n
"},{"location":"reference/comms/auth/interface/#comms.auth.interface.Auth.logout","title":"logout()
abstractmethod
","text":"Log out a user
Source code incli/medperf/comms/auth/interface.py
@abstractmethod\ndef logout(self):\n\"\"\"Log out a user\"\"\"\n
"},{"location":"reference/comms/auth/local/","title":"Local","text":""},{"location":"reference/comms/auth/local/#comms.auth.local.Local","title":"Local
","text":" Bases: Auth
cli/medperf/comms/auth/local.py
class Local(Auth):\ndef __init__(self):\nwith open(config.local_tokens_path) as f:\nself.tokens = json.load(f)\ndef login(self, email):\n\"\"\"Retrieves and stores an access token from a local store json file.\n Args:\n email (str): user email.\n \"\"\"\ntry:\naccess_token = self.tokens[email]\nexcept KeyError:\nraise InvalidArgumentError(\n\"The provided email does not exist for testing. \"\n\"Make sure you activated the right profile.\"\n)\nrefresh_token = \"refresh token\"\nid_token_payload = {\"email\": email}\ntoken_issued_at = 0\ntoken_expires_in = 10**10\nset_credentials(\naccess_token,\nrefresh_token,\nid_token_payload,\ntoken_issued_at,\ntoken_expires_in,\nlogin_event=True,\n)\ndef logout(self):\n\"\"\"Logs out the user by deleting the stored tokens.\"\"\"\ndelete_credentials()\n@property\ndef access_token(self):\n\"\"\"Reads and returns an access token of the currently logged\n in user to be used for authorizing requests to the MedPerf server.\n Returns:\n access_token (str): the access token\n \"\"\"\ncreds = read_credentials()\naccess_token = creds[\"access_token\"]\nreturn access_token\n
"},{"location":"reference/comms/auth/local/#comms.auth.local.Local.access_token","title":"access_token
property
","text":"Reads and returns an access token of the currently logged in user to be used for authorizing requests to the MedPerf server.
Returns:
Name Type Descriptionaccess_token
str
the access token
"},{"location":"reference/comms/auth/local/#comms.auth.local.Local.login","title":"login(email)
","text":"Retrieves and stores an access token from a local store json file.
Parameters:
Name Type Description Defaultemail
str
user email.
required Source code incli/medperf/comms/auth/local.py
def login(self, email):\n\"\"\"Retrieves and stores an access token from a local store json file.\n Args:\n email (str): user email.\n \"\"\"\ntry:\naccess_token = self.tokens[email]\nexcept KeyError:\nraise InvalidArgumentError(\n\"The provided email does not exist for testing. \"\n\"Make sure you activated the right profile.\"\n)\nrefresh_token = \"refresh token\"\nid_token_payload = {\"email\": email}\ntoken_issued_at = 0\ntoken_expires_in = 10**10\nset_credentials(\naccess_token,\nrefresh_token,\nid_token_payload,\ntoken_issued_at,\ntoken_expires_in,\nlogin_event=True,\n)\n
"},{"location":"reference/comms/auth/local/#comms.auth.local.Local.logout","title":"logout()
","text":"Logs out the user by deleting the stored tokens.
Source code incli/medperf/comms/auth/local.py
def logout(self):\n\"\"\"Logs out the user by deleting the stored tokens.\"\"\"\ndelete_credentials()\n
"},{"location":"reference/comms/auth/token_verifier/","title":"Token verifier","text":"This module defines a wrapper around the existing token verifier in auth0-python library. The library is designed to cache public keys in memory. Since our client is ephemeral, we wrapped the library's JwksFetcher
to cache keys in the filesystem storage, and wrapped the library's signature verifier to use this new JwksFetcher
This module downloads files from the internet. It provides a set of functions to download common files that are necessary for workflow executions and are not on the MedPerf server. An example of such files is model weights of a Model MLCube.
This module takes care of validating the integrity of the downloaded file if a hash was specified when requesting the file. It also returns the hash of the downloaded file, which can be the original specified hash or the calculated hash of the freshly downloaded file if no hash was specified.
Additionally, to avoid unnecessary downloads, an existing file will not be re-downloaded.
"},{"location":"reference/comms/entity_resources/resources/#comms.entity_resources.resources.get_benchmark_demo_dataset","title":"get_benchmark_demo_dataset(url, expected_hash=None)
","text":"Downloads and extracts a demo dataset. If the hash is provided, the file's integrity will be checked upon download.
Parameters:
Name Type Description Defaulturl
str
URL where the compressed demo dataset file can be downloaded.
requiredexpected_hash
str
expected hash of the downloaded file
None
Returns:
Name Type Descriptionoutput_path
str
location where the uncompressed demo dataset is stored locally.
hash_value
str
The hash of the downloaded tarball file
Source code incli/medperf/comms/entity_resources/resources.py
def get_benchmark_demo_dataset(url: str, expected_hash: str = None) -> str:\n\"\"\"Downloads and extracts a demo dataset. If the hash is provided,\n the file's integrity will be checked upon download.\n Args:\n url (str): URL where the compressed demo dataset file can be downloaded.\n expected_hash (str, optional): expected hash of the downloaded file\n Returns:\n output_path (str): location where the uncompressed demo dataset is stored locally.\n hash_value (str): The hash of the downloaded tarball file\n \"\"\"\n# TODO: at some point maybe it is better to download demo datasets in\n# their benchmark folder. Doing this, we should then modify\n# the compatibility test command and remove the option of directly passing\n# demo datasets. This would look cleaner.\n# Possible cons: if multiple benchmarks use the same demo dataset.\ndemo_storage = config.demo_datasets_folder\nif expected_hash:\n# If the folder exists, return\ndemo_dataset_folder = os.path.join(demo_storage, expected_hash)\nif os.path.exists(demo_dataset_folder):\nreturn demo_dataset_folder, expected_hash\n# make sure files are uncompressed while in tmp storage, to avoid any clutter\n# objects if uncompression fails for some reason.\ntmp_output_folder = generate_tmp_path()\noutput_tarball_path = os.path.join(tmp_output_folder, config.tarball_filename)\nhash_value = download_resource(url, output_tarball_path, expected_hash)\nuntar(output_tarball_path)\ndemo_dataset_folder = os.path.join(demo_storage, hash_value)\nif os.path.exists(demo_dataset_folder):\n# handle the possibility of having clutter uncompressed files\nremove_path(demo_dataset_folder)\nos.rename(tmp_output_folder, demo_dataset_folder)\nreturn demo_dataset_folder, hash_value\n
"},{"location":"reference/comms/entity_resources/resources/#comms.entity_resources.resources.get_cube","title":"get_cube(url, cube_path, expected_hash=None)
","text":"Downloads and writes a cube mlcube.yaml file
Source code incli/medperf/comms/entity_resources/resources.py
def get_cube(url: str, cube_path: str, expected_hash: str = None):\n\"\"\"Downloads and writes a cube mlcube.yaml file\"\"\"\noutput_path = os.path.join(cube_path, config.cube_filename)\nreturn _get_regular_file(url, output_path, expected_hash)\n
"},{"location":"reference/comms/entity_resources/resources/#comms.entity_resources.resources.get_cube_additional","title":"get_cube_additional(url, cube_path, expected_tarball_hash=None)
","text":"Retrieves additional files of an MLCube. The additional files will be in a compressed tarball file. The function will additionally extract this file.
Parameters:
Name Type Description Defaulturl
str
URL where the additional_files.tar.gz file can be downloaded.
requiredcube_path
str
Cube location.
requiredexpected_tarball_hash
str
expected hash of tarball file
None
Returns:
Name Type Descriptiontarball_hash
str
The hash of the downloaded tarball file
Source code incli/medperf/comms/entity_resources/resources.py
def get_cube_additional(\nurl: str,\ncube_path: str,\nexpected_tarball_hash: str = None,\n) -> str:\n\"\"\"Retrieves additional files of an MLCube. The additional files\n will be in a compressed tarball file. The function will additionally\n extract this file.\n Args:\n url (str): URL where the additional_files.tar.gz file can be downloaded.\n cube_path (str): Cube location.\n expected_tarball_hash (str, optional): expected hash of tarball file\n Returns:\n tarball_hash (str): The hash of the downloaded tarball file\n \"\"\"\nadditional_files_folder = os.path.join(cube_path, config.additional_path)\nmlcube_cache_file = os.path.join(cube_path, config.mlcube_cache_file)\nif not _should_get_cube_additional(\nadditional_files_folder, expected_tarball_hash, mlcube_cache_file\n):\nreturn expected_tarball_hash\n# Download the additional files. Make sure files are extracted in tmp storage\n# to avoid any clutter objects if uncompression fails for some reason.\ntmp_output_folder = generate_tmp_path()\noutput_tarball_path = os.path.join(tmp_output_folder, config.tarball_filename)\ntarball_hash = download_resource(url, output_tarball_path, expected_tarball_hash)\nuntar(output_tarball_path)\nparent_folder = os.path.dirname(os.path.normpath(additional_files_folder))\nos.makedirs(parent_folder, exist_ok=True)\nif os.path.exists(additional_files_folder):\n# handle the possibility of having clutter uncompressed files\nremove_path(additional_files_folder)\nos.rename(tmp_output_folder, additional_files_folder)\n# Store the downloaded tarball hash to be used later for verifying that the\n# local cache is up to date\nwith open(mlcube_cache_file, \"w\") as f: # assumes parent folder already exists\ncontents = {\"additional_files_cached_hash\": tarball_hash}\nyaml.dump(contents, f)\nreturn tarball_hash\n
"},{"location":"reference/comms/entity_resources/resources/#comms.entity_resources.resources.get_cube_image","title":"get_cube_image(url, cube_path, hash_value=None)
","text":"Retrieves and stores the image file from the server. Stores images on a shared location, and retrieves a cached image by hash if found locally. Creates a symbolic link to the cube storage.
Parameters:
Name Type Description Defaulturl
str
URL where the image file can be downloaded.
requiredcube_path
str
Path to cube.
requiredhash_value
(str, Optional)
File hash to store under shared storage. Defaults to None.
None
Returns:
Name Type Descriptionimage_cube_file
str
Location where the image file is stored locally.
hash_value
str
The hash of the downloaded file
Source code incli/medperf/comms/entity_resources/resources.py
def get_cube_image(url: str, cube_path: str, hash_value: str = None) -> str:\n\"\"\"Retrieves and stores the image file from the server. Stores images\n on a shared location, and retrieves a cached image by hash if found locally.\n Creates a symbolic link to the cube storage.\n Args:\n url (str): URL where the image file can be downloaded.\n cube_path (str): Path to cube.\n hash_value (str, Optional): File hash to store under shared storage. Defaults to None.\n Returns:\n image_cube_file: Location where the image file is stored locally.\n hash_value (str): The hash of the downloaded file\n \"\"\"\nimage_path = config.image_path\nimage_name = get_cube_image_name(cube_path)\nimage_cube_path = os.path.join(cube_path, image_path)\nos.makedirs(image_cube_path, exist_ok=True)\nimage_cube_file = os.path.join(image_cube_path, image_name)\nif os.path.islink(image_cube_file): # could be a broken link\n# Remove existing links\nos.unlink(image_cube_file)\nimgs_storage = config.images_folder\nif not hash_value:\n# No hash provided, we need to download the file first\ntmp_output_path = generate_tmp_path()\nhash_value = download_resource(url, tmp_output_path)\nimg_storage = os.path.join(imgs_storage, hash_value)\nshutil.move(tmp_output_path, img_storage)\nelse:\nimg_storage = os.path.join(imgs_storage, hash_value)\nif not os.path.exists(img_storage):\n# If image doesn't exist locally, download it normally\ndownload_resource(url, img_storage, hash_value)\n# Create a symbolic link to individual cube storage\nos.symlink(img_storage, image_cube_file)\nreturn image_cube_file, hash_value\n
"},{"location":"reference/comms/entity_resources/resources/#comms.entity_resources.resources.get_cube_params","title":"get_cube_params(url, cube_path, expected_hash=None)
","text":"Downloads and writes a cube parameters.yaml file
Source code incli/medperf/comms/entity_resources/resources.py
def get_cube_params(url: str, cube_path: str, expected_hash: str = None):\n\"\"\"Downloads and writes a cube parameters.yaml file\"\"\"\noutput_path = os.path.join(cube_path, config.workspace_path, config.params_filename)\nreturn _get_regular_file(url, output_path, expected_hash)\n
"},{"location":"reference/comms/entity_resources/utils/","title":"Utils","text":""},{"location":"reference/comms/entity_resources/utils/#comms.entity_resources.utils.__parse_resource","title":"__parse_resource(resource)
","text":"Parses a resource string and returns its identifier and the source class it can be downloaded from. The function iterates over all supported sources and checks which one accepts this resource. A resource is a string that should match a certain pattern to be downloaded by a certain resource.
If the resource pattern does not correspond to any supported source, the function raises an InvalidArgumentError
Parameters:
Name Type Description Defaultresource
str
The resource string. Must be in the form : required Source code in cli/medperf/comms/entity_resources/utils.py
def __parse_resource(resource: str):\n\"\"\"Parses a resource string and returns its identifier and the source class\n it can be downloaded from.\n The function iterates over all supported sources and checks which one accepts\n this resource. A resource is a string that should match a certain pattern to be\n downloaded by a certain resource.\n If the resource pattern does not correspond to any supported source, the\n function raises an `InvalidArgumentError`\n Args:\n resource (str): The resource string. Must be in the form <source_prefix>:<resource_identifier>\n or a url. The later case will be interpreted as a direct download link.\n \"\"\"\nfor source_class in supported_sources:\nresource_identifier = source_class.validate_resource(resource)\nif resource_identifier:\nreturn source_class, resource_identifier\n# In this case the input format is not compatible with any source\nmsg = f\"\"\"Invalid resource input: {resource}. A Resource must be a url or\n in the following format: '<source_prefix>:<resource_identifier>'. Run\n `medperf mlcube submit --help` for more details.\"\"\"\nraise InvalidArgumentError(msg)\n
"},{"location":"reference/comms/entity_resources/utils/#comms.entity_resources.utils.download_resource","title":"download_resource(resource, output_path, expected_hash=None)
","text":"Downloads a resource/file from the internet. Passing a hash is optional. If hash is provided, the downloaded file's hash will be checked and an error will be raised if it is incorrect.
Upon success, the function returns the hash of the downloaded file.
Parameters:
Name Type Description Defaultresource
str
The resource string. Must be in the form : required output_path
str
The path to download the resource to
requiredexpected_hash
(optional, str)
The expected hash of the file to be downloaded
None
Returns:
Type DescriptionThe hash of the downloaded file (or existing file)
Source code incli/medperf/comms/entity_resources/utils.py
def download_resource(\nresource: str, output_path: str, expected_hash: Optional[str] = None\n):\n\"\"\"Downloads a resource/file from the internet. Passing a hash is optional.\n If hash is provided, the downloaded file's hash will be checked and an error\n will be raised if it is incorrect.\n Upon success, the function returns the hash of the downloaded file.\n Args:\n resource (str): The resource string. Must be in the form <source_prefix>:<resource_identifier>\n or a url.\n output_path (str): The path to download the resource to\n expected_hash (optional, str): The expected hash of the file to be downloaded\n Returns:\n The hash of the downloaded file (or existing file)\n \"\"\"\ntmp_output_path = tmp_download_resource(resource)\ncalculated_hash = get_file_hash(tmp_output_path)\nif expected_hash and calculated_hash != expected_hash:\nlogging.debug(f\"{resource}: Expected {expected_hash}, found {calculated_hash}.\")\nraise InvalidEntityError(f\"Hash mismatch: {resource}\")\nto_permanent_path(tmp_output_path, output_path)\nreturn calculated_hash\n
"},{"location":"reference/comms/entity_resources/utils/#comms.entity_resources.utils.tmp_download_resource","title":"tmp_download_resource(resource)
","text":"Downloads a resource to the temporary storage.
Parameters:
Name Type Description Defaultresource
str
The resource string. Must be in the form : required
Returns:
Name Type Descriptiontmp_output_path
str
The location where the resource was downloaded
Source code incli/medperf/comms/entity_resources/utils.py
def tmp_download_resource(resource):\n\"\"\"Downloads a resource to the temporary storage.\n Args:\n resource (str): The resource string. Must be in the form <source_prefix>:<resource_identifier>\n or a url.\n Returns:\n tmp_output_path (str): The location where the resource was downloaded\n \"\"\"\ntmp_output_path = generate_tmp_path()\nsource_class, resource_identifier = __parse_resource(resource)\nsource = source_class()\nsource.authenticate()\nsource.download(resource_identifier, tmp_output_path)\nreturn tmp_output_path\n
"},{"location":"reference/comms/entity_resources/utils/#comms.entity_resources.utils.to_permanent_path","title":"to_permanent_path(tmp_output_path, output_path)
","text":"Writes a file from the temporary storage to the desired output path.
Source code incli/medperf/comms/entity_resources/utils.py
def to_permanent_path(tmp_output_path, output_path):\n\"\"\"Writes a file from the temporary storage to the desired output path.\"\"\"\noutput_folder = os.path.dirname(os.path.abspath(output_path))\nos.makedirs(output_folder, exist_ok=True)\nos.rename(tmp_output_path, output_path)\n
"},{"location":"reference/comms/entity_resources/sources/direct/","title":"Direct","text":""},{"location":"reference/comms/entity_resources/sources/direct/#comms.entity_resources.sources.direct.DirectLinkSource","title":"DirectLinkSource
","text":" Bases: BaseSource
cli/medperf/comms/entity_resources/sources/direct.py
class DirectLinkSource(BaseSource):\nprefix = \"direct:\"\n@classmethod\ndef validate_resource(cls, value: str):\n\"\"\"This class expects a resource string of the form\n `direct:<URL>` or only a URL.\n Args:\n resource (str): the resource string\n Returns:\n (str|None): The URL if the pattern matches, else None\n \"\"\"\nprefix = cls.prefix\nif value.startswith(prefix):\nprefix_len = len(prefix)\nvalue = value[prefix_len:]\nif validators.url(value):\nreturn value\ndef __init__(self):\npass\ndef authenticate(self):\npass\ndef __download_once(self, resource_identifier: str, output_path: str):\n\"\"\"Downloads a direct-download-link file by streaming its contents. source:\n https://stackoverflow.com/questions/16694907/download-large-file-in-python-with-requests\n \"\"\"\nwith requests.get(resource_identifier, stream=True) as res:\nif res.status_code != 200:\nlog_response_error(res)\nmsg = (\n\"There was a problem retrieving the specified file at \"\n+ resource_identifier\n)\nraise CommunicationRetrievalError(msg)\nwith open(output_path, \"wb\") as f:\nfor chunk in res.iter_content(chunk_size=config.ddl_stream_chunk_size):\n# NOTE: if the response is chunk-encoded, this may not work\n# check whether this is common.\nf.write(chunk)\ndef download(self, resource_identifier: str, output_path: str):\n\"\"\"Downloads a direct-download-link file with multiple attempts. This is\n done due to facing transient network failure from some direct download\n link servers.\"\"\"\nattempt = 0\nwhile attempt < config.ddl_max_redownload_attempts:\ntry:\nself.__download_once(resource_identifier, output_path)\nreturn\nexcept CommunicationRetrievalError:\nif os.path.exists(output_path):\nremove_path(output_path)\nattempt += 1\nraise CommunicationRetrievalError(f\"Could not download {resource_identifier}\")\n
"},{"location":"reference/comms/entity_resources/sources/direct/#comms.entity_resources.sources.direct.DirectLinkSource.__download_once","title":"__download_once(resource_identifier, output_path)
","text":"Downloads a direct-download-link file by streaming its contents. source: https://stackoverflow.com/questions/16694907/download-large-file-in-python-with-requests
Source code incli/medperf/comms/entity_resources/sources/direct.py
def __download_once(self, resource_identifier: str, output_path: str):\n\"\"\"Downloads a direct-download-link file by streaming its contents. source:\n https://stackoverflow.com/questions/16694907/download-large-file-in-python-with-requests\n \"\"\"\nwith requests.get(resource_identifier, stream=True) as res:\nif res.status_code != 200:\nlog_response_error(res)\nmsg = (\n\"There was a problem retrieving the specified file at \"\n+ resource_identifier\n)\nraise CommunicationRetrievalError(msg)\nwith open(output_path, \"wb\") as f:\nfor chunk in res.iter_content(chunk_size=config.ddl_stream_chunk_size):\n# NOTE: if the response is chunk-encoded, this may not work\n# check whether this is common.\nf.write(chunk)\n
"},{"location":"reference/comms/entity_resources/sources/direct/#comms.entity_resources.sources.direct.DirectLinkSource.download","title":"download(resource_identifier, output_path)
","text":"Downloads a direct-download-link file with multiple attempts. This is done due to facing transient network failure from some direct download link servers.
Source code incli/medperf/comms/entity_resources/sources/direct.py
def download(self, resource_identifier: str, output_path: str):\n\"\"\"Downloads a direct-download-link file with multiple attempts. This is\n done due to facing transient network failure from some direct download\n link servers.\"\"\"\nattempt = 0\nwhile attempt < config.ddl_max_redownload_attempts:\ntry:\nself.__download_once(resource_identifier, output_path)\nreturn\nexcept CommunicationRetrievalError:\nif os.path.exists(output_path):\nremove_path(output_path)\nattempt += 1\nraise CommunicationRetrievalError(f\"Could not download {resource_identifier}\")\n
"},{"location":"reference/comms/entity_resources/sources/direct/#comms.entity_resources.sources.direct.DirectLinkSource.validate_resource","title":"validate_resource(value)
classmethod
","text":"This class expects a resource string of the form direct:<URL>
or only a URL.
Parameters:
Name Type Description Defaultresource
str
the resource string
requiredReturns:
Type Descriptionstr | None
The URL if the pattern matches, else None
Source code incli/medperf/comms/entity_resources/sources/direct.py
@classmethod\ndef validate_resource(cls, value: str):\n\"\"\"This class expects a resource string of the form\n `direct:<URL>` or only a URL.\n Args:\n resource (str): the resource string\n Returns:\n (str|None): The URL if the pattern matches, else None\n \"\"\"\nprefix = cls.prefix\nif value.startswith(prefix):\nprefix_len = len(prefix)\nvalue = value[prefix_len:]\nif validators.url(value):\nreturn value\n
"},{"location":"reference/comms/entity_resources/sources/source/","title":"Source","text":""},{"location":"reference/comms/entity_resources/sources/source/#comms.entity_resources.sources.source.BaseSource","title":"BaseSource
","text":" Bases: ABC
cli/medperf/comms/entity_resources/sources/source.py
class BaseSource(ABC):\n@classmethod\n@abstractmethod\ndef validate_resource(cls, value: str):\n\"\"\"Checks if an input resource can be downloaded by this class\"\"\"\n@abstractmethod\ndef __init__(self):\n\"\"\"Initialize\"\"\"\n@abstractmethod\ndef authenticate(self):\n\"\"\"Authenticates with the source server, if needed.\"\"\"\n@abstractmethod\ndef download(self, resource_identifier: str, output_path: str):\n\"\"\"Downloads the requested resource to the specified location\n Args:\n resource_identifier (str): The identifier that is used to download\n the resource (e.g. URL, asset ID, ...) It is the parsed output\n by `validate_resource`\n output_path (str): The path to download the resource to\n \"\"\"\n
"},{"location":"reference/comms/entity_resources/sources/source/#comms.entity_resources.sources.source.BaseSource.__init__","title":"__init__()
abstractmethod
","text":"Initialize
Source code incli/medperf/comms/entity_resources/sources/source.py
@abstractmethod\ndef __init__(self):\n\"\"\"Initialize\"\"\"\n
"},{"location":"reference/comms/entity_resources/sources/source/#comms.entity_resources.sources.source.BaseSource.authenticate","title":"authenticate()
abstractmethod
","text":"Authenticates with the source server, if needed.
Source code incli/medperf/comms/entity_resources/sources/source.py
@abstractmethod\ndef authenticate(self):\n\"\"\"Authenticates with the source server, if needed.\"\"\"\n
"},{"location":"reference/comms/entity_resources/sources/source/#comms.entity_resources.sources.source.BaseSource.download","title":"download(resource_identifier, output_path)
abstractmethod
","text":"Downloads the requested resource to the specified location
Parameters:
Name Type Description Defaultresource_identifier
str
The identifier that is used to download
requiredoutput_path
str
The path to download the resource to
required Source code incli/medperf/comms/entity_resources/sources/source.py
@abstractmethod\ndef download(self, resource_identifier: str, output_path: str):\n\"\"\"Downloads the requested resource to the specified location\n Args:\n resource_identifier (str): The identifier that is used to download\n the resource (e.g. URL, asset ID, ...) It is the parsed output\n by `validate_resource`\n output_path (str): The path to download the resource to\n \"\"\"\n
"},{"location":"reference/comms/entity_resources/sources/source/#comms.entity_resources.sources.source.BaseSource.validate_resource","title":"validate_resource(value)
abstractmethod
classmethod
","text":"Checks if an input resource can be downloaded by this class
Source code incli/medperf/comms/entity_resources/sources/source.py
@classmethod\n@abstractmethod\ndef validate_resource(cls, value: str):\n\"\"\"Checks if an input resource can be downloaded by this class\"\"\"\n
"},{"location":"reference/comms/entity_resources/sources/synapse/","title":"Synapse","text":""},{"location":"reference/comms/entity_resources/sources/synapse/#comms.entity_resources.sources.synapse.SynapseSource","title":"SynapseSource
","text":" Bases: BaseSource
cli/medperf/comms/entity_resources/sources/synapse.py
class SynapseSource(BaseSource):\nprefix = \"synapse:\"\n@classmethod\ndef validate_resource(cls, value: str):\n\"\"\"This class expects a resource string of the form\n `synapse:<synapse_id>`, where <synapse_id> is in the form `syn<Integer>`.\n Args:\n resource (str): the resource string\n Returns:\n (str|None): The synapse ID if the pattern matches, else None\n \"\"\"\nprefix = cls.prefix\nif not value.startswith(prefix):\nreturn\nprefix_len = len(prefix)\nvalue = value[prefix_len:]\nif re.match(r\"syn\\d+$\", value):\nreturn value\ndef __init__(self):\nself.client = synapseclient.Synapse()\ndef authenticate(self):\ntry:\nself.client.login(silent=True)\nexcept SynapseNoCredentialsError:\nmsg = \"There was an attempt to download resources from the Synapse \"\nmsg += \"platform, but couldn't find Synapse credentials.\"\nmsg += \"\\nDid you run 'medperf auth synapse_login' before?\"\nraise CommunicationAuthenticationError(msg)\ndef download(self, resource_identifier: str, output_path: str):\n# we can specify target folder only. File name depends on how it was stored\ndownload_location = os.path.dirname(output_path)\nos.makedirs(download_location, exist_ok=True)\ntry:\nresource_file = self.client.get(\nresource_identifier, downloadLocation=download_location\n)\nexcept (SynapseHTTPError, SynapseUnmetAccessRestrictions) as e:\nraise CommunicationRetrievalError(str(e))\nresource_path = os.path.join(download_location, resource_file.name)\n# synapseclient may only throw a warning in some cases\n# (e.g. read permissions but no download permissions)\nif not os.path.exists(resource_path):\nraise CommunicationRetrievalError(\n\"There was a problem retrieving a file from Synapse\"\n)\nshutil.move(resource_path, output_path)\n
"},{"location":"reference/comms/entity_resources/sources/synapse/#comms.entity_resources.sources.synapse.SynapseSource.validate_resource","title":"validate_resource(value)
classmethod
","text":"This class expects a resource string of the form synapse:<synapse_id>
, where is in the form syn<Integer>
.
Parameters:
Name Type Description Defaultresource
str
the resource string
requiredReturns:
Type Descriptionstr | None
The synapse ID if the pattern matches, else None
Source code incli/medperf/comms/entity_resources/sources/synapse.py
@classmethod\ndef validate_resource(cls, value: str):\n\"\"\"This class expects a resource string of the form\n `synapse:<synapse_id>`, where <synapse_id> is in the form `syn<Integer>`.\n Args:\n resource (str): the resource string\n Returns:\n (str|None): The synapse ID if the pattern matches, else None\n \"\"\"\nprefix = cls.prefix\nif not value.startswith(prefix):\nreturn\nprefix_len = len(prefix)\nvalue = value[prefix_len:]\nif re.match(r\"syn\\d+$\", value):\nreturn value\n
"},{"location":"reference/config_management/config_management/","title":"Config management","text":""},{"location":"reference/entities/benchmark/","title":"Benchmark","text":""},{"location":"reference/entities/benchmark/#entities.benchmark.Benchmark","title":"Benchmark
","text":" Bases: Entity
, ApprovableSchema
, DeployableSchema
Class representing a Benchmark
a benchmark is a bundle of assets that enables quantitative measurement of the performance of AI models for a specific clinical problem. A Benchmark instance contains information regarding how to prepare datasets for execution, as well as what models to run and how to evaluate them.
Source code incli/medperf/entities/benchmark.py
class Benchmark(Entity, ApprovableSchema, DeployableSchema):\n\"\"\"\n Class representing a Benchmark\n a benchmark is a bundle of assets that enables quantitative\n measurement of the performance of AI models for a specific\n clinical problem. A Benchmark instance contains information\n regarding how to prepare datasets for execution, as well as\n what models to run and how to evaluate them.\n \"\"\"\ndescription: Optional[str] = Field(None, max_length=20)\ndocs_url: Optional[HttpUrl]\ndemo_dataset_tarball_url: str\ndemo_dataset_tarball_hash: Optional[str]\ndemo_dataset_generated_uid: Optional[str]\ndata_preparation_mlcube: int\nreference_model_mlcube: int\ndata_evaluator_mlcube: int\nmetadata: dict = {}\nuser_metadata: dict = {}\nis_active: bool = True\n@staticmethod\ndef get_type():\nreturn \"benchmark\"\n@staticmethod\ndef get_storage_path():\nreturn config.benchmarks_folder\n@staticmethod\ndef get_comms_retriever():\nreturn config.comms.get_benchmark\n@staticmethod\ndef get_metadata_filename():\nreturn config.benchmarks_filename\n@staticmethod\ndef get_comms_uploader():\nreturn config.comms.upload_benchmark\ndef __init__(self, *args, **kwargs):\n\"\"\"Creates a new benchmark instance\n Args:\n bmk_desc (Union[dict, BenchmarkModel]): Benchmark instance description\n \"\"\"\nsuper().__init__(*args, **kwargs)\n@property\ndef local_id(self):\nreturn self.name\n@staticmethod\ndef remote_prefilter(filters: dict) -> callable:\n\"\"\"Applies filtering logic that must be done before retrieving remote entities\n Args:\n filters (dict): filters to apply\n Returns:\n callable: A function for retrieving remote entities with the applied prefilters\n \"\"\"\ncomms_fn = config.comms.get_benchmarks\nif \"owner\" in filters and filters[\"owner\"] == get_medperf_user_data()[\"id\"]:\ncomms_fn = config.comms.get_user_benchmarks\nreturn comms_fn\n@classmethod\ndef get_models_uids(cls, benchmark_uid: int) -> List[int]:\n\"\"\"Retrieves the list of models associated to the benchmark\n Args:\n benchmark_uid (int): UID of the benchmark.\n comms (Comms): Instance of the communications interface.\n Returns:\n List[int]: List of mlcube uids\n \"\"\"\nassociations = config.comms.get_benchmark_model_associations(benchmark_uid)\nmodels_uids = [\nassoc[\"model_mlcube\"]\nfor assoc in associations\nif assoc[\"approval_status\"] == \"APPROVED\"\n]\nreturn models_uids\ndef display_dict(self):\nreturn {\n\"UID\": self.identifier,\n\"Name\": self.name,\n\"Description\": self.description,\n\"Documentation\": self.docs_url,\n\"Created At\": self.created_at,\n\"Data Preparation MLCube\": int(self.data_preparation_mlcube),\n\"Reference Model MLCube\": int(self.reference_model_mlcube),\n\"Data Evaluator MLCube\": int(self.data_evaluator_mlcube),\n\"State\": self.state,\n\"Approval Status\": self.approval_status,\n\"Registered\": self.is_registered,\n}\n
"},{"location":"reference/entities/benchmark/#entities.benchmark.Benchmark.__init__","title":"__init__(*args, **kwargs)
","text":"Creates a new benchmark instance
Parameters:
Name Type Description Defaultbmk_desc
Union[dict, BenchmarkModel]
Benchmark instance description
required Source code incli/medperf/entities/benchmark.py
def __init__(self, *args, **kwargs):\n\"\"\"Creates a new benchmark instance\n Args:\n bmk_desc (Union[dict, BenchmarkModel]): Benchmark instance description\n \"\"\"\nsuper().__init__(*args, **kwargs)\n
"},{"location":"reference/entities/benchmark/#entities.benchmark.Benchmark.get_models_uids","title":"get_models_uids(benchmark_uid)
classmethod
","text":"Retrieves the list of models associated to the benchmark
Parameters:
Name Type Description Defaultbenchmark_uid
int
UID of the benchmark.
requiredcomms
Comms
Instance of the communications interface.
requiredReturns:
Type DescriptionList[int]
List[int]: List of mlcube uids
Source code incli/medperf/entities/benchmark.py
@classmethod\ndef get_models_uids(cls, benchmark_uid: int) -> List[int]:\n\"\"\"Retrieves the list of models associated to the benchmark\n Args:\n benchmark_uid (int): UID of the benchmark.\n comms (Comms): Instance of the communications interface.\n Returns:\n List[int]: List of mlcube uids\n \"\"\"\nassociations = config.comms.get_benchmark_model_associations(benchmark_uid)\nmodels_uids = [\nassoc[\"model_mlcube\"]\nfor assoc in associations\nif assoc[\"approval_status\"] == \"APPROVED\"\n]\nreturn models_uids\n
"},{"location":"reference/entities/benchmark/#entities.benchmark.Benchmark.remote_prefilter","title":"remote_prefilter(filters)
staticmethod
","text":"Applies filtering logic that must be done before retrieving remote entities
Parameters:
Name Type Description Defaultfilters
dict
filters to apply
requiredReturns:
Name Type Descriptioncallable
callable
A function for retrieving remote entities with the applied prefilters
Source code incli/medperf/entities/benchmark.py
@staticmethod\ndef remote_prefilter(filters: dict) -> callable:\n\"\"\"Applies filtering logic that must be done before retrieving remote entities\n Args:\n filters (dict): filters to apply\n Returns:\n callable: A function for retrieving remote entities with the applied prefilters\n \"\"\"\ncomms_fn = config.comms.get_benchmarks\nif \"owner\" in filters and filters[\"owner\"] == get_medperf_user_data()[\"id\"]:\ncomms_fn = config.comms.get_user_benchmarks\nreturn comms_fn\n
"},{"location":"reference/entities/cube/","title":"Cube","text":""},{"location":"reference/entities/cube/#entities.cube.Cube","title":"Cube
","text":" Bases: Entity
, DeployableSchema
Class representing an MLCube Container
Medperf platform uses the MLCube container for components such as Dataset Preparation, Evaluation, and the Registered Models. MLCube containers are software containers (e.g., Docker and Singularity) with standard metadata and a consistent file-system level interface.
Source code incli/medperf/entities/cube.py
class Cube(Entity, DeployableSchema):\n\"\"\"\n Class representing an MLCube Container\n Medperf platform uses the MLCube container for components such as\n Dataset Preparation, Evaluation, and the Registered Models. MLCube\n containers are software containers (e.g., Docker and Singularity)\n with standard metadata and a consistent file-system level interface.\n \"\"\"\ngit_mlcube_url: str\nmlcube_hash: Optional[str]\ngit_parameters_url: Optional[str]\nparameters_hash: Optional[str]\nimage_tarball_url: Optional[str]\nimage_tarball_hash: Optional[str]\nimage_hash: Optional[str]\nadditional_files_tarball_url: Optional[str] = Field(None, alias=\"tarball_url\")\nadditional_files_tarball_hash: Optional[str] = Field(None, alias=\"tarball_hash\")\nmetadata: dict = {}\nuser_metadata: dict = {}\n@staticmethod\ndef get_type():\nreturn \"cube\"\n@staticmethod\ndef get_storage_path():\nreturn config.cubes_folder\n@staticmethod\ndef get_comms_retriever():\nreturn config.comms.get_cube_metadata\n@staticmethod\ndef get_metadata_filename():\nreturn config.cube_metadata_filename\n@staticmethod\ndef get_comms_uploader():\nreturn config.comms.upload_mlcube\ndef __init__(self, *args, **kwargs):\n\"\"\"Creates a Cube instance\n Args:\n cube_desc (Union[dict, CubeModel]): MLCube Instance description\n \"\"\"\nsuper().__init__(*args, **kwargs)\nself.cube_path = os.path.join(self.path, config.cube_filename)\nself.params_path = None\nif self.git_parameters_url:\nself.params_path = os.path.join(self.path, config.params_filename)\n@property\ndef local_id(self):\nreturn self.name\n@staticmethod\ndef remote_prefilter(filters: dict):\n\"\"\"Applies filtering logic that must be done before retrieving remote entities\n Args:\n filters (dict): filters to apply\n Returns:\n callable: A function for retrieving remote entities with the applied prefilters\n \"\"\"\ncomms_fn = config.comms.get_cubes\nif \"owner\" in filters and filters[\"owner\"] == get_medperf_user_data()[\"id\"]:\ncomms_fn = config.comms.get_user_cubes\nreturn comms_fn\n@classmethod\ndef get(cls, cube_uid: Union[str, int], local_only: bool = False) -> \"Cube\":\n\"\"\"Retrieves and creates a Cube instance from the comms. If cube already exists\n inside the user's computer then retrieves it from there.\n Args:\n cube_uid (str): UID of the cube.\n Returns:\n Cube : a Cube instance with the retrieved data.\n \"\"\"\ncube = super().get(cube_uid, local_only)\nif not cube.is_valid:\nraise InvalidEntityError(\"The requested MLCube is marked as INVALID.\")\ncube.download_config_files()\nreturn cube\ndef download_mlcube(self):\nurl = self.git_mlcube_url\npath, file_hash = resources.get_cube(url, self.path, self.mlcube_hash)\nself.cube_path = path\nself.mlcube_hash = file_hash\ndef download_parameters(self):\nurl = self.git_parameters_url\nif url:\npath, file_hash = resources.get_cube_params(\nurl, self.path, self.parameters_hash\n)\nself.params_path = path\nself.parameters_hash = file_hash\ndef download_additional(self):\nurl = self.additional_files_tarball_url\nif url:\nfile_hash = resources.get_cube_additional(\nurl, self.path, self.additional_files_tarball_hash\n)\nself.additional_files_tarball_hash = file_hash\ndef download_image(self):\nurl = self.image_tarball_url\ntarball_hash = self.image_tarball_hash\nif url:\n_, local_hash = resources.get_cube_image(url, self.path, tarball_hash)\nself.image_tarball_hash = local_hash\nelse:\nif config.platform == \"docker\":\n# For docker, image should be pulled before calculating its hash\nself._get_image_from_registry()\nself._set_image_hash_from_registry()\nelif config.platform == \"singularity\":\n# For singularity, we need the hash first before trying to convert\nself._set_image_hash_from_registry()\nimage_folder = os.path.join(config.cubes_folder, config.image_path)\nif os.path.exists(image_folder):\nfor file in os.listdir(image_folder):\nif file == self._converted_singularity_image_name:\nreturn\nremove_path(os.path.join(image_folder, file))\nself._get_image_from_registry()\nelse:\n# TODO: such a check should happen on commands entrypoints, not here\nraise InvalidArgumentError(\"Unsupported platform\")\n@property\ndef _converted_singularity_image_name(self):\nreturn f\"{self.image_hash}.sif\"\ndef _set_image_hash_from_registry(self):\n# Retrieve image hash from MLCube\nlogging.debug(f\"Retrieving {self.id} image hash\")\ntmp_out_yaml = generate_tmp_path()\ncmd = f\"mlcube --log-level {config.loglevel} inspect --mlcube={self.cube_path} --format=yaml\"\ncmd += f\" --platform={config.platform} --output-file {tmp_out_yaml}\"\nlogging.info(f\"Running MLCube command: {cmd}\")\nwith spawn_and_kill(cmd, timeout=config.mlcube_inspect_timeout) as proc_wrapper:\nproc = proc_wrapper.proc\ncombine_proc_sp_text(proc)\nif proc.exitstatus != 0:\nraise ExecutionError(\"There was an error while inspecting the image hash\")\nwith open(tmp_out_yaml) as f:\nmlcube_details = yaml.safe_load(f)\nremove_path(tmp_out_yaml)\nlocal_hash = mlcube_details[\"hash\"]\nif self.image_hash and local_hash != self.image_hash:\nraise InvalidEntityError(\nf\"Hash mismatch. Expected {self.image_hash}, found {local_hash}.\"\n)\nself.image_hash = local_hash\ndef _get_image_from_registry(self):\n# Retrieve image from image registry\nlogging.debug(f\"Retrieving {self.id} image\")\ncmd = f\"mlcube --log-level {config.loglevel} configure --mlcube={self.cube_path} --platform={config.platform}\"\nif config.platform == \"singularity\":\ncmd += f\" -Psingularity.image={self._converted_singularity_image_name}\"\nlogging.info(f\"Running MLCube command: {cmd}\")\nwith spawn_and_kill(\ncmd, timeout=config.mlcube_configure_timeout\n) as proc_wrapper:\nproc = proc_wrapper.proc\ncombine_proc_sp_text(proc)\nif proc.exitstatus != 0:\nraise ExecutionError(\"There was an error while retrieving the MLCube image\")\ndef download_config_files(self):\ntry:\nself.download_mlcube()\nexcept InvalidEntityError as e:\nraise InvalidEntityError(f\"MLCube {self.name} manifest file: {e}\")\ntry:\nself.download_parameters()\nexcept InvalidEntityError as e:\nraise InvalidEntityError(f\"MLCube {self.name} parameters file: {e}\")\ndef download_run_files(self):\ntry:\nself.download_additional()\nexcept InvalidEntityError as e:\nraise InvalidEntityError(f\"MLCube {self.name} additional files: {e}\")\ntry:\nself.download_image()\nexcept InvalidEntityError as e:\nraise InvalidEntityError(f\"MLCube {self.name} image file: {e}\")\ndef run(\nself,\ntask: str,\noutput_logs: str = None,\nstring_params: Dict[str, str] = {},\ntimeout: int = None,\nread_protected_input: bool = True,\n**kwargs,\n):\n\"\"\"Executes a given task on the cube instance\n Args:\n task (str): task to run\n string_params (Dict[str], optional): Extra parameters that can't be passed as normal function args.\n Defaults to {}.\n timeout (int, optional): timeout for the task in seconds. Defaults to None.\n read_protected_input (bool, optional): Wether to disable write permissions on input volumes. Defaults to True.\n kwargs (dict): additional arguments that are passed directly to the mlcube command\n \"\"\"\nkwargs.update(string_params)\ncmd = f\"mlcube --log-level {config.loglevel} run\"\ncmd += f' --mlcube=\"{self.cube_path}\" --task={task} --platform={config.platform} --network=none'\nif config.gpus is not None:\ncmd += f\" --gpus={config.gpus}\"\nif read_protected_input:\ncmd += \" --mount=ro\"\nfor k, v in kwargs.items():\ncmd_arg = f'{k}=\"{v}\"'\ncmd = \" \".join([cmd, cmd_arg])\ncontainer_loglevel = config.container_loglevel\n# TODO: we should override run args instead of what we are doing below\n# we shouldn't allow arbitrary run args unless our client allows it\nif config.platform == \"docker\":\n# use current user\ncpu_args = self.get_config(\"docker.cpu_args\") or \"\"\ngpu_args = self.get_config(\"docker.gpu_args\") or \"\"\ncpu_args = \" \".join([cpu_args, \"-u $(id -u):$(id -g)\"]).strip()\ngpu_args = \" \".join([gpu_args, \"-u $(id -u):$(id -g)\"]).strip()\ncmd += f' -Pdocker.cpu_args=\"{cpu_args}\"'\ncmd += f' -Pdocker.gpu_args=\"{gpu_args}\"'\nif container_loglevel:\ncmd += f' -Pdocker.env_args=\"-e MEDPERF_LOGLEVEL={container_loglevel.upper()}\"'\nelif config.platform == \"singularity\":\n# use -e to discard host env vars, -C to isolate the container (see singularity run --help)\nrun_args = self.get_config(\"singularity.run_args\") or \"\"\nrun_args = \" \".join([run_args, \"-eC\"]).strip()\ncmd += f' -Psingularity.run_args=\"{run_args}\"'\n# set image name in case of running docker image with singularity\n# Assuming we only accept mlcube.yamls with either singularity or docker sections\n# TODO: make checks on submitted mlcubes\nsingularity_config = self.get_config(\"singularity\")\nif singularity_config is None:\ncmd += (\nf' -Psingularity.image=\"{self._converted_singularity_image_name}\"'\n)\n# TODO: pass logging env for singularity also there\nelse:\nraise InvalidArgumentError(\"Unsupported platform\")\n# set accelerator count to zero to avoid unexpected behaviours and\n# force mlcube to only use --gpus to figure out GPU config\ncmd += \" -Pplatform.accelerator_count=0\"\nlogging.info(f\"Running MLCube command: {cmd}\")\nwith spawn_and_kill(cmd, timeout=timeout) as proc_wrapper:\nproc = proc_wrapper.proc\nproc_out = combine_proc_sp_text(proc)\nif output_logs is not None:\nwith open(output_logs, \"w\") as f:\nf.write(proc_out)\nif proc.exitstatus != 0:\nraise ExecutionError(\"There was an error while executing the cube\")\nlog_storage()\nreturn proc\ndef get_default_output(self, task: str, out_key: str, param_key: str = None) -> str:\n\"\"\"Returns the output parameter specified in the mlcube.yaml file\n Args:\n task (str): the task of interest\n out_key (str): key used to identify the desired output in the yaml file\n param_key (str): key inside the parameters file that completes the output path. Defaults to None.\n Returns:\n str: the path as specified in the mlcube.yaml file for the desired\n output for the desired task. Defaults to None if out_key not found\n \"\"\"\nout_path = self.get_config(f\"tasks.{task}.parameters.outputs.{out_key}\")\nif out_path is None:\nreturn\nif isinstance(out_path, dict):\n# output is specified as a dict with type and default values\nout_path = out_path[\"default\"]\ncube_loc = str(Path(self.cube_path).parent)\nout_path = os.path.join(cube_loc, \"workspace\", out_path)\nif self.params_path is not None and param_key is not None:\nwith open(self.params_path, \"r\") as f:\nparams = yaml.safe_load(f)\nout_path = os.path.join(out_path, params[param_key])\nreturn out_path\ndef get_config(self, identifier):\n\"\"\"\n Returns the output parameter specified in the mlcube.yaml file\n Args:\n identifier (str): `.` separated keys to traverse the mlcube dict\n Returns:\n str: the parameter value, None if not found\n \"\"\"\nwith open(self.cube_path, \"r\") as f:\ncube = yaml.safe_load(f)\nkeys = identifier.split(\".\")\nfor key in keys:\nif key not in cube:\nreturn\ncube = cube[key]\nreturn cube\ndef display_dict(self):\nreturn {\n\"UID\": self.identifier,\n\"Name\": self.name,\n\"Config File\": self.git_mlcube_url,\n\"State\": self.state,\n\"Created At\": self.created_at,\n\"Registered\": self.is_registered,\n}\n
"},{"location":"reference/entities/cube/#entities.cube.Cube.__init__","title":"__init__(*args, **kwargs)
","text":"Creates a Cube instance
Parameters:
Name Type Description Defaultcube_desc
Union[dict, CubeModel]
MLCube Instance description
required Source code incli/medperf/entities/cube.py
def __init__(self, *args, **kwargs):\n\"\"\"Creates a Cube instance\n Args:\n cube_desc (Union[dict, CubeModel]): MLCube Instance description\n \"\"\"\nsuper().__init__(*args, **kwargs)\nself.cube_path = os.path.join(self.path, config.cube_filename)\nself.params_path = None\nif self.git_parameters_url:\nself.params_path = os.path.join(self.path, config.params_filename)\n
"},{"location":"reference/entities/cube/#entities.cube.Cube.get","title":"get(cube_uid, local_only=False)
classmethod
","text":"Retrieves and creates a Cube instance from the comms. If cube already exists inside the user's computer then retrieves it from there.
Parameters:
Name Type Description Defaultcube_uid
str
UID of the cube.
requiredReturns:
Name Type DescriptionCube
Cube
a Cube instance with the retrieved data.
Source code incli/medperf/entities/cube.py
@classmethod\ndef get(cls, cube_uid: Union[str, int], local_only: bool = False) -> \"Cube\":\n\"\"\"Retrieves and creates a Cube instance from the comms. If cube already exists\n inside the user's computer then retrieves it from there.\n Args:\n cube_uid (str): UID of the cube.\n Returns:\n Cube : a Cube instance with the retrieved data.\n \"\"\"\ncube = super().get(cube_uid, local_only)\nif not cube.is_valid:\nraise InvalidEntityError(\"The requested MLCube is marked as INVALID.\")\ncube.download_config_files()\nreturn cube\n
"},{"location":"reference/entities/cube/#entities.cube.Cube.get_config","title":"get_config(identifier)
","text":"Returns the output parameter specified in the mlcube.yaml file
Parameters:
Name Type Description Defaultidentifier
str
.
separated keys to traverse the mlcube dict
Returns:
Name Type Descriptionstr
the parameter value, None if not found
Source code incli/medperf/entities/cube.py
def get_config(self, identifier):\n\"\"\"\n Returns the output parameter specified in the mlcube.yaml file\n Args:\n identifier (str): `.` separated keys to traverse the mlcube dict\n Returns:\n str: the parameter value, None if not found\n \"\"\"\nwith open(self.cube_path, \"r\") as f:\ncube = yaml.safe_load(f)\nkeys = identifier.split(\".\")\nfor key in keys:\nif key not in cube:\nreturn\ncube = cube[key]\nreturn cube\n
"},{"location":"reference/entities/cube/#entities.cube.Cube.get_default_output","title":"get_default_output(task, out_key, param_key=None)
","text":"Returns the output parameter specified in the mlcube.yaml file
Parameters:
Name Type Description Defaulttask
str
the task of interest
requiredout_key
str
key used to identify the desired output in the yaml file
requiredparam_key
str
key inside the parameters file that completes the output path. Defaults to None.
None
Returns:
Name Type Descriptionstr
str
the path as specified in the mlcube.yaml file for the desired output for the desired task. Defaults to None if out_key not found
Source code incli/medperf/entities/cube.py
def get_default_output(self, task: str, out_key: str, param_key: str = None) -> str:\n\"\"\"Returns the output parameter specified in the mlcube.yaml file\n Args:\n task (str): the task of interest\n out_key (str): key used to identify the desired output in the yaml file\n param_key (str): key inside the parameters file that completes the output path. Defaults to None.\n Returns:\n str: the path as specified in the mlcube.yaml file for the desired\n output for the desired task. Defaults to None if out_key not found\n \"\"\"\nout_path = self.get_config(f\"tasks.{task}.parameters.outputs.{out_key}\")\nif out_path is None:\nreturn\nif isinstance(out_path, dict):\n# output is specified as a dict with type and default values\nout_path = out_path[\"default\"]\ncube_loc = str(Path(self.cube_path).parent)\nout_path = os.path.join(cube_loc, \"workspace\", out_path)\nif self.params_path is not None and param_key is not None:\nwith open(self.params_path, \"r\") as f:\nparams = yaml.safe_load(f)\nout_path = os.path.join(out_path, params[param_key])\nreturn out_path\n
"},{"location":"reference/entities/cube/#entities.cube.Cube.remote_prefilter","title":"remote_prefilter(filters)
staticmethod
","text":"Applies filtering logic that must be done before retrieving remote entities
Parameters:
Name Type Description Defaultfilters
dict
filters to apply
requiredReturns:
Name Type Descriptioncallable
A function for retrieving remote entities with the applied prefilters
Source code incli/medperf/entities/cube.py
@staticmethod\ndef remote_prefilter(filters: dict):\n\"\"\"Applies filtering logic that must be done before retrieving remote entities\n Args:\n filters (dict): filters to apply\n Returns:\n callable: A function for retrieving remote entities with the applied prefilters\n \"\"\"\ncomms_fn = config.comms.get_cubes\nif \"owner\" in filters and filters[\"owner\"] == get_medperf_user_data()[\"id\"]:\ncomms_fn = config.comms.get_user_cubes\nreturn comms_fn\n
"},{"location":"reference/entities/cube/#entities.cube.Cube.run","title":"run(task, output_logs=None, string_params={}, timeout=None, read_protected_input=True, **kwargs)
","text":"Executes a given task on the cube instance
Parameters:
Name Type Description Defaulttask
str
task to run
requiredstring_params
Dict[str]
Extra parameters that can't be passed as normal function args. Defaults to {}.
{}
timeout
int
timeout for the task in seconds. Defaults to None.
None
read_protected_input
bool
Wether to disable write permissions on input volumes. Defaults to True.
True
kwargs
dict
additional arguments that are passed directly to the mlcube command
{}
Source code in cli/medperf/entities/cube.py
def run(\nself,\ntask: str,\noutput_logs: str = None,\nstring_params: Dict[str, str] = {},\ntimeout: int = None,\nread_protected_input: bool = True,\n**kwargs,\n):\n\"\"\"Executes a given task on the cube instance\n Args:\n task (str): task to run\n string_params (Dict[str], optional): Extra parameters that can't be passed as normal function args.\n Defaults to {}.\n timeout (int, optional): timeout for the task in seconds. Defaults to None.\n read_protected_input (bool, optional): Wether to disable write permissions on input volumes. Defaults to True.\n kwargs (dict): additional arguments that are passed directly to the mlcube command\n \"\"\"\nkwargs.update(string_params)\ncmd = f\"mlcube --log-level {config.loglevel} run\"\ncmd += f' --mlcube=\"{self.cube_path}\" --task={task} --platform={config.platform} --network=none'\nif config.gpus is not None:\ncmd += f\" --gpus={config.gpus}\"\nif read_protected_input:\ncmd += \" --mount=ro\"\nfor k, v in kwargs.items():\ncmd_arg = f'{k}=\"{v}\"'\ncmd = \" \".join([cmd, cmd_arg])\ncontainer_loglevel = config.container_loglevel\n# TODO: we should override run args instead of what we are doing below\n# we shouldn't allow arbitrary run args unless our client allows it\nif config.platform == \"docker\":\n# use current user\ncpu_args = self.get_config(\"docker.cpu_args\") or \"\"\ngpu_args = self.get_config(\"docker.gpu_args\") or \"\"\ncpu_args = \" \".join([cpu_args, \"-u $(id -u):$(id -g)\"]).strip()\ngpu_args = \" \".join([gpu_args, \"-u $(id -u):$(id -g)\"]).strip()\ncmd += f' -Pdocker.cpu_args=\"{cpu_args}\"'\ncmd += f' -Pdocker.gpu_args=\"{gpu_args}\"'\nif container_loglevel:\ncmd += f' -Pdocker.env_args=\"-e MEDPERF_LOGLEVEL={container_loglevel.upper()}\"'\nelif config.platform == \"singularity\":\n# use -e to discard host env vars, -C to isolate the container (see singularity run --help)\nrun_args = self.get_config(\"singularity.run_args\") or \"\"\nrun_args = \" \".join([run_args, \"-eC\"]).strip()\ncmd += f' -Psingularity.run_args=\"{run_args}\"'\n# set image name in case of running docker image with singularity\n# Assuming we only accept mlcube.yamls with either singularity or docker sections\n# TODO: make checks on submitted mlcubes\nsingularity_config = self.get_config(\"singularity\")\nif singularity_config is None:\ncmd += (\nf' -Psingularity.image=\"{self._converted_singularity_image_name}\"'\n)\n# TODO: pass logging env for singularity also there\nelse:\nraise InvalidArgumentError(\"Unsupported platform\")\n# set accelerator count to zero to avoid unexpected behaviours and\n# force mlcube to only use --gpus to figure out GPU config\ncmd += \" -Pplatform.accelerator_count=0\"\nlogging.info(f\"Running MLCube command: {cmd}\")\nwith spawn_and_kill(cmd, timeout=timeout) as proc_wrapper:\nproc = proc_wrapper.proc\nproc_out = combine_proc_sp_text(proc)\nif output_logs is not None:\nwith open(output_logs, \"w\") as f:\nf.write(proc_out)\nif proc.exitstatus != 0:\nraise ExecutionError(\"There was an error while executing the cube\")\nlog_storage()\nreturn proc\n
"},{"location":"reference/entities/dataset/","title":"Dataset","text":""},{"location":"reference/entities/dataset/#entities.dataset.Dataset","title":"Dataset
","text":" Bases: Entity
, DeployableSchema
Class representing a Dataset
Datasets are stored locally in the Data Owner's machine. They contain information regarding the prepared dataset, such as name and description, general statistics and an UID generated by hashing the contents of the data preparation output.
Source code incli/medperf/entities/dataset.py
class Dataset(Entity, DeployableSchema):\n\"\"\"\n Class representing a Dataset\n Datasets are stored locally in the Data Owner's machine. They contain\n information regarding the prepared dataset, such as name and description,\n general statistics and an UID generated by hashing the contents of the\n data preparation output.\n \"\"\"\ndescription: Optional[str] = Field(None, max_length=20)\nlocation: Optional[str] = Field(None, max_length=20)\ninput_data_hash: str\ngenerated_uid: str\ndata_preparation_mlcube: Union[int, str]\nsplit_seed: Optional[int]\ngenerated_metadata: dict = Field(..., alias=\"metadata\")\nuser_metadata: dict = {}\nreport: dict = {}\nsubmitted_as_prepared: bool\n@staticmethod\ndef get_type():\nreturn \"dataset\"\n@staticmethod\ndef get_storage_path():\nreturn config.datasets_folder\n@staticmethod\ndef get_comms_retriever():\nreturn config.comms.get_dataset\n@staticmethod\ndef get_metadata_filename():\nreturn config.reg_file\n@staticmethod\ndef get_comms_uploader():\nreturn config.comms.upload_dataset\n@validator(\"data_preparation_mlcube\", pre=True, always=True)\ndef check_data_preparation_mlcube(cls, v, *, values, **kwargs):\nif not isinstance(v, int) and not values[\"for_test\"]:\nraise ValueError(\n\"data_preparation_mlcube must be an integer if not running a compatibility test\"\n)\nreturn v\ndef __init__(self, *args, **kwargs):\nsuper().__init__(*args, **kwargs)\nself.data_path = os.path.join(self.path, \"data\")\nself.labels_path = os.path.join(self.path, \"labels\")\nself.report_path = os.path.join(self.path, config.report_file)\nself.metadata_path = os.path.join(self.path, config.metadata_folder)\nself.statistics_path = os.path.join(self.path, config.statistics_filename)\n@property\ndef local_id(self):\nreturn self.generated_uid\ndef set_raw_paths(self, raw_data_path: str, raw_labels_path: str):\nraw_paths_file = os.path.join(self.path, config.dataset_raw_paths_file)\ndata = {\"data_path\": raw_data_path, \"labels_path\": raw_labels_path}\nwith open(raw_paths_file, \"w\") as f:\nyaml.dump(data, f)\ndef get_raw_paths(self):\nraw_paths_file = os.path.join(self.path, config.dataset_raw_paths_file)\nwith open(raw_paths_file) as f:\ndata = yaml.safe_load(f)\nreturn data[\"data_path\"], data[\"labels_path\"]\ndef mark_as_ready(self):\nflag_file = os.path.join(self.path, config.ready_flag_file)\nwith open(flag_file, \"w\"):\npass\ndef unmark_as_ready(self):\nflag_file = os.path.join(self.path, config.ready_flag_file)\nremove_path(flag_file)\ndef is_ready(self):\nflag_file = os.path.join(self.path, config.ready_flag_file)\nreturn os.path.exists(flag_file)\n@staticmethod\ndef remote_prefilter(filters: dict) -> callable:\n\"\"\"Applies filtering logic that must be done before retrieving remote entities\n Args:\n filters (dict): filters to apply\n Returns:\n callable: A function for retrieving remote entities with the applied prefilters\n \"\"\"\ncomms_fn = config.comms.get_datasets\nif \"owner\" in filters and filters[\"owner\"] == get_medperf_user_data()[\"id\"]:\ncomms_fn = config.comms.get_user_datasets\nif \"mlcube\" in filters and filters[\"mlcube\"] is not None:\ndef func():\nreturn config.comms.get_mlcube_datasets(filters[\"mlcube\"])\ncomms_fn = func\nreturn comms_fn\ndef display_dict(self):\nreturn {\n\"UID\": self.identifier,\n\"Name\": self.name,\n\"Description\": self.description,\n\"Location\": self.location,\n\"Data Preparation Cube UID\": self.data_preparation_mlcube,\n\"Generated Hash\": self.generated_uid,\n\"State\": self.state,\n\"Created At\": self.created_at,\n\"Registered\": self.is_registered,\n\"Submitted as Prepared\": self.submitted_as_prepared,\n\"Status\": \"\\n\".join([f\"{k}: {v}\" for k, v in self.report.items()]),\n\"Owner\": self.owner,\n}\n
"},{"location":"reference/entities/dataset/#entities.dataset.Dataset.remote_prefilter","title":"remote_prefilter(filters)
staticmethod
","text":"Applies filtering logic that must be done before retrieving remote entities
Parameters:
Name Type Description Defaultfilters
dict
filters to apply
requiredReturns:
Name Type Descriptioncallable
callable
A function for retrieving remote entities with the applied prefilters
Source code incli/medperf/entities/dataset.py
@staticmethod\ndef remote_prefilter(filters: dict) -> callable:\n\"\"\"Applies filtering logic that must be done before retrieving remote entities\n Args:\n filters (dict): filters to apply\n Returns:\n callable: A function for retrieving remote entities with the applied prefilters\n \"\"\"\ncomms_fn = config.comms.get_datasets\nif \"owner\" in filters and filters[\"owner\"] == get_medperf_user_data()[\"id\"]:\ncomms_fn = config.comms.get_user_datasets\nif \"mlcube\" in filters and filters[\"mlcube\"] is not None:\ndef func():\nreturn config.comms.get_mlcube_datasets(filters[\"mlcube\"])\ncomms_fn = func\nreturn comms_fn\n
"},{"location":"reference/entities/interface/","title":"Interface","text":""},{"location":"reference/entities/interface/#entities.interface.Entity","title":"Entity
","text":" Bases: MedperfSchema
, ABC
cli/medperf/entities/interface.py
class Entity(MedperfSchema, ABC):\n@staticmethod\ndef get_type() -> str:\nraise NotImplementedError()\n@staticmethod\ndef get_storage_path() -> str:\nraise NotImplementedError()\n@staticmethod\ndef get_comms_retriever() -> Callable[[int], dict]:\nraise NotImplementedError()\n@staticmethod\ndef get_metadata_filename() -> str:\nraise NotImplementedError()\n@staticmethod\ndef get_comms_uploader() -> Callable[[dict], dict]:\nraise NotImplementedError()\n@property\ndef local_id(self) -> str:\nraise NotImplementedError()\n@property\ndef identifier(self) -> Union[int, str]:\nreturn self.id or self.local_id\n@property\ndef is_registered(self) -> bool:\nreturn self.id is not None\n@property\ndef path(self) -> str:\nstorage_path = self.get_storage_path()\nreturn os.path.join(storage_path, str(self.identifier))\n@classmethod\ndef all(\ncls: Type[EntityType], unregistered: bool = False, filters: dict = {}\n) -> List[EntityType]:\n\"\"\"Gets a list of all instances of the respective entity.\n Whether the list is local or remote depends on the implementation.\n Args:\n unregistered (bool, optional): Wether to retrieve only unregistered local entities. Defaults to False.\n filters (dict, optional): key-value pairs specifying filters to apply to the list of entities.\n Returns:\n List[Entity]: a list of entities.\n \"\"\"\nlogging.info(f\"Retrieving all {cls.get_type()} entities\")\nif unregistered:\nif filters:\nraise InvalidArgumentError(\n\"Filtering is not supported for unregistered entities\"\n)\nreturn cls.__unregistered_all()\nreturn cls.__remote_all(filters=filters)\n@classmethod\ndef __remote_all(cls: Type[EntityType], filters: dict) -> List[EntityType]:\ncomms_fn = cls.remote_prefilter(filters)\nentity_meta = comms_fn()\nentities = [cls(**meta) for meta in entity_meta]\nreturn entities\n@classmethod\ndef __unregistered_all(cls: Type[EntityType]) -> List[EntityType]:\nentities = []\nstorage_path = cls.get_storage_path()\ntry:\nuids = next(os.walk(storage_path))[1]\nexcept StopIteration:\nmsg = f\"Couldn't iterate over the {cls.get_type()} storage\"\nlogging.warning(msg)\nraise MedperfException(msg)\nfor uid in uids:\nif uid.isdigit():\ncontinue\nentity = cls.__local_get(uid)\nentities.append(entity)\nreturn entities\n@staticmethod\ndef remote_prefilter(filters: dict) -> callable:\n\"\"\"Applies filtering logic that must be done before retrieving remote entities\n Args:\n filters (dict): filters to apply\n Returns:\n callable: A function for retrieving remote entities with the applied prefilters\n \"\"\"\nraise NotImplementedError\n@classmethod\ndef get(\ncls: Type[EntityType], uid: Union[str, int], local_only: bool = False\n) -> EntityType:\n\"\"\"Gets an instance of the respective entity.\n Wether this requires only local read or remote calls depends\n on the implementation.\n Args:\n uid (str): Unique Identifier to retrieve the entity\n local_only (bool): If True, the entity will be retrieved locally\n Returns:\n Entity: Entity Instance associated to the UID\n \"\"\"\nif not str(uid).isdigit() or local_only:\nreturn cls.__local_get(uid)\nreturn cls.__remote_get(uid)\n@classmethod\ndef __remote_get(cls: Type[EntityType], uid: int) -> EntityType:\n\"\"\"Retrieves and creates an entity instance from the comms instance.\n Args:\n uid (int): server UID of the entity\n Returns:\n Entity: Specified Entity Instance\n \"\"\"\nlogging.debug(f\"Retrieving {cls.get_type()} {uid} remotely\")\ncomms_func = cls.get_comms_retriever()\nentity_dict = comms_func(uid)\nentity = cls(**entity_dict)\nentity.write()\nreturn entity\n@classmethod\ndef __local_get(cls: Type[EntityType], uid: Union[str, int]) -> EntityType:\n\"\"\"Retrieves and creates an entity instance from the local storage.\n Args:\n uid (str|int): UID of the entity\n Returns:\n Entity: Specified Entity Instance\n \"\"\"\nlogging.debug(f\"Retrieving {cls.get_type()} {uid} locally\")\nentity_dict = cls.__get_local_dict(uid)\nentity = cls(**entity_dict)\nreturn entity\n@classmethod\ndef __get_local_dict(cls: Type[EntityType], uid: Union[str, int]) -> dict:\n\"\"\"Retrieves a local entity information\n Args:\n uid (str): uid of the local entity\n Returns:\n dict: information of the entity\n \"\"\"\nlogging.info(f\"Retrieving {cls.get_type()} {uid} from local storage\")\nstorage_path = cls.get_storage_path()\nmetadata_filename = cls.get_metadata_filename()\nentity_file = os.path.join(storage_path, str(uid), metadata_filename)\nif not os.path.exists(entity_file):\nraise InvalidArgumentError(\nf\"No {cls.get_type()} with the given uid could be found\"\n)\nwith open(entity_file, \"r\") as f:\ndata = yaml.safe_load(f)\nreturn data\ndef write(self) -> str:\n\"\"\"Writes the entity to the local storage\n Returns:\n str: Path to the stored entity\n \"\"\"\ndata = self.todict()\nmetadata_filename = self.get_metadata_filename()\nentity_file = os.path.join(self.path, metadata_filename)\nos.makedirs(self.path, exist_ok=True)\nwith open(entity_file, \"w\") as f:\nyaml.dump(data, f)\nreturn entity_file\ndef upload(self) -> Dict:\n\"\"\"Upload the entity-related information to the communication's interface\n Returns:\n Dict: Dictionary with the updated entity information\n \"\"\"\nif self.for_test:\nraise InvalidArgumentError(\nf\"This test {self.get_type()} cannot be uploaded.\"\n)\nbody = self.todict()\ncomms_func = self.get_comms_uploader()\nupdated_body = comms_func(body)\nreturn updated_body\ndef display_dict(self) -> dict:\n\"\"\"Returns a dictionary of entity properties that can be displayed\n to a user interface using a verbose name of the property rather than\n the internal names\n Returns:\n dict: the display dictionary\n \"\"\"\nraise NotImplementedError\n
"},{"location":"reference/entities/interface/#entities.interface.Entity.__get_local_dict","title":"__get_local_dict(uid)
classmethod
","text":"Retrieves a local entity information
Parameters:
Name Type Description Defaultuid
str
uid of the local entity
requiredReturns:
Name Type Descriptiondict
dict
information of the entity
Source code incli/medperf/entities/interface.py
@classmethod\ndef __get_local_dict(cls: Type[EntityType], uid: Union[str, int]) -> dict:\n\"\"\"Retrieves a local entity information\n Args:\n uid (str): uid of the local entity\n Returns:\n dict: information of the entity\n \"\"\"\nlogging.info(f\"Retrieving {cls.get_type()} {uid} from local storage\")\nstorage_path = cls.get_storage_path()\nmetadata_filename = cls.get_metadata_filename()\nentity_file = os.path.join(storage_path, str(uid), metadata_filename)\nif not os.path.exists(entity_file):\nraise InvalidArgumentError(\nf\"No {cls.get_type()} with the given uid could be found\"\n)\nwith open(entity_file, \"r\") as f:\ndata = yaml.safe_load(f)\nreturn data\n
"},{"location":"reference/entities/interface/#entities.interface.Entity.__local_get","title":"__local_get(uid)
classmethod
","text":"Retrieves and creates an entity instance from the local storage.
Parameters:
Name Type Description Defaultuid
str | int
UID of the entity
requiredReturns:
Name Type DescriptionEntity
EntityType
Specified Entity Instance
Source code incli/medperf/entities/interface.py
@classmethod\ndef __local_get(cls: Type[EntityType], uid: Union[str, int]) -> EntityType:\n\"\"\"Retrieves and creates an entity instance from the local storage.\n Args:\n uid (str|int): UID of the entity\n Returns:\n Entity: Specified Entity Instance\n \"\"\"\nlogging.debug(f\"Retrieving {cls.get_type()} {uid} locally\")\nentity_dict = cls.__get_local_dict(uid)\nentity = cls(**entity_dict)\nreturn entity\n
"},{"location":"reference/entities/interface/#entities.interface.Entity.__remote_get","title":"__remote_get(uid)
classmethod
","text":"Retrieves and creates an entity instance from the comms instance.
Parameters:
Name Type Description Defaultuid
int
server UID of the entity
requiredReturns:
Name Type DescriptionEntity
EntityType
Specified Entity Instance
Source code incli/medperf/entities/interface.py
@classmethod\ndef __remote_get(cls: Type[EntityType], uid: int) -> EntityType:\n\"\"\"Retrieves and creates an entity instance from the comms instance.\n Args:\n uid (int): server UID of the entity\n Returns:\n Entity: Specified Entity Instance\n \"\"\"\nlogging.debug(f\"Retrieving {cls.get_type()} {uid} remotely\")\ncomms_func = cls.get_comms_retriever()\nentity_dict = comms_func(uid)\nentity = cls(**entity_dict)\nentity.write()\nreturn entity\n
"},{"location":"reference/entities/interface/#entities.interface.Entity.all","title":"all(unregistered=False, filters={})
classmethod
","text":"Gets a list of all instances of the respective entity. Whether the list is local or remote depends on the implementation.
Parameters:
Name Type Description Defaultunregistered
bool
Wether to retrieve only unregistered local entities. Defaults to False.
False
filters
dict
key-value pairs specifying filters to apply to the list of entities.
{}
Returns:
Type DescriptionList[EntityType]
List[Entity]: a list of entities.
Source code incli/medperf/entities/interface.py
@classmethod\ndef all(\ncls: Type[EntityType], unregistered: bool = False, filters: dict = {}\n) -> List[EntityType]:\n\"\"\"Gets a list of all instances of the respective entity.\n Whether the list is local or remote depends on the implementation.\n Args:\n unregistered (bool, optional): Wether to retrieve only unregistered local entities. Defaults to False.\n filters (dict, optional): key-value pairs specifying filters to apply to the list of entities.\n Returns:\n List[Entity]: a list of entities.\n \"\"\"\nlogging.info(f\"Retrieving all {cls.get_type()} entities\")\nif unregistered:\nif filters:\nraise InvalidArgumentError(\n\"Filtering is not supported for unregistered entities\"\n)\nreturn cls.__unregistered_all()\nreturn cls.__remote_all(filters=filters)\n
"},{"location":"reference/entities/interface/#entities.interface.Entity.display_dict","title":"display_dict()
","text":"Returns a dictionary of entity properties that can be displayed to a user interface using a verbose name of the property rather than the internal names
Returns:
Name Type Descriptiondict
dict
the display dictionary
Source code incli/medperf/entities/interface.py
def display_dict(self) -> dict:\n\"\"\"Returns a dictionary of entity properties that can be displayed\n to a user interface using a verbose name of the property rather than\n the internal names\n Returns:\n dict: the display dictionary\n \"\"\"\nraise NotImplementedError\n
"},{"location":"reference/entities/interface/#entities.interface.Entity.get","title":"get(uid, local_only=False)
classmethod
","text":"Gets an instance of the respective entity. Wether this requires only local read or remote calls depends on the implementation.
Parameters:
Name Type Description Defaultuid
str
Unique Identifier to retrieve the entity
requiredlocal_only
bool
If True, the entity will be retrieved locally
False
Returns:
Name Type DescriptionEntity
EntityType
Entity Instance associated to the UID
Source code incli/medperf/entities/interface.py
@classmethod\ndef get(\ncls: Type[EntityType], uid: Union[str, int], local_only: bool = False\n) -> EntityType:\n\"\"\"Gets an instance of the respective entity.\n Wether this requires only local read or remote calls depends\n on the implementation.\n Args:\n uid (str): Unique Identifier to retrieve the entity\n local_only (bool): If True, the entity will be retrieved locally\n Returns:\n Entity: Entity Instance associated to the UID\n \"\"\"\nif not str(uid).isdigit() or local_only:\nreturn cls.__local_get(uid)\nreturn cls.__remote_get(uid)\n
"},{"location":"reference/entities/interface/#entities.interface.Entity.remote_prefilter","title":"remote_prefilter(filters)
staticmethod
","text":"Applies filtering logic that must be done before retrieving remote entities
Parameters:
Name Type Description Defaultfilters
dict
filters to apply
requiredReturns:
Name Type Descriptioncallable
callable
A function for retrieving remote entities with the applied prefilters
Source code incli/medperf/entities/interface.py
@staticmethod\ndef remote_prefilter(filters: dict) -> callable:\n\"\"\"Applies filtering logic that must be done before retrieving remote entities\n Args:\n filters (dict): filters to apply\n Returns:\n callable: A function for retrieving remote entities with the applied prefilters\n \"\"\"\nraise NotImplementedError\n
"},{"location":"reference/entities/interface/#entities.interface.Entity.upload","title":"upload()
","text":"Upload the entity-related information to the communication's interface
Returns:
Name Type DescriptionDict
Dict
Dictionary with the updated entity information
Source code incli/medperf/entities/interface.py
def upload(self) -> Dict:\n\"\"\"Upload the entity-related information to the communication's interface\n Returns:\n Dict: Dictionary with the updated entity information\n \"\"\"\nif self.for_test:\nraise InvalidArgumentError(\nf\"This test {self.get_type()} cannot be uploaded.\"\n)\nbody = self.todict()\ncomms_func = self.get_comms_uploader()\nupdated_body = comms_func(body)\nreturn updated_body\n
"},{"location":"reference/entities/interface/#entities.interface.Entity.write","title":"write()
","text":"Writes the entity to the local storage
Returns:
Name Type Descriptionstr
str
Path to the stored entity
Source code incli/medperf/entities/interface.py
def write(self) -> str:\n\"\"\"Writes the entity to the local storage\n Returns:\n str: Path to the stored entity\n \"\"\"\ndata = self.todict()\nmetadata_filename = self.get_metadata_filename()\nentity_file = os.path.join(self.path, metadata_filename)\nos.makedirs(self.path, exist_ok=True)\nwith open(entity_file, \"w\") as f:\nyaml.dump(data, f)\nreturn entity_file\n
"},{"location":"reference/entities/report/","title":"Report","text":""},{"location":"reference/entities/report/#entities.report.TestReport","title":"TestReport
","text":" Bases: Entity
Class representing a compatibility test report entry
A test report consists of the components of a test execution: - data used, which can be: - a demo dataset url and its hash, or - a raw data path and its labels path, or - a prepared dataset uid - Data preparation cube if the data used was not already prepared - model cube - evaluator cube - results
This entity is only a local one, there is no TestReports on the serverHowever, we still use the same Entity interface used by other entities in order to reduce repeated code. Consequently, we mocked a few methods and attributes inherited from the Entity interface that are not relevant to this entity, such as the name
and id
attributes, and such as the get
and all
methods.
cli/medperf/entities/report.py
class TestReport(Entity):\n\"\"\"\n Class representing a compatibility test report entry\n A test report consists of the components of a test execution:\n - data used, which can be:\n - a demo dataset url and its hash, or\n - a raw data path and its labels path, or\n - a prepared dataset uid\n - Data preparation cube if the data used was not already prepared\n - model cube\n - evaluator cube\n - results\n Note: This entity is only a local one, there is no TestReports on the server\n However, we still use the same Entity interface used by other entities\n in order to reduce repeated code. Consequently, we mocked a few methods\n and attributes inherited from the Entity interface that are not relevant to\n this entity, such as the `name` and `id` attributes, and such as\n the `get` and `all` methods.\n \"\"\"\nname: Optional[str] = \"name\"\ndemo_dataset_url: Optional[str]\ndemo_dataset_hash: Optional[str]\ndata_path: Optional[str]\nlabels_path: Optional[str]\nprepared_data_hash: Optional[str]\ndata_preparation_mlcube: Optional[Union[int, str]]\nmodel: Union[int, str]\ndata_evaluator_mlcube: Union[int, str]\nresults: Optional[dict]\n@staticmethod\ndef get_type():\nreturn \"report\"\n@staticmethod\ndef get_storage_path():\nreturn config.tests_folder\n@staticmethod\ndef get_metadata_filename():\nreturn config.test_report_file\ndef __init__(self, *args, **kwargs):\nsuper().__init__(*args, **kwargs)\nself.id = None\nself.for_test = True\n@property\ndef local_id(self):\n\"\"\"A helper that generates a unique hash for a test report.\"\"\"\nparams = self.todict()\ndel params[\"results\"]\nparams = str(params)\nreturn hashlib.sha256(params.encode()).hexdigest()\ndef set_results(self, results):\nself.results = results\n@classmethod\ndef all(cls, unregistered: bool = False, filters: dict = {}) -> List[\"TestReport\"]:\nassert unregistered, \"Reports are only unregistered\"\nassert filters == {}, \"Reports cannot be filtered\"\nreturn super().all(unregistered=True, filters={})\n@classmethod\ndef get(cls, uid: str, local_only: bool = False) -> \"TestReport\":\n\"\"\"Gets an instance of the TestReport. ignores local_only inherited flag as TestReport is always a local entity.\n Args:\n uid (str): Report Unique Identifier\n local_only (bool): ignored. Left for aligning with parent Entity class\n Returns:\n TestReport: Report Instance associated to the UID\n \"\"\"\nreturn super().get(uid, local_only=True)\ndef display_dict(self):\nif self.data_path:\ndata_source = f\"{self.data_path}\"[:27] + \"...\"\nelif self.demo_dataset_url:\ndata_source = f\"{self.demo_dataset_url}\"[:27] + \"...\"\nelse:\ndata_source = f\"{self.prepared_data_hash}\"\nreturn {\n\"UID\": self.local_id,\n\"Data Source\": data_source,\n\"Model\": (\nself.model if isinstance(self.model, int) else self.model[:27] + \"...\"\n),\n\"Evaluator\": (\nself.data_evaluator_mlcube\nif isinstance(self.data_evaluator_mlcube, int)\nelse self.data_evaluator_mlcube[:27] + \"...\"\n),\n}\n
"},{"location":"reference/entities/report/#entities.report.TestReport.local_id","title":"local_id
property
","text":"A helper that generates a unique hash for a test report.
"},{"location":"reference/entities/report/#entities.report.TestReport.get","title":"get(uid, local_only=False)
classmethod
","text":"Gets an instance of the TestReport. ignores local_only inherited flag as TestReport is always a local entity.
Parameters:
Name Type Description Defaultuid
str
Report Unique Identifier
requiredlocal_only
bool
ignored. Left for aligning with parent Entity class
False
Returns:
Name Type DescriptionTestReport
TestReport
Report Instance associated to the UID
Source code incli/medperf/entities/report.py
@classmethod\ndef get(cls, uid: str, local_only: bool = False) -> \"TestReport\":\n\"\"\"Gets an instance of the TestReport. ignores local_only inherited flag as TestReport is always a local entity.\n Args:\n uid (str): Report Unique Identifier\n local_only (bool): ignored. Left for aligning with parent Entity class\n Returns:\n TestReport: Report Instance associated to the UID\n \"\"\"\nreturn super().get(uid, local_only=True)\n
"},{"location":"reference/entities/result/","title":"Result","text":""},{"location":"reference/entities/result/#entities.result.Result","title":"Result
","text":" Bases: Entity
, ApprovableSchema
Class representing a Result entry
Results are obtained after successfully running a benchmark execution flow. They contain information regarding the components involved in obtaining metrics results, as well as the results themselves. This class provides methods for working with benchmark results and how to upload them to the backend.
Source code incli/medperf/entities/result.py
class Result(Entity, ApprovableSchema):\n\"\"\"\n Class representing a Result entry\n Results are obtained after successfully running a benchmark\n execution flow. They contain information regarding the\n components involved in obtaining metrics results, as well as the\n results themselves. This class provides methods for working with\n benchmark results and how to upload them to the backend.\n \"\"\"\nbenchmark: int\nmodel: int\ndataset: int\nresults: dict\nmetadata: dict = {}\nuser_metadata: dict = {}\n@staticmethod\ndef get_type():\nreturn \"result\"\n@staticmethod\ndef get_storage_path():\nreturn config.results_folder\n@staticmethod\ndef get_comms_retriever():\nreturn config.comms.get_result\n@staticmethod\ndef get_metadata_filename():\nreturn config.results_info_file\n@staticmethod\ndef get_comms_uploader():\nreturn config.comms.upload_result\ndef __init__(self, *args, **kwargs):\n\"\"\"Creates a new result instance\"\"\"\nsuper().__init__(*args, **kwargs)\n@property\ndef local_id(self):\nreturn self.name\n@staticmethod\ndef remote_prefilter(filters: dict) -> callable:\n\"\"\"Applies filtering logic that must be done before retrieving remote entities\n Args:\n filters (dict): filters to apply\n Returns:\n callable: A function for retrieving remote entities with the applied prefilters\n \"\"\"\ncomms_fn = config.comms.get_results\nif \"owner\" in filters and filters[\"owner\"] == get_medperf_user_data()[\"id\"]:\ncomms_fn = config.comms.get_user_results\nif \"benchmark\" in filters and filters[\"benchmark\"] is not None:\nbmk = filters[\"benchmark\"]\ndef get_benchmark_results():\n# Decorate the benchmark results remote function so it has the same signature\n# as all the comms_fns\nreturn config.comms.get_benchmark_results(bmk)\ncomms_fn = get_benchmark_results\nreturn comms_fn\ndef display_dict(self):\nreturn {\n\"UID\": self.identifier,\n\"Name\": self.name,\n\"Benchmark\": self.benchmark,\n\"Model\": self.model,\n\"Dataset\": self.dataset,\n\"Partial\": self.metadata[\"partial\"],\n\"Approval Status\": self.approval_status,\n\"Created At\": self.created_at,\n\"Registered\": self.is_registered,\n}\n
"},{"location":"reference/entities/result/#entities.result.Result.__init__","title":"__init__(*args, **kwargs)
","text":"Creates a new result instance
Source code incli/medperf/entities/result.py
def __init__(self, *args, **kwargs):\n\"\"\"Creates a new result instance\"\"\"\nsuper().__init__(*args, **kwargs)\n
"},{"location":"reference/entities/result/#entities.result.Result.remote_prefilter","title":"remote_prefilter(filters)
staticmethod
","text":"Applies filtering logic that must be done before retrieving remote entities
Parameters:
Name Type Description Defaultfilters
dict
filters to apply
requiredReturns:
Name Type Descriptioncallable
callable
A function for retrieving remote entities with the applied prefilters
Source code incli/medperf/entities/result.py
@staticmethod\ndef remote_prefilter(filters: dict) -> callable:\n\"\"\"Applies filtering logic that must be done before retrieving remote entities\n Args:\n filters (dict): filters to apply\n Returns:\n callable: A function for retrieving remote entities with the applied prefilters\n \"\"\"\ncomms_fn = config.comms.get_results\nif \"owner\" in filters and filters[\"owner\"] == get_medperf_user_data()[\"id\"]:\ncomms_fn = config.comms.get_user_results\nif \"benchmark\" in filters and filters[\"benchmark\"] is not None:\nbmk = filters[\"benchmark\"]\ndef get_benchmark_results():\n# Decorate the benchmark results remote function so it has the same signature\n# as all the comms_fns\nreturn config.comms.get_benchmark_results(bmk)\ncomms_fn = get_benchmark_results\nreturn comms_fn\n
"},{"location":"reference/entities/schemas/","title":"Schemas","text":""},{"location":"reference/entities/schemas/#entities.schemas.MedperfSchema","title":"MedperfSchema
","text":" Bases: BaseModel
cli/medperf/entities/schemas.py
class MedperfSchema(BaseModel):\nfor_test: bool = False\nid: Optional[int]\nname: str = Field(..., max_length=64)\nowner: Optional[int]\nis_valid: bool = True\ncreated_at: Optional[datetime]\nmodified_at: Optional[datetime]\ndef __init__(self, *args, **kwargs):\n\"\"\"Override the ValidationError procedure so we can\n format the error message in our desired way\n \"\"\"\ntry:\nsuper().__init__(*args, **kwargs)\nexcept ValidationError as e:\nerrors_dict = defaultdict(list)\nfor error in e.errors():\nfield = error[\"loc\"]\nmsg = error[\"msg\"]\nerrors_dict[field].append(msg)\nerror_msg = \"Field Validation Error:\"\nerror_msg += format_errors_dict(errors_dict)\nraise MedperfException(error_msg)\ndef dict(self, *args, **kwargs) -> dict:\n\"\"\"Overrides dictionary implementation so it filters out\n fields not defined in the pydantic model\n Returns:\n dict: filtered dictionary\n \"\"\"\nfields = self.__fields__\nvalid_fields = []\n# Gather all the field names, both original an alias names\nfor field_name, field_item in fields.items():\nvalid_fields.append(field_name)\nvalid_fields.append(field_item.alias)\n# Remove duplicates\nvalid_fields = set(valid_fields)\nmodel_dict = super().dict(*args, **kwargs)\nout_dict = {k: v for k, v in model_dict.items() if k in valid_fields}\nreturn out_dict\ndef todict(self) -> dict:\n\"\"\"Dictionary containing both original and alias fields\n Returns:\n dict: Extended dictionary representation\n \"\"\"\nog_dict = self.dict()\nalias_dict = self.dict(by_alias=True)\nog_dict.update(alias_dict)\nfor k, v in og_dict.items():\nif v is None:\nog_dict[k] = \"\"\nif isinstance(v, HttpUrl):\nog_dict[k] = str(v)\nreturn og_dict\n@validator(\"*\", pre=True)\ndef empty_str_to_none(cls, v):\nif v == \"\":\nreturn None\nreturn v\n@validator(\"name\", pre=True, always=True)\ndef name_max_length(cls, v, *, values, **kwargs):\nif not values[\"for_test\"] and len(v) > 20:\nraise ValueError(\"The name must have no more than 20 characters\")\nreturn v\nclass Config:\nallow_population_by_field_name = True\nextra = \"allow\"\nuse_enum_values = True\n
"},{"location":"reference/entities/schemas/#entities.schemas.MedperfSchema.__init__","title":"__init__(*args, **kwargs)
","text":"Override the ValidationError procedure so we can format the error message in our desired way
Source code incli/medperf/entities/schemas.py
def __init__(self, *args, **kwargs):\n\"\"\"Override the ValidationError procedure so we can\n format the error message in our desired way\n \"\"\"\ntry:\nsuper().__init__(*args, **kwargs)\nexcept ValidationError as e:\nerrors_dict = defaultdict(list)\nfor error in e.errors():\nfield = error[\"loc\"]\nmsg = error[\"msg\"]\nerrors_dict[field].append(msg)\nerror_msg = \"Field Validation Error:\"\nerror_msg += format_errors_dict(errors_dict)\nraise MedperfException(error_msg)\n
"},{"location":"reference/entities/schemas/#entities.schemas.MedperfSchema.dict","title":"dict(*args, **kwargs)
","text":"Overrides dictionary implementation so it filters out fields not defined in the pydantic model
Returns:
Name Type Descriptiondict
dict
filtered dictionary
Source code incli/medperf/entities/schemas.py
def dict(self, *args, **kwargs) -> dict:\n\"\"\"Overrides dictionary implementation so it filters out\n fields not defined in the pydantic model\n Returns:\n dict: filtered dictionary\n \"\"\"\nfields = self.__fields__\nvalid_fields = []\n# Gather all the field names, both original an alias names\nfor field_name, field_item in fields.items():\nvalid_fields.append(field_name)\nvalid_fields.append(field_item.alias)\n# Remove duplicates\nvalid_fields = set(valid_fields)\nmodel_dict = super().dict(*args, **kwargs)\nout_dict = {k: v for k, v in model_dict.items() if k in valid_fields}\nreturn out_dict\n
"},{"location":"reference/entities/schemas/#entities.schemas.MedperfSchema.todict","title":"todict()
","text":"Dictionary containing both original and alias fields
Returns:
Name Type Descriptiondict
dict
Extended dictionary representation
Source code incli/medperf/entities/schemas.py
def todict(self) -> dict:\n\"\"\"Dictionary containing both original and alias fields\n Returns:\n dict: Extended dictionary representation\n \"\"\"\nog_dict = self.dict()\nalias_dict = self.dict(by_alias=True)\nog_dict.update(alias_dict)\nfor k, v in og_dict.items():\nif v is None:\nog_dict[k] = \"\"\nif isinstance(v, HttpUrl):\nog_dict[k] = str(v)\nreturn og_dict\n
"},{"location":"reference/storage/utils/","title":"Utils","text":""},{"location":"reference/ui/cli/","title":"Cli","text":""},{"location":"reference/ui/cli/#ui.cli.CLI","title":"CLI
","text":" Bases: UI
cli/medperf/ui/cli.py
class CLI(UI):\ndef __init__(self):\nself.spinner = yaspin(color=\"green\")\nself.is_interactive = False\ndef print(self, msg: str = \"\"):\n\"\"\"Display a message on the command line\n Args:\n msg (str): message to print\n \"\"\"\nself.__print(msg)\ndef print_error(self, msg: str):\n\"\"\"Display an error message on the command line\n Args:\n msg (str): error message to display\n \"\"\"\nmsg = f\"\u274c {msg}\"\nmsg = typer.style(msg, fg=typer.colors.RED, bold=True)\nself.__print(msg)\ndef print_warning(self, msg: str):\n\"\"\"Display a warning message on the command line\n Args:\n msg (str): warning message to display\n \"\"\"\nmsg = typer.style(msg, fg=typer.colors.YELLOW, bold=True)\nself.__print(msg)\ndef __print(self, msg: str = \"\"):\nif self.is_interactive:\nself.spinner.write(msg)\nelse:\ntyper.echo(msg)\ndef start_interactive(self):\n\"\"\"Start an interactive session where messages can be overwritten\n and animations can be displayed\"\"\"\nself.is_interactive = True\nself.spinner.start()\ndef stop_interactive(self):\n\"\"\"Stop an interactive session\"\"\"\nself.is_interactive = False\nself.spinner.stop()\n@contextmanager\ndef interactive(self):\n\"\"\"Context managed interactive session.\n Yields:\n CLI: Yields the current CLI instance with an interactive session initialized\n \"\"\"\nself.start_interactive()\ntry:\nyield self\nfinally:\nself.stop_interactive()\n@property\ndef text(self):\nreturn self.spinner.text\n@text.setter\ndef text(self, msg: str = \"\"):\n\"\"\"Displays a message that overwrites previous messages if they\n were created during an interactive ui session.\n If not on interactive session already, then it calls the ui print function\n Args:\n msg (str): message to display\n \"\"\"\nif not self.is_interactive:\nself.print(msg)\nself.spinner.text = msg\ndef prompt(self, msg: str) -> str:\n\"\"\"Displays a prompt to the user and waits for an answer\n Args:\n msg (str): message to use for the prompt\n Returns:\n str: user input\n \"\"\"\nreturn input(msg)\ndef hidden_prompt(self, msg: str) -> str:\n\"\"\"Displays a prompt to the user and waits for an aswer. User input is not displayed\n Args:\n msg (str): message to use for the prompt\n Returns:\n str: user input\n \"\"\"\nreturn getpass(msg)\ndef print_highlight(self, msg: str = \"\"):\n\"\"\"Display a highlighted message\n Args:\n msg (str): message to print\n \"\"\"\nmsg = typer.style(msg, fg=typer.colors.GREEN)\nself.__print(msg)\n
"},{"location":"reference/ui/cli/#ui.cli.CLI.hidden_prompt","title":"hidden_prompt(msg)
","text":"Displays a prompt to the user and waits for an aswer. User input is not displayed
Parameters:
Name Type Description Defaultmsg
str
message to use for the prompt
requiredReturns:
Name Type Descriptionstr
str
user input
Source code incli/medperf/ui/cli.py
def hidden_prompt(self, msg: str) -> str:\n\"\"\"Displays a prompt to the user and waits for an aswer. User input is not displayed\n Args:\n msg (str): message to use for the prompt\n Returns:\n str: user input\n \"\"\"\nreturn getpass(msg)\n
"},{"location":"reference/ui/cli/#ui.cli.CLI.interactive","title":"interactive()
","text":"Context managed interactive session.
Yields:
Name Type DescriptionCLI
Yields the current CLI instance with an interactive session initialized
Source code incli/medperf/ui/cli.py
@contextmanager\ndef interactive(self):\n\"\"\"Context managed interactive session.\n Yields:\n CLI: Yields the current CLI instance with an interactive session initialized\n \"\"\"\nself.start_interactive()\ntry:\nyield self\nfinally:\nself.stop_interactive()\n
"},{"location":"reference/ui/cli/#ui.cli.CLI.print","title":"print(msg='')
","text":"Display a message on the command line
Parameters:
Name Type Description Defaultmsg
str
message to print
''
Source code in cli/medperf/ui/cli.py
def print(self, msg: str = \"\"):\n\"\"\"Display a message on the command line\n Args:\n msg (str): message to print\n \"\"\"\nself.__print(msg)\n
"},{"location":"reference/ui/cli/#ui.cli.CLI.print_error","title":"print_error(msg)
","text":"Display an error message on the command line
Parameters:
Name Type Description Defaultmsg
str
error message to display
required Source code incli/medperf/ui/cli.py
def print_error(self, msg: str):\n\"\"\"Display an error message on the command line\n Args:\n msg (str): error message to display\n \"\"\"\nmsg = f\"\u274c {msg}\"\nmsg = typer.style(msg, fg=typer.colors.RED, bold=True)\nself.__print(msg)\n
"},{"location":"reference/ui/cli/#ui.cli.CLI.print_highlight","title":"print_highlight(msg='')
","text":"Display a highlighted message
Parameters:
Name Type Description Defaultmsg
str
message to print
''
Source code in cli/medperf/ui/cli.py
def print_highlight(self, msg: str = \"\"):\n\"\"\"Display a highlighted message\n Args:\n msg (str): message to print\n \"\"\"\nmsg = typer.style(msg, fg=typer.colors.GREEN)\nself.__print(msg)\n
"},{"location":"reference/ui/cli/#ui.cli.CLI.print_warning","title":"print_warning(msg)
","text":"Display a warning message on the command line
Parameters:
Name Type Description Defaultmsg
str
warning message to display
required Source code incli/medperf/ui/cli.py
def print_warning(self, msg: str):\n\"\"\"Display a warning message on the command line\n Args:\n msg (str): warning message to display\n \"\"\"\nmsg = typer.style(msg, fg=typer.colors.YELLOW, bold=True)\nself.__print(msg)\n
"},{"location":"reference/ui/cli/#ui.cli.CLI.prompt","title":"prompt(msg)
","text":"Displays a prompt to the user and waits for an answer
Parameters:
Name Type Description Defaultmsg
str
message to use for the prompt
requiredReturns:
Name Type Descriptionstr
str
user input
Source code incli/medperf/ui/cli.py
def prompt(self, msg: str) -> str:\n\"\"\"Displays a prompt to the user and waits for an answer\n Args:\n msg (str): message to use for the prompt\n Returns:\n str: user input\n \"\"\"\nreturn input(msg)\n
"},{"location":"reference/ui/cli/#ui.cli.CLI.start_interactive","title":"start_interactive()
","text":"Start an interactive session where messages can be overwritten and animations can be displayed
Source code incli/medperf/ui/cli.py
def start_interactive(self):\n\"\"\"Start an interactive session where messages can be overwritten\n and animations can be displayed\"\"\"\nself.is_interactive = True\nself.spinner.start()\n
"},{"location":"reference/ui/cli/#ui.cli.CLI.stop_interactive","title":"stop_interactive()
","text":"Stop an interactive session
Source code incli/medperf/ui/cli.py
def stop_interactive(self):\n\"\"\"Stop an interactive session\"\"\"\nself.is_interactive = False\nself.spinner.stop()\n
"},{"location":"reference/ui/factory/","title":"Factory","text":""},{"location":"reference/ui/interface/","title":"Interface","text":""},{"location":"reference/ui/interface/#ui.interface.UI","title":"UI
","text":" Bases: ABC
cli/medperf/ui/interface.py
class UI(ABC):\n@abstractmethod\ndef print(self, msg: str = \"\"):\n\"\"\"Display a message to the interface. If on interactive session overrides\n previous message\n \"\"\"\n@abstractmethod\ndef print_error(self, msg: str):\n\"\"\"Display an error message to the interface\"\"\"\ndef print_warning(self, msg: str):\n\"\"\"Display a warning message on the command line\"\"\"\n@abstractmethod\ndef start_interactive(self):\n\"\"\"Initialize an interactive session for animations or overriding messages.\n If the UI doesn't support this, the function can be left empty.\n \"\"\"\n@abstractmethod\ndef stop_interactive(self):\n\"\"\"Terminate an interactive session.\n If the UI doesn't support this, the function can be left empty.\n \"\"\"\n@abstractmethod\n@contextmanager\ndef interactive(self):\n\"\"\"Context managed interactive session. Expected to yield the same instance\"\"\"\n@abstractmethod\ndef text(self, msg: str):\n\"\"\"Displays a messages that overwrites previous messages if they were created\n during an interactive session.\n If not supported or not on an interactive session, it is expected to fallback\n to the UI print function.\n Args:\n msg (str): message to display\n \"\"\"\n@abstractmethod\ndef prompt(msg: str) -> str:\n\"\"\"Displays a prompt to the user and waits for an answer\"\"\"\n@abstractmethod\ndef hidden_prompt(self, msg: str) -> str:\n\"\"\"Displays a prompt to the user and waits for an aswer. User input is not displayed\n Args:\n msg (str): message to use for the prompt\n Returns:\n str: user input\n \"\"\"\n@abstractmethod\ndef print_highlight(self, msg: str = \"\"):\n\"\"\"Display a message on the command line with green color\n Args:\n msg (str): message to print\n \"\"\"\n
"},{"location":"reference/ui/interface/#ui.interface.UI.hidden_prompt","title":"hidden_prompt(msg)
abstractmethod
","text":"Displays a prompt to the user and waits for an aswer. User input is not displayed
Parameters:
Name Type Description Defaultmsg
str
message to use for the prompt
requiredReturns:
Name Type Descriptionstr
str
user input
Source code incli/medperf/ui/interface.py
@abstractmethod\ndef hidden_prompt(self, msg: str) -> str:\n\"\"\"Displays a prompt to the user and waits for an aswer. User input is not displayed\n Args:\n msg (str): message to use for the prompt\n Returns:\n str: user input\n \"\"\"\n
"},{"location":"reference/ui/interface/#ui.interface.UI.interactive","title":"interactive()
abstractmethod
","text":"Context managed interactive session. Expected to yield the same instance
Source code incli/medperf/ui/interface.py
@abstractmethod\n@contextmanager\ndef interactive(self):\n\"\"\"Context managed interactive session. Expected to yield the same instance\"\"\"\n
"},{"location":"reference/ui/interface/#ui.interface.UI.print","title":"print(msg='')
abstractmethod
","text":"Display a message to the interface. If on interactive session overrides previous message
Source code incli/medperf/ui/interface.py
@abstractmethod\ndef print(self, msg: str = \"\"):\n\"\"\"Display a message to the interface. If on interactive session overrides\n previous message\n \"\"\"\n
"},{"location":"reference/ui/interface/#ui.interface.UI.print_error","title":"print_error(msg)
abstractmethod
","text":"Display an error message to the interface
Source code incli/medperf/ui/interface.py
@abstractmethod\ndef print_error(self, msg: str):\n\"\"\"Display an error message to the interface\"\"\"\n
"},{"location":"reference/ui/interface/#ui.interface.UI.print_highlight","title":"print_highlight(msg='')
abstractmethod
","text":"Display a message on the command line with green color
Parameters:
Name Type Description Defaultmsg
str
message to print
''
Source code in cli/medperf/ui/interface.py
@abstractmethod\ndef print_highlight(self, msg: str = \"\"):\n\"\"\"Display a message on the command line with green color\n Args:\n msg (str): message to print\n \"\"\"\n
"},{"location":"reference/ui/interface/#ui.interface.UI.print_warning","title":"print_warning(msg)
","text":"Display a warning message on the command line
Source code incli/medperf/ui/interface.py
def print_warning(self, msg: str):\n\"\"\"Display a warning message on the command line\"\"\"\n
"},{"location":"reference/ui/interface/#ui.interface.UI.prompt","title":"prompt(msg)
abstractmethod
","text":"Displays a prompt to the user and waits for an answer
Source code incli/medperf/ui/interface.py
@abstractmethod\ndef prompt(msg: str) -> str:\n\"\"\"Displays a prompt to the user and waits for an answer\"\"\"\n
"},{"location":"reference/ui/interface/#ui.interface.UI.start_interactive","title":"start_interactive()
abstractmethod
","text":"Initialize an interactive session for animations or overriding messages. If the UI doesn't support this, the function can be left empty.
Source code incli/medperf/ui/interface.py
@abstractmethod\ndef start_interactive(self):\n\"\"\"Initialize an interactive session for animations or overriding messages.\n If the UI doesn't support this, the function can be left empty.\n \"\"\"\n
"},{"location":"reference/ui/interface/#ui.interface.UI.stop_interactive","title":"stop_interactive()
abstractmethod
","text":"Terminate an interactive session. If the UI doesn't support this, the function can be left empty.
Source code incli/medperf/ui/interface.py
@abstractmethod\ndef stop_interactive(self):\n\"\"\"Terminate an interactive session.\n If the UI doesn't support this, the function can be left empty.\n \"\"\"\n
"},{"location":"reference/ui/interface/#ui.interface.UI.text","title":"text(msg)
abstractmethod
","text":"Displays a messages that overwrites previous messages if they were created during an interactive session. If not supported or not on an interactive session, it is expected to fallback to the UI print function.
Parameters:
Name Type Description Defaultmsg
str
message to display
required Source code incli/medperf/ui/interface.py
@abstractmethod\ndef text(self, msg: str):\n\"\"\"Displays a messages that overwrites previous messages if they were created\n during an interactive session.\n If not supported or not on an interactive session, it is expected to fallback\n to the UI print function.\n Args:\n msg (str): message to display\n \"\"\"\n
"},{"location":"reference/ui/stdin/","title":"Stdin","text":""},{"location":"reference/ui/stdin/#ui.stdin.StdIn","title":"StdIn
","text":" Bases: UI
Class for using sys.stdin/sys.stdout exclusively. Used mainly for automating execution with class-like objects. Using only basic IO methods ensures that piping from the command-line. Should not be used in normal execution, as hidden prompts and interactive prints will not work as expected.
Source code incli/medperf/ui/stdin.py
class StdIn(UI):\n\"\"\"\n Class for using sys.stdin/sys.stdout exclusively. Used mainly for automating\n execution with class-like objects. Using only basic IO methods ensures that\n piping from the command-line. Should not be used in normal execution, as\n hidden prompts and interactive prints will not work as expected.\n \"\"\"\ndef print(self, msg: str = \"\"):\nreturn print(msg)\ndef print_error(self, msg: str):\nreturn self.print(msg)\ndef start_interactive(self):\npass\ndef stop_interactive(self):\npass\n@contextmanager\ndef interactive(self):\nyield self\n@property\ndef text(self):\nreturn \"\"\n@text.setter\ndef text(self, msg: str = \"\"):\nreturn\ndef prompt(self, msg: str) -> str:\nreturn input(msg)\ndef hidden_prompt(self, msg: str) -> str:\nreturn self.prompt(msg)\n
"}]}
\ No newline at end of file
diff --git a/sitemap.xml b/sitemap.xml
new file mode 100644
index 000000000..349210f0c
--- /dev/null
+++ b/sitemap.xml
@@ -0,0 +1,573 @@
+
+MedPerf is an open-source framework for benchmarking medical ML models. It uses Federated Evaluation a method in which medical ML models are securely distributed to multiple global facilities for evaluation prioritizing patient privacy to mitigate legal and regulatory risks. The goal of Federated Evaluation is to make it simple and reliable to share ML models with many data providers, evaluate those ML models against their data in controlled settings, then aggregate and analyze the findings.
+The MedPerf approach empowers healthcare stakeholders through neutral governance to assess and verify the performance of ML models in an efficient and human-supervised process without sharing any patient data across facilities during the process.
++ |
---|
Federated evaluation of medical AI model using MedPerf on a hypothetical example | +
MedPerf aims to identify bias and generalizability issues of medical ML models by evaluating them on diverse medical data across the world. This process allows developers of medical ML to efficiently identify performance and reliability issues on their models while healthcare stakeholders (e.g., hospitals, practices, etc.) can validate such models against clinical efficacy.
+Importantly, MedPerf supports technology for neutral governance in order to enable full trust and transparency among participating parties (e.g., AI vendor, data provider, regulatory body, etc.). This is all encapsulated in the benchmark committee which is the overseeing body on a benchmark.
++ |
---|
Benchmark committee in MedPerf | +
Anyone who joins our platform can get several benefits, regardless of the role they will assume.
++ |
---|
Benefits to healthacare stakeholders using MedPerf | +
Our paper describes the design philosophy in detail.
+ + + + + + + + + + + +A benchmark in MedPerf is a collection of assets that are developed by the benchmark committee that aims to evaluate medical ML on decentralized data providers.
+The process is simple yet effective enabling scalability.
+The benchmarking process starts with establishing a benchmark committee of healthcare stakeholders (experts, committee), which will identify a clinical problem where an effective ML-based solution can have a significant clinical impact.
+ + +MLCubes are the building blocks of an experiment and are required in order to create a benchmark. Three MLCubes (Data Preparator MLCube, Reference Model MLCube, and Metrics MLCube) need to be submitted. After submitting the three MLCubes, alongside with a sample reference dataset, the Benchmark Committee is capable of creating a benchmark. Once the benchmark is submitted, the Medperf admin must approve it before it can be seen by other users. Follow our Hands-on Tutorial for detailed step-by-step guidelines.
+Data Providers that want to be part of the benchmark can register their own datasets, prepare them, and associate them with the benchmark. A dataset will be prepared using the benchmark's Data Preparator MLCube and the dataset's metadata is registered within the MedPerf server.
++ |
---|
Data Preparation | +
The data provider then can request to participate in the benchmark with their dataset. Requesting the association will run the benchmark's reference workflow to assure the compatibility of the prepared dataset structure with the workflow. Once the association request is approved by the Benchmark Committee, then the dataset becomes a part of the benchmark.
+ +Once a benchmark is submitted by the Benchmark Committee, any user can submit their own Model MLCubes and request an association with the benchmark. This association request executes the benchmark locally with the given model on the benchmark's reference dataset to ensure workflow validity and compatibility. If the model successfully passes the compatibility test, and its association is approved by the Benchmark Committee, it becomes a part of the benchmark.
+ +The Benchmark Committee may notify Data Providers that models are available for benchmarking. Data Providers can then run the benchmark models locally on their data.
+This procedure retrieves the model MLCubes associated with the benchmark and runs them on the indicated prepared dataset to generate predictions. The Metrics MLCube of the benchmark is then retrieved to evaluate the predictions. Once the evaluation results are generated, the data provider can submit them to the platform.
+ +The benchmarking platform aggregates the results of running the models against the datasets and shares them according to the Benchmark Committee's policy.
+The sharing policy controls how much of the data is shared, ranging from a single aggregated metric to a more detailed model-data cross product. A public leaderboard is available to Model Owners who produce the best performances.
+ + + + + + + + + + +