The idea is to rebuild AWS Sagemaker Python SDK using R6 classes and paws behind the scenes.
You can install the development version of sagemaker from GitHub with:
# install.packages("remotes)
remotes::install_github("DyfanJones/sagemaker-r-sdk")
This repo is in constantly under development and is not currently stable. sagemaker is currently aligning it’s api with sagemaker v2, apologises for any code breaking this causes.
This package aims to mimic python’s AWS Sagemaker SDK api, but using
R6
and paws
sagemaker
is a metadata package that contains all methods to interact
with Amazon Sagemaker.
- sagemaker.core:
Containse core components of sdk for example
Session
R6 class - sagemaker.common: Contains common components used throughout sagemaker sdk
- sagemaker.mlcore: Contains core components for machine learning (ML) and amazon developed ML.
- sagemaker.mlframework:
Contains ML frameworks developed for Amazon Sagemaker i.e.
SKLearn
- sagemaker.workflow: Contains sagemaker pipeline and workflows
- sagemaker.debugger: Contains debugging methods (https://github.com/awslabs/sagemaker-debugger-rulesconfig)
sagemaker
is designed to minic python’s sagemaker sdk. Therefore all
examples for python’s sagemaker should be able to accessible.
- Targeted Direct Marketing predicts potential customers that are most likely to convert based on customer and aggregate level metrics, using Amazon SageMaker’s implementation of XGBoost.
- XGBoost Tuning shows how to use SageMaker hyperparameter tuning to improve your model fits for the Targeted Direct Marketing task.
- BlazingText Word2Vec generates Word2Vec embeddings from a cleaned text dump of Wikipedia articles using SageMaker’s fast and scalable BlazingText implementation.
- R Multivariate Adaptive Regression Splines example over iris data.frame
Note: If a feature hasn’t yet been implemented please feel free to raise a pull request or a ticket
To keep the package within the CRAN size limit of 5MB. sagemaker is currently using a separate repository (sagemaker-r-test-data) to store R variants of test data stored in sagemaker-python-sdk. sagemaker-r-test-data will only consist of data that can’t be read into R natively i.e. python pickle files. For other test data sagemaker will read it directly from sagemaker-python-sdk.