Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
.ipynb_checkpoints		.ipynb_checkpoints
images		images
tutorial-job		tutorial-job
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
getting-started.ipynb		getting-started.ipynb
postBuild		postBuild
requirements.txt		requirements.txt
runtime.txt		runtime.txt
start		start

Repository files navigation

Table of Contents

Purpose
Background
- Objective
Exercises
Lessons Learned

Purpose

The purpose of this scenario is to demonstrate how to operationalize Jupyter notebooks using the Versatile Data Kit (VDK) Jupyter integration. By the end of this guide, you'll understand how to:

Create a data job with VDK within a Jupyter notebook.
Write a data workflow in a notebook and make it ready to be put in a production environment.

Background

Objective:

All the following objectives will be executed within a Jupyter notebook:

Retrieve Data: - Extract data from the specified URL using pandas.
Data Cleansing: - Eliminate records associated with 'testuser'.
Score Classification: - Assign scores into predefined categories for clarity.
Data Ingestion: - Use VDK job_input to ingest the organized data.

Versatile Data Kit (VDK)

For detailed instructions on working with VDK, please refer to the guide from the provided link.

Exercises

The tutorial-job directory contains the ready-to-use code from this demo. Make sure to explore it as it will provide hands-on experience with the objectives and VDK Jupyter integration discussed in this guide. Please open up MyBinder to get started on the exercises!

The link did not work? Try this one out:

Lessons Learned

Throughout this scenario, you've:

Explored the capabilities of the VDK Jupyter integration.
Retrieved, cleaned, and processed data using Jupyter and VDK tools.
Classified scores into meaningful categories.
Understood the process of ingesting data through VDK within a Jupyter environment.

Congratulations!

> Go back to the main page of the Tutorial.

About

No description, website, or topics provided.

Apache-2.0 license

Report repository

Releases

No releases published

Packages

No packages published

Languages

Jupyter Notebook 97.8%
Python 1.5%
Shell 0.7%