Skip to content

stanford-futuredata/willump-dfs

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

66 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

willump-dfs

This repository enables replication of the Purchase benchmark results in Figures 6 and 8 of the Willump paper. Willump's Purchase benchmark is adapted from the predict-next-purchase Featuretools benchmark, located here.

This benchmark requires Python 3 and was tested with Python 3.6.8.

First, install the Featuretools library and add the willump-dfs root folder to your PYTHONPATH.

Then download the Purchase dataset from here and unzip into the tests/test_resources/predict_next_purchase_resources/data_huge folder. This dataset has been processed from the Instacart dataset by scripts in the original predict-next-purchase benchmark.

To replicate results in Figure 6, run:

python3 tests/benchmark_scripts/purchase_train.py -d huge
python3 tests/benchmark_scripts/purchase_batch.py -d huge
python3 tests/benchmark_scripts/purchase_batch.py -d huge -c

The first command trains a model (note that this takes around three hours), the second executes it natively, the third optimizes it with Willump's cascades optimization. The throughput reported by the third command should be significantly higher than that reported by the second.

To replicate results in Figure 8, run:

python3 tests/benchmark_scripts/purchase_train.py -d huge -k 100
python3 tests/benchmark_scripts/purchase_batch.py -d huge -k 100
python3 tests/benchmark_scripts/purchase_batch.py -d huge -k 100 -c

The first command trains a model (note that this takes around three hours), the second executes it natively, the third optimizes it with Willump's top-k approximation optimization. The throughput reported by the third command should be significantly higher than that reported by the second.

About

Applying Willump design to deep feature synthesis

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages