Dota-350k

Dota-350k is a dataset comprised of all ranked matches played in Dota 2 on the 16th of April 2020 (Patch Version: 7.25c).

You can find out more about the dataset in our paper:

A. Dallmann, J. Kohlmann, D. Zoller and A. Hotho, "Sequential Item Recommendation in the MOBA Game Dota 2," 2021 International Conference on Data Mining Workshops (ICDMW), 2021, pp. 10-17, doi: 10.1109/ICDMW53433.2021.00009.

If you're using the dataset please cite the above paper.

Download

Raw Datasets

These packages contain the unprocessed raw datasets:

Processed Datasets

These packages contain the processed datasets and splits:

Dataset Processing

The repository contains code for processing the raw Dota-350k dataset. It also contains a script to convert the Opendota dataset from Kaggle to enable processing with the same scripts.

Setup

This project uses Poetry to manage the build process.

Executing:

poetry shell

prompts poetry to create and activate a virtual environment with the necessary dependencies that you can use to execute the scripts.

Convert OpenDota

Download and unpack the Opendota dataset from Kaggle. Then execute:

python -m dataset_processing.opendota.convert <unpacked-opendota-directory> <dataset-file> --file-type csv

Process OpenDota / Dota-350k

Processing is done in multiple steps. In order to speed up the computation, we use Spark to process the data and Parquet as the input file format. If you only have the CSV file, you need to convert it to parquet first:

python -m dataset_processing.dota.prepare_dataset to-parquet <input-file-csv> <output-directory-parquet>

Then you can start processing the datasets:

RAW_PATH=dota-350k-raw   # /path/that/contains/raw/dataset
DATASET_PATH=dota-350k   # /path/where/data/is/written

# Dota-350k
python -m  dataset_processing.dota.add_metadata $RAW_PATH/dota.parquet $DATASET_PATH ../resources/roles.json ../resources/heros.json
# Opendota
#python -m  dataset_processing.dota.add_metadata --opendota $RAW_PATH/dota.parquet $DATASET_PATH ../resources/roles.json ../resources/heros.json

python -m  dataset_processing.dota.prepare_dataset process --items-file ../resources/items-7.25c.json  $DATASET_PATH/dota.parquet $DATASET_PATH
python -m  dataset_processing.dota.prepare_dataset split --training-ratio 0.94 --validation-ratio 0.01 --test-ratio 0.05  $DATASET_PATH/dota-processed.parquet $DATASET_PATH

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
dataset_processing		dataset_processing
resources		resources
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Dota-350k

Download

Raw Datasets

Processed Datasets

Dataset Processing

Setup

Convert OpenDota

Process OpenDota / Dota-350k

About

Releases

Packages

Languages

License

dallmann-uniwue/dota-350k

Folders and files

Latest commit

History

Repository files navigation

Dota-350k

Download

Raw Datasets

Processed Datasets

Dataset Processing

Setup

Convert OpenDota

Process OpenDota / Dota-350k

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages