-
Notifications
You must be signed in to change notification settings - Fork 129
User_Building a Competition Bundle
WARNING: This page is based on an earlier version of the Compute Pi example, but it is still useful to understand the basics of building a competition bundle. Other examples can be studied.
Building a competition in CodaLab is possible by uploading a Competition Bundle. A Competition Bundle is a zip archive which contains a YAML file that describes the competition. Other assets are included in the zip archive but won't be used unless they are referenced in the competition.yaml file.
Here are the contents of the example competition.zip bundle:
competition.zip
|- competition.yaml
|- data.html
|- evaluation.html
|- logo.jpg
|- overview.html
|- program.zip
|- reference.zip
|- terms_and_conditions.html
Here is an example competition.yaml. If you are not familiar with how competitions look in CodaLab, you might browse competitions before you read the configuration to get a basic understanding of the components of a competition.
This competition is a construed example for illustrative purposes. Assume the competition goal is to compute the value of pi (3.14...). The submission that contains the closest value of pi is the winner, each participant a single float value as their submission.
Here's an annotated competition.yaml, to explain the various configuration elements.
# Build an example competition
---
# The title of the competition
title: Example Competition
# A description of the competition
description: This is a competition to test the competition bundle system. It should be able to create a competition from this bundle. The goal is to compute the closest value of pi possible.
# A logo/image for the competition
image: logo.jpg
# Does this competition require participant approval by the organizer
has_registration: True
# When is this competition finished. It is valid to not include an end_date, which means the competition remains open perpetually.
end_date: 2013-12-31
# You can specify admins here, use their codalab username (CaSe sensitive!) they will automatically be added as participants
admin_names: tony,eric
# Each competition has a set of html pages for potential participants to read and review and for participants to use to interact with the competition. These are the specifications for those pages.
html:
# Basic overview (first impression) of the challenge
overview: overview.html
# What are the metrics being used for this challenge, how is it being scored.
evaluation: evaluation.html
# Terms of participation, including data licensing, results submission, et al
terms: terms_and_conditions.html
# Where to find the data, how to download it.
data: data.html
# An extra page
page_name: extra.html
# Competitions are broken up into phases. Every competition has at least one phase, some have multiple phases.
phases:
# Phase 1
1:
# Phase number for ordering
phasenumber: 1
# Label or name of this phase
label: "Training"
# When this phase starts - this is the first date participants can download the data and submit results
start_date: 2013-06-30
# Maximum number of submissions per participant
max_submissions: 100
# A bundle containing the program used to evaluate results.
scoring_program: program.zip
# A bundle containing reference data to compare submitted data with for scoring.
reference_data: reference.zip
# You can select from these colors:
# white, orange, yellow, green, blue, purple
color: orange
# Maximum execution time of the submission (in seconds), default = 300
execution_time_limit: 300
# Maximum number of submissions a user can make in a day. default = unlimited
max_submissions_per_day: 15
# The datasets used for this phase, all references are URLs to externally stored data
datasets:
# The first data set
1:
# Uniquely :) named
name: Data 1
# A url to the data
url: http://spreadsheets.google.com/pub?key=pyj6tScZqmEfbZyl0qjbiRQ&output=xls
# A brief description to indicate the contents of the data for users
description: Example Dataset
# A second data set, there can be any number
2:
# Again uniquely named so users can tell what it is
name: Data 2
# URL to the actual data
url: http://spreadsheets.google.com/pub?key=0AgogXXPMARyldGJqTDRfNHBWODJMRWlZaVhNclhNZXc&output=xls
# Brief description
description: Example Dataset
# Phase 2, the actual competition (in this case)
2:
# The second phase.
phasenumber: 2
# Phase name/label
label: "Challenge"
# When does this phase begin
start_date: 2013-09-30
# Maximum submissions this phase
max_submissions: 3
# Scoring program for this phase (the same as the previous phase)
scoring_program: program.zip
# The reference data for scoring, this could/should/would be different this phase
reference_data: reference.zip
# Data sets
datasets:
# Dataset #1
1:
# Data set name
name: Challenge Data
# URL for the dataset
url: http://spreadsheets.google.com/pub?key=t9GL1nIZdtxszJbjKErN2Hg&output=xls
# Data set description
description: Example challenge data
# Leaderboard / Scoreboard configuration
leaderboard:
# Collections of scores, ways to slice multi-dimensional scores into "groups"
# This leaderboard has one result, the difference (difference of the submitted number from Pi)
leaderboards:
# The internal key name for the overall results group
RESULTS: &RESULTS
# Label for this group
label: Results
# Ordering of the groups, starts at 1
rank: 1
# Actual scores in the leaderboard
columns:
# The internal key for this score
DIFFERENCE:
# This is a member of the results group
leaderboard: *RESULTS
# The column label for this score
label: Difference
# Order of the scores
rank: 1
# Sorting ascending instead of descending
sort: asc
To make this example complete it's important to understand how to build and package the program.zip and reference.zip referred to in the competition.yaml.
The program.zip bundle contains the program that compares the users submission with the reference data (in the reference.zip bundle) to score the submission. In this case the reference data contains the value of pi. The program.zip bundle computes the absolute difference of the submitted value from the reference value.
Here are the contents of the reference.zip file:
reference.zip
|- answer.txt (Contains: 3.14159265359)
|- metadata (Contains: This is the authoritative result.)
Here are the contents of the program.zip file:
program.zip
|- evaluate.py (The actual evaluation code to run)
|- metadata (Syntax and information needed to run)
|- readme.txt (Contains notes about the evaluation program)
|- setup.py (Enables py2exe to build a windows executable of the evaluate.py script)
|- Supporting modules and libraries (if required).
The program.zip metadata file contains command syntax to use, along with a short description:
command: python $program/evaluate.py $input $output
description: Example competition evaluation program.
When you upload a competition the reference_data
, scoring_program
and input_data
all are turned into datasets automatically. You can re-use or share these via the secret key!
If you want to share your competition before publishing, you can give users the "Secret Key" url underneath the title of your competition on its main page.
When zipping the competition bundle, make sure not extra directories are created within the zip. For instance, try using something like: zip -j name_of_zipe_file.zip file_to_be_zip
. The -j flag will not create extra directories within the zip.