The dataset consists of a set of detected targets of people walking through the Informatics Forum, the main building of the School of Informatics at the University of Edinburgh. The data covers several months of observation which has resulted in about 1000 observed trajectories each working day. By July 4, 2010, there were 27+ million target detections, of which an estimated 7.9 million were real targets, resulting in 92,000+ observed trajectories.
A view of the scene and image data from which the detected targets are found is:
The main entry/exit points (marked) are at the bottom left (front door), top left (cafe), top center (stairs), top right (elevator and night exit), bottom right (labs). Occasionally, there are events in the Forum which mean that there are many detected targets and tracking is rather difficult. There may be some false detections (noise, shadows, reflections). Normally, only about 30% of the captured frames contain a target and normally there are only a few targets in each frame (1 target in 46% of active frames, 2:25%, 3:14%, 4:8%, 5:4% 6-14:3% of time). There are occasional events in the recorded data, which may result in many 10s or 100s targets detected. Also, sometimes fixed furniture was moved into the field of view which resulted in a constant detection of the furniture in every frame. This accounts for several days (Jul 30, Aug 13) where the file sizes are much larger than usual.
The camera is fixed overhead (although it might drift and vibrate a little over time) approximately 23m above the floor. The distance between the 9 white dots on the floor is 297 cm vertically and 485 cm horizontally. The images are 640x480, where each pixel (horizontally and vertically) corresponds to 24.7 mm on the ground. The capture rate is about 9 frames per second depending on the local ethernet and capture host machine loads. Unfortunately, the sample rate can vary over short periods. Sometimes the capture program crashed, so some capture files may not cover all of a day. Since each captured frame is relatively independent of captured frames more than 10-20 seconds later, this should not make a difference.
The dataset does not consist of the raw images (although a short set of frames of 1 person is here). It contains a summary of each detected target in each image, namely:
- the description of a bounding box for the target and -an RGB histogram that summarises the target pixels.
Tracked target files: These files contain sets of detections that have been tracked together into a single target's trajectory. Tracker files start with "% Total number of trajectories in file are [Number]", where Number defines the number of trajectories. Files contain the information in the form of a Matlab structure. The trajectory points and the properties are in two different variables with same identifier. Each trajectory has a different identifier like "R1" for trajectory number 1 and "R2" for trajectory number 2 and so on.
The first variable is
Properties.{Identifier}= [ Number_of_Points_in_trajectory, Start_time, End_Time,
Average_Size_of_Target, Average_Width, Average_height,
Average_Histogram ];
TRACK.{Identifier}= [[ centre_X(1) Centre_Y(1) Time(1)];
[ centre_X(2) Centre_Y(2) Time(2)]
.......... and so on
.......... until ........
[ centre_X(end) Centre_Y(end) Time(end) ]];
- The size of tracked files is about 1MB each.
Tracked spline files: These files contain sets of 6 point spline descriptions of the tracked trajectory. The spline file contains the average error of the spline fit to the tracked trajectories, and the control points. This is for each trajectory produced by tracker with same identifier as tracker. The first line of spline file is "% Total number of trajectories in file are [Number]", where Number defines the number of trajectories. "X and Y are normalized by dividing 640 and 460 respectively" and "Image size is 640*460". Normalization is done because the spline fit works for variables in the range [0,1], so we transformed the values of the trajectory points to fall into [0,1]. The file contain the information in the form of a Matlab structure. Identifiers of each spline are the same as given in the tracker file for the corresponding trajectory . Deviation and Control points are stored as Deviation.{Identifier}= [ Standard deviation ];. This is the average distance between the tracked point and the closest point on the spline. The control points are stored as: Controlpoints.{Identifier}= [[Controlpoint_x1 Controlpoint_y1]; [Controlpoint_x2 Controlpoint_y2]........ and so on until six points ]]; The size of spline files is about 80KB each. The splines were fit based on a temporal parameterisation, so regions with more detections get more control points. This has the side effect that trajectories where people stand still for long periods of time are not represented accurately. People using the data might also consider investigating a spatial parameterisation whereby control points are spaced uniformly along the spatial trajectory.
The data files can be downloaded by clicking on a file and then unzipping them.
N/A - tracking is not available on that particular day, usually because there was some event happening so there were a lot of people standing around rather than walking on a focused trajectory.
Programs to do the detection, tracking and spline fitting and abnormal behaviour detection can be downloaded from here:
This data collection was initiated by Barbara Majecka as part of her MSc project. Please cite this dissertation if you use the data in a publication:s B. Majecka, "Statistical models of pedestrian behaviour in the Forum", MSc Dissertation, School of Informatics, University of Edinburgh, 2009. The spline fitting code was developed by Rowland Sillito. Improvements to the tracking was by Gurkirt Singh as part of a summer internship. This resulted in the Tracks and Splines datasets. You can read a report of his work here.
In order to the load the datasets, we provided the loader_edinburgh.py
import os, yaml
from toolkit.loaders.loader_edinburgh import load_edinburgh
# fixme: replace OPENTRAJ_ROOT with the address to root folder of OpenTraj
edinburgh_dir =
selected_day = '01Sep'
edinburgh_path = os.path.join(opentraj_root, 'datasets/Edinburgh/annotations', 'tracks.%s.txt' % selected_day)
traj_dataset = load_edinburgh(edinburgh_path, title="Edinburgh",
use_kalman=False, scene_id=selected_day, sampling_rate=4) # original framerate=9