diff --git a/README.md b/README.md index e9a9f4a22cad0..aa44ffa53e359 100644 --- a/README.md +++ b/README.md @@ -8,6 +8,7 @@ + Quick Start - [TiDB Quick Start Guide](QUICKSTART.md) - [Basic SQL Statements](try-tidb.md) + - [Bikeshare Example Database](bikeshare-example-database.md) + TiDB User Guide + TiDB Server Administration - [The TiDB Server](sql/tidb-server.md) diff --git a/bikeshare-example-database.md b/bikeshare-example-database.md new file mode 100644 index 0000000000000..d690de9dfb854 --- /dev/null +++ b/bikeshare-example-database.md @@ -0,0 +1,67 @@ +--- +title: Bikeshare Example Database +summary: Install the Bikeshare example database. +category: user guide +--- + +# Bikeshare Example Database + +Examples used in the TiDB manual use [System Data](https://www.capitalbikeshare.com/system-data) from +Capital Bikeshare, released under the [Capital Bikeshare Data License Agreement](https://www.capitalbikeshare.com/data-license-agreement). + +## Download all data files + +The system data is available [for download in .zip files](https://s3.amazonaws.com/capitalbikeshare-data/index.html) organized per year. Downloading and extracting all files requires approximately 3GB of disk space. To download all files for years 2010-2017 using a bash script: + +```bash +mkdir -p bikeshare-data && cd bikeshare-data + +for YEAR in 2010 2011 2012 2013 2014 2015 2016 2017; do + wget https://s3.amazonaws.com/capitalbikeshare-data/${YEAR}-capitalbikeshare-tripdata.zip + unzip ${YEAR}-capitalbikeshare-tripdata.zip +done; +``` + +## Load data into TiDB + +The system data can be imported into TiDB using the following schema: + +```sql +CREATE DATABASE bikeshare; +USE bikeshare; + +CREATE TABLE trips ( + trip_id bigint NOT NULL PRIMARY KEY auto_increment, + duration integer not null, + start_date datetime, + end_date datetime, + start_station_number integer, + start_station varchar(255), + end_station_number integer, + end_station varchar(255), + bike_number varchar(255), + member_type varchar(255) +); +``` + +You can import files individually using the example `LOAD DATA` command here, or import all files using the bash loop below: + +```sql +LOAD DATA LOCAL INFILE '2017Q1-capitalbikeshare-tripdata.csv' INTO TABLE trips + FIELDS TERMINATED BY ',' ENCLOSED BY '"' + LINES TERMINATED BY '\r\n' + IGNORE 1 LINES +(duration, start_date, end_date, start_station_number, start_station, +end_station_number, end_station, bike_number, member_type); +``` + +### Import all files + +To import all `*.csv` files into TiDB in a bash loop: + +```bash +for FILE in `ls *.csv`; do + echo "== $FILE ==" + mysql bikeshare -e "LOAD DATA LOCAL INFILE '${FILE}' INTO TABLE trips FIELDS TERMINATED BY ',' ENCLOSED BY '\"' LINES TERMINATED BY '\r\n' IGNORE 1 LINES (duration, start_date, end_date, start_station_number, start_station, end_station_number, end_station, bike_number, member_type);" +done; +```