CodeBook for run_Analysis.R

The assignment requires us to take the data from the UIC source and perform 5 steps towards getting clean data

Merges the training and the test sets to create one data set.
Extracts only the measurements on the mean and standard deviation for each measurement.
Uses descriptive activity names to name the activities in the data set
Appropriately labels the data set with descriptive activity names.
Creates a second, independent tidy data set with the average of each variable for each activity and each subject.

The script walks through all the above steps
Includes downloading the data from site specified
Reading the data into local variables using read.table()
Using rbind, combine the test and train data into a single data
For step 4, change the names of the X columns using gsub to make them more meaningful
Adding README.txt from the original data set for reference
Extract only specified columns which reduces the column count from 561 to 66
use the aggregate function to find the mean of all the data in the reduced set for per person, per activity

For Step2, 33 columns were chosen for Mean measurement and 33 columns for Standard Deviation measurement
MeanFrequency and Angle measurements we not included in the above set. Only mean() and std() for all measurements were considered
For Step 4, I assumed that this meant cleaning up the column names of the features. In order to make these column names more readable, "()" and "-" were replaced with "." and camel case notation (Ex. tBodyAcc.Mean.X)
Since the format of the tidy data set was not specified, I've used a .txt format similar to the original data

Provide feedback