Alternatively, for your convenience, you can find the same dataset as a dataset.zip file in the repo. You only need to download and uncompress the data in your current working directory before running the script. The code assumes that a "UCI HAR Dataset" folder has been successfully created under your current working directory.

The tidy data set

The code runs on the following raw data:

'activity_labels.txt': Links the class labels with their activity nam
'train/X_train.txt': Training set.
'train/y_train.txt': Training labels.
'test/X_test.txt': Test set.
'test/y_test.txt': Test labels.
'train/subject_train.txt': Each row identifies the subject who performed the activity for each window sample. Its range is from 1 to 30.

and produces the tidy dataset (for your convenience, the tidy dataset is also stored in the repo). The tidy data is in one single txt file. It contains 68 columns (activity, subject and 66 variables) and 180 rows. These 66 variables represent the average of a subset of variables in the original raw data. This subset has been produced by filtering and selecting only the mean or the standard deviation of the raw signals. See the Codebook.md file for more details.

The codebook

The codebook describes the measurements in more detail:

Information about the variables contained in the tidy data
Information about the summary choices and the trasnformations the software applies
Information about the experimental study design

The instruction R script

We also provide the run_analysis.R script that processes the raw data as input and produces the tidy data as output. The script has been tested on Windows 7 Professional 64-bit with service pack #1, RStudio (version 0.98.501) and R version 3.1.0 (64-bit). As a prerequisite, you need the reshape2 library to be properly installed in your R system. Finally, the code assumes that the raw data set is kept in a "UCI HAR Dataset" folder under your current working directory.

Step 1 - open RStudio, setup the working directory appropriately (see the "raw data" section above),
Step 2 - open and run/source the run_analysis.R script
Step 3 - get the output tidy dataset inside the "UCI HAR Dataset" folder

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
CodeBook.md		CodeBook.md
README.md		README.md
dataset.zip		dataset.zip
run_analysis.R		run_analysis.R
tidy.txt		tidy.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Getting and Cleaning Data Project

Human Activity Recognition Using Smartphones Dataset -- Version 1.0

Contents

The raw data

The tidy data set

The codebook

The instruction R script

About

Uh oh!

Releases

Packages

Languages

xpapag/GetCleanData

Folders and files

Latest commit

History

Repository files navigation

Getting and Cleaning Data Project

Human Activity Recognition Using Smartphones Dataset -- Version 1.0

Contents

The raw data

The tidy data set

The codebook

The instruction R script

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages