Data science is a field that studies data and how to extract meaning from it, whereas machine learning is a field devoted to understanding and building methods that utilize data to improve performance or inform predictions
In this walkthrough, I'll utilize Titanic Datasets to demonstrate data cleansing and forecast the passenger's survival using python language and jupyter notebook.
The train and test data frames describe the survival status of individual passengers
on the Titanic. The titanic data frame does not contain information for the crew, but it does contain
actual and estimated ages for almost 80% of the passengers. The principal source for data about
Titanic passengers is the Encyclopedia Titanica.
The training set used to build your machine learning models. 
The test set used to see how well your model performs on unseen data.
Pclass Passenger Class (1 = 1st; 2 = 2nd; 3 = 3rd) 
survival Survival (0 = No; 1 = Yes) 
name Name 
sex Sex 
age Age 
sibsp Number of Siblings/Spouses Aboard 
parch Number of Parents/Children Aboard 
ticket Ticket Number 
fare Passenger Fare (British pound) 
cabin Cabin 
embarked Port of Embarkation (C = Cherbourg; Q = Queenstown; S = Southampton) 
boat Lifeboat 
body Body Identification Number 
home.dest Home/Destination 
Pclass is a proxy for socio-economic status (SES)  
1st ~ Upper; 2nd ~ Middle; 3rd ~ Lower 
Age is in Years; Fractional if Age less than One (1) 
If the Age is estimated, it is in the form xx.5 
Fare is in Pre-1970 British Pounds () 
Conversion Factors: 1 = 12s = 240d and 1s = 20d 
With respect to the family relation variables (i.e. sibsp and parch) some relations were
ignored. The following are the definitions used for sibsp and parch.
Sibling: Brother, Sister, Stepbrother, or Stepsister of Passenger Aboard Titanic 
Spouse: Husband or Wife of Passenger Aboard Titanic (Mistresses and Fiances
Ignored) 
Parent: Mother or Father of Passenger Aboard Titanic 
Child: Son, Daughter, Stepson, or Stepdaughter of Passenger Aboard Titanic 
GoTO training model file for a description of the code.