Skip to content

Aafreen29/RandomForestExample

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 

Repository files navigation

RandomForestExample

Random Forest on 30 M records

Task- Predicting the fare amount (inclusive of tolls) for a taxi ride in New York City given the pickup and dropoff locations.

Following are the steps to make good model using Random Forest Regressor in Python:

1> Loading the data set of test and train. As there are 55M records; we will work on 30M records in order to avoid giving load to the python kernel.

2> Exploratory Data Analysis (EDA): This will help us to know the correlation between the independent variables and also with the target variable. We will also watch for the ouliers if any using EDA.

3> Feature Engineering: This step will helps us to convert the data types of variables (Here Pickup_datetime) and also splitting them into month, day, hour, weekdayname, weekday and year. We will add Haversine distance formula to calculate distance from the given pickup and dropoff latitudes and longitudes.

4> Model Training: We will train using Random Forest Regressor.

5> Prediction

About

Random Forest on 30 M records

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages