RandomForestExample

Random Forest on 30 M records

Task- Predicting the fare amount (inclusive of tolls) for a taxi ride in New York City given the pickup and dropoff locations.

Following are the steps to make good model using Random Forest Regressor in Python:

1> Loading the data set of test and train. As there are 55M records; we will work on 30M records in order to avoid giving load to the python kernel.

2> Exploratory Data Analysis (EDA): This will help us to know the correlation between the independent variables and also with the target variable. We will also watch for the ouliers if any using EDA.

3> Feature Engineering: This step will helps us to convert the data types of variables (Here Pickup_datetime) and also splitting them into month, day, hour, weekdayname, weekday and year. We will add Haversine distance formula to calculate distance from the given pickup and dropoff latitudes and longitudes.

4> Model Training: We will train using Random Forest Regressor.

5> Prediction

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
README.md		README.md
new_analysis1.py		new_analysis1.py
sample_submission.csv		sample_submission.csv
test.csv		test.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

RandomForestExample

About

Uh oh!

Releases

Packages

Languages

Aafreen29/RandomForestExample

Folders and files

Latest commit

History

Repository files navigation

RandomForestExample

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages