GitHub - AryanPatill/Fuel-Efficiency-Prediction--Python: This is a learning project for practice, using regression on the Auto MPG dataset to predict fuel efficiency. It involves building a Neural Network with RMSprop, MSE loss, data normalization, and early stopping to prevent overfitting. Key: normalize data, use MAE/MSE, optimize model.

# Fuel-Efficiency-Prediction--Python

Fuel Efficiency Prediction - Regression using The Auto MPG dataset ( You can use any other dataset also ) In a regression problem, we aim to predict the output of a continuous value, like a price or a probability. Contrast this with a classification problem, where we aim to select a class from a list of classes (for example, where a picture contains an apple or an orange, recognizing which fruit is in the picture).

This notebook uses the classic Auto MPG Dataset and builds a model to predict the fuel efficiency of late-1970s and early 1980s automobiles. To do this, we'll provide the model with a description of many automobiles from that time period. This description includes attributes like: cylinders, displacement, horsepower, and weight.

DataSet - https://archive.ics.uci.edu/dataset/9/auto+mpg

Stepwise Instructions:

• Install Dependencies – Installs seaborn for visualization.
• Import Libraries – Loads pandas, matplotlib, seaborn, and tensorflow.
• Load Dataset – Downloads the Auto MPG dataset from UCI.
• Read & Prepare Data – Assigns column names and loads the dataset.
• Handle Missing Values – Removes rows with missing data.
• Process Categorical Data – One-hot encodes the Origin column (USA, Europe, Japan).
• Split Dataset – 80% training, 20% test data.
• Visualize Data – Uses seaborn.pairplot() to analyze feature relationships.
• Compute Statistics – Generates descriptive stats for numerical columns.
• Separate Features & Labels– Extracts MPG as the target variable.
• Normalize Data – Standardizes features using Z-score normalization.
• Build Model – Defines a Neural Network with:
2 hidden layers (64 neurons, ReLU activation)
1 output layer (regression output)
RMSprop optimizer & MSE loss
• Initialize Model – Creates the model instance.
• View Model Summary– Displays the architecture.
• Test Initial Predictions – Runs the model on sample input.
• Train Model – Trains for 1000 epochs with a validation split.
• Monitor Training – Stores training history and prints progress.
• Plot Training Performance – Graphs MAE and MSE over epochs.
• Use Early Stopping– Stops training when validation loss stagnates.
• Evaluate Model – Computes Mean Absolute Error (MAE) on test data.
• Make Predictions – Predicts MPG values and plots true vs. predicted.
• Analyze Errors – Visualizes prediction errors using a histogram.

Some small information about the learning from the code and information about it.

In[12]- Normalize the data Look again at the train_stats block above and note how different the ranges of each feature are. It is good practice to normalize features that use different scales and ranges. Although the model might converge without feature normalization, it makes training more difficult, and it makes the resulting model dependent on the choice of units used in the input. Note: Although we intentionally generate these statistics from only the training dataset, these statistics will also be used to normalize the test dataset. We need to do that to project the test dataset into the same distribution that the model has been trained on

In[13] -This normalized data is what we will use to train the model. Caution: The statistics used to normalize the inputs here (mean and standard deviation) need to be applied to any other data that is fed to the model, along with the one-hot encoding that we did earlier. That includes the test set as well as live data when the model is used in production.

In[14] - The Model Build the model Let's build our model. Here, we'll use a Sequential model with two densely connected hidden layers, and an output layer that returns a single, continuous value. The model building steps are wrapped in a function, build_model, since we'll create a second model, later on.

In[20]- This graph shows little improvement, or even degradation in the validation error after about 100 epochs. Let's update the model.fit call to automatically stop training when the validation score doesn't improve. We'll use an EarlyStopping callback that tests a training condition for every epoch. If a set amount of epochs elapses without showing improvement, then automatically stop the training.

Conclusion

This notebook introduced a few techniques to handle a regression problem.

Mean Squared Error (MSE) is a common loss function used for regression problems (different loss functions are used for classification problems).

Similarly, evaluation metrics used for regression differ from classification. A common regression metric is Mean Absolute Error (MAE).

When numeric input data features have values with different ranges, each feature should be scaled independently to the same range.

If there is not much training data, one technique is to prefer a small network with few hidden layers to avoid overfitting.

Early stopping is a useful technique to prevent overfitting.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
Fuel_Efficiency_Prediction.ipynb		Fuel_Efficiency_Prediction.ipynb
LICENSE		LICENSE
README.md		README.md
auto-mpg.data		auto-mpg.data

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

# Fuel-Efficiency-Prediction--Python

About

Uh oh!

Releases

Packages

Languages

License

AryanPatill/Fuel-Efficiency-Prediction--Python

Folders and files

Latest commit

History

Repository files navigation

# Fuel-Efficiency-Prediction--Python

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages