This project is designed to predict house prices using a machine learning model. The project consists of two main components: a Jupyter Notebook (house_prediction.ipynb
) that runs on Vertex AI Workbench, and a Streamlit application (app.py
) that provides a user-friendly interface for making predictions.
The goal of this project is to demonstrate a complete machine learning workflow, from data extraction and model training to deploying the model in production using a Streamlit application. This approach simplifies the process of making predictions by providing an easy-to-use web interface.
- Python 3.7 or higher
- Google Cloud SDK
- Vertex AI Workbench
- Streamlit
- Pandas
- NumPy
- Joblib
- Google Cloud Storage
- Google OAuth2
- Google BigQuery
-
Create a Google Cloud Project: If you don't have a Google Cloud project, create one here.
-
Enable APIs: Enable the following APIs for your project:
- Vertex AI API
- Cloud Storage API
- BigQuery API
-
Create a Vertex AI Workbench: Set up a Vertex AI Workbench instance to run the Jupyter Notebook.
-
Create a Cloud Storage Bucket: Create a bucket to store your model files.
-
BigQuery Dataset: Ensure you have a BigQuery dataset and table containing the house data. The table should be relational and include relevant features for house price prediction.
-
Clone the Repository:
git clone https://github.com/yourusername/HousePredictorGCP.git cd HousePredictorGCP
-
Install Dependencies:
pip install -r requirements.txt
-
Upload the Notebook: Upload
house_prediction.ipynb
to your Vertex AI Workbench instance. -
Run the Notebook: Open the notebook in Vertex AI Workbench and run all cells to train the model and save the pipeline to your Cloud Storage bucket. The notebook pulls data from a BigQuery table for training.
-
Service Account Key: Download the service account key JSON file from the Google Cloud Console and place it in the project directory. Ensure the service account has access to the Cloud Storage bucket and BigQuery dataset.
-
Update the Path to the Service Account Key: In
app.py
, update the path to the service account key JSON file:credentials = service_account.Credentials.from_service_account_file( "/path/to/your/service-account-file.json" )
-
Run the Streamlit App:
streamlit run app.py
-
Access the App: Open your web browser and go to
http://localhost:8501
to access the Streamlit application.
app.py
: Streamlit application for house price prediction.house_prediction.ipynb
: Jupyter Notebook for training the model on Vertex AI Workbench.requirements.txt
: List of dependencies required for the project.README.md
: Project documentation.
- Open the Streamlit application in your web browser.
- Enter the property data in the form provided.
- Click the "Predict Price" button to get the estimated price of the property.
- Open the
house_prediction.ipynb
notebook in Vertex AI Workbench. - Run all cells to train the model and save the pipeline to your Cloud Storage bucket. The notebook pulls data from a BigQuery table for training.
The data used for training the model is pulled from a BigQuery table. Ensure that the table is relational and includes relevant features such as:
- Number of bedrooms
- Number of bathrooms
- Living area size
- Lot area size
- Number of floors
- Waterfront indicator
- View rating
- Property grade
- Renovation status
- Basement size
- Property condition