Hi there! I'm a data analyst with a passion for uncovering insights from complex datasets and building end-to-end data projects. This portfolio showcases my skills in data curation, geospatial analysis, business intelligence, comparative machine learning, time-series forecasting, and model deployment.
A comprehensive exploratory analysis and interactive dashboard visualizing the baseline and future water stress exposure of ~35,000 power plants worldwide. This project identifies geographic and technological hotspots for the critical water-energy nexus.
- Skills Demonstrated: Data Visualization, Looker Studio, Dashboard Design, Geospatial Analysis (GeoPandas), Data Curation.
- View the Interactive Dashboard
- View the Project on GitHub
An end-to-end regression and time-series project to predict solar power output from weather data. The analysis compares a baseline XGBoost model against deep learning architectures, culminating in an advanced LSTM forecasting model. The best regression model was then deployed as a live, interactive web application using Streamlit.
- Skills Demonstrated: Time-Series Forecasting, LSTM, Regression (XGBoost), Model Deployment, Streamlit, Feature Engineering.
- View the Deployed App Repo
- View the Full Modeling Project on GitHub
A comparative modeling project to classify celestial objects (Stars, Galaxies, Quasars) from the Sloan Digital Sky Survey (SDSS). This project establishes a strong Random Forest baseline and compares its performance against multiple advanced neural network techniques, including class weighting and ensemble methods.
- Skills Demonstrated: Machine Learning, Classification, Neural Networks (TensorFlow/Keras), Ensemble Modeling, Scikit-learn, Feature Importance.
- View the Project on GitHub
- Languages: Python, SQL
- Libraries: Pandas, NumPy, Scikit-learn, GeoPandas, TensorFlow/Keras, XGBoost, Matplotlib, Seaborn
- Tools & Platforms: Looker Studio, Streamlit, Git, GitHub, Google Colab