Skip to content

marcoshsq/IBMDataScience

Repository files navigation

IBM Data Science Banner

IBM Data Science Certificate Projects

IBM Logo

Instructors: Rav Ahuja, Alex Aklson, Aije Egwaikhide, Svetlana Levitan, Romeo Kienzler, Polong Lin, Joseph Santarcangelo, Azim Hirjani, Hima Vasudevan, Saishruthi Swaminathan, Saeed Aghabozorgi, Yan Luo


📚 About the Certificate

This repository contains the hands-on projects I developed as part of the IBM Data Science Professional Certificate on Coursera. The specialization provides a comprehensive foundation in data science, from data analysis to machine learning.

🧠 Courses Overview

# Course Link
01 What is Data Science? View
02 Tools for Data Science View
03 Data Science Methodology View
04 Python for Data Science, AI & Development View
05 Python Project for Data Science View
06 Databases and SQL for Data Science View
07 Data Analysis with Python View
08 Data Visualization with Python View
09 Machine Learning with Python View
10 Applied Data Science Capstone View

💼 Projects

From Course 5. In this project, I used Python to scrape and visualize stock data, aiming to create an interactive dashboard.
Tools: pandas, requests, bs4, html5lib, lxml, plotly, yfinance


From Course 6. Created and populated a relational database using IBM Db2 SQL, then analyzed Chicago city data using Python.
Tools: IBM Db2, SQL, Jupyter Notebooks, CSVs


From Course 7. Built regression models to predict housing prices based on property features.
Tools: pandas, numpy, matplotlib, seaborn, scikit-learn


From Course 8. Built a flight performance dashboard using alternative tools due to technical limitations with IBM’s internal platform.
Tools: jupyter_dash, plotly, Google Colab


From Course 9. Compared several classification algorithms on a loan dataset to identify the best-performing model.
Tools: Logistic Regression, SVM, Decision Tree, KNN


The final capstone: predicting the success of SpaceX launches.
Steps involved:

  • Data collection via SpaceX API and Wikipedia
  • Data wrangling and visualization (SQL, Plotly, Folium)
  • Feature engineering + One-hot encoding
  • ML with GridSearchCV to optimize model parameters
  • Models: Logistic Regression, SVM, Decision Tree, KNN
    Accuracy: ~83.33% across all models

This was my favorite and most challenging project – the real kickstart to my data science journey.
Greatness in small beginnings.


Feel free to check out each project individually. Feedback is welcome! 🚀