Repository containing source figures and reports for predicting minute-by-minute flight delays using time series data, machine learning, data transformations, and PySpark on Databricks. Work completed as part of Machine Learning at Scale, DATASCI 261.
-
Notifications
You must be signed in to change notification settings - Fork 0
We leverage Databricks' MLLib and PySpark libraries to predict continuous outliers using time-series flight delays, NOAA, and geospatial data. We use Yeo-Johnson, Euclidean distance, and custom joins to build a robust dataset—prior to model training. Work completed as part of ML at Scale, DATASCI 261.
JH-UCB/261-fp
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
About
We leverage Databricks' MLLib and PySpark libraries to predict continuous outliers using time-series flight delays, NOAA, and geospatial data. We use Yeo-Johnson, Euclidean distance, and custom joins to build a robust dataset—prior to model training. Work completed as part of ML at Scale, DATASCI 261.
Topics
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published