Skip to content
/ 261-fp Public

We leverage Databricks' MLLib and PySpark libraries to predict continuous outliers using time-series flight delays, NOAA, and geospatial data. We use Yeo-Johnson, Euclidean distance, and custom joins to build a robust dataset—prior to model training. Work completed as part of ML at Scale, DATASCI 261.

Notifications You must be signed in to change notification settings

JH-UCB/261-fp

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

39 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

261-fp

Repository containing source figures and reports for predicting minute-by-minute flight delays using time series data, machine learning, data transformations, and PySpark on Databricks. Work completed as part of Machine Learning at Scale, DATASCI 261.

About

We leverage Databricks' MLLib and PySpark libraries to predict continuous outliers using time-series flight delays, NOAA, and geospatial data. We use Yeo-Johnson, Euclidean distance, and custom joins to build a robust dataset—prior to model training. Work completed as part of ML at Scale, DATASCI 261.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages