Skip to content

pginjupalli/Cornell_BTTAI_Portfolio

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Pooja Ginjupalli's Cornell BTTAI Portfolio

Through the summer of 2024, I had the amazing opportunity to be a part of Cornell's Break Through Tech AI program where I learned machine learning fundamentals and tools to tackle real-world problems. I used technologies like Python and Jupyter Notebook to research datasets and build AI models.

The AI models and datasets I used over the 10 topics I learned can be found in this repository.

Topic 1: Business Understanding and ML Basics

In this topic, I learned about the business behind ML and what kinds of problems it can solve with the right datasets.

I also learned the basics of important machine learning libraries in Python like NumPy and Pandas. I used these libraries to manipulate dataframes and read large datasets.

Topic 2: Data Preparation

In this topic, I learned how to prepare and clean large datasets, so I could use them to train AI models.

I used NumPy and Pandas to remove outliers, handle cases of missing data, perform one-hot encoding to categorical features, and analyze correlations between the features and the label.

Topic 3: Decision Trees and KNNs

In this topic, I learned what Decision Trees (DTs) and K-Nearest Neighbors (KNNs) are as well as how they work and how to tune their hyperparameters.

I used the Scikit-Learn library in Python to train multiple DTs and KNNs with different hyperparameters and compared their accuracy on the same set of data. I found that DTs tend to be more accurate.

Topic 4: Linear Models

In this topic, I learned about Logistic Regression and how to optimize them to maximize the accuracy score.

I used the Scikit-Learn library to train a Logistic Regression model while tuning its hyperparameters to achieve the highest accuracy score possible.

Topic 5: Evaluation and Deployment

In this topic, I learned how to efficiently find the best values for the hyperparameters for models to ensure their accuracy in predictions. Then, I learned how to evaluate a model and refine it before re-evaluation and deployment.

I trained a KNN model and used a Grid Search technique to find the best hyperparameter values. Then, I used this technique on a Logistic Regression model before evaluating it.

Topic 6: Unsupervised Learning and Ensemble Models

In this topic, I learn what unsupervised ML was and how to implement it using clustering. I also learned about Ensemble Models which provide higher accuracy in predictions.

I used a KMeans Clustering model to group examples in a dataset together. I also used different Ensemble Models like Stacking, Random Forests, and Gradient Boosted Decision Trees to make accuracte predictions on data.

Topic 7: Neural Networks

In this topic, I learned what Neural Networks are and how they work. I also learned how Convulational Neural Networks work to process images.

I used the Keras library in Python to build a both a normal neural network and a convulational neural network to make predictions.

Topic 8: NLP

In this topic, I learned about Natural Language Processing which can read text and derive the sentient behind it.

I used the Keras library to build a neural network to read the sentiment in pieces of text.

Final Project

For my final project, I chose to create a KMeans Model, which is an unsupervised model, to group together examples within a dataset to see if the groups closely aligned with their correct label.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published