#######################
1-Capturing Outliers
2-Outliers with Chart Technique
3-How to Catch Outliers?
4-Are There Any Outliers?
5-Functionalize Transactions
6-Grab_col_names
7-Accessing the Outliers Themselves
8-Solving the Outlier Problem
9-Delete
10-Suppression Method (re-assignment with thresholds)
11-Recap
12-Multivariate Outlier Analysis: Local Outlier Factor
13-Missing Values
14-Capturing Missing Values
15-Solving the Missing Value Problem
####################
Solution 1: Quick deletion
####################
Solution 2: Filling with Simple Assignment Methods
1-Assigning Values in Categorical Variable Breakdown
#######################
Solution 3: Filling with Predictive Assignment
1-Recap
2-Advanced Analytics
3-Examining Missing Data Structure
4-Examining the Relationship of Missing Values with the Dependent Variable
5-Recap
#######################
1-Label Encoding & Binary Encoding
2-One-Hot Encoding
3-Rare Encoding
#######################
1-Analyzing the abundance of categorical variables.
2-Analyzing the relationship between rare categories and the dependent variable.
3-We will write rare encoder.
4-When there is a data set with plenty of categorical variables we should use Rare Analyzer and at least know
5-It is necessary to know which categorical variable's class, which frequency, which rate-dependent variable has an effect in terms of target.
####################
####################
#######################
1-Feature Scaling
2-StandardScaler: Classic standardization. Subtract the mean, divide by the standard deviation. z = (x - u) / s
3-RobustScaler: Subtract median and divide by iqr.
4-MinMaxScaler: Variable conversion between 2 given values
5-Numeric to Categorical: Converting Numeric Variables to Categorical Variables
6-Binning
7-Feature Extraction
8-Binary Features: Flag, Bool, True-False
9-Deriving Features from Texts
10-LetterCount
11-Word count
12-Capturing Special Structures
13-Deriving Variables with Regex
14-Generating Date Variables
15-Feature Interactions
16-Titanic End-to-End Feature Engineering & Data Preprocessing
#######################
1-Outliers
2- Missing Values
3-Label Encoding
4-Rare Encoding
5-One-Hot Encoding
6- Standard Scaler
7-Model
#######################
--Dependent variable Survived, independent variables Paasnger id and variables other than Surived
--We divide the data set into two: train and test.
--We will hold a model in Train, we will be testing this model that I built with the test.
--Let's bring the model object, a tree-based method
--Setting up the model x train independent variables y train target dependent variable
--Prediction of dependent variables of the test set
--Set the test set with the dependent variable y
--The score to be obtained without any action?
--How are the newly produced variables doing?