GitHub - Barbar41/Python-Projects-Feature-Engineering: Feature Engineering & Data Pre-Processing - Outliers-Encoding(Label Encoding,One-Hot Encoding,Rare Encoding)-Writing The Rare Encoder-Feature Engineering (Variable Engineering)

FEATURE ENGINEERING & DATA PRE-PROCESSING

#######################

Outliers

1-Capturing Outliers

2-Outliers with Chart Technique

3-How to Catch Outliers?

4-Are There Any Outliers?

5-Functionalize Transactions

6-Grab_col_names

7-Accessing the Outliers Themselves

8-Solving the Outlier Problem

9-Delete

10-Suppression Method (re-assignment with thresholds)

11-Recap

12-Multivariate Outlier Analysis: Local Outlier Factor

13-Missing Values

14-Capturing Missing Values

15-Solving the Missing Value Problem

####################

Solution 1: Quick deletion

####################

Solution 2: Filling with Simple Assignment Methods

1-Assigning Values in Categorical Variable Breakdown

#######################

Solution 3: Filling with Predictive Assignment

1-Recap

2-Advanced Analytics

3-Examining Missing Data Structure

4-Examining the Relationship of Missing Values with the Dependent Variable

5-Recap

#######################

2-Encoding (Label Encoding, One-Hot Encoding, Rare Encoding)

1-Label Encoding & Binary Encoding

2-One-Hot Encoding

3-Rare Encoding

#######################

Analysis comment

1-Analyzing the abundance of categorical variables.

2-Analyzing the relationship between rare categories and the dependent variable.

3-We will write rare encoder.

4-When there is a data set with plenty of categorical variables we should use Rare Analyzer and at least know

5-It is necessary to know which categorical variable's class, which frequency, which rate-dependent variable has an effect in terms of target.

####################

Analyzing the abundance of categorical variables.

####################

Analyzing the relationship between rare categories and the dependent variable.

#######################

3-Writing the rare encoder.

1-Feature Scaling

2-StandardScaler: Classic standardization. Subtract the mean, divide by the standard deviation. z = (x - u) / s

3-RobustScaler: Subtract median and divide by iqr.

4-MinMaxScaler: Variable conversion between 2 given values

5-Numeric to Categorical: Converting Numeric Variables to Categorical Variables

6-Binning

7-Feature Extraction

8-Binary Features: Flag, Bool, True-False

9-Deriving Features from Texts

10-LetterCount

11-Word count

12-Capturing Special Structures

13-Deriving Variables with Regex

14-Generating Date Variables

15-Feature Interactions

16-Titanic End-to-End Feature Engineering & Data Preprocessing

#######################

4-Feature Engineering (Variable Engineering)

1-Outliers

2- Missing Values

3-Label Encoding

4-Rare Encoding

5-One-Hot Encoding

6- Standard Scaler

7-Model

#######################

--Dependent variable Survived, independent variables Paasnger id and variables other than Surived

--We divide the data set into two: train and test.

--We will hold a model in Train, we will be testing this model that I built with the test.

--Let's bring the model object, a tree-based method

--Setting up the model x train independent variables y train target dependent variable

--Prediction of dependent variables of the test set

--Set the test set with the dependent variable y

--The score to be obtained without any action?

--How are the newly produced variables doing?

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
.idea		.idea
Feature_Engineering		Feature_Engineering
Miuul-Homework		Miuul-Homework
datasets		datasets
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

FEATURE ENGINEERING & DATA PRE-PROCESSING

Outliers

2-Encoding (Label Encoding, One-Hot Encoding, Rare Encoding)

Analysis comment

Analyzing the abundance of categorical variables.

Analyzing the relationship between rare categories and the dependent variable.

3-Writing the rare encoder.

4-Feature Engineering (Variable Engineering)

About

Uh oh!

Releases

Packages

Languages

Barbar41/Python-Projects-Feature-Engineering

Folders and files

Latest commit

History

Repository files navigation

FEATURE ENGINEERING & DATA PRE-PROCESSING

Outliers

2-Encoding (Label Encoding, One-Hot Encoding, Rare Encoding)

Analysis comment

Analyzing the abundance of categorical variables.

Analyzing the relationship between rare categories and the dependent variable.

3-Writing the rare encoder.

4-Feature Engineering (Variable Engineering)

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages