This project enhances WLAN security through machine learning models capable of detecting wireless network attacks. The modeling pipeline includes both classification and anomaly detection models trained on realistic network traffic data.
We used the IoT-23 dataset from Kaggle, which contains labeled benign and malicious network traffic. Preprocessing steps included:
- Removing redundant and highly sparse features (>50% missing).
- Encoding categorical features numerically.
- Filling remaining missing values with zeros.
- Addressing class imbalance using under- and oversampling.
Four attack datasets were processed individually:
- Deauthentication
- Evil Twin
- KRACK
- Rogue AP
Steps:
- Dropped sparse features.
- Balanced training data (undersample
Normal
, oversampleAttack
). - Retained top 20 features per attack using Random Forest.
- Identified 63 common features across all attacks.
- Merged and deduplicated
Normal
samples for unbalanced dataset. - Built a balanced merged dataset for comprehensive modeling.
- Model: Random Forest with 63 features
- Results: Excellent performance on
Normal
, poor on attacks - Macro F1-score:
0.42
- Selected common top features across attacks
- Improved performance on
Krack
- Macro F1-score:
0.52
- Trained with 21 engineered features → filtered to 15 based on importance
- Final model:
n_estimators=25
,max_depth=12
- F1-scores:
Attack: 1.00
,Normal: 1.00
- Accuracy:
100%
- Filtered dataset to only attack instances
- Used selected features
- F1-scores: All attack types =
1.00
- Overall Accuracy: ~100%
- Perfect recall for all attacks
- Lower precision for
Rogue AP
due to rare samples
Model | ROC-AUC | F1-Score |
---|---|---|
Isolation Forest | 0.655 | 0.29 |
One-Class SVM | 0.864 | 0.29 |
Autoencoder | 0.900 | 0.95 |
- Autoencoder outperformed other models, showing strong anomaly detection capabilities using reconstruction error.
File | Description |
---|---|
X_*_train , y_*_train |
Per-attack raw training data |
X_*_train_balanced , y_*_train_balanced |
Balanced attack training sets |
X_train_merged , y_train_merged |
Merged unbalanced training data |
X_train_balanced_merged , y_train_balanced_merged |
Final merged balanced dataset |
X_test_merged , y_test_merged |
Unified test set for all attacks |
- Integrate deep learning models (e.g., LSTM, CNNs)
- Expand dataset with more attack types
- Build a fully automated alert & response system
- Optimize for real-time edge deployment
- Maroua BOUDERKA
- Larbi SAID CHIKH
- Mira Thiziri SINIANE
- Marouane Abdeldjalil OULAD ALI
- Hana AFRA