🧠 ISAS 2025: Abnormal Behavior Recognition using Pose-Based Feature Engineering and Deep Ensemble Learning

📌 Project Summary

We propose an abnormal activity recognition system for individuals with developmental disabilities using 2D pose keypoint data and deep learning. The solution consists of:

A feature engineering pipeline crafted from raw keypoints
A dual deep learning ensemble combining Bi-LSTM and Transformer, optimized for detecting abrupt behaviors like "Attacking" or "Throwing things"

👥 Authors

The Hoang Nguyen, Gia Huy Ly, and Duy Khanh Dinh Hoang. All authors are currently studying at VNU-HCM, Ho Chi Minh City University of Technology (HCMUT)

🗂️ Dataset

The dataset is provided by the ISAS 2025 challenge:

4 subjects for training, 1 subject for testing using LOSO (Leave-One-Subject-Out)
8 labeled activities: 4 normal (e.g., Sitting, Walking) and 4 unusual (e.g., Biting nails, Attacking)
Pose keypoints extracted via YOLOv7 at 30 FPS

Main challenges:

Data imbalance: more normal than abnormal frames
Temporal variability between activity types
Subject-specific differences in motion styles
Short and unpredictable unusual behaviors (e.g., Attacking)

🔧 Feature Engineering Pipeline

We designed over 70 continuous features per frame from keypoints to capture motion, geometry, asymmetry, and temporal-frequency characteristics:

Feature Group	Description
Motion	Velocity, acceleration, jerk of hands and nose
Geometric	Euclidean distances (e.g., hand–nose), joint angles (elbow, knee), torso angle, hand-above-shoulder flag
Asymmetry	Speed and position differences between left/right hands
Temporal statistics	Rolling mean, std, and max (1.5s window ≈ 45 frames)
Frequency & regularity	Dominant frequency (FFT), zero-crossing rate (ZCR), movement regularity

All features are computed on interpolated and smoothed keypoints to reduce noise.

🧠 Model Architecture

1. Deep Bi-LSTM

Two Bi-LSTM layers (128, 64 units)
BatchNorm and Dropout for generalization
Effective for repetitive behaviors (Walking, Sitting)

2. Hybrid: Bi-LSTM + Transformer

Bi-LSTM for short-term motion encoding
Transformer for long-range, non-linear dependencies
Effective for bursty behaviors (Attacking, Throwing)

⚖️ Ensemble Strategy

Softmax probability weighted average:

Bi-LSTM: 52%
Hybrid: 48%

Weights tuned via LOSO cross-validation.

⏱️ Temporal Settings

Component	Value
Frame rate	30 FPS
Input sequence length	60 frames (≈ 2 seconds)
Feature rolling window	45 frames (≈ 1.5 seconds)
Overlap rate	~90%
Subjects used for training	1, 2, 3, 4, 5

Sliding window segmentation ensures dense sampling for short-duration activities.

📊 Evaluation Strategy

A. Activity Classification

Input: unlabeled pose sequences
Output: participant_id, timestamp, predicted_label
Metrics: Accuracy, Abnormal F1-Score, Precision, Recall

B. LOSO Evaluation

Evaluate model generalization on unseen subject
Submit LOSO-specific evaluation report

📊 LOSO Summary Results (Abnormal Behaviors)

Below is the average performance across all LOSO folds for abnormal behaviors using the Ensemble model (Bi-LSTM + Hybrid):

Abnormal Behavior	Average F1-score
Attacking	76.58%
Biting nails	71.86%
Head banging	78.42%
Throwing things	77.20%

📊 LOSO Summary Results (Normal Behaviors)

Normal Behavior	Average F1-score
Eating snacks	61.32%
Sitting quietly	45.40%
Using phone	37.40%
Walking	94.54%

📁 Submission Files

[team_name]_test.csv: format [participant_id, timestamp, predicted_label]

🚀 Key Contributions

Engineered 70+ temporal and geometric features from 2D keypoints
Designed a hybrid deep model suitable for both smooth and irregular behaviors
Applied ensemble fusion for improved recognition accuracy
Tuned temporal parameters using rolling window and LLM-guided prompting
Achieved high performance on short, bursty and challenging behaviors

🛠️ Setup & Execution Guide

Task 1 Execution

1. Create and activate virtual environment:

python3 -m venv venv
source venv/bin/activate

2. Install dependencies:

pip install -r requirements.txt

3. Step-by-step Pipeline Execution

🧩 Step 1: Feature Extraction from Labeled Data

Script: process_data.py
Input: ./data/keypointlabel/keypoints_with_labels_<id>.csv for IDs 1, 2, 3, 5
Output: features_continuous_unfiltered.csv
Extracts 70+ handcrafted features (motion, geometry, asymmetry, temporal-frequency)

⚙️ Step 2: Train Hybrid Model (Bi-LSTM + Transformer)

Script: train_hybrid_tuned.py
Output: saved_models/hybrid_tuned/
- best_model_fold_<id>.keras, scaler_fold_<id>.joblib, label_encoder.joblib, and feature_cols.json

⚙️ Step 3: Train Bi-LSTM Model

Script: train_lstm_tuned.py
Output: saved_models/lstm_tuned/
- best_model_fold_<id>.keras, scaler_fold_<id>.joblib, label_encoder.joblib, and feature_cols.json

📊 Step 4: LOSO Ensemble Evaluation

Script: ensemble_loso.py
Input: trained models from step 2 and 3
Output: prints accuracy, F1-score, and per-fold classification reports

🏁 Step 5: Final Training – LSTM Model on All Data

Script: train_final_lstm_tuned_model.py
Output: final_lstm_tuned_model_artifacts/
- Includes full model, scaler, label encoder, and selected features

🏁 Step 6: Final Training – Hybrid Model on All Data

Script: train_final_hybrid_model.py
Output: final_hybrid_model_artifacts/
- Similar structure, includes Transformer block for long-range dependencies

🧪 Step 7: Feature Extraction from Test Set

Script: process_data_test.py
Input: test data_keypoint.csv
Output: features_test.csv
Same features as training, built from interpolated keypoints

📤 Step 8: Create Submission

Script: create_submission.py
Combines: final_lstm_tuned_model + final_hybrid_model (soft voting: 52/48)
Output: Binary_Phoenix_test.csv with:
- participant_id, timestamp, predicted_label

Task 2 Execution

🧩 Step 1: Feature Extraction from 5 Participants (including Participant 4)**

Script: task_2_processdata.py
Input: ./data/keypointlabel/keypoints_with_labels_<id>.csv for IDs 1, 2, 3, 4, 5
Output: final.csv
Description: Extracts 70+ handcrafted features (motion, geometry, asymmetry, temporal-frequency) for all 5 participants. Data is interpolated and cleaned to ensure consistent feature space.

⚙️ Step 2: Train Hybrid Model (Bi-LSTM + Transformer) – LOSO 5 folds**

Script: task_2_hybrid_tuned.py
Input: final.csv from Step 1
Output: final_report_saved_models/hybrid_tuned/
- best_model_fold_<id>.keras for each fold
- scaler_fold_<id>.joblib for each fold
- encoder.joblib (label encoder used for all folds)
- feature_columns.joblib (features used by the model)
Description: Trains Hybrid model with LOSO across 5 participants. Each fold trains on 4 participants and tests on the remaining participant.

⚙️ Step 3: Train LSTM Model – LOSO 5 folds**

Script: task_2_lstm_tuned.py
Input: final.csv from Step 1
Output: final_report_saved_models/lstm_tuned/
- best_model_fold_<id>.keras for each fold
- scaler_fold_<id>.joblib for each fold
- encoder.joblib
- feature_columns.joblib
Description: Trains Deep Bi-LSTM model with LOSO across 5 participants. Results saved per fold for later ensemble.

📊 Step 4: Ensemble Evaluation – LOSO 5 folds**

Script: task_2_ensembleloso.py
Input: Models from Step 2 & Step 3
Output:
- LOSO per-fold Accuracy and Macro F1-score
- Mean Accuracy and Macro F1-score across 5 folds
- ensemble_confusion_matrix.png (aggregated confusion matrix for 5 folds)
Description: Combines predictions from Hybrid and LSTM models using soft voting (Hybrid 48%, LSTM 52%). Evaluates model generalization to unseen participants.

⚖️ [Optional] Optimize Ensemble Weights via Grid Search

Script: weighted.py
Purpose:
Finds the optimal weighting between Hybrid and LSTM models using soft voting, maximizing the weighted F1-score (abnormal classes are prioritized using class weights).
How it works:
- Performs grid search (e.g., Hybrid weights from 0.0 to 1.0)
- Uses preloaded fold predictions from both models
- Applies weight ×3 for abnormal activity classes during f1_score computation
- Prints out scores and the best weight combination
Note:
Run this script before ensemble_loso.py to determine the best weight ratio.

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
data		data
final_hybrid_model_artifacts		final_hybrid_model_artifacts
final_lstm_tuned_model_artifacts		final_lstm_tuned_model_artifacts
final_report_saved_models		final_report_saved_models
saved_models		saved_models
.DS_Store		.DS_Store
.gitignore		.gitignore
Binary_Phoenix_test.csv		Binary_Phoenix_test.csv
Hybrid_model_confusionmatrix.JPG		Hybrid_model_confusionmatrix.JPG
LOSO terminal		LOSO terminal
LSTM2layer_model_confusionmatrix.png		LSTM2layer_model_confusionmatrix.png
LSTM_model_confusionmatrix.JPEG		LSTM_model_confusionmatrix.JPEG
README.md		README.md
create_submission.py		create_submission.py
ensemble_confusion_matrix.png		ensemble_confusion_matrix.png
ensemble_loso.py		ensemble_loso.py
process_data.py		process_data.py
process_data_test.py		process_data_test.py
requirements.txt		requirements.txt
task 2_ processdata.py		task 2_ processdata.py
task 2_ensembleloso.py		task 2_ensembleloso.py
task 2_hybrid_tuned.py		task 2_hybrid_tuned.py
task 2_lstm_tuned.py		task 2_lstm_tuned.py
test data_keypoint.csv		test data_keypoint.csv
train_final_hybrid_model.py		train_final_hybrid_model.py
train_final_lstm_tuned_model.py		train_final_lstm_tuned_model.py
train_hybrid_tuned.py		train_hybrid_tuned.py
train_lstm2layer.py		train_lstm2layer.py
train_lstm_tuned.py		train_lstm_tuned.py
weighted.py		weighted.py

hoang-nguyenthe/ISAS2025

Folders and files

Latest commit

History

Repository files navigation