GitHub - ravigithubshankar/Linking-Writing-Quality-Inverted-Ensembiling-Modailites: predict overall writing quality. Does typing behavior affect the outcome of an essay? develop a model trained on a large dataset of keystroke logs that have captured writing process features.

Keystroke-Based Writing Quality Prediction

Description

This project aims to predict overall writing quality by analyzing keystroke logs that capture detailed writing process features. The goal is to explore how typing behavior affects essay outcomes and to provide insights for writing instruction, automated evaluation systems, and intelligent tutoring systems.

The project was developed as part of a Kaggle competition, where we achieved a top 100 position with a public score of 0.5718 and a private score of 0.5761.

Approach

Data Preprocessing

Extracted temporal features from keystroke logs to effectively represent typing behavior.
Scaled and normalized features for compatibility with machine learning and deep learning models.

Model Development

Built a hybrid pipeline combining boosting techniques and custom neural networks:
- Boosting Models:
  - GradientBoostingRegressor
  - AdaBoostRegressor with ElasticNet, SVR, RandomForest, and KNN base estimators
- Neural Networks:
  - Custom architectures (Bottleneck DenseLayerNet, DenseNet) integrated with weighted sample training.

Optimization

Performed hyperparameter tuning to enhance model performance, achieving a 15% boost in ensemble accuracy and a 0.92 R² score on validation data.
Applied dynamic sample weighting using residual errors for iterative performance improvement in AdaBoost.

Evaluation

Combined predictions from boosting models and neural networks using a fusion strategy for robust final predictions.

Results

Competition Rank: Top 100 on Kaggle leaderboard.
Metrics:
- Public Score: 0.5718
- Private Score: 0.5761
- Validation R²: 0.92

Technologies Used

Programming Language: Python
Libraries:
- TensorFlow/Keras
- Scikit-learn
- NumPy, Pandas
Machine Learning Models:
- GradientBoostingRegressor, AdaBoostRegressor
Deep Learning Models:
- Bottleneck DenseLayerNet, DenseNet
Techniques:
- Ensemble Learning
- Model Fusion

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
Models		Models
.gitignore		.gitignore
Data.py		Data.py
LICENSE		LICENSE
README.md		README.md
baseline_Densenet.py		baseline_Densenet.py
data_splitting.py		data_splitting.py
dnn-ensembe-methods.ipynb		dnn-ensembe-methods.ipynb
feature_Extraction.py		feature_Extraction.py
fusion_train.py		fusion_train.py
inference.py		inference.py
trim_data.py		trim_data.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Keystroke-Based Writing Quality Prediction

Description

Approach

Data Preprocessing

Model Development

Optimization

Evaluation

Results

Technologies Used

About

Uh oh!

Releases

Packages

Languages

License

ravigithubshankar/Linking-Writing-Quality-Inverted-Ensembiling-Modailites

Folders and files

Latest commit

History

Repository files navigation

Keystroke-Based Writing Quality Prediction

Description

Approach

Data Preprocessing

Model Development

Optimization

Evaluation

Results

Technologies Used

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages