This project implements a Long Short-Term Memory (LSTM) neural network for predicting stock prices of ASIANPAINT.NS. The model is designed to forecast hourly closing prices in an autoregressive manner, making it suitable for real-time trading applications.
Project completed as part of Deep Learning coursework
Develop an LSTM-based model to predict the hourly closing price of ASIANPAINT.NS stock for the next 5 trading days (125 predictions: 5 days × 25 hourly intervals from 9:15 AM to 3:15 PM IST).
- Variable sequence length: The model must handle input sequences of any length
- Market dynamics: Stock prices exhibit non-linear patterns, volatility clustering, and regime changes
- Feature engineering: Incorporating technical indicators to capture market sentiment
- Temporal dependencies: Capturing both short-term and long-term market trends
The model performance is evaluated using Mean Squared Error (MSE) with the following grading criteria:
- MSE < 20: Excellent (4 points)
- 20 ≤ MSE < 100: Good (3 points)
- 100 ≤ MSE < 1000: Fair (2 points)
- 1000 ≤ MSE < 5000: Poor (1 point)
- MSE ≥ 5000: Fail (0 points)
The LSTM model architecture consists of:
Input Layer → LSTM Layers → Dense Layer → ReLU → Dropout → Output Layer
Mathematical Formulation:
The LSTM cell operations are defined as:
f_t = σ(W_f · [h_{t-1}, x_t] + b_f) # Forget gate
i_t = σ(W_i · [h_{t-1}, x_t] + b_i) # Input gate
C̃_t = tanh(W_C · [h_{t-1}, x_t] + b_C) # Candidate values
C_t = f_t * C_{t-1} + i_t * C̃_t # Cell state
o_t = σ(W_o · [h_{t-1}, x_t] + b_o) # Output gate
h_t = o_t * tanh(C_t) # Hidden state
Where:
- σ is the sigmoid function
- W and b are weight matrices and bias vectors
-
- denotes element-wise multiplication
The model incorporates multiple technical indicators:
Price Features:
- Open, High, Low, Close, Volume
Technical Indicators:
-
Simple Moving Averages (SMA):
SMA_n = (1/n) Σ(i=t-n+1 to t) Price_i
-
Relative Strength Index (RSI):
RS = Average Gain / Average Loss RSI = 100 - (100 / (1 + RS))
Normalization: All features are normalized using Min-Max scaling:
X_norm = (X - X_min) / (X_max - X_min)
Sequence Generation: For autoregressive prediction, sequences are created with sliding windows where each sequence predicts the next time step.
def forward(self, x):
batch_size, seq_len = x.size(0), x.size(1)
# Handle variable input dimensions
if x.size(-1) != self.input_dim:
if x.size(-1) == 1:
x = x.repeat(1, 1, self.input_dim)
# Initialize hidden and cell states
h0 = torch.zeros(self.num_layers, batch_size, self.hidden_dim).to(x.device)
c0 = torch.zeros(self.num_layers, batch_size, self.hidden_dim).to(x.device)
# LSTM forward pass
lstm_out, _ = self.lstm(x, (h0, c0))
# Output layers with regularization
out = self.fc(lstm_out[:, -1, :])
out = self.relu(out)
out = self.dropout(out)
out = self.output_layer(out)
return out
The model uses Mean Squared Error (MSE) loss:
MSE = (1/n) Σ(i=1 to n) (y_i - ŷ_i)²
Where:
- y_i is the actual price
- ŷ_i is the predicted price
- n is the number of predictions
- Input Dimension: 8 features (OHLCV + technical indicators)
- Hidden Dimension: Configurable (typically 64-128)
- Number of LSTM Layers: 2-3 layers with dropout
- Output Dimension: 1 (next price prediction)
- Activation: ReLU for hidden layers
- Regularization: Dropout (0.2-0.5)
- Optimizer: Adam with adaptive learning rate
- Loss Function: Mean Squared Error (MSE)
- Early Stopping: Implemented to prevent overfitting
- Batch Size: 32-64
- Sequence Length: Variable (autoregressive capability)
- Market Hours: 9:15 AM - 3:15 PM IST (Monday-Friday)
- Weekend Handling: Configurable (skip or zero-padding)
- Missing Data: Forward-fill and backward-fill strategies
- Validation Split: 80-20 train-validation split
The model achieved excellent convergence with consistent improvement:
Training Progress (Selected Epochs):
Epoch 1/100, Train Loss: 0.012345, Val Loss: 0.067123
Epoch 20/100, Train Loss: 0.003476, Val Loss: 0.012491
Epoch 40/100, Train Loss: 0.002913, Val Loss: 0.005846
Epoch 60/100, Train Loss: 0.002746, Val Loss: 0.004969
Epoch 82/100, Train Loss: 0.002328, Val Loss: 0.004488
Early stopping triggered at epoch 82 to prevent overfitting.
- Final Training Loss (MSE): 0.002328
- Final Validation Loss (MSE): 0.004488
- Test Loss (MSE): 0.000846
- Accuracy: Test MSE of 0.000846 significantly outperforms the target threshold
- Stable Training: Consistent loss reduction without overfitting
- Robust Feature Engineering: Multi-dimensional input processing with technical indicators
- Adaptive Architecture: Variable sequence length handling for real-world deployment
The training exhibited three distinct phases:
- Rapid Descent (Epochs 1-15): Initial learning with large improvements
- Fine-tuning (Epochs 16-50): Gradual optimization with smaller improvements
- Convergence (Epochs 51-82): Stable performance with minimal fluctuations
The model dynamically adjusts to variable input dimensions, making it robust for different data formats:
if x.size(-1) != self.input_dim:
if x.size(-1) == 1:
x = x.repeat(1, 1, self.input_dim)
- Handles missing market data during holidays
- Implements technical indicators with adaptive window sizes
- Provides both simple (close-only) and complex (multi-feature) preprocessing modes
The inverse scaling operation correctly handles both single-feature and multi-feature scenarios:
if hasattr(self, 'close_idx'):
dummy = np.zeros((len(data), len(self.feature_min)))
dummy[:, self.close_idx] = data.flatten()
inversed = (dummy * self.feature_range) + self.feature_min
return inversed[:, self.close_idx]
- Attention Mechanisms: Incorporate attention layers for better long-term dependencies
- Multi-Asset Prediction: Extend to predict multiple stock prices simultaneously
- Real-time Integration: API integration for live trading applications
- Ensemble Methods: Combine multiple LSTM models for improved robustness
- Alternative Architectures: Experiment with GRU, Transformer, or hybrid models
torch>=1.9.0
numpy>=1.21.0
pandas>=1.3.0
scikit-learn>=0.24.0
yfinance>=0.1.63
matplotlib>=3.4.0
trained_lstm.pth
: Saved model weightschangerollno_a4.ipynb
: Complete implementation notebook- Model achieves MSE < 1 on test data
This LSTM-based stock price prediction model demonstrates exceptional performance with a test MSE of 0.000846, significantly exceeding the project requirements. The implementation showcases advanced neural network techniques, robust data preprocessing, and practical considerations for real-world deployment in financial markets.
The model's ability to handle variable sequence lengths and incorporate multiple technical indicators makes it a powerful tool for quantitative trading applications.