Skip to content

RL-based stock trading with sentiment analysis using FinBERT and FinRL. Tested A2C, PPO, and TD3 on 10 US stocks against DJI benchmark.

Notifications You must be signed in to change notification settings

hab5510/finrl_project

Repository files navigation

Enhancing Deep Reinforcement Learning for Stock Trading Using Financial News Sentiment and Volatility

Author: Hana Ben Slama
Institution: Technische Hochschule Ingolstadt
Contact: hab5510@thi.de


Overview

This project investigates whether integrating financial news sentiment — including both raw scores and volatility — into deep reinforcement learning (DRL) agents can improve trading performance.

We evaluate the impact of structured sentiment signals on three DRL agents:

  • A2C (Advantage Actor-Critic)
  • PPO (Proximal Policy Optimization)
  • TD3 (Twin Delayed DDPG)

Using four different experimental setups, we compare agent performance across:

  • Raw price and technical data (baseline)
  • Added raw sentiment
  • Optimized sentiment (EMA, normalization)
  • Optimized sentiment + volatility

Research Question

Can the integration of news sentiment and sentiment volatility into RL agent state representations lead to improved profitability and risk-adjusted performance?


Dataset & Tools

  • Market Data: Yahoo Finance (2020–2023)
  • News Data: fnspid dataset
  • Sentiment Scoring: FinBERT
  • Processing: Filtered by headline count, 3-day EMA smoothing, standard deviation (volatility)
  • Framework: FinRL, Google Colab
  • Initial Portfolio: $1,000,000
  • Training Time: TD3 (~20 mins), A2C/PPO (~2 mins)

Experimental Setups

Experiment Sentiment Input Preprocessing
1 No sentiment
2 Raw sentiment None
3 Sentiment Filtered, EMA, normalized
4 Sentiment + Volatility Filtered, EMA, normalized, Std Dev

Results Summary

Table 1 – Final Portfolio Value (USD)

Model No SA Raw SA Optimized SA Opt SA + Volatility
A2C $1,528,590 $2,040,000 $1,640,656 $1,554,927
PPO $1,196,072 $1,344,424 $1,172,948 $996,754
TD3 $1,457,936 $1,730,000 $1,791,986 $1,384,668
DJI $1,138,028 $1,138,028 $1,138,028 $1,138,028

Table 2 – Cumulative Return / Sharpe Ratio

Agent No SA Raw SA Opt SA Opt SA + Vol
A2C 0.53 / 1.52 0.67 / 2.38 0.64 / 2.36 0.55 / 1.71
PPO 0.20 / 1.42 0.34 / 1.56 0.17 / 1.69 0.00 / –0.11
TD3 0.46 / 1.69 0.72 / 2.42 0.79 / 2.82 0.38 / 1.88
DJI 0.14 / 1.20 0.14 / 1.20 0.14 / 1.20 0.14 / 1.20

Key Takeaways

  • TD3 showed the most robust performance, especially with optimized sentiment.
  • A2C also benefited from sentiment, though less from volatility.
  • PPO was unstable with complex or noisy features.
  • Sentiment volatility introduced noise and degraded performance across all agents.

Repository Contents

README.md

/without_sentiment/
    FinRL_without_SentimentAnalysis.ipynb  
    profile_noSA.png  
    evaluation_metrics_noSA.png  
    result_noSA.csv  
    evaluation_metrics_table_noSA.csv  
    processed_full_noSA.csv  

/with_sentiment/
    FinRL_with_SentimentAnalysis.ipynb  
    profile.png  
    metrics.png  
    result.csv  
    evaluation_metrics_table.csv  
    processed_full.csv  

/with_sentiment_optimized/
    FinRL_with_SentimentAnalysis_optimized.ipynb  
    profile.png  
    evaluation_metrics_comaprison.png  
    result.csv  
    evaluation_metrics_table.csv  
    processed_full_smoothing_scaling.csv  

/with_sentiment_optimized_volatility/
    FinRL_with_SentimentAnalysis_optimized_volatility.ipynb  
    profile.png  
    evaluation_metrics.png  
    result.csv  
    evaluation_metrics_table.csv  
    processed_full_smoothing_scaling_volatility.csv  

/report/
    FinRL_SentimentAnalysis.pdf 
    cumulative_return_sharpe_ratio_all.png  
    cumulative_return_sharpe_ratio_all.csv  
    profile_value_all.csv  

Full Report

FinRL_SentimentAnalysis.pdf


References


About

RL-based stock trading with sentiment analysis using FinBERT and FinRL. Tested A2C, PPO, and TD3 on 10 US stocks against DJI benchmark.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published