This project is a web-based application that fetches news articles for a given company, summarizes the content, performs sentiment analysis, and generates a Hindi text-to-speech (TTS) summary. It uses a combination of a Streamlit frontend for user interaction and a FastAPI backend for API-based access to the core functionality. The app is also deployed on Hugging Face Spaces for easy access.
- News Fetching: Retrieves news articles for a specified company using the NewsAPI.
- Text Summarization: Summarizes article content using a pre-trained transformer model.
- Sentiment Analysis: Analyzes the sentiment (Positive, Negative, Neutral) of each article.
- Hindi TTS: Generates an audio summary in Hindi using a pre-trained TTS model.
- Web Interface: Provides an interactive UI via Streamlit to input company names and view results.
- API Access: Exposes endpoints via FastAPI for programmatic access to news and analysis.
The project relies on the following Python libraries:
streamlit
: For building the interactive web interface.requests
: For making HTTP requests to fetch news data.beautifulsoup4
: For scraping article content from web pages.transformers
: For text summarization, sentiment analysis, and TTS generation.torch
: For running transformer models.scipy
: For handling audio file generation.fastapi
: For creating a RESTful API.uvicorn
: For serving the FastAPI application.
main.py
(assumed name for the Streamlit app):- Defines the Streamlit frontend.
- Handles user input, displays results, and plays Hindi TTS audio.
utils.py
:- Contains core functions for fetching news, sentiment analysis, summarization, and TTS generation.
api.py
(assumed name for the FastAPI app):- Defines API endpoints for news fetching and analysis.
requirements.txt
:- Lists all Python dependencies required for the project.
- Uses the NewsAPI to fetch articles based on a company name.
- Scrapes full article content using BeautifulSoup if available; otherwise, uses the article description.
- Returns a list of dictionaries with
title
andcontent
.
- Uses a pre-trained sentiment analysis model from Hugging Face's
transformers
. - Classifies text as "Positive", "Negative", or "Neutral".
- Limits input to 512 characters to avoid model constraints.
- Uses a pre-trained summarization model from
transformers
. - Summarizes text to 25-50 words, truncating input to 1024 characters.
- Falls back to a truncated version of the original text if summarization fails.
- Aggregates sentiment scores across articles to provide a distribution (e.g., Positive: 6, Negative: 2, Neutral: 2).
- Generates a textual summary in English, such as "Positive articles focus on [company]'s growth, while negative ones highlight challenges."
- Included in the JSON report and used as a basis for the Hindi TTS summary.
- Uses the
facebook/mms-tts-hin
model from Hugging Face for Hindi TTS. - Converts a text summary (limited to 200 characters) into a WAV audio file.
- Incorporates the comparative analysis into the audio output (e.g., "सकारात्मक लेखों में वृद्धि पर ध्यान है").
- Handles exceptions and ensures audio data is correctly formatted.
- Provides a simple UI to:
- Input a company name (e.g., Tesla, Amazon, Apple).
- Fetch and analyze up to 10 articles.
- Display a JSON report with titles, summaries, sentiments, and a comparative analysis.
- Play a Hindi TTS summary as audio.
- Exposes two endpoints:
/news/{company_name}
: Returns raw news articles./analyze/{company_name}
: Returns a report with summaries and sentiments for up to 10 articles.
This project has been deployed on Hugging Face Spaces, making it accessible online without local setup. Here’s how it was deployed and how you can use or replicate it:
-
Create a Space:
- Go to Hugging Face Spaces and create a new Space.
- Choose "Streamlit" as the framework since the frontend uses Streamlit.
-
Upload Files:
- Upload
api.py
,apy.py
,utils.py
, andrequirements.txt
. - Ensure the NewsAPI key is added as a Secret in the Space settings (Settings > Secrets > Add
NEWSAPI_KEY
).
- Upload
-
Configure
requirements.txt
:streamlit requests beautifulsoup4 transformers torch scipy fastapi uvicorn
Hugging Face Spaces will automatically install these dependencies.
-
Set Up the App:
- The Space runs
streamlit run main.py
by default, providing the interactive UI.
- The Space runs
-
Deploy:
- Commit the files and let Hugging Face build the Space.
- Once built, the app is live at a https://huggingface.co/Shubham0786
- Visit the Hugging Face Space https://huggingface.co/spaces/Shubham0786/News_Summarization_and_Sentiment_Analysis
- Enter a company name in the text input and click "Analyze" to see the results and hear the Hindi TTS summary.
- Open the app .
- Enter a company name (e.g., "Tesla","Amazon","Apple").
- Click "Analyze".
- View the JSON report andnPlayable audio file summarizing the sentiment report.
preview.mp4
{
"Company": "Tesla",
"Articles": [
{
"Title": "Tesla's New Factory Opens",
"Summary": "Tesla opened a new factory in Shanghai, boosting production.",
"Sentiment": "Positive",
"Topics": ["Business"]
},
...
],
"Comparative Sentiment Score": {
"Sentiment Distribution": {"Positive": 6, "Negative": 2, "Neutral": 2}
},
"Comparative Analysis": "Positive articles focus on Tesla's growth, while negative ones highlight challenges."
}
- Generated audio file (
output.wav
) with a summary like: "टेस्ला की खबरों का सारांश: कुल 10 लेख मिले। सकारात्मक: 6, नकारात्मक: 2, तटस्थ: 2।"
We welcome contributions! If you have improvements, or suggestions, please open an issue or submit a pull request.
- 🌐 GitHub Profile
- 📧 Email:shubhamkashyap9501@gmail.com
- LinkedIn: Linkedin_link
This project is open-source and available under the MIT License.