Skip to content

🔍 Get the latest news on any company—instantly, from multiple sources—powered by smart AI insights. Stay ahead, stay informed. 🚀

Notifications You must be signed in to change notification settings

shubham2924/IntelliNews

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

News Aggregator Application

Overview

This is a Streamlit-based news aggregation application that fetches news articles from multiple APIs in parallel and uses AI-powered deduplication to provide unique, relevant news content for companies. The application supports 6 different news providers and includes an AI service for intelligent article deduplication.

User Preferences

Preferred communication style: Simple, everyday language.

System Architecture

Frontend Architecture

  • Framework: Streamlit web application framework
  • Layout: Wide layout configuration with sidebar for API key management
  • Components: Interactive forms, data tables, and error handling displays
  • User Interface: Clean, responsive design with expandable article containers

Backend Architecture

  • Pattern: Provider-based architecture with async/await for parallel API calls
  • Core Components:
    • Provider layer for API integrations
    • Service layer for AI processing
    • Utility layer for common functions
  • Concurrency: Asynchronous HTTP requests for parallel news fetching

Data Processing Pipeline

  1. User inputs company name and API keys
  2. Multiple providers fetch news simultaneously
  3. Articles are normalized to common format
  4. AI service deduplicates articles
  5. Results displayed in formatted table

Key Components

News Providers (providers/)

  • Base Provider: Abstract class defining common interface and utilities
  • NewsAPI Provider: Integration with NewsAPI.org
  • NewsData Provider: Integration with NewsData.io
  • Finlight Provider: Integration with Finlight.me financial news
  • Google RSS Provider: RSS feed parsing from Google News
  • Finnhub Provider: Financial news from Finnhub.io
  • AlphaVantage Provider: Market news and sentiment from Alpha Vantage

AI Service (services/)

  • AI Service: DeepSeek AI integration for intelligent article deduplication
  • Functionality: Identifies unique news events from potentially duplicate articles
  • API: OpenAI-compatible client with custom base URL

Utilities (utils/)

  • Display Utils: Streamlit UI formatting and article presentation
  • Date Utils: Date parsing, formatting, and time calculations

Data Flow

  1. Input Phase: User provides company name and API keys through Streamlit interface
  2. Fetching Phase: All providers fetch news articles concurrently using asyncio
  3. Normalization Phase: Articles from different APIs are normalized to common schema
  4. Deduplication Phase: AI service analyzes articles to remove duplicates
  5. Display Phase: Unique articles presented in formatted, interactive table

External Dependencies

News APIs

  • NewsAPI.org: General news articles with search capabilities
  • NewsData.io: International news data service
  • Finlight.me: Financial news specializing in market data
  • Google RSS: Free RSS feeds from Google News
  • Finnhub.io: Financial market news and data
  • Alpha Vantage: Stock market news and sentiment analysis

AI Service

  • DeepSeek AI: Used for intelligent article deduplication
  • OpenAI Client: Compatible client library for API communication

Python Libraries

  • Streamlit: Web application framework
  • aiohttp: Asynchronous HTTP client for API calls
  • pandas: Data manipulation and analysis
  • PIL (Pillow): Image processing for article thumbnails
  • xml.etree.ElementTree: XML parsing for RSS feeds

Deployment Strategy

Environment Configuration

  • API keys managed through environment variables and Streamlit sidebar inputs
  • Flexible configuration allowing users to provide keys at runtime
  • Graceful degradation when API keys are missing

Error Handling

  • Provider-level error handling with specific error messages
  • Rate limiting detection and user feedback
  • Invalid API key detection and guidance

Scalability Considerations

  • Async/await pattern enables efficient concurrent API calls
  • Modular provider architecture allows easy addition of new news sources
  • AI-powered deduplication reduces information overload

Performance Optimizations

  • Parallel API calls reduce total fetch time
  • Article limiting (10 per provider) manages response size
  • Efficient data structures for article processing

Architecture Rationale

Provider Pattern Choice

Problem: Need to integrate multiple news APIs with different interfaces and authentication methods.

Solution: Abstract base provider class with concrete implementations for each API.

Benefits:

  • Consistent interface across all news sources
  • Easy to add new providers
  • Shared utilities for date formatting and article normalization
  • Independent error handling per provider

Asynchronous Architecture

Problem: Sequential API calls would be slow and inefficient.

Solution: Async/await pattern with concurrent execution.

Benefits:

  • Parallel API calls significantly reduce total fetch time
  • Better user experience with faster results
  • Efficient resource utilization

AI-Powered Deduplication

Problem: Multiple news sources often report the same story, creating duplicate content.

Solution: DeepSeek AI service analyzes articles for semantic similarity.

Benefits:

  • Intelligent deduplication beyond simple text matching
  • Focuses on unique news events rather than duplicate reports
  • Improves content quality and relevance

Streamlit Frontend Choice

Problem: Need rapid development of interactive web interface.

Solution: Streamlit framework with built-in components.

Benefits:

  • Rapid prototyping and development
  • Built-in data visualization capabilities
  • Easy deployment and sharing
  • Minimal frontend development required

About

🔍 Get the latest news on any company—instantly, from multiple sources—powered by smart AI insights. Stay ahead, stay informed. 🚀

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages