Skip to content

Adwaith0/recommendation-system

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Graph-Based Social Recommendation Engine

This project provides a hybrid, graph-based recommendation system designed to suggest new connections (friends or accounts to follow) on social media platforms like Facebook and Twitter. It uses a combination of structural analysis, content-based filtering, and community detection to generate personalized and explainable recommendations.

The entire system can be explored through an interactive web dashboard built with Streamlit.

Streamlit Dashboard Screenshot

Action: Replace the link above with a URL to a screenshot of your app.py dashboard.


✨ Features

  • Hybrid Recommendation Model: Combines three distinct signals for robust recommendations:
    1. Common Neighbors: Recommends users based on mutual connections.
    2. Feature Similarity: Uses cosine similarity on user profile features to find users with similar interests.
    3. Community Overlap: Identifies candidates from shared social circles or lists.
  • Multi-Platform Support: Includes separate, tailored recommender classes for both Facebook (undirected graphs) and Twitter (directed graphs).
  • Explainable Recommendations: Each suggestion comes with a clear, human-readable reason (e.g., "Followed by @userX, @userY" or "Common interests: #tech, #AI").
  • Interactive Dashboard: A user-friendly web interface built with Streamlit (app.py) to visualize networks and explore recommendations without needing to run command-line scripts.
  • Batch Processing: Includes functionality to process all user ego-networks in the dataset at once and save the aggregated results to a single JSON file for offline analysis.

📂 Project Structure

project-root/
├── data/                       # Input datasets
│   ├── facebook/
│   │   ├── 0.edges
│   │   ├── 0.feat
│   │   └── ... (other user files)
│   └── twitter/
│       ├── 12831.edges
│       ├── 12831.feat
│       └── ... (other user files)
│
├── output/                     # Generated recommendations
│   ├── facebook_recs/
│   └── twitter_recs/
│
├── app.py                      # Streamlit web dashboard
├── facebook.py                 # Facebook recommender logic
├── twitter.py                  # Twitter recommender logic
├── requirements.txt            # Python dependencies
└── README.md                   # Documentation


📊 Dataset Format

The system expects the data for each user (or "ego") to be provided in a specific format within the data/facebook or data/twitter directories. Each file is named with the user's ID.

  • <id>.edges: A list of connections. For Facebook, these are mutual friendships. For Twitter, they are directed "follows."
    • Format: user_id_1 user_id_2 on each line.
  • <id>.feat: A matrix of binary features for all the other users in the ego's network.
  • <id>.egofeat: The feature vector for the central "ego" user.
  • <id>.circles (Optional): Defines social groups or lists created by the user.
  • <id>.featnames (Optional): Provides human-readable names for the features in the .feat files. This is crucial for generating detailed explanations.

🚀 Getting Started

Follow these steps to set up and run the project locally.

1. Prerequisites

  • Python 3.8 or higher

2. Installation

  1. Clone the repository:

    git clone [https://github.com/your-username/your-repo-name.git](https://github.com/your-username/your-repo-name.git)
    cd your-repo-name
  2. Create and activate a virtual environment (recommended):

    # For macOS/Linux
    python3 -m venv venv
    source venv/bin/activate
    
    # For Windows
    python -m venv venv
    .\venv\Scripts\activate
  3. Install the required dependencies:

    pip install -r requirements.txt

    (You will need to create a requirements.txt file. See the section below.)

3. Create requirements.txt

Create a file named requirements.txt in the root of your project and add the following libraries:

numpynetworkxscikit-learnchardetstreamlitpandasplotlytqdm

⚙️ How to Run

You can interact with the project in two ways: through the interactive dashboard or via the command line.

1. Using the Streamlit Dashboard (Recommended)

This is the easiest way to use the project.

  1. Ensure your data is correctly placed in the data/twitter and/or data/facebook folders.
  2. Run the following command in your terminal:
    streamlit run app.py
  3. Open your web browser to the local URL provided by Streamlit (usually http://localhost:8501).

2. Using the Command-Line Scripts

You can also run the recommendation engines directly from the terminal.

For Twitter Recommendations

  • To process a single user:

    python twitter.py

    The script will prompt you to enter a user ID and the number of recommendations you want.

  • To process all users in batch mode: Run the script and select option 2. The results will be saved in a summary JSON file in the output/ directory.

For Facebook Recommendations

  • To process a single user:
    python facebook.py
    The script will prompt you to enter a user ID.

📜 Key Scripts Overview

  • app.py: The main entry point for the interactive Streamlit web application. It provides a UI to select a platform and user, and it visualizes the recommendations and network graph.
  • twitter.py: Contains the TwitterRecommender class. It handles the specific logic for directed graphs (follower/followee relationships) and includes the batch processing mode.
  • facebook.py: Contains the FacebookRecommender class. It handles the logic for undirected graphs (mutual friendships).

🔮 Future Work

  • Advanced Algorithms: Incorporate more sophisticated graph algorithms like Personalized PageRank or link prediction models (e.g., Adamic-Adar).
  • Graph Embeddings: Use techniques like Node2Vec or GraphSAGE to learn richer feature representations of users.
  • Quantitative Evaluation: Implement a formal evaluation framework using metrics like Precision@k and Recall@k to benchmark the model's accuracy.
  • Real-Time Adaptation: Integrate with graph databases (e.g., Neo4j) to handle dynamic updates for real-world, real-time applications.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •