This project provides a hybrid, graph-based recommendation system designed to suggest new connections (friends or accounts to follow) on social media platforms like Facebook and Twitter. It uses a combination of structural analysis, content-based filtering, and community detection to generate personalized and explainable recommendations.
The entire system can be explored through an interactive web dashboard built with Streamlit.
Action: Replace the link above with a URL to a screenshot of your
app.py
dashboard.
- Hybrid Recommendation Model: Combines three distinct signals for robust recommendations:
- Common Neighbors: Recommends users based on mutual connections.
- Feature Similarity: Uses cosine similarity on user profile features to find users with similar interests.
- Community Overlap: Identifies candidates from shared social circles or lists.
- Multi-Platform Support: Includes separate, tailored recommender classes for both Facebook (undirected graphs) and Twitter (directed graphs).
- Explainable Recommendations: Each suggestion comes with a clear, human-readable reason (e.g., "Followed by @userX, @userY" or "Common interests: #tech, #AI").
- Interactive Dashboard: A user-friendly web interface built with Streamlit (
app.py
) to visualize networks and explore recommendations without needing to run command-line scripts. - Batch Processing: Includes functionality to process all user ego-networks in the dataset at once and save the aggregated results to a single JSON file for offline analysis.
project-root/
├── data/ # Input datasets
│ ├── facebook/
│ │ ├── 0.edges
│ │ ├── 0.feat
│ │ └── ... (other user files)
│ └── twitter/
│ ├── 12831.edges
│ ├── 12831.feat
│ └── ... (other user files)
│
├── output/ # Generated recommendations
│ ├── facebook_recs/
│ └── twitter_recs/
│
├── app.py # Streamlit web dashboard
├── facebook.py # Facebook recommender logic
├── twitter.py # Twitter recommender logic
├── requirements.txt # Python dependencies
└── README.md # Documentation
The system expects the data for each user (or "ego") to be provided in a specific format within the data/facebook
or data/twitter
directories. Each file is named with the user's ID.
<id>.edges
: A list of connections. For Facebook, these are mutual friendships. For Twitter, they are directed "follows."- Format:
user_id_1 user_id_2
on each line.
- Format:
<id>.feat
: A matrix of binary features for all the other users in the ego's network.<id>.egofeat
: The feature vector for the central "ego" user.<id>.circles
(Optional): Defines social groups or lists created by the user.<id>.featnames
(Optional): Provides human-readable names for the features in the.feat
files. This is crucial for generating detailed explanations.
Follow these steps to set up and run the project locally.
- Python 3.8 or higher
-
Clone the repository:
git clone [https://github.com/your-username/your-repo-name.git](https://github.com/your-username/your-repo-name.git) cd your-repo-name
-
Create and activate a virtual environment (recommended):
# For macOS/Linux python3 -m venv venv source venv/bin/activate # For Windows python -m venv venv .\venv\Scripts\activate
-
Install the required dependencies:
pip install -r requirements.txt
(You will need to create a
requirements.txt
file. See the section below.)
Create a file named requirements.txt
in the root of your project and add the following libraries:
You can interact with the project in two ways: through the interactive dashboard or via the command line.
This is the easiest way to use the project.
- Ensure your data is correctly placed in the
data/twitter
and/ordata/facebook
folders. - Run the following command in your terminal:
streamlit run app.py
- Open your web browser to the local URL provided by Streamlit (usually
http://localhost:8501
).
You can also run the recommendation engines directly from the terminal.
-
To process a single user:
python twitter.py
The script will prompt you to enter a user ID and the number of recommendations you want.
-
To process all users in batch mode: Run the script and select option
2
. The results will be saved in a summary JSON file in theoutput/
directory.
- To process a single user:
The script will prompt you to enter a user ID.
python facebook.py
app.py
: The main entry point for the interactive Streamlit web application. It provides a UI to select a platform and user, and it visualizes the recommendations and network graph.twitter.py
: Contains theTwitterRecommender
class. It handles the specific logic for directed graphs (follower/followee relationships) and includes the batch processing mode.facebook.py
: Contains theFacebookRecommender
class. It handles the logic for undirected graphs (mutual friendships).
- Advanced Algorithms: Incorporate more sophisticated graph algorithms like Personalized PageRank or link prediction models (e.g., Adamic-Adar).
- Graph Embeddings: Use techniques like Node2Vec or GraphSAGE to learn richer feature representations of users.
- Quantitative Evaluation: Implement a formal evaluation framework using metrics like Precision@k and Recall@k to benchmark the model's accuracy.
- Real-Time Adaptation: Integrate with graph databases (e.g., Neo4j) to handle dynamic updates for real-world, real-time applications.