📊 社交媒体平台用户分析系统

📝 项目概述

本项目基于数学建模竞赛C题《社交媒体平台用户分析问题》，通过数据挖掘和机器学习技术，对社交媒体平台上的用户行为进行分析和预测，为平台优化推荐机制提供数据支持。

🔍 问题描述

近年来，社交媒体平台已成为人们信息获取与社交互动的重要场所。本项目针对以下四个核心问题展开研究：

博主关注数预测：预测各博主在2024.7.21当天新增关注数，筛选出关注增长最快的博主
用户关注行为预测：预测特定用户在2024.7.22会关注哪些新博主
用户-博主互动关系分析：预测用户在线状态及可能产生的互动关系
时间段推荐机制分析：分析不同时段用户活跃度与博主互动关系

项目基于附件1（2024.7.11-2024.7.20）和附件2（2024.7.22）的用户行为数据，用户行为编码如下：

1：观看
2：点赞
3：评论
4：关注

🏗️ 项目结构

graph TD
    A[src 📂] -->|主目录| B[models 💾]
    A -->|问题模块| C[question1 📊]
    A -->|问题模块| D[question2 🔮]
    A -->|问题模块| E[question3 🔄]
    A -->|问题模块| F[question4 ⏰]
    A -->|工具模块| G[utils 🔧]
    A -->|资源文件| H[resources 📁]
    A -->|输出结果| I[outputs 📈]
    
    B -->|存储| B1[question1_models]
    B -->|存储| B2[question2_models]
    B -->|存储| B3[question3_models]
    B -->|存储| B4[question4_models]
    
    C -->|组件| C1[data_processor.py]
    C -->|组件| C2[feature_engine.py]
    C -->|组件| C3[model_trainer.py]
    C -->|组件| C4[predictor.py]
    C -->|组件| C5[visualizer.py]
    C -->|入口| C6[__init__.py]
    
    G -->|工具| G1[data_utils.py]
    G -->|工具| G2[feature_utils.py]
    G -->|工具| G3[model_utils.py]
    
    I -->|输出| I1[figures]
    I -->|输出| I2[results]
    
    style A fill:#f9d5e5,stroke:#333,stroke-width:2px
    style B fill:#d5e5f9,stroke:#333,stroke-width:2px
    style C fill:#d5f9e5,stroke:#333,stroke-width:2px
    style D fill:#f9e5d5,stroke:#333,stroke-width:2px
    style E fill:#e5d5f9,stroke:#333,stroke-width:2px
    style F fill:#f9f9d5,stroke:#333,stroke-width:2px
    style G fill:#f9d5e5,stroke:#333,stroke-width:2px
    style H fill:#d5e5f9,stroke:#333,stroke-width:2px
    style I fill:#d5f9e5,stroke:#333,stroke-width:2px

目录结构详情

src/
├── models/                    # 存储训练好的模型
│   ├── question1_models/      # 问题1相关模型
│   ├── question2_models/      # 问题2相关模型
│   ├── question3_models/      # 问题3相关模型
│   └── question4_models/      # 问题4相关模型
│
├── question1/                 # 问题1模块：博主关注数预测
│   ├── __init__.py            # 主入口
│   ├── data_processor.py      # 数据处理
│   ├── feature_engine.py      # 特征工程
│   ├── model_trainer.py       # 模型训练
│   ├── predictor.py           # 预测逻辑
│   └── visualizer.py          # 可视化
│
├── question2/                 # 问题2模块：用户关注行为预测
│   ├── __init__.py
│   └── ...
│
├── question3/                 # 问题3模块：用户-博主互动关系分析
│   ├── __init__.py
│   └── ...
│
├── question4/                 # 问题4模块：时间段推荐机制分析
│   ├── __init__.py
│   └── ...
│
├── utils/                     # 通用工具函数
│   ├── __init__.py
│   ├── data_utils.py          # 数据处理公共函数
│   ├── feature_utils.py       # 特征工程公共函数
│   └── model_utils.py         # 模型相关公共函数
│
├── resources/                 # 资源文件
│   ├── Attachment_1.csv       # 2024.7.11-2024.7.20历史数据
│   └── Attachment_2.csv       # 2024.7.22用户行为数据
│
├── outputs/                   # 输出结果
│   ├── figures/               # 图表输出
│   └── results/               # 表格结果
│
├── config.py                  # 全局配置
├── run_question1.py           # 问题1运行脚本
├── run_question2.py           # 问题2运行脚本
├── run_question3.py           # 问题3运行脚本
├── run_question4.py           # 问题4运行脚本
└── run_all.py                 # 运行所有问题

🚀 安装与使用

环境依赖

# 数据处理基础库
pandas>=1.3.5
numpy>=1.20.0
dask>=2022.1.0
vaex>=4.9.0
pyarrow>=7.0.0

# 机器学习库
scikit-learn>=1.0.2
xgboost>=1.5.2
lightgbm>=3.3.2
catboost>=1.0.6

# 时间序列分析
statsmodels>=0.13.2
prophet>=1.1.1
pmdarima>=2.0.0
sktime>=0.13.0
tsfresh>=0.19.0

# 可视化工具
matplotlib>=3.5.1
seaborn>=0.11.2
plotly>=5.6.0
bokeh>=2.4.2

# 深度学习框架（可选）
# tensorflow>=2.8.0
# torch>=1.11.0
# keras>=2.8.0

# 社交网络分析
networkx>=2.7.1
community>=1.0.0b1

# 模型序列化与存储
joblib>=1.1.0

# 工具库
tqdm>=4.63.0
pathlib>=1.0.1
logging>=0.5.1.2
pyyaml>=6.0

# 开发工具
ipykernel>=6.9.1
notebook>=6.4.8
pytest>=7.1.1
black>=22.1.0
flake8>=4.0.1

安装步骤

克隆项目仓库

git clone [仓库地址]
cd [项目文件夹]

安装依赖
```
pip install -r requirements.txt
```

运行项目

# 运行问题1
python src/run_question1.py

# 运行所有问题
python src/run_all.py

🔄 数据流程

flowchart LR
    A["数据加载 📂"] -->|"原始数据"| B["数据预处理 🧹"]
    B -->|"清洗数据"| C["特征工程 ⚙️"]
    C -->|"特征矩阵"| D["模型训练 🧠"]
    D -->|"保存模型"| E["模型存储 💾"]
    D -->|"评估结果"| F["模型评估 📏"]
    E -->|"加载模型"| G["预测应用 🔮"]
    G -->|"预测结果"| H["结果可视化 📊"]
    H -->|"图表和表格"| I["输出展示 📋"]
    
    style A fill:#f9d5e5,stroke:#333,stroke-width:2px
    style B fill:#d5e5f9,stroke:#333,stroke-width:2px
    style C fill:#d5f9e5,stroke:#333,stroke-width:2px
    style D fill:#f9e5d5,stroke:#333,stroke-width:2px
    style E fill:#e5d5f9,stroke:#333,stroke-width:2px
    style F fill:#f9f9d5,stroke:#333,stroke-width:2px
    style G fill:#f9d5e5,stroke:#333,stroke-width:2px
    style H fill:#d5e5f9,stroke:#333,stroke-width:2px
    style I fill:#d5f9e5,stroke:#333,stroke-width:2px

📚 模块功能说明

1. question1 - 博主关注数预测

data_processor.py: 数据加载与预处理
feature_engine.py: 特征提取与工程化
model_trainer.py: 训练预测模型
predictor.py: 执行博主关注数预测
visualizer.py: 结果可视化与展示

2. question2 - 用户关注行为预测

预测特定用户可能关注的博主
基于用户历史行为构建推荐模型
用户-博主相似度计算

3. question3 - 用户-博主互动关系分析

用户在线状态预测
用户互动行为分析
互动强度预测与博主推荐

4. question4 - 时间段推荐机制分析

时间段用户行为分析
不同时段推荐策略比较
时段特定互动预测

🔧 技术栈

数据处理：pandas, numpy, dask
机器学习：scikit-learn, xgboost, lightgbm
时间序列分析：statsmodels, prophet
深度学习：pytorch, tensorflow (可选)
可视化：matplotlib, seaborn, plotly
模型存储：pickle, joblib
工具库：tqdm, pathlib, logging

📊 结果输出

项目执行后，结果将以以下形式展示：

控制台输出表格形式的结果
保存在 outputs/results 目录下的CSV结果文件
保存在 outputs/figures 目录下的可视化图表

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.idea		.idea
docs		docs
src		src
Readme.md		Readme.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

📊 社交媒体平台用户分析系统

📝 项目概述

🔍 问题描述

🏗️ 项目结构

目录结构详情

🚀 安装与使用

环境依赖

安装步骤

🔄 数据流程

📚 模块功能说明

1. question1 - 博主关注数预测

2. question2 - 用户关注行为预测

3. question3 - 用户-博主互动关系分析

4. question4 - 时间段推荐机制分析

🔧 技术栈

📊 结果输出

About

Uh oh!

Releases

Packages

Uh oh!

Languages

oodenX/51MCM

Folders and files

Latest commit

History

Repository files navigation

📊 社交媒体平台用户分析系统

📝 项目概述

🔍 问题描述

🏗️ 项目结构

目录结构详情

🚀 安装与使用

环境依赖

安装步骤

🔄 数据流程

📚 模块功能说明

1. question1 - 博主关注数预测

2. question2 - 用户关注行为预测

3. question3 - 用户-博主互动关系分析

4. question4 - 时间段推荐机制分析

🔧 技术栈

📊 结果输出

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages