A Computational Approach to Modeling Conversational Systems

Analyzing Large-Scale Quasi-Patterned Dialogue Flows

Official Implementation of the IEEE EUROCON 2025 Paper
A Computational Approach to Modeling Conversational Systems Analyzing Large-Scale Quasi-Patterned Dialogue Flows
Mohamed Achref Ben Ammar – National Institute of Applied Science and Technology (INSAT), University of Carthage, Tunisia
Mohamed Taha Bennani – University of Tunis El Manar (FST)

Abstract

The rise of large language models (LLMs) has led to increasingly complex and loosely structured dialogues. In this work, we introduce a computational graph-based framework that models these quasi-patterned conversations. Central to our approach is the Filter & Reconnect method, a graph simplification technique that reduces conversational noise while preserving semantic structure.

Key outcomes:

2.06× improvement in semantic metric S over prior methods
0 δ-hyperbolicity, enforcing a tree-like, interpretable structure

This framework offers practical tools for monitoring and analyzing chatbot behavior, dialogue management systems, and user interaction patterns at scale.

Methodology Overview

The methodology consists of the following core steps:

Utterance Extraction
Conversational utterances are extracted from a structured dataset consisting of multi-turn dialogues.
Semantic Embedding
Each utterance is transformed into a dense vector using a pre-trained text embedding model, capturing the semantic meaning of the message.
Clustering of Intents
Using hierarchical clustering techniques and a large language model (LLM), similar utterances are grouped together to identify key communicative intents.
Markov Chain Construction
A Markov Chain is built where nodes represent clustered intents and edges represent transitions between them in the dialogue flow.
Graph Simplification: Filter & Reconnect
The conversational graph undergoes a noise reduction process by removing irrelevant transitions while preserving semantic and structural coherence.
Flow Pattern Analysis
The resulting graph is then analyzed to identify quasi-patterned conversational flows, enabling improved interpretability and dialogue system evaluation.

Setup

1. Install Dependencies

# Create and activate a virtual environment
python -m venv venv
source venv/bin/activate        # Linux/MacOS
venv\Scripts\activate           # Windows

# Install required packages
pip install -r requirements.txt

# Download required NLP model
python -m spacy download en_core_web_md

2. Create a `.env` File

At the project root, create a .env file and configure the following environment variables:

# Python setup
PYTHONPATH=${PYTHONPATH}:.

# Environment mode
ENVIRONMENT="local"

# API Keys
GOOGLE_API_KEY=
MISTRAL_API_KEY=

Ensure your API keys are valid and have the appropriate access privileges.

Input Data Format

This framework supports ABCD v1.1, MultiWOZ 2.0, or any custom dataset formatted as follows:

{
  "conversation_1": [
    {"role": "agent", "content": "Hello, how can I help you today?"},
    {"role": "customer", "content": "I need assistance with my account."},
    {"role": "action", "content": "Agent opened account details."}
  ]
}

Save your data file as: data/processed_formatted_conversations.json

Run the Pipeline

python main.py \
    --file_path data/processed_formatted_conversations.json \
    --num_sampled_data 500 \
    --min_clusters 10 \
    --max_clusters 30 \
    --model_name 'sentence-transformers/all-mpnet-base-v2' \
    --label_model 'open-mixtral-8x22b' \
    --tau 0.15 \
    --top_k 2 \
    --alpha 0.8

Advanced Configuration

Parameter	Description	Default
`--num_sampled_data`	Number of conversations to sample	100
`--min_clusters`	Minimum cluster count for elbow method	5
`--max_clusters`	Maximum cluster count for elbow method	15
`--model_name`	Sentence embedding model	'all-MiniLM-L12-v2'
`--label_model`	LLM for labeling dialogue state clusters	'open-mixtral-8x22b'
`--tau`	Minimum transition probability threshold	0.1
`--top_k`	Number of outgoing edges to retain per node	1
`--alpha`	Balance between semantic similarity and topology	1.0

Citation

If you use this codebase for your research, please cite:

@inproceedings{achref2025conversationalgraph,
  title={A Computational Approach to Modeling Conversational Systems: Analyzing Large-Scale Quasi-Patterned Dialogue Flows},
  author={Mohamed Achref Ben Ammar and Mohamed Taha Bennani},
  conference={IEEE EUROCON 2025 - The 21st International Conference on Smart Technologies},
  year={2025},
  publisher={IEEE},
}

Contact

For questions, collaborations, or feedback, feel free to reach out:

Mohamed Achref Ben Ammar – mohamedachref.benammar@insat.ucar.tn
Mohamed Taha Bennani – taha.bennani@fst.utm.tn

Name		Name	Last commit message	Last commit date
Latest commit History 78 Commits
data		data
output		output
resources		resources
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
evaluating_ufd.py		evaluating_ufd.py
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

A Computational Approach to Modeling Conversational Systems

Analyzing Large-Scale Quasi-Patterned Dialogue Flows

Abstract

Methodology Overview

Setup

1. Install Dependencies

2. Create a `.env` File

Input Data Format

Run the Pipeline

Advanced Configuration

Citation

Contact

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

achrefbenammar404/quasi-patterned-conversations-analysis

Folders and files

Latest commit

History

Repository files navigation

A Computational Approach to Modeling Conversational Systems

Analyzing Large-Scale Quasi-Patterned Dialogue Flows

Abstract

Methodology Overview

Setup

1. Install Dependencies

2. Create a .env File

Input Data Format

Run the Pipeline

Advanced Configuration

Citation

Contact

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

2. Create a `.env` File

Packages