An interactive Streamlit-based application for performing traditional and deep learning-based clustering, enhanced with explainable AI (XAI) and Generative AI (GenAI)-powered insights.
- Centroid-based: KMeans
- Density-based: DBSCAN, HDBSCAN
- Connectivity-based: Agglomerative Hierarchical Clustering (Ward, Complete, Single, Average)
- Distribution-based: Gaussian Mixture Models (GMM)
- Grid-based: (Optional extension)
- Self-Organizing Maps (SOM)
- Deep Learning-based: Autoencoder + KMeans
- Silhouette Score
- Davies-Bouldin Index
- Calinski-Harabasz Index
- Adjusted Rand Index (ARI)
- Normalized Mutual Information (NMI)
- PCA, t-SNE, UMAP
- Visualize cluster separation, outliers, and centroids
- Supports both Matplotlib and Plotly for interactive visuals
- Cluster Summary Generator: Uses Hugging Face API (
google/flan-t5-base
) to summarize cluster characteristics - Accepts Hugging Face API key via input prompt
- Automatically generates human-readable insights based on cluster stats
git clone origin https://github.com/sivkri/GenAI-cluster-analyzer
cd GenAI-cluster-analyzer
pip install -r requirements.txt
streamlit run streamlit_bot.py
-
Upload your dataset (CSV format)
-
Preview the first 5 rows
-
Select columns to drop
-
Select features for clustering
-
Choose clustering method(s)
-
View evaluation metrics and interactive visualizations
-
Optionally generate a GenAI-powered cluster summary
To enable GenAI cluster summarization:
-
Go to Hugging Face Tokens
-
Generate a read access token
-
Paste your API key into the app when prompted