A CONCEPTUAL TEXT ANALYZER-CLUSTERER

Aydin Manzouri, 2025

SUMMARY

This notebook provides a demo toolbox for conceptual analysis and clustering of text data.

Objective

To analyze and cluster texts based on their conceptual loads, via a hybrid concept-aggregate approach

filepath → read_txt → nlp → token_ext → concept_matcher → concept_aggregator

b.2. concept_aggregator gives a tuple (detailed, aggregated) of data

b.3. Functions json_saver and json_loader enable saving and loading the above data tuple in JSON format, resp.

b.4. Function aggreg_visu generates and saves a bar chart from aggregated

b.5. And function concept_heatmap generates and saves a heatmap from detailed

(C) Working with multiple documents

c.1. Function batch_preprocess loads multiple text files and prepares the data for the next steps

c.2. Function batch_plot generates a batch of a couple of both plot types

c.3. Functions batch_json_saver and batch_json_loader are batch-process analogs of their respective single-process functions

c.4. Function vectorizer converts batch-preprocessed data into vectorized format to be used in ML operations. It combines detailed and aggregated data into a single DataFrame

c.5. Finally, function cluster performs unsupervised learning, in the form of KMeans clustering. It:

receives data in vectorized format,
performs clustering,
applies PCA to high-dimensional data,
generates and saves the resulting 2D plot,
and returns a tuple (df_combo, cluster_labels)

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
plots		plots
refs		refs
text-batch		text-batch
A_Conceptual_Text_Analyzer-Clusterer.ipynb		A_Conceptual_Text_Analyzer-Clusterer.ipynb
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

A CONCEPTUAL TEXT ANALYZER-CLUSTERER

Aydin Manzouri, 2025

SUMMARY

Objective

Contents

(A) General

(B) Working with single documents

(C) Working with multiple documents

About

Uh oh!

Releases

Packages

Languages

License

Aydin62/A_Conceptual_Text_Analyzer-Clusterer

Folders and files

Latest commit

History

Repository files navigation

A CONCEPTUAL TEXT ANALYZER-CLUSTERER

Aydin Manzouri, 2025

SUMMARY

Objective

Contents

(A) General

(B) Working with single documents

(C) Working with multiple documents

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages