This project provides a semantic text analysis service with a FastAPI-based REST API and a command-line interface (CLI). It extracts topical meaning from text and generates meaningful word clouds where word size is based on semantic importance.
- Semantic Analysis: Extracts topics from text using sentence embeddings and clustering.
- Word Cloud Generation: Creates customizable word clouds from extracted topics.
- FastAPI: Exposes endpoints for analysis and health checks.
- CLI Interface: Allows for easy interaction with the service from the command line.
- Multilingual Support: Handles English, Hebrew, Greek, and Latin text.
text-semantic-analyzer/
├── app.py # FastAPI server with MCP wrapper
├── cli.py # CLI interface
├── models/
│ ├── vectorizer.py # Sentence transformer setup
│ ├── semantic_analyzer.py # Topic/meaning extraction
│ └── wordcloud_gen.py # Word cloud generation
├── requirements.txt
├── README.md
└── .env.example
-
Clone the repository:
git clone <repository-url> cd text-semantic-analyzer
-
Install dependencies:
pip install -r requirements.txt
-
Run the FastAPI server:
uvicorn app:app --reload
POST /analyze
: Analyzes text and returns topics, a word cloud, and embeddings.GET /health
: Health check endpoint.POST /mcp/analyze
: MCP wrapper for the analyze endpoint.
curl -X POST "http:
{
"text": "This is a sample text for semantic analysis.",
"topics": 3,
"style": "dark"
}
'
# Analyze text from a string
python cli.py analyze --text "Your text here" --output wordcloud.png
# Analyze text from a file
python cli.py analyze --file document.txt --topics 5 --style dark