By Andrew Horton
LyricMind is a song lyric analysis system. It integrates multiple lyrics providers, utilizes language models for in-depth lyrical analysis, and offers both a Command Line Interface (CLI) and a RESTful API that is designed to be used by music players.
LyricMind performs custom lyrical analysis using portable, text-based 'frameworks'. This approach allows you to integrate insights from diverse disciplines. For instance, you can draw from cognitive psychology (including principles from Cognitive Behavioural Therapy, CBT, such as identifying cognitive distortions) to understand lyrical impact. The system's flexible design also supports the integration of concepts from literary theory (e.g., analyzing narrative structure, symbolism, or poetic devices) and communication/media studies (e.g., examining how lyrical messages are framed or their potential societal influence). While current example frameworks primarily emphasize psychological and wellness aspects, this flexibility empowers users to create highly tailored analytical lenses for a wide range of inquiries. The system is designed for flexibility, making it easy to combine, modify, and share these analytical approaches. This puts you in control, enabling you to experiment freely and define your own metrics for understanding lyrics.
The core of LyricMind lies in its sophisticated analysis of song lyrics, achieved by applying user-defined metrics via powerful Large Language Models (LLMs). This is an ideal application of AI: LLMs understand language by mapping words, phrases, and their contexts into a complex 'vector space.' In this space, semantic similarity and relationships between concepts are represented by proximity, allowing the models to grasp nuanced meanings, identify thematic content, and assess the potential impacts embedded within lyrical language. Users can craft their own metrics or utilize existing frameworks to guide this AI-driven measurement of specific lyrical attributes. The detailed analysis generated by LyricMind can then be leveraged by external systems, like a music player, to trigger actions when lyrics meet user-specified thresholds.
A key application of LyricMind is integration with a Spotify player (a separate project) to act as a subconscious firewall for the mind to limit negative effects from reperated lyrics exposure. When a song's lyrical analysis—performed using specialized frameworks such as 'vanilla', 'sage', or 'cinnamon'—exceeds user-defined thresholds, the player can trigger a user-defined action. Actions can include proactively skipping to the next track, removing or replacing the lyrics, or just logging the played song.
The system includes several pre-defined frameworks for analysis:
- Vanilla Framework: Provides a foundational analysis, objectively measuring common sensitive content like explicit language, sexual themes, violence, and substance promotion. It also includes a basic assessment of negative psychological impact based on cognitive distortions, serving as a general-purpose content filter.
- Sage Framework: Specifically designed for young people (ages 12-25), this framework evaluates lyrics for their developmental value. It focuses on themes supporting identity formation, emotional intelligence, and healthy life skills, prioritizing positive youth development while also identifying potentially harmful messaging.
- Cinnamon Framework: Adopts a holistic wellness perspective, analyzing lyrics for their broader emotional, psychological, and developmental impact. It assesses how music might influence overall listener wellbeing, considering both potential risks (e.g., rumination, helplessness) and positive attributes (e.g., empowerment, resilience).
Examples of each framework can be found in the analysis-samples
directory.
These frameworks utilize various metrics, including:
- Negative sentiment intensity: Assesses the average negative sentiment in the lyrics.
- Trigger words frequency: Counts the frequency of specific words or phrases related to anxiety, rumination, or other negative themes.
- Repetitive negative lyrical themes: Identifies recurring negative or harmful lyrical patterns, such as toxic relationships or substance abuse.
These metrics are used to calculate a risk score, which can then be compared to user-defined thresholds to determine if the song should be skipped or logged. This functionality is inspired by Cognitive Behavioral Therapy (CBT) principles, offering users a practical tool to mindfully curate their listening environment and manage exposure to lyrical content that might otherwise impact their mood or thought patterns.
Future plans include the ability to remove or replace lyrics from songs based on the analysis. This would allow users to benefit from the analysis without having to remove the song from their library.
It's easy to make your own framework. No need to code. Just write a text file with your metrics and you're good to go.
- Multiple Lyrics Providers: Integrates with Genius, Musixmatch, and Lyrics.ovh to find lyrics.
- Caching: Caches lyrics and analysis results in an SQLite database to improve performance and reduce external API calls.
- Configurable LLM Analysis: Leverages language models (OpenAI, Anthropic, etc) via LangChain for lyrical analysis using customizable frameworks.
- Flexible Frameworks: Supports different analysis perspectives (e.g., thematic, narrative, poetic devices) through simple text-based framework files.
- Dual Interface: Accessible via a powerful CLI for direct interaction and a Flask-based REST API for programmatic access.
- Configuration Management: Centralized configuration via
config.yaml
for API keys, database settings, logging, and provider preferences. - Docker Support: Includes considerations for efficient Docker deployment, such as model caching and command timeouts.
LyricMind's innovative approach to lyrical analysis is built upon a robust interdisciplinary foundation, integrating insights from psychology, media studies, and literary analysis. This allows for a nuanced and customizable understanding of lyrical content, moving beyond simplistic keyword spotting or static sentiment analysis.
Extensive research demonstrates the strong influence of lyrical content on emotion, mood, and cognition. Lyrical sentiment is a reliable predictor of how listeners emotionally engage with a song, sometimes even more so than the music itself (Bhattacharya & Kadambari, 2018). Cognitive processing of lyrics has been linked to emotional outcomes, with studies showing that listeners at risk of depression gravitate toward songs with more complex or negative lyrical themes (Shriram et al., 2021).
LyricMind incorporates these findings by allowing users to build frameworks inspired by psychological models such as Cognitive Behavioral Therapy (CBT), including metrics that identify cognitive distortions, negative affect, or ruminative language—tools which mirror clinical intervention strategies (Ko, 2014).
Media effects research has long explored how repeated exposure to content—such as themes of violence, despair, or substance abuse—can shape audience beliefs and attitudes over time. Music lyrics are no exception. A 2024 study confirmed that the emotional tone of lyrics has become increasingly negative and repetitive over the past 50 years, especially in pop and hip-hop genres (Parada-Cabaleiro et al., 2024).
This trend has critical implications for younger listeners. Adolescents are especially susceptible to the influence of music due to their developmental stage, emotional sensitivity, and identity formation processes. LyricMind’s "Sage" framework was designed with this in mind, helping users assess whether songs promote emotional intelligence, reinforce stereotypes, or support healthy psychosocial development.
LyricMind expands traditional literary analysis by embedding Natural Language Processing (NLP) and Large Language Model (LLM) capabilities into its framework system. Research supports that NLP tools such as SentiWordNet and LIWC can reliably extract emotional and psychological cues from lyrical texts (Sharma et al., 2016), (Xu et al., 2021).
This allows LyricMind to move beyond simple topic detection and sentiment labeling, enabling users to detect recurring themes—such as toxic relationships or glorified self-harm—and assess their frequency, intensity, and rhetorical framing. Such an approach bridges qualitative interpretation with quantifiable data.
What sets LyricMind apart is its user-defined, framework-driven model. While most tools rely on static filters or fixed sentiment rules, LyricMind empowers users to define their own metrics using plain-text frameworks. This facilitates context-aware, personalized analysis suitable for use cases ranging from wellness curation and education to therapeutic support and social impact design.
By leveraging large language models to detect nuanced patterns in lyrical language—and aligning this analysis with psychological, literary, and cultural frameworks—LyricMind delivers an advanced toolkit for understanding and managing lyrical influence in a deeply human-centric way.
lyrics-analysis-system/
├── api/ # Flask API (routes.py, __init__.py)
├── cli/ # Click CLI (main.py, __init__.py)
├── data/ # SQLite database storage (e.g., lyrics_cache.db)
├── frameworks/ # Analysis framework pro mpt files (e.g., vanilla.txt)
├── src/
│ ├── core/ # Core logic (config.py, database.py, discovery.py, analyzer.py)
│ └── providers/ # Lyrics provider implementations (genius.py, musixmatch.py, etc.)
├── .gitignore
├── config.yaml # Main configuration file
├── README.md # This file
├── requirements.txt # Python dependencies (to be created)
├── run_api.py # Script to run the Flask API server
└── Dockerfile # (Optional, for containerization - to be created)
└── docker-compose.yml # (Optional, for containerization - to be created)
- Python 3.8+ recommended.
pip
for package installation.
git clone <repository_url> # Replace with your repo URL
cd lyrics-analysis-system
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
Create a requirements.txt
file with the following content:
# Core
PyYAML
SQLAlchemy
requests
# Langchain & LLM Support (choose based on your LLM provider)
langchain
langchain-openai # If using OpenAI
langchain-anthropic # If using Anthropic
# API
Flask
# CLI
click
# Optional: For development/linting
# pylint
# autopep8
Then install them:
pip install -r requirements.txt
Copy the config.yaml-default
to config.yaml
or create config.yaml
in the project root. Populate it with your API keys and desired settings:
# config.yaml (Example - fill with your actual values)
llm:
provider: "openai" # or "anthropic"
model_name: "gpt-3.5-turbo"
temperature: 0.7
# API keys are preferably loaded from environment variables (e.g., OPENAI_API_KEY, ANTHROPIC_API_KEY).
# If not found in environment, the system will look for them here.
openai_api_key: "YOUR_OPENAI_API_KEY_IF_NOT_IN_ENV" # Optional here if OPENAI_API_KEY env var is set
anthropic_api_key: "YOUR_ANTHROPIC_API_KEY_IF_NOT_IN_ENV" # Optional here if ANTHROPIC_API_KEY env var is set
logging:
level: "INFO" # DEBUG, INFO, WARNING, ERROR, CRITICAL
log_file: "data/app.log" # Optional: path to log file, relative to project root
lyrics_providers:
genius_token: "YOUR_GENIUS_ACCESS_TOKEN"
musixmatch_api_key: "YOUR_MUSIXMATCH_API_KEY"
# lyrics_ovh is free and requires no key
provider_order: ["musixmatch", "genius", "lyrics_ovh"] # Order of preference
cache_expiry_days: 30
min_lyric_length: 50 # Minimum characters for lyrics to be considered valid
database:
url: "sqlite:///data/lyrics_cache.db" # Path to SQLite database file
api:
host: "0.0.0.0"
port: 5001
debug: true
framework_settings:
directory: "frameworks" # Relative to project root
default_framework: "vanilla"
Important: The application will attempt to create the data/
directory (e.g., for data/app.log
or data/lyrics_cache.db
) if it doesn't exist, provided it has the necessary write permissions. If you encounter issues, ensure the parent directory is writable or create data/
manually.
The CLI provides direct access to the system's functionalities. Ensure your virtual environment is active and you are in the project root directory.
General Syntax:
python cli/main.py [OPTIONS] COMMAND [ARGS]...
Common Options:
--config PATH
: Path toconfig.yaml
(defaults toconfig.yaml
in the current or project root directory).--log-level [DEBUG|INFO|WARNING|ERROR|CRITICAL]
: Overrides log level from config.
Commands:
-
Search for Lyrics:
python cli/main.py search "Artist Name" "Song Title" python cli/main.py search "Eminem" "Kim" --format json --force-refresh
Search for Lyrics (Docker):
docker compose run --rm cli search "Eminem" "Kim"
-
Analyze Lyrics from a File: Create a file
my_lyrics.txt
with song lyrics.python cli/main.py analyze my_lyrics.txt python cli/main.py analyze my_lyrics.txt --framework cinnamon --format text
-
Search and Analyze a Song:
python cli/main.py analyze-song "Queen" "Bohemian Rhapsody" python cli/main.py analyze-song "Nirvana" "Smells Like Teen Spirit" --framework sage --format json
Search and Analyze a Song (Docker):
docker compose run --rm cli analyze-song Queen" "Bohemian Rhapsody" --framework vanilla
-
List Available Lyrics Providers:
python cli/main.py providers
-
List Available Analysis Frameworks:
python cli/main.py frameworks
-
Clear Cache:
python cli/main.py clear-cache --days 60 # Clear entries older than 60 days
-
Test LLM Connection:
python cli/main.py test-llm
The API provides programmatic access to the system's features.
1. Run the API Server:
From the project root directory:
python run_api.py
The server will typically start on http://0.0.0.0:5001
(or as configured in config.yaml
).
API Endpoints:
-
GET /api/health
: Health check for the API.curl http://localhost:5001/api/health
-
POST /api/discovery/search
: Search for lyrics.- Request Body (JSON):
{ "artist": "Artist Name", "title": "Song Title", "force_refresh": false }
- Example:
curl -X POST -H "Content-Type: application/json" \ -d '{"artist": "Coldplay", "title": "Yellow"}' \ http://localhost:5001/api/discovery/search
- Request Body (JSON):
-
GET /api/discovery/providers
: List available lyrics providers.curl http://localhost:5001/api/discovery/providers
-
POST /api/analyzer/analyze
: Analyze provided lyrics text.- Request Body (JSON):
{ "lyrics": "Your song lyrics text here...", "framework": "vanilla" }
- Example:
curl -X POST -H "Content-Type: application/json" \ -d '{"lyrics": "Imagine all the people...", "framework": "vanilla"}' \ http://localhost:5001/api/analyzer/analyze
- Request Body (JSON):
-
POST /api/analyzer/analyze-song
: Discover and analyze lyrics for a song.- Request Body (JSON):
{ "artist": "Artist Name", "title": "Song Title", "framework": "sage", "force_refresh": false }
- Example:
curl -X POST -H "Content-Type: application/json" \ -d '{"artist": "Led Zeppelin", "title": "Stairway to Heaven", "framework": "sage"}' \ http://localhost:5001/api/analyzer/analyze-song
- Request Body (JSON):
-
GET /api/analyzer/frameworks
: List available analysis frameworks.curl http://localhost:5001/api/analyzer/frameworks
-
GET /api/analyzer/frameworks/{framework_name}
: Get details of a specific analysis framework.- Example:
curl http://localhost:5001/api/analyzer/frameworks/vanilla
- Example:
-
POST /api/cache/clear
: Clear system cache.- Request Body (JSON):
{ "days": 30 }
- Example:
curl -X POST -H "Content-Type: application/json" \ -d '{"days": 60}' \ http://localhost:5001/api/cache/clear
- Request Body (JSON):
Unit and integration tests will be added to ensure the reliability and correctness of the system. (Details to be added)
Contributions are welcome! Please follow standard practices for pull requests and issue reporting. (Details to be added)
Spotify lyrics integration. Especially important for Spotify player integration.
Removal and replacement of lyrics based on analysis (would anyone like to contribute with this?).
MIT License. You can use it in commercial projects. Please give attribution if you use it.