Zathura TTS Extension

A powerful text-to-speech (TTS) extension for the Zathura PDF viewer, designed to provide a seamless and feature-rich audio reading experience.

This plugin allows you to listen to your PDF documents directly within Zathura, with advanced controls for playback, voice customization, and navigation. It integrates with modern, high-quality TTS engines to deliver clear and natural-sounding speech.

🏗️ Project Structure

This repository contains both the TTS plugin and a modified version of Zathura with utility plugin support:

zathura-liberated/
├── zathura-tts/           # TTS plugin implementation
├── zathura/               # Modified Zathura with utility plugin support (submodule)
├── zathura-pdf-poppler/   # PDF plugin (submodule)
└── docs/                  # Comprehensive documentation

Why this structure? We needed to extend Zathura's core to support utility plugins (vs just document plugins). This monorepo approach makes development and installation much simpler while we prove the concept. Once utility plugins are accepted upstream, this could be split into separate repositories.

Features

Multiple TTS Engines: Supports Piper-TTS (high-quality neural voices), Speech Dispatcher (system integration), and espeak-ng (reliable fallback).
Playback Control: Play, pause, and stop audio narration with simple keyboard shortcuts.
Navigation: Skip forward and backward by sentence or paragraph.
Voice Customization: Adjust reading speed and select from available voices.
Visual Feedback: Highlights the text currently being read.
Continuous Reading: Automatically proceeds to the next page.
Special Content Handling: Announces tables, lists, and other non-standard content.

(demo.gif)

Requirements

⚠️ Important: Modified Zathura Required

This TTS plugin requires a modified version of Zathura with utility plugin support. The standard Zathura from package managers will not work.

Two options to get the required Zathura:

Use our pre-built version (included as submodule):

git clone --recursive https://github.com/ubuntupunk/zathura-liberated.git
cd zathura-liberated
# The submodule automatically points to our fork with utility plugin support
# Build and install modified Zathura (see Installation section)

Apply our patch to upstream Zathura:

git clone https://github.com/pwmt/zathura.git
cd zathura
git apply ../0001-Add-utility-plugin-support-and-TTS-API-functions.patch
# Build and install

Use our fork directly:

git clone https://github.com/ubuntupunk/zathura.git
cd zathura
git checkout feature/utility-plugin-support
# Build and install

System Dependencies

Modified Zathura: With utility plugin support (see above)
girara-gtk3: Version 0.4.0 or higher
GLib: Version 2.50 or higher
GTK+ 3: Version 3.22 or higher
Speech Dispatcher (optional, for system TTS)

📋 Compatibility Notes

GLib Version: Tested with GLib 2.74.6 (Debian 12). May need adjustments for newer GLib versions
Upstream Sync: Our Zathura fork may not be fully synchronized with latest upstream development
Build Dependencies: Ensure you have compatible versions of girara-gtk3, GTK+3, and related libraries
Feature Branch: Our modifications are in the feature/utility-plugin-support branch of the fork

Python Dependencies

Python: Version 3.8 or higher
piper-tts: Version 1.2.0 or higher

Installation

Step 1: Build Modified Zathura

Since this plugin requires a modified version of Zathura, you must build and install it first:

# Clone with submodules
git clone --recursive https://github.com/ubuntupunk/zathura-liberated.git
cd zathura-liberated

# Build and install modified Zathura
cd zathura
meson setup builddir
meson compile -C builddir
sudo meson install -C builddir

# Build and install PDF plugin
cd ../zathura-pdf-poppler
meson setup builddir
meson compile -C builddir
sudo meson install -C builddir

cd ..

Step 2: Install System Dependencies

# Example for Debian/Ubuntu
sudo apt install libgirara-dev libgtk-3-dev libglib2.0-dev speech-dispatcher

# Example for Arch Linux
sudo pacman -S girara gtk3 glib2 speech-dispatcher

Step 3: Install Piper-TTS

For the best audio quality, install the Piper-TTS Python package.

pip install piper

Step 4: Build the TTS Plugin

Build and install the TTS plugin:

# From the zathura-liberated directory
cd zathura-tts
meson setup builddir
meson compile -C builddir
sudo meson install -C builddir

Download Piper Voices: Piper requires voice models to function. You can download high-quality voices from the Piper Voices repository on Hugging Face.

Each voice consists of a .onnx file and a .onnx.json file.

Example: To download the en_US-lessac-medium voice:
```
wget https://huggingface.co/rhasspy/piper-voices/resolve/main/en/en_US/lessac/medium/en_US-lessac-medium.onnx
wget https://huggingface.co/rhasspy/piper-voices/resolve/main/en/en_US/lessac/medium/en_US-lessac-medium.onnx.json
```
You should place these voice files in a directory where the plugin can find them, such as ~/.local/share/zathura-tts/voices/. You can configure the voice path in your zathurarc.

Usage

Once installed, the TTS functionality can be controlled with the following keyboard shortcuts in Zathura:

Shortcut	Action
`Ctrl+T`	Toggle TTS on/off
`Ctrl+Space`	Pause/Resume reading
`Ctrl+Right`	Skip to the next sentence
`Ctrl+Left`	Go to the previous sentence
`Ctrl+Shift+T`	Open TTS settings

Configuration

The plugin can be configured by editing the zathurarc file or through the TTS settings dialog (Ctrl+Shift+T).

Available options include:

tts_engine: piper, speech-dispatcher, or espeak (default: piper).
tts_voice: Specify a voice for the selected engine.
tts_speed: Reading speed multiplier (0.5x to 3.0x).
tts_highlight_color: Color for highlighting spoken text.

Supported TTS Engines

The plugin selects the best available TTS engine in the following order:

Piper-TTS: A high-quality, fast, and local neural text-to-speech system. It offers the most natural-sounding voices and is the recommended engine.
Speech Dispatcher: A common interface for system-level TTS services on Linux. It provides access to a wide range of voices that may already be installed on your system.
espeak-ng: A compact and reliable software speech synthesizer. It serves as a fallback and works on most systems without extra configuration.

Development

For development, you can build the plugin without installing it system-wide.

meson setup build
meson compile -C build

To run tests:

meson test -C build

Roadmap

Enhanced handling of scientific and mathematical notation.
Support for more TTS engines and platforms.
Improved UI for settings and voice selection.
End-to-end testing with various document types.

Contributing

Contributions are welcome! Please feel free to submit a pull request or open an issue for bugs, feature requests, or suggestions.

License

This project is licensed under the Zlib license. See the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 75 Commits
.github/workflows		.github/workflows
.kiro/specs/text-to-speech-reader		.kiro/specs/text-to-speech-reader
.vscode		.vscode
docs		docs
glib-2.76.5		glib-2.76.5
zathura @ 6fc80a3		zathura @ 6fc80a3
zathura-pdf-poppler @ 0204dde		zathura-pdf-poppler @ 0204dde
zathura-tts		zathura-tts
.gitignore		.gitignore
.gitmodules		.gitmodules
ARCHITECTURE_ANALYSIS.md		ARCHITECTURE_ANALYSIS.md
CURRENT_STATUS.md		CURRENT_STATUS.md
IMPLEMENTATION_REPORT.md		IMPLEMENTATION_REPORT.md
LICENSE		LICENSE
README.md		README.md
STREAMING_SUCCESS_SUMMARY.md		STREAMING_SUCCESS_SUMMARY.md
SUBMODULE_SETUP.md		SUBMODULE_SETUP.md
ZATHURA_MODIFICATIONS_REPORT.md		ZATHURA_MODIFICATIONS_REPORT.md
debug_tts_error.sh		debug_tts_error.sh
streaming_default_test.md		streaming_default_test.md
streaming_test_plan.md		streaming_test_plan.md
test.pdf		test.pdf
test_macro.c		test_macro.c
test_tts_config		test_tts_config
test_tts_functionality.md		test_tts_functionality.md
test_tts_functionality.sh		test_tts_functionality.sh
test_tts_shortcuts.sh		test_tts_shortcuts.sh
test_zathurarc		test_zathurarc
tts_research.md		tts_research.md
tts_streaming_design.md		tts_streaming_design.md
zathura-utility-plugin-support.patch		zathura-utility-plugin-support.patch

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Zathura TTS Extension

🏗️ Project Structure

Features

Requirements

⚠️ Important: Modified Zathura Required

System Dependencies

📋 Compatibility Notes

Python Dependencies

Installation

Step 1: Build Modified Zathura

Step 2: Install System Dependencies

Step 3: Install Piper-TTS

Step 4: Build the TTS Plugin

Usage

Configuration

Supported TTS Engines

Development

Roadmap

Contributing

License

About

Uh oh!

Releases

Packages

Languages

License

ubuntupunk/zathura-liberated

Folders and files

Latest commit

History

Repository files navigation

Zathura TTS Extension

🏗️ Project Structure

Features

Requirements

⚠️ Important: Modified Zathura Required

System Dependencies

📋 Compatibility Notes

Python Dependencies

Installation

Step 1: Build Modified Zathura

Step 2: Install System Dependencies

Step 3: Install Piper-TTS

Step 4: Build the TTS Plugin

Usage

Configuration

Supported TTS Engines

Development

Roadmap

Contributing

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages