📖 Turjuman: Your Smart Book Translation System - Locally and privately hosted 🌍

Welcome to Turjuman (ترجمان - Interpreter/Translator in Arabic)! 👋

Ever felt daunted by translating a massive book (like 500 pages and over 150,000 words!)? Turjuman is here to help! (currently Markdown .md and plain text .txt files) using LLMs to magaically translate large documents while trying smartly keep the original meaning and style intact.

✨ How Turjuman Works

Turjuman uses a smart pipeline powered by LangGraph 🦜🔗:

🚀 init_translation: Start the translation job
🧐 terminology_unification: Find and unify key terms
✂️ chunk_document: Split the book into chunks
🌐 initial_translation: Translate chunks in parallel
🤔 critique_stage: Review translations, catch errors
✨ final_translation: Refine translations
📜 assemble_document: Stitch everything back together

📊 Translation Flow

flowchart TD
    A([🚀 init_translation<br><sub>Initialize translation state and configs</sub>]) --> B([🧐 terminology_unification<br><sub>Extract key terms, unify glossary, prepare context</sub>])
    B --> C([✂️ chunk_document<br><sub>Split the book into manageable chunks</sub>])

    %% Chunking produces multiple chunks
    C --> D1([📦 Chunk 1])
    C --> D2([📦 Chunk 2])
    C --> D3([📦 Chunk N])

    %% Parallel translation workers
    D1 --> E1([🌐 initial_translation<br><sub>Translate chunk 1 in parallel</sub>])
    D2 --> E2([🌐 initial_translation<br><sub>Translate chunk 2 in parallel</sub>])
    D3 --> E3([🌐 initial_translation<br><sub>Translate chunk N in parallel</sub>])

    %% Merge all translations
    E1 --> F([🤔 critique_stage<br><sub>Review translations, check quality and consistency</sub>])
    E2 --> F
    E3 --> F

    %% Decision after critique
    F --> |No critical errors| G([✨ final_translation<br><sub>Refine translations based on feedback</sub>])
    F --> |Critical error| H([🛑 End<br><sub>Stop translation due to errors</sub>])

    G --> I([📜 assemble_document<br><sub>Merge all refined chunks into final output</sub>])
    I --> J([🏁 Done<br><sub>Translation complete!</sub>])

    H --> J

🛠️ Setup & Installation

Prerequisites

Conda: Install Miniconda or Anaconda
API Keys: Get your API keys for OpenAI, Anthropic, etc.
Ollama: You can use Turjuman locally without paying for LLM by installing Ollama or any Local Inference server like LMstudio, vLLM, LLamaCPP ..etc, take alook at sample.env for details

Recommended Models

Online: Gemini Flash/Pro
Local: Gemma3 / Aya / Mistral

Clone the Repository

git clone <your-repo-url>
cd turjuman-book-translator

Create Conda Environment

conda create -n turjuman_env python=3.12 -y
conda activate turjuman_env

Install Dependencies

pip install langchain langgraph langchain-openai langchain-anthropic langchain-google-genai langchain-community tiktoken python-dotenv markdown-it-py pydantic "langserve[server]" sse-starlette aiosqlite uv streamlit

Configure Environment Variables

cp sample.env.file .env
# Edit .env and add your API keys

Run Backend Server

uvicorn src.server:app --host 0.0.0.0 --port 8051 --reload

Run Streamlit Frontend

streamlit run translate_over_api_frontend_streamlit.py

🚀 Using Turjuman via Streamlit

Configure: Set API URL, source & target languages, provider, and model
Upload: Your .md or .markdown file
Start Translation: Click the button and watch the magic happen! ✨
Review: See original and translated side-by-side, or chunk-by-chunk
Download: Get your translated book or the full JSON response

🖼️ Streamlit App Preview

BASH Script Client

A convenient command-line client script (translate_over_api_terminal.sh) is provided for interacting with the backend API.

Prerequisites: curl, jq

Getting Help:

The script includes detailed usage instructions. To view them, run:

./translate_over_api_terminal.sh --help

or

./translate_over_api_terminal.sh -h

Basic Usage:

The only required argument is the input file (-i or --input). Other options allow you to specify languages, provider, model, API URL, and output file path.

# Translate a file using default settings (English->Arabic, OpenAI provider, default model)
# Ensure OPENAI_API_KEY is set in .env if using openai
./translate_over_api_terminal.sh -i path/to/your/document.md

# Specify languages, provider, model, and save response to a specific file
./translate_over_api_terminal.sh \
  --input my_book.md \
  --output results/my_book_translated.json \
  --source english \
  --target french \
  --provider ollama \
  --model llama3

# Use a different API endpoint
./translate_over_api_terminal.sh -i chapter1.md -u http://192.168.1.100:8051

# List available models fetched from the backend API
./translate_over_api_terminal.sh --list-models

The script submits the job via the API. Since the API call is synchronous, the script waits for completion and saves the full JSON response (containing the final state and the translated document in output.final_document) to a file (default: <input_name>_<job_id>.json or the path specified with --output). It also provides the curl command to retrieve the final state again using the job ID.

🗺️ Future Plans

Support for PDF, DOCX, and other formats
More advanced glossary and terminology management
Interactive editing and feedback loop
Better error handling and progress tracking

🤝 Contributing

Pull requests welcome! For major changes, open an issue first.

📄 License

MIT

Enjoy translating your books with Turjuman! 🚀📚🌍

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
Docs		Docs
src		src
.gitignore		.gitignore
README.md		README.md
prompts.yaml		prompts.yaml
sample.env.file		sample.env.file
translate_over_api_frontend_streamlit.py		translate_over_api_frontend_streamlit.py
translate_over_api_terminal.sh		translate_over_api_terminal.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

📖 Turjuman: Your Smart Book Translation System - Locally and privately hosted 🌍

✨ How Turjuman Works

📊 Translation Flow

🛠️ Setup & Installation

🚀 Using Turjuman via Streamlit

🖼️ Streamlit App Preview

BASH Script Client

🗺️ Future Plans

🤝 Contributing

📄 License

About

Uh oh!

Releases

Packages

Languages

AhmadFaraaj/turjuman-book-translator

Folders and files

Latest commit

History

Repository files navigation

📖 Turjuman: Your Smart Book Translation System - Locally and privately hosted 🌍

✨ How Turjuman Works

📊 Translation Flow

🛠️ Setup & Installation

🚀 Using Turjuman via Streamlit

🖼️ Streamlit App Preview

BASH Script Client

🗺️ Future Plans

🤝 Contributing

📄 License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages