A production-ready Python tool for standardizing Bible references in Word documents and text. This tool automatically converts various Bible reference formats to a consistent "Book Chapter:Verse" format (e.g., "John 3:16").
-
Comprehensive Format Support: Handles multiple reference formats:
- Standard:
John 3:16
,John 3:16-18
,John 3:16,18
- Period format:
John 3.16
→John 3:16
- Space format:
John 3 16
→John 3:16
- Chapter-verse format:
John chapter 3 verse 16
→John 3:16
- Standalone chapters:
Psalm 23
→Psalms 23:1
- Cross-chapter ranges:
John 3:16-4:2
- Standard:
-
Extensive Abbreviation Support: Uses CSV file with 77+ Bible books and 260+ abbreviations
-
Document Processing: Works with Microsoft Word documents (.docx)
-
Backup Creation: Automatically creates backups before making changes
-
Table Support: Processes Bible references in tables as well as paragraphs
-
Production Ready: Comprehensive error handling, logging, and testing
- Python 3.6+
- python-docx library
# Install required dependency
pip install python-docx
# Clone or download the repository
# Ensure you have these files:
# - standardize_bible_scripture_format.py
# - Bible-Books-Abbr.csv
# - cli.py (command-line interface)
The easiest way to use the tool is through the command-line interface:
# Basic usage - processes document and creates backup
python cli.py document.docx
# Specify output file
python cli.py document.docx -o processed_document.docx
# Skip backup creation
python cli.py document.docx --no-backup
# Verbose output with detailed information
python cli.py document.docx -v
# Dry run - see what would be processed without making changes
python cli.py document.docx --dry-run
# Process a text string
python cli.py --text "In jn 3:16 and gen 1.1, we see God's love."
You can also use the tool programmatically:
from standardize_bible_scripture_format import BibleReferenceStandardizer
# Initialize the standardizer
standardizer = BibleReferenceStandardizer()
# Process text
text = "Jesus said in jn 3:16 that God so loved the world."
result, changed = standardizer.process_text(text)
print(result) # "Jesus said in John 3:16 that God so loved the world."
# Process a Word document
result = standardizer.process_document(
'input.docx',
'output.docx',
create_backup=True
)
if result['success']:
print(f"Processed {result['paragraphs_changed']} paragraphs")
else:
print(f"Error: {result['error']}")
from standardize_bible_scripture_format import process_text, process_document
# Simple text processing
result, changed = process_text("In gen 1:1, God created...")
# Simple document processing
result = process_document('document.docx')
John 3:16
(already standard)jn 3:16
(abbreviation)John 3.16
(period separator)John 3 16
(space separator)John chapter 3 verse 16
(verbose format)Psalm 23
(standalone chapter →Psalms 23:1
)John 3:16-18
(verse range)John 3:16,18
(multiple verses)John 3:16-4:2
(cross-chapter range)
All references are standardized to: Book Chapter:Verse
- Examples:
John 3:16
,Genesis 1:1
,1 Corinthians 13:4
The tool supports all 66 books of the Protestant Bible plus additional books from Catholic and Orthodox traditions. Abbreviations are loaded from Bible-Books-Abbr.csv
which includes:
- Genesis:
Gen
,Ge
,Gn
- Exodus:
Exod
,Ex
- Psalms:
Ps
,Pss
(plural) - Isaiah:
Isa
,Is
- Matthew:
Matt
,Mat
,Mt
- John:
John
,Jn
,Jo
- Romans:
Rom
,Rm
,Ro
- 1 Corinthians:
1 Cor
,1 Co
,1C
- Revelation:
Rev
,Re
,Rv
Bible_Standardizer/
├── standardize_bible_scripture_format.py # Main module
├── Bible-Books-Abbr.csv # Bible book abbreviations
├── cli.py # Command-line interface
├── test_standardizer.py # Test suite
├── create_sample_doc.py # Sample document creator
├── README.md # This file
└── sample_bible_references.docx # Sample test document
Run the comprehensive test suite:
python test_standardizer.py
Create a sample document for testing:
python create_sample_doc.py
You can provide a custom CSV file with Bible book abbreviations:
python cli.py document.docx --csv my_custom_books.csv
The CSV format should be:
Full Book Name,"Abbreviation1, Abbreviation2, Abbreviation3"
Genesis,"Gen, Ge, Gn"
Exodus,"Exod, Ex"
The tool uses Python's logging module. You can adjust the logging level by modifying the logging configuration in the script.
The tool includes comprehensive error handling:
- File not found errors
- Invalid document formats
- CSV parsing errors
- Backup creation failures
- Document processing errors
All errors are logged and returned in the result dictionary when using the API.
- Processes documents efficiently using regex patterns
- Handles large documents with tables
- Minimal memory footprint
- Always creates backups by default (can be disabled)
- Preserves original document formatting
- Non-destructive processing (original file remains unchanged when using output parameter)
- Comprehensive test suite
- Production-ready error handling
- Detailed logging for debugging
python cli.py sermon_notes.docx
# Creates: sermon_notes_backup_20231226_143022.docx
# Updates: sermon_notes.docx with standardized references
for file in *.docx; do
python cli.py "$file" -o "standardized_$file"
done
python cli.py --text "Study gen 1:1, ps 23, and jn 3.16 today."
# Output: "Study Genesis 1:1, Psalms 23:1, and John 3:16 today."
This tool is provided as-is for standardizing Bible references. Please ensure you have appropriate permissions before modifying documents.
For issues or questions:
- Check the test suite for expected behavior
- Review the CSV file for supported abbreviations
- Use verbose mode (-v) for detailed processing information
- Use dry-run mode to preview changes without modification