Skip to content

Commit badcf98

Browse files
committed
Add support for modern image formats
- Add WebP, HEIC, HEIF support for modern mobile/web images - Add TIFF variant (.tiff) support - Add professional formats: PSD, PCX support - Add document formats: PDF, JPEG 2000 variants (JP2, J2K, JPF, JPX, JPM, MJ2) - Add raw/specialized formats: PBM, PGM, PPM, PNM, PFM, PAM - Add additional formats: DIB, RLE, ICO, CUR - Comprehensive format support for all Tesseract-compatible image types - Maintains backward compatibility with existing formats
1 parent ef3f805 commit badcf98

File tree

1 file changed

+17
-2
lines changed

1 file changed

+17
-2
lines changed

constants.py

Lines changed: 17 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,20 @@
11
DEFAULT_CHECK_COMMAND = "which"
22
WINDOWS_CHECK_COMMAND = "where"
3-
TESSERACT_DATA_PATH_VAR = 'TESSDATA_PREFIX'
3+
TESSERACT_DATA_PATH_VAR = "TESSDATA_PREFIX"
44

5-
VALID_IMAGE_EXTENSIONS = [".jpg", ".jpeg", ".gif", ".png", ".tga", ".tif", ".bmp"]
5+
VALID_IMAGE_EXTENSIONS = [
6+
# Common formats
7+
".jpg", ".jpeg", ".png", ".gif", ".bmp",
8+
# TIFF variants
9+
".tif", ".tiff",
10+
# Modern formats
11+
".webp", ".heic", ".heif",
12+
# Professional formats
13+
".tga", ".psd", ".pcx",
14+
# Document formats
15+
".pdf", ".jp2", ".j2k", ".jpf", ".jpx", ".jpm", ".mj2",
16+
# Raw and specialized formats
17+
".pbm", ".pgm", ".ppm", ".pnm", ".pfm", ".pam",
18+
# Additional supported formats
19+
".dib", ".rle", ".ico", ".cur"
20+
]

0 commit comments

Comments
 (0)