Skip to content

Commit 8c11197

Browse files
committed
XML conversion updates.
1 parent 8523325 commit 8c11197

File tree

2 files changed

+3
-1
lines changed

2 files changed

+3
-1
lines changed

ocr_service/processor/processor.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -305,6 +305,8 @@ def _process(self, stream: bytes, file_name: str) -> str:
305305
try:
306306
pdf_stream = None
307307

308+
self.log.info("Assumed file type for doc id: " + file_name)
309+
308310
if type(file_type) is archive.Pdf:
309311
pdf_stream = stream
310312
elif file_type in DOCUMENT or type(file_type) is archive.Rtf:

ocr_service/utils/utils.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -52,7 +52,7 @@ def is_file_type_xml(stream: bytes) -> bool:
5252
try:
5353
xml.sax.parseString(stream, xml.sax.ContentHandler())
5454
return True
55-
except xml.sax.SAXParseException:
55+
except Exception:
5656
logging.warning("Could not determine if file is XML.")
5757
return False
5858

0 commit comments

Comments
 (0)