-
Notifications
You must be signed in to change notification settings - Fork 3k
Open
Labels
ClientThis issue points to a problem in the data-plane of the library.This issue points to a problem in the data-plane of the library.Document IntelligenceService AttentionWorkflow: This issue is responsible by Azure service team.Workflow: This issue is responsible by Azure service team.customer-reportedIssues that are reported by GitHub users external to the Azure organization.Issues that are reported by GitHub users external to the Azure organization.needs-team-attentionWorkflow: This issue needs attention from Azure service team or SDK teamWorkflow: This issue needs attention from Azure service team or SDK teamquestionThe issue doesn't require a change to the product in order to be resolved. Most issues start as thatThe issue doesn't require a change to the product in order to be resolved. Most issues start as that
Description
When processing a multi-page file (see attachment) with Azure Document Intelligence, the resulting layout shows that the section on the second page is nested much deeper than expected.
Is this simply a detection failure?
Or is this expected behavior when handling multi-page documents—meaning that post-processing is required to adjust layout consistency across pages?
I'd appreciate any clarification on whether this is a bug or something that needs to be handled on the client side.
$ for line in result.content.splitlines():
$ if '#' in line:
$ print(line)
# This is title
## 1. Text
## 2. Page Objects
### 2.1 Table
### 2.2. Figure
## 3. Others
## This is title
### 1. Text
### 2. Page Objects
#### 2.1 Table
#### 2.2. Figure
### 3. Others
$ for i, paragraph in enumerate(result.paragraphs):
$ if paragraph.get('role') in ['title', 'sectionHeading']:
$ print(paragraph['role'], paragraph['content'])
title This is title
sectionHeading 1. Text
sectionHeading 2. Page Objects
sectionHeading 2.1 Table
sectionHeading 2.2. Figure
sectionHeading 3. Others
title This is title
sectionHeading 1. Text
sectionHeading 2. Page Objects
sectionHeading 2.1 Table
sectionHeading 2.2. Figure
sectionHeading 3. Others
Metadata
Metadata
Assignees
Labels
ClientThis issue points to a problem in the data-plane of the library.This issue points to a problem in the data-plane of the library.Document IntelligenceService AttentionWorkflow: This issue is responsible by Azure service team.Workflow: This issue is responsible by Azure service team.customer-reportedIssues that are reported by GitHub users external to the Azure organization.Issues that are reported by GitHub users external to the Azure organization.needs-team-attentionWorkflow: This issue needs attention from Azure service team or SDK teamWorkflow: This issue needs attention from Azure service team or SDK teamquestionThe issue doesn't require a change to the product in order to be resolved. Most issues start as thatThe issue doesn't require a change to the product in order to be resolved. Most issues start as that