- ๐ Generate a document parsed results using Document Intelligence, and output it in Markdown format. > output
- ๐ผ๏ธ Extract figures from documents and save them as PNG images. > output
- ๐ค Generate figure descriptions using Azure OpenAI Multimodal.
- ๐ Update markdown outputs with generated descriptions. > output
- ๐ Extract tables and convert them into Excel files. > output
- ๐ Text Chunking to markdown ouputs using
MarkdownHeaderTextSplitter
,RecursiveContentChunker
, andSemanticContentChunker (TBD)
> markdown chuck output | recursive chunk output
python doc_intelli.py
- ๐ Document Intelligence Official Samples: Python (v4.0) / RAG samples / Figure understanding.
- Build Intelligent RAG for Multimodality and Complex Document Structures
- nohanaga: Document Intelligence Samples | ๐ Output: Article in Japanese
- ๐ 7 Chunking Strategies for Langchain
- MarkdownHeaderTextSplitter
- โก Azure Functions vs. Indexers: AI Data Ingestion
- RAG Time: Mastering RAG