Microsoft.Extensions.AI: How can PDF files be processed? #5892

skurth · 2025-02-13T08:04:07Z

skurth
Feb 13, 2025

As demonstrated in https://youtu.be/qcp6ufe_XYo?si=7JeVbDOj6e8LF1jl&t=1852, Microsoft.Extensions.AI can extract informations from an image and return a response as a custom class.

What is the best approach to achieving the same with a PDF file?

For example, if my PDFs represent different invoices and I want to extract the title and invoice number, will there be a PdfContent type in the future? Alternatively, can I supply the PDF in another way, or is the only option to extract the text (and images) from the PDF myself before providing it to the AI?

(I know there are services like https://azure.microsoft.com/en-us/products/ai-services/ai-document-intelligence)

Dayjay · 2025-04-24T06:35:27Z

Dayjay
Apr 24, 2025

I am wondering the same. That's what I do now (basically the same as the example):

var chatClient = app.Services.GetRequiredService<IChatClient>();

var message = new ChatMessage(ChatRole.User, $"Extract the information from this PDF file. If not specified, leave the field empty.");
message.Contents.Add(new DataContent(File.ReadAllBytes(filePath), "application/pdf"));

var documentResult = await chatClient.GetResponseAsync<DocumentResult>(message);

Console.WriteLine(documentResult.Result);

But the result json is always empty. I can see the DataContent being correct data:application/pdf;base64,xyzabc too.

If I extract the text myself or convert the PDF to a png before sending it, it works fine. I am using gpt-4.1-mini so PDF files should work?

2 replies

stephentoub May 1, 2025
Collaborator

Can you try with the latest build of https://www.nuget.org/packages/Microsoft.Extensions.AI.OpenAI/9.4.3-preview.1.25230.7?

Dayjay May 2, 2025

It works now. Thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Microsoft.Extensions.AI: How can PDF files be processed? #5892

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment 2 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Microsoft.Extensions.AI: How can PDF files be processed? #5892

Uh oh!

skurth Feb 13, 2025

Replies: 1 comment · 2 replies

Uh oh!

Dayjay Apr 24, 2025

Uh oh!

stephentoub May 1, 2025 Collaborator

Uh oh!

Dayjay May 2, 2025

skurth
Feb 13, 2025

Replies: 1 comment 2 replies

Dayjay
Apr 24, 2025

stephentoub May 1, 2025
Collaborator