AEC Hackathon Munich : MOD Smart Prefab challenge, Team MOD-2.
Our goal is to extract structured data from unstructured PDFs containing information about prefabricated elements. The main challenge is to produce reliable results in JSON format, which can be used for further applications.
The strategy we chose involves using two models acting as agents, based on OpenAI and Claude respectively, to improve each other's quality. One model generates the initial JSON based on the prompt request, while the other checks it and corrects any mistakes in the previous output.
There are some typical approaches for this challenge. Prompt engineering refers to designing effective prompts to instruct the LLM to complete the task. RAG applies semantic embedding to retrieve the relevant chunks of documents and the LLM then uses this retrieved content to generate answers that are more informed and accurate. Fine tuning aims to customize an LLM on a specific dataset to adjust its behavior or optimize it for specific tasks. In this project, we did not do fine tuning, instead, we tried a strategy called Verbal Reinforcement Learning, that is using feedback from human evaluators to iteratively improve how the LLM responds.
PDFtoDLM is composed of two backends for LLMs, a frontend user interface, and some additional tools.
- Upload multiple PDF files via the web interface.
- Immediate visualization of each uploaded PDF.
- Asynchronous generation of structured JSON data from each PDF.
- Interactive JSON schema editor with live syntax highlighting.
- Options to save edited JSON and download it locally.
OpenAI
- Frontend: React.js
- Backend: Node.js with Express
- PDF Parsing: pdf-parse, pdf-lib
- AI Integration: OpenAI API
Claude
- Bash, llm-claude-3
OpenAI backend
- Node.js (v14 or higher)
- npm or yarn
- OpenAI API Key (requires a valid API key from OpenAI)
Claude backend
- Python
- Claude API Key
- Installation