Releases: souvikmajumder26/Multi-Agent-Medical-Assistant
v2.1.2 - Upgraded RAG Agent
-
Document Processing Upgrade:
- Unstructured.io has been replaced with Docling for document parsing and extraction of text, tables, and images to be embedded.
- Parsed document is converted into markdown with text, tables, and image placeholders.
- Image placeholders are replaced by image summaries generated by multi-modal LLM using the extracted images.
-
Enhanced RAG References:
- Links to the source documents and reference images present in reranked retrieved chunks stored in local storage are added to the bottom of the RAG responses.
-
Integrated backend and frontend into one app:
- Run a single FastAPI app.py for both backend and frontend
-
Updated Dockerfile and configured GitHub Actions:
- Simplified Dockerfile, replaced Unstructured.io system dependency installations with that of Docling
-
Multi-platform bug fixes:
- Modified ingested document paths to work on any OS and docker.
- Modified FastAPI app host from
127.0.0.1
to0.0.0.0
to be accessible from outside docker container.
What's Changed
- Modified document path (for multi OS support) and app host (to access from outside docker container without overriding host using command line argument) by @souvikmajumder26 in #61
Full Changelog: v2.1.1...v2.1.2
v2.1.1 - Upgraded RAG Agent
-
Document Processing Upgrade:
- Unstructured.io has been replaced with Docling for document parsing and extraction of text, tables, and images to be embedded.
- Parsed document is converted into markdown with text, tables, and image placeholders.
- Image placeholders are replaced by image summaries generated by multi-modal LLM using the extracted images.
-
Enhanced RAG References:
- Links to the source documents and reference images present in reranked retrieved chunks stored in local storage are added to the bottom of the RAG responses.
-
Integrated backend and frontend into one app:
- Run a single FastAPI app.py for both backend and frontend
-
Updated Dockerfile and configured GitHub Actions:
- Simplified Dockerfile, replaced Unstructured.io system dependency installations with that of Docling
-
Refer v2.0 release notes for previous updates.
What's Changed
- Upgraded RAG with Docling by @souvikmajumder26 in #53
- Updated README by @souvikmajumder26 in #54
- Updated requirements by @souvikmajumder26 in #55
- Create docker-image.yml by @souvikmajumder26 in #56
- Updated Dockerflie and requirements by @souvikmajumder26 in #57
- Updated README with Docling by @souvikmajumder26 in #60
Full Changelog: v2.0...v2.1 v2.1...v2.1.1
v2.0 - First Final Version
- Successfully working overall architecture of Automated Agent Routing with LangGraph.
- Successfully working Conversation Agent fine-tuned for medical domain.
- Successfully working RAG agent.
- Successfully working Web Search agent.
- Successfully working routing from RAG to Web Search based on Retrieval Confidence score (if low).
- Successfully working routing to appropriate Medical Computer Vision agent based on Classification of uploaded image (brain MRI / chest X-ray / skin lesion).
- Successfully storing conversation history till specified length.
- Successfully working backend and frontend.
- Added ingest_rag_data.py to manually ingest new data for information retrieval.
- Currently document parsing implemented with PyPDF2, later will provide option of unstructured.io as well (needs tesseract and poppler -installation at system level).
- Successfully working Medical Computer Vision model agents - Chest X-ray Covid-19 classification, and Skin Lesion Segmentation.
- Successfully integrated ElevenLabs API to enable speech-to-text and text-to-speech services in conversation.
- Successfully integrated Input and Output Guardrails.
- Conversation history is now maintained in Graph State rather than separately managed in the fastapi backend like in previous releases.
- Updated Chunking Strategy including logic of semantic chunking (chunking respecting semantic boundaries - section, paragraph, sentence boundaries) utilizing section headers specific to different document types (that will be detected) such as research papers, clinical notes, patient records, medical condition reports, guidelines and protocols, and drug information. Also, included medical entity recognition to enrich the document metadata that will aid in hybrid search comparing with the medical entities detected from the user query.
- Provided chunking strategy options to developer: 'semantic', 'sliding_window', 'recursive', 'hybrid'.
- Due to exhaustion of Git LFS quota, large model file is now shared via gdrive which will get downloaded automatically in the correct path (added an automatic model downloader script).
- Successfully integrated Human-in-the-loop Validation for Computer Vision Agent results.
- Unified LLM and Embedding model definitions from config.
- Integrated Tables Extraction with Unstructured.IO
- Upgraded Vector Search to Hybrid Search Retrieval (BM25 sparse keyword matching + dense embedding vector similarity search) with Qdrant DB
- Ingested new data into the vector database for the final version; corresponding demo video and README have been updated
- Updated README installation guide regarding using docker along with pre-existing manual option
What's Changed
- Updated installation using docker by @souvikmajumder26 in #50
Full Changelog: v1.9...v2.0
v1.9 - Ingested new data
- Successfully working overall architecture of Automated Agent Routing with LangGraph.
- Successfully working Conversation Agent fine-tuned for medical domain.
- Successfully working RAG agent.
- Successfully working Web Search agent.
- Successfully working routing from RAG to Web Search based on Retrieval Confidence score (if low).
- Successfully working routing to appropriate Medical Computer Vision agent based on Classification of uploaded image (brain MRI / chest X-ray / skin lesion).
- Successfully storing conversation history till specified length.
- Successfully working backend and frontend.
- Added ingest_rag_data.py to manually ingest new data for information retrieval.
- Currently document parsing implemented with PyPDF2, later will provide option of unstructured.io as well (needs tesseract and poppler -installation at system level).
- Successfully working Medical Computer Vision model agents - Chest X-ray Covid-19 classification, and Skin Lesion Segmentation.
- Successfully integrated ElevenLabs API to enable speech-to-text and text-to-speech services in conversation.
- Successfully integrated Input and Output Guardrails.
- Conversation history is now maintained in Graph State rather than separately managed in the fastapi backend like in previous releases.
- Updated Chunking Strategy including logic of semantic chunking (chunking respecting semantic boundaries - section, paragraph, sentence boundaries) utilizing section headers specific to different document types (that will be detected) such as research papers, clinical notes, patient records, medical condition reports, guidelines and protocols, and drug information. Also, included medical entity recognition to enrich the document metadata that will aid in hybrid search comparing with the medical entities detected from the user query.
- Provided chunking strategy options to developer: 'semantic', 'sliding_window', 'recursive', 'hybrid'.
- Due to exhaustion of Git LFS quota, large model file is now shared via gdrive which will get downloaded automatically in the correct path (added an automatic model downloader script).
- Successfully integrated Human-in-the-loop Validation for Computer Vision Agent results.
- Unified LLM and Embedding model definitions from config.
- Integrated Tables Extraction with Unstructured.IO
- Upgraded Vector Search to Hybrid Search Retrieval (BM25 sparse keyword matching + dense embedding vector similarity search) with Qdrant DB
- Ingested new data into the vector database for the final version
- Corresponding demo video and README have been updated
What's Changed
- Ingested new data for final version by @souvikmajumder26 in #46
- Updated README by @souvikmajumder26 in #49
Full Changelog: v1.8...v1.9
v1.8 - Integrated Unstructured.IO
- Successfully working overall architecture of Automated Agent Routing with LangGraph.
- Successfully working Conversation Agent fine-tuned for medical domain.
- Successfully working RAG agent.
- Successfully working Web Search agent.
- Successfully working routing from RAG to Web Search based on Retrieval Confidence score (if low).
- Successfully working routing to appropriate Medical Computer Vision agent based on Classification of uploaded image (brain MRI / chest X-ray / skin lesion).
- Successfully storing conversation history till specified length.
- Successfully working backend and frontend.
- Added ingest_rag_data.py to manually ingest new data for information retrieval.
- Currently document parsing implemented with PyPDF2, later will provide option of unstructured.io as well (needs tesseract and poppler -installation at system level).
- Successfully working Medical Computer Vision model agents - Chest X-ray Covid-19 classification, and Skin Lesion Segmentation.
- Successfully integrated ElevenLabs API to enable speech-to-text and text-to-speech services in conversation.
- Successfully integrated Input and Output Guardrails.
- Conversation history is now maintained in Graph State rather than separately managed in the fastapi backend like in previous releases.
- Updated Chunking Strategy including logic of semantic chunking (chunking respecting semantic boundaries - section, paragraph, sentence boundaries) utilizing section headers specific to different document types (that will be detected) such as research papers, clinical notes, patient records, medical condition reports, guidelines and protocols, and drug information. Also, included medical entity recognition to enrich the document metadata that will aid in hybrid search comparing with the medical entities detected from the user query.
- Provided chunking strategy options to developer: 'semantic', 'sliding_window', 'recursive', 'hybrid'.
- Due to exhaustion of Git LFS quota, large model file is now shared via gdrive which will get downloaded automatically in the correct path (added an automatic model downloader script).
- Successfully integrated Human-in-the-loop Validation for Computer Vision Agent results.
- Unified LLM and Embedding model definitions from config.
- Integrated Tables Extraction with Unstructured.IO
- Upgraded Vector Search to Hybrid Search Retrieval (BM25 sparse keyword matching + dense embedding vector similarity search) with Qdrant DB
What's Changed
- Unified llm and embedding model definitions from config by @souvikmajumder26 in #43
- Integrated Tables Extraction with Unstructured.IO and Hybrid Search Retrieval with Qdrant DB by @souvikmajumder26 in #44
- Updated README by @souvikmajumder26 in #45
Full Changelog: v1.7...v1.8
v1.7 - Integrated Human-in-the-loop Validation
- Successfully working overall architecture of Automated Agent Routing with LangGraph.
- Successfully working Conversation Agent fine-tuned for medical domain.
- Successfully working RAG agent.
- Successfully working Web Search agent.
- Successfully working routing from RAG to Web Search based on Retrieval Confidence score (if low).
- Successfully working routing to appropriate Medical Computer Vision agent based on Classification of uploaded image (brain MRI / chest X-ray / skin lesion).
- Successfully storing conversation history till specified length.
- Successfully working backend and frontend.
- Added ingest_rag_data.py to manually ingest new data for information retrieval.
- Currently document parsing implemented with PyPDF2, later will provide option of unstructured.io as well (needs tesseract and poppler -installation at system level).
- Successfully working Medical Computer Vision model agents - Chest X-ray Covid-19 classification, and Skin Lesion Segmentation.
- Successfully integrated ElevenLabs API to enable speech-to-text and text-to-speech services in conversation.
- Successfully integrated Input and Output Guardrails.
- Conversation history is now maintained in Graph State rather than separately managed in the fastapi backend like in previous releases.
- Updated Chunking Strategy including logic of semantic chunking (chunking respecting semantic boundaries - section, paragraph, sentence boundaries) utilizing section headers specific to different document types (that will be detected) such as research papers, clinical notes, patient records, medical condition reports, guidelines and protocols, and drug information. Also, included medical entity recognition to enrich the document metadata that will aid in hybrid search comparing with the medical entities detected from the user query.
- Provided chunking strategy options to developer: 'semantic', 'sliding_window', 'recursive', 'hybrid'.
- Due to exhaustion of Git LFS quota, large model file is now shared via gdrive which will get downloaded automatically in the correct path (added an automatic model downloader script).
- Successfully integrated Human-in-the-loop Validation for Computer Vision Agent results.
What's Changed
- Integrated Human-in-the-loop Validation by @souvikmajumder26 in #42
Full Changelog: v1.6...v1.7
v1.6 - Advanced Chunking Strategy
- Successfully working overall architecture of Automated Agent Routing with LangGraph.
- Successfully working Conversation Agent fine-tuned for medical domain.
- Successfully working RAG agent.
- Successfully working Web Search agent.
- Successfully working routing from RAG to Web Search based on Retrieval Confidence score (if low).
- Successfully working routing to appropriate Medical Computer Vision agent based on Classification of uploaded image (brain MRI / chest X-ray / skin lesion).
- Successfully storing conversation history till specified length.
- Successfully working backend and frontend.
- Added ingest_rag_data.py to manually ingest new data for information retrieval.
- Currently document parsing implemented with PyPDF2, later will provide option of unstructured.io as well (needs tesseract and poppler -installation at system level).
- Successfully working Medical Computer Vision model agents - Chest X-ray Covid-19 classification, and Skin Lesion Segmentation.
- Successfully integrated ElevenLabs API to enable speech-to-text and text-to-speech services in conversation.
- Successfully integrated Input and Output Guardrails.
- Conversation history is now maintained in Graph State rather than separately managed in the fastapi backend like in previous releases.
- Updated Chunking Strategy including logic of semantic chunking (chunking respecting semantic boundaries - section, paragraph, sentence boundaries) utilizing section headers specific to different document types (that will be detected) such as research papers, clinical notes, patient records, medical condition reports, guidelines and protocols, and drug information. Also, included medical entity recognition to enrich the document metadata that will aid in hybrid search comparing with the medical entities detected from the user query.
- Provided chunking strategy options to developer: 'semantic', 'sliding_window', 'recursive', 'hybrid'.
- Due to exhaustion of Git LFS quota, large model file is now shared via gdrive which will get downloaded automatically in the correct path (added an automatic model downloader script).
What's Changed
- Updated ingested data with better chunking logic by @souvikmajumder26 in #35
- Added chunking strategy choice by @souvikmajumder26 in #36
- Updated main README and agentic workflow README by @souvikmajumder26 in #39
- Large model file removed by @souvikmajumder26 in #41
Full Changelog: v1.5...v1.6
v1.5 - Conversation history now maintained in graph state
- Successfully working overall architecture of Automated Agent Routing with LangGraph.
- Successfully working Conversation Agent fine-tuned for medical domain.
- Successfully working RAG agent.
- Successfully working Web Search agent.
- Successfully working routing from RAG to Web Search based on Retrieval Confidence score (if low).
- Successfully working routing to appropriate Medical Computer Vision agent based on Classification of uploaded image (brain MRI / chest X-ray / skin lesion).
- Successfully storing conversation history till specified length.
- Successfully working backend and frontend.
- Added ingest_rag_data.py to manually ingest new data for information retrieval.
- Currently document parsing implemented with PyPDF2, later will provide option of unstructured.io as well (needs tesseract and poppler -installation at system level).
- Successfully working Medical Computer Vision model agents - Chest X-ray Covid-19 classification, and Skin Lesion Segmentation.
- Successfully integrated ElevenLabs API to enable speech-to-text and text-to-speech services in conversation.
- Successfully integrated Input and Output Guardrails.
- Conversation history is now maintained in Graph State rather than separately managed in the fastapi backend like in previous releases.
What's Changed
- Updated README by @souvikmajumder26 in #33
- Conversation history now maintained in graph state by @souvikmajumder26 in #34
Full Changelog: v1.4...v1.5
v1.4 - Integrated Guardrails
- Successfully working overall architecture of Automated Agent Routing with LangGraph.
- Successfully working Conversation Agent fine-tuned for medical domain.
- Successfully working RAG agent.
- Successfully working Web Search agent.
- Successfully working routing from RAG to Web Search based on Retrieval Confidence score (if low).
- Successfully working routing to appropriate Medical Computer Vision agent based on Classification of uploaded image (brain MRI / chest X-ray / skin lesion).
- Successfully storing conversation history till specified length.
- Successfully working backend and frontend.
- Added ingest_rag_data.py to manually ingest new data for information retrieval.
- Currently document parsing implemented with PyPDF2, later will provide option of unstructured.io as well (needs tesseract and poppler -installation at system level).
- Successfully working Medical Computer Vision model agents - Chest X-ray Covid-19 classification, and Skin Lesion Segmentation.
- Successfully integrated ElevenLabs API to enable speech-to-text and text-to-speech services in conversation.
- Successfully integrated Input and Output Guardrails.
What's Changed
- Audio button spaced, LLM chat history summarizer for web search query by @souvikmajumder26 in #29
- Updated README & requirements by @souvikmajumder26 in #31
- Integrated Input and Output Guardrails by @souvikmajumder26 in #32
Full Changelog: v1.3...v1.4
v1.3 - Integrated speech-to-text and text-to-speech
-
Successfully working overall architecture of Automated Agent Routing with LangGraph.
-
Successfully working Conversation Agent fine-tuned for medical domain.
-
Successfully working RAG agent.
-
Successfully working Web Search agent.
-
Successfully working routing from RAG to Web Search based on Retrieval Confidence score (if low).
-
Successfully working routing to appropriate Medical Computer Vision agent based on Classification of uploaded image (brain MRI / chest X-ray / skin lesion).
-
Successfully storing conversation history till specified length.
-
Successfully working backend and frontend.
-
Added ingest_rag_data.py to manually ingest new data for information retrieval.
-
Currently document parsing implemented with PyPDF2, later will provide option of unstructured.io as well (needs tesseract and poppler installation at system level).
-
Successfully working Medical Computer Vision model agents - Chest X-ray Covid-19 classification, and Skin Lesion Segmentation.
-
Successfully integrated ElevenLabs API to enable speech-to-text and text-to-speech services in conversation.
What's Changed
- Added skin lesion model file with LFS by @souvikmajumder26 in #24
- Delete temporary files, added max upload file size limit, can upload any shape of image to diagnose by @souvikmajumder26 in #25
- Integrated speech-to-text and text-to-speech by @souvikmajumder26 in #26
Full Changelog: v1.2...v1.3