v1.3 - Integrated speech-to-text and text-to-speech
-
Successfully working overall architecture of Automated Agent Routing with LangGraph.
-
Successfully working Conversation Agent fine-tuned for medical domain.
-
Successfully working RAG agent.
-
Successfully working Web Search agent.
-
Successfully working routing from RAG to Web Search based on Retrieval Confidence score (if low).
-
Successfully working routing to appropriate Medical Computer Vision agent based on Classification of uploaded image (brain MRI / chest X-ray / skin lesion).
-
Successfully storing conversation history till specified length.
-
Successfully working backend and frontend.
-
Added ingest_rag_data.py to manually ingest new data for information retrieval.
-
Currently document parsing implemented with PyPDF2, later will provide option of unstructured.io as well (needs tesseract and poppler installation at system level).
-
Successfully working Medical Computer Vision model agents - Chest X-ray Covid-19 classification, and Skin Lesion Segmentation.
-
Successfully integrated ElevenLabs API to enable speech-to-text and text-to-speech services in conversation.
What's Changed
- Added skin lesion model file with LFS by @souvikmajumder26 in #24
- Delete temporary files, added max upload file size limit, can upload any shape of image to diagnose by @souvikmajumder26 in #25
- Integrated speech-to-text and text-to-speech by @souvikmajumder26 in #26
Full Changelog: v1.2...v1.3