Skip to content

v1.3 - Integrated speech-to-text and text-to-speech

Compare
Choose a tag to compare
@souvikmajumder26 souvikmajumder26 released this 19 Mar 18:27
· 136 commits to main since this release
bea1f50
  • Successfully working overall architecture of Automated Agent Routing with LangGraph.

  • Successfully working Conversation Agent fine-tuned for medical domain.

  • Successfully working RAG agent.

  • Successfully working Web Search agent.

  • Successfully working routing from RAG to Web Search based on Retrieval Confidence score (if low).

  • Successfully working routing to appropriate Medical Computer Vision agent based on Classification of uploaded image (brain MRI / chest X-ray / skin lesion).

  • Successfully storing conversation history till specified length.

  • Successfully working backend and frontend.

  • Added ingest_rag_data.py to manually ingest new data for information retrieval.

  • Currently document parsing implemented with PyPDF2, later will provide option of unstructured.io as well (needs tesseract and poppler installation at system level).

  • Successfully working Medical Computer Vision model agents - Chest X-ray Covid-19 classification, and Skin Lesion Segmentation.

  • Successfully integrated ElevenLabs API to enable speech-to-text and text-to-speech services in conversation.

What's Changed

Full Changelog: v1.2...v1.3