-
Notifications
You must be signed in to change notification settings - Fork 4.3k
Description
Description
I need to re-use my qdrant vector db, i can not recreate the full db again it is tens of thousands of pages pdfs long
Steps to Reproduce
@app.post("/v1/chat", response_model=ChatResponse)
async def chat(
user_id: str = Form(...),
session_id: str = Form(...),
message: str = Form(...),
files: List[UploadFile] = File(None),
book_names: Optional[str] = Form(None) # Comma-separated list of book names for knowledge filtering
):
"""
Unified chat endpoint supporting text messages, single file uploads, and multiple file uploads
Args:
user_id: User identifier
session_id: Session identifier
message: User's message/query
files: Optional list of files to upload
book_names: Optional comma-separated list of book names to filter knowledge search
Example: "CALLEN_S ULTRASOUND OBG 6th edition_Part14.pdf,Harrison's Internal Medicine 20th edition_Part5.pdf"
"""
if not PIPELINE_AVAILABLE:
raise HTTPException(status_code=503, detail="Pipeline not available")
try:
# Process book_names parameter for knowledge filtering
knowledge_filters = None
if book_names and book_names.strip():
book_list = [book.strip() for book in book_names.split(',') if book.strip()]
if book_list:
knowledge_filters = {"book_name": book_list}
logger.info(f"🔍 Knowledge filters applied: {knowledge_filters}")
# Set allowed file types
allowed_types = [
'image/jpeg', 'image/jpg', 'image/png', 'image/gif', 'image/webp',
'application/pdf', 'text/plain', 'application/msword',
'application/vnd.openxmlformats-officedocument.wordprocessingml.document'
]
# Handle case when no files are provided
if not files or len(files) == 0:
# Process text-only message
agent = get_medical_agent()
result = agent.process_message(
message=message,
user_id=user_id,
session_id=session_id,
knowledge_filters=knowledge_filters
)
elif len(files) == 1:
# Single file case - use the original process_message_concurrent for backward compatibility
file = files[0]
# Validate file type
if file.content_type not in allowed_types:
raise HTTPException(
status_code=400,
detail=f"Unsupported file type: {file.content_type}. Supported types: {', '.join(allowed_types)}"
)
# Read file content
file_content = await file.read()
# Check file size (10MB limit)
max_size = 10 * 1024 * 1024 # 10MB
if len(file_content) > max_size:
raise HTTPException(
status_code=400,
detail=f"File too large. Maximum size is {max_size / (1024*1024):.1f}MB"
)
Agent Configuration (if applicable)
No response
Expected Behavior
Model should use the knowledge base with the filters given
Actual Behavior
2025-09-17 13:33:07,081 - INFO - AFC remote call 1 is done.
WARNING Invalid filter keys provided: ['book_name']. These filters will be ignored.
INFO Valid filter keys are: set()
WARNING No valid filters remain after validation. Search will proceed without filters.
2025-09-17 13:33:11,227 - INFO - HTTP Request: POST https://3a7e5639-7584-4d45-9711-25e3deddc0b3.us-east4-0.gcp.cloud.qdrant.io:6333/collections/medical_knowledge_base/points/query "HTTP/1.1 200 OK"
INFO Found 10 documents
2025-09-17 13:33:11,518 - INFO - AFC is enabled with max remote calls: 10.
2025-09-17 13:33:26,230 - INFO - HTTP Request: POST https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent "HTTP/1.1 200 OK"
2025-09-17 13:33:26,240 - WARNING - Warning: there are non-text parts in the response: ['thought_signature'], returning concatenated parsed result from text parts. Check the full candidates.content.parts accessor to get the full model response.
2025-09-17 13:33:26,241 - INFO - AFC remote call 1 is done.
2025-09-17 13:33:27,190 - INFO - ✅ Successfully parsed JSON response with message: Ultrasonography, commonly known
as ultrasound, is a non-invasive imaging techniq...
2025-09-17 13:33:27,190 - INFO - ✅ Successfully parsed JSON response with message: Ultrasonography, commonly known
as ultrasound, is a non-invasive imaging technique that uses high-fr...
INFO: 127.0.0.1:52354 - "POST /v1/chat HTTP/1.1" 200 OK
2025-09-17 13:33:27,192 - INFO - 🔍 Knowledge filters applied: {'book_name': ["Harrison's Internal Medicine 20th edition_Part5.pdf", 'CALLEN_S ULTRASOUND OBG 6th edition_Part14.pdf']}
2025-09-17 13:33:29,424 - INFO - AFC is enabled with max remote calls: 10.
Screenshots or Logs (if applicable)
No response
Environment
windows
Possible Solutions (optional)
No response
Additional Context
No response