Skip to content

Performance Bottleneck: Stats queries taking 8-10 seconds #10

@doobidoo

Description

@doobidoo

Issue Description

Dashboard is experiencing 8-10 second query times, significantly slower than expected 2-5s performance.

Root Cause Analysis

The bottleneck is in the get_stats() and dashboard_get_stats() methods in server.py:

# WARNING: This can be very slow and memory-intensive on large collections.
all_metadatas_results = collection.get(include=['metadatas'])

Problems:

  1. Inefficient tag counting: Loads ALL metadata from ALL documents to count unique tags
  2. Frequent calls: Stats refresh after every operation (search, store, recall, init, refresh button)
  3. Memory intensive: Entire collection metadata loaded into memory
  4. No caching: Same expensive operation repeated frequently

Performance Impact

  • Query times: 8-10 seconds (should be 2-5s)
  • User experience: Noticeable lag after every interaction
  • Memory usage: Unnecessary memory consumption loading full metadata

Proposed Solutions

Short-term fix:

  1. Add caching with TTL (Time To Live) for stats
  2. Reduce call frequency - don't refresh stats after every operation
  3. Lazy loading - only refresh when user explicitly requests

Long-term optimization:

  1. Maintain tag index separately in metadata or cache
  2. Batch operations - update stats incrementally
  3. Background updates - refresh stats periodically vs. on-demand

Files Affected

  • src/memory_dashboard/server.py (primary)
  • src/MemoryDashboard.tsx (call frequency)

Priority

High - Significantly impacts user experience

Memory-Driven Development Notes

  • Archive current implementation before changes
  • Test systematically: minimal → simplified → complex
  • Create backup of current performance measurements

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions