-
Notifications
You must be signed in to change notification settings - Fork 181
Feature main #54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature main #54
Conversation
Fixed MCP Docker Build Failure: Resolved the build error for the mcp service by removing the invalid readme reference in fast-markdown-mcp/pyproject.toml.
Refactored File Handling (Removed In-Memory Storage):
Investigated the complex in-memory file handling mechanism and its inconsistencies.
Removed the in-memory storage logic from backend/app/crawler.py.
Removed the associated API endpoints (/api/memory-files, /api/memory-files/{file_id}) from backend/app/main.py.
Added a new backend API endpoint (/api/storage/file-content) to read files directly from the storage/markdown directory.
Deleted the old frontend API proxy route (app/api/memory-file/route.ts).
Created a new frontend API proxy route (app/api/storage/file-content/route.ts).
Updated frontend components (StoredFiles.tsx, DiscoveredFiles.tsx) to use the new API route for downloading file content.
Documentation: Created markdown plans for the MCP build fix and the in-memory feature removal.
This simplifies the architecture by relying solely on disk-based consolidated files in storage/markdown. Please remember to test the file download functionality after restarting the services.
This commit addresses several issues and implements enhancements across the crawling workflow: Fixes: - Resolved 400 Bad Request error caused by incorrect query parameter (`file_path`) in the file content API route. - Fixed backend `NameError` (`set_task_context`) in crawler.py that prevented result file saving. - Corrected 500 Internal Server Error caused by Docker networking issue (localhost vs. service name) in the file content API route proxy. - Ensured 'Data Extracted' statistic is correctly saved in the backend status and displayed in the UI. UI Enhancements: - Made "Consolidated Files" section persistent, rendering as soon as a job ID is available. - Relocated "Crawl Selected" button inline with status details. - Updated "Crawl Selected" button to show dynamic count and disable appropriately. - Renamed "Job Status" section title to "Discovered Pages". - Renamed "Processing Summary" section title to "Statistics". - Removed the unused "Extracted Content" display section. Backend Enhancements: - Implemented file appending logic in crawler.py for consolidated `.md` and `.json` files. Subsequent crawls for the same job now append data and update timestamps instead of overwriting. Changelog: ### Added - Backend logic to append new crawl results to existing consolidated `.md` and `.json` files for the same job ID. - Dynamic count display to "Crawl Selected" button. ### Changed - "Consolidated Files" section now appears persistently once a job is initiated. - "Crawl Selected" button relocated inline with status details and disables after initiating crawl. - Renamed "Job Status" section title to "Discovered Pages". - Renamed "Processing Summary" section title to "Statistics". - Updated backend status management to correctly store and transmit the 'Data Extracted' statistic. ### Fixed - Resolved 400 Bad Request error when fetching file content due to incorrect query parameter name. - Fixed backend `NameError` in crawler that prevented saving crawl results. - Resolved 500 Internal Server Error when fetching `.json` file content due to Docker networking issue in API proxy route. - Corrected display issue where 'Data Extracted' statistic showed "N/A" instead of the actual value. ### Removed - Removed the unused "Extracted Content" display section from the UI.
feat(frontend): Update Consolidated Files component for polling and downloads - Implements polling every 10 seconds in ConsolidatedFiles.tsx to automatically refresh the list of files from the /api/storage endpoint, ensuring newly added files appear in the UI. - Modifies the MD and JSON icon links to point to the /api/storage/download endpoint and adds the 'download' attribute, triggering file downloads instead of opening content in the browser.
Introduces a new `CrawlUrls` component to display and manage discovered URLs during a crawl job. This component utilizes Shadcn UI elements (Table, Checkbox, Badge, Tooltip) to provide a detailed view of individual URL statuses, handle URL selection for targeted actions, and display status updates driven by polling managed in `app/page.tsx`. Key changes include: - Creation of the `CrawlUrls` component for URL list display and interaction. - Refactoring of `CrawlStatusMonitor` to focus solely on displaying the overall job status within a Dialog component. - Updates to `app/page.tsx` to manage essential state (job ID, job status, selected URLs) and orchestrate the polling mechanism for fetching URL-specific status updates. - Fixed UI bugs where status icons were not updating correctly and checkbox selection state was inconsistent. - Adjusted the styling of the info icon button for better contrast as per user feedback. These frontend enhancements align with the ongoing backend redesign, supporting the new job-based status management and polling architecture for more granular progress tracking. Updated documentation in `docs/features/` (adjust_info_button_style_plan.md, fix_discovered_pages_ui_bugs.md, create_crawl_urls_component_plan.md, crawl_status_monitoring_plan.md) to reflect the completion of related tasks.
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
No description provided.