Skip to content

Conversation

20001020ycx
Copy link
Contributor

@20001020ycx 20001020ycx commented Oct 9, 2025

  • Add get_instructions MCP tool call to retrieve the system prompt.
  • Add get_nth_page MCP tool call to retrieve paginated query results cached from the previous search.
  • Introduce SessionManager for managing asynchronous session-level states.

Description

In our open-source release of the clp-mcp-server, we support 4 APIs: search_kql_query, search_kql_query_with_timestamp, get_nth_page, and get_instructions. In this PR, we implement the utility API: get_nth_page and get_instructions. For an overview of all four APIs, please refer to the internal documentation of the first MCP server release.

The documentation of get_nth_page and get_instructions are listed below:

get_nth_page(unsigned: page-idx)

Description:
Retrieves the n-th page of a paginated response with the paging metadata from the previous query. This tool is used when a query’s response is too long and LLM only wishes to examine a specific subset of the results.

Inputs:
page-idx: the index of the page LLM wants to look into

Outputs:
The set of log messages corresponding to the requested page index.

Errors:

  • “Error: No previous paginated response in this session”: If no query with a paginated response was run before running “get_nth_page” in this session then this error message is returned to the Agent.
  • “Error: page-idx is out of bounds” If page-idx is greater than num pages returned by a search query

get_instructions()

Description:
Returns a pre-defined “system prompt” that guides the LLM’s behavior.

The instruction covers:

  1. What outputs should be produced, and the desired output style.
  2. How to use the provided tools.
  3. Specific fine-tuning for the LLM (e.g., avoid premature conclusions, ask LLM for more input when necessary).
    This function must be invoked before any other tool.

Inputs:
empty

Outputs:
“system prompt” mentioned in Description.

Implementation Detail

get_nth_page:

  • Implementation: The MCP server caches only the results of the most recent query, with a maximum of 1,000 log entries stored in memory.
  • Scalability: If multiple users issue get_nth_page requests, the MCP server maintains an array of paginated responses. These are indexed by a session argument, which uniquely identifies each conversation between a user and the MCP server. The MCP server shall support multiple concurrent clients, where each client only makes synchronous API calls.

get_instructions:

  • The MCP server stores a constant string containing the “system prompt”, it returns the string as the response when this tool is invoked by LLM.
  • In the tool description, we will mention that this should be the first tool to call in a session.
  • In addition, we will implement an explicit static check to ensure this tool is called before any other tool. This can be done by using a session specific global flag which is set to true by the first tool call. If the first tool call is not get_instructions then we return an error to the AI saying that it should call get_instructions first.

Checklist

  • The PR satisfies the contribution guidelines.
  • This is a breaking change and that has been indicated in the PR title, OR this isn't a
    breaking change.
  • Necessary docs have been updated, OR no docs need to be updated.

Validation performed

  1. Unit Tests: uv run --group dev pytest tests/test_session_manager.py -v.
  2. Lint Tests: uv run ruff check clp/components/clp-mcp-server && task lint:check-py.
  3. Manual Integration Tests: task package && python3 -m clp_mcp_server.clp_mcp_server --host 127.0.0.1 --port 8000. Connected a Claude Desktop agent to the running MCP server and issued a get_instruction query, successfully receiving the predefined system prompt. You can see the full end-to-end result using this link
    Note: manual integration testing for get_nth_page was skipped due to the missing substring_search implementation; its correctness is verified through unit tests instead.
Screenshot 2025-10-09 at 10 41 42 PM

Future work

To keep this PR focused, we are intentionally deferring the following non-trivial tasks:

  1. Prompt Pruning: Author high-quality docstrings for each MCP API and refine the server-level system prompt. The goal is not only to make human maintainers understandable, but more importantly, to maximize LLM effectiveness when invoking the APIs.
  2. Integration Tests: The validation performed is only on unit tests, plus the limited manual end-to-end testing performed by the PR author. Full integration testing is incomplete due to the missing substring search implementation. The AI team is actively developing a comprehensive integration test suite, which will be the immediate next step following the first release.
  3. GitHub Workflow on API Documentation: Because API docstrings directly impact LLM performance, clarity is the top priority. We will add a CI workflow that automatically generates and updates API documentation whenever docstrings change or new APIs are added.

Summary by CodeRabbit

  • New Features

    • In-app system instructions (KQL guidance) and a tool to fetch specific paginated result pages.
  • Improvements

    • Per-session paginated caching with configurable TTL, page size and cache limits; session lifecycle management with periodic cleanup and async tooling; consolidated server settings/constants.
  • Tests

    • Extensive unit tests covering pagination, session lifecycle, expiry, concurrency and edge cases.
  • Chores

    • Dependency and lint config updates; tests package init and README testing instructions.

Copy link
Contributor

coderabbitai bot commented Oct 9, 2025

Walkthrough

Adds a constants module, a session-management subsystem with paginated caching and TTL-based async cleanup, integrates SessionManager into server tools (get_instructions, get_nth_page), updates async tool signatures/decorators, adjusts project dependencies and ruff ignores, and adds unit tests for session and pagination behavior.

Changes

Cohort / File(s) Summary of Changes
Server constants
components/clp-mcp-server/clp_mcp_server/server/constants.py
New module introducing EXPIRED_SESSION_SWEEP_INTERVAL_SECONDS = 600, NUM_ITEMS_PER_PAGE = 10, MAX_CACHED_RESULTS = 1000, SESSION_TTL_SECONDS = 600, SERVER_NAME = "clp-mcp-server", and a multiline SYSTEM_PROMPT.
Server tools & wiring
components/clp-mcp-server/clp_mcp_server/server/server.py
Replace prior protocol-constant usage with constants; instantiate SessionManager(constants.SESSION_TTL_SECONDS); add async tools get_instructions(ctx) and get_nth_page(page_index, ctx) that use session IDs; update hello_world to reference constants.SERVER_NAME; adjust decorator usage and import Context.
Session management
components/clp-mcp-server/clp_mcp_server/server/session_manager.py
New module adding PaginatedQueryResult, SessionState (dataclass) and SessionManager. Implements validation, pagination (uses paginate.Page), per-session caching, instruction gating, TTL checks, access-time updates, and an async cleanup loop. Public facade methods for caching results and retrieving pages.
Project configuration
components/clp-mcp-server/pyproject.toml
Add/update runtime deps (aiomysql, paginate, pymongo), add pytest-repeat to dev deps and reorder dev deps; introduce ruff per-file ignores for tests/** (S101, TC003).
Tests
components/clp-mcp-server/tests/__init__.py, components/clp-mcp-server/tests/test_session_manager.py
Add tests package __init__.py and extensive unit tests covering PaginatedQueryResult, SessionState, and SessionManager (pagination, bounds/errors, TTL/expiration, cleanup, instruction gating, concurrent access).

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant Client
  participant Server
  participant SessionMgr as SessionManager
  note right of SessionMgr #E6F4EA: Async session manager (TTL + cleanup)
  Client->>Server: get_instructions(ctx)
  Server->>SessionMgr: get_or_create_session(ctx.session_id)
  SessionMgr-->>Server: SessionState (access time updated)
  Server-->>Client: constants.SYSTEM_PROMPT
Loading
sequenceDiagram
  autonumber
  participant Client
  participant Server
  participant SessionMgr as SessionManager
  Client->>Server: get_nth_page(page_index, ctx)
  Server->>SessionMgr: get_nth_page(ctx.session_id, page_index)
  alt page available
    SessionMgr-->>Server: { items, total_pages, total_items, items_per_page, has_next, has_previous }
    Server-->>Client: Page data
  else no cache / out-of-range / instructions missing
    SessionMgr-->>Server: { error, message }
    Server-->>Client: Error payload
  end
Loading
sequenceDiagram
  autonumber
  participant Producer
  participant SessionMgr as SessionManager
  participant Session as SessionState
  Producer->>SessionMgr: cache_query_result_and_get_first_page(session_id, results)
  SessionMgr->>Session: cache_query_result_and_get_first_page(results)
  Session-->>SessionMgr: first page payload
  par Background cleanup (async)
    SessionMgr->>SessionMgr: periodic cleanup_expired_sessions() every EXPIRED_SESSION_SWEEP_INTERVAL_SECONDS
  end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The pull request title "feat(clp-mcp-server): Introduce session-based MCP capabilities:" directly captures the main architectural change introduced in this changeset. The raw summary and PR objectives confirm that the core change is implementing session management infrastructure to support per-session paging, lifecycle handling, and multiple concurrent users—which aligns precisely with the title's focus on "session-based MCP capabilities." The title is specific enough to convey that this is about introducing session-based architecture rather than being vague (e.g., "misc updates"), and it accurately reflects the most significant structural change across the implementation (the new SessionManager, SessionState, and PaginatedQueryResult classes that enable the two new MCP tools). While the title does not enumerate the specific APIs (get_nth_page and get_instructions), those tools are themselves dependent on and enabled by the session-based infrastructure, making the title appropriately scoped to the foundational change.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@20001020ycx 20001020ycx changed the title 10 4 mcp server implementation WIP mcp server implementation Oct 9, 2025
@20001020ycx 20001020ycx changed the title WIP mcp server implementation feat(clp-mcp-server): Implement MCP API Tool Calls: get_instruction() and get_nth_page() Oct 9, 2025
@20001020ycx 20001020ycx changed the title feat(clp-mcp-server): Implement MCP API Tool Calls: get_instruction() and get_nth_page() feat(clp-mcp-server): Implement MCP API Tool Calls: get_instruction and get_nth_page Oct 10, 2025
@20001020ycx 20001020ycx changed the title feat(clp-mcp-server): Implement MCP API Tool Calls: get_instruction and get_nth_page feat(clp-mcp-server): Implement MCP API Tool Calls: get_instructions and get_nth_page Oct 10, 2025
@20001020ycx 20001020ycx changed the title feat(clp-mcp-server): Implement MCP API Tool Calls: get_instructions and get_nth_page feat(clp-mcp-server): Implement MCP API Tool Calls: get_nth_page and get_instructions Oct 10, 2025
@20001020ycx 20001020ycx marked this pull request as ready for review October 10, 2025 12:44
@20001020ycx 20001020ycx requested a review from a team as a code owner October 10, 2025 12:44
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 321aeee and d3ccf3e.

⛔ Files ignored due to path filters (1)
  • components/clp-mcp-server/uv.lock is excluded by !**/*.lock
📒 Files selected for processing (6)
  • components/clp-mcp-server/clp_mcp_server/server/constants.py (1 hunks)
  • components/clp-mcp-server/clp_mcp_server/server/server.py (3 hunks)
  • components/clp-mcp-server/clp_mcp_server/server/session_manager.py (1 hunks)
  • components/clp-mcp-server/pyproject.toml (3 hunks)
  • components/clp-mcp-server/tests/__init__.py (1 hunks)
  • components/clp-mcp-server/tests/test_session_manager.py (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (3)
components/clp-mcp-server/clp_mcp_server/server/session_manager.py (2)
components/clp-mcp-server/clp_mcp_server/server/constants.py (1)
  • CLPMcpConstants (4-27)
components/clp-mcp-server/clp_mcp_server/server/server.py (1)
  • get_nth_page (37-46)
components/clp-mcp-server/tests/test_session_manager.py (2)
components/clp-mcp-server/clp_mcp_server/server/constants.py (1)
  • CLPMcpConstants (4-27)
components/clp-mcp-server/clp_mcp_server/server/session_manager.py (11)
  • QueryResult (15-52)
  • SessionManager (113-206)
  • SessionState (56-110)
  • get_page (38-52)
  • get_page_data (79-101)
  • cache_query_result (66-77)
  • cache_query_result (168-187)
  • is_expired (103-106)
  • get_or_create_session (147-166)
  • get_nth_page (189-206)
  • cleanup_expired_sessions (137-145)
components/clp-mcp-server/clp_mcp_server/server/server.py (2)
components/clp-mcp-server/clp_mcp_server/server/constants.py (1)
  • CLPMcpConstants (4-27)
components/clp-mcp-server/clp_mcp_server/server/session_manager.py (3)
  • SessionManager (113-206)
  • get_or_create_session (147-166)
  • get_nth_page (189-206)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: package-image
  • GitHub Check: lint-check (ubuntu-24.04)
  • GitHub Check: lint-check (macos-15)
  • GitHub Check: rust-checks (ubuntu-24.04)
  • GitHub Check: rust-checks (macos-15)
  • GitHub Check: rust-checks (ubuntu-22.04)
🔇 Additional comments (16)
components/clp-mcp-server/tests/__init__.py (1)

1-1: OK to add package marker.

Docstring-only module is fine.

components/clp-mcp-server/pyproject.toml (1)

74-78: Per-file ignores for tests are appropriate.

Allowing S101 and TC003 in tests is reasonable.

components/clp-mcp-server/clp_mcp_server/server/constants.py (1)

7-11: Constants look sane.

Values for cleanup cadence, TTL, paging, and cache limits are reasonable defaults.

components/clp-mcp-server/clp_mcp_server/server/server.py (4)

19-22: Server name + SessionManager init LGTM.

Using CLPMcpConstants and instantiating SessionManager with TTL is correct.


23-35: get_instructions sets session flag and returns prompt — good.

Meets the “must call first” design.


48-60: hello_world update to constants — good.

Simple, safe change.


36-47: Verify @mcp.tool usage and ctx argument position

  • Confirm whether the FastMCP API requires @mcp.tool() instead of @mcp.tool.
  • Ensure get_nth_page declares ctx: Context as its first parameter if context injection assumes a leading ctx argument.
components/clp-mcp-server/tests/test_session_manager.py (3)

31-40: Validation of MAX_CACHED_RESULTS bound is good.

Covers over-limit case.


77-127: Strong coverage for get_page_data paths.

Covers empty-cache, first/middle/last pages, and OOB.


185-205: Good: get_nth_page happy path and error cases.

Covers OOB index and missing cached results.

components/clp-mcp-server/clp_mcp_server/server/session_manager.py (6)

1-11: LGTM!

The imports are well-organized and appropriate for the session management functionality.


14-52: LGTM!

The QueryResult implementation is solid:

  • Proper validation of cached results against MAX_CACHED_RESULTS
  • Correct use of object.__setattr__ for setting computed field in frozen dataclass
  • Accurate ceiling division for total pages calculation
  • Appropriate bounds checking for 1-based page indexing

55-110: LGTM!

The SessionState implementation handles all state management correctly:

  • Proper use of timezone-aware datetime for tracking access time
  • Clear error messages for missing cached results and out-of-bounds pages
  • Correct TTL expiration logic using timedelta comparison

113-145: LGTM!

The thread-safe session management implementation is well-designed:

  • Appropriate use of threading.Lock to protect the shared sessions dictionary
  • Background daemon thread for periodic cleanup is a good approach
  • The cleanup logic correctly identifies and removes expired sessions under lock protection
  • Clear comments explain the threading model

147-166: LGTM!

The session lifecycle management is correct:

  • Properly handles expired session cleanup before reuse
  • Creates new sessions with appropriate defaults from CLPMcpConstants
  • Updates access time on every retrieval to maintain accurate TTL
  • Lock protection ensures thread-safe session dictionary operations

189-206: LGTM!

The pagination interface correctly:

  • Enforces the get_instructions prerequisite check
  • Converts zero-based page_index (external API) to one-based page_number (internal representation)
  • Delegates to session.get_page_data for the actual page retrieval

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added an instruction for test case.
@coerabbitai can u create an issue as a feature request to add tasks and workflows to run unit tests for all uv components?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@coderabbitai can u create an issue for us?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Create a GitHub issue for adding tasks and workflows to run unit tests for all uv components

gh issue create \
  --repo y-scope/clp \
  --title "Feature: Add tasks and workflows to run unit tests for all uv components" \
  --body "## Feature Request

Add tasks and workflows to run unit tests for all uv components in the repository.

## Context

This feature request was raised during the review of PR #1401, which added unit testing capabilities to the clp-mcp-server component using \`uv test pytest\`. The need for standardized test execution across all uv-based components was identified.

## Requested By

@LinZhihao-723

## References

- PR: https://github.com/y-scope/clp/pull/1401
- Comment: https://github.com/y-scope/clp/pull/1401#discussion_r2437510679" \
  --assignee LinZhihao-723

Length of output: 42


@LinZhihao-723, I've created issue #1431 for this feature request. The issue covers adding tasks and workflows to run unit tests for all uv components in the repository, with references back to this PR and comment thread.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 08c80b3 and 86184c9.

📒 Files selected for processing (2)
  • components/clp-mcp-server/clp_mcp_server/server/session_manager.py (1 hunks)
  • components/clp-mcp-server/tests/test_session_manager.py (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
components/clp-mcp-server/tests/test_session_manager.py (1)
components/clp-mcp-server/clp_mcp_server/server/session_manager.py (12)
  • PaginatedQueryResult (13-55)
  • SessionManager (161-242)
  • SessionState (59-158)
  • get_page (40-55)
  • get_instructions (103-110)
  • get_page_data (112-149)
  • cache_query_result_and_get_first_page (84-101)
  • cache_query_result_and_get_first_page (199-209)
  • is_expired (151-154)
  • get_or_create_session (224-242)
  • get_nth_page (211-222)
  • start (181-184)
components/clp-mcp-server/clp_mcp_server/server/session_manager.py (1)
components/clp-mcp-server/clp_mcp_server/server/server.py (2)
  • get_instructions (24-33)
  • get_nth_page (36-54)
🪛 Ruff (0.14.0)
components/clp-mcp-server/tests/test_session_manager.py

75-77: One-line docstring should fit on one line

Reformat to one line

(D200)


192-192: Local variable session1 is assigned to but never used

Remove assignment to unused variable session1

(F841)


247-247: Local variable session is assigned to but never used

Remove assignment to unused variable session

(F841)


250-250: Async functions should not call time.sleep

(ASYNC251)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: rust-checks (ubuntu-22.04)
  • GitHub Check: lint-check (ubuntu-24.04)
  • GitHub Check: rust-checks (ubuntu-24.04)

Comment on lines 97 to 101
self._cached_query_result = PaginatedQueryResult(
result_log_entries=results, num_items_per_page=self._num_items_per_page
)

return self.get_page_data(0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Handle oversized caches instead of propagating ValueError.

Line 97 constructs PaginatedQueryResult without guarding ValueError. That exception bubbles up to the MCP tool and crashes the call whenever a query returns more than MAX_CACHED_RESULTS, contradicting the “cache up to 1 000 log entries” contract. Catch the ValueError and return a structured "Error" payload.

-        self._cached_query_result = PaginatedQueryResult(
-            result_log_entries=results, num_items_per_page=self._num_items_per_page
-        )
-
-        return self.get_page_data(0)
+        try:
+            self._cached_query_result = PaginatedQueryResult(
+                result_log_entries=results,
+                num_items_per_page=self._num_items_per_page,
+            )
+        except ValueError as exc:
+            return {"Error": str(exc)}
+
+        return self.get_page_data(0)
🤖 Prompt for AI Agents
In components/clp-mcp-server/clp_mcp_server/server/session_manager.py around
lines 97 to 101, constructing PaginatedQueryResult can raise ValueError for
oversized result sets; wrap that construction in a try/except ValueError, and in
the except branch avoid propagating the exception—optionally clear or leave
_cached_query_result unset and return a structured error payload (e.g. a dict
with type "Error" and a clear message about exceeding MAX_CACHED_RESULTS).
Ensure you still return the same shape the caller expects (not an exception) and
log the condition for observability.

Comment on lines 135 to 149
if self._cached_query_result is None:
return {"Error": "No previous paginated response in this session."}

page = self._cached_query_result.get_page(page_index)
if page is None:
return {"Error": "Page index is out of bounds."}

return {
"items": list(page),
"total_pages": page.page_count,
"total_items": page.item_count,
"num_items_per_page": page.items_per_page,
"has_next": page.next_page is not None,
"has_previous": page.previous_page is not None,
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Return an empty page when the cached result set is empty.

Lines 135-149 treat every zero-length cache as “Page index is out of bounds.” With _num_pages == 0, get_page(0) returns None, so even the first page of a successful empty query yields an error. Return a proper empty page payload for page_index == 0.

-        page = self._cached_query_result.get_page(page_index)
-        if page is None:
-            return {"Error": "Page index is out of bounds."}
-
-        return {
+        cached_result = self._cached_query_result
+
+        if cached_result._num_pages == 0:
+            if page_index != 0:
+                return {"Error": "Page index is out of bounds."}
+            return {
+                "items": [],
+                "total_pages": 0,
+                "total_items": 0,
+                "num_items_per_page": self._num_items_per_page,
+                "has_next": False,
+                "has_previous": False,
+            }
+
+        page = cached_result.get_page(page_index)
+        if page is None:
+            return {"Error": "Page index is out of bounds."}
+
+        return {
             "items": list(page),
             "total_pages": page.page_count,
             "total_items": page.item_count,
             "num_items_per_page": page.items_per_page,
             "has_next": page.next_page is not None,
             "has_previous": page.previous_page is not None,
         }
🤖 Prompt for AI Agents
In components/clp-mcp-server/clp_mcp_server/server/session_manager.py around
lines 135-149, the code treats a zero-length cached_query_result as
out-of-bounds because get_page(0) returns None; update the logic to special-case
an empty cached result when page_index == 0 and return a proper empty page
payload instead of an error. Specifically: if _cached_query_result is present
but _cached_query_result._num_pages == 0 and page_index == 0, return a dict with
"items": [], "total_pages": 0, "total_items": 0, "num_items_per_page": 0 (or the
cached default if available), "has_next": False, and "has_previous": False;
otherwise keep the existing out-of-bounds handling for other page_index values.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (6)
components/clp-mcp-server/clp_mcp_server/server/session_manager.py (6)

216-217: Complete the parameter descriptions.

The session_id parameter is missing its description.

Apply this diff:

         """
         Retrieves the n-th page of a paginated response with the paging metadata from the previous
         query.
 
-        :param session_id:
+        :param session_id: Unique identifier for the session.
         :param page_index: Index of the page to retrieve (zero-based, e.g., 0 is the first page).

19-19: Complete the parameter description.

The num_items_per_page parameter is missing its description.

Apply this diff:

         :param result_log_entries: List of cached log entries to paginate.
-        :param num_items_per_page:
+        :param num_items_per_page: Number of items per page (must be > 0).
         :raise: ValueError if the number of cached results or `num_items_per_page` is invalid.

40-55: Allow empty result sets to return page 0 as an empty page.

When result_log_entries is empty, _num_pages becomes 0. Requesting page_index=0 (which becomes page_number=1) fails the check self._num_pages < page_number and returns None. This causes downstream methods to report "Page index is out of bounds" even though the query succeeded with zero results. Consider special-casing empty results to return an empty Page with correct metadata for page 0.

Apply this diff:

     def get_page(self, page_index: int) -> Page | None:
         """
         :param page_index: Zero-based index, e.g., 0 for the first page.
         :return: A `Page` object for the specified page.
         :return: None if `page_index` is out of bounds.
         """
         # Convert zero-based to one-based
         page_number = page_index + 1
+        
+        # Special-case empty results: allow page 0 to return an empty Page
+        if self._num_pages == 0:
+            if page_index == 0:
+                return Page([], page=1, items_per_page=self._num_items_per_page)
+            return None
+        
         if page_number <= 0 or self._num_pages < page_number:
             return None
 
         return Page(
             self._result_log_entries,
             page=page_number,
             items_per_page=self._num_items_per_page,
         )

228-228: Complete the parameter description.

The session_id parameter is missing its description.

Apply this diff:

         """
         Gets an existing session or creates a new one.
 
-        :param session_id:
+        :param session_id: Unique identifier for the session.
         :return: The `SessionState` object for the given `session_id`.

112-149: Handle empty result sets gracefully.

Lines 138-140 treat every zero-length cache as "Page index is out of bounds." When _cached_query_result has 0 items, get_page(0) returns None, so even the first page of a successful empty query yields an error. Consider returning a proper empty page payload for page_index == 0 when the cached result is empty.

Apply this diff:

         if self._cached_query_result is None:
             return {"Error": "No previous paginated response in this session."}
 
+        # Special-case empty cached results: allow page 0 to return an empty page
+        if self._cached_query_result._num_pages == 0:
+            if page_index == 0:
+                return {
+                    "items": [],
+                    "total_pages": 0,
+                    "total_items": 0,
+                    "num_items_per_page": self._num_items_per_page,
+                    "has_next": False,
+                    "has_previous": False,
+                }
+            return {"Error": "Page index is out of bounds."}
+
         page = self._cached_query_result.get_page(page_index)
         if page is None:
             return {"Error": "Page index is out of bounds."}

84-101: Add error handling for oversized query results.

Line 97 constructs PaginatedQueryResult without catching ValueError. If results exceed MAX_CACHED_RESULTS or num_items_per_page is invalid, the exception propagates up and crashes the MCP tool call. The PR objectives state the server "caches up to 1,000 log entries," implying graceful handling rather than tool failure.

Apply this diff to handle the error:

     def cache_query_result_and_get_first_page(
         self,
         results: list[str],
     ) -> dict[str, Any]:
         """
         :param results: Log entries from the query to cache.
         :return: Forwards `SessionState.get_page_data`'s return values.
         :return: _GET_INSTRUCTIONS_NOT_RUN_ERROR if `get_instructions` has not been called in this
             session.
         """
         if self._is_instructions_retrieved is False:
             return self._GET_INSTRUCTIONS_NOT_RUN_ERROR.copy()
 
-        self._cached_query_result = PaginatedQueryResult(
-            result_log_entries=results, num_items_per_page=self._num_items_per_page
-        )
+        try:
+            self._cached_query_result = PaginatedQueryResult(
+                result_log_entries=results, num_items_per_page=self._num_items_per_page
+            )
+        except ValueError as exc:
+            return {"Error": str(exc)}
 
         return self.get_page_data(0)
📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 86184c9 and a2a13b8.

📒 Files selected for processing (1)
  • components/clp-mcp-server/clp_mcp_server/server/session_manager.py (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
components/clp-mcp-server/clp_mcp_server/server/session_manager.py (1)
components/clp-mcp-server/clp_mcp_server/server/server.py (2)
  • get_instructions (24-33)
  • get_nth_page (36-54)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: package-image

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (2)
components/clp-mcp-server/clp_mcp_server/server/session_manager.py (2)

93-97: Handle oversized caches without crashing the tool

PaginatedQueryResult raises ValueError when a query returns more than MAX_CACHED_RESULTS. Because this call site does not guard the exception, a large-but-valid search bubbles out of cache_query_result_and_get_first_page, causing the MCP tool to fail instead of returning the documented error payload. Please catch the ValueError (or pre-cap the results) and surface a structured error so callers can recover gracefully.

-        self._cached_query_result = PaginatedQueryResult(
-            log_entries=results, num_items_per_page=self._num_items_per_page
-        )
-
-        return self.get_page_data(0)
+        try:
+            self._cached_query_result = PaginatedQueryResult(
+                log_entries=results,
+                num_items_per_page=self._num_items_per_page,
+            )
+        except ValueError as exc:
+            return {"Error": str(exc)}
+
+        return self.get_page_data(0)

134-145: Return an empty page when the cached result set is empty

When the previous query produced zero results, _num_pages is 0, so requesting page_index=0 comes back as “Page index is out of bounds.” even though the query succeeded. That violates the PR contract for get_nth_page and prevents clients from seeing the empty result set. Please special-case the zero-results path and return an empty page (with correct pagination metadata) for index 0, only treating other indexes as out of bounds.

-        page = self._cached_query_result.get_page(page_index)
-        if page is None:
-            return {"Error": "Page index is out of bounds."}
+        cached_result = self._cached_query_result
+
+        if cached_result._num_pages == 0:
+            if page_index != 0:
+                return {"Error": "Page index is out of bounds."}
+            return {
+                "items": [],
+                "total_pages": 0,
+                "total_items": 0,
+                "num_items_per_page": cached_result._num_items_per_page,
+                "has_next": False,
+                "has_previous": False,
+            }
+
+        page = cached_result.get_page(page_index)
+        if page is None:
+            return {"Error": "Page index is out of bounds."}
📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a54b019 and 046b6f6.

📒 Files selected for processing (4)
  • components/clp-mcp-server/clp_mcp_server/server/constants.py (1 hunks)
  • components/clp-mcp-server/clp_mcp_server/server/server.py (3 hunks)
  • components/clp-mcp-server/clp_mcp_server/server/session_manager.py (1 hunks)
  • components/clp-mcp-server/tests/test_session_manager.py (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (3)
components/clp-mcp-server/clp_mcp_server/server/session_manager.py (1)
components/clp-mcp-server/clp_mcp_server/server/server.py (2)
  • get_instructions (24-33)
  • get_nth_page (36-54)
components/clp-mcp-server/clp_mcp_server/server/server.py (1)
components/clp-mcp-server/clp_mcp_server/server/session_manager.py (5)
  • SessionManager (157-237)
  • get_instructions (99-106)
  • start (176-179)
  • get_or_create_session (219-237)
  • get_nth_page (206-217)
components/clp-mcp-server/tests/test_session_manager.py (2)
components/clp-mcp-server/clp_mcp_server/server/session_manager.py (12)
  • PaginatedQueryResult (13-55)
  • SessionManager (157-237)
  • SessionState (59-154)
  • start (176-179)
  • get_page (40-55)
  • get_instructions (99-106)
  • get_page_data (108-145)
  • cache_query_result_and_get_first_page (80-97)
  • cache_query_result_and_get_first_page (194-204)
  • is_expired (147-150)
  • get_or_create_session (219-237)
  • get_nth_page (206-217)
components/clp-mcp-server/clp_mcp_server/server/server.py (2)
  • get_instructions (24-33)
  • get_nth_page (36-54)
🪛 Ruff (0.14.0)
components/clp-mcp-server/tests/test_session_manager.py

103-105: One-line docstring should fit on one line

Reformat to one line

(D200)


211-211: Local variable session1 is assigned to but never used

Remove assignment to unused variable session1

(F841)


262-262: Local variable session is assigned to but never used

Remove assignment to unused variable session

(F841)


265-265: Async functions should not call time.sleep

(ASYNC251)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: package-image
  • GitHub Check: lint-check (ubuntu-24.04)
  • GitHub Check: rust-checks (ubuntu-24.04)
  • GitHub Check: rust-checks (ubuntu-22.04)

Comment on lines 255 to 282
with patch.object(constants, 'EXPIRED_SESSION_SWEEP_INTERVAL_SECONDS', 0.05):
manager = SessionManager(session_ttl_seconds=TestConstants.FAST_SESSION_TTL_SECONDS)
await manager.start()

# Create sessions that we will expire and NOT access again
expired_session_ids = [f"{TestConstants.EXPIRED_SESSION_PREFIX}{i}" for i in range(20)]
for sid in expired_session_ids:
session = manager.get_or_create_session(sid)

# Ensure created sessions expire
time.sleep(1)

# Create active sessions that we WILL keep accessing
active_session_ids = [f"{TestConstants.ACTIVE_SESSION_PREFIX}{i}" for i in range(20)]
for _ in range(100):
for sid in active_session_ids:
manager.get_or_create_session(sid)

# Wait for at least one cleanup cycle
await asyncio.sleep(0.1)

# Verify: expired sessions deleted by cleanup asynchronously
for sid in expired_session_ids:
assert sid not in manager.sessions

# Verify: active sessions still exist
for sid in active_session_ids:
assert sid in manager.sessions
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Fix async test: avoid blocking sleep and cancel the cleanup task

time.sleep(1) inside an async test blocks the event loop (Ruff ASYNC251) and the background _cleanup_task is never cancelled, leaving a pending task at teardown. Both flaws make the suite flaky. Please switch to await asyncio.sleep(...), drop the unused session assignment, and ensure the cleanup task is cancelled and awaited in a finally block. You’ll also need from contextlib import suppress.

-from concurrent.futures import ThreadPoolExecutor
-from datetime import datetime, timedelta, timezone
-from unittest.mock import patch
-
-import asyncio
-import time
+from concurrent.futures import ThreadPoolExecutor
+from contextlib import suppress
+from datetime import datetime, timedelta, timezone
+from unittest.mock import patch
+
+import asyncio
+import time
@@
-            for sid in expired_session_ids:
-                session = manager.get_or_create_session(sid)
-            
-            # Ensure created sessions expire
-            time.sleep(1)
-            
-            # Create active sessions that we WILL keep accessing
-            active_session_ids = [f"{TestConstants.ACTIVE_SESSION_PREFIX}{i}" for i in range(20)]
-            for _ in range(100):
-                for sid in active_session_ids:
-                    manager.get_or_create_session(sid)
-            
-            # Wait for at least one cleanup cycle
-            await asyncio.sleep(0.1)
-            
-            # Verify: expired sessions deleted by cleanup asynchronously
-            for sid in expired_session_ids:
-                assert sid not in manager.sessions
-            
-            # Verify: active sessions still exist
-            for sid in active_session_ids:
-                assert sid in manager.sessions
+            for sid in expired_session_ids:
+                manager.get_or_create_session(sid)
+
+            active_session_ids = [f"{TestConstants.ACTIVE_SESSION_PREFIX}{i}" for i in range(20)]
+            try:
+                # Ensure created sessions expire
+                await asyncio.sleep(1)
+
+                # Keep active sessions warm
+                for _ in range(100):
+                    for sid in active_session_ids:
+                        manager.get_or_create_session(sid)
+
+                # Wait for at least one cleanup cycle
+                await asyncio.sleep(0.1)
+
+                # Verify: expired sessions deleted by cleanup asynchronously
+                for sid in expired_session_ids:
+                    assert sid not in manager.sessions
+
+                # Verify: active sessions still exist
+                for sid in active_session_ids:
+                    assert sid in manager.sessions
+            finally:
+                if manager._cleanup_task is not None:
+                    manager._cleanup_task.cancel()
+                    with suppress(asyncio.CancelledError):
+                        await manager._cleanup_task
🧰 Tools
🪛 Ruff (0.14.0)

262-262: Local variable session is assigned to but never used

Remove assignment to unused variable session

(F841)


265-265: Async functions should not call time.sleep

(ASYNC251)

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 046b6f6 and fa8dcc5.

📒 Files selected for processing (1)
  • components/clp-mcp-server/tests/test_session_manager.py (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
components/clp-mcp-server/tests/test_session_manager.py (1)
components/clp-mcp-server/clp_mcp_server/server/session_manager.py (12)
  • PaginatedQueryResult (13-55)
  • SessionManager (157-237)
  • SessionState (59-154)
  • start (176-179)
  • get_page (40-55)
  • get_instructions (99-106)
  • get_page_data (108-145)
  • cache_query_result_and_get_first_page (80-97)
  • cache_query_result_and_get_first_page (194-204)
  • is_expired (147-150)
  • get_or_create_session (219-237)
  • get_nth_page (206-217)
🪛 Ruff (0.14.0)
components/clp-mcp-server/tests/test_session_manager.py

201-203: One-line docstring should fit on one line

Reformat to one line

(D200)


267-267: Async functions should not call time.sleep

(ASYNC251)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: package-image
  • GitHub Check: rust-checks (ubuntu-22.04)
  • GitHub Check: rust-checks (ubuntu-24.04)
  • GitHub Check: lint-check (ubuntu-24.04)

Comment on lines +201 to +203
"""
Create a `SessionManager` with a test session where `get_instructions()` has been called.
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Reformat docstring to one line per Ruff D200.

The fixture docstring spans multiple lines but should fit on a single line to comply with PEP 257 conventions.

Based on static analysis hints

Apply this diff:

-        """
-        Create a `SessionManager` with a test session where `get_instructions()` has been called.
-        """
+        """Create a `SessionManager` with a test session where `get_instructions()` has been called."""
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
"""
Create a `SessionManager` with a test session where `get_instructions()` has been called.
"""
"""Create a `SessionManager` with a test session where `get_instructions()` has been called."""
🧰 Tools
🪛 Ruff (0.14.0)

201-203: One-line docstring should fit on one line

Reformat to one line

(D200)

🤖 Prompt for AI Agents
components/clp-mcp-server/tests/test_session_manager.py lines 201-203: the
fixture docstring is written as a multi-line string but should be a one-line
docstring per Ruff D200/PEP 257; change the triple-quoted multi-line docstring
to a single-line docstring (e.g. """Create a SessionManager with a test session
where get_instructions() has been called."""), keeping the same wording and
surrounding quotes, and ensure indentation and surrounding blank lines remain
correct.

Comment on lines +254 to +284
@pytest.mark.asyncio
async def test_async_expiration_for_cleanup_loop(self) -> None:
"""Verifies that _cleanup_loop runs as an asynchronous task and deletes expired sessions."""
with patch.object(constants, "EXPIRED_SESSION_SWEEP_INTERVAL_SECONDS", 0.05):
manager = SessionManager(session_ttl_seconds=TestConstants.FAST_SESSION_TTL_SECONDS)
await manager.start()

# Create sessions that we will expire and NOT access again
expired_session_ids = [f"{TestConstants.EXPIRED_SESSION_PREFIX}{i}" for i in range(20)]
for sid in expired_session_ids:
_ = manager.get_or_create_session(sid)

# Ensure created sessions expire
time.sleep(1)

# Create active sessions that we WILL keep accessing
active_session_ids = [f"{TestConstants.ACTIVE_SESSION_PREFIX}{i}" for i in range(20)]
for _ in range(100):
for sid in active_session_ids:
manager.get_or_create_session(sid)

# Wait for at least one cleanup cycle
await asyncio.sleep(0.1)

# Verify: expired sessions deleted by cleanup asynchronously
for sid in expired_session_ids:
assert sid not in manager.sessions

# Verify: active sessions still exist
for sid in active_session_ids:
assert sid in manager.sessions
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Fix blocking sleep and add cleanup for background task.

Line 267 uses time.sleep(1) inside an async test, which blocks the event loop (Ruff ASYNC251). Additionally, the test never cancels manager._cleanup_task, leaving a pending task at teardown that can cause flakiness.

Based on static analysis hints

Apply this diff to replace the blocking sleep with async sleep and add proper cleanup:

+from contextlib import suppress
+
@@
             # Create sessions that we will expire and NOT access again
             expired_session_ids = [f"{TestConstants.EXPIRED_SESSION_PREFIX}{i}" for i in range(20)]
             for sid in expired_session_ids:
                 _ = manager.get_or_create_session(sid)
             
-            # Ensure created sessions expire
-            time.sleep(1)
-            
-            # Create active sessions that we WILL keep accessing
-            active_session_ids = [f"{TestConstants.ACTIVE_SESSION_PREFIX}{i}" for i in range(20)]
-            for _ in range(100):
-                for sid in active_session_ids:
-                    manager.get_or_create_session(sid)
-            
-            # Wait for at least one cleanup cycle
-            await asyncio.sleep(0.1)
-            
-            # Verify: expired sessions deleted by cleanup asynchronously
-            for sid in expired_session_ids:
-                assert sid not in manager.sessions
-            
-            # Verify: active sessions still exist
-            for sid in active_session_ids:
-                assert sid in manager.sessions
+            try:
+                # Ensure created sessions expire
+                await asyncio.sleep(1)
+                
+                # Create active sessions that we WILL keep accessing
+                active_session_ids = [f"{TestConstants.ACTIVE_SESSION_PREFIX}{i}" for i in range(20)]
+                for _ in range(100):
+                    for sid in active_session_ids:
+                        manager.get_or_create_session(sid)
+                
+                # Wait for at least one cleanup cycle
+                await asyncio.sleep(0.1)
+                
+                # Verify: expired sessions deleted by cleanup asynchronously
+                for sid in expired_session_ids:
+                    assert sid not in manager.sessions
+                
+                # Verify: active sessions still exist
+                for sid in active_session_ids:
+                    assert sid in manager.sessions
+            finally:
+                if manager._cleanup_task is not None:
+                    manager._cleanup_task.cancel()
+                    with suppress(asyncio.CancelledError):
+                        await manager._cleanup_task

Also add the missing import at the top of the file (after line 3):

 from concurrent.futures import ThreadPoolExecutor
+from contextlib import suppress
 from datetime import datetime, timedelta, timezone
🧰 Tools
🪛 Ruff (0.14.0)

267-267: Async functions should not call time.sleep

(ASYNC251)

🤖 Prompt for AI Agents
In components/clp-mcp-server/tests/test_session_manager.py around lines 254-284,
replace the blocking time.sleep(1) call with non-blocking await asyncio.sleep(1)
and ensure the background cleanup task is cleaned up at test end by cancelling
and awaiting manager._cleanup_task (or calling/awaiting manager.stop() if
provided) to avoid dangling tasks; also add the missing import asyncio near the
top of the file (after line 3).

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fa8dcc5 and acf99a9.

📒 Files selected for processing (1)
  • components/clp-mcp-server/clp_mcp_server/server/constants.py (1 hunks)
🧰 Additional context used
🪛 GitHub Actions: clp-lint
components/clp-mcp-server/clp_mcp_server/server/constants.py

[error] 1-1: Python linting failed: Black would reformat 1 file and Ruff detected formatting changes in constants.py. Please run 'ruff --fix' and 'black .' to fix code style issues.

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: package-image
  • GitHub Check: rust-checks (ubuntu-24.04)
  • GitHub Check: rust-checks (ubuntu-22.04)
🔇 Additional comments (2)
components/clp-mcp-server/clp_mcp_server/server/constants.py (2)

1-5: LGTM!

The module docstring and constants are well-defined. The naming conventions (EXPIRED_SESSION_SWEEP_INTERVAL_SECONDS, NUM_ITEMS_PER_PAGE) properly address previous review feedback.


9-25: LGTM with future refinements noted.

The SERVER_NAME and SYSTEM_PROMPT definitions are correctly formatted and provide a solid baseline for guiding LLM behaviour. The past review discussions identified several potential improvements (KQL syntax specifics, auto-generated keys, timestamp handling, timezone considerations) that are appropriately deferred to future work as noted in the PR description.

Copy link
Member

@LinZhihao-723 LinZhihao-723 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the PR title, how about using a multi-line commit message:

feat(clp-mcp-server): Introduce session-based MCP capabilities:
- Add `get_instructions` MCP tool call to retrieve the system prompt.
- Add `get_nth_page` MCP tool call to retrieve paginated query results cached from the previous search.
- Introduce `SessionManager` for managing asynchronous session-level states.

@20001020ycx 20001020ycx changed the title feat(clp-mcp-server): Implement MCP API Tool Calls: get_nth_page and get_instructions feat(clp-mcp-server): Introduce session-based MCP capabilities: Oct 17, 2025
@LinZhihao-723 LinZhihao-723 merged commit 221b064 into y-scope:main Oct 17, 2025
28 of 40 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants