-
Notifications
You must be signed in to change notification settings - Fork 82
feat(clp-mcp-server): Introduce session-based MCP capabilities: #1401
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 75 commits
a76ff24
697f366
70dff32
0d8b06a
c5ac54e
d326f76
37f2c41
47d0301
2201266
aa37f1a
a3076b6
d3d7830
ec0f2bc
21e0386
edf68e4
3d2be57
e1c1b3b
7844653
97e7f23
d3ccf3e
15315ec
8ce8a38
6faeba9
466e242
cccd8ec
b7c3b87
9c6b881
184c94e
07aa3f8
1c2d810
02c706f
da0b2e3
142573a
6f426ca
c8816a5
702837c
e267ec4
ed7765b
9e73062
01c6a05
67d183e
ce744d7
4fc5cc3
23797fd
0644fb7
662b47d
1a53ad3
862aa42
b79f681
134591c
50e5db3
1cad6b0
2683bcf
22de748
c934ac7
4a2bd41
ad11e90
a1979d6
2193824
d6a7944
00f9503
9678909
5af4233
392af88
07f17a5
d4f6015
540e78d
c480b33
c411ae0
2601008
e38f988
08c80b3
1de6a46
86184c9
a2a13b8
8a5c204
9caab35
f7a6579
a54b019
046b6f6
ebad693
fa8dcc5
5a12084
acf99a9
3b7daca
7e7b321
2b3f13a
c206f7b
f882168
8e67e3d
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -49,6 +49,14 @@ For more details on Claude Desktop MCP setup, see the | |
|
||
> **Warning:** 🚧 This section is still under construction. | ||
## Testing | ||
|
||
Use the following command to run all unit tests: | ||
|
||
```shell | ||
uv test pytest | ||
``` | ||
Comment on lines
+52
to
+58
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🧩 Analysis chainVerify the test command syntax and update for consistency with PR objectives. The command Please verify the correct command by running the following script: Additionally, please confirm:
Once verified, the README command should be updated to match the project's testing setup. 🏁 Script executed: #!/bin/bash
# Verify uv test command syntax and check available test runners
# Check if uv supports 'test' subcommand
uv test --help 2>&1 | head -20
# List available uv commands
uv --version
uv --help | grep -i test Length of output: 122 Understood. I'll gather information through alternative methods since the sandbox lacks the
🏁 Script executed: #!/bin/bash
# Search codebase for configuration files and existing uv command usage
# Look for configuration files
echo "=== Configuration files ==="
fd -type f -e toml -e lock | head -20
# Search for existing uv command patterns in markdown, scripts, and config files
echo -e "\n=== Existing uv commands in codebase ==="
rg -n "uv\s+(run|test|pytest)" --type md --type py --type toml -C 2
# Check if pyproject.toml defines test commands
echo -e "\n=== Content of pyproject.toml (if exists) ==="
fd -name pyproject.toml -type f -exec cat {} \;
# Search for test-related configuration
echo -e "\n=== Test command patterns ==="
rg -n "pytest|test" --type toml -C 1 | head -30 Length of output: 3065 🏁 Script executed: #!/bin/bash
# Check the PR objectives or description for test setup guidance
fd -name "*.md" -type f | head -10
cat components/clp-mcp-server/README.md | grep -A 10 -B 5 -i "test\|pytest" Length of output: 695 🌐 Web query:
💡 Result: Short answer: There is no built‑in top‑level uv test subcommand. Use uv run pytest (or uv run <script/name>) — or define a script (tool.uv.scripts.test = "pytest") and run it via uv run test. [1][2][3] Sources:
Length of output: 2457 Corriger la syntaxe de la commande de test — La commande Puisque les outils pytest se trouvent dans le groupe de dépendances uv run --group dev pytest Cela s'aligne également avec l'approche documentée dans les objectifs de la RP : 🤖 Prompt for AI Agents
|
||
|
||
[claude-desktop]: https://claude.ai/download | ||
[claude-desktop-mcp-doc]: https://modelcontextprotocol.io/docs/develop/connect-local-servers | ||
[mcp]: https://modelcontextprotocol.io/docs/getting-started/intro | ||
|
Original file line number | Diff line number | Diff line change | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
@@ -0,0 +1,26 @@ | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
"""Constants for CLP MCP server.""" | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
EXPIRED_SESSION_SWEEP_INTERVAL_SECONDS = 600 # 10 minutes | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
NUM_ITEMS_PER_PAGE = 10 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
MAX_CACHED_RESULTS = 1000 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
SESSION_TTL_MINUTES = 60 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
SERVER_NAME = "clp-mcp-server" | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
SYSTEM_PROMPT = """ | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
You are an AI assistant that helps users query a log database using KQL (Kibana Query Language). | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
When given a user query, you should generate a KQL query that accurately captures the user's | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
intent. The KQL query should be as specific as possible to minimize the number of log messages | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
returned. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
You should also consider the following guidelines when generating KQL queries: | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
- Use specific field names and values to narrow down the search. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
- Avoid using wildcards (*) unless absolutely necessary, as they can lead to large result | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
sets. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
- Use logical operators (AND, OR, NOT) to combine multiple conditions. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
- Consider the time range of the logs you are searching. If the user specifies a time range, | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
include it in the KQL query. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
- If the user query is ambiguous or lacks detail, ask clarifying questions to better | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
understand their intent before generating the KQL query. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
- Always ensure that the generated KQL query is syntactically correct and can be executed | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
without errors. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
""" | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
SYSTEM_PROMPT = """ | |
You are an AI assistant that helps users query a log database using KQL (Kibana Query Language). | |
When given a user query, you should generate a KQL query that accurately captures the user's | |
intent. The KQL query should be as specific as possible to minimize the number of log messages | |
returned. | |
You should also consider the following guidelines when generating KQL queries: | |
- Use specific field names and values to narrow down the search. | |
- Avoid using wildcards (*) unless absolutely necessary, as they can lead to large result | |
sets. | |
- Use logical operators (AND, OR, NOT) to combine multiple conditions. | |
- Consider the time range of the logs you are searching. If the user specifies a time range, | |
include it in the KQL query. | |
- If the user query is ambiguous or lacks detail, ask clarifying questions to better | |
understand their intent before generating the KQL query. | |
- Always ensure that the generated KQL query is syntactically correct and can be executed | |
without errors. | |
""" | |
SYSTEM_PROMPT = ( | |
"You are an AI assistant that helps users query a log database using KQL (Kibana Query Language)." | |
" You should generate a KQL query that accurately expresses the user's intent. The generated KQL" | |
" query should be as specific as possible to minimize the number of log messages returned.\n\n" | |
"You should consider the following guidelines to generate KQL queries efficiently:\n" | |
"- Use specific field names and values to narrow down the search.\n" | |
"- Avoid using wildcards (`*`) unless absolutely necessary, as they can lead to large result" | |
" sets.\n" | |
"- Use logical operators (`AND`, `OR`, `NOT`) to combine one or more key-value searches.\n" | |
"- Consider the time range of the logs you are searching. If the user specifies a time range," | |
" include it in the KQL query.\n" | |
"- If the user query is ambiguous or lacks detail, ask clarifying questions to better understand" | |
" their intent before generating the KQL query.\n" | |
"- Always ensure that the generated KQL query is syntactically correct and can be executed without" | |
" errors." | |
) |
A few questions about this prompt, may also need inputs from @kirkrodrigues and @junhaoliao.
- Since our KQL is slightly different from the standard version, shall we mention the syntax first? For example:
- Use double quotes for string values.
- Escape special characters.
- We didn't mention auto-generated keys. Would this make the agent only query user-generated keys, which may miss some important log-event-level metadata?
- We didn't mention file-level metadata query.
- When we say narrow down the timestamp, we didn't give rules to:
- In what granularity should the timestamp be queried? We should maybe explain the timsetamp is Unix Epoch in milliseconds (or nanoseconds with Devin's latest change).
- How to deal with the timezone? We don't have timezone information tracked in the archive now. Without some grace offsets applied, the agent could generate a query that doesn't fall into the actual expected timestamp range.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree with everything you mentioned except:
- Afaik, we aren't currently storing file-level metadata in clp-json (although clp-s supports storing it), so perhaps we should defer this.
- We can tell the LLM that we only want queries in UTC time, right?
Some additional points:
- I think it might also be useful to tell the LLM that we don't support fuzzy matches (otherwise it might expect that, for example, "system error" can match "system shutdown error").
- How does the LLM know what key to use for the timestamp?
- Do we really need the point about ensuring the KQL query is syntactically correct? Does it actually improve the accuracy of the generated query?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
iirc, we don't support parsing timestamps as UTC yet. When users give a timestamp range, LLM probably also doesn't know which timezone it should use to apply the offset.
I think a safe range should be: suppose lower_bound
and upper_bound
are given, we add a grace padding to search:
real_lower_bound
=lower_bound
- 12hoursreal_upper_bound
=upper_bound
+ 14hours
But this might make the range too wild. On the other hand, I'm not sure if AI can understand how to properly apply the offset if we give it a concrete timezone.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's my opinion to each point you have mentioned:
- agreed, I wasn't aware of this, is there any documentation defining what specific KQL we are supporting?
- I don't have any background for auto-generated key? Are they just some other fields in the returned response? In this case, I think LLM may learn from the returned result right?
- In addition to what Kirk said about file-level metadata, the current suite of API does not take in any file-level metadata (e.g file name), so information about the file is not in the interest of the current release.
- I am having trouble understanding why timezone is important here. Say agent issues substring_seach ("error"), and in the receiving response, each log message is accompanied by a timestamp, say Toronto time or Unix epoch. LLM would learn what is the format here, and issue the next substring search with a timestamp that is around the point of interest. For example, if the timestamp is Unix epoch, and the error message that the LLM is interested in has a timestamp of Unix epoch 1234, then the next request LLM would issue is substring_search("some other related keyword", 1000, 1500), because it realizes a timestamp is Unix epoch rather a UTC timestamp.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- https://docs.yscope.com/clp/main/user-docs/reference-json-search-syntax.html
- Background. Auto-generated keys won't be returned in the response but they are query-able. Thinking about it, since we don't know what autogenerated keys the compressed logs will have, at this point, we probably can't say anything in the API. In the future, perhaps we can have better support.
- Regarding timezones, @LinZhihao-723 can you provide an example log input and output that demonstrates the issue? I'm a little unclear as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed offline, we will leave the system prompt improvements for next PRs. @coderabbitai can u create two issues:
- The first one to keep track of updating this system prompt to cover the issues above.
- The second one to ask us to update CLP doc to state that the current CLP is timezoneless. Users should use timezoneless timestamp in queries.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Create first issue for system prompt improvements
gh issue create \
--title "Improve CLP MCP Server system prompt to cover KQL-specific guidance" \
--body "## Context
This issue tracks improvements to the \`SYSTEM_PROMPT\` constant in \`components/clp-mcp-server/clp_mcp_server/server/constants.py\` to provide more comprehensive guidance for LLM query generation.
## Background
Discussion in PR #1401: https://github.com/y-scope/clp/pull/1401#discussion_r2434136722
## Improvements to address:
1. **KQL Syntax specifics**: Mention CLP's specific KQL syntax
- Use double quotes for string values
- Escape special characters
- Reference: https://docs.yscope.com/clp/main/user-docs/reference-json-search-syntax.html
2. **Auto-generated keys**: Explain handling of auto-generated keys that are queryable but not returned in responses
3. **File-level metadata**: Consider if/how to mention file-level metadata queries (pending clp-json support)
4. **Timestamp handling**:
- Specify timestamp granularity (Unix epoch in milliseconds/nanoseconds)
- Clarify which key to use for timestamps
- Address timezone considerations (see related issue about timezoneless timestamps)
5. **Fuzzy match support**: Clarify that CLP doesn't support fuzzy matches (e.g., \"system error\" won't match \"system shutdown error\")
6. **Query validation**: Evaluate if the current point about syntactic correctness actually improves query accuracy
## Related
- PR #1401
- User: @20001020ycx" \
--assignee LinZhihao-723
# Create second issue for CLP documentation update
gh issue create \
--title "Document that CLP is timezoneless and requires timezoneless timestamps in queries" \
--body "## Context
This issue tracks the need to update CLP documentation to clarify that CLP is timezoneless and users should use timezoneless timestamps in their queries.
## Background
Discussion in PR #1401: https://github.com/y-scope/clp/pull/1401#discussion_r2434136722
## Required changes:
Update CLP documentation to explicitly state:
- CLP is timezoneless (does not track timezone information in archives)
- Users should use timezoneless timestamps in their queries
- Clarify the timestamp format expected (Unix epoch in milliseconds/nanoseconds)
- Consider documenting best practices for handling timezone conversions client-side
## Rationale
Without clear documentation, users and LLMs may apply incorrect timezone offsets, resulting in queries that don't match the expected timestamp range. Making this explicit will help users understand how to properly query time-based data.
## Related
- PR #1401
- User: @20001020ycx" \
--assignee LinZhihao-723
Length of output: 85
@LinZhihao-723, I've created the two issues as requested:
-
Issue Improve CLP MCP Server system prompt to cover KQL-specific guidance #1432: Improve CLP MCP Server system prompt to cover KQL-specific guidance
- Tracks improvements to the
SYSTEM_PROMPT
constant covering KQL syntax specifics, auto-generated keys, file-level metadata, timestamp handling, fuzzy match support, and query validation
- Tracks improvements to the
-
Issue Document that CLP is timezoneless and requires timezoneless timestamps in queries #1433: Document that CLP is timezoneless and requires timezoneless timestamps in queries
- Tracks the documentation update to clarify that CLP is timezoneless and users should use timezoneless timestamps in their queries
Both issues have been assigned to you and include backlinks to this PR and the discussion for context.
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,242 @@ | ||
"""Session management for CLP MCP Server.""" | ||
|
||
import asyncio | ||
from dataclasses import dataclass, field | ||
from datetime import datetime, timedelta, timezone | ||
from typing import Any, ClassVar | ||
|
||
from paginate import Page | ||
|
||
from . import constants | ||
|
||
|
||
class PaginatedQueryResult: | ||
"""Paginates the cached log entries returned from a query's response.""" | ||
|
||
def __init__(self, result_log_entries: list[str], num_items_per_page: int) -> None: | ||
""" | ||
:param result_log_entries: List of cached log entries to paginate. | ||
:param num_items_per_page: | ||
:raise: ValueError if the number of cached results or `num_items_per_page` is invalid. | ||
""" | ||
if len(result_log_entries) > constants.MAX_CACHED_RESULTS: | ||
err_msg = ( | ||
"PaginatedQueryResult exceeds maximum allowed cached results:" | ||
f" {len(result_log_entries)} > {constants.MAX_CACHED_RESULTS}." | ||
) | ||
raise ValueError(err_msg) | ||
|
||
if num_items_per_page <= 0: | ||
err_msg = ( | ||
f"Invalid num_items_per_page: {num_items_per_page}, it must be a positive integer." | ||
) | ||
raise ValueError(err_msg) | ||
|
||
self._num_items_per_page = num_items_per_page | ||
|
||
self._num_pages = (len(result_log_entries) + num_items_per_page - 1) // num_items_per_page | ||
self._result_log_entries = result_log_entries | ||
|
||
def get_page(self, page_index: int) -> Page | None: | ||
""" | ||
:param page_index: Zero-based index, e.g., 0 for the first page. | ||
:return: A `Page` object for the specified page. | ||
:return: None if `page_index` is out of bounds. | ||
""" | ||
# Convert zero-based to one-based | ||
page_number = page_index + 1 | ||
if page_number <= 0 or self._num_pages < page_number: | ||
return None | ||
|
||
return Page( | ||
self._result_log_entries, | ||
page=page_number, | ||
items_per_page=self._num_items_per_page, | ||
) | ||
|
||
|
||
@dataclass | ||
class SessionState: | ||
""" | ||
Represents the state of a user session. | ||
`SessionState` respects `FastMCP` asynchronous concurrency model that is built on Python's | ||
asyncio runtime: | ||
Asyncio achieves concurrency on a single thread by allowing tasks to yield control at `await` | ||
points. This means when a `mcp.tool` API calls executes an await expression, it gets suspended, | ||
and the event loop executes the next `mcp.tool` API calls from received requests issued by | ||
clients. | ||
Therefore, `SessionState` does not need thread-safe operations because a session cannot be | ||
accessed from two threads at the same time. | ||
20001020ycx marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
""" | ||
|
||
_num_items_per_page: int | ||
_session_id: str | ||
_session_ttl_minutes: int | ||
|
||
_cached_query_result: PaginatedQueryResult | None = None | ||
_last_accessed: datetime = field(default_factory=lambda: datetime.now(timezone.utc)) | ||
_is_instructions_retrieved: bool = False | ||
|
||
_GET_INSTRUCTIONS_NOT_RUN_ERROR: ClassVar[dict[str, str]] = { | ||
"Error": "Please call `get_instructions()` first to understand how to use this MCP server." | ||
} | ||
|
||
def cache_query_result_and_get_first_page( | ||
self, | ||
results: list[str], | ||
) -> dict[str, Any]: | ||
""" | ||
:param results: Log entries from the query to cache. | ||
:return: Forwards `SessionState.get_page_data`'s return values. | ||
:return: _GET_INSTRUCTIONS_NOT_RUN_ERROR if `get_instructions` has not been called in this | ||
session. | ||
""" | ||
if self._is_instructions_retrieved is False: | ||
return self._GET_INSTRUCTIONS_NOT_RUN_ERROR.copy() | ||
|
||
self._cached_query_result = PaginatedQueryResult( | ||
20001020ycx marked this conversation as resolved.
Show resolved
Hide resolved
|
||
result_log_entries=results, num_items_per_page=self._num_items_per_page | ||
) | ||
|
||
return self.get_page_data(0) | ||
Comment on lines
92
to
96
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Handle oversized caches instead of propagating ValueError. Line 97 constructs - self._cached_query_result = PaginatedQueryResult(
- result_log_entries=results, num_items_per_page=self._num_items_per_page
- )
-
- return self.get_page_data(0)
+ try:
+ self._cached_query_result = PaginatedQueryResult(
+ result_log_entries=results,
+ num_items_per_page=self._num_items_per_page,
+ )
+ except ValueError as exc:
+ return {"Error": str(exc)}
+
+ return self.get_page_data(0) 🤖 Prompt for AI Agents
|
||
|
||
def get_instructions(self) -> str: | ||
""" | ||
Gets a pre-defined "system prompt" that guides the LLM behavior. | ||
:return: A string of "system prompt". | ||
""" | ||
self._is_instructions_retrieved = True | ||
return constants.SYSTEM_PROMPT | ||
|
||
def get_page_data(self, page_index: int) -> dict[str, Any]: | ||
""" | ||
Retrieves the n-th page of a paginated response with the paging metadata from the previous | ||
query. | ||
NOTE: This docstring must be synchronized with `get_nth_page`'s MCP tool call. | ||
:param page_index: Zero-based index, e.g., 0 for the first page. | ||
:return: A dictionary containing the following key-value pairs on success: | ||
- "items": A list of log entries in the requested page. | ||
- "total_pages": Total number of pages available from the query as an integer. | ||
- "total_items": Total number of log entries available from the query as an integer. | ||
- "num_items_per_page": Number of log entries per page. | ||
- "has_next": Whether a page exists after the returned one. | ||
- "has_previous": Whether a page exists before the returned one. | ||
:return: A dictionary with the following key-value pair on failures: | ||
- "Error": An error message describing the failure. | ||
:return: _GET_INSTRUCTIONS_NOT_RUN_ERROR if `get_instructions` has not been called in this | ||
session. | ||
""" | ||
if self._is_instructions_retrieved is False: | ||
return self._GET_INSTRUCTIONS_NOT_RUN_ERROR.copy() | ||
|
||
if self._cached_query_result is None: | ||
return {"Error": "No previous paginated response in this session."} | ||
|
||
page = self._cached_query_result.get_page(page_index) | ||
if page is None: | ||
return {"Error": "Page index is out of bounds."} | ||
20001020ycx marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
20001020ycx marked this conversation as resolved.
Show resolved
Hide resolved
|
||
return { | ||
"items": list(page), | ||
"total_pages": page.page_count, | ||
"total_items": page.item_count, | ||
"num_items_per_page": page.items_per_page, | ||
"has_next": page.next_page is not None, | ||
"has_previous": page.previous_page is not None, | ||
Comment on lines
133
to
143
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Return an empty first page when there are zero cached results. When a search yields no hits, - page = self._cached_query_result.get_page(page_index)
+ cached_result = self._cached_query_result
+
+ if cached_result._num_pages == 0:
+ if page_index != 0:
+ return {"Error": "Page index is out of bounds."}
+ return {
+ "items": [],
+ "total_pages": 0,
+ "total_items": 0,
+ "num_items_per_page": cached_result.num_items_per_page,
+ "has_next": False,
+ "has_previous": False,
+ }
+
+ page = cached_result.get_page(page_index) 🤖 Prompt for AI Agents
|
||
} | ||
Comment on lines
130
to
144
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Return an empty page when the cached result set is empty. Lines 135-149 treat every zero-length cache as “Page index is out of bounds.” With - page = self._cached_query_result.get_page(page_index)
- if page is None:
- return {"Error": "Page index is out of bounds."}
-
- return {
+ cached_result = self._cached_query_result
+
+ if cached_result._num_pages == 0:
+ if page_index != 0:
+ return {"Error": "Page index is out of bounds."}
+ return {
+ "items": [],
+ "total_pages": 0,
+ "total_items": 0,
+ "num_items_per_page": self._num_items_per_page,
+ "has_next": False,
+ "has_previous": False,
+ }
+
+ page = cached_result.get_page(page_index)
+ if page is None:
+ return {"Error": "Page index is out of bounds."}
+
+ return {
"items": list(page),
"total_pages": page.page_count,
"total_items": page.item_count,
"num_items_per_page": page.items_per_page,
"has_next": page.next_page is not None,
"has_previous": page.previous_page is not None,
} 🤖 Prompt for AI Agents
|
||
|
||
def is_expired(self) -> bool: | ||
""":return: Whether the session has expired.""" | ||
time_diff = datetime.now(timezone.utc) - self._last_accessed | ||
return time_diff > timedelta(minutes=self._session_ttl_minutes) | ||
|
||
def update_access_time(self) -> None: | ||
"""Updates the last accessed timestamp.""" | ||
self._last_accessed = datetime.now(timezone.utc) | ||
|
||
|
||
class SessionManager: | ||
""" | ||
Session manager for concurrent user sessions. | ||
`SessionManager` respects `FastMCP` asynchronous concurrency model that is built on Python's | ||
asyncio runtime: | ||
Asyncio achieves concurrency on a single thread by allowing tasks to yield control at `await` | ||
|
||
points. This means when a `mcp.tool` API calls executes an await expression, it gets suspended, | ||
and the event loop executes the next `mcp.tool` API calls from received requests issued by | ||
clients. | ||
The operations on `sessions` which maintains concurrent user sessions are performed by | ||
asynchronous tasks. The above concurrency model guarantees that the operations of `sessions` | ||
are atomic because it cannot be accessed from two threads at the same time. | ||
20001020ycx marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
""" | ||
|
||
def __init__(self, session_ttl_minutes: int) -> None: | ||
""":param session_ttl_minutes: Session time-to-live in minutes.""" | ||
self.sessions: dict[str, SessionState] = {} | ||
self._session_ttl_minutes = session_ttl_minutes | ||
self._cleanup_task: asyncio.Task | None = None | ||
|
||
async def start(self) -> None: | ||
"""Starts the asynchronous cleanup task.""" | ||
if self._cleanup_task is None: | ||
self._cleanup_task = asyncio.create_task(self._cleanup_loop()) | ||
|
||
async def _cleanup_loop(self) -> None: | ||
"""Cleans up all expired sessions periodically as an async task.""" | ||
while True: | ||
await asyncio.sleep(constants.EXPIRED_SESSION_SWEEP_INTERVAL_SECONDS) | ||
self.cleanup_expired_sessions() | ||
|
||
def cleanup_expired_sessions(self) -> None: | ||
"""Cleans up all expired sessions.""" | ||
expired_sessions = [sid for sid, session in self.sessions.items() if session.is_expired()] | ||
|
||
for sid in expired_sessions: | ||
del self.sessions[sid] | ||
|
||
def cache_query_result_and_get_first_page( | ||
self, session_id: str, query_results: list[str] | ||
) -> dict[str, Any]: | ||
""" | ||
:param session_id: | ||
20001020ycx marked this conversation as resolved.
Show resolved
Hide resolved
|
||
:param query_results: Log entries from the query to cache. | ||
:return: Forwards `SessionState.cache_query_result_and_get_first_page`'s return values. | ||
""" | ||
session = self.get_or_create_session(session_id) | ||
|
||
return session.cache_query_result_and_get_first_page(results=query_results) | ||
|
||
def get_nth_page(self, session_id: str, page_index: int) -> dict[str, Any]: | ||
""" | ||
Retrieves the n-th page of a paginated response with the paging metadata from the previous | ||
query. | ||
:param session_id: | ||
:param page_index: Index of the page to retrieve (zero-based, e.g., 0 is the first page). | ||
:return: Forwards `SessionState.get_page_data`'s return values. | ||
coderabbitai[bot] marked this conversation as resolved.
Show resolved
Hide resolved
|
||
""" | ||
session = self.get_or_create_session(session_id) | ||
|
||
return session.get_page_data(page_index) | ||
|
||
def get_or_create_session(self, session_id: str) -> SessionState: | ||
""" | ||
Gets an existing session or creates a new one. | ||
:param session_id: | ||
:return: The `SessionState` object for the given `session_id`. | ||
20001020ycx marked this conversation as resolved.
Show resolved
Hide resolved
|
||
""" | ||
if session_id in self.sessions and self.sessions[session_id].is_expired(): | ||
del self.sessions[session_id] | ||
|
||
if session_id not in self.sessions: | ||
self.sessions[session_id] = SessionState( | ||
constants.NUM_ITEMS_PER_PAGE, session_id, self._session_ttl_minutes | ||
) | ||
|
||
session = self.sessions[session_id] | ||
|
||
session.update_access_time() | ||
return session |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added an instruction for test case.
@coerabbitai can u create an issue as a feature request to add tasks and workflows to run unit tests for all uv components?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@coderabbitai can u create an issue for us?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
Length of output: 42
@LinZhihao-723, I've created issue #1431 for this feature request. The issue covers adding tasks and workflows to run unit tests for all uv components in the repository, with references back to this PR and comment thread.