-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Description
Describe the bug
Description
I'm experiencing the same infinite loop issue with nextForwardToken
described in #4472, but I've identified a specific case that triggers this behavior: querying log groups with expired logs (logs older than the retention period).
Root Cause Identified
This infinite loop occurs when:
- Log group has a retention period set (e.g., 1 year)
- Querying logs that are older than the retention period (expired logs)
get_log_events
continues returning newnextForwardToken
values indefinitely- No events are returned in the response, but the API keeps providing new tokens
Comparison with AWS CLI
The AWS CLI handles this scenario correctly:
aws logs get-log-events
for expired logs returns no events and terminates properly- Does not get stuck in infinite pagination loops
This suggests the issue may be in how boto3/botocore implements the pagination logic versus the AWS CLI implementation.
Impact
- Applications get stuck in infinite loops when processing old log streams
- Unnecessary API calls and costs
- Resource consumption (CPU, memory) from endless pagination
- Difficult to implement robust log processing without workarounds
Workaround
Current workaround is to implement additional checks:
consecutive_empty_responses = 0
max_empty_responses = 3 # Adjust based on your needs
while next_token != response.get("nextForwardToken", ""):
# ... existing code ...
if not response["events"]:
consecutive_empty_responses += 1
if consecutive_empty_responses >= max_empty_responses:
print("Breaking due to consecutive empty responses (likely expired logs)")
break
else:
consecutive_empty_responses = 0 # Reset counter
This issue affects production systems that process historical logs and need reliable pagination behavior. A fix or clear documentation would be greatly appreciated.
Regression Issue
- Select this option if this issue appears to be a regression.
Expected Behavior
According to AWS CloudWatch Logs documentation, when no more events are available, the API should either:
- Stop returning
nextForwardToken
, OR - Return the same
nextForwardToken
value to indicate pagination is complete
Expected (similar to AWS CLI):
Iteration: 1 - Next token: None
No events found in iteration 1
Total iterations: 1
Total events: 0
Duration: 0.15 seconds
Current Behavior
When querying expired logs (older than retention period):
get_log_events
returns{'events': [], 'nextForwardToken': 'new_token_value'}
- Each subsequent call returns a different
nextForwardToken
- This creates an infinite loop as the condition
next_token != response.get("nextForwardToken", "")
never becomes true - No events are ever returned, but pagination continues indefinitely
Iteration: 1 - Next token: None
No events found in iteration 1
Iteration: 2 - Next token: f/38843677690954359307332779238523611073553355858145312768/s
No events found in iteration 2
Iteration: 3 - Next token: f/38843871704337782941158364500062913595036750815702024192/s
No events found in iteration 3
... (continues indefinitely)
Reproduction Steps
- Create a CloudWatch log group with 1-year retention
- Wait for logs to expire (or use existing expired logs)
- Attempt to query expired logs using
get_log_events
with pagination:
import boto3
import time
client = boto3.client('logs', region_name='us-east-1')
params = {
'logGroupName': '/your/log/group',
'logStreamName': 'expired-log-stream',
'startFromHead': True,
'limit': 1000
}
events = []
response = {}
next_token = None
count = 0
start_time = time.time()
# This loop will run indefinitely for expired logs
while next_token != response.get("nextForwardToken", ""):
count += 1
next_token = response.get("nextForwardToken")
print(f"Iteration: {count} - Next token: {next_token}")
if next_token:
params["nextToken"] = next_token
response = client.get_log_events(**params)
if response["events"]:
print(f"Events found: {len(response['events'])}")
events.extend(response["events"])
else:
print(f"No events found in iteration {count}")
# Safety break for demonstration (remove to see infinite loop)
if count > 20:
print("Breaking to prevent infinite loop...")
break
print(f"Total iterations: {count}")
print(f"Total events: {len(events)}")
print(f"Duration: {time.time() - start_time:.2f} seconds")
Possible Solution
- Fix in boto3/botocore: Modify pagination logic to detect when expired logs are being queried and handle accordingly
- AWS API fix: Address this at the CloudWatch Logs API level to match AWS CLI behavior
- Documentation update: Clearly document this edge case and provide recommended workarounds
Additional Information/Context
Related Issues
- Original issue: CloudWatch logs get_log_events returns nextForwardToken even though no more events appear AND limit is not respected #4472
- Similar pagination issues: list_command_invocations always returns a NextToken and repeats data #3791, GetLogEvents forward/backward paginators #2411
Questions
- Is this considered a bug in the AWS CloudWatch Logs API itself?
- Should boto3 implement special handling for this scenario?
- Are there plans to align boto3 behavior with AWS CLI for this case?
SDK version used
boto3==1.40.4
Environment details (OS name and version, etc.)
macOS 15.6