[Bug report] The plugin process hangs with high CPU usage when using advanced Gemini models (e.g., gemini-2.5-pro)

#### **Environment**

*   **Plugin Version:**  v2.4.1
*   **OS:** Arch Linux
*   **Calibre Version:** 8.7
*   **Python Version:** 3.13.7

#### **Bug Description**

When using the Ebook-Translator plugin and selecting newer Gemini series models (e.g., `gemini-2.5-pro` or `gemini-2.5-flash`) as the translation engine, the translation process hangs.

**Symptoms:**

*   The plugin UI shows that it is translating, but no translated text is ever output.
*   The system's task manager shows a CPU core spiking to 100% utilization by the Calibre process, and it remains there indefinitely.
*   The application does not crash, but the process must be terminated by sending a `kill` signal.
*   The translation works correctly when using certain other Gemini models like `gemini-2.5-flash-lite`, which does not seem to have a "thinking" phase.

It appears this issue is specific to the newer Gemini models that feature a "thinking" process and utilize a streaming response.

#### **Root Cause Analysis**

After investigation, the root cause of this bug lies within the `_parse_stream` method in the `engines/google.py` file. This method is responsible for parsing the streaming response from the Gemini API, but its current implementation is not robust enough to handle the response patterns of the newer API, leading to critical logical failures in two scenarios:

1.  **Infinite Loop Causing High CPU Usage:** The original `while True` loop relies exclusively on parsing a JSON field like `{"finishReason": "STOP"}` from the stream to exit. However, the newer Gemini API sometimes closes the connection directly after the stream is complete without sending a final data chunk containing the `finishReason`. In this case, `response.readline()` continuously returns an empty bytes object (`b''`). Since the code does not check for this condition, the loop never terminates, resulting in an infinite loop and 100% CPU usage.

2.  **Interrupted or No Translation Output:** A naive attempt to fix the above issue might be to exit the loop upon reading an empty line. This approach is flawed because the Gemini API sends empty "keep-alive" lines (e.g., `b'\r\n'`) between valid data chunks (`data: {...}`). When these lines are processed with `.strip()`, they become empty strings (`''`). If the code interprets this as the end of the stream, it will break the loop prematurely, causing the translation to be cut off after the first few chunks. This manifests as incomplete translations or, more often, no output at all.

#### **Proposed Solution**

To definitively fix this issue, the `_parse_stream` method needs to be rewritten to correctly handle three distinct scenarios simultaneously: the physical end of the stream, keep-alive empty lines, and the official end-of-stream signal.

It is recommended to replace the entire `_parse_stream` method in `engines/google.py` with the following robust version:

```python
def _parse_stream(self, response):
    while True:
        try:
            # 1. First, read the raw bytes of a line.
            raw_line_bytes = response.readline()

            # 2. [Fix for hang] If an empty bytes object is returned, the connection has been physically closed. This is the most reliable exit condition.
            if not raw_line_bytes:
                break

            # 3. After confirming the stream is not closed, decode and strip the line.
            line = raw_line_bytes.decode('utf-8').strip()

            # 4. [Fix for interruption] If the line is empty after stripping (i.e., it was an empty line like b'\r\n'), ignore it and continue to the next iteration.
            if not line:
                continue

        except IncompleteRead:
            continue
        except Exception as e:
            raise Exception(
                _('Can not parse returned response. Raw data: {}')
                .format(str(e)))

        if line.startswith('data:'):
            item = json.loads(line.split('data: ')[1])
            candidate = item['candidates'][0]
            content = candidate['content']
            if 'parts' in content.keys():
                for part in content['parts']:
                    yield part['text']
            
            # 5. [Improvement] If the API explicitly sends any finish reason, also exit the loop.
            if candidate.get('finishReason'):
                break
```

This updated version has sound logic and can flawlessly handle the streaming responses from the Gemini 2.5 series models, resolving both the process hanging and the translation interruption issues.

I hope this analysis and the provided code help you in fixing this bug.
Thank you for developing this excellent plugin

Edited:However, after further analysis, although the above code no longer causes the problem of high CPU usage, my analysis of the HTTP request is still incorrect. I strongly recommend that developers use the google-genai library provided by Google for development, which has already handled all the problems.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Bug report] The plugin process hangs with high CPU usage when using advanced Gemini models (e.g., gemini-2.5-pro) #495

Environment

Bug Description

Root Cause Analysis

Proposed Solution

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

[Bug report] The plugin process hangs with high CPU usage when using advanced Gemini models (e.g., gemini-2.5-pro) #495

Description

Environment

Bug Description

Root Cause Analysis

Proposed Solution

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions