Error checks on network calls, try/catch, etc #440
Replies: 2 comments
-
Hi @Joshfindit, thank you for the suggestion! Could you provide some examples, were you feel like additional errors would provide value? There's already quite a few specific exception types in
We already do
I already added a check for this in
Have you encountered an example of this happening? From my testing the Innertube API will return a JSON error response if something went wrong. We are already checking that the returned JSON contains the fields we are expecting and throwing exceptions accordingly. So what cases do you see unchecked here?
As mentioned above, I think this is already happening sufficiently, but let me know if you feel like there are unchecked cases! Generally speaking it's up to you whether you do a long running, or individual branches on your fork, but it's definitely easier for me if you create PRs for individual changes when adding multiple new error types (unless they depend on each other of course). I converted this into a discussion, as it's not really about a specific issue! |
Beta Was this translation helpful? Give feedback.
-
Thanks for the detailed response and for converting this to a discussion. You're absolutely right that there's already solid error handling in many places. Let me be more specific about the gaps I've identified: XML Parsing the response body: The primary uncovered case is in for xml_element in ElementTree.fromstring(raw_data) Specific scenarios that would crash:
Current flow that can crash: # In Transcript.fetch()
response = self._http_client.get(self._url)
snippets = _TranscriptParser(preserve_formatting=preserve_formatting).parse(
_raise_http_errors(response, self.video_id).text, # ← HTTP-level check only
) In all those cases, wrapping it in something that gives an error message like "YouTube returned an unexpected response body. This may be due to your IP being blocked or to some other measure" would help users understand what's going on so they can review the documentation.
Proposed Solution: Following your existing def _raise_content_errors(raw_data: str, video_id: str, content_type: str = "xml") -> str:
"""
Validates content format and raises appropriate exceptions for invalid data.
:param raw_data: The raw content to validate
:param video_id: Video ID for error context
:param content_type: Type of content expected ("xml", "json")
:return: The validated raw_data
"""
if not raw_data or not raw_data.strip():
raise YouTubeDataUnparsable(video_id, "Empty response content")
if content_type == "xml":
try:
# Validate XML can be parsed
ElementTree.fromstring(raw_data)
except Exception as xml_error:
raise YouTubeDataUnparsable(video_id, f"Invalid XML content") from xml_error
elif content_type == "json":
try:
import json
json.loads(raw_data)
except json.JSONDecodeError as json_error:
raise YouTubeDataUnparsable(video_id, f"Invalid JSON content") from json_error
except Exception as json_error:
raise YouTubeDataUnparsable(video_id, f"JSON parsing failed") from json_error
return raw_data With this function, it can be used anywhere def _fetch_innertube_data(self, video_id: str, api_key: str) -> Dict:
response = self._http_client.post(
INNERTUBE_API_URL.format(api_key=api_key),
json={
"context": INNERTUBE_CONTEXT,
"videoId": video_id,
},
)
validated_content = _raise_content_errors(
_raise_http_errors(response, video_id).text,
video_id,
"json"
)
return json.loads(validated_content) This maintains your existing error handling philosophy while catching the XML parsing edge cases. The InnerTube refactor (great work btw!) made caption metadata discovery more robust, but the final transcript fetching still uses the old XML endpoints and parsing. Any malformed XML crashes the entire application instead of providing users with a helpful YouTubeDataUnparsable exception. Happy to implement this as a focused PR if you think it adds value! |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Related to #429 but happened to me in a few other situations (accessing from a VPS without a proxy)
I think it would be helpful to add checks and catches to any network calls:
<!DOCTYPE html>
when the expected reply is JSON (this is likely in the case of a text/HTML warning from YT when they make assumptions that it's someone writing a script)If it's welcome, can I create a long-running feature branch for network error handling? I'll wrap the calls one at a time, keep it synced with main, and you can merge my progress incrementally.
Beta Was this translation helpful? Give feedback.
All reactions