# Fix: Occasional 502 Internal Server Error Returning Raw HTML via Python SDK #1923#6
Conversation
📝 WalkthroughWalkthroughEnhances error handling in LingoDotDevEngine: Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@tests/test_502_handling.py`:
- Around line 7-60: Update both tests to use the engine as an async context
manager so the underlying httpx.AsyncClient is closed: replace direct
instantiation of LingoDotDevEngine in test_502_html_handling and
test_500_json_handling with "async with LingoDotDevEngine(config) as engine"
(leveraging the class's __aenter__/__aexit__ support) before calling
engine.localize_text, ensuring the client is cleaned up and preventing
ResourceWarnings.
🧹 Nitpick comments (1)
src/lingodotdev/engine.py (1)
187-198: Avoid broad exception swallowing in error-detail parsing.Catching
Exceptionwithpasscan mask unexpected failures, and the code may append non-string or empty error values. Narrow the exception to JSON decode errors and only append a non-empty string; apply the same refinement to the other 5xx handlers.♻️ Suggested refactor
if 500 <= response.status_code < 600: error_details = "" try: - error_json = response.json() - if isinstance(error_json, dict) and "error" in error_json: - error_details = f" {error_json['error']}" - except Exception: - pass + error_json = response.json() + except ValueError: + error_json = None + + if isinstance(error_json, dict): + error_text = error_json.get("error") + if isinstance(error_text, str) and error_text.strip(): + error_details = f" {error_text}"
f903312 to
3feaef3
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@tests/test_502_handling.py`:
- Around line 32-38: The test currently expects HTML preview leakage; update the
assertions in tests/test_502_handling.py to enforce sanitization: use assert
"Response:" not in error_msg and assert "<html>" not in error_msg (invert the
two checks), and remove or relax the truncation-specific assertion (the assert
comparing len(error_msg) to len(html_body) should be removed or replaced with a
non-dependent length check) so the test no longer requires an HTML preview;
refer to the variables error_msg and html_body to locate the assertions to
change.
1ac3969 to
94c68c4
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@src/lingodotdev/engine.py`:
- Around line 84-113: Update LingoDotDevEngine._safe_parse_json to also catch
UnicodeDecodeError in addition to json.JSONDecodeError, use
LingoDotDevEngine._truncate_response to build the preview, and raise the same
RuntimeError message while chaining the original exception (using "from e") so
the UnicodeDecodeError context is preserved; keep the error text/format
identical but ensure the except clause handles both exception types and
references the original exception variable when raising.
94c68c4 to
99957f3
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@src/lingodotdev/engine.py`:
- Around line 228-231: There's a duplicated "raise RuntimeError(" token causing
a SyntaxError in the response error-handling block; remove the extra "raise
RuntimeError(" so only one raise statement remains and ensure the formatted
message uses the existing response, response.status_code, response.reason_phrase
and error_details variables (the string should be a single f-string passed to
raise RuntimeError). Locate the error block that references
response.status_code, response.reason_phrase and error_details and delete the
stray/duplicate raise so the exception is raised exactly once with the combined
message.
- Around line 483-486: Remove the duplicated "raise RuntimeError(" line in the
error handling block so the code contains a single raise statement that
constructs the full error message; update the remaining raise to be a single
call like raise RuntimeError(f"Server error ({response.status_code}):
{response.reason_phrase}.{error_details} This may be due to temporary service
issues.") ensuring the parentheses and f-string concatenation are correct
(locate the block around the existing duplicate raise in the response/error
handling code).
🧹 Nitpick comments (1)
src/lingodotdev/engine.py (1)
518-529: Consider extracting 5xx error formatting into a helper.The error-details parsing + message formatting logic is now duplicated across
_localize_chunk,recognize_locale, andwhoami. A small helper (e.g.,_format_server_error(response)) would reduce repetition and keep behavior consistent.
There was a problem hiding this comment.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/lingodotdev/engine.py (1)
222-246:⚠️ Potential issue | 🟠 MajorGuard error-path response preview decoding to avoid unhandled UnicodeDecodeError.
response.textat line 223 (and 478) is accessed immediately after checkingresponse.is_success, before the status code branches. If the response bytes cannot be decoded (due to invalid or mislabeled encoding),UnicodeDecodeErrorwill escape and bypass the error handling logic, preventing proper error reporting to the user. The codebase already handles this pattern in_safe_parse_json(lines 105–114) by catching and decoding witherrors="replace".Defer computing
response_previewuntil needed within the 400/other-status branches, and use the same fallback decoding pattern applied in_safe_parse_json.🔧 Proposed fix
@@ - if not response.is_success: - response_preview = self._truncate_response(response.text) - if 500 <= response.status_code < 600: + if not response.is_success: + if 500 <= response.status_code < 600: error_details = "" try: error_json = response.json() if isinstance(error_json, dict) and "error" in error_json: error_details = f" {error_json['error']}" except Exception: pass @@ raise RuntimeError( f"Server error ({response.status_code}): {response.reason_phrase}.{error_details} " "This may be due to temporary service issues." ) - elif response.status_code == 400: + try: + response_text = response.text + except UnicodeDecodeError: + response_text = response.content.decode("utf-8", errors="replace") + response_preview = self._truncate_response(response_text) + elif response.status_code == 400: raise ValueError( f"Invalid request ({response.status_code}): {response.reason_phrase}. " f"Response: {response_preview}" ) @@ - if not response.is_success: - response_preview = self._truncate_response(response.text) - if 500 <= response.status_code < 600: + if not response.is_success: + if 500 <= response.status_code < 600: error_details = "" try: error_json = response.json() if isinstance(error_json, dict) and "error" in error_json: error_details = f" {error_json['error']}" except Exception: pass @@ raise RuntimeError( f"Server error ({response.status_code}): {response.reason_phrase}.{error_details} " "This may be due to temporary service issues." ) + try: + response_text = response.text + except UnicodeDecodeError: + response_text = response.content.decode("utf-8", errors="replace") + response_preview = self._truncate_response(response_text) raise RuntimeError( f"Error recognizing locale ({response.status_code}): {response.reason_phrase}. " f"Response: {response_preview}" )
Fix: Occasional 502 Internal Server Error Returning Raw HTML via Python SDK #1923
Description
This PR fixes the issue where the Python SDK raises a
RuntimeErrorcontaining raw HTML when a 502 Bad Gateway error is received from the API (typically from an upstream proxy like Nginx or Cloudflare). This behavior caused logs to be flooded with HTML content and made exception handling difficult.Implementation Details
src/lingodotdev/engine.pyto sanitize error messages for 5xx responses (specifically in_localize_chunk,recognize_locale, andwhoami).Type of Change
Testing
Verification Steps
I created a reproduction script
reproduce_502_error.pythat mocks a 502 Bad Gateway response with an HTML body to verify the fix.Before Fix:
After Fix:
I also added a permanent unit regression test in
tests/test_502_handling.py.Checklist
Commit Message Format
fix: sanitize 502 HTML responses from error messagesSummary by CodeRabbit
Bug Fixes
Tests