fix: fix Chinese character escaping issue#1330
fix: fix Chinese character escaping issue#1330RiviaAzusa wants to merge 2 commits intolangfuse:mainfrom
Conversation
|
Hi @RiviaAzusa , thank you for the contribution! If possible, could you add at least one test for this behavior in order to ensure not running into this problem again? Thanks! |
|
I really want this PR is merged. |
|
I ran into the same problem, and Kimi K2 discovered the same patch. Any update on this? And CJK full text search in traces list breaks due to this issue. |
|
I really want this PR is merged. |
|
@nimarb Hi, sorry for the late reply. I've added a test case now. 39f4d8d
|
在所有json.dumps调用中添加ensure_ascii=False参数,确保中文等非ASCII字符不会被转义为\uXXXX格式。 修改的文件: - langfuse/_client/attributes.py: 核心序列化函数 - langfuse/_utils/request.py: API请求数据序列化 - langfuse/_client/utils.py: 调试输出格式化 - langfuse/_task_manager/score_ingestion_consumer.py: 分数处理序列化 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
|
@MoonSangJin ping... |
|
Hi @chenyong , I'm not a maintainer of this repo so I can't merge PRs. |

Add the ensure_ascii=False parameter to all json.dumps calls to ensure non-ASCII characters (such as Chinese) are not escaped into \uXXXX format.
Files to modify:
langfuse/_client/attributes.py: Core serialization functions
langfuse/_utils/request.py: API request data serialization
langfuse/_client/utils.py: Debug output formatting
langfuse/_task_manager/score_ingestion_consumer.py: Score processing serialization
Before Fix & After Fix:

Important
Add
ensure_ascii=Falsetojson.dumpscalls to prevent non-ASCII character escaping in multiple files.ensure_ascii=Falsetojson.dumpsinattributes.pyfor core serialization functions.ensure_ascii=Falsetojson.dumpsinrequest.pyfor API request data serialization.ensure_ascii=Falsetojson.dumpsinutils.pyfor debug output formatting.ensure_ascii=Falsetojson.dumpsinscore_ingestion_consumer.pyfor score processing serialization.This description was created by
for 843ed96. You can customize this summary. It will automatically update as commits are pushed.
Disclaimer: Experimental PR review
Greptile Summary
Updated On: 2025-09-09 06:09:40 UTC
This PR addresses a localization issue where non-ASCII characters (specifically Chinese characters) were being escaped to \uXXXX format in JSON serialization throughout the Langfuse Python SDK. The fix adds the
ensure_ascii=Falseparameter to alljson.dumps()calls across four key files to preserve Unicode characters in their original readable form.The changes affect multiple serialization touchpoints:
All modified files use the existing
EventSerializerclass but now explicitly disable ASCII-only encoding. This ensures that when users trace LLM applications containing Chinese text (or other Unicode content), the data remains human-readable in the Langfuse UI, logs, and debug output. The change maintains JSON validity while improving the user experience for international developers.Confidence score: 5/5
ensure_ascii=Falseparameter