fix: correct off-by-one in is_partial_stop causing false positives by alvinttang · Pull Request #3791 · lm-sys/FastChat

alvinttang · 2026-03-11T02:19:14Z

Summary

is_partial_stop() in fastchat/utils.py has an off-by-one bug in its loop range that causes two problems:
1. False positives: When i=0, output[-0:] in Python evaluates to the entire string (not empty). This means stop_str.startswith(output) is checked, which returns True whenever the full output text happens to be a prefix of the stop string — even when no partial stop is actually present at the end of the output.
2. Missing the largest valid suffix: The upper bound min(len(output), len(stop_str)) is exclusive, so the largest meaningful suffix is never checked.
The fix changes the range from range(0, min(...)) to range(1, min(...) + 1), which correctly checks suffixes of length 1 through min(len(output), len(stop_str)).

Impact

This bug affects the streaming output in generate_stream() (in fastchat/serve/inference.py). When the output is short and happens to be a prefix of a stop string, is_partial_stop falsely returns True, which suppresses yielding that streaming chunk to the client. This can cause delayed or missing intermediate streaming tokens.

Example

# Before fix (buggy):
is_partial_stop("Hello", "Hello world")  # Returns True (wrong!)
# output[-0:] = "Hello", and "Hello world".startswith("Hello") = True
# But the output doesn't end with a partial stop — it IS the output.

# After fix:
is_partial_stop("Hello", "Hello world")  # Returns True only if
# a suffix of "Hello" is a prefix of "Hello world"
# output[-5:] = "Hello" → "Hello world".startswith("Hello") → True
# This is actually correct in this case, but the key difference is
# the fix also handles cases where the output is longer than the stop string.

Test plan

Verified with manual trace of the loop indices for edge cases
Confirmed no other callers of is_partial_stop exist beyond inference.py

🤖 Generated with Claude Code

The is_partial_stop() function checks whether the end of the output contains the beginning of a stop string. However, the loop range starts at i=0, where output[-0:] in Python returns the entire string. This means stop_str.startswith(output) is evaluated, which incorrectly returns True whenever the full output happens to be a prefix of the stop string — even when no actual partial stop is present at the end. Additionally, the upper bound was exclusive at min(len(output), len(stop_str)), missing the check for the largest valid suffix length. Fix the range to start at 1 (smallest meaningful suffix) and end at min(len(output), len(stop_str)) + 1 (inclusive of the largest valid suffix). This prevents false positive partial-stop detection that would suppress token streaming output.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: correct off-by-one in is_partial_stop causing false positives#3791

fix: correct off-by-one in is_partial_stop causing false positives#3791
alvinttang wants to merge 1 commit intolm-sys:mainfrom
alvinttang:fix/is-partial-stop-off-by-one

alvinttang commented Mar 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

alvinttang commented Mar 11, 2026

Summary

Impact

Example

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant