fix AssertError for newer but lower message idx by snubba · Pull Request #194 · bakwc/PySyncObj

snubba · 2026-02-23T13:13:23Z

line 975-978

idx = message['log_idx']
term = message['log_term']
assert idx > self.__raftLastApplied          # ← crashes here
self.__commandsWaitingCommit[idx].append((term, callback))

The Race Condition

Key triggering scenario — leader reconnects after restart with stale socket data:

Follower A sends apply_command to Leader B → commandsWaitingReply[N] = callback
Leader B logs the command at log_idx=X, sends apply_command_response (message in OS socket buffer, not yet read by A)
Leader B restarts → new TCP connection established from B → A
In _onIncomingMessageReceived (transport.py:368), the new connection replaces the old one: self._connections[node] = new_conn, but the old socket stays registered with the poller
The new connection delivers fresh append_entries with a snapshot → __loadDumpFile(clearJournal=True) → __raftLastApplied jumps past X
Crucially: because the reconnecting node has the same address (same Node.id), line 886 evaluates self.__raftLeader != node as False — __onLeaderChanged() is never called, so commandsWaitingReply[N] is not cleared
The old socket then delivers its buffered apply_command_response with log_idx=X
Callback is found in commandsWaitingReply → X > __raftLastApplied → False → AssertionError

Secondary scenario — tick ordering:

The tick order is:
__applyLogEntries() → __sendAppendEntries() → _checkCommandsToApply() → _poller.poll()
If __applyLogEntries() applies entry at idx=X (advancing __raftLastApplied to X) in the same tick that _poller.poll() delivers an apply_command_response with log_idx=X, you get X > X → False.

Why it "doesn't go away"

AssertionError is a subclass of Exception. The _autoTickThread has a blanket except Exception at line 539 that catches it, logs it, and continues running. The popped callback is silently dropped (never called, never put in commandsWaitingCommit). If the application retries the command under the same cluster instability, it can reproduce the same race.

@bakwc Can you please confirm this fix? It is hard to reproduce reliable.

fix AssertError for newer but lower message idx

c7c2e1b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix AssertError for newer but lower message idx#194

fix AssertError for newer but lower message idx#194
snubba wants to merge 1 commit intobakwc:masterfrom
snubba:fix_assert_error_in_apply_command

snubba commented Feb 23, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

snubba commented Feb 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

line 975-978

The Race Condition

Key triggering scenario — leader reconnects after restart with stale socket data:

Secondary scenario — tick ordering:

Why it "doesn't go away"

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

snubba commented Feb 23, 2026 •

edited

Loading