-
Notifications
You must be signed in to change notification settings - Fork 91
Fix poor websocket throughput on Windows #387
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix poor websocket throughput on Windows #387
Conversation
|
Hi @broddo , |
mathieucarbou
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is my review.
Thank you for the PR !
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.
|
@broddo @willmmiles @me-no-dev @vortigont FYI I was able to test on MacOS and this improvement also increases the throughput a lot. I am testing with by changing inside: In main we have an issue when using Safari: when we hit refresh page sevral times or we send a message from the txtfield too frequently, the message queue starts to overflow. With the new implementation safari receives all the messages fine, we can refresh the page without issue (which also causes a fragmented pong control frame) and we can also send a lot of messages from the UI without impacting the queue. Just for fun, I tested also with In this test, since messages are short, we can go as low as 50ms delay without any issue and queue overflow !! That's very good ! So a big improvement of throughput for short messages, which makes sense since more messages in the same tcp buffer. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
b36fd82 to
3457302
Compare
Enable processing of cumulative ACKs across multiple queued messages Larger window size for reduced fragmentation Update src/AsyncWebSocket.cpp Co-authored-by: Mathieu Carbou <mathieu.carbou@gmail.com> Revert threshold to 9 improve readability Fix loop exit condition Co-authored-by: Mathieu Carbou <mathieu.carbou@gmail.com> Replace public sent() with private _remainingBytesToSend() Leverage _remainingBytesToSend() and tidy messageQueue sending
- Increase throughput of WS example - Remove \n at the end of log line - Fix status that was incorrectly kept to 0 even when all bytes acked - Added traces in sent method
3457302 to
a3a1466
Compare
|
@broddo : I added some more traces this morning and did some further testing following the inclusion of my review in your code and I found an issue in a fix I suggested. So:
I did some extensive test again with several clients spawned with:
in order to cause some ping-pong control frames mixed with websocket messages. No memory leak : when disconnecting clients, heap recovers like before running the test after a while. And thats is uper fast now... No queue exhaustion! That's the sort of logs we can see when activating verbose mode: |
|
@broddo : PR merged! Could you please now test your code by pointing to main branch ? It contains also several latest changes. Thanks! Note: I edited also the WebSocket example to send each 5 sec a big message of 8k characters in order to test fragmentation from server to client.
We can see below the message being sent in fragments, plus after that short messages each 50ms, then the ack gets received. In the middle of all that, we have some ping-pong control frames. |
I'm working on a project that has moderate websocket data requirements. Everything works great on MacOS/Linux even at high data rates (60 msgs/sec, 120 bytes/msg). But I found that Windows clients struggled with even the lightest data rates. Just a couple of messages per second would start causing "Too many messages queued" (if I removed the availableForWriteAll() calls) or dropped messages otherwise.
It looks like the issue is that the previous _runQueue() implementation was only sending one message per invocation, then waiting for the next event loop tick or TCP ACK callback to continue. This created a bottleneck where the message queue would fill faster than messages could be drained, especially on Windows clients which appear to have different TCP ACK timing characteristics. Coupled with that is Windows tendency to send ACKs in bursts.
I've modified _runQueue(). Instead of sending one message and returning, it iterates through all queued messages and sends as much data as the TCP window allows in a single pass. _onAck() now distributes ACKs across multiple messages.
This seems to have fixed Windows clients for me at least. They're now able to keep up with MacOS/Linux clients.