Have you checked that your OS/infrastructure/network can handle the traffic?
What's the highest throughput you can get if you cut the number of
websocket clients by half or to 1/4? What about only 1 client?
What's the threading model at the server side? How many threads do you use,
say, to handle 10000 clients?
Other areas to look into:
- As others have pointed out, TCP Nagle's algorithm can greatly impact
throughput. Have you tried TCP_NODELAY?
- Do you use inter-thread synchronization in any extent?