Best way to pull real-time data via WebSocket (handling connection drops & optimization)

Question

I’m currently working on pulling real-time market data using a WebSocket connection in Python, based on the sample code below:

https://github.com/Abdullah-2906/LSEG-python-websocket/blob/main/streaming.py

I’d appreciate some guidance on best practices and optimization.

Current setup

* Using Python with a WebSocket connection to stream data
* Subscribed to ~1200 RICs
* Incoming data is pushed to Redis
* A Celery worker processes the data and updates the database

Issues I’m facing

The WebSocket connection drops after some time

Questions

* What is the most reliable way to maintain a stable WebSocket connection for real-time data?
* Should the connection remain open continuously, or is it better to reconnect periodically (e.g., at market open/close)?
* Is Python suitable for this scale, or would another language (e.g., Go, Node.js, Java) perform better for handling high-frequency streaming data?
* What are the best practices for handling large subscriptions (~1200 instruments)?
* Are there recommended patterns for buffering/queueing (e.g., Redis) before writing to a database?

Goal

I’m trying to build a robust and scalable pipeline where:

* Data is streamed reliably without frequent disconnects
* Processing is efficient and non-blocking
* The system can handle high throughput without losing messages

Any advice, sample architectures, or code examples would be really helpful.

If any other languague which works best like c i would love to swithc but need sample exacmple for straming and updating to database

wasin.w V3 · Answer

Hello @AbdullahMuhammad

Question: Is Python suitable for this scale, or would another language (e.g., Go, Node.js, Java) perform better for handling high-frequency streaming data?

Answer: Python is not a fast language. According to my research, Go is the most popular choice for a scalable WebSocket application. However, the RSSL connection always provide a better performance for high data distribution because the connection is decoded/encoded in binary (WebSocket is JSON string message).

The supported RSSL connection APIs are EMA Java, EMA C++, and EMA C#.

You can see more detail on the

https://developers.lseg.com/en/article-catalog/article/choosing-your-refinitiv-realtime-streaming-api

article.

Jirapongse · Answer

@AbdullahMuhammad

Thank you for reaching out to us.

What is the most reliable way to maintain a stable WebSocket connection for real-time data?

First, we need to determine the cause of disconnections (slow consumer, ping mechanism, or network issue). Then, we need to find the way to solve the problem.

Should the connection remain open continuously, or is it better to reconnect periodically (e.g., at market open/close)?

It depends on the application’s requirements. If the application needs to capture all tick data, the connection or stream must remain open. However, if the application only requires periodic data (for example, at 1‑ or 5‑minute intervals), it can use snapshot requests instead.

Is Python suitable for this scale, or would another language (e.g., Go, Node.js, Java) perform better for handling high-frequency streaming data?

As far as I know, Python does not perform as well as C/C++ or Java. The most suitable APIs for handling real-time data are the Real-Time SDKs for C/C++ and Java using the RSSL protocol. Please refer to the Real-Time SDKs Performance Test Summary article for more details.

What are the best practices for handling large subscriptions (~1200 instruments)?

To efficiently handle real-time streaming data, it is essential to process incoming data as quickly as possible to avoid slow-consumer issues. I am not certain of the update rate for these instruments during peak hours. You may need to measure their average update rate during peak periods and compare it with the rate the application can handle. This approach will help determine whether the application can keep up with the instruments’ update rate.

Are there recommended patterns for buffering/queueing (e.g., Redis) before writing to a database?

Sorry, this question is beyond the scope of our WebSocket APIs.

For real-time SDKs, we provide a lot of examples on GitHub.

EXPLORE OUR SITES