Latency Comparison of Live Streaming Protocols: RTSP, RTMP, SRT, HTTP-FLV, and WebRTC – A Practical Measurement Study

In real-time live streaming scenarios, end-to-end latency is often a critical metric for evaluating technical solutions. Due to differences in underlying transmission mechanisms, buffering strategies, and error recovery methods, different streaming media protocols exhibit significantly different latency characteristics in actual deployments. This article presents a controlled experimental comparison of the latency performance of five mainstream live streaming protocols under identical network conditions, and provides an in-depth analysis of the causes of latency in each protocol.

Test Environment and Methodology

To ensure the fairness and reproducibility of the comparison results, the test was conducted under the following uniform conditions:

  • MediaServer Configuration: A single MediaServer was used as the streaming media server, playing the same screen stream source. Screen resolution: 2560×1600, video encoding: H.264.
  • Client Playback Method: MediaClient was used to play RTSP, RTMP, SRT, and HTTP-FLV streams, while a browser was used to play the WebRTC stream.
  • Latency Measurement Method: A browser-based online real-time clock (accurate to the millisecond) was displayed. The clock was captured via screenshot every 30 seconds, for a total of 10 screenshots. The latency was calculated by comparing the timestamp in the client playback screen to the real-time clock. The average latency for each protocol was obtained by averaging the 10 measurements.

Measured Latency Data Summary

Protocol Transport Layer Average Latency Recommended Use Cases
WebRTC UDP (primary) + RTP/RTCP ~100ms Video conferencing, interactive live streaming, remote control, telehealth
RTSP UDP (media) / TCP (control) ~150ms Surveillance, real-time video transmission, industrial vision systems
RTMP TCP ~150ms Live streaming ingest, CDN distribution, content delivery networks
HTTP-FLV TCP ~150ms Low-latency web live streaming, mobile playback, HTTP-based delivery
SRT UDP + ARQ + jitter buffer ~300ms Unstable networks (high packet loss), long-distance signal backhaul, cross-border video transport

Analysis of Latency Causes for Each Protocol

1. WebRTC: Gold Standard for Sub-Second Latency

In the measurements, WebRTC achieved the lowest average latency at ~100ms. The underlying reason for this performance is that WebRTC transmits media directly using UDP in conjunction with RTP (Real-time Transport Protocol). Unlike TCP, UDP does not require data acknowledgment or reordering, thus avoiding head-of-line blocking problems. Frames can be decoded and rendered immediately upon arrival. Additionally, WebRTC incorporates the GCC (Google Congestion Control) algorithm, which combines both delay-based and loss-based congestion control to achieve minimum latency and avoid buffer-bloat as much as possible[reference:0]. GCC uses the trend of delay gradients to infer network congestion and dynamically adjust the sending rate[reference:1]. GCC is the only real-time congestion control algorithm implemented in commercial browsers such as Google Chrome[reference:2]. With GCC and NACK-based retransmission mechanisms, WebRTC can dynamically adjust the bitrate under changing network conditions and maintain ultra-low latency even with limited packet loss. With fine-tuned implementations, WebRTC can achieve end-to-end latencies as low as 10–30 ms.

2. RTSP: Advantages of Separate Control-and-Transport Design

RTSP achieved a low latency of ~150ms in this test. RTSP is fundamentally a media control protocol responsible for stream setup and tear-down, while the actual media data transmission is handled by RTP/RTCP. Because RTP is based on UDP, it inherits low-latency characteristics by nature. The latency of an RTSP pipeline depends on the coordinated tuning of multiple components: RTP transport, jitter buffer (typically 100–300 ms to smooth out inter-arrival timing inconsistencies[reference:3]), decoding, and rendering. With careful engineering, the buffering time can be reduced from a default of 500 ms to approximately 150 ms. In optimized local area networks, RTSP can achieve end-to-end latencies in the 100–250 ms range. When timestamps are synchronized, the jitter buffer can merge RTP packets belonging to the same video frame for distribution, reducing thread switching overhead and improving performance without increasing delay[reference:4].

3. RTMP and HTTP-FLV: The Latency Cost of TCP

Both RTMP and HTTP-FLV achieved ~150ms latency in this test. These results differ from the conventional perception that RTMP latency is typically 2–5 seconds, because this test was conducted in a high-quality local area network environment, thereby eliminating factors such as public CDN distribution and multi-hop forwarding. Under ideal conditions, both RTMP and HTTP-FLV can demonstrate latency performance significantly below typical measurements over the public Internet. The latency performance of HTTP-FLV is comparable to that of RTMP.

The primary sources of latency for these protocols include:

  • Before transmission begins, TCP three-way handshake and protocol handshake incur multiple RTTs of overhead.
  • At the player side, anti-jitter buffers (typically 5–10 seconds in public network scenarios) are the primary contributors to latency.
  • GOP (Group of Pictures) settings and TCP queue buildup during poor network conditions can further increase latency[reference:5].

4. SRT: Engineering Trade-Off Between Reliability and Latency

SRT achieved an average latency of ~300ms, which is higher than RTSP and WebRTC but lower than traditional RTMP deployments over the public Internet. SRT is based on the UDT (UDP-based Data Transfer) protocol and adds ARQ (Automatic Repeat ReQuest) retransmission and FEC (Forward Error Correction) mechanisms on top of UDP[reference:6]. The buffer mechanism requires data packets to be temporarily stored at both the sender and receiver to facilitate retransmission when needed, and the receiver-side buffer reassembles packets into the correct order before delivery to the video decoder[reference:7]. The TSBPD (Time Stamp Based Packet Delivery) mechanism within SRT precisely controls latency, but it cannot be dynamically adjusted[reference:8].

The core design philosophy of SRT is that sender and receiver negotiate a fixed latency buffer window (e.g., 60–200 ms to absorb jitter and allow for packet recovery[reference:9]). Once streaming begins, the latency is locked and extra delay does not accumulate due to changing network conditions. This design trades a fixed and controlled latency for transmission reliability in high-packet-loss, unstable public network environments. SRT builds on UDP with a private SRT handshake and end-to-end encryption, effectively compensating for network jitter and bandwidth fluctuations.

Protocol Selection Recommendations

Based on the measured latency characteristics and the underlying technical principles of each protocol, the following selection framework is recommended:

  • Ultra-low-latency interactive scenarios (<150ms): WebRTC is the only solution that can consistently deliver stable end-to-end sub-100ms latency. It is suitable for video conferencing, cloud gaming, telehealth, remote collaboration, and interactive education.
  • Low-latency surveillance and real-time monitoring scenarios (150–250ms): RTSP has deep legacy support and ecosystem advantages in these domains. It is suitable for IPC/NVR device ingest, industrial video transmission, drone video backhaul, and robotic teleoperation.
  • Public network distribution and compatibility‑first scenarios (150–500ms): RTMP and HTTP-FLV are widely supported by CDN networks and can serve as mainstream choices for both ingest and distribution. If network conditions are poor and transmission must cross complex public‑network links, SRT's packet-loss resilience provides more reliable delivery guarantees, albeit at a moderate latency cost.

Key Takeaway: Low latency is not determined solely by the protocol specification but is a function of the entire transmission chain, including network conditions, encoder configurations (GOP size, bitrate control), buffer strategies, and decoding/rendering optimization. Under ideal local network conditions, RTSP, RTMP, and HTTP-FLV can all achieve ~150ms latency, comparable to WebRTC. Over the public Internet, WebRTC and SRT provide greater resilience to packet loss while preserving low latency. Therefore, technology selection should always consider deployment environment, reliability requirements, and compatibility with existing infrastructure.

The true technical value lies not in rigid adherence to any single protocol, but in selecting the right tool for the specific use case—balancing latency, reliability, compatibility, and deployment complexity in complex, ever-changing networking environments.