Enable Two-Way Audio with ONVIF Audio Back Channel

Configure bidirectional audio (intercom) in your rtsp server using the ONVIF Audio Back Channel specification over RTSP. Perfect for remote monitoring and real-time communication.

What is Audio Back Channel?

The ONVIF Audio Back Channel enables bidirectional audio communication over a single RTSP session. It allows a client (e.g., a monitoring station) to send audio data (like voice from a microphone) back to the NVT (Network Video Transmitter, e.g., an IP camera or media server).

This feature is essential for applications such as:

  • Remote Intercom: Security personnel can speak to individuals near the camera.
  • Remote Assistance: Experts can guide on-site technicians via voice.
  • Two-Way Audio Monitoring: Full duplex communication between endpoints.

The mechanism uses the RTSP Require header and SDP (Session Description Protocol) to negotiate the bidirectional stream.

How It Works

The audio back channel is established through a series of RTSP commands:

  1. DESCRIBE: The client includes the Require: www.onvif.org/ver20/backchannel header to signal support for backchannel.
  2. SDP Response: The server responds with an SDP that includes:
    • A media section with a=recvonly for audio from server to client (playback).
    • A media section with a=sendonly for audio from client to server (backchannel).
  3. SETUP: The client sets up both the forward and backchannel streams.
  4. PLAY: The client sends PLAY, and both audio streams are activated. The client can now send audio to the server.
  5. TEARDOWN: Terminates the session.

Configuration & Implementation

Happytime RTSP Server supports the ONVIF Audio Back Channel specification. Ensure your server is configured to advertise audio output capabilities.

Example SDP Response (Server Side)

When a client sends a DESCRIBE request with the backchannel require tag, the server should return an SDP like this:

SDP with Audio Back Channel Support
v=0
o=- 1234567890 1234567890 IN IP4 192.168.1.100
s=Media Server
c=IN IP4 192.168.1.100
t=0 0
m=audio 5004 RTP/AVP 8
a=recvonly
a=rtpmap:8 PCMA/8000
a=control:trackID=1
m=audio 5006 RTP/AVP 8
a=sendonly
a=rtpmap:8 PCMA/8000
a=control:trackID=2
m=video 5002 RTP/AVP 96
a=recvonly
a=rtpmap:96 H264/90000
a=control:trackID=3

In this example:

  • m=audio 5004 ... a=recvonly: Audio from server to client (e.g., camera mic).
  • m=audio 5006 ... a=sendonly: Audio from client to server (backchannel, e.g., client mic).
  • a=rtpmap:8 PCMA/8000: Specifies G.711 A-law audio at 8kHz.

Client Implementation (Key Headers)

The client must include the following in the DESCRIBE request:

RTSP DESCRIBE Request with Backchannel
DESCRIBE rtsp://192.168.1.100:554/live/stream1 RTSP/1.0
CSeq: 2
User-Agent: Happytime ONVIF Client
Accept: application/sdp
Require: www.onvif.org/ver20/backchannel

If the server does not support backchannel, it should respond with 551 Option not supported.

Note: The server must list all supported audio codecs in the SDP. The client selects the codec by matching the payload type in its RTP stream to one of the server's a=rtpmap entries.

Testing the Audio Back Channel

Use a compatible ONVIF client or media player to test:

  1. Use Happytime ONVIF Client or ONVIF Device Manager (ODM) to connect to the server.
  2. Navigate to the audio settings or intercom feature.
  3. Initiate a two-way audio session. You should hear audio from the server and be able to speak back.
  4. Monitor the RTSP traffic with Wireshark to verify the bidirectional RTP streams.

Ensure your client application supports the ONVIF Audio Back Channel profile.

Warning: The client MUST wait for the 200 OK response to the PLAY request before sending any audio data back to the server. Sending data prematurely may cause connection issues.

Best Practices

  • Codec Selection: Use widely supported codecs like G.711 (PCMU/PCMA) for maximum compatibility.
  • Network QoS: Ensure sufficient bandwidth and low latency for real-time audio. Consider using QoS to prioritize audio traffic.
  • Security: Combine with RTSPS (RTSP over TLS) to encrypt both video and audio streams.
  • Testing: Always test with standard ONVIF clients to ensure interoperability.