Multimedia Multicast Applications
Developing IP Multicast Networks, Volume I
Author: Beau Williamson
Publisher: Cisco Press (53)
For many people, the first thing the term IP multicast brings
to mind is video conferencing. Therefore, it is very likely that your first
exposure to a multicast application will be one of the many exciting multimedia
applications used for video and audio conferencing. Because these multimedia
conferencing applications are so popular, taking a closer look at some of
them makes sense.
This chapter starts by exploring some of the underlying protocols used
by multimedia conferencing applications. The first twoReal-Time Protocol
(RTP) and its companion protocol, Real-Time Control Protocol (RTCP)are
used to encapsulate the multimedia conference audio and video data streams
and to monitor the delivery of the data to the end-stations in the conference.
Next, this chapter examines the Session Announcement Protocol (SAP) and the
Session Description Protocol (SDP). Conference-directory applications use
these protocols to announce and to learn about the existence of the multimedia
conference session in the network. Finally, this chapter looks at the popular
MBone multimedia conferencing applications that provide video and audio conferencing
as well as some limited data sharing.
RTP is a network
layer protocol, documented in RFC 1889, that permits applications to transmit
various types of real-time payloads such as audio, video, or other data that
has real-time characteristics. RTP typically rides on top of User Datagram
Protocol (UDP) and can be used over either unicast or multicast data streams.
The protocol also provides payload type identification, sequence numbering,
and timestamping, as well as a mechanism to monitor the delivery of the data.
RTP itself does not provide any guaranteed delivery mechanisms and normally
relies on the lower-layer protocol to perform this function. Because it frequently
rides on top of IP and UDP (as is the case for most multicast multimedia applications),
however, RTP depends on the application to deal with the problems of lost
datagrams and out-of-order delivery. These conditions can be detected by the
use of the Sequence Number field in the RTP header.
RTP consists of two components:
The RTP component, which carries the real-time data.
The RTP Control Protocol (RTCP) component,
which provides information about the participants of a session and monitors
the delivery of data by using some simple quality-of-service measurements,
such as packet loss and jitter.
The next section provides an audio conference example to describe the
properties of RTP further.
Multimedia multicast applications
typically allocate a multicast group address and twoports: one for the RTP
data stream (in this case audio) and the other for the RTCP control stream.
In most cases, the control port is numerically one higher than the data port.
The incoming audio signal is sampled in small, fixed time slots (for
example, 40 ms) by the audio application. The audio from these time slots
then is encoded using one of several audio-encoding schemes (pulse code modulation [PCM], adaptive differential pulse
code modulation [ADPCM], linear predicative coding [LPC], and so on), and
the encoded data is placed inside of an RTP packet. The header of the RTP
packet contains a sequence number and a timestamp as well as an indication
of the encoding scheme used.
The Robust Audio Tool
(RAT) is an audio conferencing application that uses multiple encoding methods
to provide some redundancy to the audio data stream. The RTP packet contains
an audio sample encoded using a primary encoding scheme followed by one or
more audio samples encoded using some secondary encoding scheme. The audio
samples that follow the primary sample are delayed slightly so that they can
be used as an alternative if the primary data in a previous RTP packet is
lost or corrupted.
When the audio application receives an RTP packet, the sequence number
and timestamp in the RTP header are used to recover the sender's timing information
and determine how many packets have been lost. The encoded audio data sample
in the RTP packet then is placed in a play-out buffer with previously
received audio samples. The audio samples are placed in the play-out buffer
in contiguous order based on their sequence number and timestamp so that when
they are decoded and played out to the speaker, the original audio is recovered.
The play-out buffer also serves as a de-jitter buffer. Congestion
on the network can lead to variable interpacket arrival times that result
in choppy audio playback. By using a larger play-out buffer and then delaying
the play out of the data until the buffer is nearly full, variations in jitter
can be smoothed out and choppy audio playback avoided. The downside of using
a large play-out buffer is that it introduces delay in the audio stream. The
delay is not a problem for one-way audio broadcasts, but it can become a problem
if the application is an interactive audio conferencing tool.
Because it is useful to know who is participating in the conference
and how well they are receiving the transmission, the audio application periodically
multicasts a receiver report (RR) in an RTCP packet on the control port. These
receiver reports contain the user's name and information on the number of
packets lost and the interarrival jitter for each source in the conference.
Senders can use this information to determine how well their transmissions
are being received by each receiver and, in some cases, change to some other
encoding method to try to improve the reception. RTCP is described in more
detail in the next section.
Senders also periodically multicast sender reports (SRs) in RTCP packets to the
same control port. These sender reports contain the same information as
receiver reports but also include a 20-byte sender information section that
contains timestamps, bytes sent, and packets sent on the data port. Members
of the group can use this information to compute round-trip time
and other statistics on the traffic flow.
All RTP-based applications use RTCP periodically to transmit session
control information to all participants of the conference to accomplish the
Provide feedback on the quality of data reception and, in
many cases, modify encoding schemes to improve overall reception quality.
Third-party applications can also use this information to diagnose delivery
problems and to determine areas of the network that are suffering poor reception
Uniquely identify each transport layer source in the conference
by the use of a canonical name (CNAME). This CNAME
may be used to associate several data streams from a given participant as
part of a single multimedia session. This is important if you are trying to
synchronize audio and video data streams.
Transmit RTCP packets so the total number of participants
can be determined. This is required of all participants in order to accomplish
functions 1 and 2. The information is necessary so that the rate at which
RTCP control data is transmitted can be adjusted to some small percentage
of the total session bandwidth.
Distribute information (username, location, and so on) that
identifies the participants in the session in a user-friendly manner. This
information normally is displayed in the user interface of the application.
If you are using RTP over IP multicast, functions 1, 2, and 3 are mandatory
to allow the application to scale to a large number of participants. The fact
that many of the popular multimedia multicast applications use the RTP model
has the following very important implication on multicast
Even if the end-station is tuned in only to the video broadcast of the
company meeting and actually is not sending any audio or video data, it still
is multicasting periodic RTCP packets and, therefore, is a multicast source
and receiver. Because the end-station is sending multicast traffic also (albeit
at a low rate), this traffic is likely to cause a multicast state to be instantiated
in some or all routers in the network, depending on the multicast routing
protocol in use. The additional state generated by these so-called receivingend-stations
should be (and more often is not) considered when doing multicast network
design, because some multicast protocols do not scale well with large numbers
One of the primary architects of OpenCable, Michael
Adams, explains the key concepts of this initiative in his book
Broadband, Second Edition
by George Abe
Introduces the topics surrounding high-speed networks
to the home. It is written for anyone seeking a broad-based familiarity
with the issues of residential broadband (RBB) including product
developers, engineers, network designers, business people, professionals
in legal and regulatory positions, and industry analysts.