The ABCs of SIP
A Brief Introduction to Session Initiation Protocol
Chances are you’ve already heard about SIP, but maybe you’re thirsting for more information like What is SIP and Why is it important? In
this paper, we’ll explain the ABCs of SIP in a simple way.
Let’s Start at the Beginning
SIP is an acronym for Session Initiation Protocol. That is, it’s a signaling protocol for initiating (and terminating) multimedia sessions such as
voice calls and video conferences.
What’s so great about SIP and why is everyone talking about it? Well, for one thing, SIP opens the door to multi-user multimedia sessions
from any SIP-enabled device to any other. Multi-user means that cell phones, smartphones, IP phones and computers can all join a single
session at the same time. Multimedia means it no longer has to be a voice or video or instant messaging, it can be a voice AND video AND
Secondly, SIP saves money. Because it’s based on IP (Internet Protocol) and travels over standard IP transit protocols, SIP can share the
same network connections as data communications instead of requiring costly, dedicated voice circuits. How much money can you save
with SIP? For large enterprises, the annual savings can be in the millions of dollars. Maybe they should have called it GULP instead!
The Story of SIP:
A Very Brief History
In order to take advantage of what many people saw as a
global shift toward IP communications and applications, SIP is
based on existing Internet protocols like HTTP, TCP, SMTP and
technologies like Domain Name Servers. And they were right;
today, there is a marked migration away from dedicated voice
networks like the traditional telephony system and toward
converged voice/data/video networks over IP. By basing SIP on
existing IP standards, it’s now easier than ever to build voiceenabled
SIP: The Super Signaling Protocol
A multimedia session such as a Voice over IP (VoIP) call or
videoconference consists of two main parts: the signaling
and the media. Think of when you meet someone for the first
time: you might begin by making eye contact, shaking hands,
exchanging names, followed by a conversation and, finally,
say goodbye to one another. Multimedia sessions work much
the same way: the conversation itself is the media part and
everything else (from hello to goodbye) is the signaling part.
SIP accomplishes several things during a session:
> > It locates the other device(s) you want to talk with
on any network;
> > It initiates and terminates the session;
> > It handles changes to the sessionlike adding new users
or new kinds of media (e.g., adding video to a voice call);
> > It negotiates session capabilities between devices
(e.g., which voice or video codecs will be used).
A Sample SIP Call Flow
A basic SIP call flow between two SIP devices is fairly
straightfoward and involves only a handful of messages that
mimic the way a traditional call is made. These messages include
the invitation (called a SIP INVITE), the generation of a ringing
sound on the caller’s device, an acknowledgment (ACK) when the
called SIP device answers, and the session termination (BYE). In
between is the two-way passage of RTP media. A sample call flow
between Alice and Bob, for example, would look like this:
Successful Session Establishment
180 Ringing F2
200 OK F3
Both Way RTP Media
200 OK F6
SIP Sessions in an IP Network
In addition to a SIP-enabled phone or computer (known as a SIP User Agent), SIP sessions rely on several network elements to help them
initiate and terminate calls. These include a SIP Registrar Server to locate the IP address of the SIP User Agent being called, a SIP Proxy
Server to send the SIP message to the IP address and, in the case of sessions that involve more than one network, a SIP Redirect Server to
help forward the SIP message to a peer network’s SIP Proxy Server.
1. When a phone number is dialed, the SIP User Agent sends a SIP message to the Proxy Server in Network A.
2. The SIP Proxy Server queries the SIP Redirect Server in Network A for the location of the Proxy Server in Network B.
3. The SIP message is then forwarded from the Proxy Server in Network A to the Proxy Server in Network B.