Understanding Packet Voice Protocols

	Sign In Sign-Up

Over the past decade, the telecommunications industry has witnessed rapid changes in the way people and organizations communicate. Many of these changes spring from the explosive growth of the Internet and from applications based on the Internet Protocol (IP). The Internet has become a ubiquitous means of communication, and the total amount of packet-based network traffic has quickly surpassed traditional voice (circuit-switched) network traffic (DataQuest, 1998).

In the wake of these technology advancements, it has become clear to telecommunications carriers, companies, and vendors that voice traffic and services will be one of the next major applications to take full advantage of IP. This expectation is based on the impact of a new set of technologies generally referred to as voice over IP (VoIP) or IP telephony.

VoIP supplies many unique capabilities to the carriers and customers who depend on IP or other packet-based networks. The most important benefits include the following:

Cost savings—By moving voice traffic to IP networks, companies can reduce or eliminate the toll charges associated with transporting calls over the Public Switched Telephone Network (PSTN). Service providers and end users can also conserve bandwidth by investing in additional capacity only when it is needed. This is made possible by the distributed nature of VoIP and by reduced operations costs as companies combine voice and data traffic onto one network.
Open standards and multivendor interoperability—By adopting open standards, both businesses and service providers can purchase equipment from multiple vendors and eliminate their dependency on proprietary solutions.
Integrated voice and data networks—By making voice "just another IP application," companies can build truly integrated networks for voice and data. These integrated networks not only provide the quality and reliability of today's PSTN, they also enable companies to quickly and flexibly take advantage of new opportunities within the changing world of communications.

In 1995, the first commercial VoIP products began to hit the market. These products were targeted at companies looking to reduce telecommunications expenses by moving voice traffic to packet networks. Early adopters of VoIP networks built toll-bypass solutions to take advantage of favorable regulatory treatment of IP traffic. Without any established standards, most early implementations were based on proprietary technology.

As these packet telephony networks grew and interconnection dependencies emerged, it became clear that the industry needed standard VoIP protocols. Several groups took up the challenge, resulting in independent standards, each with its own unique characteristics. In particular, network equipment suppliers and their customers were left to sort out the similarities and differences between four different signaling and call-control protocols for VoIP:

H.323
Media Gateway Control Protocol (MGCP)
Session Initiation Protocol (SIP)
H.248/Megaco

In the process of implementing workable VoIP solutions, network engineers had to determine how each of these protocols worked and which ones were best for particular networks and applications.

This paper provides some guidance and understanding of these VoIP protocols, and tries to clarify some of the confusion in the marketplace.

The Great Voice Myth

"The Great Voice Myth" states that there is only one way to build voice networks, and that there should be only one voice protocol for each function in a packet voice network. Although this vision of network nirvana has been discussed in many academic circles, the reality is that multiple VoIP protocols and architectures have already been deployed, they will exist for the foreseeable future, and many networks will continue to be built using multiple VoIP protocols.

Just as today's data networks were built over time using multiple protocols and applications, the VoIP networks of today and tomorrow will be constructed using the protocols and applications that best fit the associated technology and business requirements.

The question that companies must ask is not "Which protocol is best?" but "Which services do we want to deploy and which VoIP protocols best support those services?" The answer to the first question may reflect the bias of a particular vendor or standards body. The answer to the second question depends entirely on the unique requirements of each network implementation. Because no two businesses or networks are exactly the same, each company will answer the second question in its own way.

Making Sense of the VoIP Standards

VoIP comprises many standards and protocols. Basic terminology must be understood in order to understand the applications and usage of VoIP. The following definitions serve as a useful starting point (the protocols are listed in alphabetical order):

H.248 is an ITU Recommendation that defines "Gateway Control Protocol." H.248 is the result of a joint collaboration between the ITU and the Internet Engineering Task Force (IETF). It is also referred to as IETF RFC 2885 (Megaco), which defines a centralized architecture for creating multimedia applications, including VoIP. In many ways, H.248 builds on and extends MGCP.
H.323 is an ITU Recommendation that defines "packet-based multimedia communications systems." In other words, H.323 defines a distributed architecture for creating multimedia applications, including VoIP.
IETF refers to the Internet Engineering Task Force (http://www.ietf.org/), a community of engineers that seeks to determine how the Internet and Internet protocols work, as well as to define the prominent standards.
ITU is the International Telecommunication Union (http://www.itu.int/home/index.html), an international organization within the United Nations System (http://www.unsystem.org/) where governments and the private sector coordinate global telecom networks and services.
Megaco, also known as IETF RFC 2885 and ITU Recommendation H.248, defines a centralized architecture for creating multimedia applications, including VoIP.
Media Gateway Control Protocol (MGCP), also known as IETF RFC 2705, defines a centralized architecture for creating multimedia applications, including VoIP.
Real-Time Transport Protocol (RTP), also known as IETF RFC 1889, defines a transport protocol for real-time applications. Specifically, RTP provides the transport to carry the audio/media portion of VoIP communication. RTP is used by all the VoIP signaling protocols.
Session Initiation Protocol (SIP), also known as IETF RFC 2543, defines a distributed architecture for creating multimedia applications, including VoIP.

Understanding Centralized and Distributed Architectures

In the past, all voice networks were built using a centralized architecture in which dumb endpoints (telephones) were controlled by centralized switches. Although this model worked well for basic telephony services, it mandated a trade-off between simplified management and endpoint and service innovation.

One of the benefits of VoIP technology is that it allows networks to be built using either a centralized or a distributed architecture. This flexibility allows companies to build networks characterized by both simplified management and endpoint innovation, depending on the protocol used.

In general, centralized architectures are associated with MGCP and H.248/Megaco protocols. These protocols were designed for a centralized device—called a media gateway controller or call agent—that handles switching logic and call control. The centralized device talks to media gateways, which route and transmit the audio/media portion of the calls (the actual voice information).

In centralized architectures, the network intelligence is centralized and endpoints are relatively dumb (with limited or no native features). Although most centralized VoIP architectures use MGCP or H.248/Megaco protocols, it is also possible to build SIP or H.323 networks in a centralized fashion using back-to-back user agents (B2BUAs) or gatekeeper routed call signaling (GKRCS), respectively.

Advocates of centralized VoIP architectures favor this model because it centralizes management, provisioning, and call control. It simplifies call flows for replicating legacy voice features. And it is easy for legacy voice engineers to understand. Critics of centralized architectures claim that it stifles innovation of endpoint features and that it will become a hindrance when building VoIP services that move beyond legacy voice features.

Distributed architectures are associated with H.323 and SIP protocols. These protocols allow network intelligence to be distributed between endpoints and call-control devices. Intelligence in this instance refers to call state, calling features, call routing, provisioning, billing, or any other aspect of call handling. The endpoints can be VoIP gateways, IP phones, media servers, or any device that can initiate and terminate a VoIP call. The call-control devices are called gatekeepers in an H.323 network, and proxy or redirect servers in a SIP network.

Advocates of distributed architectures favor this model because of its flexibility. It allows VoIP applications to be treated like any other distributed IP application, and it allows the flexibility to add intelligence to either endpoints or call-control devices, depending on the business and technology requirements of the network. Distributed architectures are usually well understood by engineers who run IP data networks. Critics of distributed architectures commonly point to the existing PSTN infrastructure as the only reference model that should be used when trying to replicate legacy voice services. They also note that distributed networks tend to be more complex.

H.323

H.323 was originally created to provide a mechanism for transporting multimedia applications over local-area networks. Although H.323 is still used by numerous vendors for videoconferencing applications, it has rapidly evolved to address the growing needs of VoIP networks. Because of its early availability and these advancements, H.323 is currently the most widely used VoIP signaling and call-control protocol, with international and domestic carriers relying on it to handle billions of minutes of use each year.

H.323 is considered an "umbrella protocol" because it defines all aspects of call transmission, from call establishment to capabilities exchange to network resource availability. H.323 defines Registration, Admission, and Status Protocol (RAS) protocols for call routing, H.225 protocols for call setup, and H.245 protocols for capabilities exchange.

Figure 1
H.323 Networks

H.323 is based on the Integrated Services Digital Network (ISDN) Q.931 protocol, which allows it to easily interoperate with legacy voice networks such as the PSTN or Signaling System 7 (SS7).

As a protocol used in a distributed architecture, H.323 allows companies to build large-scale networks that are scalable, resilient, and redundant. It provides mechanisms for interconnecting with other VoIP networks, and supports network intelligence on either the endpoints or the gatekeepers.

MGCP/H.248/Megaco

MGCP and H.248/Megaco were designed to provide an architecture where call control and services could be centrally added to a VoIP network. In that sense, an architecture using these protocols closely resembles the existing PSTN architecture and services.

MGCP and H.248/Megaco define most aspects of signaling using a model called packages. These packages define commonly used functionality, such as PSTN signaling, line-side device connectivity, and features such as transfer and hold. In addition, Session Definition Protocol (SDP) is used to convey capabilities exchange.

Figure 2
MGCP/H.248/Megaco Networks

In a centralized architecture, MGCP and H.248/Megaco allow companies to build large-scale networks that are scalable, resilient, and redundant. It provides mechanisms for interconnecting with other VoIP networks and for adding intelligence and features to the call agent.

SIP

SIP was designed as a multimedia protocol that could take advantage of the architecture and messages found in popular Internet applications. By using a distributed architecture—with URLs for naming and text-based messaging—SIP attempts to take advantage of the Internet model for building VoIP networks and applications. In addition to VoIP, SIP is used for videoconferencing and instant messaging.

As a protocol, SIP only defines how sessions are to be set up and torn down. It utilizes other IETF protocols to define other aspects of VoIP and multimedia sessions, such as SDP for capabilities exchange, URLs for addressing, Domain Name System (DNS) for service location, and Telephony Routing over IP (TRIP) for call routing.

Figure 3
SIP Networks

Although the IETF has made great progress defining extensions that allow SIP to work with legacy voice networks, the primary motivation behind the protocol is to create an environment that supports next-generation communication models that utilize the Internet and Internet applications.

As a protocol used in a distributed architecture, SIP allows companies to build large-scale networks that are scalable, resilient, and redundant. It provides mechanisms for interconnecting with other VoIP networks and for adding intelligence and new features on either the endpoints or the SIP proxy or redirect servers.

Interconnecting VoIP Protocols

VoIP networks continue to be deployed at a rapid pace, and VoIP vendors and service providers continue to add new functionality. Because vendor support for each protocol differs and companies have varying business requirements, it is very likely that VoIP networks will continue to be made up of multiple protocols.

Having various protocols gives customers the flexibility they need to connect services from multiple carriers. Using standards, even multiple standards, still simplifies deployment of multivendor endpoints and increases options for network management and provisioning.

As companies expand their networks, they are faced with choices about how to interconnect segments using differing VoIP protocols. These choices often fall into one of three categories:

Translation through time-division multiplexing (TDM)—In this model, a company uses either TDM equipment or VoIP gateways to translate from one protocol domain to another. The benefits of this model are that it can be used today. The downside is that it introduces latency into the VoIP network, and involves yet another protocol translation (VoIP no. 1 <-> TDM <-> VoIP no. 2). This model is usually considered as a short-term solution until IP-based protocol translators are available.
Single protocol architecture—In this model, a company moves all its VoIP devices and services to a single protocol, simplifying the network as a whole. The downside to this approach is that it might not be possible to migrate existing equipment to support the new protocol, a situation that can limit the company's ability to take advantage of some existing services. In addition, it limits the potential connectivity to other networks that are using other VoIP signaling protocols.
Protocol translation—In this model, a company uses IP-based protocol translators to interconnect two or more VoIP protocol domains. IP translators allow a company to retain the flexibility of using multiple VoIP protocols, do not introduce the delay problems that additional TDM interconnections do, and do not require a wholesale replacement or swap of existing equipment.

The downside to this approach is that there is no standard for protocol translation, so not all VoIP protocol translators are exactly the same. Although the IETF has attempted to define a model for translating H.323 to SIP, it involves more than just building a protocol-translation box.

As shown in Table 1, although protocols are somewhat similar, they do have some differences. Vendors of protocol translators need in-depth knowledge of all the protocols being used in the VoIP network, and they must be aware of how various VoIP components utilize different aspects of the protocol.

For example, H.323 and SIP can send dual tone multifrequency (DTMF) digits in either the signaling path or the media path (via RTP). But H.323 mandates only that the H.245 signaling path be used, and SIP does not specify how DTMF should be carried. This means that SIP devices could be sending DTMF in the media path (RFC 2833), and H.323 devices could be sending DTMF in the signaling path (H.245). If the VoIP protocol translator cannot properly recognize both the signaling path and the media path, then it might not function properly.

	H.323	SIP	MGCP/H.248/Megaco
Standards body	ITU	IETF	MGCP/Megaco—IETF; H.248—ITU
Architecture	Distributed	Distributed	Centralized
Current version	H.323v4	RFC2543-bis07	MGCP 1.0, Megaco, H.248
Call control	Gatekeeper	Proxy/Redirect Server	Call agent/media gateway controller
Endpoints	Gateway, terminal	User agent	Media gateway
Signaling transport	Transmission Control Protocol (TCP) or User Datagram Protocol (UDP)	TCP or UDP	MGCP—UDP; Megaco/H.248—both
Multimedia capable	Yes	Yes	Yes
DTMF-relay transport	H.245 (signaling) or RFC 2833 (media)	RFC 2833 (media) or INFO (signaling)	Signaling or RFC 2833 (media)
Fax-relay transport	T.38	T.38	T.38
Supplemental services	Provided by endpoints or call control	Provided by endpoints or call control	Provided by call agent

Table 1:
Details of VoIP Protocols

H.323 SIP MGCP/H.248/Megaco

Standards body
ITU

IETF

MGCP/Megaco—IETF; H.248—ITU

Architecture
Distributed

Distributed

Centralized

Current version
H.323v4

RFC2543-bis07

MGCP 1.0, Megaco, H.248

Call control
Gatekeeper

Proxy/Redirect Server

Call agent/media
gateway controller

Endpoints
Gateway, terminal

User agent

Media gateway

Signaling transport
Transmission Control Protocol (TCP) or User Datagram Protocol (UDP)

TCP or UDP

MGCP—UDP;
Megaco/H.248—both

Multimedia capable
Yes

Yes

Yes

DTMF-relay transport
H.245 (signaling) or RFC 2833 (media)

RFC 2833 (media) or
INFO (signaling)

Signaling or RFC 2833 (media)

Fax-relay transport
T.38

T.38

T.38

Supplemental services
Provided by endpoints or call control

Provided by endpoints or
call control

Provided by call agent

Conclusion—Packet Voice and VoIP Is About Services, Not Protocols

Just as companies choose various protocols for their data networks, they will choose various protocols for their VoIP requirements, depending on the business and technical requirements at hand. Although the variety in VoIP protocols has caused some confusion in the marketplace, it is precisely this protocol flexibility that makes VoIP-based voice systems so much more useful than legacy voice systems.

Companies should choose vendors based on three very important requirements:

Customers need vendors that are committed to supporting open standards within their products, and are actively developing voice strategies that consider interoperability with all VoIP protocols. Without this commitment, VoIP systems are in danger of becoming as proprietary as legacy voice systems.
Customers need products that support multiple protocols. This way, if a company finds that it needs to migrate its systems or add products that support a different protocol, it will not be required to perform upgrades to the network.
Customers need voice solutions with end-to-end support for all VoIP protocols, meaning vendors must provide solutions that work in both single-protocol and multiprotocol environments.

By working with vendors that can provide this VoIP flexibility, companies can focus on building scalable and reliable networks that support the requirements of next-generation networks.

BACK HOME