Introduction to Voice over IP

Definition: The terms "IP Telephony", "Internet Telephony", "Voice over IP (VoIP)" are all related and are often used as synonyms. 

VoIP is the collection of technologies that emulates and extends today's circuit-switched telecommunication services to operate on packet-switched data networks based on the Internet Protocol (IP).  Internet telephony refers to communications services - voice, facsimile, and/or voice-messaging applications - that are transported via the Internet, rather than the Public Switched Telephone Network (PSTN). The basic steps involved in originating an Internet telephone call are conversion of the analog voice signal to digital format and compression/translation of the signal into Internet protocol (IP) packets for transmission over the Internet; the process is reversed at the receiving end. An IP Telephone is a telephone device that transports voice over a network using data packets instead of circuit switched connections over voice only networks. IP Telephony refers to the transfer of voice over the Internet Protocol (IP) of the TCP/IP protocol suite. Other Voice Over Packet (VOP) standards exist for Frame Relay and ATM networks but many people use the terms Voice over IP (VoIP) or "IP Telephony" to mean voice over any packet network.

 

From Public Switched Telephone Network to IP Telephony

Today's Public Switched Telephone Network (PSTN) provides its users with dedicated, end-to-end circuit connection for the duration of each call. Based on the calling and called parties' numbers, circuits are reserved among the originating switch, any switches along the route between the two ends of the call, and the terminating switch. Signaling between these PSTN switches supports basic call setup, call management, and call tear down as well as querying of databases to support advanced services such as local number portability, mobile subscriber authentication and roaming, virtual private networking, and toll-free service.

The PSTN has served voice traffic well over the last 100 years, but its success has been paralleled by the rise of separate networks to support data traffic. Clearly, use of distinct networks for voice and data represents an additional burden to service providers and an additional cost to consumers. As more and more PSTN traffic becomes data-oriented, however, the trend toward voice and data network convergence becomes stronger and stronger. Service providers, Internet service providers, and manufacturers of switching, transmission, and customer premises equipment are all participating in a significant shift of the telecommunications industry toward combined voice/data networking using IP.

The data carried by an IP network can be as simple as transactional queries and responses or as complex as broadband multimedia services. In particular, the technology of IP telephony supports all the functions of voice communications, fax communications, routing, authorization, authentication, accounting, billing, and network management that are now provided by the PSTN. Today's "voice-over-IP" technology can reconstruct the signals to compensate for echoes, jitter, and dropped packets, with exceptional performance possible if the IP network is a managed network with guaranteed Quality of Service. Vendors and service providers alike are striving to develop new standards that will enable connection-oriented services like IP telephony to retain high priority when carried over a connectionless IP network.

The shift to IP telephony promises better efficiencies in the transport of voice and data, and, as a result, lower telecommunications costs to end users. Moreover, as IP telephony evolves, it will be able to match all the features of voice communications currently supported by the PSTN. Interoperability among the IP telephony products of different vendors is the first major hurdle to overcome. The real promise of IP telephony, however, will be realized with the next wave of advanced services that will begin to surpass the capabilities of the PSTN.

 

Classifications

The VoIP connection can be classified by the type of devices performing an Internet call. Please note that the term PC can be applied to any device capable of transmitting voice over data network. It does not necessarily have all the features of a standard computer. It could just look like a traditional telephone with the basic elements of a computer to execute an Internet call. With this in mind, we have the following generic classifications.

PC to PC

Figure 1 PC to PC Scenario

For users who already have an Internet access and an audio-capable PC. This scenario can take advantage of integration with other Internet services such as World Wide Web, instant messaging, e-mail, etc.

PC to telephone or telephone to PC

Figure 2  PC to Phone or Phone to PC Scenario

In this scenario, PC-callers may reach also the PSTN users. A gateway converting the Internet call into a PSTN call has to be used. Traditional telephone users also can make a call to a PC going through the gateway that connects the IP network with PSTN.

Telephone to telephone

Figure 3  Phone to Phone Scenario

The IP network can be a dedicated backbone to connect PSTN.  Gateways should connect PSTN to the IP network.

 

Benefits of using VoIP

Voice communications will certainly remain as basic form of interaction among people. A simple replacement of PSTN is hard to implement in short term. The immediate goal for many VoIP service providers is to reproduce existing telephone capabilities at a significantly lower cost and offer a quality of service competitive to PSTN. In general, the benefits of VoIP technology can be the following.

Low-Cost PSTN Options

By avoiding traditional telephony access charges and settlement, a caller can significantly reduce the cost of long distance calls. Although the cost reduction is somewhat related to future regulations, VoIP certainly adds an alternate option to existing PSTN services. 

Network efficiency

Packetized voice offers much higher bandwidth efficiency than circuit-switched voice because it does not take up any bandwidth in listening mode or during pauses in a conversation. It is a big saving when we consider a significant part of a conversation is silence. The network efficiency can also be improved by removing the redundancy in certain speech patterns. If we were to use the same 64 Kbps Pulse Code Modulation (PCM) digital-voice encoding method in both technologies, we would see that bandwidth consumption of packetized voice is only a fraction of the consumption of circuit-switched voice. The packetized voice can take advantage of the latest voice-compression algorithms to improve efficiency.

Simplification and consolidation

An integrated infrastructure that supports all forms of communication allows more standardization and reduces the total equipment and management cost. The combined infrastructure could support bandwidth optimization and a fault tolerant design. Universal use of the IP protocols for all applications reduces both complexity and more flexibility. Directory services and security services could be more easily shared.

Even though basic telephony and facsimile are the initial applications for VoIP, the longer term benefits are expected to be derived from multimedia and multi-service applications. Combining voice and data features into new applications will provide a significant return over the longer term. In out project we will discuss what are the development challenges and basic system components to implement VoIP technology. We will also present the two most relevant protocols - H.323 and SIP - and compare them.

 

Basic system components of Voice over IP

In VoIP systems, analog voice signals are digitized and transmitted as a stream of packets over a digital data network. IP networks allow each packet to independently find the most efficient path to the intended destination, thereby best using the network resources at any given instant. The packets associated with a single source may thus take many different paths to the destination in traversing the network, arriving with different end-to-end delays, arriving out of sequence, or possibly not arriving at all. At the destination, however, the packets are re-assembled and converted back into the original voice signal. VoIP technology insures proper reconstruction of the voice signals, compensating for echoes made audible due to the end-to-end delay, for jitter, and for dropped packets.


Figure 4 General VoIP Network

 

Basic Elements

There are three major system components to VoIP technology: clients, servers, and gateways.

Clients

The client comes in two basic forms. It is either a suite of software running on a user’s PC which allows the user, through a GUI, to set-up and clear voice calls, encode, packetize and transmit outbound voice information from the user’s microphone and receive, decode and play inbound voice information through the user’s speaker or headsets. The other type of client, known as a ‘virtual’ client, does not have a direct user interface, but resides in gateways and provides an interface for users of POTS.

Servers

In order for IP Telephony to work and to be viable as a commercial enterprise, a wide range of complex database operations, both real-time and non-real-time, must occur transparently to the user. Such applications include user validation, rating, accounting, billing, revenue collection, revenue distribution, routing (least cost, least latency or other algorithms), management of the overall service, downloading of clients, fulfillment of service, registration of users, directory services, and more.

Gateways

VoIP technology allows voice calls originated and terminated at standard telephones supported by the PSTN to be conveyed over IP networks. VoIP "gateways" provide the bridge between the local PSTN and the IP network for both the originating and terminating sides of a call. To originate a call, the calling party will access the nearest gateway either by a direct connection or by placing a call over the local PSTN and entering the desired destination phone number.

The VoIP technology translates the destination telephone number into the data network address (IP address) associated with a corresponding terminating gateway nearest to the destination number. Using the appropriate protocol and packet transmission over the IP network, the terminating gateway will then initiate a call to the destination phone number over the local PSTN to completely establish end-to-end two-way communications. Despite the additional connections required, the overall call set-up time is not significantly longer than with a call fully supported by the PSTN.

The gateways must employ a common protocol - for example, the H.323 or SIP or a proprietary protocol - to support standard telephony signaling. The gateways emulate the functions of the PSTN in responding to the telephone's on-hook or off-hook state, receiving or generating DTMF digits and receiving or generating call progress tones. Recognized signals are interpreted and mapped to the appropriate message for relay to the communicating gateway in order to support call set-up, maintenance, billing, and call tear-down.
 

 

Protocol architecture and signaling for VoIP

Internet telephony requires a range of protocols, ranging from those needed for transporting real-time data across the network, to QoS aware routing protocols, to resource reservation, QoS aware network management and billing protocols. In addition, VoIP requires a means for prospective communications partners to find each other and to signal to the other party their desire to communicate. We refer to this functionality as Internet telephony signaling. The need for signaling functionality distinguishes Internet telephony from other Internet multimedia services such as broadcast and media-on-demand services.

Unlike circuit-switched telephony, Internet telephony services are built on a range of packet switched protocols.  For example, the functionality of the SS7 (Signaling System No.7) telephony signaling protocol encompasses routing, resource reservation, call admission, address translation, call establishment, call management and billing. In the Internet environment, routing is handled by protocols such as BGP, resource reservation by RSVP or other resource reservation protocols.

 Table 1  ISO Reference Model and VoIP Standards and Protocols

 ISO Protocol layer

 Protocols and standards

Presentation

 Codecs / Applications

Session

 H.323 / SIP / MGCP

Transport

 RTP / TCP / UDP

Network

 IP

Link

 FR, ATM, Ethernet, PPP, HDLC, etc.

 

H.323 - ITU Standard for conducting voice over IP. It includes several related standards such as H.225 (call control), H.245 (media path and parameter negotiation), H.235 (security), H.225 Registration, Admission and Status (H.323 Gatekeeper protocol), and H.450 (supplementary services).

 SIP - Session Initiation Protocol is defined via IETF RFC 2543.

MGCP - Media Gateway Control Protocol, an IETF draft standard for controlling voice gateways through IP networks.

 VoIP protocols typically use Real-time Protocol (RTP) for the media stream. RTP uses User Datagram Protocol (UDP) as it’s transport protocol. Voice signaling traffic often uses Transmission Control Protocol (TCP) as its transport medium. The IP layer provides routing and network level addressing, while the link layer protocol controls and directs the transmission of the information over the physical medium.

Two standards compete for IP Telephony signaling. The older and currently more widely accepted standard is the ITU-T recommendation H.323, which defines a multimedia communications system over packet-switched networks, including IP networks. The other standard, Session Initiation Protocol (SIP) , comes from the IETF MMUSIC working group. In this paper, we compared these two standards for IP Telephony signaling in the following section.

 

Development challenges

The goal of VoIP developers is to add telephone calling capabilities to IP-based networks and interconnect these to traditional public telephone network and to private voice networks maintaining current voice quality standards and preserve the features everyone expects from the telephone. We can summarize the technical challenges as the following.

Voice quality should be comparable to what is available using the PSTN, even over networks having variable levels of QoS.

The underlying IP network must meet strict performance criteria including minimizing call refusals, network latency, packet loss, and disconnections. This is required even during congestion conditions or when multiple users must share network resources.

Call control (signaling) must make the telephone calling process transparent so that the callers need not know what technology is actually implementing the service.

PSTN/VoIP service compatibility (and equipment interoperability) involves gateways between the voice and data network environments.

System management, security, addressing and accounting must be provided, preferably consolidated with the PSTN operation support systems.

 

Standard and interoperability issues

While standards for VoIP technology are emerging, they are still in flux. Even VoIP implementations that are standards-compliant may not necessarily interoperate with the standards-compliant products of other vendors. The ITU-T H.323 standard, for example, does not encompass all aspects of VoIP communications, and each vendor of VoIP technology can have their own variations of the overall VoIP network architecture and algorithms. Variations among VoIP products include the algorithms and implementations used to support dynamic bandwidth allocation, packet loss recovery, adaptive echo cancellation, and speech processing to deliver voice quality as high as possible.

 

Voice Quality issues

The voice quality should be comparable to what is available using the PSTN, even over networks of varying levels of QoS. The following factors decide the VoIP quality:

Packet loss

In order to operate a multi-service packet based network at a commercially viable load level, random packet loss is inevitable. This is particularly true with communications over the Internet where traffic profiles are highly unpredictable and the competitive nature of the business drives corporations to load their networks to the maximum.

Packetizing voice codecs are becoming better at reducing sensitivity to packet loss. The main approaches are smaller packet sizes, interpolation (algorithmic regeneration of lost sound), and a technique where a low-bit-rate sample of each voice packet is appended to the subsequent packet.  Through these techniques, and at some cost of bandwidth efficiency, good sound quality can be maintained even in relatively high packet loss scenarios.

As techniques for reducing sensitivity to packet loss improve, so a new opportunity for the achievement of even greater efficiencies is presented. This refers to the suppression of the transmission of voice packets whose loss is determined by the encoder to be below a threshold of tolerability at the decoder. This is particularly attractive in the packet based networking world where statistical multiplexing favors the reuse of freed-up bandwidth.

 Delay

Two problems that result from high end-to-end delay in a voice network are echo and talker overlap. Echo becomes a problem when the round-trip delay is more than 50 milliseconds. Since echo is perceived as a significant quality problem, VoIP systems must address the need for echo control and implement some means of echo cancellation. Talker overlap (the problem of one caller stepping on the other talker’s speech) becomes significant if the one-way delay becomes greater than 250 milliseconds. The end-to-end delay budget is therefore the major constraint and driving requirement for reducing delay through a packet network.

Propagation delay (the time taken for the information wave-front to travel a given distance through a given media), jitter buffering, packetization, analog to digital encoding and digital to analog decoding delays are responsible for most of the overall delay. Service and wait time through the switching and transmission elements of the network may be considered trivial given the small packet sizes and relatively wide bandwidths prevalent on the Internet. It is generally true that when considering the achievable quality of a given service, the overall geographic distance traveled by a call is far more important than the complexity of its routing, (i.e. the number of intermediary nodes or "hop-count").

 Jitter

Jitter is the variation in inter-packet arrival time as introduced by the variable transmission delay over the network. Removing jitter requires collecting packets and holding them long enough to allow the slowest packets to arrive in time to be played in the correct sequence, which causes additional delay. The jitter buffers add delay, which is used to remove the packet delay variation that each packet is subjected to as it transits the packet network.

 Overhead

Each packet carries a header of various size that contains identification and routing information. This information, necessary for the handling of each packet, constitutes ‘overhead’ not present with circuit switching techniques. Small packet size is important with real-time transmissions since packet size contributes directly to delay and the smaller the packet size, the less sensitive a given transmission would be to packet loss. Various new techniques such as header compression are evolving to reduce the packet overhead in IP networks. It is likely that packet based networks, of one form or another, will eventually approach the efficiency, with respect to overhead, of circuit based networks.

 

User friendly design

The user need not know what technology is being used for the call. He should be able to use the telephone as he does right now.

Easy configuration

An easy to use management interface is needed to configure the equipment. A variety of parameters and options such as telephony protocols, compressing algorithm selections, dialing plans, access controls, PSTN fall back features, port arrangement etc. are to be taken care of.

Addressing / Directories

Telephone numbers and IP addresses need to be managed in a way that it is transparent to the user. PCs that are used for voice calls, may need telephone numbers. IP enabled telephones IP addresses or an access to one via DHCP protocols and Internet directory services will need to be extended to include mappings between the two types of addresses.

 

Security issues

VoIP networks introduce some new risks to carriers and their customers, risks that are not yet fully appreciated. Responding to these threats requires some specific techniques, comprehensive, multi-layer security policies, and firewalls that can handle the special latency and performance requirements of VoIP.

It is important to remember that a VoIP network is an IP network. Any VoIP device is an IP device, and it's therefore vulnerable to the same types of attacks as any other IP device. In addition, a VoIP network will almost always have non-VoIP devices attached to it and be connected to other mission-critical networks.

Every IP network, regardless of how private it is, eventually winds up connected to the global Internet. Even if it is not possible to directly route a packet from the "private" network onto the Internet, it is extremely likely that some host on the "private" network will also be connected to a less private network. Compromising this host provides an attacker with a gateway into the presumed secure private network. It's important, therefore, to secure all IP networks, but VoIP networks have special security requirements. Specific techniques, comprehensive policies, and VoIP-capable firewalls are needed to do the job right.

 

Billing issues

VOIP gateways must keep track of successful and unsuccessful calls. Call detail records should be produced. But the major issue is the suitable billing model selection. The following billing models can be applied.

 Time-based: Metered by flow duration, time-of-day, time-of week.

Destination, distance, carrier-based: Rated by called and calling station IDs associated with the sequence of stages used to support the call.

QoS-based: rated by established service parameters such as priority, selected QoS, and latency.

 

BACK HOME