Voice Over Internet protocol (VOIP)

What is VoIP? 


VoIP stands for 'V'oice 'o'ver 'I' ernet 'P'r ocol. As the term says VoIP

tries  to  let  go  voice  (mainly  human)  through  IP  packets  and,  in  defi ti

through Internet. VoIP can use accelerating hardware to achieve this purpose

and can also be used in a PC environment.

How does it work?

Many  years  ago  we  discovered  that  sending  a  signal  to  a  remote

destination could have be done also in a digit  fashion: before sending it we

have to digit ize it with an ADC (analog to digit  converter), ransmitt, and

at  the  end  transform  it  again  in  analog  format  with  DAC  (digit   to  analog

converter) to use it.

VoIP  works like  that,  digit izing  voice  in  data  packets,  sending  them

and reconverting them in voice at destination.

Digit   format  can  be  better  controll                   we  can  compress it, oute  it,

convertt to a new better format, and so on; also we saw that digit  signal is

more noise tolerant than the analog one (see GSM vs TACS).

TCP/IP networks are made of IP packets containing a header (to control

communication) and a payload to transport data: VoIP use it o go across the

network and come to destination.

Voice (source)  - - ADC - - - - Internet - - - DAC  - - Voice (dest)

W hat is the advantages using VoIP rather PSTN? 

When  you  are  using  PSTN  li            you  typically  pay  for  time  used  to  a

PSTN  line  manager  company:  more  time  you  stay at  phone  and  more  you'l

pay. In addition you couldn't alk with other that one person at a time.

In  opposite  with  VoIP mechanism  you  can  talk all he  time  with  every

person you want (the needed is that other person is also connected to Internet

at the same time), as far as you want (money independent) and, in additi

you can talk with many people at the same time.

If  you're  stil  not  persuaded  you  can  consider  that,  at  the  same  time,

you  can  exchange  data  with  people  are  you  talking  wit   sending  images,

graphs and videos.

Then, why everybody doesn't use it yet? 

Unfortunately  we  have  to  report  some  problem  with  the  integrati

between VoIP architecture and Internet. As you can easy imagine, voice data

communication must be a real time stream (you couldn't speak, waitor many

seconds, then hear other side answering): this is in contrast with the Internet

heterogeneous architecture that can be made of many routers (machines that

route packets), about 20-30 or more and can have a very high round tri                            ime

(RTT), so we need to modify something to get it properly working.

In next sections we'llry to understand how to solve this great problem.

In general we know that is very diffi                 t to guarantee a bandwidth in Internet

for VoIP appli        ion.

Technical info about VoIP

Here we see some important info about VoIP, needed to understand it.


Overview on a VoIP connection 

To setup a VoIP communication we need:

1.Fir  the ADC to convert analog voice to digit  signals (bit


2.Now the bits have to be compressed in a good format for transmission:

there is a number of protocols we'l see after.

3.Here  we  have  to insert  our  voice  packets  in data  packets using a  real-

time protocol (typically RTP over UDP over IP)

4.We need a signaling protocol to call users: ITU-T H323 does that.

5.At  RX  we  have  to  disassemble  packets,  extract  data‘s,  then  convert

them to analog voice signals and send them to sound card (or phone)

6.All hat  must  be  done  in  a  real  time  fashion  cause  we  cannot  wait or

too long for a vocal answer!

Base architecture

Voice )) ADC - Compression Algorithm -  Assembling RTP in TCP/IP -






Voice (( DAC - Decompress. Algorithm -  Disass. RTP from TCP/IP  -


Analog to Digital Conversion 

This is made by hardware, typically by card integrated ADC.

Today every sound  card  allows you  convert  with  16  bit  a  band  of  22050  Hz

(f  sampling it you need a freq of 44100 Hz for Nyquist Principle) obtaining a

throughput of 2 bytes * 44100 (samples per second) = 88200 Bytes/s, 176.4

kBytes/s for stereo stream.

For VoIP we needn't a 22 kHz bandwidth  (and also we needn't 16 bit!): next

we'l see other coding used for it.

Compression Algorithms 

Now  that  we  have  digit   data  we  may  convert  it o  a  standard  format  that

could be quickly transmitted.

PCM, Pulse Code Modulation, Standard ITU-T G.711

·      Voice  bandwidth  is 4 kHz,  so sampling bandwidth has to be  8 kHz (for


·      We represent each sample with 8 bit (having 256 possible values).

·      Throughput is 8000 Hz *8 bit = 64 kbit/s, as a typical digital phone line.

·      In real application mu-law (North America) and a-law (Europe) variants

are  used  which  code  analog  signal  a  logarithmic scale  using  12  or  13

bits instead of 8 bits (see Standard ITU-T G.711).

ADPCM, Adaptive diff enti  PCM, Standard ITU-T G.726

It  converts  only  the  diff ence  between  the  actual  and  the  previous  voice

packet requiring 32 kbps (see Standard ITU-T G.726).

LD-CELP, Standard ITU-T G.728

CS-ACELP, Standard ITU-T G.729 and G.729a

MP-MLQ, Standard ITU-T G.723.1, 6.3kbps, True speech

ACELP, Standard ITU-T G.723.1, 5.3kbps, True speech

LPC-10, able to reach 2.5 kbps!!



This last  protocols are  the  most  important  cause  can  guarantee  a  very  low

minimal band using source coding; also G.723.1 codecs have a very high MOS

(Mean  Opinion  Score,  used  to  measure  voice  fidelit   but  attention  to

elaboration performance required by them, up to 26 MIPS!

RTP Real Time Transport Protocol

Now  we  have  the  raw  data  and we  want  to encapsulate  it nto TCP/IP stack.

We follow the structure:

VoIP data packets




    I, II layers

VoIP  data  packets live  in  RTP  (Real-Time  Transport  Protocol)  packets,  which

are inside UDP-IP packets.

Fir ,  VoIP  don't  use  TCP  cause  it s too  heavy  for  real  time  appli                      ion,  so

instead UDP (datagram) is used.

In  UDP we  cannot  ordering  packets in  arrive  time  (which  is a  must  in  VoIP)

because  there  isn't onnection  idea,  each  packet  is independent  from  others

(datagram  concept);  so  we  have  to  introduce  a  new  protocol,  such  as RTP,

able to manage this.

The following figure gives the structure of RTP implemented in VOIP.

Real Time Transport Protocol

    0                   1                   2                   3

   0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1


|  V=2|P|X|  CC   |M|     PT      |       sequence number       |


|                           timestamp                           |


|           synchronization source (SSRC) identifier            |


|            contributing source (CSRC) identifiers             |

|                             .                              |



·      V indicates the version of RTP used 

·      P indicates the padding, a byte not used at bottom packet to reach the

parity packet dimension 

·      X is the presence of the header extension 

·      CC  field  is the  number  of  CSRC  identifiers following  the  fixed  header.

CSRC fi d are used, for example, in conference case.

·      M is a marker bit

·      PT payload type 


There  are  also  other  protocols  used  in  VoIP,  like  RSVP,  that  can  manage

Qualiy of Service (QoS).

RSVP is a signaling protocol that requests a cert n amount of bandwidth and

latency in every network hop that supports it.


Quality of Service (QoS) 


We said many times that VoIP appli                ions require a real-time data streaming

cause we expect an interactive data voice exchange.

Unfortunately, TCP/IP cannot guarantee this kind of purpose, it ust make a '

best  eff t'  to  do  it.  So  we  need  to  introduce  tricks and  polies that  could

manage the packet flow in EVERY router we cross.

So here are:

1.TOS fi d in IP protocol to describe type of service: high values indicate

low  urgency while  more  and more  low  values bring us more  and more

real-time urgency 

2.Queuing packets methods:

1.FIFO  (Fir   in  Fir   Out), he  more  stupid  method  that  allows

passing packets in arrive order.

2.WFQ  (Weighted  Fair  Queuing),  consisting  in  a  fair  passing  of

packets  (f   example,  FTP  cannot  consume  all  avail                                  e

bandwidth), depending on kind of data flow, typically one packet

for UDP and one for TCP in a fair ashion.

3.CQ (Custom Queuing), users can decide pri it

4.PQ  (Pri ity Queuing), here  is a  number  (typically 4)  of  queues

with a pri ity level each one: fir , packets in the fir  queue are

sent, hen  (when  fir   queue  is empty)  starts sending  from  the

second one and so on.

5.CB-WFQ  (Class Based  Weighted  Fair  Queuing), ike  WFQ  but,n

additi         we  have  classes concept (up to 64)  and the  bandwidth

value associated for each one.

3.Shaping capabily, that allows to li t  the source to a fixed bandwidth



4.Congestion Avoidance, like RED (Random Early Detection).

H323 Signaling Protocol

H323  protocol  is used,  for  example,  by Microsoft  Net  meeting  to  make  VoIP


This protocol allow a vari y of elements talking each other:

1.Terminals,  clients  that  initi ize  VoIP  connection.  Although  terminals

could  talk  together  without  anyone  else,  we  need  some  additional

elements for a scalable vision.

2.Gatekeepers, that essenti ly operate:

1.Address translation service, to use names instead IP addresses 

2.Admission control,o allow or deny some hosts or some users 

3.Bandwidth management

3.Gateways, points of reference for conversion TCP/IP - PSTN.

4.Multipoint Control Units (MCUs) to provide conference.

5.Proxies Server also is used.

h323 allows not only VoIP but also video and data communications.

Concerning  VoIP,  h323  can  carry  audio  codecs G.711,  G.722,  G.723,  G.728

and G.729 whil        or video it supports h261 and h263.

You  can  find  it mplemented  in  various  appli                   ion  software  like  Microsoft

Netmeeting , Net2Phone , DialPad , and also in freeware products you can

find at Openh323 Web Site .



Hardware requirement

To create a litle VoIP system you need the following hardware:

1.PC 386 or more 

2.Sound card, full duplex capable 


3.a  network  card  or  connection  to  internet  or  other  kind  of  interface  to

allow communication between 2 PCs 

All that has to be present twice to simulate a standard communication.

The tool above are the minimal requirement for a VoIP connection: next we'l

see  that  we  should  (and  in  Internet  we  must)  use  more  hardware  to  do  the

same in a real situation.

Sound  card  has  be  full  duplex  unless  we  couldn't  hear  anything  whil


Hardware accelerating cards 

We  can use  special  cards with hardware  accelerating capabiliy.  Two of  them

(and also the only ones directly managed by the Linux kernel at this moment)

are the 

1.Quicknet PhoneJack 

2.Quicknet LineJack 

Quicknet  PhoneJack  is  a  sound  card  that  can  use  standard  algorithms  to

compress audio stream like G723.1  

It can be connected directl             o a phone (POTS port) or a couple mic-speaker.

It has a ISA or PCI connector bus.

Quicknet  LineJack  works  like  PhoneJack  with  some  addition  features  (see


For more info see Quicknet web site .

Hardware gateway cards 

Quicknet  LineJack  can  be  connected  to  a  PSTN  line  allowing  VoIP  gateway


Then you'l need software to manage it see after).

Software requirement

We can choose what O.S. to use:



Under Win9x we have Microsoft Netmeeting, Internet Phone, DialPad or others

or Internet Switchboard (from Quicknet web site )<>

for Quicknet cards.

Also you can use free software you download from OpenH323 .

Under  Linux we  only have  free  software  from  OpenH323  web sit   simph323

or ohphone that can also work with Quicknet accelerating hardware.

Attention: all Openh323 source code has to be compiled in a user directory (if

not  it s necessary  to  change  some  environment  variable).  You  are  warned



that  compiling  time  could  be  very  high  and  you  could  need  a  lot  of  RAM  to

make itn a decent time.

Gateway software 

To manage  gateway feature  (join TCP/IP VoIP to PSTN  lines)  you need some

kind of software like this:

·      Internet  SwitchBoard  (    for  Windows systems

also acting as a h323 terminal;

·      PSTNGw for Linux and Windows systems you download from OpenH323

. (

Gatekeeper software 

You can choose as gatekeeper:

1.Opengatekeeper,  you  can  download  from  opengatekeeper  web  site

< >for Linux and Win9x.

Other software 

In addition I report some useful software h323 compli                     :

·      Phonepatch,  able  to  solve  problems  behind  a  NAT  fiewall.  It  simply

allows users (external  or  internal)  calling  from  a  web  page  (which  is

reachable from even external and internal users): when web appli                          ion

understands the remote host is ready, it call                      h323) the source telli

it alls ok and communication can be established.


    In this section we try to setup VoIP system, simple at fir , then more and

more complex.

Simple communication: IP to IP

A (Win9x+Sound card)   -  -  -    B (Win9x+Sound card)       -  -  -


A and B should:

1.have  Microsoft  Netmeeting  (or  other  software)  installed  and  properly


2.have a network card or other kind of TCP/IP interface to talk each other.

     In  this kind  of  view  A  can  make  a  H323  call o  B  (if  B  has Netmeeti

active) using B IP address. Then B can answer to itf it wants. After accepti

call VoIP data packets start o pass.

Using names 

     If  you  use  Microsoft  Windows in  a  lan  you  can  call he  other  side  using

NetBIOS name. NetBIOS is a protocol that can work (stand over) with NetBEUI

low level protocol and also with TCP/IP. It is only need to call the ' computer

name' on the other side to make a connection.

          A            -  -  -             B       -  -  -


        John           -  -  -           Alice

                    John calls Alice.


    This is possible cause John call request to Alice is converted to IP callng by

the NetBIOS protocol.

    The above 2 examples are very easy to implement but aren't scalable.

     In  a  more  big  view  such  as Internet  it s impossible  to  use  direct  callng

cause, usuall  the call s don't know the destination IP address. Furthermore

NetBIOS naming feature cannot work cause it uses broadcast messages, which

typically don't pass ISP routers .

Internet calling using a W INS server

     The  NetBIOS  name  callng  idea  can  be  implemented  also  in  a  Internet

environment, using a WINS server: NetBIOS clients can be configured to use a

WINS server to resolve names.

    PCs using the same WINS server wil be able to make direct callng between


A (WINS Server is S) - - - - I  - - - -  B (WINS Server is S)



                             E  - - - - -   S (WINS Server)  

C (WINS Server is S) - - - - R  


                             E  - - - -  D (WINS Server is S)


                   Internet communication

     A,  B,  C  and  D  are  in  diff ent  subnets,  but  they can  call  each  other  in  a

NetBIOS  name  callng  fashion.  The  needed  is that  all                   e  using  S  as WINS


Note: WINS server hasn't very high performance cause it use NetBIOS feature

and should only be used for joining few subnets.

A big problem: the masquering.

     A  problem  of  few  IPs is commonly solved  using  the  so  called  masqueri

(also NAT, network address translation): there is only 1 IP public address (that

Internet can directly ' see' ), he others machines are ' masqueraded' using

all this IP.

           A  - - -

           B  - - -   Router with NAT  - - -  Internet

           C  - - -

                       This doesn't work

In  the  example  A,B  and  C  can  navigate,  pinging,  using  mail  and  news

services  with  Internet  people,  but  they  CANNOT  make  a  VoIP  call  This

because H323 protocol send IP address at appli                    ion level, so the answer will

never arrive to source (that is using a private IP address).




·      there  is  a  Linux  module  that  modifies  H323  packets  avoiding  this

problem. You can download the module here . To install it you have to

copy it o source  directory specifi                   modify Makefie  and go compilng

and  installng  module  with  '  modprobe  ip_masq_h323'  .  Unfortunately

this module cannot work with ohphone software at this moment (I don't

know why).

           A  - - -   Router with NAT 

           B  - - -         +           - - -  Internet

           C  - - -  ip_masq_h323 module

                         This works

           A  - - -   

           B  - - -    PhonePatch   - - -  Internet

           C  - - -  

                         This works

Using Linux

     With  Linux  (as  an  h323  terminal)  you  can  experiment  everything  done


O h p h o n e Sy n ta x 

Syntax is:

' ohphone -l|--listen [options]'

' ohphone [options]. . address'

·      ' -l' , listen to standard port (1720)

·      '  address'  ,  mean  that  we  don't  wait  for  a  call,  but  we  connect  to  '

address' host

·      ' -n' , ' --no-gatekeeper' , this is ok if we haven't a gatekeeper

·      '  -q  num'  ,  '  --quicknet  num'  ,  it  uses  Quicknet  card,  device


·      ' -s device' , ' --sound device' , it uses /dev/device sound device.

·      ' -j delay' , ' --jitter delay' , it change delay buffer to ' delay' .

     Also,  when  you  start  ohphone,  you  can  give  command  to  the  interpreter

directl       like decrease AEC, Automatic Echo Cancell ion).

Setting up a gatekeeper

You can also experiment gatekeeper feature



        (Terminal H323) A  - - -    


        (Terminal H323) B  - -  - D (Gatekeeper)


        (Terminal H323) C  - - -  

                   Gatekeeper configuration

1.Hosts A,B and C have gatekeeper setting to point to D.


2.At  start ime  each  host  tells D  own  address and  own  name  (also  with

aliases) which could be used by a call  to reach it.

3.When a terminal asks D for an host, D answers with right IP address, so

communication can be established.

     We  have  to  notice  that  the  Gatekeeper  is able  only  to  solve  name  in  IP

address, it couldn't oin hosts that aren't eachable each other (at IP level),n

other words it couldn't act as a NAT router.

Program  has  only  to  be  launch  with  -d  (as  daemon)  or  -x  (execute)


Setting up a gateway 

As we said, gateway is an entity that can join VoIP to PSTN lines allowing us to

made callrom Internet to a classic telephone. So, in addition, we need a card

that could manage PSTN lines: Quicknet LineJack does it.

From OpenH323 web site we download:

1.driver for Linejack 

2.PSTNGW appli         ion to create our gateway.

If  executable  doesn't  work you need to download source  code  and openh323

li ary, <code.html> then install                 l in a home user directory.

After that you only need to launch PSTNGw to start your H323 gateway.

