SIST ES 202 737 V1.3.1:2009

Speech and multimedia Transmission Quality (STQ) - Transmission requirements for narrowband VoIP terminals (handset and headset) from a QoS perspective as perceived by the user

SIST ES 202 737 V1.3.1:2009

Name:SIST ES 202 737 V1.3.1:2009   Standard name:Speech and multimedia Transmission Quality (STQ) - Transmission requirements for narrowband VoIP terminals (handset and headset) from a QoS perspective as perceived by the user
Standard number:SIST ES 202 737 V1.3.1:2009   language:English language
Release Date:08-Oct-2009   technical committee:SPN - Services and Protocols for Networks
Drafting committee:   ICS number:33.050.01 - Telecommunication terminal equipment in general
ETSI ES 202 737 V1.3.1 (2009-09)
ETSI Standard


Speech and multimedia Transmission Quality (STQ);
Transmission requirements for narrowband
VoIP terminals (handset and headset) from a
QoS perspective as perceived by the user

---------------------- Page: 1 ----------------------
2 ETSI ES 202 737 V1.3.1 (2009-09)



Reference
RES/STQ-00154
Keywords
3,1 kHz, gateway, quality, telephony, terminal,
VoIP
ETSI
650 Route des Lucioles
F-06921 Sophia Antipolis Cedex - FRANCE

Tel.: +33 4 92 94 42 00  Fax: +33 4 93 65 47 16

Siret N° 348 623 562 00017 - NAF 742 C
Association à but non lucratif enregistrée à la
Sous-Préfecture de Grasse (06) N° 7803/88

Important notice
Individual copies of the present document can be downloaded from:
http://www.etsi.org
The present document may be made available in more than one electronic version or in print. In any case of existing or
perceived difference in contents between such versions, the reference version is the Portable Document Format (PDF).
In case of dispute, the reference shall be the printing on ETSI printers of the PDF version kept on a specific network drive
within ETSI Secretariat.
Users of the present document should be aware that the document may be subject to revision or change of status.
Information on the current status of this and other ETSI documents is available at
http://portal.etsi.org/tb/status/status.asp
If you find errors in the present document, please send your comment to one of the following services:
http://portal.etsi.org/chaircor/ETSI_support.asp
Copyright Notification
No part may be reproduced except as authorized by written permission.
The copyright and the foregoing restriction extend to reproduction in all media.

© European Telecommunications Standards Institute 2009.
All rights reserved.

TM TM TM TM
DECT , PLUGTESTS , UMTS , TIPHON , the TIPHON logo and the ETSI logo are Trade Marks of ETSI registered
for the benefit of its Members.
TM
3GPP is a Trade Mark of ETSI registered for the benefit of its Members and of the 3GPP Organizational Partners.
LTE™ is a Trade Mark of ETSI currently being registered
for the benefit of its Members and of the 3GPP Organizational Partners.
GSM and the GSM logo are Trade Marks registered and owned by the GSM Association.
ETSI

---------------------- Page: 2 ----------------------
3 ETSI ES 202 737 V1.3.1 (2009-09)
Contents
Intellectual Property Rights . 5
Foreword . 5
Introduction . 5
1 Scope . 6
2 References . 6
2.1 Normative references . 6
2.2 Informative references . 8
3 Definitions and abbreviations . 8
3.1 Definitions . 8
3.2 Abbreviations . 9
4 General considerations . 9
4.1 Default Coding Algorithm . 9
4.2 End-to-end considerations . 9
4.3 Parameters to be investigated . 10
4.3.1 Basic parameters . 10
4.3.2 Further Parameters with respect to Speech Processing Devices . 10
5 Test equipment . 10
5.1 IP half channel measurement adaptor . 10
5.2 Environmental conditions for tests . 11
5.3 Accuracy of measurements and test signal generation . 11
5.4 Network impairment simulation . 11
6 Acoustic environment. 12
7 Requirements and associated Measurement Methodologies . 13
7.1 Test setup. 13
7.1.1 Setup for handsets and headsets . 14
7.1.2 Position and calibration of HATS . 14
7.1.3 Test Signal Levels . 14
7.1.4 Setup of background noise simulation . 14
7.2 Coding independent parameters . 15
7.2.1 Send Frequency response . 15
7.2.2 Send Loudness Rating . 16
7.2.3 D-Factor . 17
7.2.4 Linearity Range for SLR . 17
7.2.5 Send Distortion . 18
7.2.6 Out-of-Band Signals in Send direction . 19
7.2.7 Send Noise . 19
7.2.8 Sidetone Masking Rating STMR (Mouth to ear) . 20
7.2.9 Sidetone delay . 20
7.2.10 Terminal Coupling Loss weighted (TCLw) . 21
7.2.11 Stability Loss . 21
7.2.12 Receive Frequency Response . 22
7.2.13 Receive Loudness Rating . 25
7.2.14 Receive Distortion . 25
7.2.15 Out-of-band signals in receive direction . 26
7.2.16 Minimum activation level and sensitivity in Receive direction . 27
7.2.17 Receive Noise . 27
7.2.18 Automatic Level Control in Receive . 27
7.2.19 Double Talk Performance . 27
7.2.19.1 Attenuation Range in Send Direction during Double Talk A . 27
H,S,dt
7.2.19.2 Attenuation Range in Receive Direction during Double Talk A . 29
H,R,dt
7.2.19.3 Detection of Echo Components during Double Talk . 30
ETSI

---------------------- Page: 3 ----------------------
4 ETSI ES 202 737 V1.3.1 (2009-09)
7.2.19.4 Minimum activation level and sensitivity of double talk detection . 31
7.2.20 Switching characteristics . 31
7.2.20.1 Activation in Send Direction . 31
7.2.20.2 Silence Suppression and Comfort Noise Generation . 32
7.2.21 Background Noise Performance . 32
7.2.21.1 Performance in send direction in the presence of background noise . 32
7.2.21.2 Speech Quality in the Presence of Background Noise . 33
7.2.21.3 Quality of Background Noise Transmission (with Far End Speech) . 34
7.2.21.4 Quality of Background Noise Transmission (with Near End Speech) . 34
7.2.22 Quality of echo cancellation . 35
7.2.22.1 Temporal echo effects . 35
7.2.22.2 Spectral Echo Attenuation . 36
7.2.22.3 Occurrence of Artefacts . 36
7.2.23 Variant Impairments; Network dependant . 36
7.2.23.1 Delay versus Time Send . 36
7.2.23.2 Delay versus Time Receive . 36
7.2.23.3 Quality of Jitter buffer adjustment . 36
7.3 Codec Specific Requirements. 37
7.3.1 Send Delay . 37
7.3.2 Receive delay . 38
7.3.3 Objective Listening Speech Quality MOS-LQO in Send direction . 40
7.3.4 Objective Listening Quality MOS-LQO in Receive direction . 40
7.3.4.1 Efficiency of Packet Loss Concealment (PLC) . 42
7.3.4.2 Efficiency of Delay Variation Removal . 42
Annex A (informative): Processing delays in VoIP terminals . 43
Annex B (informative): Bibliography . 46
History . 47

ETSI

---------------------- Page: 4 ----------------------
5 ETSI ES 202 737 V1.3.1 (2009-09)
Intellectual Property Rights
IPRs essential or potentially essential to the present document may have been declared to ETSI. The information
pertaining to these essential IPRs, if any, is publicly available for ETSI members and non-members, and can be found
in ETSI SR 000 314: "Intellectual Property Rights (IPRs); Essential, or potentially Essential, IPRs notified to ETSI in
respect of ETSI standards", which is available from the ETSI Secretariat. Latest updates are available on the ETSI Web
server (http://webapp.etsi.org/IPR/home.asp).
Pursuant to the ETSI IPR Policy, no investigation, including IPR searches, has been carried out by ETSI. No guarantee
can be given as to the existence of other IPRs not referenced in ETSI SR 000 314 (or the updates on the ETSI Web
server) which are, or may be, or may become, essential to the present document.
Foreword
This ETSI Standard (ES) has been produced by ETSI Technical Committee Speech and multimedia Transmission
Quality (STQ).
Introduction
Traditionally, the analogue and digital telephones were interfacing switched-circuit 64 kbit/s PCM networks. With the
fast growth of IP networks, terminals directly interfacing packet-switched networks (VoIP) are being rapidly
introduced. Such IP network edge devices may include gateways, specifically designed IP phones, soft phones or other
devices connected to the IP based networks and providing telephony service. Since the IP networks will be in many
cases interworking with the traditional PSTN and private networks, many of the basic transmission requirements have
to be harmonized with specifications for traditional digital terminals. However, due to the unique characteristics of the
IP networks including packet loss, delay, etc. new performance specification, as well as appropriate measuring methods,
will have to be developed. Terminals are getting increasingly complex, advanced signal processing is used to address
the IP specific issues. Also, the VoIP terminals may use other than 64 kbit/s PCM (ITU-T Recommendation G.711 [8])
speech algorithms.
The present document will provide speech transmission performance for narrowband VoIP handset and headset
terminals.
NOTE: Requirement limits are given in tables, the associated curve when provided is given for illustration.
ETSI

---------------------- Page: 5 ----------------------
6 ETSI ES 202 737 V1.3.1 (2009-09)
1 Scope
The present document provides speech transmission performance requirements for 4 kHz narrowband VoIP handset and
headset terminals; it addresses all types of IP based terminals, including wireless and soft phones.
In contrast to other standards which define minimum performance requirements it is the intention of the present
document to specify terminal equipment requirements which enable manufacturers and service providers to enable good
quality end-to-end speech performance as perceived by the user.
In addition to basic testing procedures, the present document describes advanced testing procedures taking into account
further quality parameters as perceived by the user.
It is the intention of the present document to describe terminal performance parameters in such way that the remaining
variation of parameters can be assessed purely by the E-model.
2 References
References are either specific (identified by date of publication and/or edition number or version number) or
non-specific.
• For a specific reference, subsequent revisions do not apply.
• Non-specific reference may be made only to a complete document or a part thereof and only in the following
cases:
- if it is accepted that it will be possible to use all future changes of the referenced document for the
purposes of the referring document;
- for informative references.
Referenced documents which are not found to be publicly available in the expected location might be found at
http://docbox.etsi.org/Reference.
NOTE: While any hyperlinks included in this clause were valid at the time of publication ETSI cannot guarantee
their long term validity.
2.1 Normative references
The following referenced documents are indispensable for the application of the present document. For dated
references, only the edition cited applies. For non-specific references, the latest edition of the referenced document
(including any amendments) applies.
[1] ETSI I-ETS 300 245-2: "Integrated Services Digital Network (ISDN); Technical characteristics of
telephony terminals; Part 2: PCM A-law handset telephony".
[2] ETSI EN 300 726: "Digital cellular telecommunications system (Phase 2+) (GSM);Enhanced Full
Rate (EFR) speech transcoding (GSM 06.60)".
[3] ETSI TS 126 171: "Digital cellular telecommunications system (Phase 2+); Universal Mobile
Telecommunications System (UMTS); AMR speech codec, wideband; General description
(3GPP TS 26.171 version 6.0.0 Release 6)".
[4] ITU-T Recommendation G.107: "The E-model, a computational model for use in transmission
planning".
[5] ITU-T Recommendation G.108: "Application of the E-model: A planning guide".
[6] ITU-T Recommendation G.109: "Definition of categories of speech transmission quality".
ETSI

---------------------- Page: 6 ----------------------
7 ETSI ES 202 737 V1.3.1 (2009-09)
[7] ITU-T Recommendation G.122: "Influence of national systems on stability and talker echo in
international connections".
[8] ITU-T Recommendation G.711: "Pulse code modulation (PCM) of voice frequencies".
[9] ITU-T Recommendation G.723.1: "Dual rate speech coder for multimedia communications
transmitting at 5.3 and 6.3 kbit/s".
[10] ITU-T Recommendation G.726: "40, 32, 24, 16 kbit/s Adaptive Differential Pulse Code
Modulation (ADPCM)".
[11] ITU-T Recommendation G.729: "Coding of speech at 8 kbit/s using conjugate-structure algebraic-
code-excited linear prediction (CS-ACELP)".
[12] ITU-T Recommendation G.729.1: "G.729 based Embedded Variable bit-rate coder: An 8-32 kbit/s
scalable wideband coder bitstream interoperable with G.729".
[13] ITU-T Recommendation G.1020: "Performance parameter definitions for quality of speech and
other voiceband applications utilizing IP networks".
[14] ITU-T Recommendation P.50: "Artificial voices".
[15] ITU-T Recommendation P.56: "Objective measurement of active speech level".
[16] ITU-T Recommendation P.57: "Artificial ears".
[17] ITU-T Recommendation P.58: "Head and torso simulator for telephonometry".
[18] ITU-T Recommendation P.64: "Determination of sensitivity/frequency characteristics of local
telephone systems".
[19] ITU-T Recommendation P.79: "Calculation of loudness ratings for telephone sets".
[20] ITU-T Recommendation P.340: "Transmission characteristics and speech quality parameters of
hands-free terminals".
[21] ITU-T Recommendation P.380: "Electro-acoustic measurements on headsets".
[22] ITU-T Recommendation P.501: "Test signals for use in telephonometry".
[23] ITU-T Recommendation P.502: "Objective test methods for speech communication systems using
complex test signals".
[24] ITU-T Recommendation P.581: "Use of head and torso simulator (HATS) for hands-free terminal
testing".
[25] ITU-T Recommendation P.862: "Perceptual evaluation of speech quality (PESQ): An objective
method for end-to-end speech quality assessment of narrow-band telephone networks and speech
codecs".
[26] IEC 61260: "Electroacoustics - Octave-band and fractional-octave-band filters".
[27] ISO 3 (1973): "Preferred numbers - Series of preferred numbers".
[28] ITU-T Recommendation P.800.1: "Mean Opinion Score (MOS) terminology" .
[29] ETSI ES 202 739: "Speech and multimedia Transmission Quality (STQ); Transmission
requirements for VoIP terminals from a QoS perspective as perceived by the user".
ETSI

---------------------- Page: 7 ----------------------
8 ETSI ES 202 737 V1.3.1 (2009-09)
2.2 Informative references
The following referenced documents are not essential to the use of the present document but they assist the user with
regard to a particular subject area. For non-specific references, the latest version of the referenced document (including
any amendments) applies.
[i.1] ETSI TR 102 648-1: "Speech Processing, Transmission and Quality Aspects (STQ); Test
Methodologies for ETSI Test Events and Results; Part 1: VoIP Speech Quality Testing".
[i.2] ETSI EG 201 377-1: "Speech Processing, Transmission and Quality Aspects (STQ); Specification
and measurement of speech transmission quality; Part 1: Introduction to objective comparison
measurement methods for one-way speech quality across networks".
[i.3] ETSI EG 202 396-1: "Speech Processing, Transmission and Quality Aspects (STQ); Speech
quality performance in the presence of background noise; Part 1: Background noise simulation
technique and background noise database".
[i.4] ETSI EG 202 425: "Speech Processing, Transmission and Quality Aspects (STQ); Definition and
implementation of VoIP reference point".
[i.5] ETSI EG 202 396-3: "Speech Processing, Transmission and Quality Aspects (STQ); Speech
Quality performance in the presence of background noise Part 3: Background noise transmission -
Objective test methods".
3 Definitions and abbreviations
3.1 Definitions
For the purposes of the present document, the following terms and definitions apply:
artificial ear: device for the calibration of earphones incorporating an acoustic coupler and a calibrated microphone for
the measurement of the sound pressure and having an overall acoustic impedance similar to that of the median adult
human ear over a given frequency band
codec: combination of an analogue-to-digital encoder and a digital-to-analogue decoder operating in opposite directions
of transmission in the same equipment
Composite Source Signal (CSS): signal composed in time by various signal elements
diffuse field equalization: equalization of the HATS sound pick-up, equalization of the difference, in dB, between the
spectrum level of the acoustic pressure at the ear Drum Reference Point (DRP) and the spectrum level of the acoustic
pressure at the HATS Reference Point (HRP) in a diffuse sound field with the HATS absent using the reverse nominal
curve given in table 3 of ITU-T Recommendation P.58 [17]
Ear Reference Point (ERP): virtual point for geometric reference located at the entrance to the listener's ear,
traditionally used for calculating telephonometric loudness ratings
ear-Drum Reference Point (DRP): point located at the end of the ear canal, corresponding to the ear-drum position
freefield reference point: point located in the free sound field, at least in 1,5 m distance from a sound source radiating
in free air (in case of a head and torso simulator [HATS] in the centre of the artificial head with no artificial head
present)
Head And Torso Simulator (HATS) for telephonometry: manikin extending downward from the top of the head to
the waist, designed to simulate the sound pick-up characteristics and the acoustic diffraction produced by a median
human adult and to reproduce the acoustic field generated by the human mouth
Mouth Reference Point (MRP): is located on axis and 25 mm in front of the lip plane of a mouth simulator
nominal setting of the volume control: when a receive volume control is provided, the setting which is closest to the
nominal RLR of 2 dB
ETSI

---------------------- Page: 8 ----------------------
9 ETSI ES 202 737 V1.3.1 (2009-09)
3.2 Abbreviations
For the purposes of the present document, the following abbreviations apply:
CSS Composite Source Signal
D D-value of terminal
DRP ear Drum Reference Point
EL Echo Loss
ERP Ear Reference Point
HATS Head And Torso Simulator
MOS-LQOy Mean Opinion Score - Listening Quality Objective
NOTE: See ITU-T Recommendation P.800.1 [28].
MRP Mouth Reference Point
NLP Non Linear Processor
PCM Pulse Code Modulation
PESQ™ Perceptional Evaluation of Speech Quality™
PLC Packet Loss Concealment
PN Pseudo-random Noise
POI Point Of Interconnect
PSTN Public Switched Telephone Network
QoS Quality of Service
RLR Receive Loudness Rating
SLR Send Loudness Rating
STMR SideTone Masking Rating
TCLw Terminal Coupling Loss (weighted)
TOSQA Telecommunication Objective Speech Quality Assessment
TCN Trace Control for Netem
4 General considerations
4.1 Default Coding Algorithm
VoIP terminals shall support the coding algorithm according to ITU-T Recommendation G.711 [8] (both µ-law and
A-law). VoIP terminals may support other coding algorithms.
NOTE: Associated Packet Loss Concealment (PLC) e.g. as defined in ITU-T Recommendation G.711 [8]
appendix I should be used.
4.2 End-to-end considerations
In order to achieve a desired end-to-end speech transmission performance (mouth-to-ear) it is recommended that the
general rules of transmission planning are carried out with the E-model of ITU-T Recommendation G.107 [4] taking
into account that the E-model does not yet address headsets; this includes the a-priori determination of the desired
category of speech transmission quality as defined in ITU-T Recommendation G.109 [6].
While, in general, the transmission characteristics of single circuit-oriented network elements, such as switches or
terminals can be assumed to have a single input value for the planning tasks of ITU-T Recommendation G.108 [5], this
approach is not applicable in packet based systems and thus there is a need for the transmission planner's specific
attention.
In particular the decision as to which delay measured according to the present document should is acceptable or
representative for the specific configuration is the responsibility of the individual transmission planner.
ITU-T Recommendation G.108 with its amendments [5] provides further guidance on this important issue.
ETSI

---------------------- Page: 9 ----------------------
10 ETSI ES 202 737 V1.3.1 (2009-09)
The following optimum terminal parameters from a users' perspective need to be considered:
• Minimized delay in send and receive direction.
• Optimum loudness Rating (RLR, SLR).
• Compensation for network delay variation.
• Packet loss recovery performance.
• Maximized terminal coupling loss.
4.3 Parameters to be investigated
4.3.1 Basic parameters
The basic parameters are based on I-ETS 300 245-2 [1].
4.3.2 Further Parameters with respect to Speech Processing Devices
For VoIP terminals that contain non-linear speech processing devices, the following parameters require additional
attention in the context of the present document:
• Objective evaluation of speech quality for VoIP terminals.
• Doubletalk capability.
• Time-variant impairments.
- Switching behaviour.
- Partial echo effects.
- Occurrence of artefacts.
- Clock accuracy.
• Background noise performance of the terminal.
• Etc.
The measurements of these further parameters with respect to speech processing devices which are a novelty to terminal
requirement standards have been successfully used in the ETSI VoIP speech quality test events, TR 102 648-1 [i.1].
5 Test equipment
5.1 IP half channel measurement adaptor
The IP half channel measurement adaptor is described in EG 202 425 [i.4].
ETSI

---------------------- Page: 10 ----------------------
11 ETSI ES 202 737 V1.3.1 (2009-09)
5.2 Environmental conditions for tests
The follow
...

  • Relates Information
  • ISO 8130-9:1992

    ISO 8130-9:1992 - Coating powders
    09-28
  • EN 352-2:2020/FprA1

    EN 352-2:2021/oprA1:2023
    09-28
  • IEC TS 61158-4:1999

    IEC TS 61158-4:1999 - Digital data communications for measurement and control - Fieldbus for use in industrial control systems - Part 4: Data Link protocol specification Released:3/24/1999 Isbn:2831847656
    09-28
  • HD 566 S1:1990

    HD 566 S1:1998
    09-28
  • ISO 5131:1982/Amd 1:1992

    ISO 5131:1982/Amd 1:1992
    09-28
  • EN 60598-2-22:1990

    EN 60598-2-22:1996
    09-27
  • ISO 8504-2:1992

    ISO 8504-2:1992 - Preparation of steel substrates before application of paints and related products -- Surface preparation methods
    09-27
  • EN 12165:2024

    prEN 12165:2022
    09-27
  • IEC TS 61158-6:1999

    IEC TS 61158-6:1999 - Digital data communications for measurement and control - Fieldbus for use in industrial control systems - Part 6: Application Layer protocol specification Released:3/24/1999 Isbn:2831847613
    09-27
  • ISO 4252:1992

    ISO 4252:1992 - Agricultural tractors -- Operator's workplace, access and exit -- Dimensions
    09-27