|
TECHNICAL SPECIFICATION
Universal Mobile Telecommunications System (UMTS);
LTE;
Codec for Enhanced Voice Services (EVS);
Detailed algorithmic description
(3GPP TS 26.445 version 12.12.0 Release 12)
---------------------- Page: 1 ----------------------
3GPP TS 26.445 version 12.12.0 Release 12 1 ETSI TS 126 445 V12.12.0 (2019-03)
Reference
RTS/TSGS-0426445vcc0
Keywords
LTE,UMTS
ETSI
650 Route des Lucioles
F-06921 Sophia Antipolis Cedex - FRANCE
Tel.: +33 4 92 94 42 00 Fax: +33 4 93 65 47 16
Siret N° 348 623 562 00017 - NAF 742 C
Association à but non lucratif enregistrée à la
Sous-Préfecture de Grasse (06) N° 7803/88
Important notice
The present document can be downloaded from:
The present document may be made available in electronic versions and/or in print. The content of any electronic and/or
print versions of the present document shall not be modified without the prior written authorization of ETSI. In case of any
existing or perceived difference in contents between such versions and/or in print, the prevailing version of an ETSI
deliverable is the one made publicly available in PDF format at www.etsi.org/deliver.
Users of the present document should be aware that the document may be subject to revision or change of status.
Information on the current status of this and other ETSI documents is available at
If you find errors in the present document, please send your comment to one of the following services:
Copyright Notification
No part may be reproduced or utilized in any form or by any means, electronic or mechanical, including photocopying
and microfilm except as authorized by written permission of ETSI.
The content of the PDF version shall not be modified without the written authorization of ETSI.
The copyright and the foregoing restriction extend to reproduction in all media.
© ETSI 2019.
All rights reserved.
TM TM TM
DECT , PLUGTESTS , UMTS and the ETSI logo are trademarks of ETSI registered for the benefit of its Members.
TM TM
3GPP and LTE are trademarks of ETSI registered for the benefit of its Members and
of the 3GPP Organizational Partners.
oneM2M™ logo is a trademark of ETSI registered for the benefit of its Members and
of the oneM2M Partners.
GSM and the GSM logo are trademarks registered and owned by the GSM Association.
ETSI
---------------------- Page: 2 ----------------------
3GPP TS 26.445 version 12.12.0 Release 12 2 ETSI TS 126 445 V12.12.0 (2019-03)
Intellectual Property Rights
Essential patents
IPRs essential or potentially essential to normative deliverables may have been declared to ETSI. The information
pertaining to these essential IPRs, if any, is publicly available for ETSI members and non-members, and can be found
in ETSI SR 000 314: "Intellectual Property Rights (IPRs); Essential, or potentially Essential, IPRs notified to ETSI in
respect of ETSI standards", which is available from the ETSI Secretariat. Latest updates are available on the ETSI Web
server (https://ipr.etsi.org/).
Pursuant to the ETSI IPR Policy, no investigation, including IPR searches, has been carried out by ETSI. No guarantee
can be given as to the existence of other IPRs not referenced in ETSI SR 000 314 (or the updates on the ETSI Web
server) which are, or may be, or may become, essential to the present document.
Trademarks
The present document may include trademarks and/or tradenames which are asserted and/or registered by their owners.
ETSI claims no ownership of these except for any which are indicated as being the property of ETSI, and conveys no
right to use or reproduce any trademark and/or tradename. Mention of those trademarks in the present document does
not constitute an endorsement by ETSI of products, services or organizations associated with those trademarks.
Foreword
This Technical Specification (TS) has been produced by ETSI 3rd Generation Partnership Project (3GPP).
The present document may refer to technical specifications or reports using their 3GPP identities, UMTS identities or
GSM identities. These should be interpreted as being references to the corresponding ETSI deliverables.
The cross reference between GSM, UMTS, 3GPP and ETSI identities can be found under
.
Modal verbs terminology
In the present document "shall", "shall not", "should", "should not", "may", "need not", "will", "will not", "can" and
"cannot" are to be interpreted as described in clause 3.2 of the ETSI Drafting Rules (Verbal forms for the expression of
provisions).
"must" and "must not" are NOT allowed in ETSI deliverables except when used in direct citation.
ETSI
---------------------- Page: 3 ----------------------
3GPP TS 26.445 version 12.12.0 Release 12 3 ETSI TS 126 445 V12.12.0 (2019-03)
Contents
Intellectual Property Rights . 2
Foreword . 2
Modal verbs terminology . 2
Foreword . 14
1 Scope . 15
2 References . 15
3 Definitions, abbreviations and mathematical expressions . 17
3.1 Definitions . 17
3.2 Abbreviations . 17
3.3 Mathematical Expressions . 19
4 General description of the coder . 20
4.1 Introduction . 20
4.2 Input/output sampling rate . 20
4.3 Codec delay . 20
4.4 Coder overview . 20
4.4.1 Encoder overview . 21
4.4.1.1 Linear Prediction Based Operation . 21
4.4.1.2 Frequency Domain Operation . 22
4.4.1.3 Inactive Signal coding . 22
4.4.1.4 Source Controlled VBR Coding . 23
4.4.2 Decoder overview . 23
4.4.2.1 Parametric Signal Representation Decoding (Bandwidth Extension) . 23
4.4.2.2 Frame loss concealment . 23
4.4.3 DTX/CNG operation. 23
4.4.3.1 Inactive Signal coding . 24
4.4.4 AMR-WB-interoperable option . 24
4.4.5 Channel-Aware Mode . 24
4.5 Organization of the rest of the Technical Standard . 24
5 Functional description of the encoder . 25
5.1 Common processing . 25
5.1.1 High-pass Filtering . 25
5.1.2 Complex low-delay filter bank analysis. 25
5.1.2.1 Sub-band analysis . 25
5.1.2.2 Sub-band energy estimation . 26
5.1.3 Sample rate conversion to 12.8 kHz . 27
5.1.3.1 Conversion of 16, 32 and 48 kHz signals to 12.8 kHz . 27
5.1.3.2 Conversion of 8 kHz signals to 12.8 kHz . 27
5.1.3.3 Conversion of input signals to 16, 25.6 and 32 kHz . 29
5.1.4 Pre-emphasis . 29
5.1.5 Spectral analysis . 30
5.1.5.1 Windowing and DFT. 30
5.1.5.2 Energy calculations . 31
5.1.6 Bandwidth detection . 32
5.1.6.1 Mean and maximum energy values per band . 32
5.1.7 Bandwidth decision. 34
5.1.8 Time-domain transient detection . 37
5.1.9 Linear prediction analysis . 38
5.1.9.1 LP analysis window . 38
5.1.9.2 Autocorrelation computation. 38
5.1.9.3 Adaptive lag windowing . 39
5.1.9.4 Levinson-Durbin algorithm . 39
5.1.9.5 Conversion of LP coefficients to LSP parameters . 40
5.1.9.6 LSP interpolation . 41
ETSI
---------------------- Page: 4 ----------------------
3GPP TS 26.445 version 12.12.0 Release 12 4 ETSI TS 126 445 V12.12.0 (2019-03)
5.1.9.7 Conversion of LSP parameters to LP coefficients . 41
5.1.9.8 LP analysis at 16kHz . 42
5.1.10 Open-loop pitch analysis . 43
5.1.10.1 Perceptual weighting . 43
5.1.10.2 Correlation function computation . 44
5.1.10.3 Correlation reinforcement with past pitch values . 45
5.1.10.4 Normalized correlation computation . 46
5.1.10.5 Correlation reinforcement with pitch lag multiples . 46
5.1.10.6 Initial pitch lag determination and reinforcement based on pitch coherence with other half-
frames . 47
5.1.10.7 Pitch lag determination and parameter update . 48
5.1.10.8 Correction of very short and stable open-loop pitch estimates . 49
5.1.10.9 Fractional open-loop pitch estimate for each subframe. 51
5.1.11 Background noise energy estimation . 52
5.1.11.1 First stage of noise energy update . 52
5.1.11.2 Second stage of noise energy update . 54
5.1.11.2.1 Basic parameters for noise energy update . 54
5.1.11.2.2 Spectral diversity . 55
5.1.11.2.3 Complementary non-stationarity . 55
5.1.11.2.4 HF energy content . 56
5.1.11.2.5 Tonal stability . 56
5.1.11.2.6 High frequency dynamic range . 60
5.1.11.2.7 Combined decision for background noise energy update . 60
5.1.11.3 Energy-based parameters for noise energy update . 62
5.1.11.3.1 Closeness to current background estimate . 62
5.1.11.3.2 Features related to last correlation or harmonic event . 62
5.1.11.3.3 Energy-based pause detection . 63
5.1.11.3.4 Long-term linear prediction efficiency . 63
5.1.11.3.5 Additional long-term parameters used for noise estimation . 64
5.1.11.4 Decision logic for noise energy update . 65
5.1.12 Signal activity detection . 68
5.1.12.1 SAD1 module . 69
5.1.12.1.1 SNR outlier filtering . 71
5.1.12.2 SAD2 module . 72
5.1.12.3 Combined decision of SAD1 and SAD2 modules for WB and SWB signals . 75
5.1.12.4 Final decision of the SAD1 module for NB signals . 75
5.1.12.5 Post-decision parameter update . 76
5.1.12.6 SAD3 module . 77
5.1.12.6.1 Sub-band FFT . 77
5.1.12.6.2 Computation of signal features . 78
5.1.12.6.3 Computation of SNR parameters . 81
5.1.12.6.4 Decision of background music . 83
5.1.12.6.5 Decision of background update flag . 83
5.1.12.6.6 SAD3 Pre-decision . 84
5.1.12.6.7 SAD3 Hangover . 86
5.1.12.7 Final SAD decision . 86
5.1.12.8 DTX hangover addition . 88
5.1.13 Coding mode determination . 90
5.1.13.1 Unvoiced signal classification . 91
5.1.13.1.1 Voicing measure . 92
5.1.13.1.2 Spectral tilt . 92
5.1.13.1.3 Sudden energy increase from a low energy level . 93
5.1.13.1.4 Total frame energy difference . 94
5.1.13.1.5 Energy decrease after spike . 94
5.1.13.1.6 Decision about UC mode . 95
5.1.13.2 Stable voiced signal classification . 96
5.1.13.3 Signal classification for FEC. 96
5.1.13.3.1 Signal classes for FEC . 97
5.1.13.3.2 Signal classification parameters . 97
5.1.13.3.3 Classification procedure . 98
5.1.13.4 Transient signal classification . 99
5.1.13.5 Modification of coding mode in special cases . 100
ETSI
---------------------- Page: 5 ----------------------
3GPP TS 26.445 version 12.12.0 Release 12 5 ETSI TS 126 445 V12.12.0 (2019-03)
5.1.13.6 Speech/music classification. 101
5.1.13.6.1 First stage of the speech/music classifier . 101
5.1.13.6.2 Scaling of features in the first stage of the speech/music classifier . 103
5.1.13.6.3 Log-probability and decision smoothing . 104
5.1.13.6.4 State machine and final speech/music decision . 105
5.1.13.6.5 Improvement of the classification for mixed and music content . 108
5.1.13.6.6 Second stage of the speech/music classifier . 112
5.1.13.6.7 Context-based improvement of the classification for stable tonal signals . 114
5.1.13.6.8 Detection of sparse spectral content . 118
5.1.13.6.9 Decision about AC mode . 120
5.1.13.6.10 Decision about IC mode . 120
5.1.14 Coder technology selection . 120
5.1.14.1 ACELP/MDCT-based technology selection at 9.6kbps, 16.4 and 24.4 kbps . 121
5.1.14.1.1 Segmental SNR estimation of the MDCT-based technology . 121
5.1.14.1.2 Segmental SNR estimation of the ACELP technology. 127
5.1.14.1.3 Hysteresis and final decision . 128
5.1.14.2 TCX/HQ MDCT technology selection at 13.2 and 16.4 kbps . 129
5.1.14.3 TCX/HQ MDCT technology selection at 24.4 and 32 kbps . 131
5.1.14.4 TD/Multi-mode FD BWE technology selection at 13.2 kbps and 32 kbps . 134
5.2 LP-based Coding . 135
5.2.1 Perceptual weighting. 135
5.2.2 LP filter coding and interpolation . 136
5.2.2.1 LSF quantization . 136
5.2.2.1.1 LSF weighting function . 136
5.2.2.1.2 Bit allocation . 139
5.2.2.1.3 Predictor allocation . 140
5.2.2.1.4 LSF quantizer structure . 140
5.2.2.1.5 LSFQ for voiced coding mode at 16 kHz internal sampling frequency : BC-TCVQ . 145
5.2.2.1.6 Mid-frame LSF quantizer . 152
5.2.3 Excitation coding . 153
5.2.3.1 Excitation coding in the GC, VC and high rate IC/UC modes . 153
5.2.3.1.1 Computation of the LP residual signal . 154
5.2.3.1.2 Target signal computation . 154
5.2.3.1.3 Impulse response computation . 155
5.2.3.1.4 Adaptive codebook . 155
5.2.3.1.5 Algebraic codebook . 157
5.2.3.1.6 Combined algebraic codebook . 167
5.2.3.1.7 Gain quantization. 181
5.2.3.2 Excitation coding in TC mode . 186
5.2.3.2.1 Glottal pulse codebook search . 186
5.2.3.2.2 TC frame configurations . 190
5.2.3.2.3 Pitch period and gain coding in the TC mode . 192
5.2.3.2.4 Update of filter memories . 195
5.2.3.3 Excitation coding in UC mode at low rates . 195
5.2.3.3.1 Structure of the Gaussian codebook . 195
5.2.3.3.2 Correction of the Gaussian codebook spectral tilt . 196
5.2.3.3.3 Search of the Gaussian codebook . 197
5.2.3.3.4 Quantization of the Gaussian codevector gain . 198
5.2.3.3.5 Other parameters in UC mode .
...