3GPP EVS Codec for mobile communications
The 3rd Generation Partnership Project (3GPP), which is the international standardization consortium for mobile communications, has newly defined EVS speech and audio coding standards for Voice over Long-Term Evolution (VoLTE). Conventional speech coding schemes for mobile phones have been based on Code Excited Linear Prediction (CELP). These schemes have utilized a human voice production model and achieved high-quality speech transmission with very low bit rates. EVS consists of newly invented low-delay and low-bit-rate audio coding modules in addition to CELP, and it achieves high-quality transmission of various types of input signals, including speech, audio, background noise, and background music.
In contrast to the narrowband signal (8-kHz sampling rate) for conventional fixed telephones and mobile phones and the wideband signal (16-kHz sampling rate) for VoLTE, EVS supports signals with higher sampling rates of up to 48 kHz with the aid of new bandwidth extension technologies. Note that wideband signal is used for AM radio, super-wideband signal (32-kHz sampling rate) is used for FM radio, and full-band signal (48 kHz sampling rate) is used for digital broadcasting, as shown in Figure 1. EVS has been optimized for VoLTE with a frame length of 20 ms and algorithmic delay of 32 ms. It has been designed to minimize perceptual distortion against packet loss, whereas coding schemes for conventional 3G mobile phones were optimized for robustness against bit errors. In addition, EVS covers a wide range of bit rates from 5.9 to 128 kbit/s and allows frame-by-frame selection of bit rates. This allows smooth migration from the conventional VoLTE system, since the EVS has inter-operability with Adaptive Multi-Rate WideBand (AMR-WB).
During the standardization process, a huge number of subjective quality evaluations were conducted for various coding conditions, input items, and languages. According to the report on the evaluations, EVS outperformed conventional speech and audio coding schemes in terms of quality. Using a similar procedure, NTT also conducted listening tests on Japanese materials. The results of the tests shown in Fig. 2 confirmed the superiority of EVS over the coding schemes for conventional mobile communications systems.
Note that all developments of EVS have been carried out by 12 organizations based mainly in Europe, North America, and East Asia, including Japanese companies such as Panasonic, NTT docomo, and NTT. EVS has already been deployed in commercial services such as VoLTE HD+ by NTT docomo since the summer of 2016 and by some operators in the USA and Europe as well. We believe that EVS will allow billions of people around the world to enjoy high-quality communication in the near future.
Audio bandwidth of mobile phone
For mobile phone (VoLTE*) 3GPP EVS**
- Low bit rate
- Low delay
- Interoperable to current VoLTE
- Full bandwidth (Up to 48 kHz sampling)
- High quality for music and noise
Quality comparison of EVS and current services
- EVS for VoLTE (HD+) 13.2kbps (SWB：32kHz sampling)
- AMR-WB for VoLTE 12.65kbps（WB:16ｋHz sampling）
- AMR for FOMA (3G) 12.2kbps（NB:8kHz sampling）