Proceedings of the IEEE Thirteenth International Workshop on Quality of Multimedia Experience (QoMEX 2021), Montreal, June 14-17, 2021
Stephen D. Voran
We present a set of relatively small-scale proof-of-concept experiments where we construct no-reference (NR) speech quality estimators that give reliable values of system-under-test (SUT) input speech quality in spite of the fact that NR estimators can only access SUT output speech. We then explain why this success is not as counter-intuitive as it might initially seem. Next we demonstrate that this advance can be used to adjust NR relative speech quality values to obtain the much more desirable and useful NR absolute speech quality values. The experiments start with over seven hours of studio-quality speech. A processor adds filtering, reverberation, and noise to simulate the somewhat lower quality speech that often must be used to test systems. Four different established full-reference speech quality estimators provide ground-truth values for these experiments.
Watch the recording of Voran's presentation in the NTIA YouTube channel.
Keywords: speech quality; subjective testing; no reference (NR); full reference (FR); machine learning
For technical information concerning this report, contact:
Stephen D. Voran
Institute for Telecommunication Sciences
Disclaimer: Certain commercial equipment, components, and software may be identified in this report to specify adequately the technical aspects of the reported results. In no case does such identification imply recommendation or endorsement by the National Telecommunications and Information Administration, nor does it imply that the equipment or software identified is necessarily the best available for the particular application or uses.