Joint Analyses of No-Reference Speech Quality Estimation Tools and Conference Speech Recorded in Diverse Real-World Conditions

Jaden  Pieper

July 2024 | Technical Memorandum NTIA TM-24-571

Joint Analyses of No-Reference Speech Quality Estimation Tools and Conference Speech Recorded in Diverse Real-World Conditions

doi: 10.70220/rhkmg3bd

Cite This Publication

Jaden Pieper and Stephen D. Voran

Abstract:

Recently, prerecorded audio and video presentations, as well as virtual meetings, have become a common component of professional life, due to health and environmental considerations. This places new responsibility on participants to generate audio that is of sufficiently high quality to effectively communicate. This memorandum provides analyses of real-world audio from a virtual component of a 2023 conference which encompasses a wide range of recording environments and conditions. We use both signal analyses and novel machine learning-based no-reference speech quality estimators and we evaluate their performance relative to each other. We utilized NISQA, WAWEnets, and TorchAudio-Squim, and found that while their scores show only modest agreement, we can use each to successfully identify low-quality speech. Finally we offer remediation steps for speech conferencing, to avoid many of the impairments observed in this work.

Keywords: intelligibility; speech quality; speech intelligibility; no reference (NR) metric; conference speech; no reference (NR) speech quality assessment; speech impairment

(NTIA TM-24-571.pdf)

Related Publications:

Special Publication:

“Improving Speech Audio for Prerecorded and Live Online Conference Sessions,” Special Publication NTIA SP 24-572

For technical information concerning this report, contact:

Jaden Pieper
Institute for Telecommunication Sciences
(202) 236-7516
jpieper@ntia.gov

For funding information concerning this report, click this link.

Disclaimer:

Certain commercial equipment, components, and software may be identified in this report to specify adequately the technical aspects of the reported results. In no case does such identification imply recommendation or endorsement by the National Telecommunications and Information Administration, nor does it imply that the equipment or software identified is necessarily the best available for the particular application or uses.

For questions or information on this or any other NTIA scientific publication, contact the ITS Publications Office at ITSinfo@ntia.gov or 303-497-3572.

Back to Search Results

Publications Search

Joint Analyses of No-Reference Speech Quality Estimation Tools and Conference Speech Recorded in Diverse Real-World Conditions

Cite This Publication

Funding Information

Performing Agency

Funding Agency