WEnets: A Convolutional Framework for Evaluating Audio Waveforms

Andrew A. Catellier

September 2019 | White Paper

WEnets: A Convolutional Framework for Evaluating Audio Waveforms

doi: 10.70220/d69zp0f4

Cite This Publication

Andrew A. Catellier and Stephen D. Voran

Abstract:

In this white paper, we describe a new convolutional framework for waveform evaluation, WEnets, and build a Narrowband Audio Waveform Evaluation Network, or NAWEnet, using this framework. NAWEnet is single-ended (or no-reference) and was trained three separate times in order to emulate PESQ, POLQA, or STOI with testing correlations 0.95, 0.92, and 0.95, respectively when training on only 50% of available data and testing on 40%. Stacks of 1-D convolutional layers and non-linear downsampling learn which features are important for quality or intelligibility estimation. This straightforward architecture simplifies the interpretation of its inner workings and paves the way for future investigations into higher sample rates and accurate no-reference subjective speech quality predictions.

Keywords: speech quality; no reference (NR); speech intelligibility; CNN; neural nets

(WEnets_WhitePaper.pdf)

For technical information concerning this report, contact:

Stephen D. Voran
Institute for Telecommunication Sciences
(720) 446-6425
svoran@ntia.gov

For funding information concerning this report, click this link.

Disclaimer:

Certain commercial equipment, components, and software may be identified in this report to specify adequately the technical aspects of the reported results. In no case does such identification imply recommendation or endorsement by the National Telecommunications and Information Administration, nor does it imply that the equipment or software identified is necessarily the best available for the particular application or uses.

For questions or information on this or any other NTIA scientific publication, contact the ITS Publications Office at ITSinfo@ntia.gov or 303-497-3572.

Back to Search Results

Publications Search

WEnets: A Convolutional Framework for Evaluating Audio Waveforms

Cite This Publication

Funding Information

Performing Agency

Funding Agency