Contribution to CCITT SG XII Experts Group on Speech Quality, Document Number SQ 13.92, February 1992.

Observations on the T-Reference Condition for Speech Coder Evaluation

S. Voran

Abstract: In a Study Group XII Contribution dated September 1991, John Rosenberger and Bill Cotton of Bellcore introduced an algorithm for generating temporally correlated distortion on 8 KHz sampled speech data. This distortion is parameterized by a single integer value T, and is referred to as Temporally Correlated Noise or the T-Reference Condition (T-Ref). The T-Ref is a precisely defined, repeatable distortion process, that can generate a wide range of distortion levels, ranging from virtually no distortion (T=256), to a distortion that renders speech unintelligible (< ~ 4). This distortion tends to sound more like a low bit rate speech coder than the modulated noise reference unit (MNRU). In fact, subjective similarity tests at Bellcore revealed when both the T-Ref and the MNRU are available for matching the sound of low bit rate coders, listeners overwhelmingly selected the T-Ref over the MNRU. This similarity of sound is a highly desirable property when using a reference condition to evaluate speech coders. The properties mentioned above make the T-Ref a candidate to replace the MNRU in some tests. This potential utility makes the T-Ref an interesting subject for further study to determine exactly how and why it works as it does. In this contribution, we offer some observations from our study of the T-Ref. First we provide the definition of the process and note several properties. Next we provide time and frequency domain demonstrations of the effects of the T-Ref on sinusoids. We then show its frequency domain response to speech data and compare that response to voice coders and the MNRU. Finally, we suggest a moving average digital filter representation for the T-Ref.

