Abstract
This paper addresses the problem of identifying impairment types that might be present in a speech signal. In particular, three acoustically induced degradation types that occur in teleconference systems are considered: acoustic echo, reverberation, and broadband noise, as well as combinations among them. The proposed system is double-ended (full reference) and is developed using a database of degraded full-band speech signals created according to a model for teleconference systems. A set of features obtained from both the degraded and non-degraded signals is proposed and shown to adequately capture information associated with each degradation type. A random forest classifier and a support vector machine are successfully employed, achieving a classification error below 2%. Such classifiers can be used to select an appropriate quality assessment tool for a given degraded signal.
Original language | English |
---|---|
Article number | 5753923 |
Pages (from-to) | 2516-2526 |
Number of pages | 11 |
Journal | IEEE Transactions on Audio, Speech and Language Processing |
Volume | 19 |
Issue number | 8 |
DOIs | |
State | Published - 2011 |
Externally published | Yes |
Bibliographical note
Funding Information:Manuscript received December 21, 2010; revised March 28, 2011; accepted March 29, 2011. Date of publication April 21, 2011; date of current version September 23, 2011. The work of L. W. P. Biscainho was supported by CNPq and FAPERJ. The work of L. O. Nunes was supported in part by CNpQ. This R&D project is a cooperation between Hewlett-Packard Brasil Ltd. and COPPE/UFRJ, being supported with resources of Informatics Law (no 8.248, from 1991). The associate editor coordinating the review of this manuscript and approving it for publication was Prof. Lauri Savioja.
Keywords
- Pattern classification
- speech communication
- speech quality assessment
- teleconference systems