TY - JOUR
T1 - Quantitative analysis of automatic voice disorder detection studies for hybrid feature and classifier selection
AU - Lee, Jong Bub
AU - Lee, Hyun Gyu
N1 - Publisher Copyright:
© 2024
PY - 2024/5
Y1 - 2024/5
N2 - Owing to the development of machine learning, particularly deep learning, researchers have focused on automatic voice-disorder detection. However, voice-disorder datasets vary significantly in terms of the number of patients per disorder, and different conditions are targeted in different studies. Therefore, conducting direct comparisons of performances across related studies is complicated. Hence, we compare conventional machine learning, deep learning, and multimodal methods by establishing a fixed dataset and an evaluation pipeline using the Saarbrücken voice database, which is the most commonly used database for automatic voice-disorder detection. In addition, we propose an automatic voice-disorder detection method that combines features and classifiers. Experimental results show mean unweighted average recall differences of 8% and 15% on the abovementioned two datasets, respectively, and that the proposed combination improves them by 1.5% and 0.5%, respectively.
AB - Owing to the development of machine learning, particularly deep learning, researchers have focused on automatic voice-disorder detection. However, voice-disorder datasets vary significantly in terms of the number of patients per disorder, and different conditions are targeted in different studies. Therefore, conducting direct comparisons of performances across related studies is complicated. Hence, we compare conventional machine learning, deep learning, and multimodal methods by establishing a fixed dataset and an evaluation pipeline using the Saarbrücken voice database, which is the most commonly used database for automatic voice-disorder detection. In addition, we propose an automatic voice-disorder detection method that combines features and classifiers. Experimental results show mean unweighted average recall differences of 8% and 15% on the abovementioned two datasets, respectively, and that the proposed combination improves them by 1.5% and 0.5%, respectively.
KW - Healthcare
KW - Machine learning
KW - Speech analysis
KW - Voice disorder detection
UR - http://www.scopus.com/inward/record.url?scp=85184150525&partnerID=8YFLogxK
U2 - 10.1016/j.bspc.2024.106014
DO - 10.1016/j.bspc.2024.106014
M3 - Article
AN - SCOPUS:85184150525
SN - 1746-8094
VL - 91
JO - Biomedical Signal Processing and Control
JF - Biomedical Signal Processing and Control
M1 - 106014
ER -