TY - JOUR
T1 - Deep Neural Network Confidence Calibration from Stochastic Weight Averaging
AU - Cao, Zongjing
AU - Li, Yan
AU - Kim, Dong Ho
AU - Shin, Byeong Seok
N1 - Publisher Copyright:
© 2024 by the authors.
PY - 2024/2
Y1 - 2024/2
N2 - Overconfidence in deep neural networks (DNN) reduces the model’s generalization performance and increases its risk. The deep ensemble method improves model robustness and generalization of the model by combining prediction results from multiple DNNs. However, training multiple DNNs for model averaging is a time-consuming and resource-intensive process. Moreover, combining multiple base learners (also called inducers) is hard to master, and any wrong choice may result in lower prediction accuracy than from a single inducer. We propose an approximation method for deep ensembles that can obtain ensembles of multiple DNNs without any additional costs. Specifically, multiple local optimal parameters generated during the training phase are sampled and saved by using an intelligent strategy. We use cycle learning rates starting at 75% of the training process and save the weights associated with the minimum learning rate in every iteration. Saved sets of the multiple model parameters are used as weights for a new model to perform forward propagation during the testing phase. Experiments on benchmarks of two different modalities, static images and dynamic videos, show that our method not only reduces the calibration error of the model but also improves the accuracy of the model.
AB - Overconfidence in deep neural networks (DNN) reduces the model’s generalization performance and increases its risk. The deep ensemble method improves model robustness and generalization of the model by combining prediction results from multiple DNNs. However, training multiple DNNs for model averaging is a time-consuming and resource-intensive process. Moreover, combining multiple base learners (also called inducers) is hard to master, and any wrong choice may result in lower prediction accuracy than from a single inducer. We propose an approximation method for deep ensembles that can obtain ensembles of multiple DNNs without any additional costs. Specifically, multiple local optimal parameters generated during the training phase are sampled and saved by using an intelligent strategy. We use cycle learning rates starting at 75% of the training process and save the weights associated with the minimum learning rate in every iteration. Saved sets of the multiple model parameters are used as weights for a new model to perform forward propagation during the testing phase. Experiments on benchmarks of two different modalities, static images and dynamic videos, show that our method not only reduces the calibration error of the model but also improves the accuracy of the model.
KW - confidence calibration
KW - deep ensemble learning
KW - stochastic weight averaging
UR - http://www.scopus.com/inward/record.url?scp=85184689907&partnerID=8YFLogxK
U2 - 10.3390/electronics13030503
DO - 10.3390/electronics13030503
M3 - Article
AN - SCOPUS:85184689907
SN - 2079-9292
VL - 13
JO - Electronics (Switzerland)
JF - Electronics (Switzerland)
IS - 3
M1 - 503
ER -