Abstract
The collection fusion problem of image databases is concerned with retrieving relevant images by content based retrieval from image databases distributed on the Web. While there have been many studies about database selection and collection fusion for text databases, little research has been attempted for the case of image databases. Image databases on the Web have heterogeneous characteristics since they use different similarity measures and queries are processed depending on their own policies. Our previous study [Inf. Process. Lett. 75 (1-2) (2000) 35] provided three algorithms for this problem. In this paper, the metaserver selects image databases supporting similarity measures that are correlated with a global similarity measure, and then submits a query to them. And, we propose a new algorithm for this metaserver, which exploits a probabilistic technique using Bayesian estimation for a linear regression model. It outperforms the previous approach for diverse sizes of result sets for a query, and its improvement in effectiveness becomes especially large with small sizes of result sets. We also provide a virtual optimal algorithm to which our algorithm is compared. With extensive experiments we show the superiority of the Bayesian method over the others.
Original language | English |
---|---|
Pages (from-to) | 267-285 |
Number of pages | 19 |
Journal | Information Processing and Management |
Volume | 39 |
Issue number | 2 |
DOIs | |
State | Published - Mar 2003 |
Externally published | Yes |
Bibliographical note
Funding Information:This work was supported by Korea Research Foundation Grant (KRF-2000-041-E00262). We would like to thank Dr. Gabriella Pasi, an editor, for helpful instructions and anonymous reviewers for valuable comments. We also wish to thank Dr. Ju-Hong Lee for useful discussions.
Keywords
- Bayesian model
- Collection fusion
- Image database
- Similarity search