TY - GEN
T1 - Text, image and vector graphics based appraisal of contemporary documents
AU - Lee, Sang Chul
AU - McFadden, William
AU - Bajcsy, Peter
PY - 2008
Y1 - 2008
N2 - We have designed a framework for content based appraisal of documents. Our motivation is to provide computer assisted support for answering several appraisal criteria according to the general appraisal guidelines in the National Archives and Record Administration (NARA) 1441 directive. The appraisal criteria led us to investigations related to (a) finding groups of PDF documents with similar content, (b) ranking documents according to their creation/modification time and digital volume, and (c) detecting inconsistency between ranking and content within a group of related documents. The novelty of our work is in designing a methodology and a mathematical framework for document appraisals, and prototyping the framework working with text, image and vector graphics components of PDF documents. We present example results of grouping, ranking and integrity verification for groups of scientific documents about medical topics.
AB - We have designed a framework for content based appraisal of documents. Our motivation is to provide computer assisted support for answering several appraisal criteria according to the general appraisal guidelines in the National Archives and Record Administration (NARA) 1441 directive. The appraisal criteria led us to investigations related to (a) finding groups of PDF documents with similar content, (b) ranking documents according to their creation/modification time and digital volume, and (c) detecting inconsistency between ranking and content within a group of related documents. The novelty of our work is in designing a methodology and a mathematical framework for document appraisals, and prototyping the framework working with text, image and vector graphics components of PDF documents. We present example results of grouping, ranking and integrity verification for groups of scientific documents about medical topics.
UR - http://www.scopus.com/inward/record.url?scp=60649084752&partnerID=8YFLogxK
U2 - 10.1109/ICMLA.2008.39
DO - 10.1109/ICMLA.2008.39
M3 - Conference contribution
AN - SCOPUS:60649084752
SN - 9780769534954
T3 - Proceedings - 7th International Conference on Machine Learning and Applications, ICMLA 2008
SP - 729
EP - 734
BT - Proceedings - 7th International Conference on Machine Learning and Applications, ICMLA 2008
T2 - 7th International Conference on Machine Learning and Applications, ICMLA 2008
Y2 - 11 December 2008 through 13 December 2008
ER -