Examining the full range of a document's lifetime, this volume reviews the issues involved in handling and processing digital documents. Topics include acquisition, representation, security, pre-processing, layout analysis and analysis of single components.
This text reviews the issues involved in handling and processing digital documents. Examining the full range of a document's lifetime, the book covers acquisition, representation, security, pre-processing, layout analysis, understanding, analysis of single components, information extraction, filing, indexing and retrieval. Features: provides a list of acronyms and a glossary of technical terms; contains appendices covering key concepts in machine learning, and providing a case study on building an intelligent system for digital document and library management; discusses issues of security, and legal aspects of digital documents; examines core issues of document image analysis, and image processing techniques of particular relevance to digitized documents; reviews the resources available for natural language processing, in addition to techniques of linguistic analysis for content handling; investigates methods for extracting and retrieving data/information from a document.
This text reviews the issues involved in handling and processing digital documents. Examining the full range of a document's lifetime, the book covers acquisition, representation, security, pre-processing, layout analysis, understanding, analysis of single components, information extraction, filing, indexing and retrieval. Features: provides a list of acronyms and a glossary of technical terms; contains appendices covering key concepts in machine learning, and providing a case study on building an intelligent system for digital document and library management; discusses issues of security, and legal aspects of digital documents; examines core issues of document image analysis, and image processing techniques of particular relevance to digitized documents; reviews the resources available for natural language processing, in addition to techniques of linguistic analysis for content handling; investigates methods for extracting and retrieving data/information from a document.
From the reviews:
"This book provides a background in the area of document image analysis. It has general information on image analysis, information on document image analysis, and then information specific to the application of word and phrase recognition within document images. ... I would definitely recommend this book to novice researchers in document analysis and recognition, especially to those new to computer vision as well." (Jeremy Svendsen, IAPR Newsletter, Vol. 34 (3), July-August, 2012)
"This book provides a background in the area of document image analysis. It has general information on image analysis, information on document image analysis, and then information specific to the application of word and phrase recognition within document images. ... I would definitely recommend this book to novice researchers in document analysis and recognition, especially to those new to computer vision as well." (Jeremy Svendsen, IAPR Newsletter, Vol. 34 (3), July-August, 2012)