Abstract: Maps are an important source of information in archaeology and othersciences. Users want to search for historical maps to determine recordedhistory of the political geography of regions at different eras, to find outwhere exactly archaeological artifacts were discovered, etc. Currently, theyhave to use a generic search engine and add the term map along with otherkeywords to search for maps. This crude method will generate a significantnumber of false positives that the user will need to cull through to get thedesired results. To reduce their manual effort, we propose an automatic mapidentification, indexing, and retrieval system that enables users to search andretrieve maps appearing in a large corpus of digital documents using simplekeyword queries. We identify features that can help in distinguishing maps fromother figures in digital documents and show how a Support-Vector-Machine-basedclassifier can be used to identify maps. We propose map-level-metadata e.g.,captions, references to the maps in text, etc. and document-level metadata,e.g., title, abstract, citations, how recent the publication is, etc. and showhow they can be automatically extracted and indexed. Our novel rankingalgorithm weights different metadata fields differently and also uses thedocument-level metadata to help rank retrieved maps. Empirical evaluations showwhich features should be selected and which metadata fields should be weightedmore. We also demonstrate improved retrieval results in comparison toadaptations of existing methods for map retrieval. Our map search engine hasbeen deployed in an online map-search system that is part of the Blind-Reviewdigital library system.

Autor: Qingzhao Tan, Prasenjit Mitra, C. Lee Giles

