Please use this identifier to cite or link to this item:
Title: Discovering typical structures of documents: A road map approach
Authors: Wang, Ke 
Liu, Huiqing
Issue Date: 1998
Citation: Wang, Ke,Liu, Huiqing (1998). Discovering typical structures of documents: A road map approach. SIGIR Forum (ACM Special Interest Group on Information Retrieval) : 146-154. ScholarBank@NUS Repository.
Abstract: The structure of a document refers to the role and hierarchy of subdocument references. Many online documents are similarly structured, though not identically structured. We study the problem of discovering `typical' structures of a collection of such documents, where the user specifies the minimum frequency of a typical structure. We will consider structural features of sub-document references such as labeling, nesting, ordering, cyclicity, and wild-card references, like those found on the Web and digital libraries. Typical structures can be used to serve the following purposes. (a) The `table-of-content' for gaining the general information of a source. (b) A road map for browsing and querying a source. (c) A basis for clustering documents. (d) Partial schemas for building structured layers to provide standard database access methods. (e) User/customer's interests and browsing patterns. We present a solution to the discovery problem.
Source Title: SIGIR Forum (ACM Special Interest Group on Information Retrieval)
ISSN: 01635840
Appears in Collections:Staff Publications

Show full item record
Files in This Item:
There are no files associated with this item.

Page view(s)

checked on Jun 23, 2022

Google ScholarTM


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.