Please use this identifier to cite or link to this item: https://scholarbank.nus.edu.sg/handle/10635/40991
DC FieldValue
dc.titleUsing micro information units for internet search
dc.contributor.authorLi, X.
dc.contributor.authorLiu, B.
dc.contributor.authorPhang, T.-H.
dc.contributor.authorHu, M.
dc.date.accessioned2013-07-04T08:17:04Z
dc.date.available2013-07-04T08:17:04Z
dc.date.issued2002
dc.identifier.citationLi, X.,Liu, B.,Phang, T.-H.,Hu, M. (2002). Using micro information units for internet search. International Conference on Information and Knowledge Management, Proceedings : 566-573. ScholarBank@NUS Repository.
dc.identifier.urihttp://scholarbank.nus.edu.sg/handle/10635/40991
dc.description.abstractInternet search is one of the most important applications of the Web. A search engine takes the user's keywords to retrieve and to rank those pages that contain the keywords. One shortcoming of existing search techniques is that they do not give due consideration to the micro-structures of a Web page. A Web page is often populated with a number of small information units, which we call micro information units (MIU). Each unit focuses on a specific topic and occupies a specific area of the page. During the search, if all the keywords in the user query occur in a single MIU of a page, the top ranking results returned by a search engine are generally relevant and useful. However, if the query words scatter at different MIUs in a page, the pages returned can be quite irrelevant (which causes low precision). The reason for this is that although a page has information on individual MIUs, it may not have information on their intersections. In this paper, we propose a technique to solve this problem. At the off-line pre-processing stage, we segment each page to identify the MIUs in the page, and index the keywords of the page according to the MIUs in which they occur. In searching, our retrieval and ranking algorithm utilizes this additional information to return those most relevant pages. Experimental results show that this method is able to significantly improve the search precision.
dc.sourceScopus
dc.subjectMicro information units
dc.subjectWeb page segmentation
dc.subjectWeb search
dc.typeConference Paper
dc.contributor.departmentCOMPUTER SCIENCE
dc.description.sourcetitleInternational Conference on Information and Knowledge Management, Proceedings
dc.description.page566-573
dc.identifier.isiutNOT_IN_WOS
Appears in Collections:Staff Publications

Show simple item record
Files in This Item:
There are no files associated with this item.

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.