Please use this identifier to cite or link to this item: https://doi.org/10.1109/WI.2004.10043
Title: Efficient wrapper reinduction from dynamic web sources
Authors: Mohapatra, R.
Rajaraman, K.
Yuan, S.S. 
Issue Date: 2004
Source: Mohapatra, R., Rajaraman, K., Yuan, S.S. (2004). Efficient wrapper reinduction from dynamic web sources. Proceedings - IEEE/WIC/ACM International Conference on Web Intelligence, WI 2004 : 391-397. ScholarBank@NUS Repository. https://doi.org/10.1109/WI.2004.10043
Abstract: This paper investigates wrapper induction from web sites whose layout may change over time. We formulate the reinduction as an incremental learning problem and identify that wrapper induction from an incomplete label is a key problem to be solved. We propose a novel algorithm for incrementally inducing LR wrappers and show that this algorithm asymptotically identifies the correct wrapper as the number of tuples is increased. This property is used to propose a LR wrapper reinduction algorithm. This algorithm requires examples to be provided exactly once and thereafter the algorithm can detect the layout changes and reinduce wrappers automatically. In experimental studies, we observe that the reinduction algorithm is able to achieve near perfect performance. © 2004 IEEE.
Source Title: Proceedings - IEEE/WIC/ACM International Conference on Web Intelligence, WI 2004
URI: http://scholarbank.nus.edu.sg/handle/10635/41541
ISBN: 0769521002
DOI: 10.1109/WI.2004.10043
Appears in Collections:Staff Publications

Show full item record
Files in This Item:
There are no files associated with this item.

SCOPUSTM   
Citations

9
checked on Dec 14, 2017

Page view(s)

60
checked on Dec 17, 2017

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.