Please use this identifier to cite or link to this item: https://doi.org/10.1007/978-3-642-34179-3_8
Title: Probabilistically ranking web article quality based on evolution patterns
Authors: Han, J.
Chen, K.
Jiang, D. 
Issue Date: 2012
Source: Han, J.,Chen, K.,Jiang, D. (2012). Probabilistically ranking web article quality based on evolution patterns. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 7600 LNCS : 229-258. ScholarBank@NUS Repository. https://doi.org/10.1007/978-3-642-34179-3_8
Abstract: User-generated content (UGC) is created, updated, and maintained by various web users, and its data quality is a major concern to all users. We observe that each Wikipedia page usually goes through a series of revision stages, gradually approaching a relatively steady quality state and that articles of different quality classes exhibit specific evolution patterns. We propose to assess the quality of a number of web articles using Learning Evolution Patterns (LEP). First, each article's revision history is mapped into a state sequence using the Hidden Markov Model (HMM). Second, evolution patterns are mined for each quality class, and each quality class is characterized by a set of quality corpora. Finally, an article's quality is determined probabilistically by comparing the article with the quality corpora. Our experimental results demonstrate that the LEP approach can capture a web article's quality precisely. © 2012 Springer-Verlag.
Source Title: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
URI: http://scholarbank.nus.edu.sg/handle/10635/41251
ISBN: 9783642341786
ISSN: 03029743
DOI: 10.1007/978-3-642-34179-3_8
Appears in Collections:Staff Publications

Show full item record
Files in This Item:
There are no files associated with this item.

Page view(s)

64
checked on Jan 14, 2018

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.