Please use this identifier to cite or link to this item: https://scholarbank.nus.edu.sg/handle/10635/14686
DC FieldValue
dc.titleTopic detection using maximal frequent sequences
dc.contributor.authorYAP YANG LENG, IVAN
dc.date.accessioned2010-04-08T10:45:43Z
dc.date.available2010-04-08T10:45:43Z
dc.date.issued2005-03-18
dc.identifier.citationYAP YANG LENG, IVAN (2005-03-18). Topic detection using maximal frequent sequences. ScholarBank@NUS Repository.
dc.identifier.urihttp://scholarbank.nus.edu.sg/handle/10635/14686
dc.description.abstractWhen analyzing document collections, a key detail is the number of distinct topics contained within. Traditional clustering-based methods that perform topic detection do not take into account word sequence, and usually are not equipped to describe topic content. We present a new method to address this; we use Maximal Frequent word Sequences (MFSs) as building blocks in identifying distinct topics. Our method is a hybrid of an existing algorithm to discover equivalence classes containing MFSs, and a heuristic to group equivalence classes into topic clusters. The results of applying our method to a collection of newswire articles and Manufacturing technical paper abstracts suggest that our method favors datasets whose topics are specific and conceptually well separated. Our method is also useful in generating a list of distinct topics, from a dataset whose topics are not clearly defined, which acts as an intermediate result to understanding and further partitioning the dataset.
dc.language.isoen
dc.subjectTopic Detection, Document Clustering, Maximal Frequent Sequences, Equivalence Classes
dc.typeThesis
dc.contributor.departmentMECHANICAL ENGINEERING
dc.contributor.supervisorLOH HAN TONG
dc.description.degreeMaster's
dc.description.degreeconferredMASTER OF ENGINEERING
dc.identifier.isiutNOT_IN_WOS
Appears in Collections:Master's Theses (Open)

Show simple item record
Files in This Item:
File Description SizeFormatAccess SettingsVersion 
YapYLI.pdf359.3 kBAdobe PDF

OPEN

NoneView/Download

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.