Please use this identifier to cite or link to this item:
https://scholarbank.nus.edu.sg/handle/10635/15159
DC Field | Value | |
---|---|---|
dc.title | Chinese word segmentation with a maximum entropy approach | |
dc.contributor.author | LOW JIN KIAT | |
dc.date.accessioned | 2010-04-08T10:50:39Z | |
dc.date.available | 2010-04-08T10:50:39Z | |
dc.date.issued | 2006-03-08 | |
dc.identifier.citation | LOW JIN KIAT (2006-03-08). Chinese word segmentation with a maximum entropy approach. ScholarBank@NUS Repository. | |
dc.identifier.uri | http://scholarbank.nus.edu.sg/handle/10635/15159 | |
dc.description.abstract | In this thesis, we present a maximum entropy approach to Chinese word segmentation. Besides using features derived from gold-standard word-segmented training data, we also used an external dictionary and additional training corpora of different segmentation standards to further improve segmentation accuracy. The selection of useful additional training data is modeled as example selection from noisy data. Using these techniques, our word segmenter achieved state-of-the-art accuracy. We participated in the Second International Chinese Word Segmentation Bakeoff organized by SIGHAN, and evaluated our word segmenter on all four test corpora in the open track. Among 52 entries in the open track, our word segmenter achieved the highest F-measure on 3 of the 4 test corpora, and the second highest F-measure on the fourth test corpus. | |
dc.language.iso | en | |
dc.subject | Multi lingual processing, corpus based modeling of language, machine learning, Chinese Word Segmentation, Maximum Entropy, Noise Elimination | |
dc.type | Thesis | |
dc.contributor.department | COMPUTER SCIENCE | |
dc.contributor.supervisor | NG HWEE TOU | |
dc.description.degree | Master's | |
dc.description.degreeconferred | MASTER OF SCIENCE | |
dc.identifier.isiut | NOT_IN_WOS | |
Appears in Collections: | Master's Theses (Open) |
Show simple item record
Files in This Item:
File | Description | Size | Format | Access Settings | Version | |
---|---|---|---|---|---|---|
msc.pdf | 302.26 kB | Adobe PDF | OPEN | None | View/Download |
Google ScholarTM
Check
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.