Please use this identifier to cite or link to this item:
https://doi.org/10.4230/LIPIcs.ITCS.2023.31
DC Field | Value | |
---|---|---|
dc.title | Clustering Permutations: New Techniques with Streaming Applications | |
dc.contributor.author | Chakraborty, D | |
dc.contributor.author | Das, D | |
dc.contributor.author | Krauthgamer, R | |
dc.date.accessioned | 2023-06-14T07:23:19Z | |
dc.date.available | 2023-06-14T07:23:19Z | |
dc.date.issued | 2023-01-01 | |
dc.identifier.citation | Chakraborty, D, Das, D, Krauthgamer, R (2023-01-01). Clustering Permutations: New Techniques with Streaming Applications 251. ScholarBank@NUS Repository. https://doi.org/10.4230/LIPIcs.ITCS.2023.31 | |
dc.identifier.isbn | 9783959772631 | |
dc.identifier.issn | 1868-8969 | |
dc.identifier.uri | https://scholarbank.nus.edu.sg/handle/10635/241977 | |
dc.description.abstract | We study the classical metric k-median clustering problem over a set of input rankings (i.e., permutations), which has myriad applications, from social-choice theory to web search and databases. A folklore algorithm provides a 2-approximate solution in polynomial time for all k = O(1), and works irrespective of the underlying distance measure, so long it is a metric; however, going below the 2-factor is a notorious challenge. We consider the Ulam distance, a variant of the well-known edit-distance metric, where strings are restricted to be permutations. For this metric, Chakraborty, Das, and Krauthgamer [SODA, 2021] provided a (2 − δ)-approximation algorithm for k = 1, where δ ≈ 2−40. Our primary contribution is a new algorithmic framework for clustering a set of permutations. Our first result is a 1.999-approximation algorithm for the metric k-median problem under the Ulam metric, that runs in time (k log(nd))O(k)nd3 for an input consisting of n permutations over [d]. In fact, our framework is powerful enough to extend this result to the streaming model (where the n input permutations arrive one by one) using only polylogarithmic (in n) space. Additionally, we show that similar results can be obtained even in the presence of outliers, which is presumably a more difficult problem. | |
dc.source | Elements | |
dc.type | Conference Paper | |
dc.date.updated | 2023-06-14T05:43:14Z | |
dc.contributor.department | DEPARTMENT OF COMPUTER SCIENCE | |
dc.description.doi | 10.4230/LIPIcs.ITCS.2023.31 | |
dc.description.volume | 251 | |
dc.published.state | Published | |
Appears in Collections: | Staff Publications Elements |
Show simple item record
Files in This Item:
File | Description | Size | Format | Access Settings | Version | |
---|---|---|---|---|---|---|
2212.01821.pdf | Published version | 364.53 kB | Adobe PDF | OPEN | Pre-print | View/Download |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.