Please use this identifier to cite or link to this item: https://doi.org/10.1109/ICDM.2013.119
DC FieldValue
dc.titleMin-max hash for jaccard similarity
dc.contributor.authorJi, J.
dc.contributor.authorLi, J.
dc.contributor.authorYan, S.
dc.contributor.authorTian, Q.
dc.contributor.authorZhang, B.
dc.date.accessioned2014-10-07T04:47:11Z
dc.date.available2014-10-07T04:47:11Z
dc.date.issued2013
dc.identifier.citationJi, J., Li, J., Yan, S., Tian, Q., Zhang, B. (2013). Min-max hash for jaccard similarity. Proceedings - IEEE International Conference on Data Mining, ICDM : 301-309. ScholarBank@NUS Repository. https://doi.org/10.1109/ICDM.2013.119
dc.identifier.issn15504786
dc.identifier.urihttp://scholarbank.nus.edu.sg/handle/10635/83960
dc.description.abstractMin-wise hash is a widely-used hashing method for scalable similarity search in terms of Jaccard similarity, while in practice it is necessary to compute many such hash functions for certain precision, leading to expensive computational cost. In this paper, we introduce an effective method, i.e. the min-max hash method, which significantly reduces the hashing time by half, yet it has a provably slightly smaller variance in estimating pair wise Jaccard similarity. In addition, the estimator of min-max hash only contains pair wise equality checking, thus it is especially suitable for approximate nearest neighbor search. Since min-max hash is equally simple as min-wise hash, many extensions based on min-wise hash can be easily adapted to min-max hash, and we show how to combine it with b-bit minwise hash. Experiments show that with the same length of hash code, min-max hash reduces the hashing time to half as much as that of min-wise hash, while achieving smaller mean squared error (MSE) in estimating pair wise Jaccard similarity, and better best approximate ratio (BAR) in approximate nearest neighbor search. © 2013 IEEE.
dc.description.urihttp://libproxy1.nus.edu.sg/login?url=http://dx.doi.org/10.1109/ICDM.2013.119
dc.sourceScopus
dc.subjectapproximate nearest neighbor search
dc.subjectJaccard similarity
dc.subjectmin-max hash
dc.subjectmin-wise hash
dc.typeConference Paper
dc.contributor.departmentELECTRICAL & COMPUTER ENGINEERING
dc.description.doi10.1109/ICDM.2013.119
dc.description.sourcetitleProceedings - IEEE International Conference on Data Mining, ICDM
dc.description.page301-309
dc.identifier.isiut000332874200031
Appears in Collections:Staff Publications

Show simple item record
Files in This Item:
There are no files associated with this item.

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.