BRB-KMeans: Enhancing Binary Data Clustering for Binary Product Quantization
- Authors
- Lee, Suwon; Choi, Sang-Min
- Issue Date
- Jul-2024
- Publisher
- Association for Computing Machinery, Inc
- Keywords
- binary clustering; binary data; binary vector; product quantization
- Citation
- SIGIR 2024 - Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp 2306 - 2310
- Pages
- 5
- Indexed
- SCOPUS
- Journal Title
- SIGIR 2024 - Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval
- Start Page
- 2306
- End Page
- 2310
- URI
- https://scholarworks.gnu.ac.kr/handle/sw.gnu/74200
- DOI
- 10.1145/3626772.3657898
- ISSN
- 0000-0000
- Abstract
- In Binary Product Quantization (BPQ), where product quantization is applied to binary data, the traditional k-majority method is used for clustering, with centroids determined based on Hamming distance and majority vote for each bit. However, this approach often leads to a degradation in clustering quality, negatively impacting BPQ's performance. To address these challenges, we introduce Binary-to-Real-and-Back K-Means (BRB-KMeans), a novel method that initially transforms binary data into real-valued vectors, performs k-means clustering on these vectors, and then converts the generated centroids back into binary data. This innovative approach significantly enhances clustering quality by leveraging the high clustering quality of k-means in the real-valued vector space, thereby facilitating future quantization for binary data. Through extensive experiments, we demonstrate that BRB-KMeans significantly enhances clustering quality and overall BPQ performance, notably outperforming traditional methods. © 2024 Owner/Author.
- Files in This Item
- There are no files associated with this item.
- Appears in
Collections - ETC > Journal Articles

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.