CV CVPR

Clusformer: A Transformer based Clustering Approach to Unsupervised Large-scale Face and Visual Landmark Recognition

March 2, 2021
The research in automatic unsupervised visual clustering has received considerable attention over the last couple years. It aims at explaining distributions of unlabeled visual images by clustering them via a parameterized model of appearance. Graph Convolutional Neural Networks (GCN) have recently been one of the most popular clustering methods. However, it has reached some limitations. Firstly, it is quite sensitive to hard or noisy samples. Secondly, it is hard to investigate with various deep network models due to its computational training time. Finally, it is hard to design an end-to-end training model between the deep feature extraction and GCN clustering modeling. This work therefore presents the Clusformer, a simple but new perspective of Transformer based approach, to automatic visual clustering via its unsupervised attention mechanism. The proposed method is able to robustly deal with noisy or hard samples. It is also flexible and effective to collaborate with different deep network models with various model sizes in an end-to-end framework. The proposed method is evaluated on two popular large-scale visual databases, i.e. Google Landmark and MS-Celeb-1M face database, and outperforms prior unsupervised clustering methods. Code will be available at https://github.com/VinAIResearch/Clusformer

Overall

< 1 minute

Nguyen Xuan Bac, Duc Toan Bui, Chi Nhan Duong, Tien D. Bui and Khoa Luu

CVPR 2021

Share Article

Related publications

CV CVPR Top Tier
March 6, 2024

Supreeth Narasimhaswamy, Huy Nguyen, Lihan Huang, Minh Hoai

CV CVPR Top Tier
March 6, 2024

Ka Chun Shum, Jaeyeon Kim, Binh-Son Hua, Duc Thanh Nguyen, Sai-Kit Yeung

CV CVPR Top Tier
March 6, 2024

Phong Tran, Egor Zakharov, Long-Nhat Ho, Anh Tran, Liwen Hu, Hao Li

CV CVPR Top Tier
March 6, 2024

Trung Tuan Dao, Duc Hong Vu, Cuong Pham, Anh Tran