Abstract
In the world of big data, large amounts of images are available in social media, corporate and even personal collections. A collection may grow quickly as new images are generated at high rates. The new images may cause changes in the distri- bution of existing classes or the emergence of new classes, resulting in the collection being dynamic and having concept drift. For efficient image retrieval from an image collection using a query, a hash table consisting of a set of hash functions is needed to transform images into binary hash codes which are used as the basis to find similar images to the query. If the image collection is dynamic, the hash table built at one time step may not work well at the next due to changes in the col- lection as a result of new images being added. Therefore, the hash table needs to be rebuilt or updated at successive time steps. Incremental hashing (ICH) is the first effective method to deal with the concept drift problem in image retrieval from dynamic collections. In ICH, a new hash table is learned based on newly emerging images only which represent data distri- bution of the current data environment. The new hash table is used to generate hash codes for all images including old and new ones. Due to the dynamic nature, new images of one class may not be similar to old images of the same class. In order to learn new hash table that preserves within-class similarity in both old and new images, incremental hashing with sample selection using dominant sets (ICHDS) is proposed in this paper, which selects representative samples from each class for training the new hash table. Experimental results show that ICHDS yields better retrieval performance than existing dynamic and static hashing methods.
Original language | English |
---|---|
Pages (from-to) | 2689-2702 |
Number of pages | 14 |
Journal | International Journal of Machine Learning and Cybernetics |
Volume | 11 |
Issue number | 12 |
Early online date | 24 Jun 2020 |
DOIs | |
Publication status | Published (in print/issue) - 1 Dec 2020 |
Keywords
- Concept drift
- Dominant sets
- Image retrieval
- Incremental hashing
- Semi-supervised hashing