TR2016-103
Dimensionality Reduction of Visual Features for Efficient Retrieval and Classification
-
- "Dimensionality Reduction of Visual Features for Efficient Retrieval and Classification", APSIPA Transactions on Signal and Information Processing, DOI: 10.1017/ATSIP.2016.14, Vol. 5, July 2016.BibTeX TR2016-103 PDF
- @article{Boufounos2016jul,
- author = {Boufounos, Petros T. and Mansour, Hassan and Rane, Shantanu D. and Vetro, Anthony},
- title = {Dimensionality Reduction of Visual Features for Efficient Retrieval and Classification},
- journal = {APSIPA Transactions on Signal and Information Processing},
- year = 2016,
- volume = 5,
- month = jul,
- doi = {10.1017/ATSIP.2016.14},
- url = {https://www.merl.com/publications/TR2016-103}
- }
,
- "Dimensionality Reduction of Visual Features for Efficient Retrieval and Classification", APSIPA Transactions on Signal and Information Processing, DOI: 10.1017/ATSIP.2016.14, Vol. 5, July 2016.
-
MERL Contacts:
-
Research Area:
Digital Video
Abstract:
Visual retrieval and classification are of growing importance for a number of applications, including surveillance, automotive, as well as web and mobile search. To facilitate these processes, features are often computed from images to extract discriminative aspects of the scene, such as structure, texture or color information. Ideally, these features would be robust to changes in perspective, illumination, and other transformations. This paper examines two approaches that employ dimensionality reduction for fast and accurate matching of visual features while also being bandwidth-efficient, scalable and parallelizable. We focus on two classes of techniques to illustrate the benefits of dimensionality reduction in the context of various industrial applications. The first method is referred to as quantized embeddings, which generates a distance-preserving feature vector with low rate. The second method is a low rank matrix factorization applied to a sequence of visual features, which exploits the temporal redundancy among features vectors associated with each frame in a video. Both methods discussed in this paper are also universal in that they do not require prior assumptions about the statistical properties of the signals in the database or the query. Furthermore, they enable the system designer to navigate a rate vs. performance trade-off similar to the rate-distortion trade-off in conventional compression.