CAREER: Similarity-Based Representation of Large-Scale Image Collections
(NSF Grant No. 1228082)

PI: Svetlana Lazebnik

Students: Joseph Tighe (Ph.D. 2013), Yunchao Gong (Ph.D. 2014), Megha Pandey (M.S., 2011), Hongtao Huang (M.S., 2013), Mariyam Khalid (M.S. 2014), Liwei Wang, Bryan Plummer, Cecilia Mauceri, Arun Mallya

Collaborators: Jan-Michael Frahm (UNC), Marc Niethammer (UNC), Maxim Raginsky (Duke and U of I), Julia Hockenmaier (U of I), Florent Perronnin (Xerox Research Centre Europe), Albert Gordo (Xerox Research Centre Europe), Sanjiv Kumar (Google), Qifa Ke (Microsoft Research)

This material is based upon work supported by the National Science Foundation under Grant No. 1228082. Additional funding comes from NSF Grants No. 0916829 and 1302438, Microsoft Research, ARO, Xerox, Sloan Foundation, and DARPA. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.

Last updated: July 6, 2014


PROJECT GOALS

Intellectual merit: The goal of this project is to develop a representational framework that uses similarity to capture relationships in large-scale image collections. The representation is not restricted to any specific distance function, feature, or learning model. It includes new methods to combine multiple kernels based on different cues, learn fast kernel approximations, and improve indexing efficiency. In addition, new methods for nearest neighbor search and semi-supervised learning are proposed. Two major research problems addressed are: (1) defining and computing similarities between images in vast, expanding, repositories, and representing those similarities in an efficient manner so the right pairs can be retrieved on demand; and (2) developing a system that can learn and predict similarities with sparse supervisory information and constantly evolving data.

Broader impacts: The creation of visual representations and learning algorithms capable of handling large-scale evolving multimodal data has the potential to revolutionize many scientific and consumer applications. Specific application domains include field biology, automatic localization and navigation in indoor and outdoor environments, personalized shopping and travel guides, automated assistants for the visually impaired, security and surveillance.


PUBLICATIONS AND RESOURCES

Image Parsing (webpage with code and data)

Joint Embeddings for Images and Text (webpage with code and data)

Scene Representation

Large-Scale Similarity Search

Landmark Photo Collections