March 22, 2016, 11:22 p.m.
Dr. Simon Lucey took an interview for popular science! Check the news for more details.
The CI2CV Lab, led by Dr Simon Lucey, is doing cutting edge research and developing technology in the fields of computer vision and machine learning. We are actively engaged in the following areas:
Mobile Computer Vision: Computer vision is a discipline that attempts to extract information from images and videos. Nearly every smart device on the planet has a camera, and people are increasingly interested in how to develop apps that use computer vision to perform an ever expanding list of things including: 3D mapping, photo/image search, people/object tracking, augmented reality etc. Notable examples of our work in this space include our recent work at ECCV 2014. Notable commercial applications of our work in this space includes the Glasses.com "virtual try on" App.
Model Based Vision: Modeling the 3D geometry of objects is an onerous task for computer vision, but one which holds many benefits: arbitrary viewpoints and occlusion patterns can be rendered and recognized, and reasoning about interactions between objects and scenes is more readily facilitated. This higher-level reasoning about the 3D position and placement of objects has myriad applications in fields beyond computer vision: aiding the blind in understanding and interacting with the world around them, autonomous navigation, visual search and query, as well as assisting the development of more general cognitive problems in artificial intelligence concerning geometric reasoning and inference. Notable papers in this space include our work in CVPR 2014 and PAMI 2015.
The Role of Alignment and Learning: The use of increasingly more complex representations, either hand-tuned (e.g. SIFT) or learned (e.g. Convolutional Neural Network), for detection, tracking and classification has resulted in substantial improvements in vision systems over the last few years. We have been exploring the link between geometric alignment and these complex representations. See our recent work from ECCV 2012.
Facial and Physical Behaviour: The last two decades have seen an escalating interest in methods for automating the coding of facial and body behavior. Applications for such systems extend from the legal and business fields to national security and mental health. More generally, the expansion of behavior research holds tremendous promise for advancing our understanding of basic social and emotional processes. This deepening understanding is key for heralding a long awaited new era in artificial intelligence (AI) where machines (robots, computers, mobile devices, etc.) interact, anticipate and plan seamlessly with humans. Yet, despite this keen interest, the reality is that the promise of computer vision systems to efficiently and accurately code behaviour (such as facial action codes or body intent) in naturally occurring circumstances remains elusive. Notable works in this space include our seminal work in real-time facial landmark alignment.
Visit http://www.springer.com/gp/book/9783319230474 for Springer book.
Visit http://www.springer.com/gp/book/9783319247007 for Springer book.
In this paper we tackle the problem of efficient video event detection. We argue that linear detection functions should be preferred in this regard due to their scalability and efficiency during estimation and evaluation. A popular approach in this regard is to represent a sequence using a bag of words (BOW) representation due to its: (i) fixed dimensionality irrespective of the sequence length, and (ii) its ability to compactly model the statistics in the sequence. A drawback to the BOW representation, however, is the intrinsic destruction of the temporal ordering information. In this paper we propose a new representation that leverages the uncertainty in relative temporal alignments between pairs of sequences while not destroying temporal ordering. Our representation, like BOW, is of a fixed dimensionality making it easily integrated with a linear detection function. Extensive experiments on CK+, 6DMG, and UvA-NEMO databases show significant performance improvements across both isolated and continuous event detection tasks.