Integrated Vision & Language

Video Description using Bidirectional Recurrent Neural Networks

Members: Marc Bolaños, Petia Radeva, Álvaro Perís (UPV), Francisco Casacuberta (UPV)

screen-shot-2016-10-28-at-17-43-00 Although traditionally used in the machine translation field, the encoder-decoder framework has been recently applied for the generation of video and image descriptions. The combination of Convolutional and Recurrent Neural Networks in these models has proven to outperform the previous state of the art, obtaining more accurate video descriptions. In this work, we propose pushing further this model by introducing two contributions into the encoding stage. First, producing richer image representations by combining object and location information from Convolutional Neural Networks and second, introducing Bidirectional Recurrent Neural Networks for capturing both forward and backward temporal relationships in the input frames.

References:

Álvaro Peris, Marc Bolaños, Petia Radeva, Francisco Casacuberta: Video Description Using Bidirectional Recurrent Neural Networks. ICANN (2) 2016: 3-11
Álvaro Peris, Marc Bolaños, Petia Radeva, Francisco Casacuberta: Video Description using Bidirectional Recurrent Neural Networks. CoRR abs/1604.03390 (2016)
VQA
M. Dimiccoli, M. Bolaños, E. Tavalera, M. Aghaei, S. Nikolov, P. Radeva. Semantic Regularized Clustering for Egocentric Photo Streams Segmentation. To appear in Computer Vision and Image Understanding, 2016.

Egocentric Image Retrieval with Convolutional Neural Networks

Members: Gabriel de Oliveira, Mariella Dimiccoli, Petia Radeva

screen-shot-2016-10-28-at-18-05-44

Recent advances in lifelogging technologies, and in particular, in the field of wearable cameras, have made possible to capture continuously our daily life from a first-person point of view and in a free-hand fashion. However, given the large amount of images captured and the rate to which they increase (up 2000 images per day), there is a strong need for efficient and scalable indexing and retrieval systems over egocentric images. To cope with those requirements, we develop a full Content-Based Image Retrieval system based on Convolutional Neural Network (CNN) features. In our approach, we use egocentric images to create a Lucene index with off-the-shelf features extracted from a pre-trained CNN. The extracted features are integrated into Solr, an open-source, state-of-the-art inverted index search platform. Finally, we provide a web-based prototype for egocentric image search and retrieval and tested its performances on the EDUB egocentric dataset.

References:

Gabriel Oliveira-Barra, Mariella Dimiccoli, Petia Radeva: Egocentric Image Retrieval with Convolutional Neural Networks. CCIA 2016: 71-76
G. de Oliveria, A. Cartas, M. Bolaños, M. Dimiccoli, M. Aghaei, M. Carné, X. Giro-i-Nieto, P. Radeva LEMoRe: A Lifelog Engine for Moments Retrieval at the NTCIR-Lifelog LSAT Task}. NTCIR-12 Conference & EVIA in NII, Tokyo, Japan, June 2016
Aniol Lidon, Marc Bolaños, Markus Seidl, Xavier Giró i Nieto, Petia Radeva, Matthias Zeppelzauer: UPC-UB-STP @ MediaEval 2015 Diversity Task: Iterative Reranking of Relevant Images. MediaEval 2015

Computer Vision and Machine Learning at the University of Barcelona (CVMLUB)

Consolidated research group (2017SGR1742), Regional Government of Catalonia, Spain

Computer Vision and Machine Learning at the University of Barcelona (CVMLUB)

Consolidated research group (2017SGR1742), Regional Government of Catalonia, Spain

Integrated Vision & Language

Video Description using Bidirectional Recurrent Neural Networks

Egocentric Image Retrieval with Convolutional Neural Networks

Contact:

PhD Opportunity

Postdoc on Deep learning and Computer Vision

Rememory on the press: Una fotografia cada trenta segons per ajudar a fixar els records

Rememory on the press: Una fotografia cada trenta segons per ajudar a fixar els records

International Workshop on Social Signal Processing and Beyond, ICIAP’2017

CFP: LTA2017 – Second International Workshop on Lifelogging Tools and Applications – a workshop at ACM MM 2017

Seminary of Adriana Romero and Michal Drozdal: «Towards AI personalized medicine»

Open postdoc position

Petia Radeva – invited speaker to the workshop “Humanitarian and social science: from the university to the enterprise”

Best paper award at CIAPR’2016 to our paper: «Deep Learning Features for Wireless Capsule Endoscopy Analysis», by Santi Segui, Michal Drozdzal, Guillem Pascual, Carolina Malagelada, Fernando Azpiroz, Petia Radeva and Jordi Vitrià

Petia Radeva received the International CIARP Award «Aurora Pons Porrata»

Mention prize (II place) for our Application on Automatic Food Recognition in the DKV competence Health4Good!

The journal Medical Physics chose figures of our work on Stent analysis to use as a cover on their journal. http://www.medphys.org

Petia Radeva gave the plenary talk at CCIA’2016

GRADIANT award to Beatriz Remeseiro for the best PhD thesis applied to the ICT sector 2016.

3 abstracts accepted at the NIPS Workshop WiML’16.

Dr. Giovanni Maria Farinella is visiting us on 17 of November, 2016

Offers of 4 grants for Master Students

Grants_Proposal_2016_SGR

José M. Álvarez’s seminary

Michal Drozdal received the award «Pioner 2015» for his PhD thesis “Sequential image analysis for computer-aided wireless endoscopi”, by the Institució CERCA.

Re-Memory presented at the CCCB exposition «Human+»

At NVIDIA’s #GTC15, our endoluminal image analysis work was presented during the keynote.