Extracting non visual attributes from images using Deep Learning.

Extracting non visual attributes from images using Deep Learning.

Deep learning is a set of techniques for training neural networks with many layers. It’s a subfield of machine learning, which considers other models of data besides neural networks.

The increasing performance of visual recognition tasks provided by deep learning methods has open the way to new
computer vision applications that rely on rich descriptions of image content. Today, we can tag an image with labels related to thousands of visual concepts such as objects, scenes, or even short captions describing the image content. But even if some specific image classification problems remain unsolved, state-of-the-art end-to-end learning currently allows to go one step further: we can learn non visual concepts from images that are highly relevant for data science applications. Extracting non visual attributes from images was considered very hard because of the lack of good image descriptors. For example, what is a good descriptor for labeling an image with an index of “happiness”, or a city image with an index of “livability”? There were good image descriptors for visual attribute extraction (HoG, SIFT, Daisy, etc.) because one of the main tasks of researchers was the design of handcraftted features. This process was heavily dependent on researchers’ intuition about how to represent objects and scenes. Deep learning automatically discovers the features that best represent the problem, rather than just a way to combine them. For example, in object recognition, shallow learning starts with handcrafted features of the image, but deep learning starts with the raw pixels. For this reason, deep learning frees the researcher from handcrafting features and is well suited to extract useful attributes from images that are not realted to
researcher’s intuition. It paves the way to answer questions such as ‘Can computer vision be used to automatically select pictures that will make your apartment listing successful on Airbnb?’.

The main objectives of the project are:
1. To build (design, train and deploy) a deep learning system that is able of predicting a set of non visual attributes from images.
2. To use the system to measure the value of images in some online markets (f.e. peer-to-peer accomodation).
3. To propose and validate a set of tools based on the developed system to show its potential added value for real
problems.

aerial-1851292_640