London underground crowd

A new algorithm can determine crowd emotions

Researchers at the Higher School Of Economics (HSE), in S. Petersburg, have developed an algorithm that detects emotions in a group of people on a low-quality video. The solution provides a final decision in just one hundredth of a second, and could have many applications. 

Analysing people’s social behaviour using images and videos is one of the most popular tasks for developers of smart man-machine interfaces. Researchers have achieved a rather high quality in group-level emotion recognition, but it is still impossible to implement this development on a mass scale. The problem was the requirement of most video systems for images containing face close-ups in good resolution. But ordinary cameras installed on the street or in a supermarket have low resolution and are mounted rather high, so that the facial features in the videos are very tiny.

Alexander Tarasov and Andrey Savchenko, researchers from HSE, have developed an algorithm that is comparable with the existing group-level emotion recognition techniques in terms of recognition accuracy (75.5%). At the same time, it requires only 5MB in the system memory, processes one image or video frame in just one hundredth of a second and can be used with low-quality video data.

The algorithm works in several stages. First, the image is processed with MTCNN neural network (Multi-task Cascaded Convolutional Networks), which is traditionally used for detection of small faces. Then, the features are extracted from each face with a fully convolutional network, which was preliminarily trained to classify emotions of faces with very low resolution, no bigger than a profile picture on social media. The final decision on the emotion (negative, positive or neutral) of the whole group is made by an ensemble of known classifiers applied to the weighted sum of feature vectors of all detected faces.

The new development can potentially be used in various video surveillance systems. It can help detect changes in group emotions at a concert, football match, or a protest rally, which can help in preventing conflicts in a timely manner. Integrated in a supermarket surveillance system, it will detect consumers’ emotional reaction to various promotions. Together with cameras recording a public speech, it can assess the audience’s response.