Troubled datasets

FER-2013 is a dataset, or collection of nearly thirty thousand images (28,709 to be exact) where each image has been assigned one of seven "labels": anger, disgust, fear, happiness, sadness, surprise, or neutral. The dataset is used to train a statistical model which then allows a computer program to automatically classify new (as yet unseen) images based on the original seven emotional labels. FER stands for "Facial Expression Recognition" and was part of a set of three "challenges" that took place at the 30th International Conference on Machine Learning held in 2013 in Atlanta, Georgia (USA). The FER2013 dataset was created by Pierre Luc Carrier and Aaron Courville at the University of Montreal as part of a long term research project into machine learning techniques as applied to images.

The dataset was created using the Google image search API to search for images of faces that match a set of 184 emotion-related keywords like “blissful”, “enraged,” etc. These keywords were combined with words related to gender, age or ethnicity, to obtain nearly 600 strings which were used as facial image search queries.

The first 1000 images returned for each query were kept for the next stage of processing. OpenCV face recognition was used to obtain bounding boxes around each face in the collected images. Human labelers than rejected incorrectly labeled images, corrected the cropping if necessary, and filtered out some duplicate images. Approved, cropped images were then resized to 48x48 pixels and converted to grayscale. Mehdi Mirza and Ian Goodfellow prepared a subset of the images for this contest, and mapped the fine-grained emotion keywords into the same seven broad categories used in the Toronto Face Database. The resulting dataset contains 35887 images, with 4953 “Anger” images, 547 “Disgust” images, 5121 “Fear” images, 8989 “Happiness” images, 6077 “Sadness” images, 4002 “Surprise” images, and 6198 “Neutral” images. (pp. 3-4, Challenges in Representation Learning: A report on three machine learning contests. Goodfellow et al. 2013 https://arxiv.org/pdf/1307.0414.pdf)

One motivation for representation learning is that learning algorithms can design features better and faster than humans can. To this end, we will hold one challenge that does not explicitly require that entries use representation learning. Rather, we will introduce an entirely new dataset and invite competitors from all related communities to solve it. The dataset for this challenge will be a facial expression classification dataset that we have assembled from the internet and has not yet been distributed publicly.

The first place winner of each contest will receive $300 and the second place winner will receive $150. The prize money was generously provided by Google, Inc. source

The "labels"

(0=Angry, 1=Disgust, 2=Fear, 3=Happy, 4=Sad, 5=Surprise, 6=Neutral)