Here is a useful summary of classifier results for the main data classification tasks:
– MNIST: “a large database of handwritten digits that is commonly used for training various image processing systems”.
– CIFAR 10 and CIFAR 100 are The CIFAR-10 dataset has “60000 32×32 colour images in 10 classes, with 6000 images per class. There are 50000 training images and 10000 test images.” The classes are: ‘airplane[sic], automobile, bird, cat, deer,dog, frog, horse, ship and truck.’ CIFAR 100 is similar but with 100 classes containing 600 images each.
– STL 10 is ” inspired by the CIFAR-10 dataset but with some modifications. In particular, each class has fewer labeled training examples than in CIFAR-10, but a very large set of unlabeled examples is provided to learn image models prior to supervised training.”
– SVHN ia a “real-world image dataset for developing machine learning and object recognition algorithms…obtained from house numbers in Google Street View images.”
– ILSVRC2012 task 1 is “a large visual database designed for use in visual object recognition software research. As of 2016, over ten million URLs of images have been hand-annotated by ImageNet to indicate what objects are pictured.” This is used for an annual software contest.
Software classifiers using MNIST and SVHN, both of which deal with numbers, are now achieving 0.21% error rates (MNIST: handwriting) or 1.69% error rate (SVHM: house numbers).
The others, which require images to be assigned to categories (is this a dog?) are less effective: CIFAR 10 accuracy rates are 96.53%, falling off to 75.72% for CIFAR 100 and 74.33% for STL 10. Figures for ILSVRC are not given.
It seems clear that where the image is relatively simple (a house number, viewed more or less full on) or the options limited (one in ten numbers, one in ten categories) accuracy is now very good. Where the categories are more complex or more numerous there is still work to be done.