Wolfram Language Gets Image Identification Capabilities

Stephen Wolfram's latest search tool is a function called ImageIdentify built into the Wolfram Language that lets you ask, "What is this a picture of?" - and get an answer. The Wolfram Language Image Identification Project is currently online and and lets anyone take any picture (drag it from a web page, snap it on your phone, or load it from a file) and see what ImageIdentify thinks it is.

"It won’t always get it right, but most of the time I think it does remarkably well. And to me what’s particularly fascinating is that when it does get something wrong, the mistakes it makes mostly seem remarkably human," Wolfram says in a very interesting blog post.

The project is a practical example of artificial intelligence. But for Stephen Wolfram, what’s more important is that this kind of AI operation can be integrated into the Wolfram Language, to use as a building block for knowledge-based programming.

With ImageIdentify built right into the Wolfram Language, it’s easy to create APIs, or apps, that use it. And with the Wolfram Cloud, it’s also easy to create websites-like the Wolfram Language Image Identification Project.

The project is capable of recognizing about 10,000 common kinds of objects, though Wolfram notes that it still has difficulty recognizing specific people, art, and things that are not "real everyday objects."

The Image Identification Project is based on neural networks. The name makes one think of brains and biology. But for the specific purposes, neural networks are computational, systems, that consist of compositions of multi-input functions with continuous parameters and discrete thresholds.

Humans can readily recognize a few thousand kinds of things-roughly the number of picturable nouns in human languages. Lower animals likely distinguish vastly fewer kinds of things. "But if we’re trying to achieve "human-like" image identification—and effectively map images to words that exist in human languages-then this defines a certain scale of problem, which, it appears, can be solved with a "human-scale" neural network, " Wolfram says.

There are certainly differences between computational and biological neural networks—although after a network is trained, the process of, say, getting a result from an image seems rather similar. But the methods used to train computational neural networks are significantly different from what it seems plausible for biology to use.

Still, in the actual development of ImageIdentify, Wolfram was shocked at how much was reminiscent of the biological case. For a start, the number of training images seemed very comparable to the number of distinct views of objects that humans get in their first couple of years of life.

Probably much like the brain, the ImageIdentify neural network has many layers, containing a variety of different kinds of neurons. Even Wolfram finds it
hard to say meaningful things about much of what’s going on inside the network. However, "if one looks at the first layer or two, one can recognize some of the features that it’s picking out. And they seem to be remarkably similar to features we know are picked out by real neurons in the primary visual cortex," he says.

"I think it’s of great interest to look at what happens at later layers in the neural network—because if we can recognize them, what we should see are "emergent concepts" that in effect describe classes of images and objects in the world—including ones for which we don’t yet have words in human languages."

Wolfram admits that ImageIdentify will never truly be finished. He will continue training and developing ImageIdentify, not least based on feedback and statistics from the site. "Without actual usage by humans there’s no real way to realistically assess progress—or even to define just what the goals should be for "natural image understanding".