Training the Untrained Eye with AI to Classify Fine Art
Beauty, it is said, resides in the eye of the beholder.
What if that beholder is a machine learning model being trained to describe and classify fine works of art? That’s what AI researchers at Zhejiang University of Technology in China are attempting to find out by comparing the ability of different models trained on a growing list of image data sets to classify artwork by genre and style.
Whether these models can be trained to respond emotionally remains to be seen.
Preliminary results from one study published earlier this month in the journal of the Public Library of Science highlighted the utility of using convolutional neural networks (CNNs) for demanding tasks like art classification. The researchers stressed the importance of training models and accompanying network weight connections via generalized image data sets.
One outgrowth of the image classification research was the development of a search engine used to classify and retrieve digitized paintings by style and genre.
The impetus for the CNN research is the growing digitization of the world’s art collections though initiatives such as the Google Art Project. With many art museums shuttered by the COVID-19 pandemic, demand for online galleries has soared. That demand has created a requirement for automated tools for classifying works of art added to digital collections.
The trained human eye can identify individual artists (the distinctive brush strokes of the Neo-Impressionist Vincent van Gogh, for example) along with styles and genres. The fundamental problem in using machine learning to classify images, particularly masterpieces filled with human emotion, is that the untrained eye of a machine sees only pixel values.
In order to train CNNs to classify and describe visual art in human terms, large data sets and machine learning models have emerged. Just as a human observer learns to appreciate and interpret art by viewing many paintings and styles, researchers say neural networks can be trained using large, annotated data sets.
The ArtEmis data set, for example, contains more than 80,000 works of art that can be scanned in 8 minutes along with 439,000 annotations and “emotional attributes.” The captions produced by these systems often succeed in reflecting the semantic and abstract content of the image, going well beyond systems trained on existing data sets, according to the authors of the data set.
At the Zhejiang University of Technology, researchers conducted progress testing on seven different CNN models on three different data sets to determine their ability to classify art. They also compared art classification performance when using transfer learning, a machine learning technique in which a trained model is repurposed for a second related task.
The CNNs initially used only color information to classify paintings. Later, spatial information was included to help the model distinguish, for example, between a portrait and a landscape. Models pre-trained on the ImageNet database produced the best art classification results.
That “real-world” image classification ability was found to transfer to a specific task like classifying a work of art, the Chinese researchers said.
Future work will focus on using larger data sets like ArtEmis to determine which pretrained CNN models perform best in classifying paintings—bringing another trained eye to the appreciation of art.
While AI researchers are upbeat about the application of machine learning to disciplines like art cataloging, others remain skeptical of the technology’s ability to replicate human emotions elicited by a painting.
“I like the idea of computers doing the basic work of identifying the style and genre of artwork to the end that it will allow art collections to become more widely available,” said Marie Dauenheimer, a board-certified medical illustrator and collector of fine art.
While lauding efforts like the Google Art Project that produced high-resolution digital masterpieces, Dauenheimer added, “I am less convinced that having computers assess artwork on an emotional level is useful.”
Related
George Leopold has written about science and technology for more than 30 years, focusing on electronics and aerospace technology. He previously served as executive editor of Electronic Engineering Times. Leopold is the author of "Calculated Risk: The Supersonic Life and Times of Gus Grissom" (Purdue University Press, 2016).