MIT and Microsoft algorithm determines correlations in vast art collections

A team of MIT researchers has created an algorithm to identify similar works of art. Their work could help drive innovation in datasets, survey systems, and more.


In recent years, researchers have exploited computer algorithms for a multitude of applications in all sectors, including the arts. In 2018, the “Edmond de Belamy, from La Famille de Belamy”, an original work established by an artificial intelligence (AI) computer algorithm sold at Christie’s for $432,500. Now, a team of researchers is harnessing algorithms to shed light on similarities in art across cultures, styles and mediums. MIT researchers have developed an image retrieval system to sift through a large art collection and identify similar works of art.

Researchers at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL), in partnership with Microsoft, created the algorithm known as “MosAIc” to roam the Metropolitan Museum of Art and the Rijksmuseum in Amsterdam. The image search system was then leveraged to determine the best match, or “analogous” work, for a given piece using these two collections.

SEE: TechRepublic Premium Editorial Calendar: Downloadable IT Policies, Checklists, Toolkits, and Research (TechRepublic Premium)

By using a particular part as the standard for a given analysis, the system can be dialed in to identify a similar part within a large set of filter parameters. This allows the team to use a particular image and find the closest object filtered by style or media. The algorithm can then provide the closest glassware, for example, or the Egyptian coin for a selected work.

As the team explained in a recent release, if the image retrieval system were asked “which musical instrument is closest to this painting of a blue and white dress”, MosAIc would retrieve a photo of a white and blue porcelain violin. The report’s authors note that, although the two objects share similar stylistic form and motifs, these objects “also draw their roots from a wider cultural exchange of porcelain between the Dutch and the Chinese”.

“Image retrieval systems allow users to find semantically similar images to a query image, serving as the backbone of reverse image search engines and many product recommendation engines,” says MIT CSAIL Ph. .D student Mark Hamilton, lead author of a paper on MosAIc. “Restricting an image retrieval system to particular subsets of images can yield new insights into relationships in the visual world. We aim to encourage a new level of engagement with creative artifacts.


A dive into the spirit of networks

To understand how these deep networks perceived similarities between works, researchers needed to analyze network activations. According to the report, the proximity of these activations, also called “features,” is how the researcher determined the similarity of the image.

The team also created a “Conditional KNN Tree” data structure with similar elements grouped into portions of particular branches within the larger framework. According to the report, “the data structure improves upon its predecessors by allowing the tree to quickly ‘carve’ itself to suit a particular culture, artist or collection, quickly yielding responses to new types of requests”.

The researchers found that this data structure could also be used for purposes other than comparative analysis between the two art collections.

SEE: Robotics in business (free PDF) (TechRepublic)

Future applications of these findings

Today, Generative Adversarial Networks (GANs) are commonly used to develop so-called deepfake images. This clustering data structure framework can be leveraged to identify where these probabilistic models excel in creating deepfake images and areas where these models are less refined.

“The idea is that instead of filling this tree with art, you are filling this tree with deepfakes and actual images. If you look at where the deepfakes cluster, those are the areas where these algorithms [GANs] are particularly good at making pictures,” Hamilton said.


While these systems are particularly adept in some areas, at other times these models create rather peculiar images. The researchers called areas where these models are less sophisticated “blind spots.” Blind spots can “give us insight into how biased GANs can be,” according to the report.

“In the future, we hope this work will inspire others to think about how information retrieval tools can help other fields such as the arts, humanities, social sciences, and medicine,” Hamilton said. “These domains are rich in information that has never been processed with these techniques and can be a source of inspiration for computer scientists and experts in the field. This work can be expanded in terms of new datasets, new types of queries, and new ways of understanding the connections between the works. »