Mohammad Haft-Javaherian initially intended to spend just sixty minutes exploring the Green AI Hackathon to familiarize himself with MIT's cutting-edge supercomputer, Satori. However, three days later, he departed with a $1,000 prize for his groundbreaking approach to reducing the environmental impact of artificial intelligence models designed to identify cardiac conditions.
"I had never previously considered the kilowatt-hours my research consumed," Haft-Javaherian admits. "This competition provided me with the opportunity to evaluate my carbon footprint and discover methods to exchange minimal model accuracy for substantial energy conservation."
Haft-Javaherian was one of six teams recognized at this collaborative event hosted by the MIT Research Computing Project and MIT-IBM Watson AI Lab from January 28-30. The initiative aimed to introduce students to Satori—the computing cluster generously donated to MIT by IBM—and to stimulate innovative approaches for developing energy-conscious AI models that reduce atmospheric carbon dioxide emissions.
The competition also highlighted Satori's environmentally-friendly computing capabilities. Featuring an architecture engineered to minimize data transfer and incorporate various energy-saving technologies, Satori recently secured fourth place on the Green500 supercomputer ranking. Its location further enhances its environmental credibility: situated on a restored brownfield site in Holyoke, Massachusetts, it now operates as the Massachusetts Green High Performance Computing Center, primarily powered by low-carbon hydroelectric, wind, and nuclear energy sources.
As a postdoctoral researcher at MIT and Harvard Medical School, Haft-Javaherian participated in the hackathon to expand his knowledge of Satori. He remained engaged by the challenge of reducing the energy demands of his work focused on creating AI methodologies for detecting coronary artery diseases. A revolutionary imaging technique, optical coherence tomography, has provided cardiologists with an advanced tool for visualizing arterial wall irregularities that can impede oxygenated blood flow to the heart. However, even specialists may overlook subtle patterns that computers can effectively identify.
During the competition, Haft-Javaherian conducted experiments with his model and discovered he could decrease its energy consumption by eightfold through minimizing the idle time of Satori's graphics processors. He also explored modifying the model's layers and features, exchanging varying levels of precision for reduced energy requirements.
Another team, Alex Andonian and Camilo Fosco, also claimed $1,000 by demonstrating their ability to train a classification model nearly ten times faster through code optimization while sacrificing minimal accuracy. As graduate students in the Department of Electrical Engineering and Computer Science (EECS), Andonian and Fosco are currently developing a classifier to distinguish authentic videos from AI-manipulated counterfeits for Facebook's Deepfake Detection Challenge. Facebook initiated this contest last autumn to crowdsource solutions for preventing misinformation dissemination on their platform preceding the 2020 presidential election.
Should a technical solution for deepfakes emerge, it must operate across millions of machines simultaneously, explains Andonian. This makes energy efficiency crucial. "Every optimization we discover to train and implement more efficient models will create significant impact," he notes.
To accelerate the training process, they experimented with streamlining their code and reducing the resolution of their 100,000-video training set by eliminating certain frames. They hadn't anticipated finding a solution within three days, but Satori's processing capacity proved advantageous. "We could execute 10 to 20 experiments concurrently, enabling us to refine potential concepts and obtain results rapidly," Andonian explains.
As artificial intelligence continues advancing in tasks such as analyzing medical images and interpreting videos, models have expanded in size and computational requirements, consequently increasing their energy demands. According to one estimate, training a large language-processing model generates nearly as much carbon dioxide as the complete lifecycle emissions of five American automobiles. While the typical model's footprint remains comparatively modest, the environmental impact of AI applications grows as they proliferate.
One approach to environmentally-friendly AI and managing the exponential growth in AI training demands involves creating smaller models. This strategy was employed by a third hackathon participant, EECS graduate student Jonathan Frankle. Frankle investigates early training signals that indicate subnetworks within larger, fully-trained networks capable of performing identical functions. This concept extends from his award-winning Lottery Ticket Hypothesis research from last year, which demonstrated that neural networks could operate with 90% fewer connections when the appropriate subnetwork was identified early in training.
The hackathon participants were evaluated by John Cohn, chief scientist at the MIT-IBM Watson AI Lab, Christopher Hill, director of MIT's Research Computing Project, and Lauren Milechin, a research software engineer at MIT.
The judges acknowledged four additional teams: Department of Earth, Atmospheric and Planetary Sciences (EAPS) graduate students Ali Ramadhan, Suyash Bire, and James Schloss for adapting the Julia programming language for Satori; MIT Lincoln Laboratory postdoc Andrew Kirby for modifying his graduate-level code for Satori using a library designed for simplified programming of computing architectures; and Department of Brain and Cognitive Sciences graduate students Jenelle Feather and Kelsey Allen for implementing a technique that dramatically simplifies models by reducing their parameter count.
IBM developers were present to address inquiries and collect feedback. "We challenged the system—constructively," states Cohn. "Ultimately, we enhanced the machine, its documentation, and the surrounding tools."
Looking ahead, Satori will be joined in Holyoke by TX-Gaia, Lincoln Laboratory's new supercomputer. Together, they will provide insights regarding the energy consumption of their workloads. "We aim to increase awareness and motivate users to discover innovative approaches to make all computing more environmentally sustainable," Hill concludes.