Scientific research is facing an unprecedented challenge as massive data streams threaten to overwhelm projects ranging from neutrino studies to supernova observations and brain research. The exponential growth of information requires innovative solutions for real-time data analysis in scientific research.
When gravitational-wave detectors like LIGO capture signals from distant cosmic collisions, researchers race against time to identify accompanying light signals. Meanwhile, neural sensors monitoring brain activity generate data faster than current computing systems can handle, and the Large Hadron Collider (LHC) will soon produce data exceeding 1 petabit per second.
To address this critical data bottleneck, a consortium of nine institutions led by the University of Washington, including MIT, has secured $15 million in funding to establish the Accelerated AI Algorithms for Data-Driven Discovery (A3D3) Institute. The MIT team comprises Philip Harris, assistant professor of physics and deputy director of the A3D3 Institute; Song Han, assistant professor of electrical engineering and computer science and co-PI; and Erik Katsavounidis, senior research scientist with the MIT Kavli Institute for Astrophysics and Space Research.
Supported by a five-year Harnessing the Data Revolution Big Idea grant and jointly funded by the Office of Advanced Cyberinfrastructure, A3D3 will concentrate on three data-intensive fields: multi-messenger astrophysics, high-energy particle physics, and brain imaging neuroscience. By enhancing AI algorithms with advanced processors, A3D3 aims to accelerate AI-powered data processing in astrophysics and other scientific domains to solve fundamental problems in collider physics, neutrino physics, astronomy, gravitational-wave physics, computer science, and neuroscience.
“I am very excited about the new Institute's opportunities for research in nuclear and particle physics,” says Laboratory for Nuclear Science Director Boleslaw Wyslouch. “Modern particle detectors produce an enormous amount of data, and we are looking for extraordinarily rare signatures. The application of extremely fast processors to sift through these mountains of data will transform what we can measure and discover.”
The foundation for A3D3 was established in 2017 when Harris and colleagues at Fermilab and CERN began implementing real-time AI algorithms to process the incredible data rates at the LHC. Through collaboration with Han, Harris' team developed HLS4ML, a compiler capable of running AI algorithms in nanoseconds.
“Before HLS4ML, the fastest AI processing we knew of was roughly a millisecond per inference, maybe slightly faster,” explains Harris. “We realized existing AI algorithms were designed for much slower problems like image and voice recognition. To achieve nanosecond inference times, we developed smaller algorithms and utilized custom implementations with Field Programmable Gate Array (FPGA) processors—a fundamentally different approach from others in the field.”
Months later, Harris presented their research at a physics faculty meeting, catching Katsavounidis' interest. During a discussion in Building 7, they explored combining Harris' FPGA technology with Katsavounidis' machine learning applications for gravitational wave detection. FPGAs and other advanced processors like graphics processing units (GPUs) can accelerate AI algorithms to analyze massive datasets more rapidly.
“I worked with the first FPGAs on the market in the early '90s and witnessed how they revolutionized front-end electronics and data acquisition in high-energy physics experiments,” recalls Katsavounidis. “The potential to apply them to gravitational-wave data analysis has been on my mind since joining LIGO over 20 years ago.”
Two years ago, the team received their initial grant, with the University of Washington's Shih-Chieh Hsu joining the effort. They established the Fast Machine Lab, published approximately 40 papers, expanded to about 50 researchers, and “launched an entirely new field exploring previously uncharted territory in AI,” notes Harris. “We began without any funding, securing small grants for various projects over time. A3D3 represents our first major grant to support this initiative.”
“What makes A3D3 particularly suited to MIT is its exploration of a technical frontier where AI is implemented not in high-level software but in lower-level firmware, reconfiguring individual gates to address specific scientific questions,” says Rob Simcoe, director of MIT Kavli Institute for Astrophysics and Space Research and the Francis Friedman Professor of Physics. “We're in an era where experiments generate torrents of data. The acceleration gained from tailoring reprogrammable, custom computers at the processor level can advance real-time analysis to unprecedented levels of speed and sophistication.”
Managing Massive Data from the Large Hadron Collider
With data rates already exceeding 500 terabits per second, the LHC processes more information than any other scientific instrument globally. Its future aggregate data rates will soon surpass 1 petabit per second, representing the world's largest data stream.
“Through AI, A3D3 aims to perform advanced analyses like anomaly detection and particle reconstruction on all collisions occurring 40 million times per second,” says Harris.
The objective is to identify the few collisions among 3.2 billion per second that could reveal new forces, explain dark matter formation, or complete our understanding of fundamental force interactions with matter. Processing this information requires a customized computing system capable of interpreting collider data with ultra-low latency.
“The challenge of running this analysis on hundreds of terabits per second in real-time is daunting and requires completely rethinking how we design and implement AI algorithms,” Harris explains. “With detector resolution improvements leading to even higher data rates, the challenge of finding that one significant collision among many will become increasingly complex.”
Exploring Brain Function and Cosmic Phenomena
Neuroscience is also experiencing a data revolution, with advances in medical imaging and electrical recordings from implanted electrodes generating unprecedented information about neural network responses to stimuli and motor information processing. A3D3 plans to develop and implement high-throughput, low-latency AI algorithms to process, organize, and analyze massive neural datasets in real time, enabling new experiments and therapies in brain research.
In Multi-Messenger Astrophysics (MMA), A3D3 aims to rapidly identify astronomical events by efficiently processing data from gravitational waves, gamma-ray bursts, and neutrinos detected by telescopes and detectors worldwide.
The A3D3 research team includes a multidisciplinary group of 15 researchers from the University of Washington (project lead), Caltech, Duke University, Purdue University, UC San Diego, University of Illinois Urbana-Champaign, University of Minnesota, and the University of Wisconsin-Madison. The initiative will incorporate neutrino research at Icecube and DUNE, visible astronomy at the Zwicky Transient Facility, and will organize deep-learning workshops and boot camps to train students and researchers on contributing to the framework and expanding the application of fast AI strategies across scientific disciplines.
“We've reached a point where detector network growth will be transformative in terms of event rates, astrophysical reach, and ultimately, discoveries,” says Katsavounidis. “'Fast' and 'efficient' is the only way to combat the 'faint' and 'fuzzy' signals in the universe and maximize the potential of our detectors. A3D3 will bring production-scale AI to gravitational-wave physics and multi-messenger astronomy while aspiring to become the national resource for applications of accelerated AI to data-driven scientific disciplines.”