Virtual reality headsets have struggled for years to replace conventional screens as the primary medium for visual content. A significant barrier has been the discomfort experienced by users - nausea and eye strain often result from VR's attempt to simulate 3D viewing while actually displaying fixed-distance 2D images. The breakthrough solution for authentic three-dimensional visualization might come from a surprising source: a 60-year-old technology transformed by artificial intelligence - holography.
Holograms offer an extraordinary representation of our three-dimensional world that's not only visually stunning but also physiologically comfortable. Unlike traditional displays, holograms provide shifting perspectives based on the viewer's position and allow natural eye focusing between foreground and background elements, just like in real life.
For decades, scientists have attempted to create computer-generated holograms, but the process has traditionally demanded massive supercomputers running complex physics simulations, resulting in time-consuming processes that often produce subpar results. Now, MIT researchers have revolutionized this field by developing an AI-powered method that generates holograms almost instantaneously. Their deep learning approach is so remarkably efficient that it can operate on a standard laptop in real-time.
"The industry consensus was that real-time 3D holography was impossible with existing consumer hardware," explains Liang Shi, lead author of the study and a PhD candidate in MIT's Department of Electrical Engineering and Computer Science. "Commercially viable holographic displays have perpetually been predicted to arrive in ten years - a timeline that has remained unchanged for decades."
Shi believes their innovative approach, dubbed "tensor holography," will finally transform this long-awaited vision into reality. This advancement could catalyze the integration of holography into numerous fields, including virtual reality, augmented reality, and advanced 3D printing technologies.
Published in Nature, the research was conducted by Shi alongside his advisor Wojciech Matusik. The team also included Beichen Li from MIT's Computer Science and Artificial Intelligence Laboratory, along with former MIT researchers Changil Kim (now at Facebook) and Petr Kellnhofer (now at Stanford University).
The Evolution of 3D Visualization
Conventional photography captures only the brightness of light waves, faithfully reproducing colors but ultimately delivering flat, two-dimensional images. In contrast, holography encodes both the brightness and phase of each light wave, creating a true representation of a scene's parallax and depth. While a photograph of Monet's "Water Lilies" might capture the color palette, a hologram could reveal the three-dimensional texture of each brushstroke, bringing the masterpiece to life. Despite their superior realism, creating and sharing holograms has traditionally been exceptionally challenging.
Originally developed in the mid-20th century, early holograms were created optically by splitting laser beams, with one portion illuminating the subject while the other served as a phase reference. This process generated holograms' distinctive sense of depth but produced only static images that couldn't capture movement. Additionally, these holograms existed only in physical form, making reproduction and distribution difficult.
Computer-generated holography addresses these limitations by digitally simulating the optical process, but the computational requirements have been prohibitively intensive. "Each point in a scene has a different depth, preventing the application of uniform operations across the entire image," Shi explains. "This dramatically increases computational complexity." Even with clustered supercomputers, physics-based simulations could require seconds or minutes to generate a single holographic image. Furthermore, existing algorithms couldn't model occlusion with photorealistic accuracy. This led Shi's team to pioneer a novel approach: enabling the computer to learn physics independently.
The researchers employed deep learning to accelerate computer-generated holography, achieving real-time hologram generation. They designed a convolutional neural network that uses trainable tensors to approximate human visual processing. However, training such a network typically requires extensive, high-quality datasets, which didn't exist for 3D holograms.
To overcome this obstacle, the team constructed a custom database comprising 4,000 pairs of computer-generated images. Each pair matched a conventional image (containing color and depth information for each pixel) with its corresponding hologram. To create these holograms, the researchers utilized scenes featuring complex, varied shapes and colors, with pixel depths distributed evenly from background to foreground. They implemented new physics-based calculations to handle occlusion, resulting in photorealistic training data. With this foundation, their algorithm began its work.
By learning from each image pair, the tensor network continuously refined its computational parameters, progressively improving its hologram generation capabilities. The fully optimized network operated orders of magnitude faster than traditional physics-based calculations - an efficiency that even surprised the research team.
"We were astonished by how effectively it performs," says Matusik. "In just milliseconds, tensor holography can create holograms from images containing depth information - which is readily available from computer-generated images and can be captured using multicamera setups or LiDAR sensors (now standard in many smartphones)." This breakthrough enables real-time 3D holography. Moreover, the compact tensor network requires less than 1 MB of memory. "This is negligible compared to the tens or hundreds of gigabytes available in modern smartphones," he notes.
Joel Kollin, a principal optical architect at Microsoft who wasn't involved in the research, comments that "this study demonstrates that true 3D holographic displays are practical with only moderate computational requirements." He adds that "the paper shows significant improvement in image quality over previous work," which will "enhance realism and comfort for viewers." Kollin also suggests that holographic displays could potentially be customized to individual users' visual prescriptions. "Holographic displays can correct for optical aberrations in the eye, potentially creating display images sharper than what users can achieve with contact lenses or glasses, which only address basic aberrations like focus and astigmatism."
A Transformative Breakthrough
Real-time 3D holography would revolutionize numerous systems, from virtual and augmented reality to advanced manufacturing. The researchers suggest their new system could immerse VR users in more realistic environments while eliminating the eye strain and other discomforts associated with prolonged VR use. The technology could be readily implemented on displays that modulate the phase of light waves. While most current consumer-grade displays modulate only brightness, the cost of phase-modulating displays would likely decrease with widespread adoption.
Three-dimensional holography could also accelerate the development of volumetric 3D printing, which could prove faster and more precise than traditional layer-by-layer approaches by simultaneously projecting the entire 3D pattern. Additional applications include advanced microscopy, medical data visualization, and the design of surfaces with unique optical properties.
"This represents a significant leap that could fundamentally transform how people perceive holography," Matusik concludes. "We believe neural networks were perfectly suited for this challenge."
The research received partial funding from Sony.