In a groundbreaking advancement for marine research, scientists from the prestigious Massachusetts Institute of Technology (MIT) and the Woods Hole Oceanographic Institution (WHOI) have engineered an intelligent autonomous system designed to pinpoint the most scientifically valuable sampling locations across vast and previously unexplored oceanic territories.
Environmental scientists frequently face challenges when attempting to collect samples from "maxima" – locations of particular scientific interest within any given environment. These might include areas with the highest concentration of substances, such as chemical leakage points where the concentration remains highest and least contaminated by external variables. However, a maximum can represent any measurable parameter researchers aim to investigate, including water depth or coral reef sections most exposed to atmospheric conditions.
Traditional approaches to deploying maximum-seeking robots have been hampered by significant efficiency and accuracy limitations. Conventional robotic systems typically employ lawnmower-like patterns to cover designated areas, a time-intensive approach that often results in numerous scientifically irrelevant samples. Alternative systems that detect and follow high-concentration trails to their source frequently encounter misleading scenarios. For instance, chemicals might accumulate in crevices distant from their actual source, causing robots to falsely identify these high-concentration areas as the origin point.
Presented at the renowned International Conference on Intelligent Robots and Systems (IROS), the researchers unveiled "PLUMES" – an innovative system that empowers autonomous mobile robots to locate maxima with unprecedented speed and efficiency. By leveraging sophisticated probabilistic techniques, PLUMES predicts pathways most likely to lead to maxima while successfully navigating obstacles, shifting currents, and other environmental variables. As the system collects samples, it evaluates acquired information to determine whether to continue along a promising trajectory or explore uncharted territories that potentially harbor more valuable samples.
Critically, PLUMES successfully reaches its intended destination without becoming trapped in deceptive high-concentration locations. "This capability proves essential, as it's remarkably easy to mistakenly identify fool's gold as genuine treasure," explains co-first author Victoria Preston, a doctoral candidate in the Computer Science and Artificial Intelligence Laboratory (CSAIL) and the MIT-WHOI Joint Program.
The research team engineered a PLUMES-equipped robotic vessel that successfully identified the most exposed coral formation within the Bellairs Fringing Reef in Barbados – specifically locating the shallowest point, which provides valuable insights into how sun exposure affects coral organisms. Across 100 simulated trials in diverse underwater environments, a virtual PLUMES robot consistently collected seven to eight times more maximum samples compared to traditional coverage methods within designated time periods.
"PLUMES executes the minimal exploration necessary to identify the maximum before rapidly focusing on collecting valuable samples at that location," notes co-first author Genevieve Flaspohler, a doctoral student in CSAIL and the MIT-WHOI Joint Program.
Completing the research team alongside Preston and Flaspohler are: Anna P.M. Michel and Yogesh Girdhar, both scientists in WHOI's Department of Applied Ocean Physics and Engineering; and Nicholas Roy, a professor in CSAIL and the Department of Aeronautics and Astronautics.
Mastering the Exploration-Exploitation Balance
A fundamental breakthrough in PLUMES involves employing probability-based techniques to navigate the intricate balance between exploiting acquired environmental knowledge and exploring unknown territories that might offer greater scientific value.
"The primary challenge in maximum-seeking operations involves enabling the robot to balance exploiting information from known high-concentration areas while exploring regions with limited available data," Flaspohler explains. "Excessive exploration results in insufficient valuable sample collection at the maximum, while inadequate exploration may cause the robot to miss the maximum entirely."
When deployed in a new environment, a PLUMES-powered robot utilizes a probabilistic statistical model known as a Gaussian process to predict environmental variables such as chemical concentrations while estimating sensing uncertainties. Subsequently, PLUMES generates a distribution of potential paths the robot might follow, employing estimated values and uncertainties to rank each pathway based on its effectiveness for exploration and exploitation.
Initially, PLUMES selects paths that randomly explore the environment. Each sample provides new information about targeted values in the surrounding environment – such as areas with the highest chemical concentrations or shallowest depths. The Gaussian process model leverages this data to refine possible paths from the robot's current position to sample from locations exhibiting even higher values. PLUMES employs a novel objective function – commonly utilized in machine learning to maximize rewards – to determine whether the robot should exploit existing knowledge or explore new territories.
"Hallucinating" Potential Pathways
The decision-making process for selecting the next sampling location hinges on the system's ability to "hallucinate" all potential future actions from its current position. To accomplish this, PLUMES employs a modified version of Monte Carlo Tree Search (MCTS), a path-planning technique popularized by its application in artificial intelligence systems that master complex games like Go and Chess.
MCTS utilizes a decision tree – comprising interconnected nodes and lines – to simulate a path, or sequence of moves, required to reach a final winning action. However, in gaming environments, the space of possible paths remains finite. In unknown environments characterized by real-time changing dynamics, this space becomes effectively infinite, rendering planning exceptionally challenging. The researchers developed "continuous-observation MCTS," which leverages the Gaussian process and novel objective function to search through this unwieldy space of potential real-world paths.
The root of this MCTS decision tree begins with a "belief" node – the robot's next immediate action. This node contains the complete history of the robot's actions and observations up to that point. The system then expands the tree from the root into new lines and nodes, examining several steps of future actions leading to both explored and unexplored areas.
Subsequently, the system simulates the outcomes of taking samples from each newly generated node, based on patterns learned from previous observations. Depending on the value of the final simulated node, the entire path receives a reward score, with higher values indicating more promising actions. Reward scores from all paths are rolled back to the root node. The robot selects the highest-scoring path, takes a step, and collects an actual sample. It then uses this real data to update its Gaussian process model and repeats the "hallucination" process.
"As long as the system continues to hallucinate the possibility of higher values in unobserved world regions, it must persist with exploration," Flaspohler explains. "When it finally converges on a location estimated to be the maximum – because it cannot hallucinate a higher value along the path – it ceases exploration."
Currently, the researchers are collaborating with WHOI scientists to deploy PLUMES-powered robots for localizing chemical plumes at volcanic sites and studying methane releases in melting Arctic coastal estuaries. Scientists are particularly interested in identifying the sources of chemical gases released into the atmosphere, though these testing sites can span hundreds of square miles.
"They can leverage PLUMES to minimize exploration time across these vast areas and truly focus on collecting scientifically valuable samples," Preston concludes.