In today's fast-paced medical environment, a radiologist leverages an artificial intelligence system to assist in diagnosing patient conditions through X-ray analysis. While this AI technology accelerates the diagnostic process, determining when to trust its outputs remains a critical challenge.
Medical professionals often lack clear guidance on AI reliability, instead depending on personal expertise, system-generated confidence metrics, or algorithmic explanations that may appear convincing yet still lead to incorrect conclusions.
To address this fundamental issue, MIT researchers have developed an innovative onboarding methodology designed to help humans develop more accurate mental models about AI systems—specifically understanding when machine predictions are reliable versus when they may be flawed.
By demonstrating how AI can complement human capabilities rather than replace them, this training approach enables more effective decision-making and faster conclusions when collaborating with artificial intelligence agents.
"We've created a teaching framework where we gradually introduce humans to AI models, allowing them to independently discover the system's limitations and strengths," explains Hussein Mozannar, a doctoral candidate in MIT's Social and Engineering Systems program within IDSS. "This process mirrors real-world human-AI interactions while providing targeted feedback to enhance understanding of each exchange."
Mozannar collaborated with Arvind Satyanarayan, a computer science assistant professor leading CSAIL's Visualization Group, and senior author David Sontag, an MIT associate professor and head of the Clinical Machine Learning Group. Their findings will be presented at the Association for the Advancement of Artificial Intelligence conference in February.
Understanding Mental Models
This research focuses on the mental frameworks humans construct about AI systems. When facing uncertain cases, radiologists might consult colleagues with specialized expertise, drawing upon previous experiences to assess the reliability of their advice.
Similar cognitive processes occur when humans interact with AI agents, making the accuracy of these mental models essential. Cognitive science research indicates that humans base complex task decisions on remembered interactions and experiences. Consequently, the researchers designed an onboarding process that provides representative examples of human-AI collaboration, creating reference points for future interactions. They developed an algorithm to identify examples that most effectively teach humans about AI capabilities.
"We first analyze a human expert's cognitive biases and strengths through observation of their unassisted decisions," Mozannar notes. "By combining this understanding with our knowledge of the AI's capabilities, we identify scenarios where the human should trust the AI's input and similar situations where they should rely on their own judgment."
The researchers evaluated their onboarding technique using a passage-based question-answering task: participants receive written passages containing answers to specific questions. Users can either answer independently or request AI assistance. The challenge lies in not previewing the AI's response, forcing users to rely on their mental model of the system. The onboarding process begins by presenting these examples, with users attempting predictions alongside the AI. Following each attempt, regardless of outcome, users receive the correct answer and an explanation of the AI's decision-making process. To reinforce learning, contrasting examples demonstrate why the AI succeeded or failed.
For example, during training about which plant species is native to more continents, participants might receive a complex botanical passage. After answering independently or with AI assistance, they would see follow-up examples highlighting the AI's performance on different subjects—perhaps incorrect about fruits but accurate regarding geological topics. The system highlights key terms used in its decision-making, helping users understand its limitations.
To enhance knowledge retention, participants document inferred rules like "This AI struggles with botanical predictions," creating reference guidelines for future interactions. These rules effectively formalize the user's mental model of the AI system.
Educational Impact
The researchers tested this teaching methodology with three participant groups: one receiving complete onboarding, another getting partial training without comparison examples, and a baseline group with no training but preview access to AI answers.
"Participants who completed our comprehensive training performed equally well as those who could see AI answers in advance," Mozannar observes. "This suggests they developed the ability to predict AI responses as effectively as if they had direct access to its outputs."
Further analysis revealed that nearly 50% of trained participants developed accurate understanding of the AI's capabilities. Those with accurate mental models achieved 63% accuracy, compared to 54% for those with incorrect models and 57% for the baseline group with answer preview access.
"When effective, this teaching approach produces significant results," Mozannar emphasizes. "Successfully trained participants outperform those given direct access to AI answers."
However, the research also identified remaining challenges: only half of trained participants built accurate mental models, and even these individuals achieved just 63% accuracy. Despite understanding the AI's capabilities, users didn't consistently apply their knowledge—a puzzle researchers hope to solve in future studies.
The team plans to refine the onboarding process to reduce time requirements and conduct user studies with more complex AI models, particularly in healthcare applications.
"When humans collaborate with other humans, we heavily rely on understanding our collaborators' strengths and limitations—this helps determine when to seek assistance," notes Carrie Cai, a research scientist in Google's People + AI Research and Responsible AI groups who wasn't involved in the study. "I'm encouraged to see this research applying similar principles to human-AI interactions. Teaching users about AI capabilities is essential for successful human-AI collaboration."
This research received partial funding from the National Science Foundation.