Strand 1
Strand 1: Technical Foundations for Effective and Expansive Interactions with AI

NSF iSAT team members demonstrate a collaborative activity with real-time multimodal analysis.
Strand 1 researchers focus on the technical foundations required for AI systems that can perceive, interpret, and productively support student collaborations in authentic classroom environments. The goal is to move from raw classroom signals toward actionable, interactive, and privacy-preserving models of student group activity, attention, understanding, and epistemic alignment. Two research themes guide this work: Multimodal Perception and Control Policies for Multimodal Interactive Agents. Together, these themes define the core architecture for Student-AI Teaming: AI Partners that can observe collaborations as they unfold, reason about what students are attending to and where they are misaligned, and provide support in ways that are helpful, timely, and educationally productive.
Communication in the classroom takes place through an interplay of speech, gesture and other nonverbal signals. In order for AI to reason about collaboration and learning in real-world classrooms, it must combine information across multiple modalities, aggregate it over time, and interpret it within the context of the underlying learning conversations grounded in the curriculum. We will develop new multimodal, multiparty modeling methods to understand who communicated what with whom and when, as well as the common ground of the mutually shared knowledge state of each group of students.
We are developing computational models to train AI agents to facilitate effective student-AI collaborations by encouraging participation from all students, ensuring smooth coordination among students, teachers, and the AI. The AI will help support students’ learning and collaboration by interacting unobtrusively and naturalistically with learners and integrating seamlessly into teachers’ existing workflows and the curriculum.