Adaptive Pedagogical Agents for VR History Learning

LLM-powered role and action switching in the Pavilion of Prince Teng

Exploring LLM-Powered Role and Action-Switching Pedagogical Agents for History Education in Virtual Reality

Zihao Zhu, Ao Yu, Xin Tong, Pan Hui

Abstract

Multi-role pedagogical agents can create engaging and immersive learning experiences, helping learners better understand knowledge in history learning. However, existing pedagogical agents often struggle with multi-role interactions due to complex controls, limited feedback forms, and difficulty dynamically adapting to user inputs. In this study, we developed a VR prototype with LLM-powered adaptive role-switching and action-switching pedagogical agents to help users learn about the history of the Pavilion of Prince Teng. A 2 x 2 between-subjects study was conducted with 84 participants to assess how adaptive role-switching and action-switching affect participants’ learning outcomes and experiences. The results suggest that adaptive role-switching enhances participants’ perception of the pedagogical agent’s trustworthiness and expertise but may lead to inconsistent learning experiences. Adaptive action-switching increases participants’ perceived social presence, expertise, and humanness. The study did not uncover any effects of role-switching and action-switching on usability, learning motivation and cognitive load. Based on the findings, we proposed five design implications for incorporating adaptive role-switching and action-switching into future VR history education tools.

Background and Challenges

VR technology enables immersive, time-travel-like experiences that reconstruct historical scenes and make abstract knowledge tangible. Pedagogical agents (PAs) are central to this transformation, guiding learners through narratives, spatial exploration, and contextual storytelling. However, existing multi-role PAs often remain cumbersome to control, offer rigid feedback, and fail to adapt to individual learner needs. The rapid evolution of large language models (LLMs) opens new opportunities for designing PAs that can personalize instruction and respond dynamically to learner inputs.

  • Operational complexity: Learners must manually switch roles or dialogue options, which disrupts flow.
  • Limited feedback forms: PAs rarely combine verbal, visual, and gestural cues to enrich explanations.
  • Lack of adaptivity: Static scripts cannot address diverse questions or prior knowledge levels.
System overview diagram showing voice input and adaptive modules
Figure 1. System overview illustrating how voice input drives the adaptive role-switching and action-switching modules.

Our Solution: An Adaptive VR Prototype

We created a VR learning experience centered on the history of the Pavilion of Prince Teng. The prototype integrates LLM-driven reasoning with immersive storytelling so that learners receive personalized responses and contextual demonstrations. Two adaptive modules form the core of the system, enabling the PA to select the most appropriate role and action sequence based on each learner’s questions.

Comparison of pedagogical agent conditions and role appearances
Figure 2. (a) Pedagogical agent across four experimental conditions. (b) Three adaptive roles and four action categories.

Adaptive Role-Switching

Triggered by learner prompts, the PA can automatically adopt the most suitable persona. Each role comes with distinct attire, vocal characteristics, and narrative tone to reinforce authenticity and credibility.

  • Prince Teng: Provides architectural and historical insights into the pavilion’s construction and legacy.
  • Poet Wang Bo: Shares literary context, especially the famed preface and poetic symbolism.
  • Archaeological Expert: Offers contemporary interpretations, excavation findings, and preservation perspectives.

Role transitions happen seamlessly, maintaining learner immersion while tailoring explanations to question intent.

Adaptive Action-Switching

Beyond dialogue, the PA selects from 18 animated actions grouped into four categories. The LLM determines which gestures best complement the spoken response, making explanations more expressive and spatially grounded.

Action Category Purpose Representative Gestures
Pointing Guide attention to architectural details or spatial references. Highlight inscriptions, trace roof curves, indicate river orientation.
Natural Expressive Convey enthusiasm and empathy to sustain engagement. Open-arm welcome, thoughtful nods, inviting hand waves.
Descriptive Visualize processes or imagery embedded in narratives. Mimic writing poetry, demonstrate calligraphy strokes, outline skyline silhouettes.
Character-Specific Reinforce persona identity and historical authenticity. Formal bow of a court official, scholarly contemplation, modern analytical gestures.

By aligning verbal explanations with context-aware gestures (see Figure 2(b) and Table 1), the PA deepens learner comprehension and social presence.

Study Design

We employed a 2 × 2 between-subjects design that assigned 84 participants to four conditions: role-switching (R) versus no role-switching (N), and action-switching (A) versus no action-switching (N). This resulted in four groups:

  • RA: Role-switching + Action-switching
  • RN: Role-switching + No action-switching
  • NA: No role-switching + Action-switching
  • NN: No role-switching + No action-switching

Participants experienced the VR prototype (see Figure 4) and completed pre- and post-tests covering conceptual and factual knowledge, alongside questionnaires on social presence, trustworthiness, expertise, and humanness.

User study procedure flowchart from introduction to interviews
Figure 3. Study procedure: introduction, pre-test, VR experience, post-test, experience questionnaire, and interview.
VR user interface showing goals, agent, and interaction buttons
Figure 4. VR user interface with learning goals on the left, the pedagogical agent in the center, and interaction controls on the right.

Main Findings

Quantitative Results

  • Learning outcomes: Role-switching significantly boosted factual knowledge gains (Figure 5b).
  • Trustworthiness & expertise: Role-switching enhanced perceived trustworthiness (Figure 7b) and expertise (Figure 8a).
  • Social presence & humanness: Action-switching improved social presence (Figure 6b) and humanness (Figure 8b), reinforcing the value of expressive gestures.

Qualitative Results

Participants described adaptive role-switching as “novel” and “fun,” which reinforced their trust in the PA. As one learner noted, “It’s more believable to hear Poet Wang Bo himself describe the background of his preface.”

However, frequent and abrupt role transitions could occasionally feel incoherent, highlighting the importance of pacing controls.

Across action types, pointing gestures were considered the most helpful for navigation, whereas descriptive actions (e.g., mimicking writing) were viewed as supplementary rather than essential.

Bar chart showing factual knowledge improvements by condition
Figure 5. Factual knowledge gains across experimental conditions.
Bar chart comparing social presence scores
Figure 6. Social presence outcomes emphasize the impact of action-switching.
Bar chart summarizing trustworthiness and expertise ratings
Figure 7. Trustworthiness and expertise ratings highlight the benefits of role-switching.
Composite chart illustrating humanness perceptions
Figure 8. Humanness perceptions improve when action-switching is enabled.

Design Implications

  1. Leverage social cues beyond actions. Incorporate facial expressions, gaze, and micro-gestures to deepen rapport.
  2. Emphasize navigational actions. Prioritize precise pointing behaviors to help learners orient within expansive virtual scenes.
  3. Optimize information delivery. Pair verbal explanations with visual aids—such as virtual artifacts or embedded media—to reduce reliance on descriptive gestures.
  4. Utilize proactive PAs. Allow agents to proactively cue learners toward critical historical details, maintaining narrative coherence.
  5. Promote multi-perspective understanding. Use role-switching strategically at key narrative moments so learners can appreciate events from diverse viewpoints.
Loading...