Funded by UT REAL Health AI · University of Texas System

Scaling AI-Based
OSCE Assessment
Across Six UT Medical Schools

The MAPLES AI grading platform is being prepared for multi-site validation across UT System medical schools, with engagement through the UT Health Intelligence Platform underway and funding from the UT-REAL Health AI initiative.

Our Project Latest Updates

LIVE

UTSW Production

UT Partner Sites

7,000+

Encounters Graded

Participating Medical Schools

Partnering across the UT System to validate AI-enabled clinical assessment.

Latest Updates

View all posts »

Milestones, publications, and announcements from the project.

Wayfinder: AI-Assisted Rubric Authoring in MAPLES

Watch Minhan Park and Licheng Yi demonstrate Wayfinder: create a structured rubric from a short description, enhance it with accept/reject suggestions, transform it for a new scenario, and continue in chat.

Reimagining OSCE Grading and Medical Education at UT Southwestern

On Healthcare AI Pioneers, Thomas Dalton and Andrew Jamieson discuss UT Southwestern’s AI-enabled OSCE assessment work: rubrics, multimodal evidence, staged rollout, human review, and precision education.

MAPLES Walkthrough: From Rubric Upload to Faculty Review

Watch Ameer Hamza Shakur walk through the MAPLES workflow: upload a rubric, group OSCE encounters, run AI-assisted grading, review cited evidence, and export validated results.

Six UT Medical Schools

Collaborating across the UT System to validate AI-enabled clinical assessment.

View all participating sites →

Our Mission

Empowering Educators, Improving Doctors

Our mission is to give medical educators better tools so they can give students better feedback — faster, more consistent, and at a scale that wasn't possible before.

Share Best Practices for AI-Ready Rubrics

Share strategies and best practices for designing AI-compatible OSCE rubrics, while allowing each school to customize assessment to their own educational philosophy and clinical needs.

Scale AI Grading Infrastructure

Prepare MAPLES for governed multi-site deployment across partner sites, with the UT Health Intelligence Platform (UT-HIP) as the preferred infrastructure path under active planning.

Validate AI vs. Human Concordance

Conduct rigorous validation studies comparing AI grading accuracy against expert human raters across diverse patient scenarios and institutions.

Multi-Site Data Collection Planning

Evaluate UT-HIP as the preferred hosting path for a governed multi-institutional OSCE dataset, pending site-specific IRB, DUA, and hosting decisions.

Cross-Institutional Collaboration

Six UT medical schools sharing rubrics, best practices, and lessons learned — building a community of practice around AI-assisted clinical education.

Publication & Dissemination

Publish findings in high-impact venues (NEJM AI, JMIR AI) and present at national conferences to advance the field of AI in medical education.

UTSW Pilot Results

Production metrics from UTSW — the foundation we're scaling across the UT System.

0.830

AI-Human Agreement (kappa)

7,000+

Encounters Graded at UTSW

Partner Institutions

$300K

Grant Award

Phase Progress

From single-site proof-of-concept to UT System-wide deployment.

Phase 1: UTSW Proof-of-Concept (Complete)

AI grading system developed and deployed in production at UT Southwestern. 7,000+ encounters graded, 3,200+ students assessed. Published in NEJM AI and JMIR AI. AI agreement (kappa = 0.830) exceeds human inter-rater reliability (kappa = 0.732).

Phase 2: Award & Planning (Current)

Funded by the UT REAL Health AI Pilot Program, March 2026 ($300K, 18 months). Award setup complete. Infrastructure engagement with UT-HIP and UTSW Enterprise Data Services underway. IRB template protocol in review. Site inventory and governance planning in progress.

Phase 3: Multi-Site Deployment

Phased onboarding of partner sites beginning with Wave 1 institutions. Site-specific technical audits, IRB/DUA routing, and data ingestion pipeline setup.

Phase 4: Validation & Dissemination

Cross-institutional validation studies, publication of multi-site results, open-source governance playbook, and framework for national adoption.

Where It All
Started

Born at the UT Southwestern Simulation Center, the MAPLES platform has graded over 7,000 clinical encounters in production — proving that AI can match and exceed human inter-rater reliability. Published in NEJM AI and JMIR AI, with multimodal assessment research on preprint. Now, through the UT-REAL initiative and the UT Health Intelligence Platform, we're scaling that capability across the UT System.

MAPLES Walkthrough

Wayfinder Rubric Demo

SimRubrics Demo

UTSW Simulation Center