ClinicalTrials.Veeva

Menu

Scalable Clinical Oversight of Large Language Models Via Uncertainty Triangulation (SCOUT)

N

National Center for Cardiovascular Diseases

Status

Not yet enrolling

Conditions

Coronary Heart Disease (CHD)

Treatments

Diagnostic Test: Standard Manual Review Workflow
Diagnostic Test: SCOUT-Assisted Review Workflow

Study type

Interventional

Funder types

Other

Identifiers

NCT07414966
2025-2702-1

Details and patient eligibility

About

This prospective, multi-reader, randomized crossover trial evaluates SCOUT (Scalable Clinical Oversight via Uncertainty Triangulation), a model-agnostic meta-verification framework that selectively defers unreliable large language model (LLM) predictions to clinicians by triangulating three orthogonal uncertainty signals: model heterogeneity, stochastic inconsistency, and reasoning critique. The trial assesses whether SCOUT-assisted review can reduce physician review time compared with standard manual review of AI-generated diagnoses while maintaining non-inferior diagnostic accuracy in coronary heart disease (CHD) subtyping.

Full description

Background: Large language models are increasingly deployed in clinical workflows, yet requiring clinician review of every AI output negates the efficiency gains that motivate their adoption. SCOUT addresses this efficiency-safety paradox through algorithmic meta-verification.

The SCOUT framework triangulates three orthogonal external signals to determine case-level uncertainty: (1) Model Heterogeneity - whether a structurally different auxiliary LLM agrees with the primary model; (2) Stochastic Inconsistency - whether repeated sampling from the same model yields divergent outputs; (3) Reasoning Critique - whether an external checker model identifies logical flaws in the chain-of-thought reasoning.

In this crossover trial, 7 clinicians of varying seniority (2 junior residents, 3 senior residents, 2 attending physicians) each review all 110 cases under both standard manual review and SCOUT-assisted review workflows. The study evaluates workflow efficiency (primary endpoint) and diagnostic accuracy (secondary endpoint).

Enrollment

7 estimated patients

Sex

All

Ages

18+ years old

Volunteers

No Healthy Volunteers

Inclusion criteria

  • Board-certified or in-training cardiologists at Fuwai Hospital
  • Spanning three experience strata: junior residents, senior residents, attending physicians

Exclusion criteria

  • Clinicians involved in the development or optimization of the SCOUT framework
  • Clinicians involved in the gold-standard adjudication process

Trial design

Primary purpose

Diagnostic

Allocation

Randomized

Interventional model

Crossover Assignment

Masking

None (Open label)

7 participants in 2 patient groups

Control (Standard Manual Review)
Active Comparator group
Description:
Physicians manually review all cases in the control set (n=54) with access to AI predictions and reasoning. No selective deferral.
Treatment:
Diagnostic Test: Standard Manual Review Workflow
Experimental (SCOUT-Assisted Review)
Experimental group
Description:
Physicians process the intervention set (n=56) through the SCOUT framework. Low-uncertainty cases are auto-accepted; high-uncertainty cases undergo physician review with full audit trail.
Treatment:
Diagnostic Test: SCOUT-Assisted Review Workflow

Trial contacts and locations

0

Loading...

Central trial contact

Xiaojin Gao, Dr.

Data sourced from clinicaltrials.gov

Clinical trials

Find clinical trialsTrials by location
© Copyright 2026 Veeva Systems