Head-to-Head Evaluation of the Cancer Ontology Supervised Multimodal Orchestration (COSMO) AI System Versus Pathologist-Only Review (COSMO)

December 13, 2025 updated by: Kun-Hsing Yu, Harvard Medical School (HMS and HSDM)

This study evaluates the diagnostic performance of the Cancer Ontology Supervised Multimodal Orchestration (COSMO) AI system for cancer subtype classification and compares it head-to-head with pathologist-only review. Pathologists will independently review de-identified whole-slide images derived from up to 300 patients across three anatomical sites (brain, lung, kidney) and provide diagnostic assessments. In parallel, COSMO will process the same cases offline to generate independent predictions, enabling direct comparison of diagnostic accuracy between human experts and the AI system.

The study will characterize the diagnostic accuracy of COSMO and pathologists, inter-observer agreement, and variations in performance across anatomical sites and cancer types with different incidence rates. Results will establish how COSMO compares to pathologists on identical cases and will inform the development of AI-assisted diagnostic systems in clinical practice.

Study Overview

Status

Enrolling by invitation

Detailed Description

Study Rationale and Background Diagnostic accuracy in cancer subtype classification varies significantly among pathologists due to differences in expertise, experience, and access to diagnostic resources. The emergence of AI systems in pathology offers the potential to enhance diagnostic performance and consistency in cancer classification. However, direct empirical comparisons of AI-based predictions and pathologists' diagnostic performance on identical cases remain limited in the literature.

Study Aims This head-to-head comparative study aims to: (1) evaluate the diagnostic performance of the COSMO AI system in cancer subtype classification across multiple anatomical sites; (2) characterize the diagnostic accuracy of experienced pathologists on the same cases; (3) directly compare diagnostic performance metrics between COSMO and pathologists; and (4) examine concordance patterns and performance variation by anatomical site, cancer incidence category, pathologist experience, and case complexity.

Study Setting and Participants The study will involve up to 25 board-certified pathologists with 3 to 10+ years of diagnostic experience, recruited from institutions across North America, Europe, and the Asia-Pacific region. Participating pathologists will have domain expertise in neuropathology, pulmonary pathology, urologic pathology, or general anatomical pathology.

Cases and Stratification The study will employ de-identified archival whole-slide images representing up to 300 patients with confirmed reference diagnoses, including 100 brain cancers, 100 lung cancers, and 100 kidney cancers. Cases will be stratified by cancer type and incidence category (common vs. rare or uncommon), consistent with World Health Organization (WHO) guidelines.

Data Collection Pathologists will independently review each case and provide diagnostic classifications along with confidence assessments using a 5-point scale. The digital pathology interface will automatically record time-to-diagnosis metrics. COSMO will process the same cases offline to generate independent diagnostic predictions and confidence scores. Both pathologist and AI predictions will be evaluated against established reference standard diagnoses.

Analysis Framework The primary analysis will characterize diagnostic performance metrics (including accuracy, sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and area under the receiver operating characteristic curve (AUROC)) for both pathologists (at the individual and aggregated levels) and the COSMO system. Secondary analyses will assess performance stratified by anatomical site, cancer incidence category, and pathologist experience level.

Study Type

Observational

Enrollment (Estimated)

30

Contacts and Locations

This section provides the contact details for those conducting the study, and information on where this study is being conducted.

Study Locations

    • Massachusetts
      • Boston, Massachusetts, United States, 02115
        • Harvard Medical School

Participation Criteria

Researchers look for people who fit a certain description, called eligibility criteria. Some examples of these criteria are a person's general health condition or prior treatments.

Eligibility Criteria

Ages Eligible for Study

  • Child
  • Adult
  • Older Adult

Accepts Healthy Volunteers

No

Sampling Method

Non-Probability Sample

Study Population

We will recruit pathologists from international academic medical centers, hospital systems, and diagnostic pathology practices across North America (United States), Europe (Austria, Hungary), and the Asia-Pacific (Taiwan, Hong Kong, South Korea, China, India) region. Participating sites will include major academic institutions with established pathology departments, with recruitment targeting expertise in neuropathology, pulmonary pathology, and urologic pathology.

Description

Inclusion Criteria:

  • Board-certified pathologist with expertise in neuropathology, pulmonary pathology, urologic pathology, or general anatomical pathology
  • Minimum of 3 years of clinical diagnostic experience
  • Active clinical practice involving diagnostic pathology slide review
  • Willingness to independently review and diagnose up to 300 de-identified whole-slide images
  • Ability to access the study platform and complete case reviews within the specified study timeline
  • Provision of informed consent for study participation

Exclusion Criteria:

  • Prior involvement in the design or validation of the COSMO AI system
  • Inability to commit sufficient time to complete assigned case reviews
  • Presence of significant financial conflicts of interest related to the study outcomes

Study Plan

This section provides details of the study plan, including how the study is designed and what the study is measuring.

How is the study designed?

Design Details

Cohorts and Interventions

Group / Cohort
Intervention / Treatment
AI-Based Evaluation using COSMO
Pathologist-Based Evaluation
Digital Pathology Evaluation

What is the study measuring?

Primary Outcome Measures

Outcome Measure
Measure Description
Time Frame
Diagnostic performance
Time Frame: Periprocedural (at the time of slide review)
Diagnostic performance of the COSMO AI system and pathologists in identifying cancer subtypes across brain, lung, and kidney tumors, as assessed by accuracy, balanced accuracy, sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and area under the receiver operating characteristic curve (AUROC). We will include both overall comparisons and stratified evaluations by anatomical site and cancer incidence category (common vs. rare or uncommon).
Periprocedural (at the time of slide review)

Secondary Outcome Measures

Outcome Measure
Measure Description
Time Frame
Inter-Observer Agreement Among Pathologists
Time Frame: Periprocedural (at the time of slide review)
Diagnostic concordance among participating pathologists, measured by Fleiss' kappa, intraclass correlation coefficient (ICC), and pairwise concordance rates.
Periprocedural (at the time of slide review)
Pathologist-COSMO AI Concordance
Time Frame: Periprocedural (at the time of slide review)
Agreement patterns between pathologist diagnoses and COSMO AI predictions, including proportion of concordant cases overall and stratified by anatomical site, cancer incidence category, and pathologist experience level.
Periprocedural (at the time of slide review)
Diagnostic Confidence
Time Frame: Periprocedural (at the time of slide review)
Mean confidence scores (5-point scale) reported by pathologists during diagnostic assessment, stratified by anatomical site, cancer incidence category, and diagnostic correctness (correct vs. incorrect).
Periprocedural (at the time of slide review)
Time-to-Diagnosis
Time Frame: Periprocedural (at the time of slide review)
Mean diagnostic time (in seconds) required by pathologists to provide cancer subtype classification, stratified by anatomical site, cancer incidence category, and pathologist experience level.
Periprocedural (at the time of slide review)
Diagnostic Performance Stratified by Pathologist Experience
Time Frame: Periprocedural (at the time of slide review)
Diagnostic accuracy of pathologists stratified by years of clinical experience (3-5 years, 6-10 years, >10 years) to assess the relationship between experience level and diagnostic performance in cancer subtype classification.
Periprocedural (at the time of slide review)

Collaborators and Investigators

This is where you will find people and organizations involved with this study.

Investigators

  • Principal Investigator: Kun-Hsing Yu, MD, PhD, Harvard Medical School (HMS and HSDM)

Study record dates

These dates track the progress of study record and summary results submissions to ClinicalTrials.gov. Study records and reported results are reviewed by the National Library of Medicine (NLM) to make sure they meet specific quality control standards before being posted on the public website.

Study Major Dates

Study Start (Actual)

June 12, 2025

Primary Completion (Estimated)

January 31, 2026

Study Completion (Estimated)

January 31, 2026

Study Registration Dates

First Submitted

December 13, 2025

First Submitted That Met QC Criteria

December 13, 2025

First Posted (Actual)

December 29, 2025

Study Record Updates

Last Update Posted (Actual)

December 29, 2025

Last Update Submitted That Met QC Criteria

December 13, 2025

Last Verified

December 1, 2025

More Information

Terms related to this study

Plan for Individual participant data (IPD)

Plan to Share Individual Participant Data (IPD)?

NO

IPD Plan Description

Individual pathologist diagnostic assessments will not be shared to protect evaluator anonymity and privacy. De-identified case data and aggregated performance metrics will be made available through published results and supplementary materials. The protocol document will be uploaded to enable full methodological transparency.

Drug and device information, study documents

Studies a U.S. FDA-regulated drug product

No

Studies a U.S. FDA-regulated device product

No

This information was retrieved directly from the website clinicaltrials.gov without any changes. If you have any requests to change, remove or update your study details, please contact register@clinicaltrials.gov. As soon as a change is implemented on clinicaltrials.gov, this will be updated automatically on our website as well.

Clinical Trials on Brain Cancer

Clinical Trials on Digital Pathology Evaluation

Subscribe