Development and Validation of an Artificial Intelligence-assisted System for Bowel Cleanliness Assessment Based on Withdrawal Distance Weighting

Fudan University

Status

Not yet enrolling

Conditions

Colorectal Adenoma

Treatments

Other: No Intervention: Observational Cohort

Study type

Observational

Funder types

Other

Identifiers

NCT07150130

2024K272-F251

Details and patient eligibility

About

To address the limitations of current AI-based systems that rely on the assumption of a "constant withdrawal speed," this study proposes the integration of the UPD-3 endoscopic positioning system. By using colonoscope withdrawal videos in combination with UPD-3 imaging data as training samples, we aim to develop an AI-powered bowel cleanliness assessment system that incorporates "withdrawal distance" as a weighting factor. This approach is expected to yield a more reliable, objective, and clinically applicable intelligent assessment system that better aligns with real-world clinical practice and endoscopists' operational habits.

Full description

This study developed an intelligent bowel cleanliness assessment system that uses colonoscope withdrawal distance as a weighting factor. The system consists of the following four modules:

Module 1: Exclusion of Unqualified Frames in Colonoscopy Videos

1.1 A total of 20 randomly selected colonoscope withdrawal videos (from 20 different subjects) were retrospectively collected from the Endoscopy Center database of Huadong Hospital between January 2018 and June 2024. Images were extracted at a rate of 5 frames per second. Clear frames suitable for BBPS scoring and unqualified frames (e.g., blurred, under irrigation, with instrument manipulation, images from the small intestine, outside the patient, or chromoendoscopy images) were manually labeled.

1.2 The labeled images were split into training and validation sets at a 7:3 ratio. A Transformer-based AI classification model was trained on the training set and validated on the validation set.

1.3 An additional 10 independent colonoscope withdrawal videos (from 10 different subjects) were retrospectively collected using the same method for image extraction and manual labeling. These served as an external validation set to assess the accuracy of the AI model in classifying qualified vs. unqualified frames, thereby evaluating its clinical applicability.
Module 2: BBPS 0-3 Scoring for Qualified Colonoscopy Images

2.1 Colonoscopy images were randomly collected from the same database between January 2018 and June 2024. Three expert endoscopists (with over 5 years of experience) independently assigned BBPS scores (0-3) to each image. Images with at least two consistent ratings were included. Data collection concluded once each of the four categories (BBPS 0-3) had at least 500 labeled images, sourced from at least 500 different subjects.

2.2 The labeled images were split into training and validation sets at a 7:3 ratio. A CLIP-based AI classification model was trained and validated accordingly.

2.3 Ten additional independent withdrawal videos (from 10 different subjects) were retrospectively collected. Images were extracted at 1 frame per second and manually labeled with BBPS 0-3 scores. These were used as an external validation set to evaluate the model's BBPS classification accuracy and clinical relevance.

2.4 Human-Machine Contest: A set of 120 colonoscopy images (30 for each BBPS score 0-3), labeled by three expert endoscopists, were retrospectively selected (from at least 30 different subjects). The AI system, two junior endoscopists (<5 years of experience), and two expert endoscopists (>5 years of experience) independently assigned BBPS scores to all 120 images. The accuracy of the AI system was compared with that of the junior and expert endoscopists.
Module 3: Prediction of Hepatic and Splenic Flexure Locations

3.1 Forty colonoscope withdrawal videos (from 40 different subjects) containing UPD-3 positioning data were collected. An expert endoscopist used the UPD-3 system and the video to identify and extract 5-15s video clips representing the transition through the hepatic flexure (ascending to transverse colon) and the splenic flexure (transverse to descending colon).

3.2 These 40 videos were split into training and validation sets at a 3:1 ratio. For training, the 5-15s clips representing flexure transitions, along with 5s of video before and after the clip, were used to train a Video-LLaMA-based AI model. In the validation set, the model predicted flexure transition clips, and its consistency with expert annotations was measured. Manual verification of whether predicted clips contained the actual transition process was also performed to compute accuracy.

3.3 Human-Machine Contest: Ten independent colonoscope withdrawal videos (from 10 different subjects) with UPD-3 positioning were collected. The UPD-3 overlay was masked, and the AI system, two junior endoscopists, and two expert endoscopists were asked to extract 10s clips they believed represented transitions through the hepatic and splenic flexures. An expert endoscopist then used the UPD-3 data and the videos to judge whether each 10s clip indeed included the respective transition, comparing the accuracy among the AI system, junior, and expert endoscopists.
Module 4: Real-Time Prediction of Withdrawal Distance

4.1 Dataset Construction: Video segments were randomly sampled from full withdrawal videos. OCR technology was used to extract real-time insertion depth values displayed by the UPD-3 system. The difference in length between the first and last frame was recorded as the ground truth label. Segments were manually screened to exclude invalid clips, ensuring a final dataset of over 2,000 valid segments covering a range of lengths and withdrawal distances.

4.2 Teacher Model Training: The teacher model consists of a 3D feature extractor (3D-GAN) and a video feature extractor (Transformer). The model estimates withdrawal distance using UPD-3 imaging and video clips. The 3D-GAN extracts features from multi-view UPD-3 images, and the Transformer extracts video features. The 3D features guide the training of the Transformer to predict withdrawal distance.

4.3 Student Model Training via Knowledge Distillation: The student model consists of a single Transformer trained using video-only inputs. It learns from the teacher model through knowledge distillation to improve its prediction accuracy without access to 3D data during inference.

4.4 Model Validation: The student model was evaluated using the validation set, using only video input (no 3D data) to better reflect real-world deployment. Accuracy was measured by computing the mean squared error between predicted and actual withdrawal distances.

Enrollment

700 estimated patients

Sex

All

Volunteers

No Healthy Volunteers

Inclusion criteria

Clear colonoscopy images suitable for BBPS scoring
Complete and clear colonoscopy videos suitable for BBPS scoring
Clear colonoscopy videos with a stable UPD-3 positioning system, without signal drift, disappearance, or other disruptions

Exclusion criteria

Blurred colonoscopy images
Colonoscopy images taken from the small intestine or outside the patient's body
Colonoscopy images captured during irrigation or instrument manipulation
Colonoscopy images obtained during chromoendoscopy
Colonoscopy videos that do not contain the complete withdrawal process
Videos in which the UPD-3 colonoscopic positioning system exhibited signal drift, disappearance, or other instability

Trial design

700 participants in 4 patient groups

Cohort for Module 1

Description:

Development and Validation of Module for Exclusion of Unqualified Frames in Colonoscopy Videos

Treatment:

Other: No Intervention: Observational Cohort

Cohort for Module 2

Description:

Development and Validation of Module for BBPS 0-3 Scoring for Qualified Colonoscopy Images

Treatment:

Other: No Intervention: Observational Cohort

Cohort for Module 3

Description:

Development and Validation of Module for Prediction of Hepatic and Splenic Flexure Locations

Treatment:

Other: No Intervention: Observational Cohort

Cohort for Module 4

Description:

Development and Validation of Module for Real-Time Prediction of Withdrawal Distance

Treatment:

Other: No Intervention: Observational Cohort

Trial contacts and locations

Central trial contact

Danian Ji, M.D.; Zhiyu Dong, M.D.

Data sourced from clinicaltrials.gov

Clinical trials

Find clinical trials Trials by location

Research sites

Find research sites Learn about CTV for professionals

Resources

Contact CTV support

Legal

Privacy Notice Terms