An Interpretable Fundus Diseases Report Generating System Based On Weakly Labelings

Sun Yat-sen University

Status

Begins enrollment this month

Conditions

Choroidal Neovascularization

Age Related Macular Degeneration

Pathological Myopia

Choroidal Disease

Retinal Diseases

Diabetic Retinopathy

Treatments

Other: With fundus diseases

Other: Without fundus diseases

Study type

Observational

Funder types

Other

Identifiers

NCT06918028

2024KYPJ005

Details and patient eligibility

About

To establish a multimodal fundus image report generation model to realize an interpretable system for multiple fundus diseases, multimodal image analysis, diagnosis, and treatment decision automatic reporting based on weakly labeled training data. Construct an interpretable feature fusion network for the clinical and imaging features of fundus lesions, and we hope to extract new imaging markers that can predict the occurrence and progression of various fundus lesions at an early stage, and ultimately verify them in real clinical data, further providing possible directions for exploring the molecular mechanisms of refractory fundus lesions, and may also provide new ideas for the precise prevention and treatment of fundus lesions.

Full description

AI models for multimodal fundus imaging in the diagnosis of retinal diseases. AI models have demonstrated significant potential in assisting the diagnosis of various retinal diseases based on multimodal fundus imaging. In recent years, AI has rapidly advanced in the field of fundus disease imaging diagnostics. High-accuracy diagnostic models can be developed using large datasets of precisely annotated single-modal images. Fundus photography, which provides clinicians with an initial diagnostic impression, is widely accessible and can be obtained using simple imaging devices or even mobile devices. For instance, Cen et al. trained a deep learning system using 249,620 precisely annotated fundus photographs to diagnose 39 common retinal diseases, achieving diagnostic accuracies exceeding 90% for each condition.

However, fundus photography alone offers limited disease information, making it challenging to differentiate between diseases with similar manifestations. In addition, its diagnostic accuracy is heavily based on image quality and clinician expertise, which may lead to missed or misdiagnosed cases.

Optical coherence tomography (OCT) provides a three-dimensional analysis of the retinal layers, clearly revealing the severity and location of pathologies such as intraretinal and subretinal fluid. OCT has become a standard diagnostic and differential diagnostic tool for retinal diseases and is essential to guide the precise treatment and follow-up of conditions such as age-related macular degeneration and diabetic macular edema. AI-assisted OCT analysis can further enhance the follow-up and personalized treatment of retinal diseases. For example, Fauw et al. utilized 14,884 OCT images to diagnose more than 10 retinal diseases and map the location of the lesions.

The accurate diagnosis of retinal diseases also relies on dynamic and functional evidence. Fundus fluorescein angiography (FFA) and indocyanine green angiography (ICGA) are indispensable for the location, characterization, and evaluation of the vascular function of the lesion. However, due to the complexity of interpreting angiographic images, the application of AI in FFA and ICGA analysis has only recently gained traction.

Moreover, most cases require multimodal imaging, including OCT, fundus photography, and angiography, to comprehensively locate and analyze the disease pathology. Additionally, integrating clinical data and patient medical history is crucial for an accurate diagnosis. Therefore, there is an urgent need to develop new AI models capable of integrating multi-modal data to assist clinicians in accurately diagnosing complex retinal diseases.
AI-Assisted Generation of Complex Medical Reports. The complexity and specialized nature of fundus imaging make image interpretation challenging, and the shortage of clinicians capable of generating precise reports further increases the workload of ophthalmologists and hinders early diagnosis and treatment of retinal diseases. AI-assisted report generation has the potential to address these challenges. Compared to simple image recognition and classification, developing AI models that generate textual interpretations from images is more complex, as it requires the machine to mimic a human-like understanding of image content. "Image-to-text" models incorporate various deep learning algorithms, including computer vision and natural language processing. Early models required millions of natural images to achieve satisfactory text generation, which is impractical for medical imaging due to limited datasets and extensive textual information. However, clinical reports generated by clinicians, based on comprehensive clinical data, detailed image analysis, and experience, can serve as high-quality training datasets for such models.

Currently, significant research efforts are focused on areas with large datasets and standardized report formats, such as chest X-rays, chest CT scans, and brain MRI. The widely used report generation databases include Open-IU, MIMIC-CXR, and PadChest, with MIMIC-CXR containing more than 270,000 chest radiograph reports. In ophthalmology, due to the complexity of fundus imaging interpretation and the relatively smaller size of the data set, research in this area is limited, particularly for highly specialized imaging modalities such as fundus angiography. Our team has successfully developed a fundus fluorescein angiography report generation dataset (FFA-IR) based on angiography images and the corresponding reports. Our report generation model can produce accurate bilingual reports (Chinese and English) for common and rare retinal diseases, with accuracy comparable to that of human retinal specialists, while significantly reducing report generation time.
Weak Annotation: A Promising Approach for Training New Disease Diagnosis and Classification Models Traditional AI-assisted diagnostic models for medical imaging typically rely on large-scale, precisely annotated, high-quality images, as the accuracy of the training dataset is critical for model performance. When high annotation accuracy cannot be guaranteed, increasing the dataset size is often the only way to improve model performance. However, for complex datasets like fundus angiography images, the large number of diseases and variable manifestations exponentially increase the annotation difficulty. Additionally, clinical reports often contain uncertainties due to incomplete clinical data or varying levels of clinician expertise, leading to inaccuracies in diagnostic reports.

Weak annotation offers a promising solution to reduce annotation costs, improve annotation efficiency, and improve model generalizability. In natural image processing, weak annotation has been widely studied, with large-scale natural image datasets enabling the training of such models. In medical AI research, Guo et al. pioneered the use of weakly annotated datasets derived from brain CT reports, automatically extracting low-quality keyword information to accurately identify and locate four common brain pathologies. This approach demonstrated excellent generalizability across different centers and imaging devices. However, these systems were limited to broad categories of diseases and single-modal images, lacking the ability to diagnose specific diseases or generate detailed reports.

Based on preliminary findings and literature, we propose a hypothesis: Can weakly annotated imaging reports, combined with AI deep learning algorithms such as knowledge graphs and Transformer, be used to build an interpretable, multi-modal, multi-disease fundus report generation system? This project aims to refine existing AI models and develop a system that helps generate imaging diagnoses and reports for multiple retinal diseases. The results will not only reduce the workload of ophthalmologists, but also promote the widespread adoption of advanced fundus imaging techniques, ultimately improving the early diagnosis and treatment of blinding retinal diseases.

Enrollment

9,999 estimated patients

Sex

All

Volunteers

Accepts Healthy Volunteers

Inclusion criteria

Disease group: All multimodal fundus examination images containing fundus lesions, examined from January 2011 to December 2023, including fundus photography, as well as OCT, OCTA, FFA, ICGA, B-ultrasound and corresponding imaging reports. Images could be either clear or unclear, reports are either complete or incomplete.
Normal group: All multimodal fundus examination images without fundus lesions, examined from January 2011 to December 2023, including fundus photography, as well as OCT, OCTA, FFA, ICGA, B-ultrasound and corresponding imaging reports. Images could be either clear or unclear, reports are either complete or incomplete.

Exclusion criteria

Disease group: 1. The image has serious quality problems; 2. The diagnostic report lacks key information.
Normal group: 1. The image has serious quality problems; 2. The diagnostic report lacks key information.