ClinicalTrials.Veeva

Menu

Prognosis and Management of Infective Endocarditis Using the Clinical Data Warehouse From AP-HP (ENDO-EDS)

A

Assistance Publique - Hôpitaux de Paris

Status

Completed

Conditions

Endocarditis Infective

Study type

Observational

Funder types

Other

Identifiers

NCT06957119
APHP240997

Details and patient eligibility

About

"Infective endocarditis (IE) is a rare but severe condition with significant morbidity and mortality, with in-hospital mortality reaching 20% and 1-year mortality up to 40%. The epidemiological profile of IE has profoundly changed in recent years, both in terms of responsible microorganisms and affected populations, shifting from young adults with post-rheumatic valvulopathy to older populations with degenerative valve disease or prosthetic implants. Prophylaxis and management recommendations have also evolved, underscoring the importance of monitoring the evolution of IE profiles in France.

Despite these changes, there is no standardized surveillance for IE in France, and existing studies often rely on data from specialized centers, introducing selection biases. Moreover, another important limitation when studying IE using medical-administrative basis, like the french nationwide claims database (SNDS), is the poor performances of administrative coding (ICD-10) in accurately identifying IE cases.

The ""ENDO-EDS"" project aims to leverage the extensive data available in the APHP Clinical Data Warehouse (CDW) to study IE in a real-world, unbiased context. Indeed, the AP-HP clinical data warehouse, with 11 million patients, offers the opportunity to identify IE cases across a large population base and, due to the presence of both expert and non-expert IE hospitals, to describe the characteristics of the disease while reducing bias risks. In addition, AP-HP CDW contains medical reports and other documents emitted during patient' stays in hospital, thus enabling to overcome ICD-10 coding limitations, by utilizing a large-scale clinical data repository combined with advanced Natural Language Processing (NLP) algorithms.

The anticipated benefits include improving knowledge of the epidemiological profile of IE, describing diagnostic and therapeutic management practices, and studying their impact on patient prognosis. These efforts aim to contribute to the improvement of IE diagnosis and management in France. The results will be published as scientific articles in open-access peer-reviewed journals. Additionally, the development and validation of algorithms based on automated language processing for identifying patients with IE within the AP-HP data warehouse could be shared to extend subsequent analyses to other French or even European health data warehouses."

Full description

"1. Introduction

Infective endocarditis (IE) is a rare but serious disease, with an incidence of 3-6 cases per 100,000 per year in France and a hospital mortality rate of 20%, increasing to 40% at five years. Over recent decades, the epidemiological profile of IE has significantly evolved. Previously identified as a disease of young adults with well-defined predisposing valvular conditions, such as post-rheumatic valvulopathy, IE now primarily affects older patients, many without identifiable valvular disease.

IE results from bacterial colonization of a sterile fibrin-platelet vegetation on damaged endocardium. Changes in both the sources of bacteremia and valvular abnormalities have been noted. Historically, rheumatic valvular disease and cyanotic congenital heart defects were the predominant predisposing factors. The decline in rheumatic fever and early surgical correction of congenital heart defects have reduced their role. However, new risk factors, such as prosthetic valves, degenerative valvular sclerosis, and invasive medical procedures, have emerged.

Despite these changes, the incidence of IE has not decreased. Microbiological profiles have also shifted. Meta-analyses indicate that staphylococci have surpassed oral streptococci as the leading causative organisms. This shift varies geographically; for instance, IE caused by Staphylococcus aureus is more prevalent in the United States than in Europe. Factors contributing to these differences include dialysis, diabetes, and intravenous drug use in certain regions.

Prophylaxis guidelines have undergone significant revisions. In France, since 2002, antibiotic prophylaxis has been limited to high-risk patients, such as those with prosthetic valves or prior IE, reflecting evidence-based recommendations to reduce unnecessary antibiotic use and resistance. This paradigm shift may influence the epidemiology of IE, necessitating ongoing surveillance.

Diagnosing IE remains challenging due to its diverse clinical presentations. While echocardiography is the primary diagnostic tool, other imaging modalities such as PET-CT, MRI, and multi-slice CT are increasingly used for complex cases. Therapeutic approaches have also evolved. Surgical intervention during acute IE episodes has become more frequent, facilitated by advancements in surgical techniques and improved patient management. Nevertheless, questions remain regarding optimal timing, surgical indications, and patient selection criteria.

The growing interest in oral antibiotic regimens during late treatment phases, particularly following the POET trial, offers potential to reduce hospitalization duration. However, these strategies require further validation.

Given the high morbidity and mortality associated with IE, its evolving characteristics, and the challenges in diagnosis and treatment, continuous monitoring of IE epidemiology is critical. In France, surveillance faces several obstacles : IE is not a notifiable disease, it is excluded from rare disease reference centers, and population studies are costly and underfunded. Harnessing medico-administrative databases offers a cost-effective alternative for tracking IE trends, minimizing biases inherent in data from specialized centers.

Clinical data warehouses, and in particular the one developed by the APHP, which includes 11 million individuals, offer the opportunity to use text search methods to identify patients with infectious endocarditis and describe their characteristics. The APHP data warehouse will make it possible to confirm changes in the epidemiological profile, including an increasing number of cases of staphylococcal and nosocomial endocarditis, to describe diagnostic and therapeutic management practices and to study their impact on patient prognosis. This is the objective we have set for ourselves for the period 2018-2023, which precedes the publication of the new European recommendations of August 2023, in the development of which we participated. All this will contribute to improving the diagnosis and management of IE in France.

  1. Source data verification

Data Collection:

  • Extract structured (ICD-10, CCAM) and unstructured data (clinical notes, imaging reports) from the APHP Data Warehouse.
  • Pre-select patients using cohort360, a tool developed by APHP for large-scale patient cohort identification.

Algorithm Development and Validation:

  • Create NLP-based algorithms to identify IE cases, clinical characteristics, and treatment pathways.

  • Validate algorithm outputs against gold-standard cases reviewed by expert cardiologists and infectious disease specialists.

    1. Statistical Analysis

Analysis of the primary outcome:

Cox model to assess the strength of the association between covariates (field characteristics, characteristics of the IE, recourse to surgery, etc.) and the outcome death from any cause at 1 year after the date of diagnosis of IE after adjustment for age, sex, main comorbidities

Analysis of secondary outcomes:

  1. Descriptive statistics of recourse to imaging examinations in patients hospitalized for infective endocarditis (IE) (number, percentage ± confidence interval);
  2. Descriptive statistics of recourse to cardiac valve replacement surgery in the acute phase of IE (number, percentage ± confidence interval);
  3. Descriptive statistics of the clinical, iconographic and microbiological characteristics of patients hospitalized for an IE (number, percentage ± confidence interval for qualitative variables / number, median, range, interquartile range, mean, standard deviation, and confidence interval of the mean for quantitative variables); These analyses will be carried out for each of the following variables: age, sex, History of IE, History of valvulopathy, Congenital heart disease, History of prosthesis placement/replacement or valve plastic surgery, IV drug addiction, History of pacemaker or implantable electronic defibrillator implantation/control/replacement, Microorganism(s) responsible for the IE, Charlson score, Vascular phenomena, glomerulonephritis, vegetations, abscess (valvular/paravalvular), pseudoaneurysm.
  4. Descriptive statistics of the rates of patients who benefited from an oral relay of antibiotic treatment in the acute phase of IE (number, percentage ± confidence interval);
  5. Descriptive statistics of the intra-hospital mortality rates of patients hospitalized for IE (number, percentage ± confidence interval);
  6. Descriptive statistics of the morbidity rates at 1 year, Re-hospitalization rate, Relapse rate, Recurrence rate, Valve replacement surgery rate of patients hospitalized for IE (number, percentage ± confidence interval);
  7. Sankey flow diagrams representing the care pathways of patients hospitalized for an IE and descriptive statistics of the rates of passages in the different medical specialties and hospital services (number, percentage ± confidence interval) as well as the characteristics of the stays: length of stay, mode of entry, mode of exit (number, percentage ± confidence interval for qualitative variables / number, median, range, interquartile range, mean, standard deviation, and confidence interval of the mean for quantitative variables);
  8. Descriptive statistics of the rates of patients whose file was presented in endocarditis RCP (number, percentage ± confidence interval and comparisons according to subgroups of specialties);
  9. Descriptive statistics of the performance metrics of the diagnostic codes (ICD-10) of the PMSI stays to identify an IE (sensitivity, specificity, predictive values, correct classification rate, F1 score, etc.).

Level of statistical significance:

Threshold of 0.05 with consideration of alpha risk in the event of multiple tests.

Management of changes made to the initial statistical plan:

Any changes made to the initial statistical plan will be discussed and decided by the scientific committee of the study after discussion with the team in charge of the analyses. They may be suggested by the statisticians/data scientists in charge of the analyses. They will be justified and documented in the final statistical report.

  1. Quality assurance

The validation of the NLP and multimodal algorithms follows a rigorous multi-step process to ensure data accuracy and reliability within the registry. Each algorithm undergoes a structured evaluation based on predefined metrics, including positive predictive value (PPV), negative predictive value (NPV), specificity, and sensitivity.

  1. Development and Internal Validation:

    • The algorithms are trained on 80% of the reference population.
    • They are built using keyword-based regular expressions curated by two domain experts (an infectious disease specialist and a cardiologist).
    • Iterative refinements are conducted in collaboration with clinicians to optimize detection accuracy.
  2. External Validation with Expert Review:

    • The remaining 20% of the reference population serves as a validation set.
    • A random sample of 100 patients is selected from this validation set.
    • For each selected patient, relevant medical documents (hospitalization reports, imaging reports, pathology reports, microbiology data) are manually reviewed by two independent physicians to establish a reference standard.
    • Algorithm outputs are then compared against this reference, and performance metrics are calculated.
  3. Registry Integration and Monitoring:

    • The validated algorithms are integrated into the EDS-NLP and EDS-SCIKIT open-source libraries to ensure reproducibility and transparency.
    • The overall process adheres to AP-HP data governance policies and CNIL (French Data Protection Authority) regulations for data reuse.

This structured quality assurance framework ensures the reliability of automated data extraction, supports reproducibility, and enhances registry integrity for infectious endocarditis research.

  1. Ethical consideration This study has been approved by the Scientific and Ethical Committee of the AP-HP Clinical Data Warehouse (EDS).

Concerning AP-HP's Clinical Data Warehouse (EDS), the collection and processing of health data within the EDS are authorized by the French Data Protection Authority (CNIL).

In addition, since the creation of the EDS in 2017, AP-HP has conducted an extensive information campaign for patients, including:

  • Individual notifications via postal and electronic means,
  • Collective information through its website and press releases.

Since then, AP-HP has continuously informed patients about the collection and potential reuse of their health data within the EDS, as well as their associated rights through posters, website information, welcome booklet..."

Enrollment

3,000 patients

Sex

All

Ages

18 to 130 years old

Volunteers

No Healthy Volunteers

Inclusion and exclusion criteria

The selection of patients will be carried out in 2 successive stages.

A 1st stage of pre-selection of patients (Population 0) based on the following criteria:

  • Patients aged 18 or over hospitalized in MCO in an APHP department between 01/08/2017 and 31/12/2023;
  • Having benefited from a CCAM act of transthoracic and/or transesophageal echocardiography or
  • Presenting a textual mention of "endocarditis" or "IE" in one of the reports associated with the hospitalization or
  • Having an ICD-10 diagnostic code for infective endocarditis in the PMSI (I33, I38, T826, B376, I39)

A second step will allow the identification of the study population (Population 1) of patients presenting with infective endocarditis (=gold standard) from a multimodal algorithm, notably based on automated language processing (TAL) algorithms applied to hospitalization, imaging, and pathology reports. Population 1 will be used to meet all of the study objectives with the exception of secondary objective no. 9 carried out using Population 0.

Patients with no usable hospitalization reports will be excluded.

Trial contacts and locations

1

Loading...

Data sourced from clinicaltrials.gov

Clinical trials

Find clinical trialsTrials by location
© Copyright 2025 Veeva Systems