ClinicalTrials.Veeva

Menu

Comparison of Different Feature Engineering Methods for Automated ICD Coding

N

National Center for Cardiovascular Diseases

Status

Unknown

Conditions

Cardiovascular Diseases

Treatments

Other: No intervention

Study type

Observational

Funder types

Other

Identifiers

NCT04849195
2021-1425-02

Details and patient eligibility

About

Using traditional machine learning classifiers, this study targets on comparing bag-of-words, word2cec and roberta on automated ICD coding related to cardiovascular diseases in Chinese corpus.

Full description

ICD coding is quite important as it serves as basis for a wide range of economic and academic applications. Currently, manual coding is mainly adopted, which faces several limits like being time-consuming and prone to error, and this makes automated ICD coding via machine learning a hot research topic.

As an inevitable phase during machine learning, feature engineering plays a crucially important role in leading to promising coding performance. Although have reached enlightening conclusions, existing studies lacked comparison of different feature engineering methods. Finding out what methods under what circumstances perform better can be quite helpful in promoting practical applications of automated coding.

The investigators will implement this study based on inpatient' data collected from electronic medical records from Fuwai Hospital, the world's largest medical center for cardiovascular disease. Bag-of-words, word2cec and roberta will be respectively used to extracted features from training data. Then code-wise logistic regression classifiers and support vector machine classifiers will be trained to auto-assign codes. Afterwards, performances of the models on test data will be evaluated.

Enrollment

6,947 estimated patients

Sex

All

Volunteers

No Healthy Volunteers

Inclusion criteria

  • Admissions in Fuwai Hospital, from January 1, 2019, to February 28, 2019

Exclusion criteria

Trial design

6,947 participants in 1 patient group

Model training and test group
Description:
Data set will be split into training group and test group, where training group will be used for model building, and test group for subsequent evaluation and verification.
Treatment:
Other: No intervention

Trial contacts and locations

1

Loading...

Data sourced from clinicaltrials.gov

Clinical trials

Find clinical trialsTrials by location
© Copyright 2026 Veeva Systems