ClinicalTrials.Veeva

Menu

Sex-Specific Machine Learning Models to Predict Distant Metastasis in Liver Cancer (GENDER-HCC-MET)

T

Tongji University

Status

Completed

Conditions

Hepatocellular Carcinoma
Neoplasm Metastasis
Liver Cancer

Study type

Observational

Funder types

Other

Identifiers

NCT07386639
82504454 (Other Grant/Funding Number)
SEER-HCC-DM-Gender-2024

Details and patient eligibility

About

This study looked at whether male and female patients with liver cancer (hepatocellular carcinoma, HCC) have different risks of the cancer spreading to distant parts of the body (distant metastasis). Liver cancer is much more common in men than in women, and women often have better survival rates. However, it was unclear if the factors that predict this spread are the same for both sexes.

To answer this question, researchers analyzed information from a large, national cancer database (SEER) from 2004 to 2022, including 19,019 patients diagnosed with liver cancer. They studied factors like age, race, tumor stage, treatment received, and where patients lived. The team used advanced computer models (machine learning) to build separate prediction tools for men and women to estimate their risk of distant metastasis at the time of diagnosis.

Full description

Study Design and Objective:

This was a retrospective, population-based cohort study utilizing data from the Surveillance, Epidemiology, and End Results (SEER) database. The primary objective was to systematically compare the incidence and identify sex-specific determinants of distant metastasis in patients with hepatocellular carcinoma (HCC). A secondary objective was to develop and validate separate, high-performance machine learning (ML) prediction models for distant metastasis risk tailored to male and female patients.

Data Source and Participants:

Data were extracted from 22 SEER registries covering patients diagnosed with HCC between 2004 and 2022. Inclusion required a pathological diagnosis of HCC. Key exclusion criteria were: missing data on race, marital status, tumor grade, or surgical status; non-first primary malignancy; and incomplete TNM staging data. After applying criteria, 19,019 patients were included in the final analysis (14,575 males, 4,444 females).

Variables and Definitions:

The outcome variable was distant metastasis status at diagnosis, dichotomized as M0 (no metastasis) or M1 (metastasis) based on consistent AJCC criteria. Predictor variables included: age, sex, race, tumor grade, marital status, surgical treatment (categorized as non-surgery, local therapy, surgical resection, or liver transplantation), radiotherapy, chemotherapy, annual household income, and residential location (based on population size). To ensure comparability across different editions of the AJCC staging manual, T stage was grouped as T0-2 (localized) vs. T3-4 (locally advanced), and N stage as N0 vs. N1.

Statistical and Machine Learning Analysis:

Univariate and multivariable logistic regression analyses were performed to identify factors independently associated with distant metastasis, stratified by sex.

For predictive modeling, the dataset was randomly split into a training set (80%) and an internal testing set (20%). Eight machine learning algorithms were developed and compared: Logistic Regression, Random Forest, XGBoost, LightGBM, AdaBoost, Decision Tree, Gradient Boosting Decision Tree (GBDT), and Multilayer Perceptron. Model hyperparameters were optimized using 10-fold cross-validation on the training set. The final models were evaluated on the independent testing set. Model performance was assessed using the Area Under the Receiver Operating Characteristic Curve (AUC), accuracy, sensitivity, specificity, F1 score, calibration curves, and Decision Curve Analysis (DCA). The interpretability of the best-performing model was enhanced using Shapley Additive Explanations (SHAP).

Enrollment

19,019 patients

Sex

All

Ages

18 to 100 years old

Volunteers

No Healthy Volunteers

Inclusion criteria

  • Pathologically confirmed diagnosis of Hepatocellular Carcinoma (HCC).
  • Diagnosis year between 2004 and 2022, inclusive.
  • Case identified within the 22 registries of the Surveillance, Epidemiology, and End Results (SEER) database.

Exclusion criteria

  • Missing information on race, marital status, tumor grade, or surgical status.
  • Non-first primary malignancy or presence of multiple primary tumors.
  • Incomplete TNM staging data.

Trial design

19,019 participants in 1 patient group

MALE and Female HCC Patients
Description:
A cohort of male patients (n=14,575) diagnosed with hepatocellular carcinoma (HCC) between 2004 and 2022, identified from the SEER database. This group was analyzed separately to identify sex-specific determinants and build a prediction model for distant metastasis.;A cohort of female patients (n=4,444) diagnosed with hepatocellular carcinoma (HCC) between 2004 and 2022, identified from the SEER database. This group was analyzed separately to identify sex-specific determinants and build a prediction model for distant metastasis.

Trial contacts and locations

0

Loading...

Data sourced from clinicaltrials.gov

Clinical trials

Find clinical trialsTrials by location
© Copyright 2026 Veeva Systems