Status
Conditions
Study type
Funder types
Identifiers
About
This study looked at whether male and female patients with liver cancer (hepatocellular carcinoma, HCC) have different risks of the cancer spreading to distant parts of the body (distant metastasis). Liver cancer is much more common in men than in women, and women often have better survival rates. However, it was unclear if the factors that predict this spread are the same for both sexes.
To answer this question, researchers analyzed information from a large, national cancer database (SEER) from 2004 to 2022, including 19,019 patients diagnosed with liver cancer. They studied factors like age, race, tumor stage, treatment received, and where patients lived. The team used advanced computer models (machine learning) to build separate prediction tools for men and women to estimate their risk of distant metastasis at the time of diagnosis.
Full description
Study Design and Objective:
This was a retrospective, population-based cohort study utilizing data from the Surveillance, Epidemiology, and End Results (SEER) database. The primary objective was to systematically compare the incidence and identify sex-specific determinants of distant metastasis in patients with hepatocellular carcinoma (HCC). A secondary objective was to develop and validate separate, high-performance machine learning (ML) prediction models for distant metastasis risk tailored to male and female patients.
Data Source and Participants:
Data were extracted from 22 SEER registries covering patients diagnosed with HCC between 2004 and 2022. Inclusion required a pathological diagnosis of HCC. Key exclusion criteria were: missing data on race, marital status, tumor grade, or surgical status; non-first primary malignancy; and incomplete TNM staging data. After applying criteria, 19,019 patients were included in the final analysis (14,575 males, 4,444 females).
Variables and Definitions:
The outcome variable was distant metastasis status at diagnosis, dichotomized as M0 (no metastasis) or M1 (metastasis) based on consistent AJCC criteria. Predictor variables included: age, sex, race, tumor grade, marital status, surgical treatment (categorized as non-surgery, local therapy, surgical resection, or liver transplantation), radiotherapy, chemotherapy, annual household income, and residential location (based on population size). To ensure comparability across different editions of the AJCC staging manual, T stage was grouped as T0-2 (localized) vs. T3-4 (locally advanced), and N stage as N0 vs. N1.
Statistical and Machine Learning Analysis:
Univariate and multivariable logistic regression analyses were performed to identify factors independently associated with distant metastasis, stratified by sex.
For predictive modeling, the dataset was randomly split into a training set (80%) and an internal testing set (20%). Eight machine learning algorithms were developed and compared: Logistic Regression, Random Forest, XGBoost, LightGBM, AdaBoost, Decision Tree, Gradient Boosting Decision Tree (GBDT), and Multilayer Perceptron. Model hyperparameters were optimized using 10-fold cross-validation on the training set. The final models were evaluated on the independent testing set. Model performance was assessed using the Area Under the Receiver Operating Characteristic Curve (AUC), accuracy, sensitivity, specificity, F1 score, calibration curves, and Decision Curve Analysis (DCA). The interpretability of the best-performing model was enhanced using Shapley Additive Explanations (SHAP).
Enrollment
Sex
Ages
Volunteers
Inclusion criteria
Exclusion criteria
19,019 participants in 1 patient group
Loading...
Data sourced from clinicaltrials.gov
Clinical trials
Research sites
Resources
Legal