This study aims to predict the survival outcomes of glioblastoma patients using modern deep learning and machine learning techniques. Utilizing data from the Surveillance, Epidemiology, and End Results (SEER) database, the researchers developed machine learning models and a feed-forward deep neural network (DNN) for multiclass classification and regression prediction of survival. The DNN consistently outperformed other machine learning models in both classification and regression tasks, with accuracy rates of 90.25% and 90.22% for holdout and cross-validation sampling strategies, respectively. The study highlights the importance of age at diagnosis as the most influential feature in survival predictions. Glioblastomas are aggressive brain tumors with low median survival rates despite traditional treatments. Predicting survival is crucial for treatment planning, decision-making, and reducing patient anxiety. While previous studies have used machine learning and statistical models for survival prediction, this study is the first to leverage deep learning algorithms on the SEER database for glioblastoma patients. Additionally, it introduces clinically meaningful survival classes and enhances model interpretability using Shapley Additive Explanations (SHAP).
The results demonstrate significant associations between various features and predicted survival, with two exceptions. Synthetic Minority Oversampling Technique (SMOTE) and Synthetic Minority Over-Sampling with Gaussian Noise (SMOGN) techniques were applied to address dataset skewness. The classification models' performance was evaluated through AUC diagrams and confusion matrices, showing promising results in both holdout and cross-validation sampling strategies.