SOAR-ML: Synthetic Optimization and Augmentation for Robust Machine Learning in Oral Cancer Prediction

Abstract:

Oral Squamous Cell Carcinoma (OSCC) continues to be a significant concern in healthcare due to often late detections and intricate treatment results. While traditional diagnostic methods are crucial, they sometimes miss their mark in predictive accuracy. The emerging field of artificial intelligence (AI) and machine learning (ML) presents an exciting shift, poised to transform the future of OSCC diagnosis and care. Yet, issues like data gaps and biases in ML models emphasize the importance of solid, dependable algorithms. In our research, we combined synthetic data augmentation with state-of-the-art ML techniques for a novel approach to OSCC identification. Integrating CatBoost, XGBoost, LightGBM, Random Forest, and Deep Neural Networks, we aim for unmatched predictive precision. This method addresses existing model shortcomings, especially in dealing with limited positive case samples. Consequently, our study promises a major advancement, setting new benchmarks for early OSCC detection and signaling a new era in oral cancer screening.