A Novel Method for Voice Disorder Detection Using Feature Selection and Data Augmentation Techniques

Abstract:

This paper presents a novel approach for voice disorder detection. The study utilizes a unique combination of features, including Mel-frequency cepstral coecients (MFCCs) with their rst and second derivatives, fundamental frequency, harmonic-to-noise ratio (HNR), energy, glottal features, and patient information, such as age, gender, and status. Two feature selection methods were applied: a combination of mutual information and sequential backward selection (SBS) and a combination of SBS and the Spearman test. The selection methods were employed to identify the relevant features from the feature set. The VOICED (VOice ICar fEDerico II) database was used for this purpose. However, because the VOICED dataset is not balanced, an augmentation method was applied using the Saarbruecken voice database (SVD). The models were evaluated on both unbalanced and balanced datasets. The best results were achieved using the balanced dataset, which reached an accuracy of 86%.