Abstract:
The present paper aims to study Portugal’s unemployment patterns at the municipality level using machine learning techniques. The study was guided by the CRISP-DM methodology, considering its first five phases: business understanding, data understanding, data preparation, modeling, and evaluation. The data included the unemployed living in municipalities in mainland Portugal in February and August 2024 and was collected from Instituto do Emprego e Formação Profissional, Portugal’s national employment public institution. Variables such as gender, age category, educational level, duration of registration with employment centers and job search status were included. Following data cleaning and preparation, the K-means clustering technique was used to discover groupings of towns with similar unemployment rates. The findings revealed distinct seasonal patterns of unemployment, with considerable disparities between the winter and summer seasons. In February, the unemployed were primarily from the younger age groups, whereas in August, unemployment was concentrated among adults. Regional differences were also found, with coastal areas having a higher concentration of unemployed people in February, whereas urban areas had higher unemployment rates in August.