Abstract:
Breast cancer is one of the worlds most notorious cancer types for women, and together with the rest of the cancer types it forms the worlds second most lethal disease for humans[1]. Much research effort has been put into predicting and even preventing it. While the field of cancer prediction greatly beneted from machine learning in the past decades as many research papers addressed the problem of cancer prediction, there is still space for improvement. This paper aims to review some of the existent researches from the field of breast cancer, gather a set of useful methods and processes, and finally apply them on a publicly available breast cancer dataset, with a focus on understanding how different features, or their absence, can influence the outcome of a prediction.