A data mining study of predictive models among Stack Overflow developers: what makes them earn more?

Abstract:

The knowledge sharing of latest methods and techniques in programming, software development or data science is made possible by strong communities such as Stack Overflow or Kaggle that gather people all over the world and answer questions to both professional and enthusiast members. This study uses  the 2018 developers survey from Stack Overflow that contains responses to questions from people all over the world regarding their profession, background, job satisfaction and experience in the field. An issue of general interest is represented by the level of salaries in IT industry which largely depends on specialization and experience. Therefore, in this paper we research what are the drivers that make some people earn better than others by building predictive models capable of classifying developers according to their salary using Classification Tree and Logistic Regression. Preliminary results show that developers are passionate about programming as most of them code as a hobby. The main findings reveal that developers who are 35 years or older, greatly satisfied with their careers and who aren't students anymore are most likely to earn more. These results might also be explained by the increase in experience over the years as one accumulates more knowledge at an older age and may benefit from a senior position in a company. Ultimately, we use the Area under the Curve model performance metric in order to compare the classifiers we used.