Author Name Disambiguation Using Predictive Models

Abstract:

Name ambiguity is one of the biggest challenges for the knowledge management systems and digital libraries. The name ambiguity arises mainly from the lack of a universal accepted standard and from the polysemy of the names. This is further complicated by the practice used by publication and journals to use short names for authors, i.e. the family name and the initial letter(s) from the given name(s). In this paper we present a Machine Learning-based method for identifying the researchers with predictive models built on the publishing profile extracted from the existent work, academic affiliation, research domain and meta attributes like email address, ORCID, or ResearchID. Initial experiments were obtained using CART and Conditional inference tree methods.