Authorship Identification: A hybrid Method Based on Stylistic and Statistical Analysis

Abstract:

This paper tackles the author identification problems of documents that have unknown origins. To cope with the author identification challenges, we introduce a hybrid method that amalgamates stylistic analysis with statistical analysis. In fact, the proposed method takes advantage of a large set of stylistic and statistical features to fully address the identification of the document’s author. These features are explored to build a machine learning process. We obtained promising results by relying on an English literature corpus.