Methodical Approaches to Forecasting Dynamics of the Stock Market based on “Text-Mining”

Abstract:

The development of the information society generates significant volumes of information, the impact of which on various socio-economic institutions is yet to be assessed by researchers. Certain work is carried out in many areas, one of which is text-mining in the data mining system, as a set of tools for analyzing unstructured data of various content. The purpose of this study is to determine the influence of messages in the media on the dynamics of the stock index and the formation, on the basis of the identified relations of methodological approaches to forecast price movements on the exchange.  The basis of the study is the hypothesis of information markets, according to which a number of research hypotheses have been proposed to test the relation between media reports and the dynamics of the stock index (RTS, intraday timeframe, 1 minute). Criteria of the hypothesis are the correlation and ranges of its values, put in line with the level of information effectiveness. Correlation (according to K. Pearson) is determined between the sentiment of media reports and the index of the exchange (RTS). In addition, the correlation between the event index was assessed as an alternative, as an integral evaluation of the various characteristics of the multiplicative model and the exchange index (using the author's database of events, as a training sample for regression analysis in the formation of evaluation criteria for the target sample). As the results showed, the accuracy of determining the dependence of the dynamics of the exchange on messages in the media using the estimated signs of events is somewhat higher than the traditional one, which is explained by the universality of the lexicon used in assessing sentiment (Liu Hu and Vader methods). The use of the author's methodology made it possible to carry out estimates for the target sample in an automatic mode using the corresponding software product (“Orange”). The obtained results and the target sample, with appropriate modifications, can be used to solve an important scientific problem - improving the predictability of both the stock market and the financial market as a whole.