Abstract:
This paper studies the efficiency of investment strategies created by the machine learning algorithm for the 50 largest US companies. The Random Forest, as a representative of the algorithms based on decision trees, has many advantageous features for the time series analysis, in particular, for problems of regression and classification. The Random Forest was applied in the study to detect a signal whether to buy, sell or hold a specific stock for a period of 1 month. The decisions are based on the classification and consider multiple macro-economic and market variables. The case was analyzed for 3 classes which comply with investors' expectations regarding the forecast of price changes. It is desired to focus on the accuracy measure to verify how the presented algorithm will work, namely, how many times it may correctly predict a certain result among three investment decisions – to buy, hold or sell shares. The Matthews correlation coefficient was used as a synthetic measure of the forecasts quality. It is based on the concept of the confusion matrix. As it was expected, the results support the thesis that the market is becoming more and more efficient. Moreover, contrary to the imposed concepts, even utilizing a large data set and complex deep machine learning tools does not lead to a satisfactory result in practice.