 # Machine Learning Methods in Stock Price Prediction This machine learning article was written by Yuxiao Yang – Financial Analyst at I Know First.

## Highlights:

• Machine Learning Methods help us find patterns from historical data and then apply them to predictions and algorithmic trading strategies.
• Major Machine Learning Methods in Stock Price Prediction can be divided into Traditional Machine Learning Methods such as regression methods, Deep Learning methods, Time Series Analysis methods, and Graph-Based methods.
• The I Know First AI algorithm provides us with the tool to select the most promising stocks.

## Overview of Machine Learning Methods in Stock Price Prediction

Since stock price prediction involves many factors, it increases the complexity and difficulty of stock price prediction. Machine Learning Methods can effectively overcome those obstacles in stock price prediction. Machine learning allows computers to learn from historical stock data and then make pattern recognition from them.

Machine learning algorithms learn through the training data from historical information also used as target variables, then test against the target variables in the validation dataset to verify its accuracy. The goal is to create a model that can accurately predict the target variable when it is exposed to relevant predictive variables. Machine learning algorithms can find patterns in a huge number of data based on mathematical calculations of information such as prices and volatility. They are used to find patterns in historical data and then apply them to algorithmic trading strategies.

Traditional methods mainly include linear regression, logistic regression analysis, Random Forest (RF), Support Vector Machine (SVM), Naive Bayes, and K-Nearest Neighbor (KNN). Regression analysis is a statistical tool for defining the relationship between a dependent and one or more independent variables. The regression equation is solved to find the coefficients. We choose historical stock data as input data and grouped it into two sets, training data set and testing data set. The training data set is used to train a model and to estimate the unknown coefficients. The estimated coefficients are used to predict the future price of a stock. Also, these coefficients are used to compare the actual price and the predicted price.

Random Forest Regression is a supervised learning algorithm that uses the ensemble learning method for regression. Support Vector Machines (SVM) are supervised learning models with associated learning algorithms that analyze data for regression analysis. Naive Bayes methods are based on applying Bayes’ theorem with strong independence assumptions between the features. K-Nearest Neighbor (KNN) is a non-parametric supervised learning method.

## Deep Learning Methods

Deep Learning Methods mainly include Long Short Term Memory Network (LSTM) and CNN (Convolutional Neural Network). LTSMs are a type of Recurrent Neural Network for learning long-term dependencies. It is commonly used for processing and predicting time-series data.LSTM model is created by choosing a specific time step. The hyperparameters such as the number of neurons, epochs, learning rate, batch size, and time step have been incorporated into the model. Once the hyperparameters are tuned, the input data is fed into the LSTM model to predict the price of the stock.

Convolutional Neural Network is a feed-forward neural network. Like the traditional architecture of a neural network including input layers, hidden layers, and output layers, a convolutional neural network also contains these features and the input of the layer of convolution are the output of the previous layer of convolution or pooling.

## Time Series Analysis Methods and Graph-Based Methods

Time Series Analysis Methods depend on forecasts and the projection of discrete-time data, mainly including ARIMA, Generalized Additive Model (GAM), and time series supervised Learning methods. Graph-Based Focuses on forming relationships based on correlation and causation among the nodes which is useful for exploring previously hidden insights.

## Stock Picking with the I Know First AI Algorithm

The I Know First predictive algorithm is a successful attempt to discover the rules of the market that enable us to make accurate stock market forecasts. Taking advantage of artificial intelligence and machine learning and using insights of chaos theory and self-similarity (the fractals), the algorithmic system is able to predict the behavior of over 10,500 markets. The key principle of the algorithm lays in the fact that a stock’s price is a function of many factors interacting non-linearly. Therefore, it is advantageous to use elements of artificial neural networks and genetic algorithms. How does it work? At first, an analysis of inputs is performed, ranking them according to their significance in predicting the target stock price. Then multiple models are created and tested utilizing 15 years of historical data. Only the best-performing models are kept while the rest are rejected. Models are refined every day, as new data becomes available. As the algorithm is purely empirical and self-learning, there is no human bias in the models and the market forecast system adapts to the new reality every day while still following general historical rules.

I Know First has used algorithmic outputs to provide an investment strategy for institutional investors. Below you can see the investment result of our S&P 100 Stocks package which was recommended to our clients for the period from November 24th, 2019 to July 31th, 2022 (you can access our forecast packages here).