Machine Learning Stock Sector Risk vs Classical Risk Sector Measures

Sergey Okun  This article “Machine learning stock sector risk vs classical risk sector measures” was written by Sergey Okun – Senior Financial Analyst at I Know First, Ph.D. in Economics.

machine learning


  • Rapid development and implementation of AI algorithms in the investing area required reconsidering the concept of risk.
  • The effectiveness of ML training and subsequent forecasting depends on the quality of training data.
  • XLE is the least risky sector for further sufficient learning and forecasting, despite its high volatility and required investment risk premium.

Classical Aspects of Risk

The stock market, a bustling hub of buying and selling securities, holds undeniable allure for investors seeking to grow their wealth. However, behind the potential for handsome returns, there exists an intricate web of risks. In the context of the stock market, risk signifies the likelihood of financial losses or adverse outcomes incurred by investors as they participate in the buying and selling of stocks and other equity-related securities. Stocks represent ownership in companies, and their values fluctuate based on a myriad of factors, making risk an intrinsic aspect of stock market investing. The main known types of risks are market risk (systematic risk), company-specific risk (unsystematic risk), liquidity risk, volatility risk, regulatory and political risk, earnings, and financial risk.

Classically, stock risk refers to the uncertainty or potential for loss associated with investing in individual stocks or the stock market as a whole. The main measures of stock risk are volatility and the Beta coefficient. Volatility refers to the degree of variation or fluctuation in the price over time. It measures the extent to which the price of the asset moves up and down, indicating the level of risk or uncertainty associated with that asset. Beta is a measure of a stock’s sensitivity to market movements. According to portfolio theory, an investor requires compensation for holding a risk that cannot be diversified (systematic risk). A stock with a beta of 1.0 is expected to move in line with the overall market, usually represented by an index like the S&P 500. Stocks with betas greater than 1.0 are considered more volatile than the market, while those with betas less than 1.0 are less volatile. The Beta coefficient enables us to estimate a risk premium. Moreover, in the CAPM model, Beta is only one parameter that identifies differences in risk premiums among stocks.

White Noise and Memory on Stock Market

The stock market, a dynamic and ever-evolving financial ecosystem, has long captured the imagination of investors and traders seeking to unlock its potential for profit. In recent years, the integration of machine learning and artificial intelligence (AI) has emerged as a game-changer, providing sophisticated tools to analyze data, identify patterns, and make more informed investment decisions. Machine learning focuses on the development of algorithms and models enabling computers to learn from and make predictions or decisions based on data. However, the effectiveness of training and subsequent forecasting depends on the quality of the training material. If the training material consists of white noise, the effectiveness of the training will be negligible. White noise refers to a sequence of random numbers that cannot be predicted. On the other hand, if the training data includes information about the probabilistic development of events in addition to white noise, the algorithm can use this information to draw conclusions and make predictions based on it.

The classic financial theory, based on the Efficient Market Hypothesis (EMH), states that the stock market reflects all available information in stock prices. This means that a stock’s return follows a normal distribution, or Gaussian distribution (with the distribution curve taking the form of a bell). Therefore, this hypothesis denies the market’s ability to have a memory and, moreover, to construct sustained profitable investment strategies based on analyzing past trends. So, according to EMH, today’s stock price already incorporates all available information, and we cannot extract additional information from analyzing stock data to make an efficient stock price prediction because all information that we can find is already priced into stocks.

However, the Theory of Chaos postulates another approach than EMH based on the Fractal Market Hypothesis (here we discuss aspects of the Chaos Theory to analyze financial markets). A fractal is an attractor (limit set) of a generating rule (information process). This is a kind of self-similarity in which smaller parts relate to the whole. An attractor is a structure in which infinite possibilities are contained within a finite range. The Hurts exponent enables us to estimate a fractal in time series data. There are three possible cases to consider:

  • If 0 H < 0.5, then a time series is characterized as mean-reverting.
  • if H = 0.5, then there is a completely uncorrelated time series with normal distribution. Stock returns are random, which is an example of Brownian motion.
  • if 0.5 < H 1, then a time series has positive autocorrelation and clusters in one direction (there is a persistent component). The closer the H value is to 1, the more predictable a time series is.

The persistent time series defined as 0.5<H≤1 is a fractal because it can be described as a generalized Brownian motion. In generalized Brownian motion, there is a correlation between events on the time scale.

Alternative Risk Approach

Above, we discussed the classic understanding of risk in the stock market. However, in the context of machine learning, we need to shift our focus from the characteristics of individual shares (such as volatility and Beta) to the characteristics of the data series on which artificial intelligence should learn and make predictions. The Hurst exponent measures the degree of jaggedness in a time series. A smaller H value indicates more noise in the system and a greater resemblance to randomness. Conversely, a larger H value suggests less noise, more persistence, and clearer trends in the data. Therefore, while stocks A and B may differ in terms of volatility or Beta, if stock A has a higher H value, it implies that there is less random behavior in its data series. As a result, a machine learning process is more effective when applied to stock A’s data series, leading to better forecast quality and reduced forecasting risk. Stocks with higher H values should be more predictable to machine learning models and, consequently, entail lower risk for investors. Now, let’s estimate the volatility, Beta, and the Hurst exponent for the S&P500 and sector ETF SPDRs.

S&P500Tracking the stock performance of 500 large companies listed on stock exchanges in the United States.
XLFXLF tracks an index of S&P 500 financial stocks, weighted by market cap.
XLIXLI tracks a market-cap-weighted index of industrial-sector stocks drawn from the S&P 500.
XLKXLK tracks an index of S&P 500 technology stocks.
XLVXLV tracks healthcare stocks from within the S&P 500 Index, weighted by market cap.
XLBXLB tracks a market-cap-weighted index of US basic materials companies. The fund includes only the materials components of the S&P 500.
XLPXLP tracks a market-cap-weighted index of consumer-staples stocks drawn from the S&P 500.
XLUXLU tracks a market-cap-weighted index of US utility stocks drawn exclusively from the S&P 500.
XLEXLE tracks a market-cap-weighted index of US energy companies in the S&P 500.
XLYXLY tracks a market-cap-weighted index of consumer-discretionary stocks drawn from the S&P 500.
XLREXLRE tracks a market-cap-weighted index of REITs and real estate stocks, excluding mortgage REITs, from the S&P 500.
XLCXLC tracks a market-cap-weighted index of US telecommunication and media & entertainment components of the S&P 500 index.
(Table 1: ETFs Descriptions)

Below we have estimated annual volatility, Beta, and the Hurst exponent for the S&P500 and sector ETF SPDRs from August 15th, 2019 to September 8th, 2023 (1024 data point returns).

machine learning: Hurst and risk measures
(Table 2: Annual Volatility, Beta, and Hurst Exponent for the period of August 15th, 2019 – September 8th, 2023)

According to Table 2, we can observe that XLP (Consumer-staples sector) has the lowest annual volatility and requires the lowest risk premium for investors. Conversely, the highest Hurst exponent is found in XLE (Energy sector) during the analyzed period. Furthermore, since both the S&P500 and ETFs have an H value above 0.5, their time structure is considered fractal. Before, we investigated the entropy for SPDR ETFs here.

machine learning: Sectors' Hirst exponents
*Values are sorted from highest to lowest
(Table 3: Ranking ETF SPDRs)

According to Table 3, XLE has the highest Hurst exponent, indicating that the XLE time series return contains less noise, exhibits greater persistence, and shows clearer trends. Consequently, XLE carries the lowest risk because its time series data has minimal noise. Moreover, XLE experiences significant price fluctuations, which align with its annual volatility. Investors demand a high-risk premium for XLE compared to other sectors. In contrast, XLK ranks as the sector with the highest risk premium, only exceeded by XLE in terms of volatility. XLP, on the other hand, holds the lowest risk premium and exhibits the lowest volatility. However, both XLP and XLK have the lowest values of the Hurst exponent

To sum up, the evidence suggests that investing in XLE, based on machine learning algorithms, carries lower risk despite its high volatility and beta values. This is because the XLE time series contains clearer trends that can be recognized by AI algorithms for effective learning and forecasting. XLK, despite being one of the riskiest sectors alongside XLE, presents a higher level of risk in the context of machine learning, as does XLP, which is the least risky sector.

Investing in Stock Sectors with the IKF AI Algorithm

I Know First provides stock market forecasts based on chaos theory approaches. Previously, we discussed the Conceptual Framework of Applying ML and AI Models to Analyze and Forecast Financial Assets. The I Know First predictive algorithm is a successful attempt to discover the rules of the market that enable us to make accurate stock market forecasts. Taking advantage of artificial intelligence and machine learning and using insights of chaos theory and self-similarity (the fractals), the algorithmic system is able to predict the behavior of over 13,500 markets. The key principle of the algorithm lays in the fact that a stock’s price is a function of many factors interacting non-linearly. Therefore, it is advantageous to use elements of artificial neural networks and genetic algorithms. How does it work? At first, an analysis of inputs is performed, ranking them according to their significance in predicting the target stock price. Then multiple models are created and tested utilizing 15 years of historical data. Only the best-performing models are kept while the rest are rejected. Models are refined every day, as new data becomes available. As the algorithm is purely empirical and self-learning, there is no human bias in the models and the market forecast system adapts to the new reality every day while still following general historical rules.

Basic Principle of the "I Know First" Predictive Algorithm

I Know First has used algorithmic outputs to provide an investment strategy for institutional investors. Below you can see the investment result of our sector rotation strategy for the period from January 21st, 2020, to June 26th, 2023. Also, we discuss the sector rotation here.

ILF sector rotation machine learning strategy
(Figure 1 – The IKF Sector Rotation Strategy from January 21st, 2020, to June 26th, 2023 )

The strategy provides a positive return of 254.76% which exceeded the S&P 500 return by 221.35%. Below we can notice the strategy behavior for each year.


Risk is one of the fundamental concepts in the investment process. Classical risk measures are volatility and the Beta coefficient. However, rapid development and implementation of AI algorithms in the investing area required reconsidering the concept of risk from the characteristics of shares to characteristics of data time series on which artificial intelligence should learn and further predict. We have estimated annual volatility, Beta, and the Hurst exponent for the S&P500 and sector ETF SPDRs, and noticed that XLE is the least risky sector for further sufficient learning and forecasting, despite its high volatility and required investment risk premium.

I Know First Premium article

To subscribe today click here.

Please note-for trading decisions use the most recent forecast.