Making the Factor Factory work

Since the Capital Asset Pricing Model (CAPM) was constructed almost 40 years ago, both academics and quantitative practitioners have discovered many anomalies. The insights generated from these empirical findings prompted the development of factor-based investing. Likewise, it also shaped the way academics think about asset-pricing theory. Consequently, this also led to a proliferation of the amount of potential (risk) factors. By now, hundreds and hundreds of factors have been documented. Not only does the amount increase, the factors also become more and more exotic. John Cochrane, a well-known financial economist, refers to this collection of factors as a “zoo”.

There seems to be no end to this proliferation, as academics and quant practitioners, even today, work in their own “Factor Factory” trying to produce a successful back-tested factor, which in turn can be used for either publication or trading purposes. While this might be exciting for some, this ongoing hunt is actually troublesome for others. 

To advance further within asset pricing and make a meaningful impact on the industry, I argue in this article that we require a robust framework to scrutinize and evaluate new factors, thereby aiming to end the trend of this proliferation. Don’t get me wrong, I love factor investing, but my point here is to filter out factors that are abundant and useless. Factor investing, in itself, may improve one’s portfolio risk portfolio, given correct implementation. 

A first point of concern is that factors, on paper, look significant due to our good friend called the “false positive” (or rather known as the type I error). A simple example, in human language, is the trial of an accused criminal. The null hypothesis is that this person is innocent. A type I error means that the person is found guilty, while he is not. Likewise, an asset pricing factor can be regarded as statistically significant, while it is not. This may happen in 5% (on average) of the cases (note that this is the standard significance level). Suppose that we conduct 1.000 asset pricing studies, then in 50 cases, we end up with a “p-hacked” factor that has no further meaning, but may be published in a top-tier finance journal. One partial solution to this problem is to impose a higher threshold level. Harvey, Liu and Zhu (2016) suggest to use a t-statistic of 3 as a hurdle: this corresponds to a type I error of approximately 0.1%, i.e. 1 in 1000 asset pricing studies will on average produce a “lucky” factor rather than 50! The authors, however, do point out that factors grounded in deep economic/theoretical models and motivation should not be subject to this higher hurdle. By now, multiple studies take this higher hurdle into account.

A second point of concern is the lack of replication. This replication crisis is a wider problem within economic, but also strongly present within empirical asset pricing. Typically, when a new factor is discovered and published, post-publication results are not provided and the factor is rarely updated. For other researchers, it is typically hard to perfectly replicate these factors, and there is little incentive to reinvent the “wheel” A handful of recent replication studies, however, do exist. For example, Hou, Xue and Zhang (2018) replicate more than 450 asset pricing anomalies. They find that 65% of these factors don’t even clear the 5% significance. Imposing even more stringent hurdles raises this to a failure rate of 82%. Next to statistical significance, among those that did survive the tests, the economic magnitude is smaller than originally reported as well. My suggestion, other than conducting replication studies, to tackle the replication crisis within empirical asset pricing is that top-tier journals collaborate and create an online depository containing data and algorithms that create and update published factor on a regular basis. By doing so, other researchers can easily replicate, and test these factors further. 

This will increase the quality and transparency of the original idea, and inform investors regarding the real-time out-of-sample performance of these factors. 

In addition to increasing the statistical hurdle, and tackling the replication crisis in this field, I also propose to increase the scrutiny towards factors by using multiple decision heuristics. These rule of thumbs can be used to select factors that really delivers risk premia. The following rules of thumbs can be used to determine whether the factor factory produced a real factor or a lucky factor: 

First of all, a factor should be persistent over time and across business cycles. For example, the momentum factor has been shown to be present in a sample spanning more than 200 years of data (Baltussen, Swinkels & van Vliet, 2019). Furthermore, a true factor is pervasive in the sense that it holds across multiple countries, industries and even asset classes. Again, the momentum factor is an example of a pervasive factor (See for example Grinblatt & Moskowitz (1999), Asness, Moskowitz & Pedersen (2013), and Fama and French (2012)). 

Another consideration is that a factor premium is robust to minor variation in factor construction or definition. For example, an ongoing debate is whether there is a risk premium for an ESG (Environmental, Social, and Governance) factor. Results are mixed, most likely due to differences in measuring ESG. Fourth, a factor should also be investable in the sense that risk premia still can be harvested after considering liquidity issues, trading costs and other constraints. For example, some factors can induce high turnover each rebalancing period, thereby increasing the corresponding trading costs substantially. Also, some pricing anomalies are concentrated in small, illiquid and distressed firms for which trading and shorting costs are relatively high. 

Lastly, and perhaps most importantly, a true factor has an intuitive reason for its existence. In other words, a factor should have a simple reason for its existence backed by solid economic and financial theory. A factor can be related to a macroeconomic risk exposure (“the risk-based explanation”), a deep-rooted behavioural bias that is present among investors or an institutional feature. The momentum factor (again), for example, has numerous behavioural and risk-based explanations.

To summarize, we observe a proliferation in factors that arguably can price assets. In this article, I argue that we should be very careful with assessing the value of these factors. I have proposed multiple ways to put the “Factor Factory” under higher scrutiny: use more stringent statistical hurdles, promote replication studies, and impose certain decision criteria to select factors. Especially the latter suggestion is of most practical use to the average investor, like you and me. Remain a healthy level of scepticism when comes to factors created by Factor Factory.

Amar Soebhag