A Closer Look at Value Premium: Literature Review and Synthesis

This paper provides a systematic review of value premium literature that examines the performance difference between value and growth stocks and the possible reasons for it. We compare and synthesize the results from the different regional stock markets and different sample periods. The literature is categorized according to stock selection criteria that are based on either individual valuation ratios, such as E/P, B/P, D/P, S/P, CF/P, and enterprise value&#8208;based multiples, or composite value criteria that aim to capture more than one dimension of relative value simultaneously or combine them with other classification criteria. We also compare the efficacy of various selection criteria to each other and synthesize the literature on the explanations for the value anomalies. The overall evidence shows that the best criterion varies over time and across the markets. The relative efficacy of different valuation criteria also seems to depend on numerous methodological choices. Recent studies have given mild evidence that combining traditional valuation ratios either with each other or with some financial statement variables could at least in some cases enhance the value premium, although very few studies have provided transparent comparisons between the results based on individual valuation ratios and those based on composite value criteria.


Introduction
The debate on the value anomalies is an excellent example of the fruitful interplay between scholars and investment practitioners. As early as the 1930s, after the stock market crash caused the Great Depression, academics started to develop theories of a fair value of common stocks. These pricing theories motivated investors to chase abnormal returns by using trading strategies that were based on the mispricing of the stocks. Soon after the introduction of the Capital Asset Pricing Model (CAPM), the first contrarian results according to which the relationship between risk and return is not linear were published: Lintner (1965), who is acknowledged as one of the developers of the CAPM, already documented that the security market line was too flat in comparison with the predictions. The follow-up anomaly studies began the new (and current) era of stock market research. During the recent decades, several investment strategies have been proven to generate abnormal returns. In almost every case, proponents of the CAPM have understated such results by invoking data mining, methodological flaws, or even a misinterpretation of the results. However, new evidence against stock market efficiency is continuously being published in academic studies. For example, numerous studies have identified the existence of price momentum on stock returns (e.g., see Titman, 1993, 2001;Rouwenhorst, 1998;Chan et al., 2000;Korajczyk and Sadka, 2004;Gutierrez and Kelly, 2008;Chui et al., 2010;Israel and Moskowitz, 2013), which refers to the tendency of recent winner stocks to generate abnormal returns in the near future. However, there is plenty of international evidence of a value premium in stock returns (e.g., see Chan and Lakonishok, 2004;French, 2006, 2012;Cakici et al., 2013), which refers to the tendency of value stocks to outperform glamour stocks. Recently, new evidence of the added value of combining value and momentum strategies has also been documented (e.g., see Bird and Casavecchia, 2007a;Bettman et al., 2009;Leivo and Pätäri, 2011;Guerard, Jr. et al., 2012;Asness et al., 2013). The empirical results of academic studies have formed the basis of many investment strategies commonly used in equity markets. However, dilemmas encountered by portfolio managers and consultants, such as the style identification of value and growth investments and the tailoring of style-specific benchmark indices, have contributed to the academic literature.
The reason value anomalies have piqued a broader interest among economists is that their existence would challenge the semi-strong-form market efficiency, which asserts that security prices fully reflect all publicly available information (Fama, 1970(Fama, , 1991. Another reason for the increased attention to the value anomalies is their persistence. Though all pricing anomalies should disappear in efficient markets soon after they have been discovered, the value anomaly has not done so (e.g., see Fama and French, 2006;Israel and Moskowitz, 2013). Therefore, academic interest in the value premium has exploded since the seminal papers of Fama and French (1992) and Lakonishok et al. (1994), although the origins of value anomaly literature can be traced back to the 1960s (e.g., see Nicholson, 1960Nicholson, , 1968McWilliams, 1966;Breen, 1968). The reason for the proliferation was Fama and French's (1992) results that strongly challenged the validity of the standard single-factor CAPM, particularly in the minds of the CAPM proponents. The debate stemming from that specific paper has made it one of the most influential articles in the current financial literature.
In this article, we review and update the empirical literature on value anomalies and value premium. Our survey covers studies that examine the portfolio performance of value investment strategies as well as those that examine the cross-sectional explanatory power of various factor combinations on expected or realized returns. Due to space limitations, this literature review focuses on those articles that use variables from individual companies as input data and therefore omits many articles that are based on aggregate stock market data. Because several preceding articles have already provided extensive reviews on the theoretical issues related to the debate over value versus growth investing (e.g., see Fama, 1998;Campbell, 2000), we focus on empirical papers.
The remainder of the paper is structured as follows. Section 2 gives an overview of the evidence of the earnings yield (E/P) anomaly and the related value premium. Section 3 reviews the literature on the book-to-price (B/P) anomaly, whereas Section 4 discusses the dividend yield (D/P) anomaly. Section 5 summarizes and synthesizes the evidence of the sales-to-price (S/P) anomaly, whereas Section 6 focuses on the cash flow-to-price (CF/P) anomaly. Section 7 overviews the enterprise value-based anomalies, 1 and Section 8 introduces evidence of the added value of using composite value criteria. Section 9 discusses the explanations for the value anomalies and the related value premium. Concluding remarks and suggestions of directions for future research are given in Section 10. can get against one unit of input, the better in terms of relative value. Therefore, high valuation ratios, given that they have been denoted as output/input ratios, are characteristics of value stocks, although corresponding ratios of glamour stocks are low. 2 Table 1 summarizing the studies on the E/P anomaly made with the U.S. sample data, introduces 12 such papers that have compared at least three alternative portfolio formation criteria to each other and also included E/P as one of the criteria. 5 In two out of the 12 aforementioned studies, the E/P criterion has generated the highest value premium. Furthermore, in another of these two (i.e., Davis, 1994), the sample period is from 1940 to 1963 and covers the pre-Compustat era. Based on value portfolio returns, the U.S. evidence of relative efficacy of E/P is also parallel, because in three out of 11 studies, the E/P value portfolio generated the highest return. 6 At the other extreme, in four out of 12 cases, the lowest value premium has been generated by E/P, whereas the use of the same criterion has resulted in the lowest value portfolio returns in three out of 11 studies. Table 2 includes 19 comparable papers that have employed non-U.S. sample data. In three out of 18 of them, 7 the greatest value premium has been generated by E/P, whereas the highest value portfolio return is based on E/P in only one out 17 comparable studies. 8 At the other end, the use of E/P has resulted in the lowest value premium in six out of 18 cases 9 and in the lowest value portfolio return in six out of 17 cases. Based on overall evidence, E/P has not been among the most efficient selection criteria, because in only 5 out of 30 cases, it has generated the highest value premium, whereas it has generated the lowest in 10 cases. In only three out of 28 cases has the value portfolio return been the highest based on E/P, whereas it has been the lowest in 10 cases. However, it must be noted that this type of comparison includes some selection bias, because only B/P beside E/P has been included among portfolio formation criteria in all of such 31 cases in which at least three different valuation ratios have been compared. The relative efficacy of E/P also varies across the samples because of different treatment practices of negative earnings' stocks. Moreover, the impact of inclusion or exclusion of those stocks varies across the samples and sample periods. 10 In spite of the mixed evidence of the relative efficacy of E/P, many recent papers have shown its applicability for a stock selection. Chen and Zhang (2007), for example, concluded that beside the Fama-French factors, E/P ratios were useful in explaining price movements of the U.S. stocks. Parallel results from the same stock markets were also reported by Penman and Reggiani (2013) and by Artmann et al. (2012) in the German stock markets. According to the latter authors, the explanatory power of the standard Fama-French 3-factor model on the cross-section of average stock returns in Germany has not been strong. An alternative three-factor model in which the size factor was replaced with E/P factor explained the returns better, and adding the momentum factor further increased the explanatory power. The explanatory power of different portfolio formation criteria on subsequent stock returns seems to vary across both the stock markets and the sample periods (see also Barbee et al., 2008 for recent U.S. evidence and Hou et al., 2011 for global evidence). Beside Penman and Reggiani (2013), recent evidence of the E/P anomaly in the U.S. stock market is also documented by Li et al. (2009), Athanassakos (2011), and Israel and Moskowitz (2013), and in the U.K. stock market by Anderson and Brooks (2006).

Book-to-Price (B/P) Anomaly
The book value of equity provides a relatively stable, intuitive measure of value that can be compared to the market value of the equity that reflects the market's expectations of the firm's earning power and cash flows. High B/P ratios are sometimes considered to provide a "margin of safety," because book value is deemed to be a "floor" supporting the market price (Bodie et al., 2010). However, the relationship between book value and price is much more complex because book values are not necessarily reliable indicators of the assets' current fair value or liquidation value. In spite of that, B/P is the most frequently used valuation ratio in the value premium literature, although recent evidence have also put forth several other criteria as a competitive alternative for that criterion (e.g., see Loughran and Wellman, 2011;Gray and Vogel, 2012;Gharghori et al., 2013;Pätäri et al., 2016).  Basu (1977) Performance comparison of quintile pfs (the separate low E/P pf including and negative earnings' firms is also included).
The value pf return is 16.3%. Jensen alphas are significantly positive (negative) for the top-two (bottom-two) quintile pfs. Basu (1983) Performance comparison of both E/P quintile pfs and double-sorted (based on E/P and size and vice versa) 5×5 quantile pfs (negative firms excluded).

CRSP/ Compustat
From 352 (1950) to 1,309 (1974) 1951-1986 E/P anomaly exists but its intensity is higher among small-cap stocks than among large-cap stocks. In addition, evidence of consistently high returns for firms of all sizes with negative earnings.

700-1,770
July 1963-Dec 1990 The E/P-based VP exists but it is not as wide as its B/P-based counterpart. Abnormal returns of stocks with negative earnings is explained by the size effect. Fuller et al. (1993) Performance comparison of EW quintile pfs (based on abnormal returns of industry-adjusted E/P quintiles).
(Continued)  Roll (1995) Performance comparison of 8 pfs including the stocks in the top and bottom halves based on 3 criteria (size, E/P and B/P) and monthly reformation frequency.
Roll and Ross Asset Management database/ CRSP From 2,160 (appr.) to 3,160 (appr.) stocks listed in NYSE, AMEX andOTC April 1984-Mar 1994 The best criterion is E/P. In raw return comparisons, the best 4 pfs are all high E/P pfs (The best pf is small-cap high E/P & high B/M pf with 21.2% return). Fama and French (1998) VW return comparison between the top and the bottom 30% pfs formed on B/P, CF/P, D/P & E/P. 3,333-6,258 NYSE, NASDAQ andAMEX stocks 1975-1995 The second highest and significant VP (6.71%), as well as the second highest value pf return (14.09%). Dhatt et al. (1999) Performance comparison of B/P-, E/P-and S/P-based tertile pfs. Firms with negative IVRs excluded from tertiles but included in separate pf for each ratio.  D/P anomaly is stronger than E/P anomaly, but together they subsume size and share price anomalies.
Comparison of B/P-, CF/P-and E/P-and size-based quartile pf returns and C-SRs.         To our knowledge, Stattman (1980) was the first to report a significant B/P anomaly in the U.S. stock market, although his results are prone to both survivorship and look-ahead biases due to the sample selection criteria employed.  compared four portfolio formation criteria (i.e., the CF/P, E/P, B/P, and size criteria) in the Japanese stock market and concluded that B/P had the best discriminatory power on value and glamour stocks. In addition, the best performance in terms of both absolute and riskadjusted returns was also reported for B/P value quartile portfolios. Parallel results from the U.S. markets were documented by Fama and French (1992), who found B/P to have the best explanatory power on expected returns in the U.S. markets. The authors further demonstrated that together with the market value of equity (i.e., firm size), these two variables captured the explanatory power of E/P. The dramatic dependence of returns on B/P was independent of β, suggesting either that high B/P firms are relatively underpriced, or that B/P is serving as a proxy for a risk factor that affects equilibrium expected returns. After controlling for the size and B/P effects, β seemed to have no power to explain average security returns, indicating that systematic risk seems not to matter, whereas the B/P ratio seems to be capable of predicting future returns.
In line with the seminal paper of Fama and French (1992), Capaul et al. (1993) documented the inverse relationship of return and β in most of the major stock markets when comparing the value premiums and their βs in six developed national markets. The authors also showed that the B/P anomaly was a global phenomenon and even stronger outside the United States. The overall evidence of the studies reviewed in our paper reinforces that this conclusion is also true at a more general level: Tables 3 and4 include 12 United States and 19 non-United States studies in which at least three alternative portfolio formation criteria based on individual valuation ratios, including B/P, have been compared to each other. For the U.S. sample data, in four of them, the greatest value premium was generated by B/P, whereas the B/P value portfolio return was the highest in three out of 11 studies. For the non-U.S. sample data, the corresponding proportions are 10 out of 18 and 10 out of 17, respectively. At the other extreme, in two out of 12 U.S. studies, the lowest value premium was generated by B/P rankings that also have resulted in the lowest value portfolio return in four of 11 cases. For the non-U.S. sample data, the lowest value premium was generated by B/P in five out of 18 studies, whereas the lowest value portfolio return was documented for the same criterion in only two out of 17 cases.
Based on these rough statistics, it seems that the relative efficacy of B/P compared to that of E/P has been somewhat stronger in the studies based on the non-U.S. sample data. If the condition of the inclusion of at least three alternative portfolio formation criteria based on individual valuation ratios is exempted, and the efficacy comparison is made among the studies that include B/P and E/P comparisons, Tables 1-4 include 35 such cases. In 21 of them, the B/P-based value premium has been higher than its E/P-based counterpart, whereas the reverse holds for 14 cases. In addition, the value portfolio return based on B/P was higher than that based on E/P in 20 out of 33 cases, whereas the reverse conclusion was drawn in 13 studies. If pooled results for the U.S. and non-U.S. samples are compared separately, the proportions based on comparisons of value portfolio returns reinforce the conclusion that B/P has in this sense worked somewhat better than E/P, particularly for the non-U.S. sample data, because the former has generated higher value portfolio returns in 14 out of 20 cases. 11 By contrast, based on the pooled results for the U.S. sample data, the E/P-based value portfolio return was higher in seven out of 14 cases. However, it should be noted that samples used as bases of quantile divisions based on B/P and E/P are not necessarily identical, because in many studies, negative earnings' stocks have been excluded from the E/P-based division, whereas they have been included in the B/P-based division unless their book values have also been negative (e.g., see Fama and French, 1992;Dhatt et al., 1999).
Like the E/P anomaly, the B/P anomaly is also related to firm size. B/P-based value premium has often been documented to be at its highest among small-cap stocks and at its lowest among large-cap stocks (e.g., see French, 1993, 2012;Israel and Moskowitz, 2013;Hou et al., 2015). However, this relationship cannot be generalized: for example, Bauman et al. (1998) documented for the large international sample that the value premium was higher among large-cap stocks than among small-cap          Table 3.

Continued
Author ( French's data library (derived from CRSP/ Compustat data) All US stocks included in decile pfs formed by French July 1951-Dec 2011 The second lowest VP (5.89% in terms of excess returns), but the highest value pf return (12.54% excess of risk-free rate) among 4 IVRs.
Hou et al.

CRSP/ Compustat
All non-financial US stocks July 1972-Dec 2012 The highest and significant VP (.7% p.m.) among the first 4 IVRs. The returns are given on an annual basis, unless indicated by p.m. in the first context within each summary description of the findings (In those cases, the returns are given on a monthly basis throughout the description). (2017)

Morgan
Stanley/UBS The sample includes appr. 60 % of the total market cap of each country Jan 1981-Jun 1992 VP is the highest in France (.53%), the lowest in Germany (.13%), and globally significant. The highest market-adjusted abnormal return for the value pf in Japan.

Miles and Timmermann
(1996) Performance comparison of EW decile pfs formed on 3 IVRs (D/P, E/P & B/P) and monthly C-SRs.
Extel Company Accounts data/ LSPD 457 UK non-financial firms May 1979-April 1991 The highest VP (.79% p.m.) and value pf return (1.54%) among the 3 IVRs. In terms of risk-adjusted returns, B/P is the only IVR that generates significant outperformance for the value pf. Cai (1997) Return comparison between the top and bottom decile pfs formed on 4 criteria (B/P, E/P, CF/P and sales growth) for 5-year hps and C-SR tests.

PACAP/Daiwa Securities/Nihon
Keizai Shimbun From 1,186 to 1,651 Japanese non-financial and non-utility stocks June1971-Dec1993 The highest VP (11.2%) and value pf return (25.5%) among the 3 IVRs. The best explanatory power in C-SRs (size factor included). Doeswijk (1997) Return comparison of B/P-, CF/P, D/P-and E/P-based quintile pfs. Hp lengths 1 year and 3 years.         Daniel et al. (2001) reported the highest value premium in the middle-size quintile for the comprehensive sample of Japanese stocks. According to their results, the B/P-based value premium was significant in all size quintiles other than the smallest-cap quintile, for which the value portfolio return was the highest, indicating that the small-cap anomaly existed among the sample stocks. Parallel results based on recent data were also provided by Fama and French (2012) who, for the sample of 23 developed stock markets divided into 4 regional stock markets, also documented that unlike in other countries, the value premiums in Japan were higher in large-cap quantiles than in small-cap quantiles. By contrast, Fama and French (2006) found in their earlier study that in the U.S. markets, significant B/P-based value premiums (return difference between the top-2 vs. the bottom-2 B/P quintiles) existed for all five size groups, except for the largest-cap group. They also reported that the U.S. value premiums were higher based on B/P than based on E/P in the three biggest size groups, whereas the reverse held for the two smallest-cap groups. Instead, for the merged data from 14 non-U.S. developed national markets, the value premium was almost equal among the largest-cap and the smallest-cap groups, according to the same authors. Cakici et al. (2013) also supported the previous finding in their sample of 18 emerging stock markets. Recent U.S. evidence of B/P-based value premium is provided by Loughran and Wellman (2011) and Israel and Moskowitz (2013), whereas the corresponding global evidence is offered by Fama and French (2012) from developed markets and by Cakici et al. (2013) from emerging markets. Both last-mentioned studies document strong and significant value premium in most of the regional markets examined, but there are some fundamental differences between the developed and emerging markets. According to Fama and French's results, the value premium decreased when moving from small-cap samples toward the larger-cap ones. Instead, Cakici et al. (2013) documented for the pooled emerging market sample that the value premium is almost identical among the large-cap and small-cap samples. The value premiums, as well as the top value portfolio returns, were higher in emerging markets than in developed markets.

Dividend Yield (D/P) Anomaly
The hypothesis that D/P predicts returns has been the subject of considerable theoretical (e.g., see Boudoukh et al., 2008) and empirical research (e.g., see Ball, 1978;Hodrick, 1992;Goetzmann and Jorion, 1993;Kothari and Shanken, 1997;Ang and Bekaert, 2006). Actually, there are at least three central competing hypotheses: the tax-effect hypothesis, the dividend-neutrality hypothesis, and the signalling hypothesis. The tax-effect hypothesis proposed by Brennan (1970) states that investors receive higher before-tax, risk-adjusted returns on stocks with higher anticipated dividend yields to compensate for the historically higher taxation of dividend income relative to capital gain income. In contrast, the dividend-neutrality hypothesis proposed by Black and Scholes (1974) states that if investors require higher returns for holding higher dividend-yield stocks, firms would adjust their dividend policy to restrict the quantity of dividends paid, lower their cost of capital, and thus increase their stock price. Correspondingly, if investors required a lower return on high dividend-yield stocks, value-maximizing firms would increase their dividend payouts to increase their stock price. In equilibrium, value-maximizing behaviour would result in an aggregate supply of dividends to equal the aggregate demand for dividend income from investors that prefer dividends at least as much as capital gains. As a consequence, a predictable relationship between anticipated dividend yields and risk-adjusted stock returns should not exist. According to the signaling hypothesis, dividend yields and their changes reflect the management's beliefs about the future prospects of the firm (e.g., see Dielman and Oppenheimer, 1984;Denis et al., 1994;Sant and Cowan, 1994). Therefore, higher D/Ps could be assumed to signal the management's trust in the continuity of dividend-paying ability.
The prediction power of D/P on stock returns can also be reasoned on the basis of the dividend growth model (Gordon and Shapiro, 1956), according to which the total return of a stock will be determined based on its initial dividend yield and the dividend growth rate. In an efficient market, if all stocks with equal risk offered the same total return, the stocks with a low dividend growth would have to offer higher initial dividend yields. However, if investors are incapable of assessing growth prospects correctly, it is possible that the growth rate assumed for high growth rate stocks will be too high and that for low growth rate stocks will be too low. Therefore, overoptimistic growth extrapolations might explain why high D/P stocks would offer a higher total return. A related explanation is that investors do not necessarily behave according to the dividend growth model when pricing stocks. For example, investors might not be indifferent between a stock with a 2% higher initial yield and a stock with a 2% faster growth rate, as put forth by Lofthouse (2001).
Evidence of differences in returns among stocks with high and low or zero-dividend yields has been mixed. Blume (1980) and Keim (1985) documented a U-shaped relationship between risk-adjusted returns and D/Ps, with zero-dividend stocks generating larger returns than dividend-paying stocks and higher D/P stocks realizing larger risk-adjusted returns than lower D/P stocks. By contrast, Christie (1990) showed that the anomalous returns of zero-dividend stocks were largely due to the performance of stocks with a price of less than two dollars during the 1930s. By comparing the returns between zero-dividend and dividend-paying stocks of equal market capitalization, he documented significantly higher size-adjusted returns for the latter type of stocks. Elton et al. (1983) also documented a strong positive relationship between D/P and expected returns. In addition, Keim (1985Keim ( , 1986 found a significant though not linear relationship between D/P and abnormal returns (i.e., Jensen's alphas) in the U.S. market.
In the U.K., Levis (1989) examined the relationship between D/Ps and returns and found that a high D/P and a high return were monotonically positively related. According to his results, the D/P anomaly was the strongest in relation to size, E/P, and stock price anomalies. Although a large degree of interdependency between all four anomalies was documented, Levis reported the D/P and E/P anomalies to subsume the size and stock price anomalies. Controlling for firm size, intra-year seasonality, and market risk, Morgan and Thomas (1998) also found a significant positive relation between D/Ps and returns in the U.K. stock markets. Parallel results from the same stock market were also reported by Chan and Chui (1996) for the 1973-1990 period on the basis of monthly cross-sectional regressions, whereas based on the annual data, the explanatory power of D/P was insignificant. By contrast, the results of Miles and Timmermann (1996) showed that the explanatory power of D/P on the subsequent returns was not significant even in monthly cross-sectional regressions, and in addition, that the D/P-based value premium was negative. 12 McManus et al. (2004) introduced the payout ratio into the empirical relationship between stock returns and D/P and found that it had an important impact on the statistical significance of the dividend yield itself in explaining returns, and furthermore, that it conveys signaling information beyond that of D/P. Naranjo et al. (1998) found that both absolute and risk-adjusted returns for NYSE stocks increased with an increasing dividend yield. Consistent with Blume (1980) and Keim (1985), the authors documented higher absolute returns for zero-dividend stocks than for low-dividend stocks, but after the Fama-French 3-factor risk-adjustment, the former stocks performed the worst. Naranjo et al. (1998) showed further that tax effects could not explain their findings. Fama and French (1998) compared the value premiums obtained from using four different portfolio formation criteria (i.e., B/P, CF/P, E/P, and D/P) in 13 major stock markets. According to their results, the D/P criterion resulted in the greatest value premium in only one out of 13 national stock markets during the 1975-1995 period. Moreover, the value premium based on D/P was statistically significant in only two national markets. Instead, a comparison of the same four valuation ratios by Bauman et al. (1998) documented the greatest value premium based on D/P for a large pooled sample of international stocks whose fiscal year end (FYE) was in March. However, the Sharpe ratios (Sharpe, 1966) of the value quartile portfolios formed on CF/P and B/P were slightly higher than that of the D/P value portfolio for this subsample. Instead, for the subsample of the stocks with FYE in December, the highest Sharpe ratio was shared with the E/P and D/P value quartile portfolios. Based on that evidence, the relative performance of value portfolios based on different valuation ratios also seems to be dependent on the time of the FYE of the sample companies (however, when interpreting the results of Bauman et al. (1998), it should also be noted that due to varying FYE practices across countries, the different FYE subsamples are "country-biased," because for example, the most common FYE for the Japanese companies is in the end of March, whereas in Australia it is in the end of September, whereas for the majority of U.S. companies the fiscal year equals the calendar year).
Tables 5 and 6 include four U.S. and nine non-U.S. studies in which at least three alternative portfolio formation criteria based on individual valuation ratios, including D/P, have been compared to each other. Somewhat surprisingly, such a literature is very scant, particularly for the U.S. sample data, which might be at least partially explained by the relatively high ratio of non-dividend-paying stocks, as well as by huge variability in the proportion of dividend-paying stocks in the U.S. over time (e.g., see Christie, 1990;Fama and French, 2001). In all such U.S. studies, D/P has generated the lowest value premium as well as the lowest value portfolio return. Instead, for the non-U.S. sample data, the results are more mixed. In two out of nine studies, the highest value premium has been generated by D/P, whereas the highest value portfolio return has been documented for the D/P criterion in three out of eight studies. It is noteworthy that all of the evidence for the superiority of D/P is from the small European national stock markets. At the other end, the lowest value premium and the lowest value portfolio return based on D/P have been documented in three non-U.S. studies.
In addition, when comparing the evidence on the D/P anomaly, the reader should also note that many different practices have been employed in the calculation of dividend yields. For example, Naranjo et al. (1998) multiplied a firm's most recently declared quarterly dividend by four and divided the resulting product by the previous month's closing price, whereas Keim (1985) and Christie (1990) used the ratios of the sum of dividend paid over the last 12-month period to the stock price in the beginning of that period. Instead, Fama and French (1993) and Hou et al. (2015) divided the similarly calculated sum of total dividends by the stock price at the end of the period preceding the moment of portfolio formation. Moreover, the calculation practices of dividend yields can also vary across the countries in the same databases. For example, according to Hou et al. (2011), Worldscope presents all price and per share data (including dividends) on a calendar year-end basis for U.S. firms, but on a fiscal year-end basis for non-U.S. firms. In addition, the group of non-dividend-paying stocks makes the sizes of D/P portfolios very unequal compared to the quantile portfolios formed on some other individual valuation ratios. In spite of weak overall evidence on the efficacy of D/P as a single value measure, D/P may still add value to the investor as a complementary value measure, as argued by Dimson et al. (2003). There is also some evidence that high D/P stocks might be less risky (e.g., see Naranjo et al., 1998;Leivo and Pätäri, 2011 13

Sales-to-Price (S/P) Anomaly
Influenced by Fisher (1984), the use of the S/P criterion became popular during the era of the new economy in the change of the millennium. In those days, analysts found it hard to justify their recommendations on the basis of earnings and book value multiples because negative earnings and low book values were very common among many information, communication, and technology companies. Instead, sales multiples could be calculated even for the most distressed and for very new firms. The use of sales multiples is also motivated in the financial literature by their stability compared to other valuation multiples (e.g., see Bodie et al., 2010), or by the fact that sales are relatively difficult to manipulate, unlike earnings and book values (e.g., see Damodaran, 2012). The biggest disadvantage of using sales multiples is that if a firm generates a high sales growth while simultaneously losing significant amounts of money, S/P could erroneously indicate a low relative valuation for such a firm. Sales can also be increased by increasing debt, which in most cases increases S/P. 14 However, the sales multiples do not indicate whether the sales have been generated without leverage or with maximum leverage, which certainly makes a difference to risks of the firms being compared. In spite of the above-mentioned disadvantages, evidence has shown          The returns are given on an annual basis, unless indicated by p.m. in the first context within each summary description of the findings (In those cases, the returns are given on a monthly basis throughout the description). The length of holding periods is one year unless otherwise indicated. that S/P has worked surprisingly well as a relative valuation criterion (e.g., see Bird and Casavecchia, 2007b;Barbee et al., 2008;Gharghori et al., 2013). Senchack and Martin (1987) were the first to examine the efficacy of S/P for value portfolio selection. They compared the relative performance of high S/P and high E/P strategies among NYSE and AMEX stocks and found that high S/P stocks produced abnormal risk-adjusted returns compared to both low S/P stocks and the market portfolio. However, high E/P stocks dominated high S/P stocks in terms of both absolute and risk-adjusted returns because the relative performance of the high E/P stocks was more consistent than that of the high S/P stocks. Instead, Barbee et al. (1996) found that S/P explained U.S. stock returns better than B/P or firm size. The authors stated further that S/P captures the role of the debt/equity ratio in explaining the returns. By contrast, Mukherji et al. (1997), who found evidence of the S/P and the B/P anomaly in the Korean stock markets, showed that the positive relationship of the debt-to-equity ratio persisted in portfolios formed on B/P and S/P. Dhatt et al. (1999) found that for small-cap U.S. stocks, S/P was a better indicator of value than B/P, which in turn was superior to E/P. The same authors also reported the superiority of S/P over B/P, E/P, and CF/P in terms of both value premium and value portfolio returns for the sample of larger-cap U.S. stocks, although the composite value measure based on combining the S/P and E/P criteria generated marginally higher returns . Leledakis and Davidson (2001) also reported higher value decile returns and higher value premium for the portfolios formed on S/P than those based on B/P. For the U.K. sample data, the authors also found that S/P had the best explanatory power on subsequent returns among the four criteria examined (the others were B/P, firm size, and debt-to-equity ratio). Bird and Casavecchia (2007b) documented the superiority of S/P in the European markets, but found no evidence of added value from combining valuation ratios. Barbee et al. (2008) found that in the U.S. stock markets, S/P has the most consistently significant positive relation and the highest explanatory power with subsequent annual returns. According to the authors, S/P is an undervalued value measure, because investors may tend to focus more on E/P and B/P than on CF/P or S/P, resulting in the information contained in the first two multiples being more efficiently incorporated into stock returns than the information in the last two multiples. Recent evidence for the S/P-based value premium has also been provided from the Australian stock market by Gharghori et al. (2013), who reported the highest value-weighted value premium based on S/P, as well as the highest value-weighted return for the value decile portfolio formed on S/P. The support for S/P was also given by cross-sectional regression tests in which it appeared to be a significant explanatory factor, although not as significant as B/P. Table 7 indicates that so far there are only three such U.S. studies in which at least three alternative portfolio formation criteria based on individual valuation ratios, including S/P, have been compared to each other. The number of such studies is surprisingly low in the light of the fact that in every one of these three studies, the highest value premium has been generated by S/P, and in addition, the highest value portfolio return is based on the same criterion in two out three comparable cases. 15 In addition to the evidence based on the portfolio formation approach, S/P was also documented to have the best explanatory power on subsequent returns among the individual valuation ratios in cross-sectional regression tests by Barbee et al. (1996Barbee et al. ( , 2008. For the non-U.S. sample data, the evidence is less favourable for S/P. Table 8 includes eight such comparable studies that have employed non-U.S. sample data. In two out of seven of them, the highest value premium has been generated by S/P, 16 whereas the highest value portfolio return is based on the same criterion in three out of eight cases. By contrast, the S/P-based value premium and the corresponding value portfolio return have been the lowest in an equal number of cases, but it is noteworthy that all this evidence is from the Finnish stock market and based on overlapping sample periods.  and hp to holding period). The returns are given on an annual basis, unless indicated by p.m. in the first context within each summary description of the findings (In those cases, the returns are given on a monthly basis throughout the description). The length of holding periods is one year unless otherwise indicated.   The returns are given on an annual basis, unless indicated by p.m. in the first context within each summary description of the findings (In those cases, the returns are given on a monthly basis throughout the description). The length of holding periods is one year unless otherwise indicated.

Cash Flow-to-Price (CF/P) Anomaly
Many financial analysts as well as scholars are cautious with earnings figures because of differences in companies' practices to calculate accruals such as depreciations and amortizations (e.g., see Bildersee et al., 1990;Chan et al., 2006), and, in addition, differences over time in calculation principles of those figures stemming from accounting standards (e.g., Cheng et al., 1996). Many scholars have also shown that accounting losses (i.e. negative earnings) can be regarded as temporary by nature; therefore, they are not reflected in cash flow expectations (e.g., see Hayn, 1995;Martikainen, 1997;Kallunki et al., 1998). The shortcomings of accounting earnings have motivated a number of scholars to explore the relationship between cash flow yields and stock returns (for the first attempts, see e.g., Wilson, 1986;Bernard and Stober, 1989).
To the best of our knowledge, the use of CF/P as a basis of a value investment strategy was first adopted by , who compared the efficacy of the CF/P criterion with the E/P, B/P, and size criteria in the Japanese stock market. Their results showed that of the four criteria considered, B/P and CF/P had the most significant positive impact on expected returns. Lakonishok et al. (1994) documented parallel results from the U.S. stock markets, with the exception that CF/P was somewhat more efficient for their sample data than B/P, whereas the opposite held for the Japanese sample employed by . Both of these cornerstone studies concluded that the observed value premium was not explained by the higher risk (measured by volatility) of value stocks. In the cross-country comparison of value premiums based on four different individual valuation ratios, Fama and French (1998) reported the highest value premium for CF/P in four out of 13 national stock markets. The highest value portfolio return was documented for CF/P in five out of 13 national stock markets, whereas for E/P, it was not documented in any of the markets examined.
The strong performance of CF/P-based strategies relative to E/P-based strategies is also consistent with recent evidence. For example, for the large sample of tradable NYSE and NASDAQ stocks, Dhatt et al. (2004) found that among 16 different portfolio formation criteria, which included the size criterion, the B/P, CF/P, E/P, and S/P, and 11 combination criteria formed as the combinations of the four last-mentioned ratios, the lowest risk and the highest return/risk ratio were documented for the CF/P value portfolio. Desai et al. (2004) noted that the average annual return for the simple E/P-based market-neutral long/short strategy was 10.2%, whereas for the corresponding CF/P-based strategy, it was 15.3%. Dissanaike and Lim (2010) compared the performance of value strategies based on relatively simple measures, such as B/P, CF/P, E/P, and past return, and some more sophisticated measures, such as those based on the Ohlson (1995) model and the residual income model (suggested by Dechow et al., 1999). For the comprehensive sample of U.K. stocks, the authors found that simple cash flow-to-price measures appeared to perform almost as well as, and in some cases even better than, the more sophisticated alternatives. The value premium based on both standard CF/P, as well as operating cash flow/price, were substantially higher than that based on either B/P or E/P. By contrast, Gregory et al. (2001) reported the superiority of B/P and E/P over CF/P in selecting value portfolios in the same stock market. Hou et al. (2011) examined a large number of firm-level characteristics that might explain the crosssectional and time-series variation in global stock returns. Their analysis included the size, D/P, E/P, CF/P, B/P, leverage, and momentum factors using monthly returns for over 27,000 individual stocks from 49 countries from 1981 to 2003. Using cross-sectional Fama and MacBeth (1973) tests of individual stock returns and time-series regression-based tests of multifactor models, the authors confirmed the strong and reliable explanatory power of a value-based factor in global stock returns. In contrast to almost all preceding comparable studies, this factor was somewhat surprisingly based on CF/P instead of B/P, E/P, or D/P. In addition, the incremental explanatory power of a B/P factor-mimicking portfolio, over and above that based on CF/P, turned out to be negligible. Table 9 summarizes the studies on CF/P anomaly executed with the U.S. sample data; it introduces eight such papers in which at least three alternative portfolio formation criteria based on individual valuation   ratios, including CF/P, have been compared to each other. In two out of eight such studies, CF/P has generated the highest value premium, whereas the CF/P-based value portfolio return has been the highest in two out of seven cases. 17 By contrast, only one of the U.S. studies included in Table 9 has documented the lowest value premium and the lowest value portfolio return for CF/P (i.e., Barbee et al., 2008). For the non-U.S. sample data (Table 10), CF/P has generated the highest value premium in one out of 10 cases 18 ; in addition, based on size-adjusted returns, it has done this in another case, which is actually the only case where the CF-based value portfolio return has also been the highest outside the U.S. markets (i.e., Gregory et al., 2001, for the U.K. data). At the other end, the lowest CF/P-based value premium has been documented in two out of 10 non-U.S. studies, whereas the lowest value portfolio return has been based on the same criterion in only one out of eight cases. Thus, evidence for the relative efficacy of CF/P is slightly stronger in the United States than in elsewhere in the world. Based on the overall evidence and compared to E/P, CF/P has generated higher value premium in 11 out of 17 cases, whereas the value portfolio return has been higher based on CF/P in 10 out of 15 cases. 19 Analogous to E/P-based value premium, the CF/P-based value premium can also vary remarkably depending on whether the firms with negative cash flows are included in the sample, which makes the comparison of CF/P-based results with the results based on other individual valuation ratios complicated. Because cash flows are in most cases higher than the corresponding earnings, the samples including nonnegative cash flows' stocks are generally larger than the samples including only non-negative earnings' stocks (e.g., Chan et al., 1993), which makes the former samples more consistent with B/P samples than the latter samples (i.e., samples of non-negative E/P stocks). Based on the overall evidence, the comparison between CF/P and B/P shows the superiority of CF/P-based value premiums in seven out of 17 cases, whereas based on value portfolio returns, the superiority of B/P has been documented in 11 out of 15 cases. 20 Although the overall evidence slightly favours B/P, the same does not hold for the U.S. evidence, according to which CF/P has resulted in higher value premium in four out of seven studies, whereas based on value portfolio returns, B/P has performed better in an equal number of cases.

Enterprise Value-Based Anomalies
So far, enterprise value-based multiples have been seldom examined as a basis for value investing strategy. The reason for the increasing popularity of enterprise value multiples is that they can be compared more easily across firms with diverging leverage, because enterprise value also takes the net value of a company's debt into account. The use of enterprise value multiples as a basis of a value investment strategy has been justified by the fact that if an acquirer is going to buy the entire company, he/she must also take the responsibility of the payment of a debt, therefore allowing for debt in the purchase price. Correspondingly, an investor should not ignore debt, because by buying stocks of a company, he/she is actually buying a part of the company. The most commonly used enterprise value-based multiples introduced in valuation textbooks are earnings before interests and taxes-to-enterprise value (EBIT/EV), earnings before interests, taxes, depreciations and amortizations-to-enterprise value (EBITDA/EV), and sales-to-enterprise value (S/EV). However, their relative efficacy in separating value stocks from glamour or growth stocks has been discussed in only a handful of studies.
One reason supporting EBITDA/EV as a measure of relative valuation is in its use of operating income before depreciation as the profitability measure. Differences in depreciation methods across different companies can cause differences in net income, but they do not affect EBITDA. Of course, the limitations of EBITDA as a measure of profitability should also be borne in mind. As stated by Penman (2013), depreciation is a real economic cost; therefore, pricing a company without considering plant, copyright, and patent expenses would imply that a business could be run without these expenses. Therefore, some scholars have argued for EBIT/EV because EBIT includes depreciations and amortizations, which reflect a firm's capital expenditure in previous years. According to Chan and Lui (2011), EBIT figures can give    Gharghori et al.  The returns are given on an annual basis, unless indicated by p.m. in the first context within each summary description of the findings (In those cases, the returns are given on a monthly basis throughout the description). The length of holding periods is one year unless otherwise indicated.
investors better guidance on profit growth and future sustainability, which makes EBIT/EV ratios more reflective of a company's true profitability than EBITDA/EV ratios. However, that does not necessarily imply that EBITDA/EV would be inferior to EBIT/EV as a basis of selection criterion for value portfolios. Actually, the recent results of Gray and Vogel (2012) show that even the more rudimentary profitability measures than EBITDA can be used as a profit component in enterprise value-based multiples. Based on value-weighted returns calculated for quintile portfolios of U.S. stocks, the authors documented both the highest value premium and the highest value portfolio return for the gross profit-to-enterprise value (GP/EV) ratio, in which gross profit is calculated by subtracting the cost of goods sold from total sales. Their finding is not so surprising in the light of the fact that S/P has also worked well as a relative valuation criterion despite that S/P does not account for the leverage differences between the firms being compared. Although S/EV ratios do so, it is probable that they would also work well as a relative valuation criterion.
As EBITDA/EV also worked well for the same purpose in Gray and Vogel (2012) (particularly based on equally-weighted quintile portfolios), it is not surprising that GP/EV also did so (because GP is between sales and EBITDA in income statement). Instead, based on value-weighted quintile returns, free cash flow-to-enterprise value (FCF/EV) was not as successful as GP/EV and EBITDA/EV as a value portfolio selection criterion, although based on equally weighted portfolios, it slightly outperformed the GP/EV criterion.
To our best knowledge, the first published journal article that examined the performance of EBITDA/EV-ranked quantile portfolios and compared it to the performance of portfolios formed on more commonly used valuation ratios is Leivo et al. (2009). Among 20 quintile portfolios formed on four individual valuation ratios (i.e., EBITDA/EV, E/P, B/P, and S/P), the best return/total risk ratios (i.e., the Sharpe and the Sortino ratios) 21 in the Finnish stock markets were documented for the EBITDA/EV value portfolio. The results are consistent with Gray and Vogel (2012) who found the top-quintile EBITDA/EV portfolio the best-performing one among 25 quintile portfolios formed on five individual valuation ratios (i.e., EBITDA/EV, free cash flow/EV, E/P, B/P, and GP/EV). So far the latest evidence of the superiority of enterprise value-based multiples over price-based multiples is provided by Pätäri et al. (2016) who, in a performance comparison of tertile portfolios formed on four individual valuation ratios (i.e., EBIT/EV, E/P, B/P, and S/P), documented the best performance for the EBIT/EV value portfolio. The value premium was also higher based on EBIT/EV than on the basis of any of the price-based valuation multiples examined. Table 11 summarizes the findings of the studies that have examined enterprise value-based anomalies. Both of the studies based on U.S. data included here have documented very promising results on the applicability of EBITDA/EV for portfolio selection criterion. According to Loughran and Wellman (2011), EBITDA/EV-based value premium was the highest in comparison of value-weighted decile portfolios formed on four individual valuation ratios. Based on the corresponding comparison of equally weighted portfolios, EBITDA/EV was the second best after the B/P criterion. EBITDA/EV-based value portfolios also generated the second highest returns after their B/P-based counterparts on both equally and value-weighted bases. Gray and Vogel (2012) reported the highest return based on equally weighted portfolios for EBITDA/EV, followed by FCF/EV and GP/EV. For value-weighted portfolios, the same enterprise value-based multiples also dominated the two price-based multiples (i.e., E/P and B/P), but the rank order based on absolute returns was in this case GP/EV, followed by EBITDA/EV and FCF/EV, whereas based on the Fama-French three-factor alphas, the rank order of the three best individual valuation ratios was the same as in case of equally weighted portfolios. Among three studies from the Finnish stock market, Leivo et al. (2009) documented the highest risk-adjusted return among the value portfolios formed on individual valuation ratios for the EBITDA/EV value portfolio, whereas in Pätäri et al. (2016), the EBIT/EV-based value portfolio was the best in terms of both absolute returns, Sharpe ratios, skewnessand kurtosis-adjusted Sharpe ratios 22 and size-adjusted alphas among all tertile portfolios formed on four individual valuation ratios. In Leivo and Pätäri (2011), EBITDA/EV generated the second highest value premium after D/P, but the corresponding value sextile portfolio return was only the fourth best  pf to portfolio, and hp to holding period). The returns are given on an annual basis, unless indicated by p.m. in the first context within each summary description of the findings (In those cases, the returns are given on a monthly basis throughout the description). The length of holding periods is one year unless otherwise indicated.  Table 12 summarizes the numbers of top and bottom rankings given to each individual valuation ratio (the corresponding numbers for all the enterprise value-based multiples are documented in the same row). a Refers to size-adjusted findings. b Refers to risk-adjusted findings.
in comparison of the six selection criteria based on individual valuation ratios (the other four were E/P, CF/P, B/P, and S/P). Table 12 summarizes the numbers of the top and bottom rankings given to each valuation ratio among such studies in which at least three different individual valuation ratios have been compared. Panel A provides the numbers for the studies that are based on U.S. sample data, whereas Panel B does the same for the studies that employed non-U.S. sample data.

Evidence of the Benefits of Composite Value Criteria
The idea of combining value indicators to enhance the value portfolio performance and/or value premium is not new (e.g., see Graham, 1973;Rosenberg et al., 1985). The combination may add value if the value indicators are not highly correlated. Early empirical evidence of the benefits from combining individual valuation ratios was provided by Guerard, Jr. and Takano (1992). However, the existing literature on the empirical tests of the benefits of composite value criteria is relatively young (see Table 13 for the U.S. evidence and Table 14 for the non-U.S. evidence). To the best of our knowledge, Dhatt et al. (1999) were the first to report the results of performance comparisons between value portfolios based on both individual valuation ratios and a composite value criterion. The authors formed tertile portfolios of stocks included in the Russell 2000 Index, which is the commonly used U.S. small-cap benchmark, on E/P, B/P, and S/P. The portfolios based on the composite value criterion were formed by combining stocks with consistently low positive values by all three aforementioned valuation ratios into one portfolio, consistently medium positive values into another portfolio, and consistently high positive values into a third portfolio. 23 The composite value portfolio generated the highest return/risk ratios among all of the portfolios that were compared. However, the performance of the tertile portfolios based on individual July 1979-June 1997 The average annual return of the combined value pf slightly higher (1.56%) than for the value pf based on the best IVR (E/P). No gain in VP compared to the best IVR (S/P based on VP). No reduction in risk compared to E/P and B/P value pfs. Piotroski (2000) Performance comparison between EW B/P value quintile pf and more concentrated double-sorted sub-pfs formed of the same value stocks (first based on B/P, and then based on F_Score).

1976-1996
Double-sorted sub-pf performed clearly better than single-sorted B/P quintile pf (the corresponding returns were 31.3% vs. 23.9%). However, the double-sorted pf included (on average) only 1/10 of the stocks of the B/P value quintile pf.

Chan and Lakonishok
Cross-sectional models for predicting future returns based on preceding B/P, CF/P, E/P, and S/P ratios. The estimated slope coefficients determine the weights for each IVR in the composite value indicator.   Dhatt et al. (2004) Performance comparison of the quintile pfs formed on 4 IVRs (B/P, E/P, CF/P and S/P) and all their median-adjusted 2-, 3-and 4-combinations.
CRSP/ Compustat 1,280 (1980)-2,314 (1997) 1980-1998 Composite value measures expand the set of efficient pf in the mean-variance framework. Among the 11 combination criteria, the highest VP (8.52%), as well as the highest value pf return (21.12%) is generated by the 2-combination of S/P and E/P. Bartov and Kim (2004)   The returns are given on an annual basis, unless indicated by p.m. in the first context within each summary description of the findings (In those cases, the returns are given on a monthly basis throughout the description). The length of holding periods is one year unless otherwise indicated.      The returns are given on an annual basis, unless indicated by p.m. in the first context within each summary description of the findings (In those cases, the returns are given on a monthly basis throughout the description). The length of holding periods is one year unless otherwise indicated.
valuation ratios and those based on the composite value criterion is not totally comparable, because the amount of stocks was remarkably lower in the latter case. 24 A different methodological approach was introduced by Piotroski (2000), who examined whether the performance of the B/P-based value strategy could be boosted by accounting-based fundamental variables. He used the sum of nine individual binary signals (referred as F_Score) as the measure for three areas (i.e., profitability, financial leverage or liquidity, and operating efficiency) of the firm's financial condition. Piotroski (2000) tested the F_Score and its ability to separate "winners from losers" within a quintile portfolio of high B/P U.S. companies. He showed that the mean return earned by a high B/P quintile investor could be increased by 7.5 percentage points annually through the selection of financially strong high B/P firms. However, in spite of numerous robustness tests, Piotroski did not compare the performance of the portfolio formed from the highest F_Score stocks picked from the top-quintile B/P stocks to the same size of portfolios formed on the highest B/P stocks. 25 To the best of our knowledge, the first study comparing the performance of value portfolios formed from individual valuation ratios with composite value portfolios of equal size is van der Hart et al. (2003). For the global sample of emerging market stocks, they used the combination of normalized B/Ps and E/Ps as a basis of composite value measure and found a slight enhancement both in value premium and average return of the top value portfolio in comparison with the results based on the individual valuation ratios. However, the added value from using the composite value measure was rather small for this particular sample. Some other studies analyzing emerging market data have also employed combined value measures, but they have used somewhat simpler nonparametric methodology for allocation of stocks into quantile portfolios. For example, Ding et al. (2005) and Brown et al. (2008a,b) allocated the stocks into portfolios in accordance with their average rankings based on three or four individual valuation ratios. However, no comparison of the results based on that kind of allocation system and those based on individual valuation ratios is made in any of these three studies. Chan and Lakonishok (2004) examined the efficacy of combining B/P, CF/P, E/P, and S/P ratios. By employing robust regression methods, they first estimated cross-sectional models that predicted future annual returns from the beginning-year values of each valuation ratio. The estimated slope coefficients determined the weights to be applied to the valuation ratio to arrive at the composite value criterion. The authors tested the efficacy of the above-described portfolio formation criterion with the three different samples, of which the first consisted of six largest-cap deciles of NYSE stocks, the second of the stocks that were in the sixth through the ninth deciles in the same stock exchange, and the third of largest-cap stocks in the MSCI EAFE Europe (Europe/Australasia/Far East) Index of non-U.S. countries. For all the samples examined, Chan and Lakonishok (2004) concluded that the use of the composite value criterion boosted the performance of the value strategy. The authors showed further that the outperformance was not explained by greater risk of value portfolios.
Unfortunately, Chan and Lakonishok (2004) did not report the results based on individual valuation ratios, which would have been an interesting extension for comparability purposes. In contrast, Dhatt et al. (2004) did so for the same valuation ratios employed by the former authors. However, the latter authors used a somewhat simpler methodology in constructing the quintile portfolios instead of the decile portfolios employed by the former ones. At the first stage, Dhatt et al. (2004) standardized each of the valuation ratios of a firm in a particular year by the median value of that ratio for all the firms in their final sample in that year. At the second stage, composite value measures were computed as simple averages of different combinations of these relative valuation ratios. According to their results, the highest return during was reported for the value portfolio that was based on the combination of E/P and S/P ratios. Although the best return/risk trade-off, as well as the lowest risk, was documented for the CF/P value portfolio, the authors concluded that using composite value measures can expand the set of efficient portfolios, enabling investors to achieve a wider range of return/risk trade-offs. Bartov and Kim (2004) examined whether the return of the B/P value portfolio could be boosted by including accruals as another selection criterion beside B/P. The authors showed that by picking the stocks whose accruals are in the lowest quintile (within the full sample) from the highest B/P quintile, the average annual return for the value portfolio could have been enhanced by 2.4% p.a. The impact of adding accruals as another independent sorting dimension on the value premium was even more dramatic, because the value premium between the aforementioned value portfolio and the comparable glamour portfolio (consisting of those high-accruals quintile stocks that simultaneously belong to the low B/P quintile) was 20.6%, whereas based on the return difference between the top and bottom B/P quintile portfolios, it was 14.1% for the same sample period. However, Bartov and Kim (2004) did not compare the performance of portfolios formed on two independent sorts with that of the quantile portfolios of same size, which leaves the real benefits of the methodology employed open, like in the majority of studies on composite value criteria. Bird and Gerlach (2006) examined the extent to which fundamental accounting information can be used to better identify truly undervalued stocks. They used Gibbs sampling and model averaging in a logistic regression setting, employing fundamental accounting information as explanatory variables to enhance the performance of value investment strategies in the United States, the United Kingdom, and the Australian stock markets. According to the results, the stocks in the value portfolio that were most likely to show positive abnormal (i.e., above market) returns could be predicted more successfully through the use of fundamental accounting information. The highest added value from including accounting variables beside B/P in portfolio selection was documented in the United Kingdom, whereas in Australia, it was the lowest. Again, the authors did not compare the double-sorted (first based on B/P and then based on accounting variables) sub-portfolios with such B/P value portfolios that would have included the same number of constituent stock series as the former sub-portfolios. Guerard, Jr. (2006) tested the methodology introduced in Guerard, Jr. and Takano (1992), which they developed to improve the prediction power of valuation ratios on subsequent returns in Japanese stock markets. By using several variants of cross-sectional regression models, he documented a slight improvement in prediction accuracy (compared to that based on the use of either each individual valuation ratio alone or all four of them as explanatory factors) when the relative historical levels of individual valuation ratios quoted as the ratios of their current level to their corresponding five-year averages are added as explanatory factors to the regression model. Instead, no improvement in prediction power (compared to that of the best individual valuation ratios) was documented without them. Moreover, Guerard, Jr. (2006) showed that the prediction power of the model could be further increased by also including one-and two-year projected E/Ps and their monthly revisions in the model.
Using a comprehensive sample of Finnish stocks, Leivo et al. (2009) compared the performance of tertile portfolios formed on the two combinations of B/P and EBITDA/EV, the three combinations of B/P, EBITDA/EV, and S/P, and the inverse of the Graham ratio (i.e., the product of E/P and B/P) with the performance of corresponding portfolios formed on individual valuation ratios. The first two combinations appeared to improve risk/return ratios of value portfolios. In addition, the authors noted that abnormal returns (i.e., Jensen's alphas) of value portfolios formed on composite value measures were generally less sensitive to changing stock market sentiment than those based on individual valuation ratios.
Following the methodology developed by Dhatt et al. (2004), Leivo and Pätäri (2011) reported the results for nine different value-only strategies that included three combination strategies. The authors compared the performance of sextile portfolios formed on the two combinations of D/P and two three combinations of which the first is based on D/P, B/P, and EBITDA/EV, and the second on D/P, B/P, and E/P. However, none of the combinations examined added value to value portfolio selection because of the superior performance of the D/P-based value portfolio.
To our best knowledge, Pätäri et al. (2010) were the first to show the applicability of data envelopment analysis (DEA) 26 for separating value stocks from other types of stocks (i.e., neutral and glamour) on the basis of input and output factors derived from the components of three traditional valuation ratios (i.e., E/P, B/P, and D/P). In their further study, Pätäri et al. (2012) examined the added value of using DEA as formation criteria for equity portfolio selection by including several new variables. Although most of their results are for criteria that combine the value and price momentum criteria, two criteria are based on composite value-only measures. The results for the Finnish sample data showed that these two criteria were very selective in identifying the best-performing stocks of the future to the extent that not only the DEA glamour tertile portfolio but also the DEA middle tertile portfolio was significantly outperformed by the corresponding value portfolio. However, neither of these studies compares the performance between DEA value portfolios and value portfolios based on other selection criteria.
Novy-Marx (2013) stated that controlling for gross profitability dramatically increases the performance of value strategies. He formed 5×5 portfolios based on independent sorts of gross profit-to-assets (GP/A) and B/P; furthermore, he documented the best performance for the portfolio that consists of stocks with both high B/P and high GP/A and the worst for the low B/P and low GP/A portfolio. The performance enhancement was particularly evident among the largest, most liquid stocks. However, Novy-Marx (2013) did not report the performance statistics for the B/P portfolios of comparable size. Moreover, the excess returns of the stocks with the lowest profitability in the highest B/P quintile were higher than those of the stocks with the highest profitability in the lowest B/P quintile, although the first-mentioned returns were the lowest among the highest B/P quintile and the latter returns the highest among the lowest B/P quintile.
As several previous papers have shown that firm size, financial leverage, and industry can affect the relative valuation of stocks, Pätäri et al. (2016) introduced a new and innovative methodology for equity portfolio selection by adjusting the traditional individual valuation ratios on the basis of firm size, financial leverage, and/or industry classification and combining them as hybrid selection criteria after harmonizing the resulting valuation scores. Their results showed that the simultaneous combining of valuation scores based on different multiples coupled with size, leverage, and industry adjustments pays off to the value investor. The top-tertile portfolios formed on the best combination criteria significantly outperformed not only the stock market portfolio but also the corresponding middle and bottom tertile portfolios. Moreover, the middle-tertile portfolios outperformed the comparable bottom-tertile portfolios for the same criteria. Consistent with previous literature, the division of the full sample period into bullish and bearish periods revealed that the outperformance of top-tertile value portfolios against the market portfolio is mostly attributed to the fact that they lose far less of their value during the bearish conditions than the market portfolio, or the middle-or bottom-tertile portfolios. 27 Thus, the use of multidimensional criteria offered better downside protection against stock market declines than one-dimensional valuation criteria or traditional valuation ratios.
Although almost all of the studies on the efficacy of composite value criteria have concluded that the combination adds value to the investor, only a few of them have provided impartial performance comparisons between portfolios based on such criteria and those formed on individual valuation ratios. To the best of our knowledge, the only exceptions are van der Hart et al. (2003), Guerard, Jr. (2006), Bird and Casavecchia (2007b), Leivo et al. (2009), Pätäri et al. (2016). However, the overall evidence of the added value of combined strategies is relatively weak, because in only one of these six studies did the combination result in both higher returns and lower risk in comparison with the best criterion based on individual valuation ratios (i.e., in Pätäri et al., 2016). Nevertheless, this branch of literature is relatively young; therefore, further empirical studies are required to see the real potential of composite value criteria.

Explanations of Value Premium
The reasons for both value anomalies and value premium are widely discussed in the financial literature. The explanatory studies are usually divided into three categories. The first of these relies on risk-based explanations stating that the value premiums and/or related anomalies stem from hidden risk factors that have not been taken into account in the risk-adjustment procedures employed in the value premium studies. The second group of explanations is based on the assumption of irrational behavior of investors, which results in the mispricing of assets. The third group of studies asserts that value premium or related anomalies are artefacts of data snooping bias or other biases related to data or data processing. Of course, these explanations can also be intertwined so that the value premium and/or related anomalies are partially explained by omitted risk-based factors and partially by behavioural reasons, as well as by data-related biases. This section reviews and synthesizes the literature on explanations of value premium starting from risk-based reasons, which are often connected to the analysis of B/P-based value premium. Fama and French (1992) reported in their seminal study that size and B/P explain most of the anomalous differences in future stock returns. However, Daniel and Titman (1997) showed that, after controlling for size and B/P, returns are not strongly related to market βs calculated on the basis of the Fama-French 3-factor model (for a contrary view on this inference, see Davis et al., 2000). In contrast, Ang and Chen (2007) argued that when the tests allow for time-varying market βs, no evidence against a CAPM story for the value premium is left. However, Fama and French (2006) showed that the inferences of Ang and Chen (2007) were valid only for the 1926-1963 period, and furthermore, that during the 1963-2004 period, the value stocks had lower βs than the growth stocks, contrary to the CAPM requirements for explaining the value premiums. Moreover, contradicting the findings of Loughran (1997), Fama and French (2006) showed that the value premium is not restricted to small-cap stocks by rejecting the CAPM pricing formed on size, B/P, and market β during the 1928-2004 period. Daniel and Titman (2006) argued that the B/P anomaly is driven by overreaction to the part of the B/P ratio that is not related to accounting fundamentals. The other part of the B/P ratio that is related to the fundamentals did not appear to forecast returns, thus casting doubts on the explanation according to which violations of the CAPM could be captured by controlling for size and B/P effects that have been interpreted to represent proxies for distress risk by the proponents of market efficiency. Fama and French (1993) suggested that the value premium exists to compensate investors for the risks inherent in value stocks relative to growth stocks, which are not captured by the traditional CAPM of Sharpe (1964), Lintner (1965), and Mossin (1966). Using the neoclassical framework with rational expectations and competitive equilibrium, Zhang (2005) came to a parallel conclusion, but explained the value premium with the difference between value and growth companies in their ability to adjust the level of production to match the demand in varying economic conditions (i.e., the differences in operating leverage). The conclusion that the B/P anomaly relates to operating leverage is also supported by Carlson et al. (2004) and García-Feijóo and Jorgensen (2010). Moreover, Petkova and Zhang (2005) showed that the economic fundamentals of value firms responded negatively to economic shocks, whereas the same did not hold for growth stocks. They interpreted this as evidence that value stocks are riskier than growth stocks, at least in the adverse states of the world. Cooper (2006), Li et al. (2009), andGulen et al. (2011) also agreed that the value premium is explained by the lesser flexibility of value firms in adjusting to worsening economic conditions compared to growth firms. Empirical evidence of the "operating leverage hypothesis" is also given by Novy-Marx (2011), who found that the value premium is driven by intra-industry differences in firms, and not by industry characteristics. Instead, Fong (2012) found no evidence of macroeconomic risks explaining the value premium. Guthrie (2013) also showed that the value premium exists even without operating leverage or an industry-wide investment effect, according to which investments by other firms in the same industry buffers demand shocks, reducing the risk of assets in place in high-demand states (e.g., see Kogan, 2004;Aguerrevere, 2009). Though both operating leverage and industry-wide investment contribute to the observed B/P anomaly, they are not entirely satisfactory as explanations of this phenomenon (Guthrie, 2013).
When seeking further explanations for the B/P anomaly, Fama and French (1995) concluded that low B/P firms typically have high average returns on capital, and moreover, that high B/P companies are relatively financially distressed. The authors showed further that low B/P companies remained more profitable for at least five years after the portfolio formation, but that the earnings growth rates of high B/P firms became more similar to low B/P firms after the portfolio formation. They also found evidence that the market does not understand this convergence of earnings growth and that the market seems merely to extrapolate the strong earnings growth of low B/P firms and the weaker growth of high B/P firms. Similar findings were also reported by Chan et al. (2003), who showed that the market estimates the earnings growth of high B/P stocks too low, leading to a mispricing of stocks due to an over-pessimistic extrapolation of previous growth. The interpretation is also consistent with the conclusion of Penman (1996), who used the residual income valuation framework to illustrate expectations embedded in the price of a high B/P company. Vassalou and Xing (2004) showed that the B/P anomaly only exists in the two quintiles of the highest default risk, indicating that the B/P anomaly is for the most part related to the financial distress variable. In contrast, the role of financial distress as an explanation for the B/P anomaly is questioned or rejected in many papers. For example, Dichev (1998) found that the relation between bankruptcy risk and book-tomarket is not monotonic. Although distressed firms generally have high B/Ps, the most distressed firms have lower B/Ps; therefore, a return premium related to default risk cannot fully explain the B/P anomaly even if distress was rewarded by higher returns. Similar conclusions were also drawn by Griffin and Lemmon (2002) and Campbell et al. (2008), who used Ohlson's (1980) O-score model and a dynamic logit model, respectively, to predict defaults. Piotroski (2000) also supported the argument made by Fama and French (1995) and suggested that accounting fundamentals such as leverage, liquidity, profitability trends, and cash flow adequacy could also be used in discriminating between companies within the high B/P set of firms. Previous literature has also shown that an average high B/P firm is in many cases neglected by the market and followed by fewer investors or analysts (see, e.g., Griffin and Lemon, 2002;Jegadeesh et al., 2004;Doukas et al., 2005). This would also support the usefulness of financial statement analysis on high B/P firms, because the market is more likely to misprice companies that are not actively followed by investors. Penman et al. (2007) suggested that the B/P ratio could be decomposed into an enterprise B/P ratio and a leverage component reflecting financial risk. The authors also showed that as high B/Ps are associated with high returns, the leverage component is negatively associated with the returns. This suggests that the B/P-based value premium could be further enhanced if the leverage-related factors could be taken into account in the portfolio formation. However, this result is contrary to the belief that a higher amount of leverage and risk should yield higher excess return as a reward for the leverage risk, when the effect is in fact the opposite (for recent evidence of this, see e.g., Campbell et al., 2008;George and Hwang, 2010;Obreja, 2013). Penman et al. (2007) suggested that this finding could be due to one or more of the following explanations: measurement error in leverage, omitted operating risk factors that are negatively correlated with leverage, or mispricing of leverage by the market. Although the reason for the leverage effect was not explained, its existence at least on some level supported the conclusion that mispricing could happen within high B/P companies and that it might be exploited. Fama and French (2007a) traced three sources of the value premium: first, it is contributed by the value stocks that improve in type because their companies are acquired by other companies or because they earn high returns and migrate to a neutral or growth portfolio. Secondly, the value premium is attributed to the poor performance of some growth stocks earning low returns and thus moving to a neutral or value portfolio. The third reason for the value premium is the slightly higher returns of value stocks that do not migrate compared with the returns of corresponding growth stocks. In another related study, Fama and French (2007b) found the convergence in B/Ps of value and growth portfolios, which is caused by mean reversion in profitability and expected returns: B/Ps of value portfolios tend to fall as some value companies become more profitable, whereas B/Ps of growth portfolios rise as growth companies cannot reach the profitability level that is expected from them.
An alternative explanation for the value anomalies is based on the irrational behavior of investors, first proposed in the 1930s by Graham and Dodd (1934). This interpretation was re-launched in the theory of investments in the form of De Bondt and Thaler's (1985) overreaction hypothesis. The conclusion was supported by the results of Chopra et al. (1992) and Lakonishok et al. (1994), who applied it in the context of examining the value premium and drew conclusions parallel to the reasoning of the original any, support for the view that value strategies are fundamentally riskier than glamour strategies. The similar conclusion was also drawn by Haugen and Baker (1996) and Brennan et al. (1998). Barberis et al. (1998) also stated that the naïve extrapolation of past growth causes stock prices to overreact in both directions, resulting in return predictability on the basis of valuation ratios.
In contrast, Doukas et al. (2002) showed that the value premium is not explained by over-optimism in analysts' EPS forecasts, thus rejecting their non-risk-based explanation of the value premium. In their follow-up paper, the same authors found support for the risk factor explanation as the source of value premium when using the standard deviation of analysts' EPS forecasts as a risk proxy (Doukas et al., 2004). The authors suggested that the abnormal returns of value stocks reflect compensation for higher risk as measured by the dispersion in analysts' EPS forecasts.
According to Daniel et al. (2001), investors' overconfidence induces overreaction, and extreme B/P ratios are caused by overreactions to private signals. Phalippou (2008) showed that the value premium is concentrated in stocks mostly held by individual investors and declines from the lowest to the largest institutional ownership decile, consistent with behavioural explanations, the value premium. Parallel conclusions were also drawn by Bartov and Kim (2004), who found that the B/P anomaly is stronger among firms held primarily by small (unsophisticated) investors and followed less closely by market participants than among firms with considerable institutional ownership and analyst coverage. The authors divided their sample into two subsamples based on stock price; they documented clearly higher value premiums for the sample of stocks whose price was below $10 than for those whose price was at least $10. 28 Thus, the mispricing explanation for the B/P anomaly held primarily to a subset of stocks with unsophisticated ownership, as investment professionals are typically unable to invest in firms with a stock price less than $10 due to institutional restrictions. Consistently with the mispricing explanation, Ali et al. (2003) showed that the B/P anomaly is stronger among stocks with a higher idiosyncratic return volatility, higher transaction costs, and lower investor sophistication.
The recent results of Piotroski and So (2012) and Hwang and Rubesam (2013) also support the mispricing hypothesis. The first-mentioned authors found that prices of glamour (value) firms reflect systematically optimistic (pessimistic) expectations. Thus, the value/glamour effect should be concentrated (absent) among firms with (without) ex ante identifiable expectation errors. Classifying firms based upon whether expectations implied by current valuation multiples are congruent with the strength of their fundamentals, Piotroski and So (2012) documented that value/glamour returns and ex post revisions to market expectations were predictably concentrated (absent) among firms with ex ante biased (unbiased) market expectations. Hwang and Rubesam (2013) found that the large and positive average value premium comes from the correction of mispricing, which is accelerated during the bearish market sentiment. According to their results, the correction is more severe for value stocks than for growth stocks due to the higher uncertainty of value stocks. As a result, value stocks tend to outperform growth stocks during bad times, because the underpricing of value stocks is corrected faster during bear markets when volatility increases. Arnott and Hsu (2008) also argued that both size and value anomalies are driven by pricing noise, but the authors did not exclude the possibility that such anomalies could also be partially driven by hidden risk factors or behavioural irrationalities.
A third group of explanations for the existence of the value premium relies on the data snooping bias or other biases related to data (e.g., see Black, 1993;Conrad and Kaul, 1993;Ball et al., 1995;Kothari et al., 1995;Conrad et al., 2003). However, in the light of recent results on the value premium documented all around the world, it seems unlikely that all of the evidence of its existence could be explained by these types of biases (e.g., see Markowitz and Xu, 1994;Guerard, Jr. et al., 2012). As the ongoing academic debate on the reasons for the value premium indicates, the research community is still far from a consensus in this respect.

Conclusions
This paper provides an extensive literature review on value premium and the related anomalies. The current literature is based mainly on two different approaches, the first of which uses cross-sectional regression models in explaining future returns with the variables based on the most recent information from stock markets and/or financial statements. The other approach divides the full sample of stocks into quantile portfolios based on the same type of information and then compares the performance of quantile portfolios to each other and/or to the stock market average. It is noteworthy that these two approaches can produce different conclusions about relative efficacy of individual valuation ratios even for the same sample data because the outperformance of a quantile portfolio may stem from the superior performance of the minority of stocks included in that specific quantile, whereas the cross-sectional regression may not find significant causality due to the diversity of performance of individual stocks included in the same quantile.
Among individual valuation ratios, the two most-examined are clearly B/P and E/P, of which the former has in most cases generated both the higher value premium and the higher value portfolio return. However, the pairwise efficacy comparisons of individual valuation ratios based on overall evidence are troublesome, because the samples used for the E/P criterion are often much smaller than those used for B/P due to the commonly used methodological choice to exclude negative earnings' firms from the former samples. The same kind of dilemma is also faced in portfolio divisions based on D/Ps, due to a remarkable subsample of zero-dividend stocks. However, based on overall evidence, D/P has been clearly the least efficient criterion among individual valuation ratios, particularly in the U.S. markets, though some exceptions from small European national markets where it has been the best have been documented. By contrast, the S/P criterion has proven to be very successful in the U.S. markets in those few studies in which it has been employed. In the light of those results, the number of studies that have included S/P as one portfolio formation criterion is surprisingly low, which provides a clear research gap for forthcoming studies. For the non-U.S. sample data, the evidence on the efficacy of S/P is more mixed than it is for the U.S. sample data.
Based on overall evidence, the CF/P criterion has clearly performed better than E/P, particularly in that it has very seldom been the worst criterion in such studies where three or more individual valuation ratios have been compared. Analogous to E/P-based value premiums, the CF/P-based value premiums can also vary remarkably depending on whether or not the firms with negative cash flows are included in the sample. In comparison with B/P, CF/P has performed approximately equally in U.S. markets, whereas in non-U.S. markets, overall evidence is somewhat inclined toward B/P.
For the present, the academic literature on the efficacy of enterprise value-based valuation ratios has been scant, but the results on their relative efficacy have been promising. In the light of those few results, it is very likely that enterprise value-based multiples will get more emphasis in forthcoming studies. One clear research gap related to enterprise value-based multiples is that so far, to the best of our knowledge, none of the studies have included the S/EV criterion in their comparisons. This is even more surprising in the light of the good performance of the S/P criterion, because S/EV has more solid theoretical foundations than S/P, and in addition, may benefit from the lack of academic research on the related anomaly. 29 This literature survey reviewed 10 such studies that reported the performance enhancement for value portfolios formed on composite value criteria compared to their counterparts formed on individual valuation ratios. However, in the majority of these studies, the value portfolios based on the former criteria were much narrower than the portfolios of individual valuation ratios. Thus, the real performance enhancement stemming from the use of composite value criteria is questionable in all such studies. After careful review, we identified only one study in which the use of composite measures had resulted in both higher return and lower risk without narrowing the value portfolios being compared. However, the literature on composite value criteria is young and based for the most part on very simple methods of combination. Therefore, it is too early to draw such a conclusion that the composite value criteria could not add value to the investor, 30 although on the other hand, the use of more sophisticated portfolio formation criteria does not necessarily result in better performance. However, it is likely that their use in academic research will proliferate in the near future. In conclusion, overall evidence shows that the best criteria may vary over time 31 and across markets. The efficacy ranking of the valuation criteria may also depend on methodological choices such as general research design (cross-sectional regression or portfolio formation approach), frequency of quantile division of portfolios, return calculation practice (equally or value-weighting), performance metrics employed, treatment of firms with negative valuation ratios (to include or exclude), and the lengths of the holding and selection periods, for example.

Notes
sum of which Dhatt et al. (2004) subtracted preferred dividends. Instead, Israel and Moskowitz (2013) and Hou et al. (2015) used income before extraordinary items plus equity's share of depreciation plus deferred taxes). 19. In addition, Fama and French (1998) documented that in nine out of 12 developed non-U.S. stock markets, the value premium was higher based on CF/P than on the basis of E/P, whereas in comparisons of value portfolio returns, CF/P was superior to E/P in eight out of 12 cases. 20. According to Fama and French (1998), in five out of 12 national developed non-U.S. stock markets, the value premium was higher based on CF/P than based on B/P, whereas in comparisons of value portfolio returns, CF/P was superior to B/P in six out of 12 cases. 21. The Sortino ratio is calculated by dividing the average excess return of the portfolio by the semistandard deviation of the corresponding period returns (see Sortino and van der Meer, 1991, for details). 22. The skewness-and kurtosis-adjusted Sharpe ratio was first introduced in a hedge fund study of Pätäri and Tolvanen (2009). 23. Stocks with negative values for any of the three valuation ratios were excluded from the sample.
The average total number of companies included in three portfolios formed on the composite value measure decreased to 536 from 1958, indicating that the great majority of companies did not have consistently low, medium, or high values of the three ratios used as the basis of the composite value criterion. 24. If the value portfolios based on individual valuation ratios had included only the same amount of stocks as the composite value portfolio, their performance might have been as good as or even better than the performance of the composite value portfolio. In the light of the fact that the value premium is usually stronger when the frequency of quantile division is higher, and on the other hand, that the reported performance differences between tertile portfolios based on individual valuation ratios and those based on the composite value criterion are small, this is likely. 25. This kind of comparison would have been particularly valid and interesting in the light of the fact that, on average, the former portfolios included only one-tenth of the stocks included in the highest B/P quintile portfolios (see Table 3 in Piotroski 2000). Therefore, the seemingly dramatic added value of the use of F_scores beside B/P criterion might be fully or partially explained by the size differences of the portfolios being compared, analogous to the aforementioned case of Dhatt et al. (1999). 26. DEA was originally developed by operations researchers for the efficiency comparison purposes (e.g., see Charnes et al., 1978). 27. In this sense, the results of Pätäri et al. (2016) are parallel to those of Lakonishok et al. (1994), Bird and Whitaker (2003), and Hwang and Rubesam (2013). 28. For the first subsample, the value premium based on the return difference of top and bottom quintile portfolios was 20.7%, whereas for the second subsample it was only 5.2%. The high B/P quintile return for the first subsample was 20.9% and 16.3% for the second, indicating that the value premium difference is for the most part explained by poor performance of low B/P stocks whose price is below $10. 29. For example, McLean and Pontiff (2014) documented the degradation of profits from anomalies after their publication. 30. For example, see the empirical evidence in Leivo et al. (2009), Israel and, and Pätäri et al. (2016), according to whom the combination strategies make the value portfolio performance more stable over time and less prone to varying stock market conditions. 31. See also Davis (2001) for an evidence of time-variability in relative value premiums calculated on the basis of four individual valuation ratios (B/P, CF/P, E/P, and S/P).