ARE FAMILY-FRIENDLY WORKPLACE PRACTICES A VALUABLE FIRM RESOURCE?

We study the determinants and consequences of family-friendly workplace practices (FFWP) using a sample of over 450 manufacturing firms in Germany, France, U.K., and U.S. We find a positive correlation between firm productivity and FFWP. This association disappears, however, once we control for a measure of the quality of management practices. We further find that firms with a higher proportion of female managers and more skilled workers, as well as well-managed firms, tend to implement more FFWP. Conversely, a firm's environment does not have a significant impact on the FFWP it provides.


Introduction
Strategists' key interest is in studying the performance impact of (strategic) firm actions.
Recently, actions aimed at leveraging and securing firm resources have been studied intensively.
Here, actions were measured against their ability to make a resource valuable, rare, and/or inimitable (Barney, 1991). Human capital is often considered a potential firm resource (Pfeffer, 1994;Koch and McGrath, 1996;Conner and Prahalad, 1996;Lee and Miller, 1999), sparking intense interest among strategy scholars in the performance effects of (strategic) human resource management (SHRM) (Cappelli and Singh, 1992;Pfeffer, 1994;Ulrich, 1991;Wright and McMahan, 1992;Huselid, 1995;Fey, Björkman, and Pavlovskaya, 2000;Batt, 2002). Some empirical studies of SHRM forge links to the specific elements of a VRIO framework (resources must be valuable, rare, inimitable, and require organizational support), typically to policies leveraging the value (V) of a firm's workforce (Koch and McGrath, 1996) and increasing the inimitability (I) of human resources (HR) by increasing employee retention (Perry- Smith and Blum, 2000). 1 However, many actions taken by firms do not seem to affect their financial performance much, which led scholars to widen the definition of firm performance to include corporate social performance (Brammer and Millington, 2008;David, Bloom, and Hillman, 2007) or environmental response (Murillo-Luna, Garcés-Ayerbe, and Rivera-Torres, 2008), and to consider antecedents of actions affecting other performance dimensions (Young, Charns, and Shortell, 2001;Murillo-Luna et al., 2008;Coombs and Gilley, 2005). Generally speaking, the 1 For example, by encouraging and rewarding feedback from lower-ranked employees a firm can achieve superior market knowledge. Similarly, providing long-term financial incentives aligns executives' incentives with shareholder goals, both of which will help a firm get the most out its employees (raising V). Conversely, giving employees the option to scale down their working hours, or offering childcare support within the firm, can help a firm in retaining valuable employees (increasing I) without necessarily increasing their productivity. link between nonfinancial stakeholder demands and firm actions suggests that firms consider multiple factors when choosing their actions. From a strategic perspective, what is especially interesting is whether actions influenced by nonfinancial stakeholders are detrimental to firm financial performance (i.e. if they are inconsistent with firms maximizing profits), or whether they complement other policies in making firm resources valuable, rare, and inimitable.
The management of human capital plays a crucial role in this line of research (Conner and Prahalad, 1996). On the one hand, the link between employees and performance seems obvious, but many strategies designed to improve workforce productivity are only expected to translate into 'hard' performance measures like firm value (Perry- Smith and Blum, 2000;Arthur, 2003;Delaney and Huselid, 1996;Edmans, 2007) or accounting profits (Huselid, 1995;Huselid, Jackson, and Schuler, 1997) through 'soft' channels like organizational commitment (Lee and Miller, 1999;Eaton, 2003) or employee turnover (Guthrie, 2001;Huselid, 1995). On the other hand, the reasons for implementing certain employee-friendly practices are not clear-cut, as employees are an input factor as well as a stakeholder group. This is especially salient for family-friendly workplace practices (FFWP), which do not directly affect the workplace, but rather enhance the ability of employees to combine working and personal life. Nevertheless, prior work looking at the association between FFWP and firm performance (Perry- Smith and Bloom, 2000;Gray and Tudball, 2003;Milliken, Martins, and Morgan, 1998) has generally found a positive association whether performance was measured in terms of work attitudes (Kossek and Ozeki, 1998;Lobel, 1999;Van Yperen and Hagedoorn, 2003), organizational citizenship (Organ and Konovsky, 1989;Schnake, 1991;Smith, Organ, and Near, 1983), or even firm productivity (Konrad and Mangel, 2000;George, 2005;Collins and Clark, 2003).
The positive link between FFWP and performance may be problematic for several reasons. First, FFWP may simply be part of a wider set of management practices found in wellperforming firms. If a well-managed firm uses a number of performance-enhancing management practices and concurrently uses FFWP, omitting the set of other practices in performance regressions creates a spurious correlation between FFWP and performance, a so-called 'false positive'. 2 Second, while the provision of FFWP may improve morale and employee retention, the link between FFWP and a set of hard performance measures is more controversial and has to be studied rigorously. For example, employee retention may improve after FFWP are introduced, but there may be costs of production inflexibility reducing productivity or costs of implementation reducing profitability. Third, the FFWP-performance link may not give a conclusive picture because firms may provide FFWP for reasons other than enhancing financial performance. FFWP provision may be due to firm characteristics unrelated to firm performance, which would be ignored in a narrowly defined study on FFWP and performance.
Further, a positive overall assessment of FFWP, if confirmed, would leave us with a puzzle: If FFWP were unambiguously good for firm performance, all firms should introduce them on their own accord (which they do not). One potential explanation is that FFWP carry significant costs of implementation borne by the firms while the returns accrue to both firms and workers, which creates a divergence of private and public returns and the resulting underinvestment (or underprovision) problem. Another explanation is that firms do not know the extent of benefits of FFWP, and therefore prefer to delay introducing them until they know more about them (Bryson et al., 2007). A third is that firms have differing benefits from FFWP and therefore some choose to adopt them, while it is not profitable for others to do so due to their employee base (Konrad and Mangel, 2000;Gray and Tudball, 2003) or their strategy (De Cieri et al., 2005;Becker, Huselid, and Ulrich, 1996). A fourth possible reason for this puzzle is that firms have different preferences regarding the balance of their employees' well-being and the financial performance of the firm. This weighting is influenced both by internal and external circumstances of the firm (Perry- Smith and Blum, 2000;Brammer and Millington, 2008).
There has been little or no research that distinguishes among these factors and gives an explanation for the differential use of FFWP. In this paper, we study two connected aspects related to the strategic use of FFWP. First, we investigate the 'effect' of FFWP on firm performance measured in multiple ways while controlling for multiple factors, including the quality of management practices (Bloom andVan Reenen, 2007, 2009). 3 This imposes more stringent conditions on identifying FFWP as a bundle of activities constituting a firm resource (Koch and McGrath, 1996;Perry-Smith and Blum, 2000). Second, we study 'determinants' of FFWP. That is, we look for differences in firm characteristics and external circumstances affecting firms' propensity to implement FFWP (Konrad and Mangel, 2000;Perry-Smith and Blum, 2000). Specifically, we ask two key questions: i) Are FFWP positively correlated to firm performance?
ii) Which firms are likely to adopt FFWP?
The first question studies the FFWP-firm performance link. While the direction of causality is difficult to establish in cross-sectional data, we subject the claim of a positive correlation between FFWP and firm performance to a rigorous analysis, running a large number of robustness tests with different measures of our dependent and independent variables. In 3 Note that we will use 'performance' to indicate a firm's financial performance unless noted otherwise. particular, we control for the quality of management practices, thus presenting a more rigorous test of FFWP as a firm resource (Arend, 2006 Using a novel survey tool on the provision of FFWP as well as management practices, and matching this data with detailed information on firm financial performance, we address two common problems in testing for firm resources (Arend, 2006;Newbert, 2007): first, the confirmatory bias arising from omitting other performance-enhancing resources, and second, the problems associated with using narrow performance measures. We thus believe that our study gives a nuanced picture of the provision and impact of FFWP and its validity as a firm resource.
We surveyed over 450 firms in Europe (Germany, France, and the U.K.) and the U.S. to gather data on firm performance, management, and FFWP, and uncover a number of surprising results. First, we find that, contrary to much of the existing literature on FFWP, the positive association between firm performance and FFWP disappears once controls for management practices are included. This suggests that some of the earlier results on the positive performance impact of FFWP (Batt, 2002;Fey et al., 2000;Perry-Smith and Blum, 2000;Gray and Tudball, 2003;Milliken et al., 1998) may be due in part to omitted variable bias, as these studies do not control adequately for management quality. This calls for recasting FFWP as a nonmarket strategy affecting other outcomes than financial performance. Second, we find that 'external' factors like the gender and skills composition of the workforce and the overall quality of management play an important role in FFWP provision.
Our paper is structured as follows: In the next section, we introduce a general framework for the provision of FFWP and their impact on firm performance. We then derive testable hypotheses on the impact and determinants of FFWP before we give a detailed discussion of our dataset and the procedures used to collect it. We subsequently present and discuss our results, and finally provide concluding comments.

A General Framework
Although FFWP are often considered part of high-performance HR strategy and good management in general, the first-order effect of FFWP is to 'provide relief for non-work concerns' (Perry- Smith andBlum, 2000: 1108). To conceptualize this, we separate FFWP and a set of 'good' management practices shown to improve firm productivity. Consider a simple approach of characterizing the effects of good management and FFWP: ( 2) where w = work-life balance (WLB) outcomes and y = performance outcomes (such as productivity or profitability). X is an index of FFWP (such as child/family care flexibility and subsidy -a complete list is given in Table 1a), and M is an index of good management practices (such as better shop-floor operations like lean manufacturing, sensible targets, and merit-based promotion procedures -see the section on scoring FFWP and management practices below).
We model these as being composite measures of several underlying practices, so that X = x(X 1 , different management 'best practices'. The x(.) and m(.) are non-decreasing functions of the arguments. Finally, D is a vector of control variables. Note that we consider WLB an 'outcome' and FFWP an 'input'.
We expect better management practices to be associated with better performance: ∂y / ∂M ≥ 0. We also expect more available family-friendly policies to be associated with improved WLB outcomes, i.e. ∂w / ∂X ≥ 0 (Kossek and Ozeki, 1998;Lobel, 1999;Van Yperen and Hagedoorn, 2003;Organ and Konovsky, 1989;Schnake, 1991;Smith et al., 1983). In Appendix A1, we confirm this using a measure of self-reported WLB. The focus of this study, however, is on the role of FFWP ('X') in equation (2), especially ∂y / ∂X, the conditional association of FFWP with performance. If FFWP are implemented predominantly to improve workers' well-being rather than to improve firm performance, we expect ∂y / ∂X ≤ 0, so that there is no positive association between more FFWP provided by the firm and superior performance. Firms may then still implement FFWP because ∂w / ∂X ≥ 0 (i.e. more FFWP imply better WLB) and they care about workers' well-being as well as financial performance. This could be due to the firm owners' preferences, because of regulatory pressures, or labor unions. If FFWP help make human capital a more valuable or more inimitable firm resource by helping employees work more productively or help the firm retain talented staff, we expect ∂y / ∂X ≥ 0, i.e. better FFWP are positively correlated with performance. 4 We also consider the drivers of FFWP provision. Consider a set of factors Z = (Z 1 , Z 2 , Z 3 , …) that may affect these practices. These factors can be internal or external to the firm and may proxy for pressure by nonfinancial stakeholders to implement certain organizational strategies such as FFWP. We therefore model FFWP as a function of these factors: Especially in the context of internal factors and characteristics, this allows for a simple test if FFWP provision is used to increase the value of a resource: If an interaction term between X (FFWP) and a factor Z i (say, the proportion of skilled employees) is positive in equation (2), this would indicate that Z i is a valuable resource that is made more effective by providing favorable FFWP. We investigate this in more detail when we discuss our regression results.
This framework is useful for a number of reasons: First, we can separate the effects of firm and environmental characteristics on FFWP and performance. Second, we can draw some conclusions about why firms provide FFWP. If there is a negative, or zero, correlation between FFWP provision and firm performance, FFWP may be provided because firms take other factors (like employee well-being or corporate social responsibility) into account or because firm characteristics or circumstances dictate the implementation of FFWP.
In the following section, we derive hypotheses on the correlation between FFWP and performance (Hypothesis 1) and on the determinants of FFWP (Hypotheses 2-5).

Uncovering the FFWP-performance link
We first derive our hypothesis on the link between FFWP and firm performance. The management of human capital and its link to firm performance has generated significant academic interest (Conner and Prahalad, 1996). SHRM more generally (Cappelli and Singh, 1992;Pfeffer, 1994;Ulrich, 1991;Wright and McMahan, 1992;Huselid, 1995) and highcommitment workplace practices specifically have been studied in much detail (Perry- Smith and Blum, 2000;Arthur, 2003;Delaney and Huselid, 1996;Huselid, 1995;Huselid et al., 1997;Lee and Miller, 1999;Eaton, 2003;Guthrie, 2001), and a subset of these looks at the impact of family-friendly policies on firm performance (Perry- Smith and Blum, 2000;Gray and Tudball, 2003;Milliken et al., 1998). Using a number of different performance measures such as work attitudes (Kossek and Ozeki, 1998;Lobel, 1999;Van Yperen and Hagedoorn, 2003), organizational citizenship (Organ and Konovsky, 1989;Schnake, 1991;Smith et al., 1983), and firm productivity (George, 2005;Collins and Clark, 2003), the literature is united in their view that FFWP positively affect firm performance, which is summarized in our first hypothesis: H1: There is a positive association between more FFWP and firm performance.
To accurately capture the performance effects of FFWP, however, we have to control for factors that may be correlated both with FFWP and performance and may therefore generate spurious correlation if omitted. To avoid this problem and following equation (2), we include a set of management practices that have previously been shown to be positively correlated with firm performance. We outline this approach in more detail when we discuss our results.

Determinants of FFWP provision
In the following four hypotheses, we identify a number of covariates Z 1 , Z 2 , etc. in equation (3).

Workers' skill levels
The literature on SHRM emphasizes the importance of a highly qualified workforce as a factor for competitive advantage (Pfeffer, 1994;Wright, McMahan, and McWilliams, 1994;Greenhaus and Parasuraman, 1999;Lobel, 1999). Training and development of firm-specific knowledge make skilled workers a scarce resource for the firm (Legge, 1998;Snell and Dean Jr, 1992;Kleiner et al., 1987;Terpstra and Rozell, 1993) as their knowledge would not be easily replaceable (Barney, 1991;Wright and McMahan, 1992;Wright et al., 1994;Conner and Prahalad, 1996). Since college-educated workers are more likely to receive such training, they may be able to extract a larger part of the firm's quasi-rents (Freidson, 1970;Raelin, 1986).
While part of these rents will be distributed by way of higher salaries, highly skilled workers may also demand more FFWP as a result of their bargaining position (Osterman, 1995). We therefore hypothesize the following: H2: Firms with a higher proportion of skilled employees offer more FFWP.

Female participation
The majority of family and caring duties are fulfilled by women (Shelton and John, 1996;Greenhaus and Parasuraman, 1999;Abbott, De Cieri, and Iverson, 1998;Borrill and Kidd, 1994;Judge, Boudreau, and Bretz, 1994;Konrad and Mangel, 2000), and FFWP are affected by the proportion of employees likely to take them up when offered. We expect the proportion of female employees to affect the provision of FFWP for two related reasons: First, female employees may demand more favorable FFWP at their workplace (Konrad and Mangel, 2000;De Cieri et al., 2005), which makes it advantageous and/or necessary for a firm to provide them.
Note that this is irrespective of whether FFWP affect performance or not. Following our basic model introduced in our general framework, equation (2) allows for other motivations such as employee WLB to guide FFWP provision. Assuming that women demand and benefit relatively more from FFWP than men, it would simply be a case of enhancing WLB for a large proportion of a firm's employees. Second, the set of female employees best placed to implement improvements in FFWP are managers (Harel, Tzafrir, and Baruch, 2003;Harrigan, 1981;Daily, Certo, and Dalton, 1999). Managers are also likely to be in a better bargaining position to negotiate practices beneficial to their own well-being. Both these factors imply that, irrespective of the overall proportion of female employees, a higher proportion of female managers will be associated with more favorable FFWP (Goodstein, 1994(Goodstein, , 1995Ingram and Simons, 1995). In other words, we expect female managers to be associated with more FFWP, first because they are likely to be considered 'important' employees by top management, but also because they are more likely to overcome resistance by top management to implement these practices. We summarize our hypotheses on the role of female employees, particularly managers, on FFWP as follows: H3a: Firms with a higher proportion of female employees offer more FFWP.
H3b: Firms with a higher proportion of female managers offer more FFWP.

Management practices
We argued that good management and a multitude of FFWP offered fulfill different roles for the firm. For example, use of 'Total Quality Management' (Young et al., 2001), bundles of HR practices (Huselid, 1995), and specific practices like mentoring (Ragins, Cotton, and Miller, 2000), as well as use of FFWP may simply be signs that an organization can effectively meet the needs of different stakeholders -investors, employees, or even society as a whole (Brammer and Millington, 2008;Murillo-Luna et al., 2008;David et al., 2007). This may be either because the decision-makers within an organization have the skills and abilities to successfully implement firm practices benefiting any group of stakeholders, or because there is a financial incentive for management to implement them (Coombs and Gilley, 2005). Further, if both good management in general and FFWP specifically contribute to resource-building as argued above (Barney, 1991), we would expect both to occur jointly. We therefore hypothesize the following: H4: Firms with good management practices offer more FFWP.

Competitive pressure
In addition to internal characteristics of the firm associated with the provision of more FFWP, there may also be external pressure to provide more (or less) of these practices. The firm's environment and pressure to focus on financial results is expected to play a particularly important role. Such pressures are likely to originate either from a firm's exposure to foreign competition or from deregulation or other structural features of its primary market. Previous work shows that the degree of competition and other structural features of a firm's product market may affect the provision of HR practices in general (Youndt et al., 1996;Koch and McGrath, 1996;Datta, Guthrie, and Wright, 2005) and FFWP in particular (Perry- Smith and Blum, 2000;Milliken et al., 1998), not least because pressure on financial results demands practices like long hours (De Cieri et al., 2005;Kirby and Krone, 2002;Nord et al., 2002;Smith, 1994;Wolcott and Glezer, 1995), which are incompatible with a wide range of FFWP in a firm (Ouchi, 1980;Tsui et al., 1997;Ehrenberg and Smith, 1997;Gerhart and Milkovich, 1992;Coff, 1997). Further, previous research has found that competitive pressure forces firms to improve their management practices (Bloom and Van Reenen, 2007, 2009, which suggests that underutilized resources or practices delivering modest (if any) financial benefits will be avoided.
We summarize these arguments in the following hypothesis: H5: Firms in more competitive product markets offer less FFWP.

Data and Procedures
To investigate these issues, we construct robust measures of FFWP, WLB, management practices, and our independent and control variables across our four sampled countries (Germany, France, U.K., and U.S.). We first discuss the collection of FFWP, WLB, and management data, which was undertaken using an innovative firm survey tool, and then the collection of performance data and firm characteristics taken from more standard firm and industry data sources. This data is also freely available on-line to enable replication of the results in the paper. 5 Our variables are defined in Table 1a, and descriptive statistics are in Table 1b. The sampling procedure is detailed in Table A4 in the Appendix. The correlation matrix is given in Table 1c.
Insert Tables 1a, b, and c here.

Scoring FFWP and management practices
Measuring FFWP and management practices requires codifying these concepts into something applicable across different firms and countries. This is difficult, as FFWP and good management are hard to define. To do this, we combine concepts that have been used previously, e.g. in (i) the tool developed by a leading international management consultancy, and (iii) the prior management and economics literature. While our focus here is on the determinants and consequences of FFWP, we use good management practices as control variables in our performance regressions.

FFWP and WLB perceptions
In Appendix A3, we detail the HR interview guide, which was used to collect a range of detailed FFWP and characteristics from firms. Focusing on the use of voluntary FFWP, we minimize the influence of different regulatory regimes on FFWP provision. We collected three types of data: • The first was managers' WLB perceptions data on their own firm's WLB versus that of other firms in the industry. This was used as our WLB outcome measure, defined as the response to the question: 'Relative to other companies in your industry, how much does your company emphasize WLB?', scored as: much less (1); slightly less (2); the same (3); slightly more (4); much more (5). We use this variable to validate the claim that ∂w / ∂X ≥ 0, i.e. FFWP work in terms of improving perceived employee WLB.

•
The second was data on key FFWP variables including childcare flexibility, home-working entitlements, part-time to full-time job flexibility, job-sharing schemes, and childcare subsidy schemes. This was gathered by asking the following question on childcare flexibility: 'How much flexibility is there if an employee needed to take a day off at short notice due to childcare problems or their child was sick?', and entitlements to 'working at home in normal working hours', 'switching from full-time to part-time work', 'job sharing schemes', and 'financial subsidy to help pay for childcare'. 6 • The third was workforce characteristics data on variables including average employee age, hours, holidays, and the proportion of female employees, plus information on skills (the proportion of college-educated), trainings, and unionization. This data was used to test our hypotheses on the internal determinants of FFWP.
We subsequently constructed our FFWP measure as the composite z-score (see below) of all five dimensions on FFWP as well as the hours worked and holidays taken. Using alternative measures of FFWP (see Table 2) gives similar results.

Management practices
We follow Bloom and Van Reenen (2007) in our definition of good management practices. They find that the external validity of their management score is high given its strong and positive correlation with firm performance. 7 We group management practices into four areas: 'operations' (three practices), 'monitoring' (five practices), 'targets' (five practices), and 'incentives' (five practices). The operations management section focuses on the introduction of lean manufacturing techniques, the documentation of process improvements, and the rationale behind introductions 6 Note that our measure of FFWP provision does not measure the actual take-up of these practices within the firm, which depends on social and firm-wide norms (Drago and Wooden, 1992;Thompson, Beauvais, and Lyness, 1999) and personal and family characteristics (Gray, 1989;Grover and Crooker, 1995;Lobel, 1991). This is because we focus on the supply of FFWP as a decision variable by firms. However, our FFWP score is significantly positively correlated with self-reported WLB in the firm, suggesting that FFWP are taken up to improve firm WLB. 7 They also do survey re-rater tests by re-interviewing 10 percent of the sample using different interviewers and interviewees (different plant managers) in the same firm. They find these independent surveys to be highly significantly correlated. For example, the intra-firm correlation of the management scores for the 64 firms interviewed repeatedly were correlated at 0.734 (p-value of 0.001), suggesting the two different interviews were providing broadly consistent information about firm practices. of improvements. The monitoring section focuses on the tracking of the performance of individuals, reviewing performance (e.g. through regular appraisals and job plans), and consequence management (e.g. making sure that plans are kept and appropriate sanctions and rewards are in place). The targets section examines the type of targets (whether goals are simply financial, or operational, or more holistic), the realism of targets (stretching, unrealistic, or nonbinding), the transparency of targets (simple or complex), and the range and interconnection of targets (e.g. whether they are given consistently throughout the organization). Finally, incentives (or people management) includes promotion criteria, pay and bonuses, and fixing or firing bad performers, where best practice is deemed to be an approach that gives strong rewards for those with both ability and effort. These practices are all ranked on a scale of 1-5.
The key step to scoring management practices is the use of 'double-blind' surveys. The first part of double-blind is that the survey was conducted by telephone without telling managers they were being scored. This enabled scoring to be based on the interviewers' evaluation of actual firm practices, rather than the firms' aspirations, managers' perceptions, or the interviewers' impressions. To run this blind scoring, we used open questions (i.e. 'Can you tell me how you promote your employees?') rather than closed ones (i.e. 'Do you promote your employees on tenure [yes/no]?'). These questions target actual practices and examples, with the discussion continuing until the interviewer can accurately assess the firm's typical practices. In most cases, three or four questions were needed to score each practice. The survey was targeted at plant managers, who are typically senior enough to have an overview of management practices, but not so senior as to be detached from day-to-day operations of the enterprise.
The second part of double-blind is that the interviewers did not know anything about the firms' financial information or performance prior to the interview. This was achieved by selecting medium-sized manufacturing firms (who interviewers have typically not heard of before) and by providing only firm names and contact details (but no financial details) to the interviewers. The interviewers were specially trained graduate students from top European and U.S. business schools. All interviews were conducted in the respective manager's native language. Since each interviewer ran over 50 interviews on average, we could include interviewer fixed effects in all specifications to address potential concerns over inconsistent interpretation of categorical responses.
Finally, detailed information was collected on the interview process itself (number and type of prior contacts before obtaining the interview, duration, local time-of-day, date, and dayof-the-week), the manager (gender, seniority, nationality, company and job tenure, internal and external employment experience, and location), and the interviewer (we include interviewer fixed effects, time-of-day, and a subjective reliability score assigned by the interviewer). Some of these survey controls are significantly informative about the management score, and when we use these as controls for interview noise in our estimations, the coefficients on the management score and FFWP tyically increase, suggesting we are removing survey noise.

Obtaining interviews with managers
Interviews took about 50 minutes on average and were run from a single U.K. site. Overall, we obtained a high response rate of 54 percent, which was achieved through four steps: • First, the interview was introduced as 'a piece of work' 8 without discussion of the firm's financial position or company accounts, making it relatively uncontroversial for managers to participate. Interviewers did not discuss 8 Words like 'survey' or 'research' were avoided, as these are used by switchboards to block market research calls. financials in the interviews, both to maximize managers' participation and to ensure our interviewers were truly blind on the firm's financial position.
• Second, questions were ordered to begin with the least controversial (shop-floor management) and finish with the most controversial (pay, promotions, and firings). The FFWP questions were placed at the end of the interview to ensure the most candor in managers' responses.
• Third, interviewers' performance was monitored as was the proportion of interviews achieved, so they were persistent in chasing firms (the median number of contacts each interviewer had per interview was 6.4). The questions are also about practices within the firm that any plant manager can respond to, so there were potentially several managers per firm who could be contacted. 9 • Fourth, written endorsements of the 'Bundesbank' (in Germany), the 'Banque de France', and the 'Treasury' (in the U.K.) helped demonstrate to managers this was an important exercise with official support.

Sampling frame and additional data
We focus on the manufacturing sector, where most economists regard productivity as easier to measure than in the non-manufacturing sector (see Griliches (1994) for a discussion). Moreover, we focused on medium-sized firms, selecting a sample where employment ranged between 50 and 10,000 workers (with a median of 700). Very small firms have little publicly available data, so measuring performance from public sources would be difficult. On the other hand, very large firms are likely to be more heterogeneous across plants, and it would be difficult to get a representative picture of FFWP in the firm as whole from just one or two plant interviews. We drew a sampling frame from each country to be representative of medium-sized manufacturing firms, and then randomly chose the order in which firms were contacted (see Appendix A4 for details). We excluded any clients of our partnering consultancy firm from our sampling frame.
Comparing the responding firms with those in the sampling frame, we found no evidence that responders were systematically different from non-responders on any of the performance measures. They were also statistically similar on all other observables in our dataset, except on size where our firms were on average slightly larger than those in the sampling frame.

Productivity and competition data
Quantitative information on firm sales, employment, capital, materials, etc. came from company accounts and proxy statements, and was used to calculate firm level labor and total factor productivity and profitability. The details are provided in Appendix A4. To measure competition, we follow Nickell (1996) and Aghion et al. (2005) in using two broad measures. The first measure is obtained by calculating the three-digit industry Lerner index of competition by country, which is (1 -profits/sales), calculated as the average across the entire firm level database (excluding each firm itself). 10 This is constructed for the period 1995-1999 to remove any potential contemporaneous feedback. The second measure of competition is the survey question on the number of competitors a firm faces (see Appendix A3), valued 0 for 'no 10 Note that in constructing this we draw on all firms in the population database, not just those in the survey. 20 competitors', 1 for 'less than five competitors', and 2 for 'five or more competitors'. We also use three-digit import penetration of sales to measure exposure to international competition.

Validation issues and descriptive statistics
Using FFWP z-scores As mentioned above, we convert our survey responses on FFWP into z-scores. That is, we transform each firm's response on a FFWP, X i , as follows: By subtracting the sample mean of practice X i and dividing by the standard deviation, we eliminate problems of consistently different levels of provision of specific practices. For example, if a childcare subsidy is provided much less frequently than childcare flexibility, a simple sum of the raw FFWP scores would imply that an above-average score in a childcare subsidy may have the same contribution to the composite score as a below-average score in childcare flexibility. Constructing a z-score avoids these problems. From our individual scores, we then construct a 'double-z-score' by summing all individual z-scores and performing the same procedure on the sum, i.e.: change in an independent variable will lead to a change of zz i standard deviations in the FFWP score.

Robustness of our FFWP score
Our measure of FFWP implies that there is a bundle of FFWP that is comparable across our four sampled countries and that is affected in the same way by our independent variables. However, different regulatory environments may imply that some practices are more common in some countries than others. For example, hours worked are affected in Germany and France by the EU Working Time Directive, limiting employees to a maximum of 48 hours per working week. To account for this, we constructed a number of different measures of FFWP, omitting and including practices that may differ systematically across countries. We also performed a factor analysis on our individual practices' z-scores and used the highest-loading factor as an alternative FFWP score. Table 2 gives the correlations between the respective FFWP scores. 11 Insert Table 2 here.

Descriptive statistics
The first issue we address is the association between FFWP and perceived WLB. In Appendix A1, we show that many individual FFWP are significantly associated with the WLB score reported by plant managers, and that all signs go in the expected direction (i.e. negative for hours worked, positive for all others). Most importantly, our aggregate z-score of FFWP has a 11 We ran all regressions with our alternative FFWP scores with no qualitative differences. Results are available on request.
coefficient of 0.258 and is highly significant, so that a one-standard-deviation improvement in the FFWP score will be associated with a 0.258 improvement in self-reported WLB.
The second question we examine in more detail is the connection between our two composite variables, FFWP and management practices. As FFWP are often regarded as part of SHRM, one might expect a positive correlation between FFWP and good management practices.
The regression results for individual management practices on our FFWP score are reported in Appendix A2. We find that a number of individual management practices are positively and significantly associated with the provision of FFWP. Most notably, when running our preferred regression with all individual dimensions of management practices, we find that the incentives category is significant and positive in our basic FFWP regression. Thus, firms offering more FFWP also have better people management practices, suggesting that firms for which human capital is an important resource tend to both treat and manage their employees better.

Regression results -consequences of FFWP
In our first set of regressions, we focus on the performance effects of FFWP. Our results on the association of firm performance with FFWP are given in Table 3.
Insert Table 3 here.
Column (1) in Table 3 finds a positive association between our FFWP score and a firm's labor productivity (i.e. sales per employee). 12 This is in line with a number of previous studies (Perry -Smith and Blum, 2000;Gray and Tudball, 2003;Milliken et al., 1998). However, in column (2), we find that once we control for the management practices used in a firm, the 12 The coefficient's p-value is 1.50. coefficient's magnitude drops drastically and becomes completely insignificant. This is in line with our Hypothesis 4 that better-run firms offer more FFWP. It does, however, suggest that omitting the set of management practices typically found jointly with FFWP will lead to spurious correlation between FFWP and firm performance. Our remaining columns (3-8) confirm this result: There seems to be no positive and significant correlation between FFWP and firm performance when controlling for all inputs (3) and measuring performance as a return on capital employed (5-8) once management practices are controlled for. Allowing all production-related coefficients to vary by country (columns (4) and (8)) does not change results either. Hypothesis 1 therefore finds no support in our regressions. Note that an insignificant (or positive) result on our narrow performance measure (sales per employee) and a negative one on our broader measure (return on capital employed (ROCE)) would have implied that firms offering generous FFWP do not receive the targeted benefits (through increased sales per employee), but bear the cost of implementing them (which would result in a negative correlation between FFWP and ROCE).
Our results suggest that although there is no positive effect on labor productivity, there is no negative one on profits either, implying that FFWP pay for themselves.
This leaves the question of why firms implement FFWP if they have no apparent effect on firm performance. One reason would be that some key employees can (or will) work more productively if they are presented with a bundle of FFWP. In other words, providing FFWP can help turn certain groups of employees into a valuable resource by improving retention or making them work more productively. In this case, firms with many FFWP and a large number of employees who work more productively with more FFWP should perform better. E.g., an interaction term between the percentage of female managers or skilled employees and FFWP should then carry a positive and significant sign in a performance regression. We report our results with interaction terms in Table 4 and find that none of the interaction terms (with management, skills, or the proportion of female managers) are significant, suggesting that FFWP are not provided to keep valuable groups of employees or motivate them to work more productively.
Insert Table 4 here.
A second possible explanation is that firms end up implementing FFWP because a number of key employees demand it and firms value the well-being, and specifically the WLB, of their employees as well as the financial performance of the firm. This is consistent with our model presented in our general framework, where two processes generate two outputs -WLB and financial performance. Note that in all our regressions, the use of performance-enhancing management practices is positively correlated to FFWP provision. That is, although FFWP alone do not have a tangible effect on firm performance, it is still the well-managed firms that implement them more readily.
Finally, it may be that firms are optimally choosing the correct degree of FFWP for their firms, so there is no systematic variation to identify the performance equations. If there were no optimization errors, no exogenous shocks, and no adjustment costs, we could not identify the coefficient of FFWP on performance, even if one existed. Although this is possible theoretically, it is unlikely to be the whole story in empirical practice.

Regression results -determinants of FFWP
We now turn to our regressions on the determinants of FFWP provision. Our results are given in Table 5.
Insert Table 5 here.
We can see from Table 5 that the proportion of skilled employees has a robust positive association with the provision of FFWP. Column (1) uses basic controls only, while column (8) includes all variables of interest. We find that although the level of significance decreases once management and the proportion of female managers are included, 13 the coefficient is consistently positive. Hypothesis 2 is therefore broadly supported.
The percentage of female managers is also positively and significantly correlated with FFWP provision in columns (3) and (8). The overall proportion of female employees is not significant as seen in column (2), which suggests a complex relationship between female employees and FFWP provision. It is not simply the proportion of female employees that results in provision of FFWP. Rather, it is the presence of a group of key employees -female managers -that place a comparably higher value on more FFWP in an organization. There are at least two possible interpretations of this. Either, female managers negotiate a bundle of advantageous FFWP when in the firm (bargaining effect), or they choose to work only for firms that provide sufficiently generous FFWP (self-selection effect). As always with cross-sectional data, the direction of causality is hard to establish, but the positive coefficient of the proportion of female managers on FFWP is robust and significant, suggesting tangible differences in FFWP policies among firms with different gender workforce composition. We therefore find support for Hypothesis 3b, but not Hypothesis 3a.
The final variable relating to the internal characteristics of the workplace are the management practices in a firm. In all our regressions, the coefficient on good management 13 Omitting either the proportion of female managers or management results in a significant coefficient on skills, which is not surprising as using all three internal variables simultaneously may cause collinearity problems.
practices is positive and highly significant -see columns (4) and (8). This suggests that wellrun firms offer their employees more FFWP. Hypothesis 4 on the concurrent use of good management practices and advantageous FFWP is therefore supported. However, as shown above, FFWP have no independent effect on financial performance.
Our hypotheses on external determinants of FFWP are tested in columns (5)-(8) of Table   5. We find that import penetration (5) and the general degree of competition in an industry (columns (6) and (7)) are not associated with more or less FFWP. In our regression with all covariates (8), the external determinants remain insignificant. The rejection of Hypothesis 5 is in contrast to the strong support of our earlier hypotheses on internal determinants of FFWP. It suggests that external pressure to provide or abandon FFWP is not as significant as the influence of factors at the workplace itself -either through the composition of the workforce (i.e. more skilled workers and more female managers) or the general use of good management practices.
Further, our results also suggest that firms operating in tough product markets try to cope with the situation by means other than reducing the number of FFWP available to their employees. This is especially interesting as these product market competition variables have been found to be strongly correlated with good management practices.

Conclusions and Implications
In this paper, we studied the impact of FFWP on firm performance, and found that increased provision of FFWP is only (weakly) positively correlated with better firm performance if we omit management quality. Once we control for general management quality, there is no significant association between FFWP and performance measured in different ways. This raises the question of why firms would want to implement them in the first place. To investigate this, we studied the firm and environmental characteristics that are correlated with increased FFWP use, and find that firms with a higher proportion of skilled workers and female managers, as well as better management practices offer more FFWP. One interpretation of this is that firms must offer FFWP to avoid losing key employees, which may constitute a valuable resource for the firm. In further analysis, however, we find that this is unlikely to be a driving factor for FFWP provision, as firms with more female managers or skilled workers do not benefit more from FFWP than others. Instead, our results are consistent with firms valuing more than just financial performance when choosing their strategies. This resonates with the recent work on corporate social responsibility (Brammer and Millington, 2008;David et al., 2007)  FFWP should not be criticized for their lack of positive financial impact, as they do have a tangible effect on employee well-being. Therefore, FFWP should be treated as policies that improve firm performance in terms of the satisfaction of a particular stakeholder group -the firm's employees -but that financial performance should not be the primary goal of implementing FFWP. Our work also feeds into a wider, emerging research field of nonmarket strategies affecting auxiliary performance dimensions. Researchers face a complex problem with firms choosing strategies to serve multiple stakeholders. Firm characteristics affect these strategic actions in complex ways, and there may be interactions between the different performance dimensions which have to be identified empirically rather than assumed (or ignored) a priori.
Our results carry some limitations. First, we focus on manufacturing firms. This is mainly to avoid problems in measuring firm performance, but it would be interesting to compare our results with data from the service sector. Second, we do not measure the degree to which FFWP are taken up by employees. While this has some disadvantages, it has the advantage of capturing the provision of FFWP rather than their take-up in the workforce, which may be conflated with questions of corporate culture, peer pressure, etc. In other words, while provision of FFWP is a decision variable by the firm, take-up of FFWP will at the very least be the result of a combination of FFWP supply (by the firm) and demand (by its employees). Third, we only sample firms with under 10,000 employees, so we lose the very largest firms (although since we use subsidiaries, some of the parents of our firms are very large). We do not think this biases our results, but studying the very largest firms would also be of interest.
Our work could be extended in several ways. We used country dummies in all our regressions and found significant differences in FFWP provision across our four sampled countries -Germany, France, U.K., and U.S. 14 We do not discuss these results in detail, but it is a promising avenue of future research to study cross-country differences in FFWP scores to find different international models of FFWP. Further, the role of female managers in FFWP provision is interesting. Specifically, our result that it is not the overall proportion of female employees but rather the proportion of female managers that matters warrants further study. Will female managers not work for firms that do not provide sufficient FFWP, or are they more successful in 'pushing through' the provision of such practices? The well-known stylized fact that female managers are paid less than their male peers 15 suggests that firms compensate their female workforce with more FFWP. This is a highly relevant topic for future research.
14 The signs of our country dummies show that firms in European countries offer significantly more FFWP than U.S.-based firms after controlling for observables, while labor productivity in Continental Europe tends to be higher after controlling for input factors. Further results on the country dummy coefficients are available on request. 15 The correlation between the (log of) average wages and the proportion of female managers in our sample is -0.2 and significant at the one percent level.

FFWP variables FFWP Score
Composite z-score of hours worked, holidays taken, childcare flexibility, working from home, job switching, job sharing, and childcare subsidy.

Total Hrs/Week
Total hours worked per week, averaged and weighted over managerial and nonmanagerial staff.

Mgmt Hrs/Week
Total hours worked per week, managerial staff only.

Non-Mgmt Hrs/Week
Total hours worked per week, non-managerial staff only.

Holidays/Year
Days of holidays taken per year, averaged and weighted over managerial and nonmanagerial staff.

Childcare Flexibility
Degree of flexibility in case of unexpected childcare emergency.

Childcare Subsidy
Presence of subsidy to help pay for childcare.

Working from Home
Entitlement to working from home during normal working hours.

Job Switching
Entitlement to switch from full-time to part-time work.

Job Sharing
Entitlement to job sharing schemes.

Output/performance variables ROCE
Return on capital employed.

Ln(S)
(Natural) log of sales.

Ln(M)
(Natural) log of material costs. Ln(S/L) (Natural) log of sales per employee. Ln(K/L) (Natural) log of capital per employee.

Ln(M/L)
(Natural) log of material costs per employee.

Firm characteristics Skills
(Natural) log of percent of employees with university degree.

Female Total
Percentage of female employees as part of total workforce.

Female Mngrs
Percentage of female managers as part of management layer.

Mgmt
Composite z-score of management questions.

Industry characteristics ImPen
(Natural) log of average imports/production by country/SIC pair for 1995-1999 (= import penetration). Lerner Index 1 -(average rents of all other firms in the industry, 1995-1999).

Competitors
= 0 if none, = 1 if less than five, = 2 if five or more.     Baseline FFWP score is the double-z-score (see equation (5)) of hours worked, holidays taken, childcare flexibility, working from home, job switching, job sharing, and childcare subsidy.

•
Alternative 1 FFWP score is the double-z-score of childcare flexibility, working from home, job switching, job sharing, and childcare subsidy, i.e. the baseline FFWP score without hours worked and holidays taken.

•
Alternative 2 FFWP score is the double-z-score of childcare flexibility, working from home, job switching, and job sharing, i.e. the baseline FFWP without hours worked, holidays taken, and childcare subsidy (childcare subsidy is mandatory for some firms in our sample). • FFWP factor is the highest-loading factor in a factor analysis of individual z-scores (see equation (4)) of average hours worked, average holidays taken, childcare flexibility, working from home, job switching, job sharing, and childcare subsidy.
• Variable definitions follow  (4) and (8) include a set of country dummies interacted with ln(capital-labor ratio), ln(materials-labor ratio), and ln(employment). The reported coefficients are the ones for the U.K. (the base country).
• Variable definitions follow Table 1a. • 'Country dummies' includes four country dummies. • '3-digit SIC dummies' includes 98 industry dummies. • 'Standard controls' includes a dummy for public listings, the ln(age) of the firm, and a dummy for consolidated accounts. 18 The positive and significant coefficient on management quality is robust to including a number of noise controls pertaining to the interview process, including interviewer dummies, the seniority, gender, tenure, and number of countries worked in of the manager who responded, the day of the week the interview was conducted, the time of the day the interview was conducted, the duration of the interview, and an indicator of the reliability of the information as coded by the interviewer. 19 The total percentage of female employees was omitted due to the high degree of collinearity between the total percentage and the percentage of female managers. Coefficients and significance of the other variables remain virtually unaffected by the inclusion of the total percentage of female employees.
• Variable definitions follow Table 1a. • 'WLB outcome score' is the response to the question: 'Relative to other companies in your industry, how much does your company emphasize WLB?', scored as: much less (1); slightly less (2); the same (3); slightly more (4); much more (5). • 'Country dummies' includes four country dummies. • '3-digit SIC dummies' includes 98 industry dummies. • 'Standard controls' includes a dummy for public listings, the ln(age) of the firm, and a dummy for consolidated accounts. 20 Note that we lose five observations from our baseline regression in Table 3 as five managers did not answer the self-reported WLB question while HR representatives in the same firm responded to the provision of FFWP. 21 This result is robust to the inclusion of interview noise controls. 22 A test of joint significance of managerial and non-managerial hours confirms joint significance at the five percent level.
• Variable definitions follow Table 1a. • 'Country dummies' includes four country dummies. • '3-digit SIC dummies' includes 98 industry dummies. • 'Standard controls' includes a dummy for public listings, the ln(age) of the firm, and a dummy for consolidated accounts. • 'Full controls' includes the standard controls and the share of workforce with degrees, the share of female managers and non-managers, the share of workforce with MBAs, and a U.S. MNE as well as a non-U.S. MNE dummy.

Sampling frame construction
Our sampling frame was based on the Amadeus dataset for Europe (Germany, France, and U.K.) and the Compustat dataset for the U.S. These all have information on company accounting data.
We chose firms whose principal industry was in manufacturing and who employed (on average between 2000 and 2003) no less than 50 employees and no more than 10,000 employees. We also removed any clients of the consultancy firm we worked with from the sampling frame (33 out of 1,353 firms).
Our sampling frame is reasonably representative of medium-sized manufacturing firms.
The European firms in Amadeus include both private and public firms, whereas Compustat only includes publicly listed firms. There is no U.S. database with privately listed firms with information on sales, labor, and capital. Fortunately, there is a much larger proportion of firms listed on the stock exchange in the U.S. than in Europe, so we were able to go substantially down the size distribution using Compustat. Nevertheless, the U.S. firms in our sample are slightly larger than those of the other countries, so we were always careful to control for size and public listing in the analyses.
Another concern is that we conditioned on firms where we have information on sales, employment, and capital. These items are not compulsory for firms below certain size thresholds, so disclosure is voluntary to some extent for the smaller firms. Luckily, the firms in our sampling frame (over 50 workers) are past the threshold for voluntary disclosure (the only exception is for capital in Germany).
success rate given the voluntary nature of participation. Respondents were not significantly more productive than non-responders. French firms were slightly less likely to respond than firms in the other three countries, and all respondents seemed randomly spread around our sampling frame.

Firm level data
Our firm accounting data on sales, employment, capital, profits, shareholder equity, long-term debt, market values (for quoted firms), and wages (where available) came from Amadeus (Germany, France, and the U.K.) and Compustat (U.S.). For other data fields, we did the following:

Materials
In Germany and France, these are line items in the accounts. In the U.K., these were constructed by deducting the total wage bill from the cost of goods sold. In the U.S., these were constructed following the method in Bresnahan, Brynjolfsson, and Hitt (2002). We start with costs of goods sold (COGS) less depreciation (DP) less labor costs (XLR). For firms who do not report labor expenditures, we use average wages and benefits at the four-digit industry level (Bartelsman, Becker, and Gray, 2000) until 1996, and then Census Average Production Worker Annual Payroll by four-digit NAICS code) and multiply this by the firm's reported employment level.
This constructed measure is highly correlated at the industry level with materials. Obviously, there may be problems with this measure of materials (and therefore value added), which is why we check robustness to measures without materials.

Industry level data
This comes from the OECD STAN database of industrial production. This is provided at the country ISIC Rev. 3 level and is mapped into U.S. SIC (1997) three-digits (which is our common industry definition in all four countries).