One of the theories for underdeveloped countries' continued poverty is the wide dispersion in productivity between firms. This is misallocation — the existence of small, unproductive firms taking away market share from efficient ones is really bad. Hsieh and Klenow (2009) looks at manufacturing plants in India and China, and shows a big tail of unproductive firms. If they were chased out of the market by competition, and the dispersion of firm productivity — not the average, merely the variance — were to be like the US, total factor productivity would increase 30-50% in China, and 40-60% in India. What is needed for India and China to grow is hard, but simple — any increase in competition will have enormous benefits. The allocative efficiency is so poor as to make improvement simple. So, it is to my chagrin that along comes this paper to kick it right where it hurts — in the measurements.
I think it worthwhile to digress into how we measure productivity. It is not something one wishes to see — never can you regain the easy confidence in your statistics afterwards — but, it is important. A statistical bureau, like the Census Bureau here in the United States, sends out a survey to all the firms that they can find, asking them to provide a lot of information. The big one in the US is the Census of Manufactures, which is done every five years, so 2002, 2007, 2012, and so on. It covers 300,000 manufacturing plants. They need how much you sell; how much you spend, broken down by input; how many people you employ; and so on and so on. This requires quite a lot of “shoe leather”, because firms screw up. They input the wrong values and leave things blank. The first check is when the statistical bureau asks questions whose answers overlap with each other — perhaps they ask for the total value of all products bought, and the values of each product bought. If these sum to different numbers, then you can adjust the stats. If a firm with characteristics very similar to another firm leaves some sections blank, the value can be imputed from the values found in other firms’ responses. Analysts need to go out and check the outliers, and indeed they physically check in on some of the firms and plants at random. Finally, you can check their responses against their tax data. Some of it might be tax avoidance, but it seems more likely that some of it is error. Either way, it can highlight what values are suspicious and need to be checked. The cleaning is far-reaching and substantial. 80% of manufacturing plants have a value in their cleaned data which is different from the raw data.
If every country cleaned data in the same way, there would be no problem. Unfortunately, they do not. Applying the same procedures to cleaning data substantially reduces the variance found in firm productivity. This is the key finding of Martin Rotemberg and T. Kirk White’s recent paper, “Measuring Cross-country Differences in Misallocation”. In the United States, the measured misallocation is vastly less in the cleaned data than in the raw data. In some years, measured misallocation is 87 times less in the cleaned data than in the raw data. India, and I cannot emphasize this enough, does not clean their data. That’s the whole effect of Hsieh and Klenow!! If you apply the same procedures to both datasets, India might have less misallocation than the US. The key finding of Hsieh and Klenow, then, might be a mirage.
In truth, I’m a bit disappointed by this. The first thing to point out is that applying the same procedures to clean data in different countries may result in different outcomes, if the nature of errors in the two countries differ. We cannot, from this paper alone, say with certainty that there is no more misallocation in India than in the US. It also contradicts our intuition. Intuition isn’t all bad — we have an image on doing business in India for a reason. Certainly firms in India are less productive than in the US. There is some micro-evidence to suggest misallocation — firms grow slower in India (and Mexico), according to Hsieh and Klenow (2014). Bartelsman, Haltiwanger, and Scarpetta (2013) found substantial dispersion and misallocation in firm productivity using data from European countries, and surely the statistical agencies of Germany are as adept as those in America.
Still, I think the basic charge of Rotemberg and White is correct. The misallocation is largely an artifact of measurement. The basic justice of this is admitted by Klenow, who is a co-author on a very recent paper (with Bils and Ruane) attempting to recalculate the losses from misallocation, arguing that growth should be less sensitive to error. For Indian firms, the potential gains from reallocation are lowered by 20%, and that is a lower bound.
So what are the takeaways? For those interested in reducing global poverty, we oughtn’t expect that there are simple fixes to low productivity, like reallocating resources between firms. The low productivity of poor countries runs deeper than that. This is not to say that there do not exist solutions which are relatively straightforward — the usual solutions of “don’t expropriate people” remain good always — but we cannot simply get higher productivity from small firms folding. There are constraints on large firms that make it unprofitable to expand further.
For the development economist, as an economist, this tells us to investigate your fucking data. Don’t assume. You will get burned. Your data may not be accurately measuring the thing you want it to measure, or it may be measuring different things in different countries, or it may be shaped by unseen assumptions which differ between countries. You need to poke at your data, and see if anything is off. We can do better — don’t wait a decade for someone to point this out. Do it now.