Abhijit Banerjee, Esther Duflo, and Michael Kremer are most famous, justly, for their work on randomized controlled trials, or RCTs, for which they were awarded the Nobel Prize in 2019. An RCT, for those who do not know, is the same set-up as that by which we test a medicine. We give the treatment to a group chosen at random, and compare their outcomes to an untreated group. In the hands of Banerjee, Duflo, and Kremer, and their many collaborators at the Mohammed Latif Jameel Poverty Action Lab, that method is used to test policy. How should we provide public health services? How should we conduct education? How do we get an efficient government? How do we disseminate information? The policies informed by their experiments have affected the lives of over 600 million people; a conservative estimate of lives saved would run into the millions.
Their work stretches far beyond simply being great administrators and organizers of experiments, though. They are also first-rate economists who have created work of incredible depth and insight using a variety of methods, in theory and in practice, and have been enormously influential both on me, and on the field at large.
Of the husband and wife team of Banerjee and Duflo, Banerjee is the older of the two by 11 years, and in fact supervised Duflo’s dissertation. It is actually surprising that two of his three most cited works are actually not related to RCTs at all – “A Simple Model of Herd Behavior”, from 1992, and “Occupational Choice and the Process of Development” from 1993. (with Andrew Newman). Most of his early research focused on theoretical matters of how information is spread, which would later be put into practice in RCTs, such as “Using Gossips to Spread Information”, in the context of vaccination, and “When Less is More”, on communicating the details of India’s 2016 demonetisation.
His most cited paper, “A Simple Model of Herd Behavior” is indeed simple, and quite elegant. In the model, people make decisions sequentially, and can observe what other people choose. Because people possess some private information, we are best served incorporating their signals into our decisions; but since they are also incorporating information from others, we may converge upon outcomes which are wrong. He gives a simple example of choosing between two restaurants, A and B. Everyone starts with the prior that there’s a 51% chance restaurant A is better, and 49% restaurant B is, and then receives a signal favoring one side or another. 99 out of the 100 people might think that restaurant B is better – but if the first person received a possibly inaccurate signal that restaurant A is better, then everyone will choose A. Everyone would be at least as well off if they had no information from other people’s actions at all. Thus, we can get bubbles and other such inaccurate behavior.
I see it as a sort of reverse “Bayesian Persuasion”. There, we get sub-optimal outcomes because the actions which the receiver can take are non-linear, and if you push someone over a threshold you get a big change in response. Here we get inefficiencies because the signals which people send are discrete. We can only go to one restaurant or the other, not some percentage of both. Perfect efficiency is really only possible in a continuous world.
“The Economics of Rumours” would construct a reason why a bubble wouldn’t be bought into by everyone. There are a number of investors who might invest in a project, the returns of which are either a or b, with some probability assigned to each. Investors face different costs, which for simplicity are “high” and “low”, with the payoffs such that low cost investors always want to invest, and high cost investors only want to invest if the returns are a, but not b. Some small portion of the population hears about the opportunities, and knows the returns – everyone else knows only that other people have invented. If hearing about the opportunity is a function of whether other people have invested, then the optimal decision rule for high-cost investors is to invest if early, and to not invest if heard later. The reason is simple – if it spreads slowly and you hear of it late, it’s more likely to be in state b.
He also has an intriguing, if entirely too long, paper with Eric Maskin giving an account of why money exists. Money famously avoids the need for a “double coincidence of wants” – if I wish for apples, and have bananas, money lets me get apples even if they don’t want bananas. The trouble is that, once you allow for durable goods, the goods themself can be money. Standard goods, like rice or salt, do often take the place of fiat money in developing economies. Note that I say standard goods, though. According to Maskin and Banerjee, the reason for money is that it is extremely easy to judge quality, or else people would use low-quality goods to try and get something out of the other. Thus, metals as money, for purity and size can be easily ascertained.
I do not think the jump from theory to empirical work was as much of a break as it would seem. His early papers were largely practically minded models, seeking to simply explain a commonplace observation. His paper on herding, for example, came from noticing how people waiting for the train at Princeton would often form long lines for the wrong train. His motivations for RCTs often lie not in obtaining simply an unbiased estimate, but as a way to directly test a theory with a treatment which lines up 1 to 1 with the theoretical hypothesis.
One of his early papers, with Andrew Newman, explored the possibility of poverty traps. Can countries have a large divergence in outcomes from a small difference in initial conditions? Can initial conditions be changed to start the process of development? They assume that there are credit frictions of some kind. (Were there not, first welfare theorem holds and we’d be maximally efficient – but that’s not interesting, now is it!). Poor people choose either to become self-employed, enter into employment contracts, or become an entrepreneur. If people are too poor, or insufficiently unequal, then society never takes off into business, and it remains a nation of cottage industry. They also point out, in an AEA P&P, that the poor being closer to the lower bound of possible utility makes it harder to enforce the repayment of loans, and so they will naturally find it harder to borrow.
Duflo would come to MIT in 1995 on the advice of Thomas Piketty – though she was nearly rejected by, of all people, Abhijit Banerjee. I see Duflo’s work as relentlessly optimistic. She does not think that poverty is inevitable. There is no fundamental difference between the developed and developing world. Rather, poverty is the result of particular problems, each of which can be overcome by determined action, and it is her job – no, her duty – to find the solution. She writes in her biographical statement upon accepting the Nobel, “I felt that the only way I could ever repay this huge cosmic debt to the world was first, to nourish and exploit my own unremarkable talent, and second, to play some role in helping others get the opportunity to find and nurture their talents.”
Banerjee and Duflo’s first paper together is on the limits of contracting. Indian courts are notoriously ineffective, and so firms cannot rely upon them to settle disputes. This makes building a good reputation of utmost importance, especially when the products (software packages) are much too complicated to even explain to the court. How burdensome is being unable to make the precise contracts you want?
They have a few strong and testable predictions. Firms without a good reputation will need to take on fixed cost contracts, so that the burden of cost overruns fall upon them; when they build up reputations for honorable dealing and paying when they are at fault, they can agree to contracts which allow for adjustments in the case of adverse contingencies.
I think we can see their strengths combined in this. It marries clean theorizing about a model with dogged empirical work. (They conducted 125 interviews with software CEOs in three months!) It allows them to very cleanly test the question that Banerjee had in mind, an extension of the Grossman-Hart-Moore theory of the firm work. And it gives a sympathetic account of the actions of people in the poorest parts of the world — they have never been ones to consider anyone lesser to them.
Duflo is much more than simply an organizer of experiments, of course. Her most-cited paper, with Marianne Bertrand and Sendhil Mullainathan, merely completely changed the way we handle differences-in-differences. Differences-in-differences is a statistical technique for inferring the effects of something from observational data. Suppose, as in the example they give in the paper, we are interested in the effect of legislation on the wages of women. Difference-in-differences is predicated on the assumption that women’s wages in different states, before the passage of that law, were following the same trend, and that you can attribute the divergence after the law to that event. Somebody interested in studying this might get data from the Current Population Survey (or CPS) on a panel of women, observe their wages each year, and look at how they change after a law is passed.
The trouble with this is that the wages of a particular person are correlated over time, in what is called serial autocorrelation. They do not randomly vary over time! This means that each year is not an independent observation. They are adding to our sample size when we divide to find correlation, but are not actually providing any new information. We will reject the null hypothesis, and conclude that the legislation changed wages, far more often than we should.
Bertrand, Duflo and Mullainathan demonstrate this by taking the datasets mentioned, and randomly assigning legislative changes to years and states. At a 5% significance level, we should reject the null hypothesis exactly 5% of the time. Instead, they can reject it 45% percent of the time. It is still somewhat baffling to me why this took so long for people to figure out. As they say in the paper, serial autocorrelation was well-understood in theory, and these were not the first fumbling explorations of a new technique either. In six journals alone, they counted 92 papers over 10 years! Nowadays you cannot do one of the most common statistical techniques in modern day empirical work without taking her criticism, and proposed corrections, into account, and I think the entire profession is better off for it.
Her doctoral dissertation was a differences-in-differences study itself. I suspect that that is what started her thinking on the topic. In 1973, the Indonesian government embarked on a massive school building campaign, building 61,000 primary schools in six years. Enrollment greatly increased. The placement of schools was non-random, and very obviously non-random, but it is still possible to make use of the data. Exposure to education varied across both regions and time, so if you treat the effect of exposure to education as additive and linear you can compare the difference in the change of educational exposure to estimate the effect on wages. She estimates that a year of schooling has a rate of return between 6.8 and 10.6 percent, making education well worthwhile. These effects would get revised down by her later work – she considers how changes in schooling changes which skills are rewarded in the economy, which lowers the wages of prior generations; and in any event physical capital accumulation did not keep pace with human capital accumulation – but even that still regards education as plainly worthwhile. This was a big improvement over prior work on education, which was certainly trying hard, but was largely very bad.1
Duflo would continue to do exceptional work with non-RCT methods. “Dams”, with Rohini Pande, is one of my favorites, which introduced a completely novel (and brilliant) instrumental variable strategy we see used even today. Water, as is well-known in the literature, flows downhill. If a dam is built at an elevation of a thousand feet above sea level, it can easily irrigate downstream places which are at an elevation of 950 feet, but it cannot irrigate the places upstream at 1050 feet. In addition, you cannot simply build a dam anywhere. Dams for irrigation purposes can’t be too steep, while with dams for power generation, the steeper the better. When you have exogenous placement of dams, and otherwise similar places up and downstream, you can infer the effect of dams.
According to Duflo and Pande, dams benefit the places downstream, but actually make the districts upstream worse off. These effects are mediated through institutions, and places where landlords are responsible for tax collection saw bigger increases in poverty, even when the productivity shock was the same. They build off a paper by Banerjee and Lakshmi Iyer, who used differences in colonial era institutions due to takeovers by the British during the Princely State era. People would later use this method to study other things, notably the effect of sewers.
Still, they wanted to do more to help the world. They wanted to go out and test the theories. For that, I think we need to turn to their frequent coauthor, and co-laureate, Michael Kremer. Kremer is, by general acknowledgement, the intellectual origin behind running RCTs in development. Banerjee writes in his biographical statement upon receiving the Nobel, “…in particular I remember a conversation where he wondered aloud why we economists don’t do randomized controlled trials (RCTs). It turned he was doing one [sic]. This was I think in 1995. I went home and thought hard about it and was entirely persuaded. RCTs offered so many possibilities that one could never get to otherwise. Emboldened by the fact that I had some confidence in our ability to collect data, I started organizing my own first RCT – and persuaded Michael join me.”
Kremer’s background is as a macroeconomist, and his first few contributions are in this vein. The earliest works on endogenous growth did not take into account population – rather, they treated human capital like any other capital, which must be invested in and accumulated. Kremer, in “Population Growth and Technological Change”, argues that what matters are ideas, and that these ideas are randomly generated as a function of the number of people in the world. The more people, the greater the rate of technological change. We cannot directly observe this, but in a Malthusian world where population expands to eat the whole surplus, we don’t have to. Knowing the population is a sufficient statistic for the level of technological progress, and this we can (very roughly) estimate. Calibrating his model on historical population estimates, stretching from 1 million B.C. to the 1990, he is able to show that – slowly but surely – population was an accelerating function of past population.
The implications to today’s great fertility crisis should be clear. Stagnating population means a slower rate of technological discovery, and a slower rise in living standards. Kremer does not directly consider this in his model, but if you allow for knowledge to depreciate over time, then a sufficiently small population will slide quickly into extinction. The technological stock of isolated islands like Tasmania dropped over time, including losing the knowledge of fire, while on very small islands like Flinders Island, humans went extinct.
Kremer also has some non pareils, the best of which is “Elephants”, with Charles Morcom. It has essentially no connection to the rest of his work; it is merely brilliant. The possibility of a population going extinct leads to interesting dynamics, where the survival of elephants is determined by everyone’s beliefs about their survival. If everyone thinks that the elephants will go extinct, they will – everyone is incentivized to rush out and kill an elephant now before there are no more. The government can then prevent elephants from going extinct simply by credibly committing itself to crack down on poaching if the elephant population falls below a certain level.
Of course, poor African governments – or any government, really – are not known for their powers of commitment. Kremer and Morcom propose a solution – the government should stockpile captured ivory, and promise to release it if and only if the number of elephants falls below a certain level. This is extremely easy to hold to, and puts the governments incentives in the right place. This does, notably, only work for durable goods, like ivory – however, non-durable goods have different dynamics, since everyone has to sell in the same time period. In that world, people rushing out to kill the animals drives down prices too, preventing it from being rational to go hunting.
A strand of Kremer’s work, much of it with his wife Rachel Glennerster, is on how we should fund the research and development of pharmaceuticals. So much of the improvements in living conditions in the poorest nations has been due to new medicine. In his JEP paper “Pharmaceuticals and the Developing World” he gives the example of how Vietnam, despite having a GDP per capita similar to the United States in 1900, has a life expectancy 22 years longer than us then. Kremer is concerned not just with distributing new medicines, but making sure they are developed to begin with. Since the developing world is poor for reasons largely beyond their control, there is a great disconnect between which medicines most raise utility, and which generate the most profit. In the year 2000, America accounted for 40% of the entire market for pharmaceuticals, and Africa accounted for 1%, when the marginal value of spending more on medicine in Africa would be far higher. Who demands it does indeed have a big effect on which drugs are developed – Acemoglu and Linn (2004) would use plausibly exogenous shifts in the demographic composition of America to infer how much R&D responds to demand, finding that a 1% rise in potential market size leads to a 4% increase in drugs approved.
The triumph of this work is advanced market commitments, largely written with Rachel Glennerster. You may not have heard that term before, but you have seen it work – its best known form is Operation Warp Speed. When the invention of a medicine has massive positive externalities (and the Covid vaccine did – estimates from the stock market give an approximate value of 5 to 15% of global wealth) technological progress from many competing companies will come too slow. Each fears that the others will succeed, and make their investments not worthwhile, in what is called the “business-stealing” externality. To solve this, the government needs to commit in advance to buying a certain number of doses at a pre-negotiated price, and buy this regardless of how many companies produce a sufficiently good product within a timeframe.
Kremer has also proposed novel methods for aligning social costs and benefits in drug development, though the method could be extended to all innovation. If the government wishes to incentivize drug discovery, it can use prizes, patents, or subsidies. Prizes work great when what you want done can be perfectly specified in advance, and you know how much the market would value it, but these are rarely known. It also discourages changes in drug quality, which are hard to specify, and does not reward failed efforts which make it easier for others to create the drug. Patents mean that you do not pay for valueless innovation, but it creates a massive distortion in the market, cannot ensure the optimal amount of research, and still does not reward failed efforts. Direct subsidies to research do reward unlucky searches, but it is difficult to align people’s incentives toward actually producing the drug, and we often do not know what we are searching for.
Kremer proposes patent auctions. Whenever someone files a patent, it is automatically put up for sale. 90% of the time, the government buys it for the winning bid, and puts it in the public domain – the other 10% of the time, the winner buys it at the winning bid price. (The exact form of the auction would likely be an ascending auction, because people’s values are correlated and it is probable that they are risk-averse. Under those conditions, an ascending price auction maximizes the auction revenue; see Milgrom and Weber (1982)). Conceivably, the system could be used for every kind of patent. I think the primary objection to this would be if there are multiple patents used in an interlocking system, and only the whole is useful. A product market rival could then wreck the whole system, or we would not meaningfully get the benefits of freed patents. It is, nevertheless, a very clever idea that would be an improvement over our present system.
His other great theory paper of his early years totally reshaped my worldview. In “The O-Ring Theory of Economic Development”, he changes the production function from one in which goods are produced in one step, to one where it requires many steps, an error in any one of which wrecks the whole process, like how an error in the O-ring blew up the Challenger space shuttle. It is a small change, but one with profound implications. The probability of error no longer has a linear effect on income, but an exponential one. If we think of the error rate as a function of human capital – or perhaps even being human capital – it is possible to explain very large differences in income between countries with considerably smaller differences in skill.
Kremer implicitly uses this with a now almost forgotten paper with Olivier Blanchard, called “Disorganization”. The fall in output in the Soviet Union upon removal of much of the price controls is strange, because the effect should be the same as removing a tariff or any other distortion. Kremer and Blanchard argue that the Soviet Union had many firms which often had single, highly specific, suppliers, and when the central planner was no longer around to keep them all in line, the relations between firms broke down. They show that goods which have longer, more complicated production processes saw greater declines in output. (In footnote 16, they thank Esther Duflo for providing the data. She had tracked down unpublished data. It seems quite in character.)
The O-ring paper has deeper implications than just that, though. At the firm level, like will cluster with like. One company will have the best people at every task, and other companies will have the next best people, and so on. We should also expect the lower-skill firms to be smaller, and thus for firms in poorer countries to be smaller. If production takes place sequentially, then lower skilled workers (and firms, and countries) will work at the beginning of the process, where mistakes matter less, and higher skilled entities will produce at the end of the process. All of a sudden, the order of production matters in economics, and shocks to productivity at different stages will have heterogeneous effects. One can generalize the idea of exponential error to institutions, and grasp how small differences in culture might lead to very large differences in corruption and such.
Even more strikingly, there are now multiple equilibria of skill attainment. If your returns to human capital acquisition are dependent upon what everyone else around you is doing, then low skill is just as much an equilibrium as high skill. A “big push” to obtain skills can raise everyone’s living standards, but won’t happen on its own. Access to immigration can actually raise the skill level of people in the country which people are emigrating from, because the returns to gaining skill are now higher.
If you believe that poor countries are stuck in bad equilibria, as Kremer does, then the natural question is how we get them out. Education is the big one, and so much of his RCT work focuses on education. Is it cost-effective? How do we make it better at forming skills? How do we get teachers to show up? How do we make kids able to actually learn? This last reason is the motivation behind the seminal paper “Worms”, with Ted Miguel.
Intestinal helminths (better known as worms) are endemic in much of the world, and cause children to be malnourished. Giving deworming medicines to everyone is often cheaper than testing for infection, so the most efficient way to treat them is to simply give everyone the medicine. Many prior studies show surprisingly small effects, however, which is extremely weird. Everyone agrees that having intestinal worms is bad for you. Everyone agrees that the medicine does treat the ailment (though reinfection will occur). So why small effect sizes?
Kremer and Miguel find that there are enormous externality effects from treatment, and that the untreated control group is also benefiting from the treatment. Studies which randomize who in a school gets it would not be able to show the effect, because the treatment interferes with the transmission of the disease. They avoid this by randomizing when different schools receive the treatment en masse, and then using showing the positive spillovers to nearby schools.
They found massive positive effects for very cheap. The cost of getting one extra year of school attendance was $3.50, and their estimate of the cost of saving one disability-adjusted life year was $5. 76% of the treatment effect was due to spillovers. Later follow-up of the long term effects showed that the earlier school attendance led to increased wages in better jobs.2
Kremer has several other studies in Kenya – the one that he was referring to in his conversation with Banerjee was on textbooks and school uniforms. These came back with extremely mixed results. A lot of inputs simply didn’t matter. The best interventions were not textbooks or flipcharts – it was a warm meal, and a full belly. I have neither time nor patience to followup in any comprehensive manner on the thousands of educational studies done since, but I will state my impression that very often, what we care about for improving outcomes are just the simple things. Investing in better HVAC systems is often much better than the new fad for pedagogy.
Much of Banerjee and Duflo’s RCT agenda has been education as well. With Kremer and Dupas (and not Banerjee) they provided some of the best evidence for the efficacy of tracking, and its interaction with teacher incentives. They found that splitting classes by ability, and allowing curriculum to differ between them, substantially improved student outcomes. Strikingly, the tracking benefited all students. There is no difference in average outcomes between the best students in the lower achieving cohorts, and the worst students in the higher achieving cohorts.
Importantly, tracking only works when it changes the curriculum taught. Some studies of tracking find null results, but this is only because they are teaching the same material at the same pace to the students. If you don’t allow the smarter students to race on ahead, you are holding them back. Analyzing the same experiment, they also found that students who were taught by contract teachers, rather than regular civil service teachers, had better results. They did find worse results for free education in Ghana, with it leading to income gains only through greater access to government jobs, where it is questionable that that actually improves economic outcomes.
They care a lot about their advice being useful to policymakers, and actually leading to change. Duflo’s Ely lecture address, “The Economist as Plumber” can be taken as an expression of her ethos. It isn’t enough to try and find the one clever change which, when plugged into the economic machine, will lead to big, positive, indirect changes. Your clever change can be foiled by lots of little details. In order for economics to be good, it must be willing to roll up its sleeves and attend to the minutia. For example, Banerjee (and co-authors) found that simply mailing cards with more information about what a subsidized rice program actually did substantially increased the programs efficiency. Beforehand they were being defrauded by local leaders, who misrepresented what the program did! Or in another instance, reduced smoke cooking stoves are a highly promising way to reduce mortality. Yet, if they are fiddly and you don’t work hard enough to encourage adoption, people will stop using them. Duflo (with Rema Hanna and Michael Greenstone) found that the positive effects on smoke inhalation disappeared entirely by the second year of the study.
Banerjee and Duflo’s RCT work has also had some incredibly important null results. Their findings are not merely trivial confirmations of obvious things, but genuinely change our policies. In particular, their research has been critical in moving the international aid community away from microfinance, and toward things which work.
The theory is appealing, and I have no doubt they came into hoping that it would indeed work. Many of the distortions in the economies of developing countries are summarized under the heading “credit market imperfections”. People want to smooth consumption over time, but cannot. This leads to them choosing inefficient, but less risky, production techniques. Policymakers thought that this could be averted at little cost with “microfinance”, or small loans. So highly was this regarded that Muhamed Yunus was in fact awarded a Nobel Prize for his work in Bangladesh.
The only trouble is, it doesn’t work. Banerjee, Duflo, Glennerster, and Cynthia Kinnan tested it with an RCT, and found simply no positive effects from microcredit. It just doesn’t do anything. There is no discernible difference in health, consumption, well-being, anything. There have been a number of studies evaluating this, and they consistently agree. Microfinance is not worth doing. If you want to benefit the poor, you are going to need to just give them money, for which there is good evidence of it working. The best work on this was led by Kremer’s coauthor, Ted Miguel, and can be found under “General Equilibrium Effects of Cash Transfers”. Simply giving out a thousand dollars at random leads to families, including non-treated families, getting a total increase in consumption equal to about $2500. One of Duflo’s early papers touches on this as well — in her estimation, the rapid equalization of white and black pensions led to an increase in the height of children who had more exposure to higher incomes.
Something which can work are nudges, or relatively small changes in the incentives which are trying to correct a behavioral bias. Duflo cites nudges favorably in her “Economists as Plumbers” article, because they are not very harmful if we misassess how irrational people are. If people do not have behavioral biases, then it does nothing; if they do have behavioral biases, then it raises utility. Duflo, Kremer, and Jonathan Robinson investigate fertilizer purchasing in Kenya. Fertilizer is incredibly profitable when used by farmers in moderate amounts, and is manifestly underutilized. If you give people a subsidy, however, people overuse fertilizer. Since the reason why people underinvest are behavioral biases – people defer making the purchase for too long – offering a small subsidy, but only at the beginning of the season, can have a big effect for a much lower cost of subsidizing it year round.
The last great detail which will frustrate the aims of economists is that of ineffective, or even hostile, government. Some of Banerjee and Duflo’s early work focused on “Addressing Absence”. Government employees in a lot of places in the world simply wouldn’t show up to work. They might not even exist! The Indian healthcare system is notionally one with free, universally provided primary healthcare. However, since most of the time the nurses didn’t show up, nobody even bothers going – they pay for private healthcare instead. In “Putting a Band-aid on a Corpse”, they implemented an attendance monitoring scheme, with initially excellent results. In 18 months, though, the program was all but defunct – the local government did not have the stomach to go through with actually punishing delinquent nurses. The status quo was preferred by those in power.
Or take an anti-corruption campaign in Bihar. They, with Santhosh Mathew as the driving force behind it, implemented better accounting practices in the government. If someone signs off against an expense, the person who authorized it was recorded and kept. It had a lot of positive effects – in particular, the reported expenditures on government work dropped by fifth, without any change in the actual amount of work done. Many of the supposed employees never existed to begin with. Unfortunately, this is what led to the program being canceled – local politicians disliked their graft being shut off, and lobbied for it being undone.
The RCT agenda is not a panacea, insofar as it can only lead policymakers (the horses) to water, but cannot make them drink. In many places, they have been frustrated by recalcitrant bureaucrats, by entrenched special interests, and by those in privileged positions who would not want to accept the truth. But is there any other method by which you can make a government do what it does not want to do? All we can do then is give the public ammunition to lobby for the right policy, and improve the world to the extent which we are permitted. One of the great advantages of RCTs is that they are incredibly easy to understand, and rely on nothing which is not obvious. They are part of the toolkit, but they are a very important part of the toolkit indeed.
The agenda of RCTs is an agenda which invites criticism, but not refutation. No one, not even the harshest critics, argues that it is not on some level useful to know whether a program works or doesn’t work in a particular context. The arguments against it are that it is an error in emphasis, and has limited generalizability. Above all, a particular intervention produces a change in levels, but not a change in growth rates. Development economists should focus on the fundamental causes of growth, and should not waste their talents debugging a welfare program.
I think that this criticism is fundamentally misguided. Growth is a series of iterative improvements. To be the man in the arena actually creating the gains is as consequential as creating the environment where those advances might happen. We are judged not by the elegance of our solutions, but by their effects.
And in our concern that it may not treat the fundamental causes of stagnation, let us not lose sight of what it has done. The policies informed by J-PAL’s experiments have improved the lives of over 600 million people. They are the intellectual progenitors of effective altruism in practice, and they are my heroes. If I should do one millionth of the good that they have done for the world, that would be enough.
There are some background sources which I found useful to understanding the topic, but did not explicitly link in this essay’s text. Benjamin Olken wrote the Nobel Prize essay, which I found quite helpful for sorting through the mass of RCTs, and corrected a misconception over the sequence of Kremer’s experiments in Kenya. Christopher Udry’s essay in honor of Duflo was lucid and informative. I also read many more papers than could reasonably be included, even under the incredibly expansive inclusion standards of this essay. In particular, I would like to draw attention to “The Economic Lives of the Poor”; to “How High Are the Rates of Return to Fertilizer?”, which found that the returns were highly concave, such that bad recommendations turned returns negative; “On The Road”, which is a very late and out of character work quantifying the benefits of highways in China; “Good Policy or Good Luck?”, which argues that growth rates decade to decade are too wildly unstable to actually be attributed to policy over good luck; the work on auditing and pollution in Gujarat, which would later turn into a test of pollution markets in developing countries; “There Cannot Be Adverse Selection If There Is No Demand” (which is an objectively funny way for a trial to flop); “Why Does Misallocation Persist” which points out just how extremely weird persistent between firm misallocation is, as compared to between industry misallocation; and lastly, “Odious Debt”, which is an elegant attempt in the line of elephants to prevent loans which will just be looted.
I remain baffled by how Card’s method of taking distance from college as an instrumental variable ever made it past the drawing board. (People move for non-random reasons!)
There would later be some controversy over whether the effects were accurately measured – I do not believe the criticism, however, which, after correcting a few small errors, hinge on treating groups which were in line to be treated, but were not yet treated, as part of the treatment group rather than the control group. This is bizarre to me, as the not-yet-treated groups are otherwise indistinguishable from the control groups. Also, they do not cover externalities argument in any depth. As Chris Blattman writes, “To be quite frank, you have throw so much crazy sh*t at Miguel-Kremer to make the result go away that I believe the result even more than when I started.” For more, read “Worm Wars” from Berk Oezler, and the synoptic “Worm Wars: The Anthology”. In general, we need to separate out what we mean by “fails to replicate”, and “is not robust”. Arguing about which statistical practice is best is very different from the data being fake, or the supposed analysis plan not producing the actual results. I agree entirely with Michael Clemens here.
Thanks Nicholas, this is a fantastic essay!