Price Indices, Inflation Inequality, and CPI Bias

The challenges which affect the construction of a price index

Jun 12, 2025

One of the fundamental tasks of economics is measuring the level of real income in a country. We need to measure the effects of policies in order to be able to say whether they work or not. Finding the real change in income is also necessary to be able to say what the rate of inflation is, since inflation is simply the change in real GDP subtracted from the change in nominal GDP. Conducting monetary policy requires a prompt and accurate assessment of the actual state of the nation.

Calculating real income is much harder than it looks, however. For starters, it is only with great difficulty that we observe the quality of goods. The ideal price index is “hedonic”, and adjusts the price for how much utility it gives someone. This can be recovered from actual data if what gives people utility is a linear function of an observable attribute. An easy example of this is with computers – the current approach is to regress the price paid on computing power, and extract a price per unit of compute. As we move into other periods, we then adjust the “prices” accordingly, such that a computer becoming available at the same price but with twice the computing power counts as the price falling in half. This can work in multiple dimensions too – one of the early Zvi Griliches papers (who pioneered this approach) was trying to find the real price of fertilizer. Eventually he realized that you can break down fertilizer into a few important molecules (nitrogen, phosphoric acid, and potash) and get a coefficient for each ingredient, and then he used that to adjust the prices. (He discusses it in a somewhat rambly reminisce here). Another method of adjusting for quality growth, which can be generalized to more goods than computers and fertilizer and cars and the like, is to estimate a “quality Engel curve” by seeing how much more the rich are willing to pay for a unit of a good, as proposed by Bils and Klenow (2001). For example, cars have a higher slope than vacuums, which indicates that there is more of a difference in the quality of cars than in the quality of vacuum cleaners. If you assume that there is no screwy stuff going on with signaling and Veblen goods, and we assume that the quality slope stays stable over time, then we can infer from quality from how consumers change their shares of consumption over time.

This is still unsatisfying when utility and the attribute have a non-linear relationship. We are also left adrift when the good is entirely new, and cannot be thought of as a combination of attributes. For example, before electronic computers were invented, there were “computers” – people assigned to do tedious calculations for physics and such. We can do perhaps one calculation calculation per second. Meanwhile a standard desktop computer can do around ten trillion computations per second. Yet, it is to some degree absurd to say that computers provide ten trillion times the utility at the same price as before. This is due to non-homotheticity, which is that the goods that people consume change in their relative quantities as our income changes. This is a profound challenge to our aims, but I will bury discussion of this till later in the article.

Estimating new goods is commonly done with “structural” assumptions. For example, you could assume a constant elasticity of substitution between varieties, and say that consumers gain from having more choices. (Constant elasticity of substitution is abbreviated as CES. You should learn this acronym, I use it all the time). Generally you use a nested-CES function, which accounts for the fact that goods are more similar to each other within categories than between categories, so you might say that there is one CES for different types of fruits, and then another for switching for different types of food, and then another for switching from food to totally different expenditures.

Alternatively, you could just guesstimate it. Dead serious, that’s the best description of the procedure at the Bureau of Labor Statistics. Referring to Bils and Klenow here, 46% of the time they simply say that a new good is comparable to old good and make no quality adjustment, 22% they either use hedonic pricing or the manufacturer’s estimate of how much it cost to make, and for the remainder it says that the quality adjusted price is whatever is necessary to make its inflation for that month the same as the other goods in its narrow category. This can lead to obvious silliness – when Walmart entered markets, offering goods at cheaper prices than traditional grocery markets, the BLS didn’t actually notice this as a price change at all. Rather, it simply assumed that Walmart must be offering lower quality goods at the same quality adjusted price, and so there was no improvement from the new establishment. This is obvious nonsense – but how is anyone supposed to know exactly how much utility people derive from two grades of ham when they are never observed at the same price.

In discussing how to compute a price index, it would be useful to start with the simplest, and then proceed on by seeing how later methods fix the inadequacies of prior methods. The simplest way to fix shares of the consumption bundle at the beginning and see how prices change. Alternatively, you could fix shares at the end, and then calculate how things changed in the past. These are, respectively, the Laspeyres and the Paasche price index. Both of them will err substantially, though, both because there are new goods introduced, and because consumers will substitute between goods as their prices change. If we spend 1% of our income on apples in the first period, and then the price of apples relative to oranges rises, we will substitute some of our purchases from apples to oranges. You can take account of this, to some degree, by bundling goods in categories together and using a geometric mean to find the average price change of that bundle of goods. The point of the geometric mean is to minimize outliers.

The still better way around this is to simply average the shares at the beginning and end using a geometric mean. This is the Fisher index – the Tornqvist index is doing the same for many periods, and then “chaining” them together. All of these are an approximation of the idealized Divisia index, which is implementing the approach in continuous time. The prior indexes are not the rate of change of different bundles of goods, but the breaking up of it into discrete periods. The US government uses both of these types of indices, with CPI being a fixed basket, and PCE allowing for substitution. Obviously, the PCE is better, but it is more administratively complex to find – CPI can be estimated using only easily available price data.

This is something of an aside, but given the importance of sales, I am not actually convinced by this line of reasoning. I am not convinced you can really calculate CPI without knowing the retail scanner data, and if you are waiting around for retail scanner data then you may as well wait around to learn the exact quantities purchased. Nakamura and Steinsson (2008) estimate that half of all price changes are temporary sales – you can’t just assume that the price stores start with is the price that they paid, and in any event, you have to weight by the share of people buying at sale and non-sale prices.

An implication of non-homotheticity is that it is possible for different groups to have different inflation experiences. The goods which rich people buy – or possibly those which the poor people buy, though in practice it’s the former – could fall in quality-adjusted price at a faster rate than the bundle of the goods the other group consumes. Xavier Jaravel (2019) is the go-to reference here. The primary channel for this, in practice, is that increased demand leads to the entry of additional goods. The rich have greater demand, so there are more new goods catering to their preferences. We cannot observe the utilities for each of these new goods, so we have to make structural assumptions, including the CES mentioned earlier. Jaravel tries out a bunch of ways in which these new goods could enter into utility, and shows that it is robust to a bunch of different approaches.

He establishes that it is in fact the increase in demand causing the change in innovation by product group using a shift-share design along the lines of Acemoglu and Linn. He breaks the assumption that preferences vary only due to income, and grants that different household characteristics have different demand characteristics in the initial period, which then stay constant as they change in numbers. So, for example, where Acemoglu and Linn found that an increasing number of old people increased the supply of new drugs which cater to the diseases of old age, Jaravel finds this for all goods.

It’s important to note that this difference in inflation experiences shows up only with extremely disaggregated goods. The CPI is calculated with the Consumer Expenditure Survey, which lumps together goods into categories and does not report exact quantities for specific goods. The retail scanner data which is used by Jaravel and others is at the barcode level, and so can pick up much more fine-grained variation in the products which people buy. Of course, services do not have a bar code and so are not included in retail scanner data. This takes up half of our expenditures, so the cost of omitting this is considerable.

What’s striking is that, with non-homothetic preferences, it is possible for two groups to face persistently different inflation rates, with no difference in the rate of consumption growth for either group. Ezra Oberfield has a paper currently in R&R at Econometrica demonstrating how this is possible. Suppose goods are falling for two reasons – there is a secular trend of technology improving, and there is also a demand driven channel for innovation. Having more people demand for a good leads to producers to pay the fixed costs to adopt new technologies. Everyone has identical preferences, differing only in income. In a world where everyone is equal, everyone shares a consumption basket, and prices fall at the same rate for everyone. There is no difference between inflation rates and consumption growth rates.

Now suppose that there is arbitrary income inequality. Now the goods which the rich are consuming will see more technological progress than the goods which the poor are consuming, and so their prices will fall at a faster rate. Yet, this implies nothing about the change in the rate of consumption. When the income of the poor increases, they will purchase the goods formerly purchased only by the rich at the new lower prices. Consumption is growing at exactly the same rate as before for everyone, but inflation is lower for the rich because the bundle of goods they buy is falling while they are buying it, not before they start buying it.

The natural question is how much this matters. It is absolutely possible for inflation inequality to be a meaningful contributor to consumption inequality. Lashkari and Jaravel (2024) propose a method for correcting for non-homotheticity, which works so long as there is at least one period in which people at all incomes purchase at least a little bit of the good. The difference in share of expenditures forms the basis of the correction. An odd implication is that it leads to us revising growth rates down. Inflation inequality means that the relative cost of achieving the rich’s bundle of goods falls relative to the bottom’s bundle of goods. Basically, in 1955 (when they start the data) consumers had a stronger preference for necessities, since they didn’t have much, and those necessities were also cheaper. The implications for policy are really quite exciting.

That is where my story ends. I do want to cite, for those interested, Baqaee and Burstein (2023), which is sort of the immediate precursor to Lashkari and Jaravel; I refrain from discussing it largely because I have yet to deeply understand a David Baqaee paper. Perhaps in a year I will have written his Wikipedia page.

Writing this article has brought to mind some questions which I intend to explore in future articles, but which I include here to both stimulate the minds of my readers, and to commit myself to writing about them. First, how volatile is income? If we assume that people perfectly smooth consumption, then an increase in the volatility of income would show up as an increase in income inequality with no actual change in consumption inequality. Similar things happen in wealth, which is more obviously a poor proxy for welfare. A recent tweet went viral alleging that the bottom half of Americans have less wealth than the bottom half of the Chinese. This, even if the data is accurate, is obviously nonsense though. People borrow much more at the start of their lives, while having at all times a higher income. We also don’t want to mistake a fall in the real interest rate as evidence for an increase in inequality – if people can borrow more at lower costs, they will appear to be less wealthy but will in fact be consuming more.

Second, how much do preferences change over time? We are willing to grant that preferences vary based on observables. Can we even measure if they vary over time? At the very least, allowing for advertising revealing new information about our desires makes life very difficult indeed. If rich and poor people simply have different preferences over what to consume, we cannot make a non-homotheticity adjustment. We could not assume that poor people would eventually mature into the same consumption bundle.

Manuel Estavillo

Jun 13

Erwin Diewert pilled I see

Expand full comment

Isaac King

Jun 12

> The prior indexes are not the rate of change of different bundles of goods, but the breaking up of it into discrete periods. The US government uses both of these types of indices

Possible typo, you switch between different pluralizations of "index" here.

2 more comments...

Homo Economicus

Discussion about this post