Discussion about this post

User's avatar
Atanas Pekanov's avatar

This is a really great post, congrats! The only thing missing is a couple of the killer charts from some of these now canonical papers, which really make it easy to grasp and understand why and how heterogeneity matters (I think the main figures from Kaplan-Moll-Violante, Patterson and Bilbiee et al. make an instant impression, I used them at a recent talk to show why heterogeneity matters). Maybe for a next post.

Expand full comment
throwing_away_bits_and_bytes's avatar

"The number of possible paths to solve for increases as an exponent of the number of agents"

How could that be? The amount of agents that one agent can interact and get/use unique information about is going to be some fixed amount in a given simulation tick, assuming we are modeling humans. And that fixed amount is going to be small too, the amount of people I talk with in a day is certainly a lot smaller than the amount of people in the United States. An agent could respond to aggregate information about a collection of agents, but that aggregate information should be shared and not be unique per agent, so it corresponds to the amount of layers of aggregation, which would be log(n) with population (assuming each layer contains info on an amount of people that is a multiple of the amount of people of the layer below it, e.g., team at a company the agent works at, the company's division, company, company's industry, the whole economy), making the sim still only n*log(n).

The models I see in economics (not that I'm particularly specialized in the field), are typically math equations, which doesn't exactly translate to particularly many instructions, maybe, what, 10-100 per agent-agent interaction tops? If each agent interacts with a 100 other agents that's maybe 1k-10k cycles per agent? On top of that, the modelling per agent shouldn't be too hard to keep parallel, agents would probably be pulling information in relatively straightforward ways, and branching probably wouldn't be too much of an issue. So its not that wild to think that, say, 10 million agents, which I would think would put little strain on memory capacity of RAM/caches, would be ~100 billion cycles tops, so maybe 100 simulation ticks per second on a few hundred USD gaming GPU.

Obviously that's a very very rough estimate!

Now going from 10 million to 10 billion agents isn't just 3 orders of magnitude slower because unless you have some remarkable compression scheme (not impossible depending on the specifics), you'll exceed the memory capacity of the VRAM. You can use enough SSD sticks on a RAID card to fill up the bandwidth of a PCIe slot, which can work if the memory access patterns are regular enough, but if each agent uses 1000 bytes, that's 10 TB which would take in the ballpark of hundreds of seconds to read in and write out (without getting too terribly fancy, you can get about a couple dozen GB/s of bandwidth).

You could probably get some benefit going to multi GPU and more expensive GPUs too. Still even if it is hours per simulation tick at 8 billion agents, the solar system should still be around by the end.

Expand full comment
1 more comment...

No posts

Ready for more?