According to Digital Trends, new research reveals that popular AI models like OpenAI’s ChatGPT-4o and Anthropic’s Claude-Sonnet-4 have a flawed view of human intelligence. The study, which tested these systems in a classic game theory scenario called the Keynesian beauty contest, found they consistently assume people are more rational and logical than they actually are. In the “Guess the Number” game variant, the AIs were given profiles of human opponents from first-year students to expert game theorists and asked to predict choices. While the models did adjust their guesses based on the opponent’s described experience, they still “played too smart,” overestimating the depth of human strategic thinking. This gap between AI expectation and human reality could impact how these systems are used in economic modeling and other fields that require predicting human decisions.
AI Meets Game Theory
Here’s the thing about the Keynesian beauty contest: it’s not about what *you* think is beautiful. It’s about what you think *other people* will think is beautiful. And then what you think they think other people will think. You see how this gets messy fast? The AI research basically dropped these language models into a digital version of this, where the goal was to pick a number closest to half the average of everyone’s guess. It’s a perfect test for strategic reasoning.
And the models did try. They read a bio of a “first-year undergrad” and thought, “Okay, this person might not think too deeply.” They read about an “experienced game theorist” and cranked up the strategic calculus. But they still assumed a baseline of logic that most humans just… don’t use. We get impulsive, we overthink, we follow hunches. The AI, trained on vast amounts of text that often explains *ideal* rational behavior, seems to have internalized that as the human default. It’s a classic case of the map not being the territory. The research paper argues this highlights a core calibration problem.
Why This Matters Beyond the Game
So why should we care if a chatbot is bad at a number-guessing game? Because this isn’t just a game. This is about any system that needs to anticipate human choices. Think about algorithmic trading, supply chain forecasting, or even content recommendation engines. If the model powering that system assumes we’re all cool, rational actors, its predictions are going to be off. Way off.
It echoes other worrying findings, like the fact that even top AI systems are only about 69% accurate in some evaluations, or that they can convincingly mimic human personality—a scary tool for manipulation. As the coverage on TechXplore points out, this isn’t an abstract issue. We’re already plugging these models into complex domains. If they don’t understand our inherent irrationality, how can we trust their output?
The Calibration Challenge
This gets at a fundamental challenge in AI development. We’re building systems trained on a compressed version of human knowledge and communication. But that corpus is full of how we *say* we think, or how experts *prescribe* we should think. It’s not a perfect dataset of how we *actually* think in the moment, with all our biases and shortcuts.
Fixing this is incredibly hard. Do you train the AI on more data of real, messy human decision-making? And then risk it learning and replicating our worst, most irrational impulses? It’s a tightrope walk. The researchers suggest this “overestimation” could lead AIs to propose strategies that are too complex for humans to follow or trust in real-world collaborations. They’re playing 4D chess while we’re over here playing checkers and sometimes just knocking the board over. For more insights from the research community, you can follow discussions from experts like Manisha on social media.
Look, the takeaway isn’t that AI is dumb. In fact, it’s the opposite—it’s too *theoretically* smart for its own good when dealing with us. The next big leap might not be in making models more logical, but in making them more psychologically astute. Until then, maybe take its predictions about human behavior with a grain of salt.
