Pushing aside GDP for a measure of human well-being turns out to be very, very difficult. Ask Dan Benjamin
UCLA Anderson’s Daniel Benjamin is at the forefront of an international movement to focus economic policy on helping people live happier, more satisfying lives, rather than just maximizing gross domestic product, or GDP.
The United Nations, the Organisation for Economic Co-operation and Development, the European Commission and at least four Nobel Prize-winning economists are on board with multiple initiatives to create new measures of well-being worthy of driving economic agendas. Benjamin and his research partners are working with New Zealand and Israeli government agencies to test alternative measures that could elevate things people really care about — health, social connections, work-life balance and the environment, for example — in making policy decisions.
Yet Benjamin, who has been with campaign happiness almost since its inception in the early aughts, is possibly its biggest killjoy. His research on the subject reads like a litany of reasons no one should take data from today’s happiness and well-being studies too literally, much less use it to try to change the world.
Here’s a sample of findings by Benjamin and his co-authors, in paper after paper: Seemingly straightforward survey questions in major studies of happiness and well-being are rife with unintended ambiguity. People repeatedly misinterpret what researchers are asking, and researchers repeatedly miscalculate what those answers mean. None of the standard research questions consistently prompt the kind of key information the scientists think they are collecting. Survey questions intended to capture everything we actually do care about don’t even come close. In fact, the most popular surveys used in research don’t even ask about some of the issues nearest to our hearts.
Benjamin, who is an expert in survey design, insists none of these problems is deadly to the goal. He and his colleagues gamely offer possible fixes in every study in which they expose flaws. For example, they demonstrate that in some cases, tweaking the wording of survey questions can get economists and their subjects speaking the same language around the topic. Sort of.
But those are baby steps in what Benjamin sees as a very long process to build a measure of national well-being respectable enough to be the basis of policy decisions.
He and his co-authors have long-argued that what the world really needs is not a single happiness measure, but an index of well-being aspects, made up of the things that are actually important in our lives, and weighted accordingly. That means meticulously addressing all the above issues and more, for not one but dozens of population subsets, on thousands of survey questions, and calibrating interpretation of the data accordingly.
Benjamin does agree that the need for new measures is urgent. Policymakers worldwide currently work hard to maximize GDP per capita, a widely accepted measure of progress in economics but a terrible yardstick for well-being. The goal encourages more and more consumption and production even if most goods go to only a tiny fraction of the population, or all that production destroys our leisure time and the environment. Wars and natural disasters often become positive events with this formula. There’s simply nothing about a giant GDP per capita that ensures that citizens’ lives are happy and fulfilling.
Right now, he continues in a video interview: “There are serious problems with these (new) measures that we haven’t solved yet … I think the enthusiasm for the happiness type measures has outstripped what we know.” He’d like everyone to take a pause on applying the measures for policy to focus on basic research needed to improve their accuracy.
Meanwhile, some of the policy community has moved on from “can we accurately measure happiness and well-being?” to “why isn’t well-being the main goal of public policy yet?” The U.N.’s “World Happiness Report 2022,” tells us that Finland (again) is the happiest nation on earth. The latest report from the OECD’s Better Life Initiative says the populations of one-third of nation members were more satisfied with life than they were a few years ago. In England, members of Parliament are calling for use of the U.K.’s Gross Domestic Wellbeing score as a “guiding star” to set policies to address the country’s collective decline in well-being, as measured annually in that study.
Benjamin sticks with cautious support. “I’m hopeful that eventually these (types of) measures will become useful supplementary measures… whose limitations are well understood,” he says. “I think there are good reasons to be enthusiastic about their promise.”
But even that tepid optimism comes with these enormous caveats.
Happiness Is Overrated
“I don’t think that people maximize happiness in that sense. They actually want to maximize their satisfaction with themselves and with their lives. And that leads in completely different directions than the maximization of happiness.”—Nobel Laureate Daniel Kahneman in a 2018 interview with Tyler Cowen, on explaining why he walked away from happiness research.
Conventional economic theory suggests we make choices in our own best interest, essentially to maximize our own happiness and well-being. When we choose badly, it’s because we didn’t have enough information, or maybe we lacked the intellect to understand the best choice. Modern behavioral economics, including Kahneman’s work, has spent 30-plus years exploring why we make these mistakes.
But what if these off choices aren’t really mistakes? What if scientists instead are misinterpreting what we want?
Some years before Kahneman gave up on happiness, Benjamin and his colleagues began questioning whether the pursuit of happiness is really what drives choices. Their research found that even very informed, very intelligent people don’t always prioritize it in their decisions.
Their first study collaboration in 2012 asked participants to choose between pairs of hypothetical situations, such as a job that pays less but allows for more sleep versus the opposite. Other questions were designed to get at which situation the respondent expected would most improve their happiness. The work was conducted by Benjamin, University of Colorado’s Miles Kimball, Cornell’s and Hebrew University of Jerusalem’s Ori Heffetz and University of Pennsylvania’s Alex Rees-Jones.
Although the choices usually lined up with people maximizing their own happiness, there were systematic discrepancies. Factors such as family happiness and social status, for example, appeared to contribute heavily in decisions for many people.
In 2014, the team followed up with a paper looking at real-life, high-stakes scenarios. They surveyed students from 23 medical schools after selecting residency programs they hoped to land. “The match” is the pinnacle of the med school experience. After a lengthy application and interview process, students rank their residency preferences, residencies rank the students, and overlap determines the outcome. The students often start thinking about the match even before starting med school, and schools prep students on how best to handle the process.
Each participant answered questions about their expectations for things like their social lives, anxiety levels, prestige and spousal happiness if they landed a particular residency. (Repeated for each of their top four choices.) Then they were asked a series of questions to gauge anticipated levels of their own happiness during and after each specific residency.
The students’ choice rankings lined up with the residencies they thought would bring them the most happiness and life satisfaction about 70% to 80% of the time. So if researchers picked which residency would make one of these applicants happiest solely by looking at her choices, they’d get it wrong on about 1 out of 4 tries. Some favored prestige, for example, or their spouse’s preferred location.
It seems unlikely that so many soon-to-be physicians mistakenly ranked their choices in one of the most important decisions of their lives. Instead, Benjamin says, they intentionally traded happiness, or life satisfaction, to pursue other goals.
Economists Don’t Know Your Happy Place
“People care about more than just what is measured by standard, single-question survey measures of ‘happiness’ or even ‘life satisfaction.’” —2020 Behavioral Public Policy study by Benjamin, Gordon College’s Kristen Cooper, Heffetz and Kimball.
The holy grail of the “Beyond GDP” movement is a new indicator that directly measures the collective level of lifetime satisfaction as you and I might define it. It will assess and track how we’re doing in aspects of our lives that are really important to us, such as friends, families and freedoms. It will not ding a nation’s progress for prioritizing, say, leisure time over production or green energy over consumption, if those things are important to its population. Improving this indicator year after year would become a core policy goal that tempers the push to raise GDP per capita at all costs.
Unfortunately, no one is sure what ingredients make your life satisfying. They’re probably different from mine.
To get around this hurdle, many studies ask individuals to self-assess their levels of well-being via survey questions. The U.K., for example, added four questions about well-being to its annual household survey, including, “Overall, how satisfied are you with your life nowadays?” The World Values Survey, the European Social Survey and several other country-specific surveys rely on similar questions. Gallup provides survey answers for the World Happiness Report. The University of Chicago’s General Social Survey has provided data for thousands of academic and legislative works involving well-being.
Other strategies use educated guesses about our hearts’ desires and quantitative indicators that measure these aspects. The most famous of these, The U.N.’s Human Development Index, devised by Nobel laureate Amartya Sen and Pakistani economist Mahbub ul Haq, gives equal weight to indicators in three categories: longevity, education and standard of living. The OECD’s annual How’s Life? publication and Better Life Index are derived from indicators and survey data in 11 areas, including health and work-life balance.
But these popular measures of well-being miss key aspects of what we actually care about, according to Benjamin’s research. And they overweight things that are way down our own lists of priorities.
Benjamin, Heffetz, Kimball and Cooper lay out examples of this problem in an article published by the International Monetary Fund in December 2021. They return to findings in their early studies, conducted by Benjamin, Heffetz, Kimball, and Cornell’s Nichole Szembrot, in which they listed 136 aspects they thought might rank high in importance to us and asked survey participants to choose between them in pairs. For example, would you choose slightly more love in your life, or slightly more sense of control over your life?
Those early studies found health of major importance to people, but not longevity. Feeling happy wasn’t particularly high up either. The participants cared about living morally and participating in politics and community life. They cared much less about knowledge, skills and understanding the world — factors that weigh heavily in the HDI calculation. Anxiety levels, the subject of one of the U.K.’s four survey questions, barely registered in importance.
In numerous papers and articles, Benjamin and co-authors lay out detailed plans for constructing their own ideal well-being indicator. But these collaborators know that their approach isn’t ready for prime time either. They consider it a good first crack at devising a comprehensive, properly weighted index that gets at population levels of well-being, encompassing happiness and life satisfaction. They envision their setup will be discussed, criticized and improved upon until it truly reflects our progress toward well-being goals GDP measures lack.
We Don’t Understand the Survey Questions
“The questions that are being used now are not being interpreted in ways that economists think that they are, or want them to be, for the purposes that economists are using them… People are interpreting the questions differently from each other. And that’s a problem.” —Benjamin in a 2019 interview at Institutet för framtidsstudier.
Researchers have attempted to skirt the problem of pegging our individual preferences by asking one or a few broad questions that might cover whatever conditions we need for happiness. They read something like this one, from the oft-used General Social Survey: “Taken all together, how would you say things are these days — would you say that you are very happy, pretty happy or not too happy?”
But there are so many different ways we could answer that. For example, I think they’re asking for an assessment of my very personal status, but my family members’ struggles and joys play heavily in my own happiness. I can’t separate the two. And since I’m temporarily just OK — I’ve had a crummy couple of weeks but a “very happy” life so far—– does that make me “pretty happy” or “not too happy” here?
Now consider that every individual answering these survey questions runs them through their own personal interpretations of “taken all together” and “you” and “these days” and “pretty happy.” You end up with useless data because people are essentially answering a lot of different questions.
This holy mess of crossed translation showed up repeatedly when Benjamin and his collaborators asked respondents what they had in mind when they were answering questions on well-being surveys.
None of the questions consistently elicited answers that reflect the “self-centered utility” that the researchers were after, according to the findings in a 2021 working paper by Benjamin, Amherst College’s Jakina Debnam Guzman, Paris School of Economics’ Marc Fleurbaey, Heffetz and Kimball. Many people consider family or friends in that equation no matter how the question is worded. When answering common scale questions — “On a scale of 0-10, how happy are you?,” for example — there was no uniform understanding of, say, a 6, or a move from a 4 to a 5.
And the time period under consideration? We all have our own ideas.
Even Economists Don’t Know What They’re Asking
“Collectively, the literature (research) makes all sorts of different assumptions about the same question, and, you know, they can’t all be right.” —Benjamin, in an interview with UCLA Anderson Review.
One of the biggest disconnects between the questions economists think they are asking in these surveys, and the questions we actually answer, involves the period of life under consideration. When asked, “How satisfied are you with your life these days,” do you think about your well-being right now (which economists call flow utility); in coming months or years (forward-looking utility); or over your whole life, including past, present and future expectations (lifetime utility)? (“Utility” is econ-speak for “well-being.) It’s exceedingly important, for reasons we’ll get into later, that researchers know which nerdy distinction of time your answers reflect.
Yet Benjamin and his co-authors found it all but impossible for researchers to know which time perspective survey participants were using. Neither the standard questions, nor new ones they devised, consistently elicit responses for any particular time horizon, according to their results.
They found it likely that people change the way they interpret the time element around life events and aging. That means one person might respond from a here-and-now perspective one year and then respond, to the same question, as a lifetime assessment the next.
Worse yet, researchers throughout the field seem unclear about which time frame particular questions trigger. That “taken all together” question in a previous section? Benjamin’s team found it used as a proxy for flow utility in one study and forward-looking utility in another. In one case, they found it interpreted differently at different times within the same study.
Co-author Kimball sounds the alarm on such, shall we say, imprecise work in a blog post about the study earlier in 2022. “Papers using happiness, life satisfaction or, ‘Where do you stand on the ladder of life?’ data (self-reported well-being data), make strong assumptions about how that data relates to theoretical utility notions,” he writes at Confessions of a Supply-Side Liberal. “In this paper, my co-authors Dan Benjamin, Mark Fleurbaey, Jakina Debnam Guzman, Ori Heffetz and I bend over backwards to make that OK, but it just isn’t.”
Applying This Data Could Seriously Screw Up Policy
“While we agree that ‘measurement does not need to be perfect to be useful,’ we worry that fetishizing an imperfect measure could be damaging — as demonstrated by the obsession with GDP.” —Behavioural Public Policy study by Benjamin, Cooper, Heffetz and Kimball, 2019.
Wouldn’t a well-being index just measure stuff the government already knows? Shifts in prices, income and employment, for example, are already compiled quantitatively and regularly without so much drama over the formulas. Why do we need surveys to find out, say, a lot of people are struggling with rising prices and stagnant incomes?
Ideally, the surveys capture something we’ve grown immune to in statistics: How these economic changes make us feel. It could be that small income declines don’t depress us nearly as much as the government thinks, but a steady rise in our grocery bills worries us immensely.
If economists saw the depth of emotion we experience with these data changes, they could make policy adjustments at scales we’d appreciate. The changes on the well-being index would become omens throwing off calls for action, similar to the way a GDP decline over two consecutive quarters signals a recession: time for the government to throw some resources at averting dark times.
Of course, there’s no guarantee that the U.S. government would be any less frozen by contention over policies to address well-being than other issues of the day. But a respectable index would give the country, for the first time, the data it needs to at least know what would make us feel like our lives were improving.
Unfortunately, economists still understand very little about how we, the public, really feel. Today’s well-being measures don’t tell us nearly enough to predict how we would react to policies they might prompt.
Consider unemployment, which, as Benjamin explains, makes people miserable. When the jobless are asked, “How happy are you these days,” they tend to slide their markers way left to the “very unhappy” end of the scale. Working people generally assess themselves as much happier than the unemployed.
Now imagine you’re an economist advising the Fed these days on inflation tactics looking at those results. Unemployment is statistically low but, for those experiencing it, it feels just devastating. So do you advise a soft touch on interest rate hikes in order to preserve jobs, even though it means higher prices for everyone? And maybe an increase in unemployment benefits too, even though it might require a tax increase or balloon the deficit?
And is unemployment unhappiness a permanent scar on well-being or a period of such extreme stress that it temporarily clouds a more accurate assessment of one’s lifetime satisfaction? The answer, Benjamin says, “makes a big difference in terms of what the size of the unemployment check should be.”
The problem is, he continues, those survey answers told researchers very little about how painful unemployment is or whether the people prioritize high levels of employment over low prices, Benjamin explains. The differences in answers between jobless and employed may just reflect different interpretations of the question.
The unemployed are hard-focused on the misery at hand, Benjamin and colleagues found when they asked them. They answer the happiness question to reflect how they feel at this moment, even though they recently would have scored themselves much higher. They tend to revert to previous levels of well-being shortly after starting a new job. “If the unemployed had focused on their entire life, their answer might be different,” Benjamin says.
Employed respondents tap less from the here and now. Some will answer at 60 or 70 on a 100-point happiness scale, even if they’re having a horribly difficult year, because they’re optimistic that this struggle will soon pass. Older people usually bleed even more of their entire lives into that answer.
Such rampant variations in interpreting survey questions can create misleading policy cues in multiple ways. Maybe you’re living through some grueling years of medical training. You will drag down collective well-being if your answer is based on current hardship but perhaps boost it if considering your future life as a highly paid surgeon.
Single people repeatedly score themselves lower on questions meant to assess life satisfaction than those with romantic partners, but happier on questions related to work. Does government need to get in the business of encouraging partnerships or making single life better? No. The work by Benjamin and his co-authors suggests that some people just care more about work.
Such layers and layers of apples-to-oranges comparisons, and the miscues they promote, would be embarrassments in traditional economic research. To use this data to direct policy? Benjamin’s team keeps publishing study after study after study that makes that look like a really bad idea.
Real Science Is Really Slow
“Accurate national measures of any indicator cannot be developed quickly — after all, it took decades to refine GDP into the national statistic it is today.” —Benjamin, Samantha Cunningham, Heffetz, Kimball and Szembrot in a 2015 essay at the World Economic Forum.
Benjamin and his collaborators contend that this entire field of research, including their own efforts to peg population well-being, is in its infancy. That elusive indicator, they believe, cannot be simplified into a few excellent survey questions. “We weren’t able to succeed at finding the question that captured everything,” he says.
The team continues to run experiments to find out what, exactly, we need for well-being. (That list of 130 possibilities they tested in the 2010s? It’s grown to more than 2,000.) They fiddle with endless variations on survey questions in search of some that cleanly measure those aspects. They constantly address a host of issues in aggregating the data to reflect our priorities, a necessity for both weighting the index and understanding which policy trade-offs we would prefer.
How do you measure happiness? It’s not easy or simple, it turns out. A wealthier country doesn’t actually mean happier people. For example, Finland has been No.1 for five years running, according to the World Happiness Report (table below), but Luxembourg with a higher GDP ($135,683), doesn’t crack the top five. After 10 years of annual happiness rankings, we still don’t know exactly what the difference maker is. Benjamin, Cooper, Heffetz and Kimball argue that “instead, a measure that captures many dimensions of well-being and complements GDP is needed.” Refining which metrics are monitored and how those metrics are weighted could help us answer the happiness question and create more effective public policy.
U.S. Out-GDPs These Countries, But Trails Them in Contentment
Happiness rank | GDP per capita |
1. Finland | $53,983 |
2. Denmark | $67,803 |
3. Iceland | $63,384 |
5. The Netherlands | $58,061 |
15. Canada | $52,051 |
16. United States | $69,288 |
17.United Kingdom | $47,334 |
20. France | $43,519 |
And they continue to publish papers and essays aimed at convincing the broader scientific community that this work is nowhere near ready for policy use. Right now, Benjamin says, they’re looking into differences in the ways people use scales. For example, those happy Dutch may think of a 7 on a 0-to-10 scale as merely content while Russians might consider a 7 blissful.
In the meantime, Finnish GDP per capita clocks in at about 65% of that in the U.S., and the country still swamps us in the happiness rankings. All those other Scandinavian countries that consistently hover at the top cite a lot of leisure time and reliable financial safety nets that perhaps we should emulate.
Benjamin wishes that governments would first take these reports as a prompt to fund more research into improving the data that goes into them.
How should the rest of us read them? Benjamin says: “I tell people to take the results with a grain of salt.”