Think Critically About the Wisdom of Experts

Some lessons about assessing the claims of people whose opinions seem unassailable.

Reading Time: 11 min 

Topics

Permissions and PDF Download

Expert analysis informs the decisions we make as leaders and managers — and in our everyday lives. We can’t see red blood cells, but we trust scientists who say we have them and doctors who order blood tests to count them. We suspect that cognitive biases affect our choices, not because we have done the analysis ourselves, but because we believe social scientists who conduct experimental research. Much of our knowledge is ultimately garnered from the testimony of teachers, mentors, colleagues, and authors who write for publications like this one.

But we also live in a world where, almost daily, some expert’s previous certainty is discredited by new analysis. Diets once thought to be foolproof are ridiculed; management practices once decried are suddenly praised. So how should we treat the next piece of advice we get from a scholar or a consultant?

Philosophers of science, who study this issue, generally recommend that we simply trust what we hear from well-credentialed people who seem competent and sincere. But I think we can do better. We should always think critically about what we hear or read.

In my experience, “fresh eyes” often find errors that have eluded expert minds. We owe it to ourselves to handle each item of expertise the way we would a piece of fruit we’re about to buy — gauging how wholesome and ripe it is. Here are my thoughts on how to do that.

Dare to Doubt

In the second most popular TED talk of all time,1 social psychologist Amy Cuddy tells us that holding certain physical postures boosts our power hormones and makes us more courageous; however, attempts to replicate that result have failed.2 European governments chose to adopt austerity policies in part because esteemed Harvard economists Carmen Reinhart and Kenneth Rogoff told them that high debt levels cause a sudden drop in economic growth.3 Then a graduate student, Thomas Herndon, discovered that their claim was influenced by an Excel spreadsheet error.4

Experts fool themselves all the time, especially when a problem is messy and the analysis is difficult. The replication crisis — whereby scientific findings are increasingly being revealed as tough to reproduce — is plaguing psychology, economics, and medical research.5

No one really knows the extent to which empirical findings can be trusted, but some people have tried to guess. Stanford professor John Ioannidis argues that most medical research results are false.6 Economists J. Bradford DeLong and Kevin Lang make a similar claim about the field of economics.7 In a Strategic Management Journal article, my coauthor Brent Goldfarb and I estimate, very roughly, that about 20% of the research findings in business management are based on little more than random noise.8 Would you trust the word of someone who gave you bad advice one time in five?

Lesson: Don’t hesitate to challenge expert analysis.

Distinguish Stories From Predictions

Most of what we read from scholars, scientists, and other experts are stories that emerge from analysis of patterns in data. Experts ask, “Which companies succeed?” or “Which people make good leaders?” and then weave narratives that describe the patterns: “Companies that ‘stick to their knitting’ succeed” or “Authentic people are better leaders.” These stories are conjectures, Sherlock Holmes-style.

All of us, including experts, fall in love with our guesses and the stories we tell about them. I once asked a world-famous business scientist if he had ever tested his theory by trying to predict future events. He said he didn’t need to, because his theory predicted the past so well. He had forgotten that a story explains and a theory predicts.

To make sure our theories are predictive, we must test them against new information. Big-data analysts have learned this the hard way — by seeing exciting discoveries later debunked as products of chance. So, nowadays, the best analysts split their data into halves, first developing the story or model (the “training set” of data) and then evaluating it (the “validation set”). If they don’t get the same result for both halves, they conclude they don’t have a predictive result.9

Lesson: When an expert links a cause to a supposed effect, ask whether it’s a story to make sense of the past or a theory to forecast the future.

Question Assumptions

Analyzing empirical evidence always requires assumptions, sometimes so many that the process has been described as a trail with forking paths.10 At each fork, the analysts must make an assumption that may influence the final outcome. One problematic and common type of assumption involves how to assign values to variables that cannot be measured directly.

When analysts cannot conduct randomized experiments and must instead use observational data, guessing is especially common. Such is the case for researchers who study Alzheimer’s disease, because the slow and delayed progression of the disease makes it difficult to implement interventions and study their effects. Instead, analysts must sift through patient histories in search of possible causes, and such sleuthing requires many assumptions about missing information. For example, people who play bridge are less likely to develop Alzheimer’s, but the interpretation of this relationship depends on your guess about the hidden attributes of those who play bridge.11 If players and nonplayers are the same, then bridge playing may indeed prevent Alzheimer’s; if the two groups differ, these hidden factors may be the real explanation.12

Of course, the need to make assumptions clouds the work of management researchers too. Many variables needed for robust analysis are hard to measure, so analysts often make guesses to fill in the gaps.

Nonexperts, as unbiased outsiders, sometimes are better at vetting the logic of such guesses than the experts who made them. Researchers want to believe that they are on the right path — a delusion I myself have suffered — and may convince themselves that some guess is reasonable when it is not. Nobel laureate Richard Feynman articulated this aphorism for empirical scientists: “The first principle is that you must not fool yourself, and you are the easiest person to fool.”13

Lesson: Unearth assumptions that experts have used to get from the raw data to a set of conclusions.

Seek Alternative Explanations

To link effects with their true causes, researchers must rigorously try to rule out rival hypotheses. But most scientists are no more creative than you or I, and they tend to fall in love with a particular explanation and not look hard enough for an alternative. For example, a group of scholars who study gender stereotypes reported that hurricanes kill more people when they have female names, rather than male ones, because the female names made them seem less dangerous.14 The idea evoked such deep gender stereotypes that many people accepted the explanation. Then a host of scholars showed that the data record did not support the claim.15

I know from my own experience that considering alternative explanations is hard work and it is easy to become complacent. During a project in which I was estimating the determinants of demand for certain types of entertainment, I clung to my favorite predictors and stopped considering other causes. The result: a model that failed to accurately predict future demand.

To avoid this pitfall, don’t assume the analyst exhaustively considered rival explanations. Make a list of your own conjectures, and ask whether they were ruled out. This querying is easy if the analyst is a consultant or an employee, but you can also do it with published work. Research authors usually write a few paragraphs on alternative explanations; if they haven’t done so, or if their list seems incomplete, email them. If they have an answer, they’ll write back.

Lesson: Identify alternative explanations for a particular conclusion, and ask why each one is not a better answer.

Know the Limits of Inference

Trying to prove that a suspected cause is the true one is fraught with difficulty, as the philosopher David Hume made clear. As a result, researchers often use logical judo in conducting their analyses. Rather than directly seeking evidence to support the cause, they flip the analysis and measure the probability that the supporting evidence is just a mistake.

Consider the notion of “statistical significance,” which many people think measures confidence in a proposed cause for some observed effect. In fact, it is a measure of the opposite — the probability that chance accounts for what is observed. Thus, an estimate that is statistically significant isn’t necessarily true, and one that is “not significant” isn’t necessarily false. Significance, both found and not found, is just a way to flag, “Hey, there might be a real pattern here.”

Significance also gets confused with importance. With a large enough sample, almost any difference becomes statistically significant, but that does not mean the difference matters. For example, per journeyed mile, the safety record of U.S.-based air travel is statistically significantly better than that of U.S. train travel. Should we then worry about venturing by rail? No: The difference is unimportant because both forms of travel have an extremely low death rate (0.07 deaths per billion miles for air, 0.43 deaths per billion miles for rail). U.S. motorcycle travel, in contrast, is both significantly and substantively more deadly (213 deaths per billion miles).16

Lesson: Always ask, “Does this finding make a material difference in the real world?”

Demand a Robustness Analysis

Even a well-crafted study provides just one estimate of many that are possible. When estimates are not consistent across an array of assumptions, a study’s findings are considered to be less than robust, a term of art in research statistics.

A classic example involves the analysis of the notion that gun possession deters crime. An early study suggested that crime rates fell in areas where laws allowed people to carry a concealed firearm.17 But later studies, using the same data and slightly different assumptions, yielded different conclusions.18 Each side accused the other of being political stooges. The National Research Council tried to adjudicate the debate, but even its members could not agree.19

Finally, a group of scholars showed that slightly different assumptions resulted in widely varying conclusions: Gun possession caused less, more, or the same amount of crime — depending on an array of factors. The researchers even used a fancy statistical method, Bayesian analysis, to evaluate whether a best answer existed given the assumptions in each of the conducted studies. They concluded that, given the available data, we just can’t tell the effect of concealed weapons on crime.20 In short, the finding was not robust.

In contrast, the work of the late economist Steven Klepper on geographic industry clusters has stood up to repeated tests of robustness. Klepper and his colleagues showed, for example, that as companies spin off near their parent companies, they drop like fruit from a founding tree and often grow into strong organizations in their own right.21

Lesson: Confirm that findings persist across a variety of assumptions.

Avoid Overapplication

Even if study findings are robust in a particular sample (such as a specific population), they may not apply to other settings or groups. For instance, concealed-carry laws may vary in their effects in the U.S., Canada, or Bermuda. Educational tools that work in one culture may not work in another. Indeed, most studies provide information only about a given sample for a particular population.

If you run a marketing test on, say, U.S. college freshmen, you may have good data on the appeal of a product for that group, but not necessarily for a broader demographic. Draw your conclusions about other groups only after they too have been studied.

Lesson: The population and sample matter.

Be Skeptical of Hearsay

We all want to understand our world, so we tend to see patterns where none exist — canals on Mars, faces on the moon, old men on mountainsides. Mark Twain famously lampooned this tendency in Life on the Mississippi: “There is something fascinating about science. One gets such wholesale returns of conjecture out of such a trifling investment of fact.” Today’s experts on empirical analysis agree with him: We humans want to believe we know things and to be perceived as knowledgeable, which often leads us to make stronger claims than we should.

For example, I often hear experts say they “know” something on the basis of evidence — when they actually don’t know and, given the limitations of statistical analysis, could not possibly know. They may have good reason to suspect something, but they take the inference too far.

Worse still, other people repeat the inflated claims because they came from an expert. Then with each retelling, the exaggeration spreads. I’ve seen this happen with my own work — to the point where I couldn’t recognize my original thoughts. We all need to be more careful. A simple solution is to banish the word know — and use suspect or suggest instead.

Lesson: Avoid the language of certainty.

When Richard Feynman said, “Science is the belief in the ignorance of experts,”22 he was not disparaging scientists but reminding us that we all can help in advancing knowledge. As we use data to learn about the world and make the best possible business and management decisions, we all engage in scientific inquiry. As we consume others’ learning, we can be useful critics. In whatever role we happen to occupy, we should always question our inferences, think critically about the evidence and the arguments we hear, and admit our own fallibility when we proffer our own conclusions.

At least that is my advice to you, subject to your evaluation and careful critical analysis. If you engage seriously in that effort, you will realize I have told you a story, assumed many things, left alternatives unconsidered, failed to show that my analysis is robust, and made my own unsupported claims.

I hope the few ideas I have shared, based on my experience, will be useful to you. But you must decide that for yourself.

Topics

References

1. A. Cuddy, “Your Body Language May Shape Who You Are,” presentation at TEDGlobal 2012: Radical Openness, Edinburgh, United Kingdom, June 25-29, 2012.

2. M.W. Berger, “Power Poses Don’t Help and Could Potentially Backfire, Penn Study Shows,” Penn Today, Nov. 23, 2016.

3. C.M. Reinhart and K.S. Rogoff, “Growth in a Time of Debt,” American Economic Review 100, no. 2 (May 2010): 573-578.

4. K. Roose, “Meet the 28-Year-Old Grad Student Who Just Shook the Global Austerity Movement,” Daily Intelligencer, April 18, 2013.

5. S. Vasishth, “The Replication Crisis in Science,” The Wire, Dec. 29, 2017.

6. J.P.A. Ioannidis, “Why Most Published Research Findings Are False,” PLoS Medicine 2, no. 8 (Aug. 30, 2005).

7. J.B. De Long and K. Lang, “Are All Economic Hypotheses False?” Journal of Political Economy 100, no. 6 (December 1992): 1,257-1,272.

8. B. Goldfarb and A.A. King, “Scientific Apophenia in Strategic Management Research: Significance Tests and Mistaken Inference,” Strategic Management Journal 37, no. 1 (January 2016): 167-176.

9. J. Han, M. Kamber, and J. Pei, “Mining Frequent Patterns, Associations, and Correlations,” chap. 5 in “Data Mining: Concepts and Techniques,” 2nd ed. (San Francisco: Morgan Kaufmann, 2006): 227-283.

10. A. Gelman and E. Loken, “The Garden of Forking Paths: Why Multiple Comparisons Can Be a Problem, Even When There Is No ‘Fishing Expedition’ or ‘P-Hacking’ and the Research Hypothesis Was Posited Ahead of Time,” unpublished ms, Nov. 14, 2013.

11. D.F. Hultsch, C. Hertzog, B.J. Small, and R.A. Dixon, “Use It or Lose It: Engaged Lifestyle as a Buffer of Cognitive Decline in Aging?” Psychology and Aging 14, no. 2 (June 1999): 245-263.

12. J. Weuve, C. Proust-Lima, M.C. Power, A.L. Gross, S.M. Hofer, R. Thiébaut, G. Chêne, M.M. Glymour, C. Dufoil, and MELODEM Initiative, “Guidelines for Reporting Methodological Challenges and Evaluating Potential Bias in Dementia Research,” Alzheimer’s & Dementia 11, no. 9 (September 2015): 1,098-1,109.

13. R.P. Feynman, “Cargo Cult Science,” Engineering and Science (June 1974): 10-13.

14. K. Jung, S. Shavitt, M. Viswanathan, and J.M. Hilbe, “Female Hurricanes Are Deadlier Than Male Hurricanes,” Proceedings of the National Academy of Sciences 111, no. 24 (June 17, 2014): 8,782-8,787.

15. D. Malter, “Female Hurricanes Are Not Deadlier Than Male Hurricanes,” Proceedings of the National Academy of Sciences 111, no. 34 (Aug. 24, 2014): E3,496; and L.A. Bakkensen and W. Larson, “Population Matters When Modeling Hurricane Fatalities,” Proceedings of the National Academy of Sciences 111, no. 50 (Dec. 16, 2014): E5,331-5,332.

16. I. Savage, “Comparing the Fatality Risks in United States Transportation Across Modes and Over Time,” Research in Transportation Economics 43, no. 1 (July 1, 2013): 9-22.

17. J.R. Lott and D.B. Mustard, “Crime, Deterrence, and Right-to-Carry Concealed Handguns,” Journal of Legal Studies 26, no. 1 (January 1997): 1-68.

18. D.A. Black and D.S. Nagin, “Do Right-to-Carry Laws Deter Violent Crime?” Journal of Legal Studies 27, no. 1 (January 1998): 209-219; and I. Ayres and J.J. Donohue III, “Shooting Down the More Guns, Less Crime Hypothesis,” working paper 9,336, National Bureau of Economic Research, November 2002.

19. National Research Council, “Firearms and Violence: A Critical Review” (Washington, D.C.: National Academies Press, 2005).

20. S.N. Durlauf, S. Navarro, and D.A. Rivers, “Model Uncertainty and the Effect of Shall-Issue Right-to-Carry Laws on Crime,” European Economic Review 81 (January 2016): 32-67.

21. R. Goldman and S. Klepper, “Spinoffs and Clustering,” RAND Journal of Economics 47, no. 2 (summer 2016): 341-365.

22. R. Feynman, “The Pleasure of Finding Things Out: The Best Short Works of Richard P. Feynman” (New York: Basic Books, 2005).

Reprint #:

60213

More Like This

Add a comment

You must to post a comment.

First time here? Sign up for a free account: Comment on articles and get access to many more articles.

Comment (1)
Paul H Aube
That is the beauty of human knowledge.  It may be imperfect, but growth arises from this.