Column: Should we believe more in Big Data or in magic?

Allison Schrager

November 6, 2013 at 1:19 PM

A magician performs with a model who presents a creation by Indian designer Manish Arora as part of his Fall-Winter 2011/2012 women's ready-to-wear fashion collection during Paris Fashion Week March 3, 2011. REUTERS/Benoit Tessier

By Allison Schrager One year I spent a lot of time with professional magicians. A few showed me the secrets to their tricks. Whenever they did, the skill and dexterity required for sleight-of-hand struck me as far more impressive than the idea that magic had been performed. It reminded me of my own experience with statistics. Data analysis is very similar to performing magic. With great skill you can pull things together and create the perception of surprising relationships. Often the magic is getting people to look at one thing, when they should be seeing another. Similarly with statistics, it's often not the correlation that's interesting but what you did to find it. This is important to keep in mind as the world embarks on the big data revolution. Big data is very large data sets, collected by the government, corporations, and institutions, becoming more available. Using this data, firms and policymakers can figure out what programs work (like health treatments, and which people respond to government incentives) and what consumers want. The deluge of information is expected to increase efficiency and lower prices. In a recent report, the McKinsey Global Institute encourages the increased availability of big data. It estimates that greater access to big data has the potential to create $3 trillion a year in value. It is generally true that more information is better, though big data comes at a cost in terms of privacy and data collection. Yet what concerns me is the proper interpretation of big data. An earlier McKinsey report addresses this issue. It notes a dearth of trained statisticians, estimating that America is short 140,000 to 190,000 workers with the skills to handle data. But lack of talent is not just an impediment; it's a potential source of danger. People, even those who know better, often take correlations literally and make decisions based on them, without appreciating the magic behind the numbers. Interpreting data is more of an art than a science. But unlike magicians, most researchers do not intentionally mislead people. A big concern when you run statistics is bias (over- or understating a relationship) and mistaking correlation for causation (whether X causes Y or just that they tend to occur at the same time). You might get biased results by either using the wrong data or an inappropriate estimation technique. Minimizing bias requires making subjective judgments. If you ran numbers on a large data set without inspecting it, removing outliers, and choosing the best model — you'd have much more bias than if you used some discretion. The process is complicated by human nature. It is easy to be seduced by your own results when they validate your prior expectation of what you'll find. Take the financial crisis, in which bad statistics played a large role. Many quants priced exotic housing securities using models that were fed data from areas where house prices never fell. This made the price of risk look very attractive, but then the products couldn't remain viable when house prices fell. In most cases the oversight was not intentional. It reflected the data available and the current industry standard. Without a significant drop in housing prices in recent memory, it was an easy mistake to make. Often what's most interesting isn't the statistical relationship itself, but the data that was required to find it. Take the oft-cited statistic that American life expectancy is lower than that of many other OECD countries. That would suggest that American healthcare is not as successful as other systems. But when you look more deeply at the data, a different story emerges. Once you account for people who died from injury (like violence or car accidents) or obesity-related disease, American life expectancy is similar to Canada's. America's lower life expectancy is alarming and should get the attention of policymakers. But to remedy it, we need to understand what's causing more car fatalities and obesity, and what factors — like poverty or arcane drug laws — lead to so much violence. American healthcare is certainly inefficient, but depending on how you parse the data, it's not clear that it's delivering worse results in terms of mortality compared to other OECD countries. Such examples may seem straightforward, but in practice they are hard to spot, even for the most experienced and well-intentioned professionals. That's why in academia, statistical work under goes a rigorous peer review process. In the same way a magician can discern an impressive or dirty trick, it takes a community with the same expertise to spot sources of bias. But expert peer review won't be realistic as data becomes more wildly available and used commercially. It should be a serious concern that people, without adequate experience, might unknowingly produce biased results and make important decisions based on them. But the use of big data is worth the risk. Statistical analysis is an imperfect process, but it's all we have to make sense of big data. With any new, transformative innovation there exists potential to take it too far or use it incorrectly. The same can be said for cars, airplanes or new financial products. The benefits of more innovation and information usually outweigh the costs. We can minimize these risks with greater awareness of a new innovation's limitations. McKinsey advocates more training and apprenticeships so we have more people who can run and manage data. This is certainly necessary, but not sufficient. We must also view any statistical result with the same humility and skepticism we experience when we see a magic trick. (Allison Schrager is a Reuters columnist. The opinions expressed are her own.)

Yahoo Sports
Former NBA guard Darius Morris dies at 33
Former NBA guard Darius Morris has died at the age of 33. He played for five teams during his four NBA seasons. Morris played college basketball at Michigan.
Yahoo Finance
The FDIC change that leaves wealthy bank depositors with less protection
Affluent Americans may want to double-check how much of their bank deposits are protected by government-backed insurance. The rules governing trust accounts just changed.
Yahoo Sports
Timberwolves coach Chris Finch calls Jamal Murray's heat-pack toss on court 'inexcusable and dangerous'
Murray made a bad night on the court worse during a moment of frustration on the bench.
Yahoo Finance
Former House Speaker Paul Ryan says he’s not voting for Trump : 'Character is too important'
Ryan says he would be writing in a Republican candidate instead of voting for Donald Trump.
Yahoo Sports
Post-draft NFL fantasy power rankings: Offenses we love, like and want to stay away from
With free agency and the draft behind us, what 32 teams look like today will likely be what they look like Week 1 and beyond for the 2024 season. Matt Harmon and Scott Pianowski reveal the post-draft fantasy power rankings. The duo break down the rankings in six tiers: Elite offensive ecosystems, teams on the cusp of being complete mixed bag ecosystems, offensive ecosystems with something to prove, offenses that could go either way, and offenses that are best to stay away from in fantasy.
Yahoo Sports
Cardinals lose C Willson Contreras after left arm fractured by J.D. Martinez's swing
The Cardinals' nightmare season continues.
Yahoo Sports
Ranking the best situations for the rookie quarterbacks: Start with Michael Penix in Atlanta at No. 1
It’s key to note that we’re not saying the “best team” or “best roster.” Instead, we’re talking about the best confluence of factors that can outline a path for survival and then success.
Yahoo Finance
These 3 stocks are poised to benefit from the massive energy transition
The energy transition will benefit companies providing electrical needs for surging demand. Analysts point to these three stocks as a Buy.
Yahoo Sports
Blockbuster May trade by Padres, MVP Ohtani has arrived, Willie Mays’ 93rd birthday & weekend recap
Jake Mintz & Jordan Shusterman discuss the Padres-Marlins trade that sent Luis Arraez to San Diego, as well as recap all the action from this weekend in baseball and send birthday wishes to hall-of-famer Willie Mays.
Yahoo Sports
Phil Mickelson on the majors: 'What if none of the LIV players played?'
Phil Mickelson hints that big changes could be coming to LIV Golf's rosters, and the majors will need to pay attention.
Yahoo Sports
Fantasy Baseball Trade Analyzer: Buy into a pair of Astros sluggers
Fantasy baseball analyst Fred Zinkie offers up his top buy low/high and sell low-high candidates for Week 6.
Yahoo Sports
2024 NFL Team Fantasy Football Power Rankings, 1.0
With NFL rosters pretty much set before training camp, Scott Pianowski reveals his first set of team fantasy power rankings for the 2024 season.
Yahoo Sports
NBA fines Nuggets G Jamal Murray $100K for tossing heat pack, towel on court vs. Timberwolves; no suspension
Murray tossed a heat pad onto the court during gameplay vs. the Timberwolves.
Yahoo Finance
Social Security just passed Medicare as the government's most pressing insolvency risk
An annual government report offered a glimmer of good news for Social Security and a jolt of good news for Medicare even as both programs continue to be on pace to run dry next decade.
Yahoo Sports
NBA playoffs: Officials admit they flubbed critical kick-ball call in controversial final minute of Pacers-Knicks
Tuesday's last-2-minute report should be interesting.
Yahoo Sports
The Scorecard: Andy Pages looks set to go down as one of the best fantasy baseball waiver wire pickups of 2024
Fantasy baseball analyst Dalton Del Don delivers his latest batch of hot takes as we enter Week 6 of the season.
Engadget
The best budgeting apps for 2024
Budgeting apps can help you keep track of your finances, stick to a spending plan and reach your money goals. These are the best budget-tracking apps available right now.
Yahoo Sports
No one was airing Angel Reese and Kamilla Cardoso's WNBA preseason debuts, so an X user livestreamed it
The quality was choppy, but it was better than what the WNBA had.
Yahoo Sports
Ex-Ole Miss QB and Denver Broncos draft pick Chad Kelly suspended at least nine games by CFL
Kelly allegedly harassed a female strength and conditioning coach who sued him and the Toronto Argonauts in February.
Yahoo Finance
Fed’s Kashkari: Rates will stay high for 'extended period' and can't rule out a hike
Minneapolis Fed president Neel Kashkari said interest rates will likely stay at current levels for an "extended period" and didn't rule out a hike if inflation stalls near 3%.

News

Life

Entertainment

Finance

Sports

New on Yahoo

Column: Should we believe more in Big Data or in magic?

Recommended Stories

Former NBA guard Darius Morris dies at 33

The FDIC change that leaves wealthy bank depositors with less protection

Timberwolves coach Chris Finch calls Jamal Murray's heat-pack toss on court 'inexcusable and dangerous'

Former House Speaker Paul Ryan says he’s not voting for Trump : 'Character is too important'

Post-draft NFL fantasy power rankings: Offenses we love, like and want to stay away from

Cardinals lose C Willson Contreras after left arm fractured by J.D. Martinez's swing

Ranking the best situations for the rookie quarterbacks: Start with Michael Penix in Atlanta at No. 1

These 3 stocks are poised to benefit from the massive energy transition

Blockbuster May trade by Padres, MVP Ohtani has arrived, Willie Mays’ 93rd birthday & weekend recap

Phil Mickelson on the majors: 'What if none of the LIV players played?'

Fantasy Baseball Trade Analyzer: Buy into a pair of Astros sluggers

2024 NFL Team Fantasy Football Power Rankings, 1.0

NBA fines Nuggets G Jamal Murray $100K for tossing heat pack, towel on court vs. Timberwolves; no suspension

Social Security just passed Medicare as the government's most pressing insolvency risk

NBA playoffs: Officials admit they flubbed critical kick-ball call in controversial final minute of Pacers-Knicks

The Scorecard: Andy Pages looks set to go down as one of the best fantasy baseball waiver wire pickups of 2024

The best budgeting apps for 2024

No one was airing Angel Reese and Kamilla Cardoso's WNBA preseason debuts, so an X user livestreamed it

Ex-Ole Miss QB and Denver Broncos draft pick Chad Kelly suspended at least nine games by CFL

Fed’s Kashkari: Rates will stay high for 'extended period' and can't rule out a hike