Deputy Chief Technology Officer for Data Policy DJ Patil with his skateboard March 19, 2015. (Photo: Chris Usher for Yahoo News)
DJ Patil is still settling into his White House office. It’s by far the most prestigious place he’s ever worked, but it looks more like a bootstrapping startup than a precinct of Washington power. His desk is tidy, and on top of his overflowing bookshelf sits a watchful Pokémon Pikachu doll, adding a touch of whimsy to the otherwise drab, government-issue office. Propped up against the wall is his bamboo longboard (a hybrid trickboard with a school of fish painted on the underbelly). The skateboard is more than just an emblem, a reminder of his days as a Silicon Valley technologist when he sometimes zipped across the campuses of LinkedIn and eBay. It’s how he gets to work on nice days, hurtling down the streets of downtown D.C., weaving around potholes, tourists and protesters as he feels the breeze in his dark brown hair, before slipping behind the security gates of 1600 Pennsylvania Avenue.
Patil is responsible for nothing short of harnessing the extraordinary power of the federal government’s hundreds of thousands of data sets.
In February, President Barack Obama added Patil, a cheerful Californian who still laments the lack of authentic Mexican food in D.C., to the growing roster of Silicon Valley-bred digerati who have been recruited to shore up the White House’s tech capabilities. His official job title is chief data scientist and deputy chief technology officer for data policy. Unofficially, he is responsible for nothing short of harnessing the extraordinary power of the federal government’s hundreds of thousands of data sets.
How much data is that, exactly? The scope of the government’s collection is so staggering that no one really knows. Thanks to the explosive use of affordable mobile electronics in the 21st century, data proliferates in such a frenzy that researchers estimate over 4 zettabytes of it exists in the world. To put that in context, if 1 byte equaled a character of text, the 1,250-page tome War and Peace would fit into a zettabyte about 323 trillion times. The U.S. is the largest collector in the Northern Hemisphere. And Patil’s job is to be both the wielder and the protector of that data, to unlock its vast potential for progress while guarding against risk of government abuse.
Today, government data powers some of the most basic parts of our lives. The weather app you look at each morning before getting dressed filters statistics from the National Oceanic and Atmospheric Administration into the form of a shining sun or angry cloud. On your way to work, your local transportation office feeds information onto a screen that tells you when your train will arrive, or how much longer it’ll take to drive to your destination. The Bureau of Transportation Statistics even keeps track of how horribly delayed your flight is.
How Patil commutes to work each morning.
But Patil, a guy who’s spent his life molding data into powerful, cutting-edge products for companies like eBay and LinkedIn, wants to do much more than improve your morning commute. From day one on the job, he’s spearheaded a far-reaching project called precision medicine: a form of care that uses your DNA to dictate a personalized approach to treatment. Patil imagines that one day, mapping your genome may be as routine as getting a cavity filled. Doctors will then be able to tailor care based on your own genetic makeup. Cross-reference that with statistics the government provides about the environment you grew up in — the air quality, the water quality, the likeliness of disease — and, as Patil says, his brown eyes widening, “literally, data can save lives.”
Though Patil is focusing on health care for now, nearly every government entity could be revolutionized with the help of data. After collaborating with companies like Waze during Hurricane Sandy , the Federal Emergency Management Agency has invested in data-driven infrastructure that swiftly identifies areas of need during a disaster. Researchers at the University of California, Berkeley, are currently investigating data relating to mandatory sentencing in the criminal system, calculations that could save prisons money and reduce recidivism rates. Most recently, a study that found a correlation between climate change and violent conflict made its way into Obama’s State of the Union speech.
“ Literally, data can save lives.” – DJ Patil
On the flip side, Patil must also ensure that the enormous amounts of data collected under the government’s watch are both secure and free from the exploitive tendencies of private companies. Recently, for instance, the administration has focused on educational apps tailored to individual students. By tracking students’ online habits, educators can zero in on their needs. But once the data is collected, under current law it can also be sold for marketing purposes. Patil is a fierce opponent of such practices. A newcomer to Washington politics, he’ll have to wade into this and other contentious policy debates.
By turning to Patil, Obama is placing a bet on an iconoclastic risk taker, a classic disrupter in the mold of Silicon Valley techies. His trailblazing path to the pinnacle of data science involved skirting the rules and tilting against traditional institutions. He was suspended from school, roundly rejected by admissions offices, and befuddled HR directors who couldn’t figure out where he would fit in. But through a combination of blinding intelligence, a laserlike focus on the real-world applications of science and an uncanny ability to see alternative paths, he willed his way to the top of this cutting-edge profession. Now the question is: Can he succeed inside the U.S. government, the mother of all rulebound institutions, a risk-averse bureaucracy and stifler of innovation?
DJ Patil chats with colleagues at the White House. (Photo: Chris Usher for Yahoo News)
“The biggest challenge for DJ is going to be human and organizational, rather than technical,” said Steven Weber, a professor at Berkeley’s School of Information and Political Science who has interacted with Patil in the field. “I don’t think he’s going to be able to walk into that office and say, ‘Look at all this great data we have, we can build this and we can build that, we can build this and we can build that.’ But I know that’s going to go through his head.”
Silicon Valley High
Patil was born in 1974 as Dhanurjay Patil, and grew up in the unharvested suburbs of Cupertino back when the Bay Area was better known for its fruit orchards than its office parks. His father, Suhas, immigrated from Jamshedpur, India, to earn his PhD in electrical engineering at MIT, studying under greats like Harold Edgerton (best known for his mind-bending strobe flash photography of milk drops). After graduating, he set out for Silicon Valley to build his own successful semiconductor business, Cirrus Logic.
Growing up, DJ was bored by rote science and math lessons — the step-by-step instructions he was taught in school. But at home he became steeped in the ethos of garage-based businessmen like Dave Packard and Bill Hewlett. Patil and his father cleared out an extra room in his childhood home to re-create Edgerton’s experiments, tinkering away for hours on weekends. Captivated by the unpredictability of physical movement, he devoured James Gleick’s book “Chaos: Making a New Science,” a seminal study of chaos theory.
“I started really reading and I thought: ‘This is fascinating,’” he told me last month in an empty conference room in the Eisenhower Executive Office Building, where we were surrounded by framed pictures of iconic moments in science and tech history. “But I didn’t have the mathematical skills to really understand it.”
“ It was this kind of moment when you realize: ‘Oh, my gosh, I am that stupid.’” – DJ Patil
It was a pattern in Patil’s life: He was too intellectually restless to settle down and focus in class. He loved computers and quickly found their subversive power. By middle school he’d managed to hack into his English course’s grading system. Within the first six months of attending Monta Vista High School, he was kicked out of his algebra class for speaking out of turn (he was forced to repeat the class that summer). Later on, he was suspended from school for setting off a stink bomb in class. Despite this rebellious streak, the very assistant principal who suspended him for that incident saw his talent.
“I don’t think he was one who was disrespectful of the system or was actively defiant against rules for the sake of being defiant against the rules,” Rich Knapp, a now retired school administrator, told me in a recent interview. “If he thought he had a better way of doing things, he wasn’t afraid to step out and say, ‘I think there’s a better way to do this’ and take a risk and do it.”
Nevertheless, by the time Patil graduated from high school in the spring of 1992, his high jinks and poor SAT scores had put him at the bottom of his class. He received a pile of thin envelopes — rejection letters from every college he wanted to attend. He cried, and now recalls it as a “soul-crushing” experience.
At the encouragement of his father, however, he pushed on. First he appealed his rejection from the University of California. At the same time, he followed his girlfriend to De Anza Community College, enrolling in the same classes she did. On the first day of their calculus course, he listened intently to the professor’s lecture but understood nothing.
DJ Patil in high school. (Photo: Courtesy DJ Patil)
“It was this kind of moment when you realize: ‘Oh, my gosh, I am that stupid,’” he said. “I had a choice: Either get with the program or you’re not going to be able to understand these concepts that you’ve been passionate about.”
Deeply embarrassed, he went to the Cupertino library, checked out every single high school math book he could get his hands on and — in a scene straight from Isaac Newton’s biography — taught himself math. After years of feeling clueless in the classroom, he finally found that the core mathematical concepts that had always fascinated him seemed to stick. It was a humbling moment for Patil, but also, as he recalls it, “really fun.” Meanwhile, much to his surprise, his appeal to get into college worked, and in 1993 he transferred to the University of California, San Diego, majoring in mathematics.
Patil fit in well at UCSD, a college located in the sunny, affluent beach town of La Jolla. His dorm — a newly built canyonside housing area nicknamed “Snoopy Camp” — was a quick 15-minute walk from Black’s Beach. But he was still disengaged in class. By 1996, he’d completed his degree and found himself in a place similar to the one he’d been in four years earlier — without the grades to get into any of the programs he’d been eyeing, particularly a mathematics and physics doctorate track at the University of Maryland. He again turned to his father for advice.
“ If he thought he had a better way of doing things, he wasn’t afraid to step out and say, ‘I think there’s a better way to do this.’” – Rich Knapp
“In his infinite wisdom he suggested one thing — road trip,“ Patil recalled in a commencement speech at UC Berkeley’s School of Information. “A special one that might just ‘happen’ to take us near the campus.”
The two set out east with an aim to meet James Yorke, regarded by many as the father of chaos theory. Patil had cold-emailed Yorke before the trip, to a tepid response. But Yorke agreed to meet them for dinner at a nearby Chinese restaurant. By the time the check arrived, Patil had impressed Yorke enough to make it into the program.
Yorke saw something different in Patil, namely his drive to take his research further than the math problems on a page. “He was unusually focused on the question of ‘Why are we doing this and where do we want to go?’” Yorke, now 73, recalled during a recent interview. “One can get lost in mathematics by creating results that really don’t have impact. So he wants to know, what’s the impact? Why are we doing this? What’s the worst thing that could happen?”
To pay his way through school, Patil took a gig as a lecturer while juggling his research on nonlinear weather patterns at night. Each day around 5 p.m. he’d go to bed and sleep until midnight. Then he’d wake up and head to the campus computer lab, where he’d parse pages and pages of public data sets from the National Oceanic and Atmospheric Administration. It was during these early hours of the morning that he came to appreciate open access to the government’s vast databases, later admitting that they “helped get me through school!”
DJ Patil with his father after Patil’s commencement address at UC Santa Cruz’s Jack Baskin School of Engineering. (Photo: Courtesy DJ Patil)
By the time he’d completed his doctorate, Patil had made a significant improvement on mathematical models used for numerical weather forecasting, finding a more efficient way to predict chaotic temporal patterns. Finally, he saw firsthand how his work in the field could effect actual change.
He joined the Defense Department in 2004, at the height of the U.S.’s involvement with Iraq. There, he and two other research fellows took part in something called the Threat Anticipation Project, which researched how to combine computer science and social science in order to anticipate emerging terrorist threats.The experience was yet another instance where Patil saw the limitless power of data in almost any situation.
After the fellowship and a short stint as a professor at the University of Maryland, Patil headed home to Silicon Valley in 2006. By then, the cherry orchards he grew up with had been ripped down and replaced with a hybrid shopping center/apartment complex, complete with a Chipotle and a Borders bookstore. Over the previous 20 years, the sleepy suburban sprawl of his childhood hometown had expanded to accommodate the massive number of tech workers who were flocking west to join companies like Facebook, Google and Skype. Patil was one of them. But as he interviewed at startup after startup, it became clear that he was not as desirable a candidate as he presumed.
“I’d go to eBay and all these other companies, and I’d be like: ‘Look, these are the problems I’m interested in and I think I can help,’” he said. But he’d always receive the same response: “We don’t know what to do with you.”
Through a family connection he was able to land a job at eBay. The company created a role for him as principal architect. Eventually Patil was placed at the crux of product development for every company that fell under eBay’s purview, including Skype, PayPal and StubHub. There he used his expertise in data to strategically pinpoint problem areas in the company and developed products to improve its core operations. In just a two-year stint there, he filed eight patents relating to customer service support, machine learning, human-computer interaction, visualization, behavior insights and social network analysis.
Around the time that Patil joined Skype, LinkedIn had its own data problems. The company recognized that, though it had grown to a network of about 8 million accounts, people didn’t seem to be connecting with one another as automatically as hoped. One LinkedIn manager compared the situation to “arriving at a conference reception and realizing you don’t know anyone. So you just stand in the corner sipping your drink — and you probably leave early.” Employees like Jonathan Goldman were playing with user information to invent features like People You May Know — a built-in website feature that practically every social network uses today. But these were side projects, and sometimes openly contested within the company.
DJ Patil speaks at the Digital Life Design Conference on Jan. 20, 2013, in Munich, Germany. (Photo: Tobias Hase/DPA/Zumapress.com)
When then-CEO Reid Hoffman recruited Patil to join the company in 2008, all that changed. Patil assembled a team of sharp data scientists, focusing their efforts on missed opportunities within an individual’s browsing experience. Once they identified a weak point, they’d use data to offer a personalized fix.
“What we realized is data, when used responsibly, is a force multiplier,” Patil said.
Patil and his team approached its projects organically, bouncing around ideas, moving quickly to test them and quickly throwing out what flopped. The result was the creation of well-known tools like Who’s Viewed My Profile, Jobs You Might Be Interested In and visualizations of a person’s professional network called InMaps. In each new feature, Patil drove home the idea that the best sign of a good data product is no obvious evidence of the data itself.
“The user doesn’t want to see raw data, they want the data in a usable form, and that usable form should help them do something more creative, be more efficient, give them superpowers,” he said. “Something you could never conceive of before.”
To Patil, it was just plain logic that regular people didn’t want a bunch of statistical noise getting in the way of their online experience. But in a profession filled with statisticians and programmers, his rare eye for the bigger picture set him apart.
Coining a profession
Around the same time, Jeff Hammerbacher was developing similarly personalized data products for Facebook. And in 2008, he and Patil separately began using the term “data scientist” when hiring employees. In 2012, Patil co-authored a Harvard Business Review article with academic Thomas Davenport titled “Data Scientist: The Sexiest Job of the 21st Century.” The piece, intended as a way to recruit talent to LinkedIn, argued that most data-collecting entities could benefit from having a data scientist to make sense of it all.
No organization faced a larger challenge in this realm than the U.S. government. In 2012, a survey of government IT workers revealed just how little data was put to use, revealing that one-third of all the information the government collected was “unstructured and therefore substantially less useful.” Not only were agencies not putting to work a large chunk of statistics they’d collected, a huge portion of them were of practically no value to begin with.
It was around this time that the Obama administration began scouring Silicon Valley for talent that could give the White House a digital boost. In 2012, prodigy health-tech entrepreneur Todd Park was poached to be Obama’s second chief technology officer, setting off a spree of similar hires. In the past two years alone the administration has hired Megan Smith, the VP of Google’s business development, to replace Park as CTO; Alexander Macgillivray, previously Twitter’s general counsel, was named her deputy. In March, one of Facebook’s lead engineers, David Recordon, joined the White House as its director of IT. And Jason Goldman, a Silicon Valley veteran who’s worked at Blogger, Twitter and Medium, was just named the White House’s first-ever chief digital officer.
When it came time to add a data scientist to the roster, Patil was the obvious choice.
“He’s an incredibly respected leader from the entire technology community, and being able to bring somebody in of his caliber is a signal to all of the technologists in the United States that [the White House is] really serious about bringing in some of the smartest people from outside of government,” Brian Forde, a former senior adviser for mobile and data innovation at the Office of Science and Technology Policy (OSTP), told Yahoo News.
Silicon Valley transplant
In February 2015, after years of ducking the traditional rules of educational and professional institutions, Patil got a job at the White House. Rather than simply put out a press release, the OSTP made digital waves with his hiring. The news first broke on Wired, and was followed up by a “memo” from Patil on Medium, a publishing platform darling among the media and Silicon Valley technologists. The post, which included a listicle made of SoundCloud embeds, declared that “the data age has arrived.” The next day, at the annual Strata + Hadoop World conference (a Comic-Con for data scientists), Patil gave a talk that was prefaced with a video of Obama welcoming him to the OSTP, stressing that “understanding and innovating with data has the potential to change almost anything for the better.”
Even before Patil joined, Obama made a concerted effort to make the White House more data-centric. In 2013, the president signed a long-overdue executive order that required machine-readable, open data to be the record-collecting standard in every government agency. Over the past few years, his administration has also released more than 135,000 data sets to the public, hosting events like hackathons, “Data Jams” and “Datapaloozas” — meetings in which statisticians, academics, industry leaders and bureaucrats gather to imagine new, helpful data applications and sometimes even build them. It has also developed the Presidential Innovation Fellows program, assigning young data scientists to start projects in agencies ranging from the Internal Revenue Service to the National Aeronautics and Space Administration. In 2014, the executive office of the president published its first “ Big Data report,” outlining the major benefits and concerns of the swaths of information it had accumulated over the years.
As Forde sees it, Obama’s decision to make these data sets public and foster a creative community around them marked the beginnings of a data revival after a long, period of inactivity.
“You had this really fertile farmland that was cemented over like a parking lot,” Forde said, referring to the government’s data sets. “We had to come in and — working with our agency partners — jackhammer all of that cement and just clear it out. Now we have the fertile farmland. You need someone like DJ Patil, who can harvest that crop. That’s what he’s doing right now.”
Patil’s first petri-dish initiative is to marry bioinformatics and health care in precision medicine. Doctors have been using this technique in a range of capacities for more than a century. Blood typing, for example, allows physicians to offer blood transfusions. Studying the unique genetic changes in cancer cells has led to the development of new drugs for degenerative diseases. By collecting data from these treatments, it’s possible to predict diseases in direct descendants of patients, or those who have similar genetic makeup.
By turning to Patil, Obama is placing a bet on an iconoclastic risk taker, a classic disrupter in the mold of Silicon Valley techies.
The health care private sector already does this to some extent. Organizations like Kaiser Permanente maintain records about which treatments are most cost-effective and encourage doctors to diagnose accordingly. But Patil’s goals go beyond the realm of cost efficiency, exploring the ways that, say, mapping someone’s genome could prevent future health risks in a family. Or demonstrating how the pollution of a certain area may affect the health of a community.
Patil is collaborating with one of his personal heroes, Dr. Francis Collins, the man who mapped the human genome, to explore the ways that data could inform medical treatments. A little over a month on the job, Patil is still finding his way. But he’s brimming with ideas for the future of health care.
“The patient has to be at the center of this,” he said, pausing for a second to think and diving into all the questions running through his mind. “What does it mean to have the opportunity to have care that is really tailored to a specific population? What are the right privacy mechanics? Will you get a magic pill that’s customized for you? That’s a long way of saying that it’s not obvious what it means when you walk in and you get your genome sequenced.”
But even the “Big Data report,” authored by counselor to the president John Podesta and other White House staffers, admits that the health care industry’s current privacy infrastructure might not yet be secure enough to ensure bioinformatics are used responsibly. “The nation needs to adopt universal standards and an architecture that will facilitate controlled access to information across many different types of records,” the report says. “Modernizing the health care data privacy framework will require careful negotiation between the many parties involved in delivering health care and insurance to Americans.”
DJ Patil ascends the stairs at the White House. (Photo: Chris Usher for Yahoo News)
Yet another area that requires careful negotiation is the indiscriminate use of data collection by private companies in public settings. Take, for instance, educational apps that offer personalized learning experiences. By recording students’ online activities, these tools are able to adjust lesson plans based on the students’ strengths and weaknesses and evaluate their interests based on their Web habits. A recent report from The New York Times revealed that educational companies have bypassed school district privacy rules by marketing this technology directly to teachers, using student data at their own discretion. Though a federal law exists to protect students’ privacy in the classroom, critics argue that it doesn’t adequately address some of the advanced tracking techniques that companies are using.
In his January State of the Union address, Obama called on Congress to “protect our children’s information,” outlining a new bill to protect consumers and students under the age of 18. Though legislation supporting these efforts was supposed to be submitted on March 23, the bill remains stuck in limbo. Its latest draft would keep companies from using the data of students 18 and under for marketing purposes but allow it to be given away for “employment opportunities.” It would also allow companies to change privacy policies after a school commits to a contract for their services.
In his position, Patil won’t be a major driver of policy or politics, even in the Big Data realm. The best he can hope for is influencing the debate, at least on the margins, through the force of his arguments. Still, he is pleased with the White House’s stance on the issue.
“ The biggest challenge for DJ is going to be human and organizational, rather than technical.” – Steven Weber
“The thing that I’m really happy to see in the Podesta report is the fact that they called it out,” he said, referring to the “Big Data” report’s warning against the insidious collection techniques used by educational companies. “Let’s make sure the student data is not only utilized for the benefit of a student, but to make sure that student isn’t being marketed to in a way that I think we would all fundamentally say is not acceptable for the public good.”
In some ways it’s a topic close to his heart, considering that Patil himself could’ve benefited from targeted learning as a restless kid back at Monta Vista High School.
“If someone had just given me more flexibility to understand and look at the world differently and try new things, I probably would’ve gravitated to that,” he said.
This is how Patil sees the world: not as a dichotomy between data and intuition, but as a combination of both. Each project is an opportunity to readjust his perspective and try things a different way. It was that approach that got him into college, grad school, LinkedIn and ultimately the White House.