Synthetic data set of human trafficking victims could allow big data work without privacy compromises

Devin Coldewey

Updated September 23, 2021 at 5:30 PM·3 min read

In order to combat human trafficking effectively, those combating it must understand it — and these days, that means data. Unfortunately, for obvious reasons there is no convenient index of trafficking victims, though this confidential information is in some ways abundant. Microsoft and the International Organization for Migration may have found a way forward with a new synthetic database that has all the important characteristics of the real trafficking data, but is completely artificial.

While each victim is unquestionably individual, basic high-level questions like which countries are increasingly the source or means of trafficking, which routes and methods are used, and where the victims end up are a matter of statistics. The evidence to identify trends and patterns, crucial to prevention, is locked up in thousands of these individual stories that most would prefer not to publicize.

"Administrative data on identified cases of human trafficking represent one of the main sources of data available but such information is highly sensitive," said IOM program coordinator Harry Cook in a news release describing the data set. "IOM has been delighted to work with Microsoft Research over the past two years to make progress on the critical challenge of sharing such data for analysis while protecting the safety and privacy of victims."

Historically, for things like crime databases and medical info, the strategy is to redact liberally, but this method of "de-anonymizing" has been shown to be ineffective against any serious attempt to reconstruct the data. With numerous databases public and leaked and computing power on tap, the redacted information can be supplied quite reliably.

The option taken by Microsoft Research is to use the original data as the basis for a synthetic data set that retains all the important statistical relationships of the source but none of the identifiable information. And it's not just turning "Jane Doe" into "Janet Doeman" and her hometown from Cleveland to Queens. Instead, groups of no less than 10 people with similar or overlapping data are merged to create a set of attributes that accurately represent them statistically but can't be used to identify them individually.

Caption: Statistics relating to human trafficking around the world.

Image Credits: Microsoft Research / IOM

Naturally this doesn't have the granularity of the original data, but unlike the sensitive source, this data can actually be used. It's not necessarily for some task force to analyze and say "okay the next smuggling operation will be based out of..." but rather this data, based in firsthand evidence, can be pointed at as a factual record for addressing this at a policy and diplomacy level. Where before one may have had to say in a more general way that Country X or Government Z was neglectful or complicit in these matters, having hard data to back that up allows one to say "36 percent of sex trafficking victims pass through your jurisdiction."

Not that the data has to be used in strongarm tactics — simply understanding the global trade in human misery as a system and not just a series of disconnected events is valuable in and of itself. You can peruse the data and request to use it here, and learn more about the process for creating it at the program's GitHub.

SportsYahoo Sports
2024 NBA Mock Draft 7.0: Who will the Hawks take at No. 1? Our projections for every pick with lottery order now set
With the lottery order set, here's a look at Yahoo Sports' projections for both rounds of the 2024 NBA Draft.
SportsYahoo Sports
NBA Draft Lottery: Hawks get No. 1 pick, despite 3 percent chance of winning
The Atlanta Hawks won the No. 1 overall selection in the NBA Draft Lottery. The Hawks had a 3 percent chance of winning the top pick.
SportsYahoo Sports
NBA playoffs: Nuggets stun Timberwolves with Jamal Murray prayer; tie series, reclaim home-court advantage
The champs are back.
SportsYahoo Sports
Former MLB infielder, Little League World Series star Sean Burroughs dies at 43
The seven-year major leaguer collapsed while coaching his son's Little League game on Thursday.
SportsYahoo Sports
The best RBs for 2024 fantasy football, according to our experts
The Yahoo Fantasy football analysts reveal their first running back rankings for the 2024 NFL season.
SportsYahoo Sports
Anthony Edwards talks postgame exchange with Jamal Murray: 'We love that, keep talking that'
Edwards is here for the chatter. And he's goading Murray for more.
BusinessYahoo Finance
Here's 1 big investing mistake you are probably still making
Maybe a 5% CD isn't the best choice for your hard-earned money.
SportsYahoo Sports
Dolphins owner Stephen Ross reportedly declined $10 billion for team, stadium and F1 race
The value of the Dolphins and Formula One racing is enormous.
BusinessYahoo Finance
How rich homebuyers are avoiding high mortgage rates
Homebuyers with means are turning to an old strategy to get around a new crop of high mortgage rates: all-cash deals.
SportsYahoo Sports
Timberwolves coach Chris Finch calls Jamal Murray's heat-pack toss on court 'inexcusable and dangerous'
Murray made a bad night on the court worse during a moment of frustration on the bench.

News

Life

Entertainment

Finance

Sports

New on Yahoo

Synthetic data set of human trafficking victims could allow big data work without privacy compromises

Recommended Stories

2024 NBA Mock Draft 7.0: Who will the Hawks take at No. 1? Our projections for every pick with lottery order now set

NBA Draft Lottery: Hawks get No. 1 pick, despite 3 percent chance of winning

NBA playoffs: Nuggets stun Timberwolves with Jamal Murray prayer; tie series, reclaim home-court advantage

Former MLB infielder, Little League World Series star Sean Burroughs dies at 43

The best RBs for 2024 fantasy football, according to our experts

Anthony Edwards talks postgame exchange with Jamal Murray: 'We love that, keep talking that'

Here's 1 big investing mistake you are probably still making

Dolphins owner Stephen Ross reportedly declined $10 billion for team, stadium and F1 race

How rich homebuyers are avoiding high mortgage rates

Timberwolves coach Chris Finch calls Jamal Murray's heat-pack toss on court 'inexcusable and dangerous'