In Brief
It's impossible to have truly anonymous genetic data. As genetic databases fill, they will become targets for hackers. Do we need new ways to protect them, or will we stop caring?

People all over the world are starting to learn the secrets encoded in their DNA. They want to learn about what diseases they might some day have or what diseases they might pass on to their children. Or they want to discover more about their own ancestry.

No matter if it’s a medical institution or a direct-to-consumer company sequencing their genomes, people skim over the fine print about what happens to their data. They don’t think much about how it’s stored or protected. The organization, they might assume, will keep it safe in a database, isolating their identifying information from their genetic data

In most cases, they’d be wrong.

Recently, experts have shown that it’s impossible to completely de-identify genetic information. And though genetic testing is still in its infancy, it will soon become more routine in medicine and beyond, amplifying patient fears about breaches and leaks. Those fears range from discrimination to the abstract uneasiness that comes with a loss of privacy. Decades in the future, there’s a chance that protections will improve — or that our notion of privacy will shift altogether (and maybe even disappear).

The Privacy Myth

To understand the role genes play in most diseases (so that they can, hopefully, develop new treatments), scientists need huge databases of genetic information. People can contribute their genetic material for general use, as part of clinical practice to treat diseases like cancer, or for private use, like when they send in cheek swabs to a direct-to-consumer company such as 23andMe.

Research institutions and the federal government regulate how organizations should safeguard patient information. The stewards of these databases decide how to best protect patient privacy while still making the data useful to scientists. Sometimes they don’t collect any patient information at all; other times they don’t sequence the entire genome so that it’s harder to identify the person who donated it.

More often, scientists need complete genomes or more context about the individual. So some databases separate patients’ identifying information from their genomic information.

Recent studies have shown that none of these techniques can guarantee protection. In one 2013 study, researchers matched de-identified genetic information with 50 individuals. The information was part of an online database and contained a small bit of genetic information found on the Y chromosome. Because both are often passed down from father to son, mutations on the Y chromosome often correlate with last names. Online searches revealed the participants’ identifying information.

“It’s hard to strip [genomics] of any identifying information and keep any utility,” Yaniv Erlich, an assistant professor of computer science at Columbia University and the senior author of the 2013 study, tells Futurism.

Increasingly, scientists are realizing it’s impossible to guarantee that genetic information can be kept anonymous. “The concept of anonymity is completely disingenuous. It’s inaccurate. Your DNA is you. You can’t really be anonymous with your DNA,”  Ifeoma Ajunwa, a legal scholar and sociologist who studies genetic discrimination, tells Futurism.

To be fair, you might think this doesn’t apply to you at the moment — today, only about six percent of Americans say they have had their genomes sequenced. But in the next few decades that will likely change. The cost to sequence a genome will drop, and more people than ever before will have access to their own genomic information.

To scientists, a large database of genomic information may look like a treasure trove. But to hackers, it’s a target.

Image credit: Staff Sgt. Joshua J. Garcia, U.S. Air Force

A Genetic Match

If a hacker was able to match someone’s genetic information with her personal information, the hacker would have plenty of uses for it. If the target happens to be in a position of power, genetic information could undermine their position in politics or be used to question their royal lineage. In this respect, it’s a tool of espionage. It could also be useful for people working to create biological weapons or genetic information could be used for identity theft. Malicious actors could place an innocent person’s DNA at a crime scene to frame them (this already happened by accident in New Orleans in 2015). DNA could reveal ethnic identities, imperil family histories, and raise doubts about religious or political affiliations. In parts of the world where ethnic persecution continues, that kind of information can put people’s lives in danger, Erlich suggests.

In the hands of employers, genetic information could lead to workplace discrimination. An applicant might not get a job because his or her genetic information reveals a propensity towards addiction, or the likely early onset of Alzheimer’s, or criminal behavior. Fortunately, there is already a law to prevent this: the Genetic Information Nondiscrimination Act of 2008 (GINA). In the years since the bill became law, plaintiffs have already brought (and, often, won) several lawsuits against employers that have violated GINA. Patient privacy laws, notably the Health Insurance Portability and Accountability Act (HIPAA), further protect genetic privacy.

But these laws as they stand don’t completely protect citizens from genetic discrimination (and definitely not hackers). “If a savvy life insurance company was able to find genetic information associated with an individual [it was covering], it may be able to hike up premiums,” Zubin Master, a bioethicist at the Mayo Clinic, tells Futurism. No laws currently protect against sequencing DNA that one finds, Ajunwa notes, though such laws do exist in the United Kingdom.

So theoretically, someone could take a coffee cup you used and have your DNA sequenced. That’s especially scary since artificial intelligence can now make 3D renderings of people’s faces based on their genomic information alone. In 2013, the Supreme Court ruled that police can swab suspects’ cheeks to add their genetic information to a database, even if a suspect hasn’t been convicted of a crime (and despite the fact that DNA evidence is not a perfect tool).

Original faces, left, and AI renderings, right. Image credit: Lippert et al, PNAS 2017

Scientists still don’t understand much about the human genome. That means we’ll be able to do even more with people’s genetic information in the future. Genomes could offer new cures for diseases, allowing people to lead longer and healthier lives. But it also creates new vulnerabilities, which could be exploited by nefarious actors or companies seeking to amplify profits by suppressing citizens’ rights.

No one knows exactly what would happen if a criminal got a person’s genomic code, says Nilay Shah, the division chair of healthcare policy and research at Mayo Clinic. “But [the genome holds] so much personal information that if it were to get compromised… people would have a perpetual fear that it could be used in the future,” he tells Futurism. It’s easy to replace a credit card that becomes compromised, but not so with a person’s DNA.

There are plenty of concrete things to fear if the wrong person can access another’s genomic code. But some fear is simply the abstract kind that comes with a loss of privacy. “A lot of evidence out there says those fears are for the most part unfounded. And there’s not a lot of evidence to suggest that DNA is being used against individuals,” Brad Malin, a professor of biomedical informatics, biostatistics, and computer science at Vanderbilt University, tells Futurism. “I think there’s a latent knowledge problem with DNA: You don’t know what it will be useful for.”

The Meaning of Protection

Matching identifying information to genomic information, as Erlich did in his study, still requires a lot of time and resources. So it’s not so easy for nefarious actors to accomplish. But as databases expand with more patient genomes, and more hackers test the security of those databases, it might not be so difficult.

Medical institutions and companies have techniques to defend their databases against these attacks. Digital security experts can inspect requests to pull from the database as they come in, using AI to look for suspicious patterns. They can protect data itself using cryptographic methods such as blockchain — 23andMe encrypts all its genetic data, according to the company’s privacy policy.

“Privacy and security are our top priorities when it comes to customer information,” Kate Black, the privacy officer at 23andMe, wrote in an emailed statement to Futurism. “On the technical side, 23andMe employs software, hardware, and physical security measures to protect the computers where customer data is stored. We use robust authentication methods to access our systems. Personal information and genetic data are stored in physically separate computing environments.”

But all that protection comes at a cost: The data is much harder to access and use, even for the scientists using it for research. And in the end, all that protection might not impede the most intrepid or skilled hackers. “There’s a security paradigm that the more layers you introduce to protect something, the better off you’ll be,” Mat Wiepert, a manager of genomics systems at Mayo Clinic, tells Futurism. “But you shouldn’t be [over] confident because there’s probably a hacker out there who can break into what we built. Vigilance is key.”

The front line of the battle for genetic privacy. Image credit: Getty Images

There are alternatives to cracking down on genomic privacy. One is to simply be more open. That’s what’s happening in Iceland, where about 45 percent of the country’s citizens donated their genomes to a central database.

Genomic information isn’t the only thing we need to protect, Malin points out — medical records might hold information that’s much more attractive to hackers. ‘Your DNA, barring a few disorders, is more probabilistic in nature [than many people believe],” Malin says. There’s no need to give patient genomes special treatment.

In the years since his study, Erlich believes more strongly in the value of transparency. “With genetic data, you can learn so much about yourself. We want to enable people to share this info and feel comfortable about it. That’s where the field is going,” Erlich says. But to get there, we have to build trust — between government regulators, scientific establishments, researchers, and participants. A genetic test is like eating out at a restaurant, Erlich says. You can chow down with confidence because you trust that the government will intervene if there’s wrongdoing, like if the food makes you sick.

Transparency with researcher participants is a good start. Master suggests that researchers should explain privacy risks as part of informed consent, a form every patient signs before participating in a scientific study. Shah notes that researchers are thinking about using apps and digital portals to better engage patients. A patient might feel more control over her genome if she has easy access to that information.

It will likely still be a few years, or even decades, before it’s common practice to have your genome sequenced. Our understanding of privacy could change between now and then. A 2016 Pew survey shows that people under age 50 are generally more open to sharing data if it will benefit them (though Pew found this didn’t apply to medical information, about which younger people tended to be more conservative). Social media makes it easy to share increasingly personal data like our location and our lunch. And we’re more and more comfortable disseminating it, often without reading the privacy policies on the sites on which we share it.

“We are not living in isolated cubes in our digital lives. The notion of privacy changes. The notion of genetic privacy will change in a similar way,” Erlich says.

Disclaimer: the author of this piece was a recipient of the Mayo Clinic’s 2017 journalist residency for surgery, where she met a number of the sources quoted in this piece. The residency was paid for by the Mayo Clinic; however, neither the clinic nor any of its affiliates has editorial review privileges.