Twins get some 'mystifying' results when they put 5 DNA ancestry kits to the test
Geneticist at a popular ancestry company admits it's 'kind of a science and an art'
One set of identical twins, two different ancestry profiles.
At least that's the suggestion from one of the world's largest ancestry DNA testing companies.
Last spring, Marketplace host Charlsie Agro and her twin sister, Carly, bought home kits from AncestryDNA, MyHeritage, 23andMe, FamilyTreeDNA and Living DNA, and mailed samples of their DNA to each company for analysis.
Despite having virtually identical DNA, the twins did not receive matching results from any of the companies.
In most cases, the results from the same company traced each sister's ancestry to the same parts of the world — albeit by varying percentages.
But the results from California-based 23andMe seemed to suggest each twin had unique twists in their ancestry composition.
According to 23andMe's findings, Charlsie has nearly 10 per cent less "broadly European" ancestry than Carly. She also has French and German ancestry (2.6 per cent) that her sister doesn't share.
The identical twins also apparently have different degrees of Eastern European heritage — 28 per cent for Charlsie compared to 24.7 per cent for Carly. And while Carly's Eastern European ancestry was linked to Poland, the country was listed as "not detected" in Charlsie's results.
"The fact that they present different results for you and your sister, I find very mystifying," said Dr. Mark Gerstein, a computational biologist at Yale University.
Twins' DNA 'shockingly similar'
Marketplace sent the results from all five companies to Gerstein's team for analysis.
He says any results the Agro twins received from the same DNA testing company should have been identical.
And there's a simple reason for that: The raw data collected from both sisters' DNA is nearly exactly the same.
"It's shockingly similar," he said.
Watch: Yale scientists mystified by different results for twin sisters.
The team at Yale was able to download and analyze the raw data set that each company used to perform its calculations.
An entire DNA sample is made up of about three billion parts, but companies that provide ancestry tests look at about 700,000 of those to spot genetic differences.
According to the raw data from 23andMe, 99.6 per cent of those parts were the same, which is why Gerstein and his team were so confused by the results. They concluded the raw data used by the other four companies was also statistically identical.
Still, none of the five companies provided the same ancestry breakdown for the twins.
"We think the numbers should be spot on the same," Gerstein said.
While he can't say for certain what accounts for the difference, Gerstein suspects it has to do with the algorithms each company uses to crunch the DNA data.
"The story has to be the calculation. The way these calculations are run are different."
When asked why the twins didn't get the same results given the fact their DNA is so similar, 23andMe told Marketplace in an email that even those minor variations can lead its algorithm to assign slightly different ancestry estimates.
The company said it approaches the development of its tools and reports with scientific rigour, but admits its results are "statistical estimates."
Differences across all 5 companies
Family had told the Agro sisters their ancestors come from Sicily, Poland and Ukraine.
However, the results each sister received from the ancestry companies revealed some surprising — and, in some cases, conflicting — family history.
AncestryDNA found the twins have predominantly Eastern European ancestry (38 per cent for Carly and 39 per cent for Charlsie).
But the results from MyHeritage trace the majority of their ancestry to the Balkans (60.6 per cent for Carly and 60.7 per cent for Charlsie).
One of the more surprising findings was in Living DNA's results, which pointed to a small percentage of ancestry from England for Carly, but Scotland and Ireland for Charlsie.
Another twist came courtesy of FamilyTreeDNA, which assigned 13-14 per cent of the twins' ancestry to the Middle East — significantly more than the other four companies, two of which found no trace at all.
Dr. Paul Maier, population geneticist at FamilyTreeDNA, acknowledges that identifying genetic distinctions in people from different places is a challenge.
"Finding the boundaries is itself kind of a frontiering science, so I would say that makes it kind of a science and an art," Maier said in a phone interview.
How it works
In order to determine someone's ancestry, companies like 23andMe compare a DNA sample to what is commonly referred to as a reference panel. A reference panel is made up of a select number of DNA samples, from previous customers who have taken the test and/or from publicly available DNA databases.
Dr. Simon Gravel, a population geneticist with McGill University who is also part of the 1000 Genomes Project, says ancestry companies will take 700,000 or so of your DNA segments and use an algorithm to compare your segments to those in their reference panel.
"They're going to match it to different parts of the world," he said. "In the end, there's going to be some overall of these [reference panel] contributions where your DNA matched better, and that's going to be their estimate of how much ancestry you have."
They kind of need to take a pencil more or less and say, 'That's a region.' And different companies draw different circles.- Dr. Simon Gravel, population geneticist
Different companies use different panels, so they're each likely to provide the same customer with different ancestry results.
In a statement to Marketplace, AncestryDNA acknowledged that the size of the reference panel is key. The company said it is "always working to improve its science" and that its "new, larger reference panel will give customers more precise results."
Why so different?
There are a variety of factors that can affect the accuracy of results from an ancestry company, Gravel says, but of particular importance is the size and quality of its reference panel. The larger and more representative it is, the more accurate the results, he says.
"If you have fewer people that you can compare to, then you make more shortcuts," he said.
"You also run more of a risk of having missed diversity that you might not know existed in one particular region."
Another reason for discrepancies in the results from different companies is the arbitrary way each company defines the world's regions, Gravel says.
"They kind of need to take a pencil more or less and say, 'That's a region.' And different companies draw different circles."
Watch: Do consumers expect DNA ancestry results to be 100% accurate?
Gravel also says the tests tend to be more accurate for people with European ancestry, as more people with that particular background have been tested.
He cautions people not to interpret their test results as definitive. He says a testing company can use DNA ancestry kits to trace a person's ancestry to a particular continent with statistical accuracy, but anything more specific than that, like pinpointing a country or town, is less reliable.
Lack of oversight
The biggest DNA ancestry companies have tested millions of people. MyHeritage, for example, says it expects sales of well over $100 million this year.
Despite the popularity of ancestry testing, there is absolutely no government or professional oversight of the industry to ensure the validity of the results.
It's a situation Gravel finds troubling.
"Usually in science we have a process like peer review and make the data accessible, and make the algorithms accessible, that's how we ensure the high quality of the data," he said.
"In this case, we don't have access to that because the companies keep the data private."
That's why Gravel says consumers should take the results generated by these tests with a grain of salt. People need to understand these tests are not subject to the same standard as diagnostic medical testing. They are more like a "recreational scientific activity," he said.
Similar to 23andMe, MyHeritage says its results are "ethnicity estimates."
When spokesperson Rafi Mendelson was then asked why MyHeritage presents results with such certainty — video results sent to customers declare, "You are," before listing a person's ancestry — he said he believes the messaging is clear, that results are only estimates, and that North American consumers are especially clear on this.
Results subject to change
Whatever your ancestry results, don't get too attached to them. They could change.
In September, AncestryDNA informed customers that it had updated their estimates with the following message:
"Your DNA doesn't change, but we now have 13,000 additional reference samples and powerful, new science to give you better ethnicity results."
The ancestry estimates used in this story are from Nov. 6, 2018, after the company updated the twins' results.
The new estimates included previously undetected ancestral ties to Russia, Greece, the Balkans and Baltics.
— With files from Jeannie Stiglic