· 8 min read
23andMe vs AncestryDNA: which raw data is better for analysis?
If you've taken a consumer DNA test you can almost always download the raw data — a plain text file with hundreds of thousands of genetic markers. Once you have it, you can run third-party analyses, contribute to research, or just understand what your sample actually contains. But the file from 23andMe is not the same as the file from AncestryDNA. Here's how they differ and which is better for what.
The chips behind the files
Both companies genotype your sample on a microarray — a glass chip with millions of probes that detect specific positions in the genome. The chip determines which variants appear in your raw file. They are not the same chip:
- 23andMe currently uses a customised Illumina GSA chip targeting around 650,000 SNPs. The chip has been refreshed multiple times; newer kits include extra medically-relevant variants the older v3/v4 chips lacked.
- AncestryDNA uses its own customised Illumina chip, also based on GSA technology, with roughly 700,000 SNPs. Its design biases toward ancestry-relevant markers (population-defining variants).
Overlap and unique coverage
The two chips share roughly 60–70% of their SNPs. The non-overlapping portion is where the practical difference lives:
- 23andMe covers more pharmacogenomic variants (drug-response markers) and many of the medically actionable variants flagged by ClinVar.
- AncestryDNA covers more ancestry-informative markers and certain population-specific variants useful for genealogical inference.
File format
Both ship a zipped .txt file. The internal format differs slightly:
- 23andMe: tab-separated, four columns (
rsid · chromosome · position · genotype). The genotype is two letters likeAGwith--for no-calls. - AncestryDNA: tab-separated, five columns (
rsid · chromosome · position · allele1 · allele2). Alleles are one letter each, with0for no-calls.
Both formats are trivially convertible. Any reasonable analysis tool — including ours — accepts either and auto-detects which is which.
Which is "better" depends on why you're analysing
| Use case | Better source |
|---|---|
| Polygenic risk scores (T2D, heart, etc.) | Tie — both have enough coverage |
| Pharmacogenomics (drug response) | 23andMe |
| ClinVar pathogenic variants | 23andMe |
| Ancestry / population inference | AncestryDNA |
| Matching with relatives | AncestryDNA (larger database) |
A note on quality
Microarray genotyping is not the same as whole-genome sequencing. Both 23andMe and AncestryDNA are highly accurate (>99% concordance with sequencing) at the positions they actually measure — but they only measure 0.02% of your genome. If you want a complete picture, consider a whole-genome sequencing service. For everything in our risk reports, however, microarray data is more than enough.
The practical answer
If you only have one and it's not a primary use case in the table above, analyse what you have. The reports you'll get are very similar. If you're choosing a new test specifically to do downstream analysis, 23andMe edges ahead for health and AncestryDNA edges ahead for genealogy. The very best option, if budget allows, is to do both and merge the files — combined coverage is around 90% of all SNPs on either chip.
Already have a raw file?
Upload your 23andMe, AncestryDNA or MyHeritage file and get a free personalised report. Our parser auto-detects the format and works with the zipped download.
Start free →