What is DNA Raw Data?
Your ancestry DNA raw data is a lab-generated text file. It is usually in the following three formats
- .txt file of size 14-17 MB
- zip file of size 5-7 MB
The whole-genome files have a size of 200-400 MB. All the top consumer genetic at-home DNA testing companies like 23andMe, Ancestry DNA, Family Tree DNA that test for ancestry, enable their customer to download their raw data directly from the website or request a download. You also have the option to import your DNA raw data directly from the 23andMe database instead of having to upload your raw DNA data. This is aside from buying their genetic test kit.
The raw data from different ancestry at-home DNA testing companies have a different number of genetic markers depending on the microarray chip that is used. For example, the 23andMe raw data has around 650,000 SNPs in its v5 chip. Ancestry DNA has around 700,000 SNPs. Login to your 23andMe account and download your raw data now! And it does not take long.
Before you read on about the interpretation of Promethease reports, read the following update carefully.
Update: In 2019, MyHeritage acquired Promethease and SNPedia.
MyHeritage offered Promethease free of charge through the end of 2019 and continues to maintain SNPedia as a free resource for academic and non-profit users. For Non-European users, the DNA raw data will be shifted to MyHeritage into new accounts which will be created for them. However, the users will retain the ownership of their DNA raw data file and are free to delete it from MyHeritage’s server.
What is Promethease?
Firstly the correct spelling is Promethease not Prometheus as wrongly propagated by the lazy or uninformed.
Promethease is an online literature retrieval system based on the wiki-styled data repository called SNPedia. It does not perform raw data analysis as such. Though popular, it is often criticized to be too technical and difficult to read. You can take a look at Promethease's sample report.
How did Promethease gain popularity?
Promethease was one of the early companies to offer DNA raw data analysis based on 23andme, Ancestry DNA raw data. When the FDA (briefly) banned 23andme from providing health reports, Promethease was the only alternative and gained popularity because of its free service and their not too long TAT.
It ranked among the best third-party tools but was criticized for its user-unfriendly interphase based on several user reviews. Some well-known competitors for this tool are Xcode Life, Codgen, Interpretive, Nutrahacker, and GEDMatch (mostly for DNA matches across databases and ethnicity).
What does Promethease provide?
Promethease offered health reports based on 23andme, Ancestry DNA, Family Tree DNA, and other ancestry raw DNA data providers for a price of $12. Once you upload your DNA raw data you will get back your interpretation in about 20 minutes.
On occasions, the report was offered for free or at a discount. Many users tried the Promethease reports due to the free/low-cost aspect and its apparent accuracy due to its vast content. However, the report left individuals more confused than clear. So if you were a customer of Promethease and still cannot make heads or tails of the report, read on.
What exactly do you get in a Promethease report?
Promethease is a wiki-style collection of peer curated genetic information versus reports that are expert-curated (like Xcode Life's reports). You can search genes/variants/diseases as you would on Wikipedia. The results are pages of information. This is great for people who have degrees in genetics and want to geek out and learn about all sorts of diseases they can possibly get.
However, if you want organized information that you can use in specific aspects of daily life such as Nutrition, Fitness or Ancestry, etc., to enhance your health and wellbeing, then you may not find the Promethease reports all that helpful. Promethease’s emphasis largely seems to be the variety of different diseases that one can get.
How to read Promethease results?
Once you transfer your DNA raw data to Promethease you will get back a vast report. Though on the face of it, it may look like a massive report, the amount of potentially useful information in the Promethease report is quite limited. Understanding and interpreting this report is a challenge. Let’s break it down:
The total number of entries in a typical Promethease report = Approximately 25,000
Number of entries
- With Good repute = 5817
- With Bad repute = 416
- With unknown repute =18668
The “Summary” component of the Promethease report is where the information pertaining to your genotype is given. This column for the vast majority of the entries has information that will not be useful to most people. It contains terms like common, normal, common/normal, average, common in affy, and other terms that carry no useful information for the regular user.
The number of entries remaining after removing terms such as “common in ClinVar”, “common on affy axiom”, “none”, “complete genomics”, “common/normal” and dropping the empty entries (no annotation) = 1482
After further removing terms such as normal, common, etc = ~800
Only about 800 entries out of 25,000 (~ 3%) are potentially informative entries in the Promethease report
The “Simplified Promethease” report from Xcode filters your Promethease report down to the most useful information and organizes this information for you in an understandable way.
Don't have your DNA raw data? Choose from our recommended list of providers
How to read a Promethease report?
Before you get into reading your Prometheus report, it will help to know the meaning of a few terms that are frequently used in them. They are tabulated below.
|Repute||This is a parameter that is applied to a genotype. It can be good, bad or not set. This is based on a general opinion.|
|Magnitude||It tells you how interesting your genotype is. It is purely a subjective parameter. It is a numerical value between 0 and 10, with 0 assigned to boring information and 10 being assigned to really significant information. This can be set to whichever value you prefer. Most users set this value up to a moderate 5.|
|Orientation||As we know, genetic studies are based on genes. Orientation is simply the chromosome position of these genes. These positions are based on a reference human genome. The present build is GRCh38 patch 12, that was released in December 2017 (you don’t have to remember this!).|
|Frequency||How rare or common a genotype is in your population.|
|Geno Time||Timestamp when the genotype page was last updated. This is with reference to SNPedia.|
|ClinVar||If you select that button, only health conditions that are found in ClinVar will be displayed. ClinVar is a public archive of reports of the relationships among human gene variations and phenotypes, with supporting evidence.|
|Genosets (gs IDs)||A genoset is a defined set of genotypes that are associated with a particular phenotype.|
|SNPs for the other sex||Many people are taken aback by this term. Don’t worry about this. It simply tells you that manifestations of these SNPs are irrelevant to your sex. For example it is irrelevant to a woman, how an SNP can increase the risk for prostate cancer.|
|GMAF||Stands for Global Minor Allele Frequency. The allele (an alternative form of your gene) that is less common in the global gene pool. This is as per the 1000 Genomes Project.|
|Carrier||Many of you may already be familiar with this term. Carriers are individuals who carry only one copy the risky allele. It is of significance only in disorders that are know to be caused due to a single gene mutation like Cystic Fibrosis or Huntington’s disease (a.k.a Mendelian disorders). Carriers do not manifest the disease but may pass on the mutant allele to the next generation.|
|CI||Stands for Confidence Interval. The range within which a value can be deemed true with a certain degree of confidence. It is usually taken as 95%.|
|de novo mutation||True to the name, it means a new mutation that was not found in both your parents.|
|CNV||Stands for Copy Number Variants. Number of copies of the gene you have in your genome. Large copy number of genes are commonly found in many forms of cancer.|
|Haplotype||A specific combination of SNPs all occurring on the same chromosome. This is all chromosomes inherited from your mother or father.|
|Diplotype||It refers to a specific combination of haplotypes.|
|Familial risk||It is a comparison between the risk of individuals with relatives with a disease to the risk of individuals with relatives who do not have the same disease.|
|GWAS||Stands for Genome-Wide Association Study. A study of the markers (usually SNPs) across the entire genome to see which ones are statistically more or less common in one group of people (often patients with a specific disease) compared to another group of people (typically people unaffected by that disease).|
|In/dels||They are a type of polymorphism that involves the insertion or deletion of slightly bigger sequence of bases.|
|SNP||Stands for Single Nucleotide Polymorphism. As the name suggests, it is a change occuring in only one base.|
|Odds ratio (OR)||It is a useful number to evaluate the association between an allele or genotype with a phenotype. The phenotype might be an inherited trait (like widow’s peak) or a disease (like Type 2 Diabetes). This number is arrived at by comparing carriers of a less common allele to people with 2 copies of the most common allele.|
|Penetrance||The degree to which you're likely to actually have a given trait or phenotype when you carry a given variation. Penetrance is considered high if over 80%, and moderate if between 20 - 80%.|
|Position||Location of an SNP on the chromosome. This position is however only relative to positions in the reference genome sequence. You cannot assign absolute positions to SNPs on a chromosome.|
|Prevalence||The proportion of people with a given condition at a given time. Usually a percentage or a ratio.|
|rs#||All SNPs have an rs identifier that is officially assigned by dbSNP, a public database of SNPs maintained by the National Institute of Health (NIH).|
|Wild-type||The genotype or the allele that is most common in the population and serves as a reference to evaluate all other types of alleles.|
If you still have some more terms to which you need meanings you can look them up here
How is Xcode Life Report different from the Promethease report?
Phew..and all you wanted was some simple tips to align your lifestyle to your genes, you got a lesson in Biostatistics instead.
Xcode Life, on the other hand, includes expert-curated references from several large databases and leading scientific journals to name a few, to curate the variant annotations. The information is then organized systematically into topical reports such as Nutrition, Health, Fitness, Skin, Allergy, Ancestry, etc. Each report is further organized into traits, which provide actionable insights into your genetic type along with specific recommendations for you. Xcode reports are easily readable, understandable, and implementable. The core philosophy of Xcode reports, in contrast with Promethease, is to empower the user with actionable genetic information that they can use to enhance health and wellbeing. Each report is reasonably priced at around $30; additionally, there are package discounts if the user buys multiple reports together.
There you have it! If you are someone who wants to satisfy their curiosity and not really looking for anything specific, then Promethease may be the way to go. Even still, you will likely be left scratching your head. But, if you want specific, organized, and actionable insights from your genetic data about your health and wellbeing, then you certainly must try the Xcode Health reports.