The production of high quality chocolate will benefit from the recent sequencing and assembly of the chocolate tree genome - Theobroma cacao - considered by chocolate experts to produce the world's finest chocolate. The Maya domesticated this variety of Theobroma cacao, 'Criollo', about 3,000 years ago in Central America and it is among the oldest domesticated tree crops, though today many growers prefer to grow hybrid cacao trees ('Trinitario') that produce chocolate of lower quality but are more resistant to disease.

A key benefit of biology is using science to give us the best of both worlds - better chocolate that farmers can afford to grow because the trees won't die easily.   Currently, most cacao farmers earn about $2 per day, but producers of fine cacao earn more. Increasing the productivity and ease of growing cacao can help to develop a sustainable cacao economy.   If you're the type who thinks no optimization of nature should ever be done, take heart that the trees are an environmentally beneficial crop because they grow best under forest shade, allowing for land rehabilitation and enriched biodiversity - growing more of them will be good.

The researchers identified a variety of gene families that may have future impact on improving cacao trees and fruit either by enhancing their attributes or providing protection from fungal diseases and insects that effect cacao trees.

Theobroma cacao Criollo genome
Cacao flowers on a tree.   Photo: Mark Guiltinan, Penn State

"Our analysis of the Criollo genome has uncovered the genetic basis of pathways leading to the most important quality traits of chocolate -- oil, flavonoid and terpene biosynthesis," said Siela Maximova, associate professor of horticulture, Penn State, and a member of the research team. "It has also led to the discovery of hundreds of genes potentially involved in pathogen resistance, all of which can be used to accelerate the development of elite varieties of cacao in the future."

Because the Criollo trees are self-pollinating, they are generally highly homozygous, possessing two identical forms of each gene, making this particular variety a good choice for accurate genome assembly.

The researchers assembled 84 percent of the genome identifying 28,798 genes that code for proteins. They assigned 88 percent or 23,529 of these protein-coding genes to one of the 10 chromosomes in the Criollo cacao tree. They also looked at microRNAs, short noncoding RNAs that regulate genes, and found that microRNAs in Criollo are probably major regulators of gene expression.

"Interestingly, only 20 percent of the genome was made up of transposable elements, one of the natural pathways through which genetic sequences change," said Mark Guiltinan, professor of professor of plant molecular biology at Penn State. "They do this by moving around the chromosomes, changing the order of the genetic material. Smaller amounts of transposons than found in other plant species could lead to slower evolution of the chocolate plant, which was shown to have a relatively simple evolutionary history in terms of genome structure."

Guiltinan and his colleagues are interested in specific gene families that could link to specific cocoa qualities or disease resistance. They hope that mapping these gene families will lead to a source of genes directly involved in variations in the plant that are useful for acceleration of plant breeding programs.   "Fine cocoa production is estimated to be less than 5 percent of the world cocoa production because of low productivity and disease susceptibility."

Citation: Xavier Argout, Jerome Salse, Jean-Marc Aury, Mark J Guiltinan, Gaetan Droc, Jerome Gouzy, Mathilde Allegre, Cristian Chaparro, Thierry Legavre, Siela N Maximova, Michael Abrouk, Florent Murat, Olivier Fouet, Julie Poulain, Manuel Ruiz, Yolande Roguet, Maguy Rodier-Goud, Jose Fernandes Barbosa-Neto, Francois Sabot, Dave Kudrna, Jetty Siva S Ammiraju, Stephan C Schuster, John E Carlson, Erika Sallet, Thomas Schiex, et al, 'The genome of Theobroma cacao', Nature Genetics (2010) doi:10.1038/ng.736 (free to read)