Cacao has been moved around the globe by people for thousands of years. That makes it surprisingly difficult to tell whether a population of cacao trees is truly wild or the result of human transport and cultivation, making them introduced. This paper tries to tackle exactly that question, using genomic data.
What motivated the authors
Traditionally, cacao variations were categorised into two main groups, Criollo and Forastero, with a third hybrid group, called Trinitario. Starting 2018, further studies showed this classification does not fully cover the genetic diversity present in the, by now, eleven identified populations: Amelonado, Caquetá, Contamana, Criollo, Curaray, Guiana, Iquitos, Marañón, Nacional, Nanay, and Purús.
The authors of this paper explain that these cacao populations range from wild, to naturalised, to cultivated. This blurs the boundary between what we call “wild” and “introduced” cacao.
What they did
First, let me explain some biological terms to help you understand what the authors did.
DNA (Deoxyribo Nucleic Acid) is a molecule in a double helix structure. It is written with fourletters’ code (four types of nucleotides): Adenine (A), Thymine (T), Cytosine (C), and Guanine (G).The order of these letters stores an organism’s genetic instructions. It contains genes that tell the cells what proteins to make and when. Small differences in DNA create variation between individuals.
Sequencing is a technique used to learn the order of these DNA letters, so we can compare one individual to another. In this paper RAD sequencing was used. Instead of reading all of the DNA (which is huge), RAD-seq reads the same selected pieces of DNA across many samples. That gives enough genetic markers to see how populations are related, without sequencing the whole genome. Why the shortcut? The cacao genome consists of roughly 440 million DNA letters. That’s a lot of reading….
The writers zoom in on three major groups (mentioned below) that matter for cacao history, using genome wide data to compare which populations look wild and which populations show signs of domestication.
1) a group of 28 cacao from different wild populations from the Upper Amazon basin (known for high diversity)
2) a group of 8 cacao from the Guiana population
3) a group of 6 cacao from the Amelonado population, introduced into eastern Brazil
From these 42 samples, they used the dried leaf, and took the DNA. Using the RAD-sequencing explained above, they looked for genome-wide genetic variations:
a) signals that are consistent with either selection or domestication.
b) genetic diversity. So how much variation exists within each of the three groups.
c) population structure. Do some groups cluster/group together?
d) the differentiation between groups (how different are the groups from each other.
What they found
They recovered a large number of genomic differences between the groups (944,958 different sites). Breaking it down per group, they found the following:
group 1) Upper Amazonian populations look like true wild diversity hubs.
The data show a broad genetic spread and patterns consistent with a long-term, wild origin history. This supports the widely accepted notion that the upper Amazon region is the key area and very likely the main genetic source/origin region of cacao. In other words, a lot of rare diversity is found here.
group 2) the interesting find is this group with Guiana, looking more like an isolated wild lineage, than a recent introduction.
This group forms a close, tight cluster, showing verry low differentiation from the group 1 Upper Amazonian populations. There are shared ancestry patterns found that fit better with wild relatedness than with a clearly introduced group. The author’s conclusion is that the Guiana group represents geographically isolated wild cacao, and not simply human movement of cacao.
group 3) Amelonado looks like a domesticated and introduction-shaped cacao.
The group of Amelonado individuals were found grouped together and is dominated by a single ancestry component in structure. The authors interpret these patterns as consistent with a domestication history, following its historical introduction.
A nuance the authors are honest about is that the group boundaries aren’t perfectly clean and clear, with some of the samples having intermediates (falling between groups) or outlier individuals. Though this does fit the fact that cacao has a complex history with mixing varieties and movement of cacao across the globe.
Summary
Using RAD-seq genomic data from 42 cacao leaf samples, divided into three heritage groups, the authors compare wild upper Amazon populations, the debated Guiana group, and the historically introduced Amelonado variety. They once more confirm that the Upper Amazon as a major reservoir of wild cacao diversity. They also show that the Amelonado group carries patterns consistent with domestication/introduction history. They also provide evidence that the Guiana cacao group is likely an isolated wild lineage, and not a recent human introduction.
The broader takeaway of this manuscript is that cacao doesn’t perse fall neatly into “wild vs cultivated” catagories, but rather sits on a continuum shaped by long-term human influence.
Paper details
Full title: Wild or Introduced Investigating the Genetic Landscape of Cacao Populations in South America.
Authors: Matheus Colli-Silva, James Edward Richardson, José Rubens Pirani & Antonio Figueira
Journal: Ecology and Evolution Volume 15, Issue 7, Jul 2025
Official citation: Colli-Silva, M., J. E. Richardson, J. R. Pirani, and A. Figueira. 2025. “ Wild or Introduced? Investigating the Genetic Landscape of Cacao Populations in South America.” Ecology and Evolution 15, no. 7: e71746.
Link to full article: https://doi.org/10.1002/ece3.71746


Leave a comment