Until we learn whether or not life exists on other planets, we extrapolate on the basis of our single living world. Just how long it took life to develop is a vital question, with implications that extend to other planetary systems. In today’s essay, Alex Tolley brings his formidable background in the biological sciences to bear on the matter of Earth’s first living things, which may well have emerged far earlier than was once thought. In particular, what was the last universal common ancestor — LUCA — from which bacteria, archaea, and eukarya subsequently diverged? Without the evidence future landers and space telescopes will give us, we remain ignorant of so fundamental a question as whether life itself — not to mention intelligence — is a rarity in the cosmos. But we’re piecing together a framework that reveals Earth’s surprising ability to spring into early life.

by Alex Tolley

Once upon a time, the history of life on Earth seemed so much simpler. Darwin had shown how natural selection of traits could create new species given enough time, although he did not argue for the origin of life, other than it would start in a “warm pond”. Extant animals and plants had been classified starting with Linnaeus, and evolution was inferred by comparing traits of organisms. Fossils of ancient animals added to the idea of evolution in deep time. In 1924, Oparin, and later in 1929, Haldane, suggested that a primordial soup would accumulate in a sterile ocean, due to the formation of organic molecules from reduced gasses and energy. This would be the milieu for life to emerge.

With the Miller-Urey experiment (1952) that demonstrated that amino acids, the “basic building blocks of life” could be created quickly in the lab with a primordial atmosphere gas mixture and electricity, it was assumed that proteins that form the basis of most of life’s structure and function would follow. The time needed for the evolution of life was increased from less than 10,000 years in the Biblical Old Testament, to 100 million years (my) in the late 19th century, to about 4.5 billion years (Ga) once radioisotopic dating was established by 1953. Fossil evidence relied on the mineralization of hard structures which started to appear in the Cambrian period around 550 million years ago (mya).

The Apollo lunar samples indicated that the Moon had been subjected to a late heavy impactor bombardment (LHB) after its formation 4.5 Ga from around 4.1 – 3.8 Ga. With the Earth assumed to be sterilized by the LHB, there seemed to be plenty of time for life to appear. Then the dating of stromatolites pushed the earliest known life to nearly 3.5 Ga and reduced the time for abiogenesis to just a few 100 million years after the LHB. This seemed to leave too little time for abiogenesis. There was a reprieve when it was argued that the LHB was an artifact of lunar sample collection, with the later Imbrium impact adding its younger age to the older samples. If the LHB was not a sterilizing event, then another 500 million years to a billion years could be allowed for life to appear.

Even though the structure of DNA was determined by Watson and Crick in 1953, and with it the site of genes, sequencing even short lengths of DNA was a slow process. This changed with gene sequencing machines and algorithms during the 1990s with the sequencing of the human genome. Sequencing costs have fallen sharply, and gene databases are being filled. We now have vast numbers of sequenced genes from a range of organisms, and full genomes from selected species.

The resulting inexpensive gene sequencing kickstarted the genomics revolution. With gene sequences from a large number of extant species, Richard Dawkins suggested that even if there were no fossils, evolution could be inferred by the changes in the nucleotide base sequences in modern organisms, and evolution was represented by the incremental changes in species’ genomes. His opus magnum The Ancestors’ Tale was an exploration of the tree of life moving backwards in time. [6].

The slow changes over time in the sequences of key functional genes that appear in all organisms is called the “molecular clock”. The greater the difference in sequences between the genes in 2 species, the greater their evolutionary separation. However, unlike atomic clocks, the molecular clock does not tick at the same rate for each organism, or gene. If they did, all the divergences would sum to the same length of time. As Figure 1 demonstrates, they do not. Nevertheless, evolutionary trees for all organisms with sequenced conservative functional genes were built to show how species evolved from each other and could be compared with phylogeny trees created using the fossil record.

Figure 1. Rooted and unrooted phylogenetic trees. (Source: Creative Commons Chiswick Chap).

While this phylogenetic tree shows evolutionary separation, it has no timeline. These trees converge back in time to a Last Universal Common Ancestor (LUCA) at the point where the 2 most distantly related domains of life, the Bacteria, and Archaea are joined. However, fossils can provide a means to calibrate the timeline for the tree branches and when LUCA can be placed in time. For example, if we can find and date human fossils and chimpanzee fossils, we can be confident that their common ancestor lived at an earlier age. The common ancestor would be younger than the time that both humans and chimps diverged from our ape ancestor, and in turn that ancestor would be younger than the ancestor of all primates. The phylogenetic trees based on gene sequences can be compared to trees based on morphology. Generally, they match. With fossil evidence, these new phylogenetic trees can be calibrated to date the branches.

Without good fossil evidence to calibrate the phylogenetic tree, it is harder to date the tree of life as we approach its root where we believe LUCA must be present. Several attempts have been made to determine this timeline. In 2018, a paper by Betts indicated that LUCA could be dated to about the age of the Earth [2]. Mahendrarajah et al, analyzing the gene for ATP Synthase, estimated a similarly early date for its appearance before the separation of the Archaea and Bacteria placing LUCA at over 4 Ga.[3]

The new paper by Moody et al, extends the work of the aforementioned 2 co-authors, as well as others, to create the best estimate of the timeline of life, the dating of LUCA, a description of LUCA, and its environment. The approach used a cross-bracing method using gene duplications of ancient functional genes to firm up the phylogenetic tree and the fossil calibrations. Cross-bracing is the use of duplicated genes (paralogs) to anchor different trees with dates to provide mutual support for the dating [12].

The 2 different trees are based on gene duplication before LUCA appeared to create the separate trees, which are shown in Figure 2. The analysis dates LUCA at least 4 Ga to the age of the Earth, 4.5 Ga. As most theories of abiogenesis require a watery environment, the earliest dating of surface water on Earth and the appearance of oceans is fairly fast, within 100 million years (my) after Earth’s formation, about 4.4 Ga, [11]. The relaxed Bayesian distributions used hard (no 2.5% tail distribution) and soft (include 2.5% tail distribution) dates for the boundary dating calibrations The maximum likelihood for the age of LUCA was set at 4.2 Ga, 200 my after the oceans were formed and about 300 my after the Earth formed and the impact that formed the moon and sterilized the Earth.

Figure 2 shows the new timeline. The dendrogram indicates the degree of gene sequence divergence as a horizontal line from each node. The greater the length of the line, the more ticks of the molecular clock as the sequence changes compared to nearby species’ lines, and the greater the time the species have been separated by evolution. LUCA is dated within the Hadean eon, a time once thought to be devoid of life due to its hellish surface conditions from impactor bombardment as well as the heat from its formation and radioactivity. The 4.5 Ga calibration date is a hard constraint as terrestrial abiogenesis is impossible before then.

Figure 2. The calibrated phylogenetic tree shows the 2 lineages for the gene duplications, with each of the 2 trees acting as cross braces. The 2 algorithm variants with distributions in gold and teal converge to close overlaps with the dating of LUCA. Note the small purple stars that are the fossil calibrations. The calibrations for LUCA use the age of the Earth and prior fossil evidence as there is no fossil evidence for LUCA unless the controversial carbon isotope evidence demonstrates life and not an abiotic process. Credit: Moody et al.

The paper also uses the gene sequence evidence to paint a picture of LUCA as very similar to a prokaryote bacterium. It has all the important cellular machinery of a contemporary bacterium but with several cellular pathways absent or of low probability. It was probably a chemoautotroph, meaning that it could use free hydrogen and carbon dioxide to reduce and fix carbon as well as extract energy, from either geochemical processes or other contemporary organisms.

Because LUCA is not a protocell, but a likely procaryote, this implies that the sequence of abiogenesis from inanimate chemistry to a functioning prokaryote cell must have taken no more than 300 my, and more likely 200 my.

As the authors state:

How evolution proceeded from the origin of life to early communities at the time of LUCA remains an open question, but the inferred age of LUCA (~4.2 Ga) compared with the origin of the Earth and Moon suggests that the process required a surprisingly short interval of geologic time. (emphasis mine).

The issue of the rapid appearance of life was back in play.

Figure 3 shows the hypothetical progression of abiogenesis to the Tree of Life and the steps needed to get from a habitable world to LUCA at the base of the Tree of Life.

Figure 3. The hypothetical development of life from the habitable planet through simpler stages and eventually to the radiation of species we see today. (Source: Creative Commons Chiswick Chap).

Given that the complexity of LUCA appears to be great, why is the timeline to evolve it so short when the timeline to the last archaean and last bacterial common ancestors (LACA, LBCA) is so prolonged at a billion years? Are the genomic divergences between bacteria and archaea so great not because of a slow ticking of the molecular clock, but rather evidence of rapid evolution that would imply LUCA was younger than it appears as the molecular clock was ticking faster?

It is important to understand that LUCA was not a single organism, but a representative of a population. It probably lived in an ecosystem with other organisms, none of whose lineages survived. This is shown below in Figure 4. The red lines indicating that other extinct lineages may have transferred genes to each of the archaean and bacterial lineages after LUCA evolved could, in principle, have exaggerated the divergence of these 2 lineages, exaggerating the depth of the timeline from LUCA. This is purely speculative to explain the authors’ findings.

Figure 4. LUCA must have had ancestors and likely contemporary organisms. The gray lineage includes LUCA’s ancestors as well as other lineages that became extinct. The red lines indicate horizontal gene transfer across lineages.

A key question is whether the calibrated timeline is correct. While the authority of the number of authors is impressive, and the many checks on their analysis are substantial, the method may be simply inaccurate. We have a similar methodological issue with the Hubble Tension between 2 methods of determining the Hubble constant for the universe’s rate of expansion. Molecular clock rates are not uniform between species and estimated timelines for the divergence of species can vary when compared to the oldest fossils. DNA sequences can be extracted for relatively recent fossils to more accurately calibrate the phylogenetic tree. However, this is not possible after a few million years due to DNA degradation. Purely mineralized fossils, impressions in rocks, and isotopic biosignature evidence rule this tight calibration out. Fossils are relatively rare and usually prove younger than the node that starts their particular lineage. This is to be expected, although the discovery of older fossils can modify the picture.

Because molecular clock rates are not fixed, various means are used to estimate rates, using Bayesian probability. These rely on different distributions. The authors use 2 methods:

1. Geometric Brownian motion (GBM)

2. Independent log-normal.(ILN)

In Figure 2, the distributions are indicated by color. For the younger nodes, these methods clearly diverge, and in the case of the last eukarya common ancestor, the 2 distributions do not overlap. The distributions converge deeper in time, with the GBM maximum probability now a little older than the ILN one. The authors selected the GBM peak as the best dating for LUCA, although using the ILN method makes almost no difference.

While the Bayesian method has become the standard method for calibrated phylogenetic tree dating, the question remains whether it is accurate. All the genes and cross-bracing used would be false support if there is a flaw in the methodology. A 2023 paper by Budd et al highlights the problem. In particular, based on fossils, the divergence of mammals occurs after the K-T event that is associated with the extinction of the non-avian dinosaurs, whereas the genomic data supports a much older divergence without any fossil evidence. The paper argues that the same applies to the emergence of animals. Fossils in the Cambrian era are much younger than the calibrated phylogenetic data suggests.

Budd states that:

Overall, the clear implication is that the molecular part of the analysis does not allow us to distinguish between different times of origin of the clade, and thus does not contradict the general picture provided by the fossil record….

…we believe that our results must cast severe doubt on all relaxed clock outcomes that substantially predate well-established fossil records, including those affected by mass extinctions.

This becomes extremely problematic when there are no fossils to compare with. In the Moody paper the LACA and LBCA nodes have no calibrations at all, and LUCA has somewhat ad hoc calibration points. If Budd is correct, and he makes a good case, then all the careful analyses of the Moody paper are ineffective, due to fundamental flaws in the tools.

Given the paucity of hard fossil evidence, the known issues of calibrated Bayesian priors for molecular clock dating of phylogenetic trees, compared to the careful testing by the authors of the LUCA paper, the best we can do is look at the consequences of the paper being an over/underestimate of the age of LUCA.

The easy consequence is that the age of LUCA has been overestimated. That LUCA was represented by a population between 3.4 and 4 Gya, with a peak probability somewhere in between. This would allow up to a billion years for abiogenesis to reach this point before the various taxons of archaea and bacteria separated 100s of millions of years later, and subsequently, the eukarya separated from the archaea even later.

This would grant a comfortable period to postulate that at least one abiogenesis happened on Earth and that all life on Earth is local. Conventional ideas on the likely sequence of events remain reasonably intact. Other planets may have their abiogenesis events, with any possibility of panspermia increasingly unlikely with distance. For example, any life discovered in the Enceladan ocean would be a local event with a biology different from Earth’s.

The harder consequences are assuming the short timeline for abiogenesis is correct. What are the implications?

First, it strengthens the argument that under the right conditions, life emerges very quickly. While we do not know what those conditions are exactly, it does suggest that our neighbor, Mars, which has evidence of surface water as lakes and a boreal sea, could have also spawned life. As Mars was not formed after an early collision, its water bodies may date another 100 my before the oceans on Earth. As Mars’ gravity is lower than on Earth, the transfer of material containing any life might have seeded Earth with life.

If we find life in the subsurface of Mars’s crust, it would be important to determine if its biology was the same or different from Earth’s life. If different, that would be the most exciting result as it would argue for the ease of abiogenesis. If the same, then a possible common origin. The same applies to any life that might be found in the subsurface oceans of the icy moons of the outer planets. Different origins imply abiogenesis is common. Astrobiologist Nathalie Cabrol seems quite optimistic about possible life on Mars, and any [dwarf] planet with a subsurface ocean [8]. Radiogenic heating can also ensure liquid water on planets that are well outside the traditional habitable zone (HZ) [10].

If abiogenesis is common, then we should detect biosignatures in many exoplanets in the HZ with the conditions we expect for life to start and thrive. Carr has suggested, rather controversially, that Mars was the better environment for abiogenesis, and therefore terrestrial life was due to panspermia from Mars [5].

What if the rest of the solar system is sterile, with no sign of either extant or extinct life? This would imply the conditions on Earth suitable for abiogenesis are narrower than we thought, which would suggest exoplanet biosignatures would be rarer than we might expect from the detected conditions on those worlds.

The last option is one we would prefer not to be the case if the aim is to work on how abiogenesis occurred on Earth. This option is to accept that LUCA appeared after just a few hundred million years, but that this time was too short. It would imply that the location of abiogenesis, however it occurred, was not on Earth. It would imply that the same probably applies to other bodies in the solar system and therefore life originated in another star system.

Leslie Orgel and Francis Crick’s early suggestion was that terrestrial life was spawned by panspermia [4]. Would that derail studies on the origin of life, or assume only plausible terrestrial conditions? How would we determine the truth of panspermia? I think it could only be demonstrated by sampling life on exoplanets and determining they all shared the same biology fairly exactly. The consequences of that might be profound.

A last thought, that surprised me in my thinking about abiogenesis being seemingly impossibly short: Cabrol, states, with no supporting evidence that [9]:

…how much time it takes for the building blocks of life to transition to biology.….estimates range between 10 million years and as little as a few thousand years.

If true, then life could appear anywhere with suitable conditions, however transient those conditions are. What state that life would be in, for example, protocells, or some state prior to LUCA is not explained [but see Figure 3], but if correct, appears to offer more time for LUCA to evolve. That is indeed food for thought.


