Each copy of the human genome consists of about 3,200,000,000 base pairs, and includes about 500,000 repeats of the LINE-1 transposable element (a LINE) and twice as many copies of Alu (a SINE), as compared to around 20,000 protein-coding genes.

Whereas protein-coding regions represent about 1.5% of the genome, about half is made up LINE-1, Alu, and other transposable element sequences. These begin as parasites, and some continue to behave as detrimental mutagens implicated in disease. However, most of those in the human genome are no longer mobile, and it is possible that many of these persist as commensal freeloaders.

Finally, it has long been expected that a significant subset of non-coding elements would be co-opted by the host and take on functional roles at the organism level, and there is increasing evidence to support this. A notable fraction of the non-genic portion of human DNA is undoubtedly involved in regulation, chromosomal function, and other important processes, but based on what we know about non-coding DNA sequences, it remains a reasonable default assumption -- though one that should continue to be tested empirically -- that much or perhaps most of it is not functional at the organism level.

This does not mean that a search for the functional segments is futile or irrelevant -- far from it, as many non-genic regions are critical for normal genomic operation and some have played an important role in many evolutionary transitions. It simply means that one must not extrapolate without warrant from discoveries involving a small fraction of sequences to the genome as a whole.

More generally, it has been known for more than 50 years that the total quantity of DNA in the genome is linked to nucleus size, cell size, cell division rate, and a wide range of organism-level characteristics that derive from these cytological features. Thus, large amounts of DNA tend to be found in large, slowly dividing cells, which in turn typically make up the bodies of organisms with low metabolisms, slow development, or other such traits.

On this basis alone, one would expect to see consequences for the organism if a large quantity of non-coding DNA were eliminated from or added to the genome, even if most of the particular elements in question were neutral or detrimental under normal circumstances. Non-functional is not equivalent to inconsequential.

This is especially true when there are factors operating at different levels, for example when an abundant and diverse collective of entities includes components that are variously neutral, beneficial, and detrimental to a host.

Though they cannot prove an argument, analogies are often useful for understanding an issue. In this capacity, consider the following:
  • There are roughly 1013 to 1014 individual microorganisms living in your digestive tract (Gill et al. 2006), which is on par with, or perhaps even 10x larger than, the number of cells making up your own body. It is also two or three orders of magnitude larger than the number of humans who have ever lived, and of the number of stars in the Milky Way galaxy.
  • The assemblage of microorganisms in your intestines comprises some 500 species, most of which have never been cultured in the lab or studied in detail (Gilmore and Ferretti 2003). To put this diversity in perspective, there are only about 5,000 species of mammals on Earth today.
  • The combined "metagenome" of the microorganisms in your gut contains at least 100 times as many genes as your own genome (Gill et al. 2006).
We do not know the specific characteristics of many of the microorganisms in the gut. However, we do know that at least some of them are essential, or at least highly beneficial, for human health. Several of the species found in the gut are important mutualists, assisting with digestion and in return drawing nutrients from the food that we consume.

In this sense, it is hard not to agree with Gill et al. (2006), who argue that "humans are superorganisms whose metabolism represents an amalgamation of microbial and human attributes".

The question is, are all 10,000,000,000,000+ microbial cells that we carry with us functional for our well-being? Some certainly are. But many, maybe even most, are probably commensal freeloaders who neither harm nor benefit us, though of course their total abundance is limited to what can be carried by the host without deleterious consequences.

By contrast, some gut bacteria are implicated in gastrointestinal disorders. A few are actively parasitic, but their numbers may be kept in check by our own immune system or through competition with non-pathogenic species, or because they kill the host or are killed by antibiotics. Some, such as the well known Escherichia coli, can be harmless or deadly depending on the presence of particular genes. Thus, the total number of microorganisms, and the relative diversity of species that this encompasses, is influenced by a complex interaction of factors internal to the gut (e.g., who invades, which microorganisms are already present, how efficiently they reproduce) and higher-level conditions (e.g., human immune response, dietary effects on which nutrients are present, positive or negative effects on the host).

What we know about bacteria and other microorganisms makes for a reasonable default assumption that much or even most of what is found in the gut is not there because it provides a direct benefit to humans. On the flipside, we have good reason to expect that some, perhaps even a large fraction, of these organisms are beneficial.

Therefore, we require evidence to show that any particular species is functional from the human point of view, and that its abundance is determined on this basis. The search for such evidence is important, but it occurs against a backdrop of realizing that bacteria could be there for their own benefit only, whether or not that has any adverse effects on our well-being as hosts.

Establishing that a specific strain of bacteria in the digestive tract is beneficial does not justify the conclusion that all bacteria in the gut are mutualistic. It does not even imply that all individuals of the helpful strain are essential, because the optimal abundance for the host and the pressures for reproduction of the microorganisms may not converge on the same quantity.

If one were to remove the microorganisms from the gut, or to significantly alter their species composition or abundance, one would expect to see consequences for host health. This would be true even if most of the particular organisms in question were neutral or detrimental in normal circumstances.

As with non-genic elements in the genome, this means that even if many organisms in the gut are non-functional from the host's perspective, their presence is not inconsequential for the biology of an animal carrying them.