Horizontal transfer of genes between microorganisms

Horizontal Gene Transfer (HGT) describes the lateral movement of genes between organisms. In contrast to the vertical transmission of genes during organismal reproduction, genes transferred horizontally do not always become genes that pass to organismal offspring. HGT is also known as infectious transfer, exemplified by viruses and plasmids. Three classical mechanisms for HGT in microbes are: transformation, the uptake of nucleic acids; transduction, virus-mediated gene transfer; and conjugation, plasmidmediated gene transfer. Conjugation alone probably occurs between all bacteria, bacteria and plants, bacteria and animals and bacteria and fungi. Possibly all organisms, microbial or not, are significantly affected by HGT. 


I. THINKING ABOUT GENES, NOT GENOTYPES 
It is difficult as biologists, and particularly as genetical thinkers, to consider genes separately from organisms and phenotypes. The conceptual independence of genotype and phenotype was only first introduced last century by W. Johannsen, who coined the term “gene.” That revolution in thought is now most pertinent to those thinking about evolution. Although genes are still discovered by their effects on organismal phenotype, these effects are only indirectly related to the history and ancestry of the gene and organism. Genes that are parts of chromosomes reproduce when the chromosome is replicated. Chromosomal replication is tightly associated with organismal reproduction. If the offspring die at any stage of the reproductive process, from incorporation of the first nucleotide at a repliction fork to their last encounter with a predator, then all the genes on the chromosomes of that organism are at an evolutionary end. Genes in chromosomes share the fate of the organism. To the extent that gene reproduction is synchronous with organism reproduction, the evolution of particular genes and organisms is undoubtedly due to the effect of the gene on the organism. However, not all genes reproduce in this vertical fashion, that is, sychronously with organisms—at least, not all of the time. Some reproduce horizontally by transferring between organisms. When that transfer results in a gene that will be reproduced vertically, that is, inherited by the recipient organism, then the gene has been horizontally transmitted. When genes can reproduce horizontally, then they can evolve somewhat independently of their effects on the host. If genes transfer horizontally at rates that exceed vertical reproduction, then it is possible for genes to evolve functions that cannot be determined by studying the effects of the gene on a host. This last point is especially relevant if the host–gene relationship is studied under conditions where the gene is mostly confined to reproducing at the rate of the host. For HGT to contribute to the evolution of genes, it must occur: (a) at least infrequently, but produce strongly selected phenotypes in organisms (and probably leave records of the event through maintenance of particular DNA sequences in organismal descendants); (b) so frequently that most often the effects on organisms are unimportant to the genes reproducing horizontally (and particular DNA sequences may not accumulate in offspring); (c) frequently, but leave short nucleotide sequence records in organisms; or (d) infrequently most of the time and extremely frequently for short periods of time (e.g., during the age when mitochondria and chloroplasts first entered the ancestors of most eukaryotes). Of course, these various possibilities are not mutually exclusive. The remainder of this article will be devoted to reviewing the mechanisms of HGT, barriers to gene transmission, estimates of transfer/transmission rates, and the difficulty of determining such rates. The article will conclude with considerations of the importance HGT has for studying evolution and assessing the risk of new biotechnologies, including the introduction of new antibiotics and genetically modified organisms. 


II. THE WAY GENES REPRODUCE BETWEEN ORGANISMS 
A. Mechanisms Genes are transferred between organisms by three known routes: transformation, transduction, and conjugation. Transduction and conjugation are conducted by viral and plasmid vectors, respectively. These vectors are themselves groups of genes that reproduce horizontally, possibly far more often than they do vertically. Transformation may have a vector of sorts, such as membrane-bound vesicles, escort proteins, or “uptake sequences.” The vectors are sometimes also transmitted, but other times the vectors are only transferred with the genes. For example, transducing viruses can package chromosomal or plasmid DNA during infection or incorporate nonviral DNA into their own genome. Subsequent infections can result in a new host receiving all to none of the virus, with subsequent incorporation and inheritance of the transferred nonviral DNA. Transmission by transformation and transduction is often limited to closely related organisms because these mechanisms usually require DNA–DNA recombination and, in the case of transduction, DNA delivery is mediated by viruses that may infect a small number of species. It would be premature, however, to exclude the contribution of the growing number of broad-host range viruses being described and to equate the transfer range of viruses with the more limited range of hosts that support their infectious cycle. Of the three mechanisms, conjugation can move the largest DNA fragments (as much as an entire bacterial genome may be moved in one conjugative encounter). Conjugation also has the broadest known transfer range, mediating exchanges between all eubacteria and transfers from prokaryotes to eukaryotes, including human cells. Conjugation, a process determined by plasmids or transposable elements, is not usually dependent on homologous recombination to achieve the formation of a recombinant. B. Host ranges Comparing the sequence of particular genes in different organisms has become a taxonomic tool for inferring organismal homology. Comparisons are complicated by DNA sequences that suggest a lineage different from other genes in the same organism. If that gene has been acquired by horizontal transmission, then, truly, it would be a rouge and its exceptional sequence signature could be explained. The origin of genes by HGT has often been discounted, however, when the host range of known vectors is thought to not overlap with the putative donor and recipient species involved and no other vector or ecological relationship is obvious. R. F. Doolittle (1998) calls this the “opportunity” factor. “The possibility for gene transfer is often given wider berth whenever parasitism, symbiosis or endosymbiosis is involved.” Host range determinations are generally the result of studies that require a vector or gene to be transmitted to determine retrospectively if genes had transferred. Thus, when certain plasmids or viruses do not cause demonstrable infections in an organism, they are assumed not to have transferred to that organism. The history of the host range studies using the Ti plasmid of Agrobacterium tumefaciens and the conjugative plasmids, like IncF and IncP, of the Gram-negative bacteria illustrate the lesson well. 



The transfer of DNA from A. tumefaciens to dicotyledonous plants is determined by the Ti plasmid. Indeed, that process, which results in tumors in certain susceptible plant species, was prematurely thought to be limited to those species. When the relevant DNA was conferred with sequences that would maintain it in other species, the host range was extended to monocots and then to fungi. The mechanism of Ti-mediated DNA transfer is biochemically and genetically equivalent to bacterial conjugation. Since the equivocation of the mechanisms, it has been demonstrated that even mundane bacterial conjugative plasmids transfer to eukaryotes. Once again, demonstration required engineering the plasmids with a selectable marker and a strategy for replication in the eukaryotic host (either replication autonomous from the chromosomes or by integration into the chromosomes). These changes had no obvious effect on transfer. Thus, the transfer range of genes can be remarkably different from host range. A recent finding that a DNA virus, that infects animals, evolved via recombination between a DNA virus, that infects plants, and an RNA virus, that infects animals, provides an illustration for the viruses (Gibbs and Weiller, 1999). The plant virus must have been able to transfer to animals without causing an obvious phenotype. The many transfer events preceding the evolution of the new variant virus were not detected by selecting or observing a recombinant animal (and possibly would not have been detected even with current DNA amplification technologies). The transmission event could be detected, but provides no quantitative information about the frequency of transfers of the original virus to animals. Clearly, Ti and some viruses can mediate transfer to more species than normally display the effects of that transfer because those effects are not frequently heritable or selectable. The “opportunity” factor in determining the likelihood of HGT is often less amenable to test than DNA sequence comparisons, making opportunity a very distant secondary consideration for evaluating the possibility of HGT. 



III. GENE ARCHAEOLOGY 
Determining which genes are most closely related, and how that reflects organismal relationships, is difficult. The underlying assumption in sequence comparisons is that sequence is the best indicator of homology. The difficulty in establishing relatedness independently of sequence information makes testing the proposition problematical. Even when sequence information is available, the identity and history of the gene are not always obvious (Fig. 50.1). Comparisons must be made between homologous genes and not just genes with homologous names—those with functional similarity (discussed by Doolittle in Horizontal Gene Transfer, 1998). For example, did two genes with similar structure and function diverge from a single sequence or converge from much different sequences? Defining genes with sufficiently similar sequences as homologous begs the question of the adequacy of the criteria for determining homology. Those who compare sequences have to beware several common problems, as will be discussed. A. Sequence evidence of ancestry 1. Orthologous or paralogous? Are the genes orthologous, diverged when the species diverged, rather than paralogous, diverged since duplication? The differences between orthologs can represent the divergence of the two organisms from a common ancestor. When gene duplication yields paralogs, each allele can change separately, reflecting the history of the genes within, instead of between, lineages. Identifying which paralogs in different organisms reflect organismal histories is sometimes done by assuming that the most similar of the paralogs in the different species are the orthologs. However, accepting this a priori assumption can render the comparison redundant. Moreover, that test can never be independently verified because, no matter how close the sequence match, the possibility that one species lost the true ortholog since the speciation event can never be excluded. Indeed, the genes may be orthologous but, because the gene was transferred horizontally, the comparison misrepresents the ancestry of the organism. Mistakes have been made by the inadvertent comparisons of paralogs (discussed by Doolittle, 1998). 2. Mutation or recombination? Do the differences and similarities in sequences result from historical events other than time since divergence? Sequences can be maintained by selection or diverge rapidly through selection at rates that differ from other genes in the same lineage. Recombination can also maintain or disrupt sequences. Recombination, particularly with sequences obtained horizontally, can convert a portion of a DNA sequence. 


The resulting mosaic could have an overall average sequence similarity that supported the phylogeny suggested by comparing other genes between the organisms but actually be the product of genes that evolved independently. Reports of mosaic genes are becoming increasingly common. Perhaps one of the most instructive examples of the impact of HGT on the evolution of organismal phenotypes is the story of _-lactam resistance evolution in Neisseria and Streptococcus (Fig. 50.2). Alterations in their target penicillin-binding proteins (PBP) are a common and growing means of resistance to the drugs. Not only should this type of resistance arise slowly, if at all, but it should be species-specific. The reasons for these expectations follow from the nature of the drugs and mechanisms of resistance. _-Lactams bind to several different and essential PBPs; binding to any is usually sufficient for therapy. Thus, each PBP must change to confer phenotypic resistance. Moreover, a single point mutation that together conferred resistance and maintained protein function may be impossible. Thus, up to 4 genes (as in Streptococcus pneumoniae) must accumulate at least two changes simultaneously for the PBPs to retain function and for a cell to display phenotypic resistance. Based on a simple calculation using a reasonable mutation rate around 10_9/base/generation, the absurd probability of a pre-existing penicillin resistant strain would be 10_72/cell/generation (10_9)4.2. How did these pathogens become resistant? Individual PBPs of resistant pathogens are composed of segments of homologous PBPs from up to three different species! HGT has done what is a probabilistic impossibility for vertical evolution by mixing parts of different PBPs from various species of bacteria that have individual PBPs with low binding affinities to _-lactams. 3. Big rare events or small common events? How large does a horizontally transmitted sequence have to be to reveal itself as having been acquired horizontally? Herein lies the most difficult problem in both assessing the validity of gene comparisons for determinations of ancestry and determining the rate and extent of horizontal gene transmission. A horizontally transferred gene preserved in a vertically reproducing lineage may be identified by the anomalous phylogenetic tree its sequence creates, a significant deviation from the average G_C content of the host, an unusual organization of genes or deviation from the normal codon bias of the host. 


Most indicators are quantitative. Thus, they are useful if large tracts of sequences with one or more large deviations from accepted norms are being analyzed. The origin of sequences becomes increasingly difficult to determine the shorter they are or the closer to accepted ranges they appear to be. The proper conclusion in those cases is uncertainty as to how much of the sequences’ reproductive history has been either horizontal or vertical. Moreover, some of the characteristic differences between transferred and endogenous genes, like codon bias and G_C content, can change to more closely resemble organismal norms with time. Sequence comparisons limit analysis both to large tracts of DNA and to recently transmitted genes. How many bases are retained in the average HGT event? Horizontally transferred sequences are retained through vertical reproduction only by selection or chance. Sequences that reduce fitness will likely disappear with the organism. Protein coding sequences, the source of most DNA sequences from organisms, are the least likely repository of horizontally acquired sequences. Selectable changes could be preserved in gene-regulating sequences or in sequences important for chromosome structure. Those preserved by chance are most likely to accumulate in regions already known for their high rates of change: intergenic regions and junk. Overall, only sequences much shorter than necessary to encode a protein should normally be retained in organisms. Transferred sequences that must ultimately recombine with those in chromosomes may also be excluded by the homologous recombination and mismatch repair machinery (Table 50.1). Although such enzymes are generally necessary for efficient recombination, they also discourage the retention of dissimilar sequences. The E. coli RecA protein, for example, aborts recombination between sequences with less than an overall 90% identity or more than three mismatches in a row. Taking the biochemical stringency of RecA as roughly representative of most organisms, then the tracts of sequences incorporated will, in the main, be short, even if the donor and recipient are closely related. The tracts of heterologous sequences, as much as 24% diverged, in the mosaic PBPs of Neisseria meningitidis and S. pneumoniae, stretch hundreds of base pairs. Given the barriers to transmission, large mosaic structures should be created extremely rarely. Their existence is evidence of gene transfer frequencies on a scale large enough to produce such unlikely recombination events randomly. Mutations or physiological variation in mismatch repair function (the gene products that correct mispaired bases that arise from polymerase errors or recombination between homologous but not identical sequences of DNA) dramatically increase the mutation rate of individuals and similarly decrease the barrier to interspecies recombination. For example, reducing the stringency of sequence comparisons made by mismatch repair enzymes dramatically increases the frequency of recombination in interspecies crosses of E. coli and Salmonella typhimurium (discussed in Matic et al., 1996). Similar effects have been noted in eukaryotes as well as prokaryotes (Kolodner, 1996). The mismatch repair genes themselves are highly mosaic. This is to be expected because rare mutations in mismatch repair genes would make the organism more likely to retain recombination products created by horizontally transferred DNA from other organisms. When such recombinations also restored mismatch repair function, then the recombinant organism would regain its controlled mutation rate (Denamur et al., 2000). HGT can increase the proportion of organisms within populations that have high mutation rates due to defects in mismatch repair functions by up to 10,000-fold in some environments (Funchain et al., 2001). 


Thus, transient mutator phenotypes, caused by mutational or physiological cycling of mismatch repair stringency, may impact gene transfer profoundly but frequently only over short stretches of DNA and may only rarely introduce a new gene in its entirety while simultaneously restoring, or creating new, biochemical functions. 4. Homology more than sequence Is sequence structure necessarily conserved by evolution? The genomes of HMEs tend to be fluid. Over time, particular HMEs are replaced by relatives carrying different genes, such as plasmids with an expanding repertoire of antibiotic resistance genes. Tracing ancestry of HMEs by structure allows relationships to be determined over only very short periods of time. For example, the structure of infectious retroviruses can change 104–106 times the rate of other genes and defective retroviruses reproducing in synchrony with the host. The phylogeny of these viruses is confined to tracing the residues of defective viruses trapped in chromosomes of organisms or monitoring divergences on decade scales. This observation led Doolittle et al. (1989) to lament that “As is the case in the rest of the biological world, rapid evolutionary change appears to be associated with rapid extinction.” Astatement that cannot help but be true if the HME is only defined by its primary nucleic acid structure. Although contrary to their preferred conclusion, they did acknowledge that “the remarkable constellation of enzymes and structural proteins that constitute infectious retroviruses may have been assembled . . . in the quite recent past . . . 


The erratic occurrence of retrovirus-like entities in the biological world could be the result of widespread distant horizontal transfers” (Doolittle et al., 1989). B. Genome structure How would ancestry of organisms be determined without, or with much less, reliance on nucleic acid or protein sequences? In the cases of ribonuclease H (Doolittle et al., 1989) and the A8 subunit of mitochondria (Jacobs, 1991), ancestry was inferred from three-dimensional structures or other biophysical characteristics because the primary nucleic or amino-acid sequences had lost such information. Approaches that rely indirectly on sequences are emerging. The overall composition of genes carried by organisms, their relative positions, and their regulation establish grouping patterns. As discussed above, determining the impact and extent of HGT will require looking at the effects of the process rather than at conservation of particular sequences. The recent introduction of the “Competition Model” (Cooper and Heinemann, 2000) and the “Selfish Operons Model” (Lawrence and Ochman, 1998), are producing robust tests of predictions of genome organization. The Selfish Operon Model explains the organization of prokaryotic genes in operons as the result of HGT. The genes collected into operons are, and would need to be, nonessential for survival in at least one environment or only weakly selected, like traditional HME-borne genes, such as antibiotic resistance and novel virulence traits. Operon organization is not selected by clonal dissemination of hosts benefiting from this particular organization of its genes. Instead, the collection of genes in the operon reproduces faster by horizontal than by vertical reproduction and the genes are preserved in organisms when they transfer together. The genes in operons are functionally codependent. Individually, they provide no selective benefit to a cell. So whereas they may be transferred horizontally as individual genes, they are lost in time from vertically reproducing lineages. As an example, imagine that the genes of the lac operon were once distributed around the chromosome of some ancient bacterium. That bacterium could survive occasional deletions of some intervening nonessential genes, rearrangements of genes by intrachromosomal recombination and transposition. Eventually, the genes of the lac operon may have come close enough together to be mobilized by a single HME or efficiently taken up by transformation. As the lac genes together could provide a selective benefit to a recipient cell that none of the genes could provide individually, then the recipient that maintained the cluster would be selected. Over time, the new lineage and other recipients of the transferred cluster could tolerate occasional deletions of material between the lac genes until the modern minimal structure of the operon emerged. Thus, the genes were maintained vertically because of their contribution to the organismal phenotype but were organized in such a way as a result of HGT. For the model to work, HGT must occur frequently, with occasional retention of particular sequences in vertically reproducing lineages. 


The Competition Model, which seeks to explain HME organization, supports Selfish Operon expectations of high gene transfer frequencies. Normally, microbes are cultured under conditions that favor organismal reproduction. In tests of the Competition Model, conditions that favored HME reproduction were maintained. Sometimes gene transfer was allowed to occur as fast or faster than organismal reproduction. When multiple HMEs were mixed under such conditions, individuals with strategies to eliminate other HMEs emerged and dominated. The phenotypic expression of the genes (e.g., antibiotic resistance genes) that made HMEs successful during horizontal competition were sometimes detrimental and sometimes beneficial to the host. Studying those genes during clonal culture of the host, especially in the absence of competiting HMEs, can lead to significantly different perceptions of their function and evolutionary history. IV. ESTIMATING RATES The emergence of microbial cells as legitimate entities in biology was delayed by centuries because of their size. The scale of the microbial community is still largely unknown. Similarly, the recalcitrance of horizontally transferred genes to study using existing technology has led to the untenable impression that they are rare. Judging from transmission rates, HGT could be quite common. For instance, it has been estimated that 18% of the E. coli genome was acquired horizontally since it diverged from S. typhimurium 100 million years ago (Lawrence and Ochman, 1998). As much as 5% of the mammalian genome is contained between copies of the two long terminal repeats characteristic of retroviruses. As much as 30% of the mammalian genome, and 10% of the human, was created by the action of reverse transcriptase. Plant genomes may be almost half the product of reverse transcriptase. Twenty-three percent of the human major histocompatibility complex class II region is of retroviral origin, strongly suggesting that transfer alters important genetic characteristics. A. Organism clonality How frequently could genes be reproducing horizontally? The apparent clonality of many microogranisms amenable to the techniques used to estimate diversity would seem to support the perception of a vertical world. 



Evidence of clonal distribution is inappropriate evidence for conclusions about horizontal transfer, though, for several reasons. First, the number of clonal types is evidence, not of the amount of recombination that occurs between cells, but of the amount of recombination and subsequent selection of particular individuals. Second, the techniques for estimating diversity, which rely on alterations in genome sequences or protein conformations detectable by electrophoresis (e.g., restriction fragment length polymorphism (RFLP) and isozyme analysis), focus at a level of resolution that cannot detect recombination of short nucleotide sequences. Finally, the techniques are preoccupied with chromosomal characteristics. The mosaic structure of plasmids within bacteria argues for extensive interstrain recombination that is not preserved in the chromosomes. Whereas a lack of clonality, when it exists, can be concluded from such analyses, apparent clonality cannot at present preclude still enormous frequencies of HGT. B. Viruses Viruses in the environment provide some insight into the amount of horizontal gene flow. Since the viral life cycle can include a stable, extracellular period that other HME life cycles may not, the viruses are uniquely amenable to monitoring. The summertime free viral load of the world’s oceans has been estimated at between 5_106 and 1.5_107/ml at up to 30 m depth. These viral titers were 5–100 times the estimated concentration of organisms. Up to 70% of marine prokaryotes are infected by viruses at any given time. Each of these infected hosts produce an estimated 10–100 new viruses. The absolute numbers vary somewhat between studies and season, but are consistent with counts in fresh water of 104 Pseudomonas- and chlorella-specific viruses/ml. Finding viruses is difficult without the ability to culture each possible virus; counting the number of plasmid and transposon generations is, at present, impossible. Still, the ease of isolating plasmid-infected bacteria from natural habitats would suggest that plasmids and transposons are replicating no less than viruses. Summing the volume of water in the top 30m of oceans, adding a factor for viral turnover on land and the contribution of nonviral HMEs, produces an, albeit crude, estimated HGT frequency of 1030 per day. The limited technologies available for surveying and then unambiguously culturing to purity the units that mostly reproduce horizontally leaves us with an exaggerated awareness of cellular life. 



V. THE RISKS AND CONSEQUENCES 
A. Effects of transfer versus transmission The difference between transfer and transmission affects more than experimental design. The difference is the essence of the debate over the risk of gene escape from genetically modified organisms and the evolution of resistance to new generation antimicrobial agents. In both cases, we are interested in the formation of genotypes that produce organismal phenotypes better avoided. Experiments that have failed to detect the exchange of genes between organisms because no recombinants were detected have suffered from two important flaws. First, the scale of the experiments and the exposure to HME invasions were both too limited to represent the fate of genes in time and on global scales. The pace and extent of antibiotic resistance evolution is a clear indication of how extremely rare events (10_72) can become certainties for populations as large as those of the HMEs and microbial cells. Second, preservation of horizontally transferred genes is the last step in the process of generating recombinant organisms. The rate limiting step in gene transmission is compromising the barriers to inheritance and expression of recombinant genes. Thus, recombinants are an inaccurate way of assaying gene transfer. Unless a risk assessment experiment can be conducted under constant selection with a flux of organisms and vectors, the effect of transfer on potentiating the creation of recombinant types cannot be estimated. B. Potentiating conditions Other molecules are transferred concomitantly with nucleic acids. Viruses and conjugative plasmids carry proteins into recipients. In natural vesicle-mediated transformation, proteins and DNA enter recipient cells. Except for prions, these other types of transferred molecules are genetically inert because they do not direct their own replication. Nevertheless, escort molecules, whether they be protein or polymers of nucleotides, can instigate heritable states de novo. For example, proteins, double-stranded RNA (e.g., RNAi, siRNA) and the methyl groups on DNA provoke heritable changes in gene expression patterns. The transfer of the E. coli RecA protein to recA_ _ lysogens by conjugative plasmids can activate the latent virus, as can the transfer of damaged DNA. Once activated, the virus remains active until a new environmental signal causes its conversion to a prophage. In mammalian cells, the introduction of DNA fragments with preexisting methylation patterns different from the pattern on the homologous endogenous sequences can cause the methylation or demethylation of chromosomal sequences. In all eukaryotes so far tested, RNAi-effects can be transmitted not just horizontally from cell to cell within an organism, but across generations (Cogoni and Macino, 2000).  Thus, HGT processes are capable of more than the creation of recombinants. The impact, surely underestimated, includes the use of transferred nucleic acids for recombination with, and repair of, endogenous genes, the creation of recombinants with novel and potentially selectable phenotypes, and the potential to alter gene expression and heritable states. 

No comments:

Post a Comment