Comprehensive and Updated Computer Programming!: Escherichia coli and Salmonella, Genetics

Over 2300 Wild-type (naturally occurring) varieties of Salmonella are known to inhabit Earth and an estimated vast excess of over 10,000 (perhaps more than 100,000) wild-type varieties of Escherichia coli exist as well. To help explain this striking diversity, there is overwhelming evidence that a complex spectrum of vertically inherited mutational events and horizontal gene transfer events has taken place over the course of evolution, since the time these two species diverged from a common ancestor approximately 150 million year ago. The resulting population of genetic variants includes mostly innocuous inhabitants of avian, human, and other animal intestinal tracts (E. coli cells normally constitute about 1% of the bacterial flora in the human intestine), and also a large number of highly pathogenic variants that are able to invade the intestine, blood, and other organs, exacting a huge toll in terms of disease and death (there are 3.6 million deaths annually from Salmonella infections worldwide) and the associated economic burden (over a billion dollars annually in the United States alone). As is well known, certain strains of E. coli and Salmonella have been used since the 1940s to serve as laboratory workhorses for studies of mutation, physiology, gene transfer, and fundamental life processes such as mechanisms of genetic inheritance, macromolecular synthesis, biosynthetic pathways, and gene regulation. The huge families of laboratory-derived strains and seminal findings from these experiments is a crucial counterpart to the lives of E. coli and Salmonella in the wild. The genetics of these organisms is based on the study of their genomic structure and mechanisms of variation observed both in the laboratory and in nature.

I. TAXONOMIC NICHE OF ESCHERICHIA COLI AND SALMONELLA

The description of the phylogeny of Escherichia coli and Salmonella has had a particularly tortuous and confusing history. This is in part because of the medical relevance of the many pathogenic varieties of these strains and the ensuing detail of study, and in part due to the intrinsically vast spectrum of subtle antigenic variations that have evolved on the surface of these cells, and, furthermore, the ability of the cells of even one strain to alternate between antigenic types in their normal course of growth. The larger family of bacterial genera to which E. coli and Salmonella belong is the Enterobacteriaceae (or Enterobacteria), defined in 1937 by Rahn, which has diverse natural habitats including animal intestines, as well as plants, soil, and water. The number of genera, let alone species, defined to be in this family has varied repeatedly for almost 100 years, depending on available diagnostic measures and criteria of relatedness agreed to and disagreed to by many investigators. For example, the number of genera of Enterobacteriaceae listed in Bergey’s Manual of Determinative Bacteriology, 5th edition (1939), a widely used standard, is nine, with 67 species (not including Salmonella, whose taxonomic history has varied widely depending on the concept of species), whereas the 8th edition (1974) lists 12 genera and 37 species. As of the 1990s, the use of methods of DNA–DNA hybridization to determine relatedness has led to the clear resolution of over 30 genera and 100 species within the “extended” family of Enterobacteriaceae. Within this family, certain genera such as E. coli and Shigella are very closely related (70–100% homologous); yet they are maintained in separate genera because of the practical difference in clinical diseases they cause. The next closest relatives of E. coli are the Salmonella and Citrobacter genera, then Klebsiella and Enterobacter. More distantly related genera include Erwinia, Hafnia, Serratia, Morganella, Edwardsiella, Yersinia, Providencia and, most distantly, Proteus.

The above genera were all known before 1965 (starting with Serratia in 1823), mostly as a result of infection in and the fecal content of humans or animals. Since the late 1970s, 17 more genera in this family have been discovered, some from clinical origin and some from water, plants, or insects. The major characteristics that serve to identify the Enterobacteriaceae are that they are gram-negative rod-shaped bacteria that, with minor exceptions, can grow aerobically and anaerobically, produce a catalase but not oxidase, ferment glucose and convert nitrates to nitrates, contain a common enterobacterial antigen, are usually motile by means of flagella, do not require sodium for growth, and are not sporeforming. Further subclassification depends on other subtleties in metabolic capacities, such as the ability to use various sugars. For practical reasons, many medically important strains are diagnosed also with antisera specific to components of the outer membrane. Horizontal gene transfer is believed to occur between virtually all members of the family. E. coli was discovered in feces from breastfed infants by Theodor Escherich, in 1885, and was first named Bacterium coli. By the early 1900s it became clear that it differed from another similar organism known primarily as a pathogen, Bacillus typhi (known since 1930 as Salmonella typhi), and hence was renamed Escherichia in honor of its discoverer. E. coli was found to ferment lactose, whereas the (Salmonella) typhi strains could not. In hindsight, this ability to ferment lactose and the divergence of E. coli and Salmonella roughly 150 million years ago coincides rather strikingly with the assumed appearance of mammals, and the assumed first synthesis of lactose in nature, at about the same evolutionary time. The differing abilities of E. coli and Salmonella to ferment lactose, and invade mammals in a pathogenic fashion suggests a rapid adaptation to the newly emerging families of mammals on Earth. In contrast to the discovery of E. coli as a common and compatible inhabitant of the human intestine, Salmonella, named in 1900 after Salmon, was first defined as the causative agent for hog cholera (hence the initial strain name, “choleraesuis”). Investigations of the varied and varying antigenic properties of many similar strains from diseased animals and humans led to a classification scheme (Kauffmann– White) based on three types of antigens.

Further classification was based on differences in sensitivity to various bacteriophages. The three broad classes of antigens used to categorize both E. coli and Salmonella isolates are based on three types of surface structures. The outermost are the flagella that are present in motile strains. The antigens of flagella are termed H antigens (Hauch, meaning “cloud” or “film,” describing colonies of motile bacteria). Next is the group of outermost surface layers including the capsule (in some strains), envelope and fimbriae. The antigens of this group are called K (for Kapsule). Internal to the K group is a group of antigens associated with the phospholipids in the outer membrane—termed O antigens because strains that lose their motility (and are thus “ohne Hauch,” that is, without cloud or film as colonies) through loss of flagella can expose the lower-lying (O) antigens. What makes this overall system feasible is that by selective treatment the outer antigens can be removed or inactivated stepwise by alcohol (for the H antigen, i.e., the flagella) and by heating to 100 _C for 2 h (which inactivates most or all of the K antigens), which thus leaves material containing only the O antigens. By preparing antibodies to the bacteria in all three states, the antisera can be subtractively purified to obtain fractions specific for each of the three antigen classes. The results of decades of effort in this direction in the early 1900s (and continuing into the present, for more specific isolates) resulted in a well-defined spectrum of antigenic determinants that define a multitude of E. coli and Salmonella strains. In E. coli at least 173 distinct O antigens, 80 K antigens, and 56 H antigens are known. Although it is not likely that all conceivable combinations exist in nature, it is thought that well over 10,000 variants exist, more than for any other species. Note that the common laboratory strain, E. coli K-12 (also written K12 or K12), does not show any K or O antigenic determinants. The name K-12 was applied in a Stanford clinic where it was isolated in 1922, long before the K, O, H system was developed. As for E. coli, Salmonella strains from natural sources display a huge variety in O and H (and sometimes K) antigenic determinants, and over 2300 combinations have now been registered, second only to E. coli in complexity. Some of this complexity is due to the ability of Salmonella to present more than one form of H antigen (phase variation).

Additional variation is introduced by the phenomenon of lysogenic conversion, whereby certain bacteriophages infect a strain, become lysogenized as a stable prophage, and express one or more new genes that alter the O surface antigens. Until the early 1940s, the names used for different isolates were taken to mean separate species, thus hundreds of Salmonella species are described in the early literature. In 1973, a major simplification occurred as a result of DNA-relatedness studies of Crosa and colleagues, who found that virtually all Salmonella strains are closely related and can be considered a single species, divided into six subgenera (subspecies). Officially, this species is named Salmonella choleraesuis; however a subsequent proposal is pending to name the species Salmonella enterica and thus use a name not associated with any of the earlier isolates. Many investigators are already using this terminology, wherein the particular strain is defined with a serovar name that corresponds in most cases with the older “species” name, that is, Salmonella enterica serovar Typhimurium, or simply Salmonella Typhimurium.

II. THE GENOME

A. The chromosome The genomes of E. coli and Salmonella strains consist of a single major circular chromosome of over 4 Mb of DNA (the genomes from natural isolates of E. coli range in size from 4.5–5.5 Mb), and commonly one or more small independently replicating (extrachromosomal) DNA plasmids, usually 100 kb or less. Thus, these organisms are haploid, although a second copy of a portion of the chromosome can be carried on a plasmid, thus creating a partial diploid (merodiploid) state. Strains of this type have been used extensively in the laboratory in genetic studies of dominance and regulation. Though the genome is haploid, the number of copies of the main chromosome varies, ranging between approximately one per cell in the resting (stationary phase) state to sometimes four per cell in rapidly dividing cells in rich growth media. In this state there are more than one set of replication growth forks along the chromosome, which replicates divergently beginning at a site called the origin (of replication) and ending in a region called the terminus (see Fig. 35.1). Thus, in rapidly growing cultures there are at any given time more copies of genes located near the origin than near the terminus. As a result of this multiplicity of growth forks, rapidly growing cells divide and produce two daughter cells as often as every 20 min, even though the time required for any one growth fork to move from the origin to the terminus is 40 min. Genes whose products are used in high concentration during rapid growth (such as genes for components of the ribosome) tend to be located nearer the origin than the terminus, thus allowing relatively high amounts of product formation. The chromosomes of E. coli and Salmonella are clearly related, as can be seen from Fig. 35.1, which shows the general congruence of the sequence of genes around the chromosomes for the two most commonly used laboratory strains (originally isolated from the wild), E. coli K-12 and Salmonella Typhimurium LT2.

The vast majority of the genes in E. coli K-12 have counterparts at the corresponding regions of the Salmonella Typhimurium chromosome, and vice versa. At the base-sequence level, the DNA homology varies from approximately 75–99% for various protein-coding genes. The G_C to A_T DNA base ratios are also similar (50.8% G_C content for E. coli K-12; 52% for Salmonella Typhimurium). Recombination can occur between the E. coli and Salmonella chromosomes, albeit rarely (see later discussion), to form viable hybrid strains that carry portions from each chromosome, and most of the gene functions from one organism can substitute for the analogous function in the other. Superimposed on the general congruence of these two related genomes are a number of sites in which an insertion or deletion shows a distinct difference between the two, as indicated in Fig. 35.1 (and see later discussion). Figure 35.1 also shows a region of the chromosome, in the 26–40 min region, where a large inversion has reversed the gene sequence in E. coli K-12 relative to Salmonella Typhimurium. It is believed that the Salmonella Typhimurium configuration of this region is the more distantly ancestral one because certain other types of Enterobacteria related to E. coli and Salmonella typhimurium carry inversions analogous to the 26–40 min inversion but differing greatly in extent, ranging from a much smaller inversion (in Klebsiella aerogenes) to a much larger inversion (in Salmonella Enteriditis). Thus, it is believed that over the course of evolution a number of independent inversion events in this region occurred, starting from a configuration similar to that of Salmonella Typhimurium. A summary of types of genes and other genetic elements on the E. coli K-12 chromosome is listed in Table 35.1. For structural genes that encode proteins, Table 35.2 indicates the major classes based on broad functional role in the cell. B. Extrachromosomal elements Most natural isolates of Salmonella, and many of E. coli, carry extrachromosomal plasmids. These are broadly classified as either self-transmissable (conjugative) or not (nonconjugative). Conjugative plasmids carry a number of genes needed for conjugative DNA transfer between cells (see later discussion). Some nonconjugative plasmids can be conjugationally transferred (mobilized) when a conjugative plasmid is also present to provide the necessary gene functions. Plasmids can be classified into groups based on whether they can coexist stably in the same cell. (Two different plasmids that can coexist are defined to different incompatibility, Inc, groups.) There are more than 30 Inc groups among the Enterobacterial plasmids. About two-thirds of natural Salmonella Typhimurium strains carry a conjugative plasmid roughly 90kb in size, similar to pSLT, which is present in strain LT2 as indicated in Fig. 35.1.

These plasmids usually carry some of the genes that contribute to the virulence of Salmonella as a pathogen. Some wild-type E. coli strains (roughly 10% or more) also carry conjugative plasmids. The one denoted F from E. coli K-12 is about 100 kb in size and is unusual in that it is self-transmissable at very high frequency (see later discussion). The F factor and other F-like factors from E. coli strains have a number of genes involved in conjugation that are homologous to pSLT and similar plasmids from Salmonella Typhimurium strains. Another type of plasmid, found in about 30% of wild-type E. coli strains, is the colicin factor. Colicins are toxins produced by one bacterium to kill or inhibit another bacterium. At least 20 distinct types of colicin factor, some of which are self-transmissable, have been found in E. coli.

III. WAYS IN WHICH THE GENOME CAN CHANGE

A. Mutation Various changes in base sequence occur spontaneously at the rate of approximately 6_10_10 per base pair per generation (i.e. about 0.003 per genome) during vegetative growth. Much higher rates of mutagenesis are caused by mutagenic agents such as irradiation or exposure to certain chemicals, or even by simple prolonged starvation, as may occur in nature. The consequences of mutation can range from almost no effect (silent mutations, e.g. a base substitution in an amino acid-coding triplet in which the encoded amino acid does not change; however the change in tRNA used can lead to subtle changes in translation efficiency) to mild phenotypic changes (e.g. temperature sensitivity or leaky requirement for a growth factor; i.e. a leaky auxotroph or bradytroph, due to point mutations that alter just one or a few amino acids) to severe phenotypic changes (e.g. the inactivation of one or more genes due to deletions, frameshift mutations, or nonsense mutations). Sometimes a single mutation can result in a change in more than one phenotype or a change in production of a number of gene products, in which case the mutation is called pleiotropic. A broad range of phenotypes can be observed in E. coli or Salmonella mutants, listed in Table 35.3. B. Recombination Three distinct types of rearrangements of genetic elements (i.e. recombination) are observed throughout the living world: homologous, site-specific, and transpositional. Figure 35.2 shows diagrammatically the main topological features of these classes. 1. Homologous recombination This refers to the interaction of two very similar DNA molecules (i.e. having almost the same base sequence for many hundreds of continuous bases) at equivalent sites along their DNA chains, which can result in crossover or DNA-repair events almost randomly at any site along the pair of parental DNA molecules. This can thereby result in a change in linkage, or proximity on the same chromosome, for various small (but nevertheless significant) chromosomal differences situated along the lengths of the two recombining DNA molecules. If the two recombining DNA chains are generally homologous, but contain scattered differences from one to the other (e.g. a few percent difference or more for the E. coli versus Salmonella chromosomes) recombination can still occur, but is more infrequent with increasing differences in sequences. This nearly homologous configuration is known as homeologous recombination.

Homologous recombination can thus occur following gene transfer between closely related organisms (see later discussion). However, homologous recombination also occurs, very frequently, even between elements within one cell. For instance, crossovers between some of the seven homologous copies of rRNA operons can occur during vegetative growth to produce large genetic duplications or inversions. These rearrangements are unstable and readily recombine back to the normal haploid state. Nevertheless their mere existence (in up to one-third of a growing population of cells) makes possible various changes in gene dosage or gene arrangement, which likely contributes to the evolutionary flux over long periods of time due to rare accompanying recombination events that are sometimes nonhomologous (or illegitimate). 2. Site-specific recombination As the name implies, this form of recombination is a crossover event between two DNA double helices, which is catalyzed to occur at unique sites on the DNA strands, defined by the local DNA sequence involving usually 20–30 bases. A specialized enzyme recognizes these sequences, on both parental molecules, and causes a double-strand breakage and rejoining event that exchanges the partners of flanking DNA. The crossover is completely conservative, that is, no bases are added or lost at the crossover site. One configuration of site-specific recombination enables the circular form of a lysogenic phage, such as lambda (found in E. coli K-12) or P22 (one of a large family of related bacteriophages found in most natural Salmonella isolates), to integrate into the chromosome (and thus become a stable lysogen), or to be excised from it (and thus begin to replicate vegetatively). See Fig. 35.2 (site-specific, upper). Another DNA configuration involved in sitespecific recombination involves a hairpin-like intermediate such that the end result of the crossover is an inversion of the stretch of DNA located between the two crossover sites (Fig. 35.2, site-specific, lower).

An example of this occurs in Salmonella and results in a change in gene expression for two different genes for the production of flagellin, which is assembled into filaments on the surface of the cell (see Fig. 35.3). By switching the gene expression in this way, Salmonella is able to change its filament antigenic structure (i.e. undergo phase variation) and thus reduce attack by host immune systems. Another mechanism for phase variation also exists in some strains, in which methylation of certain critical adenine residues will activate or deactivate gene expression, and this methylation pattern can be quasi-stably inherited for many generations, until it flip-flops back to a nonmethylated state, which reverses the antigenic phase. 3. Transpositional recombination As in the case of site-specific recombination, certain specialized short DNA sequences are involved in recombination involving DNA elements called transposons. Transposons are sequences of DNA, usually in the size range of 700–10,000 bases, which can move by recombination from one location in the genome to any of a multitude of other sites (targets). The specialized sequences that promote this recombination are usually 15–30 bases long and located at the two ends of the transposon as an inverted repeat. The DNA between the two ends can encode a number of genes whose functions are either required for the transpositional recombination (e.g. transposase, resolvase) or other functions that alter cell phenotype, such as antibiotic resistance. Transposons that do not encode any internal genes except those needed for transposition are called insertion sequences (IS). The E. coli chromosome contains several copies of five IS, and approximately 15% of the spontaneous mutations in E. coli occur due to the movement of these IS to new sites. None of these particular IS are normally found in Salmonella, but a different IS specific to Salmonella strains is normally present somewhere on its chromosome. Transposition events can either involve the movement of the entire original transposon DNA sequence to a new site (conservative transposition) or a replicative process in which the transposon is replicated to form two copies, one of which is inserted at the new (target) site and the other of which is retained at the original site (replicative transposition) (see Fig. 35.2). Even in the case of conservative transposition, a small amount of DNA synthesis is involved, in which a few of bases at the target site are copied to create a small duplication at each end of the newly inserted transposon.

IV. GENE TRANSFER BETWEEN CELLS

A. Conjugation In Section II, it was noted that most Salmonella strains and many E. coli strains carry fertility factors (conjugative plasmids) that can promote gene transfer from cell to cell by conjugation. Furthermore, the F-like plasmids in E. coli and the pSLT-like plasmids in Salmonella have a number of genes in common, and considerable homology between them. These facts strongly imply that over evolutionary history there has been considerable cross-talk and gene transfer among these strains. The particular F factor in E. coli strain K-12 is (fortunately for J. Lederberg and other early investigators in the discovery and study of conjugation) highly efficient for promoting conjugation between F_ cells (those that carry the F factor as a plasmid) and F_ cells (those which have no F factor in any form). This is due to a mutation on the K-12 F factor in the regulatory system that, for the vast majority of fertility factors, represses the genes for conjugation. With these “normal” conjugative plasmids, this repression keeps all the cells except about 1/1000 from acting as donors. At any given time any one of the cells can, with a probability of 1/1000, become temporarily derepressed for conjugative functions and cause its own transfer into an F_ cell (thus converting it into an F_ cell). Immediately following this transfer into the F_ cell, there is a delay before the repression of the conjugative functions builds up. Thus, such a newly formed F_ cell is temporarily able to conjugate (act as a donor) efficiently again, with another F_ cell. In this way a rapid epidemic spread of conjugative plasmids can move into a new population of F_ cells when the opportunity arises. On rare occasions, an F-like plasmid can recombine with the chromosome to form an Hfr (high frequency of recombination) cell (see Fig. 35.4). In the process of conjugation from such a cell, the transfer of the normal leading region (next to a specific site called the origin of transfer, indicated by the small arrowhead in Fig. 35.4), leads to the transfer of chromosomal genes because in the Hfr state the F factor and chromosome form one continuous DNA molecule. Considerable amounts of chromosomal genetic information can be transferred in this way, although the transfer process tends to be interrupted spontaneously so that the genes furthest around the chromosome have the least chance of being transferred. B. Transduction Recall that a large family of lysogenic bacteriophages related to P22 exist among wild-type Salmonel strains. It so happens that when phage P22 is replicating lytically in a cell and being packaged into new phage particles, occasionally (about once in every 30 times) the packaging process makes a mistake and packages a segment (about 1%) of the bacterial genome instead of the newly replicated P22 DNA. These aberrant particles are called transducing particles and can come out of the cell along with the burst of normal new phage particles. The transducing particles can travel to another cell and inject in the genomic DNA, followed in some cases by recombination into the new cell’s genome.

The analogous processes occur in E. coli strains with a phage known as P1, which can package about 2% of the genome into transducing particles. The process, discovered in 1952 by Zinder and Lederberg, is called generalized transduction and it most likely has had a major role in the horizontal transfer of genetic information among enteric bacteria. In some cases, the transduced chromosomal DNA fragment does not recombine into the chromosome but forms a stable (nonreplicating) structure that continues to be unilaterally inherited by only one of the daughter cells at each division. If the transducing fragment carries a functional gene that allows the transduced cell to grow on selective medium, only a tiny colony will grow because only one of the two daughter cells after each division will divide again. This is termed abortive transduction and has been useful in determining functional complementation between mutated genes. Another mode of transduction, termed specialized transduction, is distinctly different from generalized transduction. Specialized transduction involves the excision of a chromosomally located lysogenic bacteriophage (i.e. in the prophage state) to form a circular replicating form. However, rarely, this excision event is imprecise and sometimes involves a crossover within the adjacent bacterial chromosomal DNA (analogous to F-prime formation, Fig. 35.4), such that the replicating phage and subsequent phage particles produced contain a small amount (a few genes) of chromosomal DNA. This DNA can thus be carried by the progeny phage particles and injected into new host cells, followed by possible recombination events. Thus, in specialized transduction only genes close to the prophage integration site can be transduced. Moreover, the altered phage particles carrying the chromosomal fragment can be grown and amplified to produce large quantities of the specific chromosomal genes for experimental purposes. C. Transformation The uptake of naked DNA by bacteria and the subsequent inheritance of its genetic information in the genome occur naturally in some gram-positive and gram-negative bacterial species such as Streptococcus pneumoniae,

Bacillus subtilus, Haemophilis, and Acinetobacter, but only under certain growth conditions that induce competence, or the ability to take up DNA. In contrast, E. coli and Salmonella are not known to have this natural competence for DNA uptake, and it is unlikely that transformation has contributed significantly to their evolutionary genomic development. However, laboratory methods have been devised that promote the uptake of DNA into E. coli and Salmonella, thus greatly facilitating their use in molecular biological studies. These methods for artificially induced transformation competence generally involve either treatment with di- or multivalent cations (e.g., Ca2_, Mg2_) and heat shock, which is believed to partially disrupt the lipopolysaccharide in the outer membrane (chemical competence), or treatment with a brief pulse of high-voltage electricity (electroshock), which renders the membrane temporarily permeable to a variety of macromolecules including DNA.

Comprehensive and Updated Computer Programming!

Escherichia coli and Salmonella, Genetics

No comments:

Post a Comment