What information does the genetic code encode? Degeneracy of the genetic code: general information


The series of articles describing the origins of the Civil Code can be regarded as an investigation of events about which we have very few traces. However, understanding these articles requires a bit of effort to understand the molecular mechanisms of protein synthesis. This article is the introductory article for a series of auto-publications devoted to the origin of the genetic code, and it is the best place to start acquaintance with this topic.
Usually genetic code(GC) is defined as a method (rule) of encoding a protein on the primary structure of DNA or RNA. In the literature, it is most often written that this is a one-to-one correspondence of a sequence of three nucleotides in a gene to one amino acid in the synthesized protein or the end point of protein synthesis. However, there are two errors in this definition. This implies 20 so-called canonical amino acids, which are part of the proteins of all living organisms without exception. These amino acids are protein monomers. The errors are the following:

1) Canonical amino acids are not 20, but only 19. We can call an amino acid a substance that simultaneously contains an amino group -NH 2 and a carboxyl group - COOH. The fact is that the protein monomer - proline - is not an amino acid, since it contains an imino group instead of an amino group, so it is more correct to call proline an imino acid. However, in the future, in all articles on HA, for convenience, I will write about 20 amino acids, implying the indicated nuance. The amino acid structures are shown in fig. one.

Rice. 1. Structures of canonical amino acids. Amino acids have constant parts, marked in black in the figure, and variable (or radicals), marked in red.

2) The correspondence of amino acids to codons is not always unambiguous. See below for the violation of uniqueness cases.

The occurrence of HA means the occurrence of encoded protein synthesis. This event is one of the key ones for the evolutionary formation of the first living organisms.

The structure of the HA is presented in a circular form in fig. 2.



Rice. 2. Genetic code in a circular shape. The inner circle is the first letter of the codon, the second a circle - the second letter of the codon, the third circle - the third letter of the codon, the fourth circle - designations of amino acids in a three-letter abbreviation; P - polar amino acids, NP - non-polar amino acids. For clarity of symmetry, the chosen order of symbols is important U-C-A-G.

So, let's proceed to the description of the main properties of HA.

1. Tripletity. Each amino acid is encoded by a sequence of three nucleotides.

2. The presence of intergenetic punctuation marks. Intergenic punctuation marks include nucleic acid sequences on which translation begins or ends.

Translation I can not begin with any codon, but only with a strictly defined - starting. The start codon is the AUG triplet, which starts translation. In this case, this triplet encodes either methionine or another amino acid, formylmethionine (in prokaryotes), which can only be switched on at the beginning of protein synthesis. At the end of each gene encoding a polypeptide is at least one of the 3 termination codons, or brake lights: UAA, UAG, UGA. They terminate translation (the so-called protein synthesis on the ribosome).

3. Compactness, or lack of intragenic punctuation marks. Within a gene, each nucleotide is part of a significant codon.

4. Non-overlapping. Codons do not overlap with each other, each has its own ordered set of nucleotides, which does not overlap with similar sets of neighboring codons.

5. Degeneracy. The reverse correspondence in the amino acid-codon direction is ambiguous. This property is called degeneracy. Series is a set of codons encoding one amino acid, in other words, it is a group equivalent codons. Think of a codon as XYZ. If XY defines “meaning” (i.e. amino acid), then the codon is called strong. If a certain Z is needed to determine the meaning of a codon, then such a codon is called weak.

The degeneracy of the code is closely related to the ambiguity of codon-anticodon pairing (an anticodon means a sequence of three nucleotides on a tRNA that can complementarily pair with a codon on messenger RNA (see two articles on this for more details: Molecular Mechanisms for Ensuring Code Degeneracy and Lagerquist's rule. Physico-chemical substantiation of symmetries and Rumer's relations). One anticodon per tRNA can recognize one to three codons per mRNA.

6.Unambiguity. Each triplet encodes only one amino acid or is a translation terminator.

There are three known exceptions.

First. In prokaryotes, in the first position (capital letter) it encodes formylmethionine, and in any other position it codes for methionine. At the beginning of the gene, formylmethionine is encoded both by the usual methionine codon AUG, and also by the valine codon GUG or leucine UUG, which within the gene encode valine and leucine, respectively .

In many proteins, formylmethionine is cleaved off or the formyl group is removed, as a result of which formylmethionine is converted to ordinary methionine.

Second. In 1986, several groups of researchers at once discovered that on mRNA, the UGA termination codon can encode selenocysteine ​​(see Fig. 3), provided that it is followed by a special nucleotide sequence.

Rice. 3. The structure of the 21st amino acid - selenocysteine.

At E. coli(this is the Latin name for Escherichia coli) selenocysteyl-tRNA during translation and recognizes the UGA codon in mRNA, but only in a certain context e: for the recognition of the UGA codon as meaningful, the sequence of 45 nucleotides long, located after the UGA codon, is important.

The considered example shows that, if necessary, a living organism can change the meaning of the standard genetic code. In this case, the genetic information contained in the genes is encoded in a more complex way. The meaning of a codon is determined in the context of e with a certain long sequence of nucleotides and with the participation of several highly specific protein factors. It is important that selenocysteine ​​tRNA was found in representatives of all three branches of life (archaea, eubacteria and eukaryotes), which indicates the antiquity of the origin of selenocysteine ​​synthesis, and possibly its presence in the last universal common ancestor (it will be discussed in other articles). Most likely, selenocysteine ​​is found in all living organisms without exception. But in each individual organism, selenocysteine ​​is found in no more than a couple of dozens of proteins. It is part of the active sites of enzymes, in a number of homologues of which ordinary cysteine ​​can function at a similar position.

Until recently, it was believed that the UGA codon could be read either as selenocysteine ​​or as a terminal, but recently it has been shown that in ciliates Euplotes The UGA codon codes for either cysteine ​​or selenocysteine. Cm. " Genetic code allows for inconsistencies"

Third exception. In some prokaryotes (5 species of archaea and one eubacterium - the information on Wikipedia is very outdated) there is a special acid - pyrrolysine (Fig. 4). It is encoded by the UAG triplet, which in the canonical code serves as a translation terminator. It is assumed that in this case, like the case with coding for selenocysteine, the reading of UAG as a pyrrolysine codon occurs due to a special structure on the mRNA. Pyrrolysine tRNA contains the anticodon CTA and is aminoacylated by class 2 APCases (for the classification of APCases, see the article "Codases help to understand how genetic code ").

UAG is rarely used as a stop codon, and if it is, it is often followed by another stop codon.

Rice. 4. Structure of the 22nd amino acid of pyrrolysine.

7. Versatility. After the decoding of the Civil Code was completed in the mid-60s of the last century, for a long time it was believed that the code is the same in all organisms, which indicates the unity of the origin of all life on Earth.

Let's try to understand why the GC is universal. The fact is that if at least one coding rule were changed in the body, this would lead to the fact that the structure of a significant part of the proteins changed. Such a change would be too dramatic and therefore almost always lethal, since a change in the meaning of only one codon can affect, on average, 1/64 of all amino acid sequences.

One very important idea follows from this - the HA has hardly changed since its formation more than 3.5 billion years ago. And, therefore, its structure bears a trace of its occurrence, and the analysis of this structure can help to understand how exactly the GC could arise.

In reality, HA may differ slightly in bacteria, mitochondria, the nuclear code of some ciliates and yeasts. Currently, there are at least 17 genetic codes that differ from the canonical one by 1-5 codons. In total, in all known variants of deviations from the universal GC, 18 different substitutions of the sense a codon are used. Most deviations from the standard code are known in mitochondria - 10. It is noteworthy that the mitochondria of vertebrates, flatworms, echinoderms are encoded by different codes, and molds, protozoa and coelenterates - by one.

The evolutionary closeness of species is by no means a guarantee that they have similar GCs. Genetic codes can differ even between different types of mycoplasmas (some species have a canonical code, while others are different). A similar situation is observed for yeast.

It is important to note that mitochondria are descendants of symbiotic organisms that have adapted to live inside cells. They have a highly reduced genome, some of the genes have moved to the cell nucleus. Therefore, changes in the HA in them are no longer so dramatic.

The exceptions discovered later are of particular interest from an evolutionary point of view, as they can help shed light on the mechanisms of code evolution.

Table 1.

Mitochondrial codes in various organisms.

codon

Universal code

Mitochondrial codes

Vertebrates

Invertebrates

Yeast

Plants

UGA

STOP

trp

trp

trp

STOP

AUA

ile

Met

Met

Met

ile

CUA

Leu

Leu

Leu

Thr

Leu

AGA

Arg

STOP

Ser

Arg

Arg

AGG

Arg

STOP

Ser

Arg

Arg

Three mechanisms for changing the amino acid encoded by the code.

The first is when some codon is not used (or almost not used) by some organism due to the uneven occurrence of some nucleotides (GC-composition), or combinations of nucleotides. As a result, such a codon may completely disappear from use (for example, due to the loss of the corresponding tRNA), and in the future it can be used to encode another amino acid without causing significant damage to the body. This mechanism is probably responsible for the appearance of some dialects of codes in mitochondria.

The second is the transformation of the stop codon into the meaning of the new one. In this case, some of the translated proteins may have additions. However, the situation is partially saved by the fact that many genes often end with not one, but two stop codons, since translation errors are possible, in which stop codons are read as amino acids.

The third is the possible ambiguous reading of certain codons, as occurs in some fungi.

8 . Connectivity. Groups of equivalent codons (that is, codons that code for the same amino acid) are called series. The GC contains 21 series, including stop codons. In what follows, for definiteness, any group of codons will be called liaison, if from each codon of this group it is possible to pass to all other codons of the same group by successive nucleotide substitutions. Of the 21 series, 18 are connected. 2 series contain one codon each, and only 1 series for the amino acid serine is unconnected and splits into 2 connected subseries.


Rice. 5. Connectivity graphs for some code series. a - connected series of valine; b - connected series of leucine; the serine series is unrelated, splitting into two connected subseries. The figure is taken from an article by V.A. Ratner " Genetic code like a system."

The property of connectivity can be explained by the fact that during the period of formation, the HA captured new codons that minimally differed from those already used.

9. Regularity properties of amino acids by the roots of triplets. All amino acids encoded by U triplets are non-polar, not of extreme properties and sizes, and have aliphatic radicals. All C-root triplets have strong bases, and the amino acids encoded by them are relatively small. All triplets with root A have weak bases and encode non-small polar amino acids. G-root codons are characterized by extreme and abnormal variants of amino acids and series. They encode the smallest amino acid (glycine), the longest and flattest (tryptophan), the longest and "clumsiest" (arginine), the most reactive (cysteine), and form an abnormal subset for serine.

10. Blockiness. The universal CC is a "block" code. This means that amino acids with similar physicochemical properties are encoded by codons that differ from each other by one base. The blockiness of the code is clearly visible in the following figure.


Rice. 6. Block structure of the Civil Code. Amino acids with an alkyl group are marked in white.


Rice. 7. Color representation of the physicochemical properties of amino acids based on the values ​​described in the bookStyers "Biochemistry". Left - hydrophobicity. On the right, the ability to form an alpha helix in a protein. Red, yellow and blue colors indicate amino acids with high, medium and low hydrophobicity (left) or the corresponding degree of ability to form an alpha helix (right).

The property of blockiness and regularity can also be explained by the fact that during the period of formation, the HA captured new codons that minimally differed from those already used.

Codons with the same first base (codon prefix) code for amino acids with similar biosynthetic pathways. The codons of amino acids belonging to the shikimate, pyruvate, aspartate, and glutamate families have prefixes U, G, A, and C, respectively. For the pathways of the ancient biosynthesis of amino acids and its connection with the properties of the modern code, see "Ancient doublet genetic code was predetermined by the pathways for the synthesis of amino acids. "Based on these data, some researchers conclude that the formation of the code was greatly influenced by the biosynthetic relationships between amino acids. However, the similarity of biosynthetic pathways does not at all mean the similarity of physicochemical properties.

11. Noise immunity. In its most general form, the noise immunity of HA means that, with random point mutations and translation errors, the physicochemical properties of amino acids do not change very much.

The replacement of one nucleotide in a triplet in most cases either does not lead to a replacement of the encoded amino acid, or leads to a replacement with an amino acid with the same polarity.

One of the mechanisms that ensure the noise immunity of the GK is its degeneracy. The average degeneracy is - number of encoded signals/total number of codons, where encoded signals include 20 amino acids and the translation termination sign. The average degeneracy for all amino acids and the termination sign is three codons per encoded signal.

In order to quantify noise immunity, we introduce two concepts. Mutations of nucleotide substitutions that do not lead to a change in the class of the encoded amino acid are called conservative. Nucleotide substitution mutations that change the class of the encoded amino acid are called radical .

Each triplet allows 9 single substitutions. There are 61 triplets encoding amino acids in total. Therefore, the number of possible nucleotide substitutions for all codons is

61 x 9 = 549. Of these:

23 nucleotide substitutions result in stop codons.

134 substitutions do not change the encoded amino acid.
230 substitutions do not change the class of the encoded amino acid.
162 substitutions lead to a change in the amino acid class, i.e. are radical.
Of the 183 substitutions of the 3rd nucleotide, 7 lead to the appearance of translation terminators, and 176 are conservative.
Of the 183 substitutions of the 1st nucleotide, 9 lead to the appearance of terminators, 114 are conservative and 60 are radical.
Of the 183 substitutions of the 2nd nucleotide, 7 lead to the appearance of terminators, 74 are conservative, and 102 are radical.

Based on these calculations, we obtain a quantitative estimate of the noise immunity of the code, as the ratio of the number of conservative replacements to the number of radical replacements. It is equal to 364/162=2.25

In a real assessment of the contribution of degeneracy to noise immunity, it is necessary to take into account the frequency of occurrence of amino acids in proteins, which varies in different species.

What is the reason for the noise immunity of the code? Most researchers believe that this property is a consequence of the selection of alternative HAs.

Stephen Freeland and Lawrence Hurst generated random such codes and found out that only one of the hundred alternative codes has no less noise immunity than the universal GC.
An even more interesting fact came to light when these investigators introduced an additional constraint to take into account actual trends in DNA mutation patterns and translational errors. Under such conditions, ONLY ONE CODE FROM A MILLION POSSIBLE turned out to be better than the canonical code.
Such an unprecedented vitality of the genetic code is most easily explained by the fact that it was formed as a result of natural selection. Perhaps at one time in the biological world there were many codes, each with its own sensitivity to errors. The organism that coped better with them was more likely to survive, and the canonical code simply won the struggle for existence. This assumption seems quite realistic - after all, we know that alternative codes do exist. For more information about noise immunity, see Coded Evolution (S. Freeland, L. Hurst "Code Evolution". / / In the world of science. - 2004, No. 7).

In conclusion, I propose to count the number of possible genetic codes that can be generated for 20 canonical amino acids. For some reason this number never came across to me. So, we need to have 20 amino acids and a stop signal encoded by AT LEAST ONE CODON in the generated GCs.

Mentally, we will number the codons in some order. We will reason as follows. If we have exactly 21 codons, then each amino acid and stop signal will occupy exactly one codon. In this case, there will be 21 possible GCs!

If there are 22 codons, then an extra codon appears that can have one of any 21 meanings, and this codon can be located in any of the 22 places, while the remaining codons have exactly one different meaning y, as in the case of 21 codons. Then we get the number of combinations 21!x(21x22).

If there are 23 codons, then, arguing similarly, we get that 21 codons have exactly one different meaning of s (21! options), and two codons have 21 different meanings of a (21 2 meanings of s at a FIXED position of these codons). The number of different positions for these two codons will be 23x22. Total number of GK variants for 23 codons - 21!x21 2x23x22

If there are 24 codons, then the number of GCs will be 21!x21 3 x24x23x22, ...

....................................................................................................................

If there are 64 codons, then the number of possible GCs will be 21!x21 43x64!/21! = 21 43 x64! ~ 9.1x10 145

Thanks to the process of transcription in a cell, information is transferred from DNA to protein: DNA - i-RNA - protein. The genetic information contained in DNA and mRNA is contained in the sequence of nucleotides in molecules. How does the translation of information from the "language" of nucleotides into the "language" of amino acids take place? This translation is carried out using the genetic code. A code, or cipher, is a system of symbols for translating one form of information into another. The genetic code is a system for recording information about the sequence of amino acids in proteins using the sequence of nucleotides in messenger RNA. How important it is the sequence of the same elements (four nucleotides in RNA) for understanding and preserving the meaning of information can be seen with a simple example: by rearranging the letters in the word code, we get a word with a different meaning - doc. What are the properties of the genetic code?

1. The code is triplet. RNA consists of 4 nucleotides: A, G, C, U. If we tried to designate one amino acid with one nucleotide, then 16 out of 20 amino acids would remain unencrypted. A two-letter code would encode 16 amino acids (from four nucleotides, 16 different combinations can be made, each of which has two nucleotides). Nature has created a three-letter, or triplet, code. This means that each of the 20 amino acids is coded for by a sequence of three nucleotides called a triplet or codon. From 4 nucleotides, you can create 64 different combinations of 3 nucleotides each (4*4*4=64). This is more than enough to encode 20 amino acids and, it would seem, 44 codons are superfluous. However, it is not.

2. The code is degenerate. This means that each amino acid is coded for by more than one codon (two to six). The exceptions are the amino acids methionine and tryptophan, each of which is encoded by only one triplet. (This can be seen from the table of the genetic code.) The fact that methionine is encoded by one triplet OUT has a special meaning, which will become clear to you later (16).

3. The code is unambiguous. Each codon codes for only one amino acid. In all healthy people, in the gene that carries information about the hemoglobin beta chain, the GAA or GAG triplet, I, which is in sixth place, encodes glutamic acid. In patients with sickle cell anemia, the second nucleotide in this triplet is replaced by U. As can be seen from the table, the triplets GUA or GUG, which are formed in this case, encode the amino acid valine. What such a replacement leads to, you already know from the section on DNA.

4. There are "punctuation marks" between genes. In printed text, there is a period at the end of each phrase. Several related phrases make up a paragraph. In the language of genetic information, such a paragraph is an operon and its complementary mRNA. Each gene in the operon encodes one polypeptide chain - a phrase. Since in a number of cases several different polypeptide chains are sequentially created along the mRNA template, they must be separated from each other. For this, there are three special triplets in the genetic code - UAA, UAG, UGA, each of which indicates the cessation of the synthesis of one polypeptide chain. Thus, these triplets perform the function of punctuation marks. They are at the end of every gene. There are no "punctuation marks" inside the gene. Since the genetic code is like a language, let's analyze this property using the example of such a phrase composed of triplets: the cat lived quietly, that cat was angry with me. The meaning of what is written is clear, despite the absence of "punctuation marks. If we remove one letter in the first word (one nucleotide in the gene), but we also read in triples of letters, then we get nonsense: ilb ylk ott ihb yls yls erm ilm no otk occurs when one or two nucleotides are missing from the gene.The protein that will be read from such a damaged gene will have nothing to do with the protein that was encoded by the normal gene.

6. The code is universal. The genetic code is the same for all creatures living on Earth. In bacteria and fungi, wheat and cotton, fish and worms, frogs and humans, the same triplets encode the same amino acids.

Under the genetic code, it is customary to understand such a system of signs denoting the sequential arrangement of nucleotide compounds in DNA and RNA, which corresponds to another sign system that displays the sequence of amino acid compounds in a protein molecule.

It is important!

When scientists managed to study the properties of the genetic code, universality was recognized as one of the main ones. Yes, strange as it may sound, everything is united by one, universal, common genetic code. It was formed over a long time period, and the process ended about 3.5 billion years ago. Therefore, in the structure of the code, traces of its evolution can be traced, from the moment of its inception to the present day.

When talking about the sequence of elements in the genetic code, it means that it is far from being chaotic, but has a strictly defined order. And this also largely determines the properties of the genetic code. This is equivalent to the arrangement of letters and syllables in words. It is worth breaking the usual order, and most of what we will read on the pages of books or newspapers will turn into ridiculous gibberish.

Basic properties of the genetic code

Usually the code carries some information encrypted in a special way. In order to decipher the code, you need to know the distinguishing features.

So, the main properties of the genetic code are:

  • triplet;
  • degeneracy or redundancy;
  • uniqueness;
  • continuity;
  • the versatility already mentioned above.

Let's take a closer look at each property.

1. Tripletity

This is when three nucleotide compounds form a sequential chain within a molecule (i.e. DNA or RNA). As a result, a triplet compound is created or encodes one of the amino acids, its location in the peptide chain.

Codons (they are code words!) are distinguished by their connection sequence and by the type of those nitrogenous compounds (nucleotides) that are part of them.

In genetics, it is customary to distinguish 64 codon types. They can form combinations of four types of nucleotides, 3 in each. This is equivalent to raising the number 4 to the third power. Thus, the formation of 64 nucleotide combinations is possible.

2. Redundancy of the genetic code

This property is observed when several codons are required to encrypt one amino acid, usually within 2-6. And only tryptophan can be encoded with a single triplet.

3. Uniqueness

It is included in the properties of the genetic code as an indicator of healthy gene inheritance. For example, the GAA triplet in the sixth place in the chain can tell doctors about the good condition of the blood, about normal hemoglobin. It is he who carries information about hemoglobin, and it is also encoded by him. And if a person is anemic, one of the nucleotides is replaced by another letter of the code - U, which is a signal of the disease.

4. Continuity

When writing this property of the genetic code, it should be remembered that codons, like chain links, are located not at a distance, but in direct proximity, one after another in the nucleic acid chain, and this chain is not interrupted - it has no beginning or end.

5. Versatility

It should never be forgotten that everything on Earth is united by a common genetic code. And therefore, in a primate and a person, in an insect and a bird, a hundred-year-old baobab and a blade of grass that has barely hatched out of the ground, similar amino acids are encoded in identical triplets.

It is in the genes that the basic information about the properties of an organism is stored, a kind of program that the organism inherits from those who lived earlier and which exists as a genetic code.

The genetic code is a special encoding of hereditary information with the help of molecules. Based on this, genes appropriately control the synthesis of proteins and enzymes in the body, thereby determining metabolism. In turn, the structure of individual proteins and their functions are determined by the location and composition of amino acids - the structural units of the protein molecule.

In the middle of the last century, genes were identified that are separate sections (abbreviated as DNA). The nucleotide units form a characteristic double chain, assembled in the form of a helix.

Scientists have found a connection between genes and the chemical structure of individual proteins, the essence of which is that the structural order of amino acids in protein molecules fully corresponds to the order of nucleotides in the gene. Having established this connection, scientists decided to decipher the genetic code, i.e. establish the laws of correspondence between the structural orders of nucleotides in DNA and amino acids in proteins.

There are only four types of nucleotides:

1) A - adenyl;

2) G - guanyl;

3) T - thymidyl;

4) C - cytidyl.

Proteins contain twenty types of essential amino acids. Difficulties arose with deciphering the genetic code, since there are much fewer nucleotides than amino acids. When solving this problem, it was suggested that amino acids are encoded by various combinations of three nucleotides (the so-called codon or triplet).

In addition, it was necessary to explain exactly how the triplets are located along the gene. Thus, three main groups of theories arose:

1) triplets follow each other continuously, i.e. form a continuous code;

2) triplets are arranged with alternation of "meaningless" sections, i.e. the so-called "commas" and "paragraphs" are formed in the code;

3) triplets can overlap, i.e. the end of the first triplet may form the beginning of the next.

Currently, the theory of code continuity is mainly used.

The genetic code and its properties

1) The triplet code - it consists of arbitrary combinations of three nucleotides that form codons.

2) The genetic code is redundant - its triplets. One amino acid can be encoded by several codons, since, according to mathematical calculations, there are three times more codons than amino acids. Some codons perform certain termination functions: some may be "stop signals" that program the end of the production of an amino acid chain, while others may indicate the initiation of code reading.

3) The genetic code is unambiguous - only one amino acid can correspond to each of the codons.

4) The genetic code is collinear, i.e. the sequence of nucleotides and the sequence of amino acids clearly correspond to each other.

5) The code is written continuously and compactly, there are no "meaningless" nucleotides in it. It begins with a certain triplet, which is replaced by the next one without a break and ends with a termination codon.

6) The genetic code is universal - the genes of any organism encode information about proteins in exactly the same way. This does not depend on the level of complexity of the organization of the organism or its systemic position.

Modern science suggests that the genetic code arises directly from the birth of a new organism from bone matter. Random changes and evolutionary processes make possible any variants of the code, i.e. amino acids can be rearranged in any order. Why did this kind of code survive in the course of evolution, why is the code universal and has a similar structure? The more science learns about the phenomenon of the genetic code, the more new mysteries arise.

Ministry of Education and Science of the Russian Federation Federal Agency for Education

State Educational Institution of Higher Professional Education "Altai State Technical University named after I.I. Polzunov"

Department of Natural Science and System Analysis

Essay on the topic "Genetic code"

1. The concept of the genetic code

3. Genetic information

Bibliography


1. The concept of the genetic code

The genetic code is a single system for recording hereditary information in nucleic acid molecules in the form of a sequence of nucleotides, characteristic of living organisms. Each nucleotide is denoted by a capital letter, which begins the name of the nitrogenous base that is part of it: - A (A) adenine; - G (G) guanine; - C (C) cytosine; - T (T) thymine (in DNA) or U (U) uracil (in mRNA).

The implementation of the genetic code in the cell occurs in two stages: transcription and translation.

The first of these takes place in the nucleus; it consists in the synthesis of mRNA molecules on the corresponding sections of DNA. In this case, the DNA nucleotide sequence is "rewritten" into the RNA nucleotide sequence. The second stage takes place in the cytoplasm, on ribosomes; in this case, the nucleotide sequence of the i-RNA is translated into the sequence of amino acids in the protein: this stage proceeds with the participation of transfer RNA (t-RNA) and the corresponding enzymes.

2. Properties of the genetic code

1. Tripletity

Each amino acid is encoded by a sequence of 3 nucleotides.

A triplet or codon is a sequence of three nucleotides that codes for one amino acid.


The code cannot be monopleth, since 4 (the number of different nucleotides in DNA) is less than 20. The code cannot be doublet, because 16 (the number of combinations and permutations of 4 nucleotides by 2) is less than 20. The code can be triplet, because 64 (the number of combinations and permutations from 4 to 3) is greater than 20.

2. Degeneracy.

All amino acids, with the exception of methionine and tryptophan, are encoded by more than one triplet: 2 amino acids in 1 triplet = 2 9 amino acids in 2 triplets = 18 1 amino acid in 3 triplets = 3 5 amino acids in 4 triplets = 20 3 amino acids in 6 triplets = 18 Total 61 triplet codes for 20 amino acids.

3. The presence of intergenic punctuation marks.

A gene is a section of DNA that codes for one polypeptide chain or one molecule of tRNA, rRNA, or sRNA.

The tRNA, rRNA, and sRNA genes do not code for proteins.

At the end of each gene encoding a polypeptide, there is at least one of 3 termination codons, or stop signals: UAA, UAG, UGA. They terminate the broadcast.

Conventionally, the AUG codon also belongs to punctuation marks - the first after the leader sequence. It performs the function of a capital letter. In this position, it codes for formylmethionine (in prokaryotes).

4. Uniqueness.

Each triplet encodes only one amino acid or is a translation terminator.

The exception is the AUG codon. In prokaryotes, in the first position (capital letter), it codes for formylmethionine, and in any other position, it codes for methionine.

5. Compactness, or the absence of intragenic punctuation marks.

Within a gene, each nucleotide is part of a significant codon.

In 1961 Seymour Benzer and Francis Crick experimentally proved that the code is triplet and compact.

The essence of the experiment: "+" mutation - the insertion of one nucleotide. "-" mutation - loss of one nucleotide. A single "+" or "-" mutation at the beginning of a gene corrupts the entire gene. A double "+" or "-" mutation also spoils the entire gene. A triple "+" or "-" mutation at the beginning of the gene spoils only part of it. A quadruple "+" or "-" mutation again spoils the entire gene.

The experiment proves that the code is triplet and there are no punctuation marks inside the gene. The experiment was carried out on two adjacent phage genes and showed, in addition, the presence of punctuation marks between the genes.

3. Genetic information

Genetic information is a program of the properties of an organism, received from ancestors and embedded in hereditary structures in the form of a genetic code.

It is assumed that the formation of genetic information proceeded according to the scheme: geochemical processes - mineral formation - evolutionary catalysis (autocatalysis).

It is possible that the first primitive genes were microcrystalline crystals of clay, and each new layer of clay lines up in accordance with the structural features of the previous one, as if receiving information about the structure from it.

Realization of genetic information occurs in the process of synthesis of protein molecules with the help of three RNAs: informational (mRNA), transport (tRNA) and ribosomal (rRNA). The process of information transfer goes: - through the channel of direct communication: DNA - RNA - protein; and - via the feedback channel: environment - protein - DNA.

Living organisms are able to receive, store and transmit information. Moreover, living organisms tend to use the information received about themselves and the world around them as efficiently as possible. Hereditary information embedded in genes and necessary for a living organism for existence, development and reproduction is transmitted from each individual to his descendants. This information determines the direction of development of the organism, and in the process of its interaction with the environment, the reaction to its individual can be distorted, thereby ensuring the evolution of the development of descendants. In the process of evolution of a living organism, new information arises and is remembered, including the value of information for it increases.

In the course of the implementation of hereditary information under certain environmental conditions, the phenotype of organisms of a given biological species is formed.

Genetic information determines the morphological structure, growth, development, metabolism, mental warehouse, predisposition to diseases and genetic defects of the body.

Many scientists, rightly emphasizing the role of information in the formation and evolution of living things, noted this circumstance as one of the main criteria of life. So, V.I. Karagodin believes: "The living is such a form of existence of information and the structures encoded by it, which ensures the reproduction of this information in suitable environmental conditions." The connection of information with life is also noted by A.A. Lyapunov: "Life is a highly ordered state of matter that uses information encoded by the states of individual molecules to develop persistent reactions." Our well-known astrophysicist N.S. Kardashev also emphasizes the informational component of life: “Life arises due to the possibility of synthesizing a special kind of molecules that are able to remember and use at first the simplest information about the environment and their own structure, which they use for self-preservation, for reproduction and, which is especially important for us, for obtaining more more information." Ecologist F. Tipler draws attention to this ability of living organisms to store and transmit information in his book "Physics of Immortality": "I define life as some kind of coded information that is preserved by natural selection." Moreover, he believes that if this is so, then the life-information system is eternal, infinite and immortal.

The discovery of the genetic code and the establishment of the laws of molecular biology showed the need to combine modern genetics and the Darwinian theory of evolution. Thus, a new biological paradigm was born - the synthetic theory of evolution (STE), which can already be considered as non-classical biology.

The main ideas of Darwin's evolution with his triad - heredity, variability, natural selection - in the modern view of the evolution of the living world are supplemented by ideas not just of natural selection, but of such selection, which is genetically determined. The beginning of the development of synthetic or general evolution can be considered the work of S.S. Chetverikov on population genetics, in which it was shown that not individual traits and individuals are subjected to selection, but the genotype of the entire population, but it is carried out through the phenotypic traits of individual individuals. This leads to the spread of beneficial changes throughout the population. Thus, the mechanism of evolution is implemented both through random mutations at the genetic level, and through the inheritance of the most valuable traits (the value of information!), which determine the adaptation of mutational traits to the environment, providing the most viable offspring.

Seasonal climate changes, various natural or man-made disasters, on the one hand, lead to a change in the frequency of gene repetition in populations and, as a result, to a decrease in hereditary variability. This process is sometimes called genetic drift. And on the other hand, to changes in the concentration of various mutations and a decrease in the diversity of genotypes contained in the population, which can lead to changes in the direction and intensity of selection.


4. Deciphering the human genetic code

In May 2006, scientists working on sequencing the human genome published a complete genetic map of chromosome 1, which was the last incompletely sequenced human chromosome.

A preliminary human genetic map was published in 2003, marking the formal end of the Human Genome Project. Within its framework, genome fragments containing 99% of human genes were sequenced. The accuracy of gene identification was 99.99%. However, at the end of the project, only four of the 24 chromosomes had been fully sequenced. The fact is that in addition to genes, chromosomes contain fragments that do not encode any traits and are not involved in protein synthesis. The role that these fragments play in the life of the organism is still unknown, but more and more researchers are inclined to believe that their study requires the closest attention.

Editor's Choice
Fish is a source of nutrients necessary for the life of the human body. It can be salted, smoked,...

Elements of Eastern symbolism, Mantras, mudras, what do mandalas do? How to work with a mandala? Skillful application of the sound codes of mantras can...

Modern tool Where to start Burning methods Instruction for beginners Decorative wood burning is an art, ...

The formula and algorithm for calculating the specific gravity in percent There is a set (whole), which includes several components (composite ...
Animal husbandry is a branch of agriculture that specializes in breeding domestic animals. The main purpose of the industry is...
Market share of a company How to calculate a company's market share in practice? This question is often asked by beginner marketers. However,...
First mode (wave) The first wave (1785-1835) formed a technological mode based on new technologies in textile...
§one. General data Recall: sentences are divided into two-part, the grammatical basis of which consists of two main members - ...
The Great Soviet Encyclopedia gives the following definition of the concept of a dialect (from the Greek diblektos - conversation, dialect, dialect) - this is ...