The Twenty Canonical Amino Acids: Structure and Classification

The twenty canonical amino acids encoded by the universal genetic code are the chemical vocabulary of all ribosomally synthesized peptides and proteins. Their structures, classifications, and individual chemical personalities are the foundation on which all of peptide science is built. This article provides the reference framework and introduces the chemical logic that distinguishes each group.

Key Terms

Canonical amino acid
One of the twenty amino acids specified by the standard genetic code and incorporated into proteins by ribosomal translation. All twenty share a common backbone architecture: an α-amino group, an α-carboxyl group, a hydrogen, and a variable side chain attached to a central α-carbon.
Side chain, R group
The variable substituent attached to the α-carbon of an amino acid, distinguished from the invariant backbone. The chemical character of the side chain determines the amino acid's physical, chemical, and biological properties.
Residue mass
The molecular weight of an amino acid minus water, reflecting its mass contribution when incorporated into a peptide chain by condensation. Residue masses range from 57.0 Da for glycine to 186.2 Da for tryptophan.
Zwitterion
The dipolar ionic form of an amino acid at physiological pH, in which the α-amino group is protonated and positively charged and the α-carboxyl group is deprotonated and negatively charged. Free amino acids exist predominantly as zwitterions at neutral pH.
Isoelectric point, pI
The pH at which the net charge of an amino acid or peptide is zero. At pH values above the pI the molecule carries a net negative charge; below the pI it carries a net positive charge.

The Logic of Classification

The twenty canonical amino acids are not twenty equally dissimilar molecules. They form natural groups defined by the chemical character of their side chains, and the properties of those groups, not the properties of individual residues, are what peptide chemists most often reason about. The standard classification divides the twenty into five groups: nonpolar aliphatic, aromatic, polar uncharged, positively charged, and negatively charged. This classification, established by the IUPAC-IUB Joint Commission and now universal in biochemistry, is a working tool rather than a rigid taxonomy. [1] Several amino acids sit at the boundaries of these groups, and understanding why is as instructive as knowing the classification itself.

Nonpolar Aliphatic Residues

Glycine, alanine, valine, leucine, isoleucine, proline, and methionine form the nonpolar aliphatic group, though each is distinctive. Glycine carries no side chain beyond a hydrogen, making it the smallest amino acid, the most conformationally flexible backbone residue, and the only one with no α-carbon chirality. Alanine, with its methyl group, is the simplest chiral amino acid and a benchmark for helix propensity studies. Valine, leucine, and isoleucine carry branched aliphatic side chains of increasing bulk; all three are strongly hydrophobic and abundant in protein hydrophobic cores. Leucine and isoleucine are isobaric, sharing the same molecular formula and residue mass, which creates the well-known ambiguity in mass spectrometric sequencing discussed in Article 1.2.

Proline's unique cyclic structure, discussed in depth in Articles 2.3 and 2.4, sets it apart from the rest of this group. Its pyrrolidine ring restricts backbone conformation, eliminates the amide NH, and makes it a helix breaker and beta turn nucleator. Methionine, despite its aliphatic classification, contains a thioether sulfur that is reactive toward oxidation and is the source of the start codon AUG, meaning all ribosomal protein synthesis begins with methionine at the N-terminus before any processing occurs.

Aromatic Residues

Phenylalanine, tyrosine, and tryptophan carry aromatic side chains and share characteristic UV absorbance properties that are essential for protein quantification and spectroscopic analysis. Phenylalanine is purely hydrophobic, contributing only dispersion and stacking interactions. Tyrosine carries a phenolic hydroxyl that gives it partial polar character, a side chain pKa of approximately 10.1, and the ability to serve as a phosphorylation site in signaling. Tryptophan is the largest of the twenty amino acids by residue mass, carries an indole ring system with a polar NH, and is the primary contributor to protein fluorescence at 280 nm, a property exploited routinely in concentration measurement and protein folding studies.

Polar Uncharged Residues

Serine, threonine, cysteine, asparagine, and glutamine are polar but carry no net charge at physiological pH, though they differ considerably in reactivity. Serine and threonine carry hydroxyl groups that serve as phosphorylation, glycosylation, and O-acylation sites. Cysteine carries a thiol group with a side chain pKa of approximately 8.3 that is partially ionized at physiological pH, making cysteine the most nucleophilic of the twenty canonical side chains and the predominant target for thiol-selective bioconjugation chemistry. The ability of two cysteines to form a disulfide bond is structurally fundamental to many peptide and protein architectures, particularly in extracellular environments. Asparagine and glutamine carry amide side chains that are polar but uncharged, participate in hydrogen bonding, and serve as N-glycosylation sites in the case of asparagine.

Charged Residues

Lysine, arginine, and histidine carry positive charge at physiological pH, while aspartate and glutamate carry negative charge. Lysine provides a primary amine with a side chain pKa of approximately 10.5, making it protonated and positively charged at physiological pH and a primary target for NHS-ester-based bioconjugation chemistry. Arginine carries a guanidinium group with a pKa of approximately 12.5, remaining positively charged across essentially the entire physiological range and contributing strongly to protein-nucleic acid interactions and cell-penetrating peptide activity. Histidine is the most chemically versatile of the charged residues: its imidazole side chain has a pKa of approximately 6.0, placing it within the physiological pH range and allowing it to function as both a proton donor and acceptor under physiological conditions. This makes histidine the canonical general acid-base catalyst in enzyme active sites. Aspartate and glutamate differ only in side chain length by one methylene group; both are negatively charged at physiological pH and are distinguished by the two-letter mass spectrometric ambiguity codes discussed in Article 1.2.

Reference Table of the Twenty Canonical Amino Acids

Amino Acid Three-Letter One-Letter Residue Mass (Da) Side Chain Class Side Chain pKa
GlycineGlyG57.0Nonpolar, aliphaticn/a
AlanineAlaA71.1Nonpolar, aliphaticn/a
ValineValV99.1Nonpolar, aliphaticn/a
LeucineLeuL113.2Nonpolar, aliphaticn/a
IsoleucineIleI113.2Nonpolar, aliphaticn/a
ProlineProP97.1Nonpolar, cyclicn/a
MethionineMetM131.2Nonpolar, thioethern/a
PhenylalaninePheF147.2Aromaticn/a
TyrosineTyrY163.2Aromatic, phenolic~10.1
TryptophanTrpW186.2Aromatic, indolen/a
SerineSerS87.1Polar, hydroxyln/a
ThreonineThrT101.1Polar, hydroxyln/a
CysteineCysC103.1Polar, thiol~8.3
AsparagineAsnN114.1Polar, amiden/a
GlutamineGlnQ128.1Polar, amiden/a
LysineLysK128.2Positive, ε-amine~10.5
ArginineArgR156.2Positive, guanidinium~12.5
HistidineHisH137.1Positive, imidazole~6.0
AspartateAspD115.1Negative, carboxyl~3.9
GlutamateGluE129.1Negative, carboxyl~4.1

Residue masses are monoisotopic values rounded to one decimal place, calculated as the molecular weight of the free amino acid minus 18.0 Da for the water lost on condensation. Side chain pKa values are for free amino acids in aqueous solution at 25°C; values in folded proteins deviate substantially depending on local environment. See Article 3.2 for full discussion.

How Classification Serves Practice

The five-group classification is not merely taxonomic. It directly informs decisions in peptide design, synthesis, and analysis at every stage. A peptide dominated by nonpolar residues will aggregate readily in aqueous solution and may require special handling. A sequence rich in charged residues will be sensitive to pH and ionic strength. A cysteine-containing peptide requires attention to oxidation state. The aromatic residues provide the spectroscopic handles used to quantify the peptide. These group-level properties operate even when the behavior of individual residues is unpredictable from first principles, and recognizing the chemical character of a sequence at a glance is one of the most practically useful skills in peptide science. The subsequent articles in this chapter address each of these group properties in depth.

References

  • [1] IUPAC-IUB Joint Commission on Biochemical Nomenclature (1984). Nomenclature and symbolism for amino acids and peptides. European Journal of Biochemistry, 138(1), 9–37.
Next 3.2 Acid-Base Chemistry: pKa Values, Ionization …

Comments (0)

No comments yet.

Log in to leave a comment.

On This Page

Article Info

Views: 8
amino acids canonical amino acids classification side chain residue mass nonpolar aromatic polar charged glycine proline