Lecture 4 – Introduction to Protein Structure (1)
Introduction Proteins are the functional forms of polypeptides. n
n
represent all levels of the hierarchy of macromolecular structure (1o to 4o) protein structure defined by: w w
chemical properties of the polypeptide chain. the environment.
Several distinct classes of proteins:
n
1. globular proteins (water-soluble). 2. fibrous proteins (water-insoluble). 3. proteins that associate with membranes. These differ by tendencies in amino acid sequence and composition, but… w
can all be described using the same basic principles.
Proteins Built from Amino Acids There are 20 common amino acids. n
all are -amino acids: w amino and carboxylic acid groups
separated by a single, C carbon.
n n n
all are L-amino acids(except Glycine). predominantly zwitterions, at pH 7. distinguished by chemical nature of R, w the ‘side chain’.
Proteins also have other components: n
D-amino acids. w e.g., bacterial antibiotics, such as
gramicidin.
n
Covalent modifications, following synthesis. w Disulfide bonds common in eukaryotic
Adoption of the L-form Structurally Significant Consider a natural protein, such as Rubredoxin: n
protein in sulfur-metabolizing bacteria. w Fe-S complex shown in green and
yellow. w constructed of L-amino acids.
Will Rubredoxin made of D-form amino acids have an inverted structure? n
in 1993, L and D-forms of Rubredoxin were synthesized. w structures: X-ray crystallography. w they are exact mirror images.
n
L- and D- HIV protease also synthesized.
Amino Acids Distinguished by chemical nature of the sidechain. w w w w
size and shape. charge. hydrogen-bonding ability. ability to form disulfide-bridges, etc.
Amino acids can be broadly classified into 5 groups:
n
1. Aliphatic: R = hydrocarbon side-chain. 2. Nonpolar: R = other hydrophobic side-chain. 3. Aromatic: R = aromatic ring. 4. Polar: R = uncharged, polar group. 5. Charged: R carries a charge in solution, at pH 7. note: other classification schemes are also in use...
The Hydrophobic Effect Most proteins are amphipathic: n
they include both hydrophobic and hydrophilic residues…
Folding generally results in a partitioning of residues: n n
b/w Aq and non-Aq environments. hydrophobic residues – w will each ‘desire’ to avoid water: w tend to reside in a membrane or protein interior.
n
hydrophilic residues – w will each ‘desire’ to interact with water: w tend to remain hydrated, reside on a protein exterior.
This partitioning b/w Aq and non-Aq environments: n
leads the hydrophobic effect, which drives protein folding.
The Partition Coefficient The Partition Coefficient, P: n
n
measures the partitioning of a residue between Aq and non-Aq environments. Consider a two-solvent system… w with separate Aq and non-Aq environments; w that are in contact, at equal conditions (e.g.,
Temperature).
n
P answers the question, w ‘how much of a given amino acid will reside in each
environment?’
For a given amino acid, P is measured by: P = nonaq /aq, where w i = mole fraction residing in environment i. n
conceptually simple, but there are some practical
Hydrophobicity Scale Numerous scales of amino acid hydrophobicity have been proposed. n
n
most based on the Go to transfer from water to octanol. which is ‘best’ is controversial, but one popular scale is…
The Hydrophobicity of Fouchere and Pliska (1983): n
hydrophobicity parameters, derived from: w transfer from water to octanol. w N-acetyl-amino acid amides.
n
Hydrophobicity parameter for a given amino acid: = ln P = ln (nonaq / aq) w Then: Hydrophobic: > 0; Hydrophilic: < 0
Aliphatic Amino Acids These amino acids have alkyl side-chains: n
so that R is a hydrocarbon.
All are hydrophobic (P > 1). n
hence, have hydrophobicity, = ln P > 0.
Hydrophobicity increases with side-chain length: n
= 0.31, 1.22, 1.70, 1.80 for Ala, Val, Leu, and Ile, respectively.
Nonpolar Amino Acids Have a nonpolar side chain, other than a hydrocarbon. n n
(Almost) all are hydrophobic (> 0). no simple correlation with chain ‘length’. w = 0, 0.72, 1.54, 1.23 for G, P, C, and M, respectively.
Otherwise, have more distinct characters: n n
Glycine is achiral; very flexible (a ‘helix-breaker’). The side chain of Proline is a closed ring; w strong influence on the nature of the peptide bond.
n
In some proteins, Cysteine residues form disulfide links.
Aromatic Amino Acids All highly hydrophobic. n n
Phe has a hydrophobicity of = 1.79. Tyr is less hydrophobic, at = 0.96, w due to its reactive hydroxyl group.
n
Trp is the most hydrophobic residue ( = 2.25).
Rings bulky… and tend to interact with other rings. n n
due to pi-pi interaction. in Aq. solution, rings perpindicular. w entropically favored.
n
This is in contrast with the stacked rings in DNA… w minimizes solvent
exposure.
Polar Amino Acids Each contains groups with partial charges, n
and therefore, tend to form hydrogen bonds...
Thus, much less averse to water: n
Asn, Gln, Ser all have negative hydrophobicity values. w = -0.60, -0.22, and –0.04, respectively.
n
Thr is slightly hydrophobic, at = 0.26. w in some scales, Thr is hydrophilic (e.g., in the ‘hydropathy’
scale).
Charged Amino Acids Amino acids that carry a charge very hydrophilic: n
Lys and Arg (+) charged at pH 7 w very hydrophilic ( = -0.99 and = –1.01, respectively).
n
His can also be (+) charged at pH 7 (environmentsensitive). w intermediate hydrophobicity ( = 0.13).
n
Aspartic acid and Glutamic acid (-) charged at pH 7 w very hydrophilic ( = -0.77 and –0.64, respectively).
Overall Charge of a Protein In passing, we note that the overall protein charge: n
n
depends on the number of acidic and basic residues... at the experimental pH of interest.
The Isoelectric Point (pI) n
n
= the pH at which the total charge of a protein is zero. Protein charge density can be estimated from pI: c = (pI-pH)/MW w here, MW is the molecular weight of the protein. n
For pH > pI, overall charge negative (deprotonation).
Protein Structure Proteins may have up to 4 levels of structure: n
Primary structure: w the N to C amino acid sequence of the polypeptide.
n
Secondary structure: w helices resulting from local folding.
n
Tertiary structure: w global folding of secondary structures into a larger
structure.
n
Quaternary structure: w association of several, independent polypeptides.
We will look at each, in detail… n n
1o and 2o Structure (this Lecture) 3o and 4o structure (next Lecture)
Protein Primary Structure A polypeptide - covalently linked chain of amino acids. n n
each linked amino acid called a ‘residue’. each pair of residues connected by a peptide bond:
The N-terminal to C-terminal sequence of residues: n
Is the primary structure (1o) of the encoded protein.
Anfinsen’s Principle Anfinsen’s Principle is basic to biochemistry: n
‘The information needed to fold a macromolecule into its native, 3-D structure is contained in its sequence’. w Denatured ribonuclease spontaneously refolds into the
enzymatically active form, in vitro(Anfinsen, 1963).
1o structure then specifies the higher structure: n
Each sequence corresponds to a well-defined 3-D structure. w or to a family of closely related structures with activity. w the ‘native state’.
n
On the other hand… a unique structure does not require a unique sequence. w level of sequence homology required for similar
structure is only about 25%-30%. w non-homologous sequences can also have similar structures.
o
Protein 2 Structure: The Helices of Polypeptides Protein secondary structure (2o) – n
refers to the regular and repeating structure of a polypeptide. w here, regular means symmetric.
n
A linear chain of chiral building blocks can form only one symmetric structure…a helix. w Note a -strand is also a helix.
Protein 2ostructure thus refers to the helices formed by polypeptide chains. n
all are held together by hydrogen bonds. w H-bond donors = amino-Hydrogens. w H-bond acceptors = carboxyl-Oxygens.
n
The importance of H-bonding to biopolymer stability: w first recognized by Linus Pauling.
Traditional Names for Helices Determined by the nature of the repeating Hbond: 1. N = number of residues in 1 helix turn. 2. d = number of atoms in the ring formed by each Hbond donor (amino-H) and acceptor (keto-O). w each helix assigned the name, ‘Nd’. w
e.g., in a 310 helix, n n
Each turn contains N=3 residues; each H-bonded donor/acceptor pair form a ring of d = 10 atoms.
This notation has several shortcomings: w
a -sheet cannot be described in these terms: n
w
each strand is a 2-fold helix, but H-bonds between strands.
No information about handedness (left or right).
Helical Symmetry Helical symmetry refers to discrete screw symmetry. n
n
residues rotate andrise in a repeating manner along an axis. This is a special case of screw symmetry: w symmetry refers to monomer positions,
at discrete values of .
For n-fold helical symmetry, monomer positions related by: Cn(x,y,z)i + T = (x,y,z)i+1 n
Cn = n-fold rotational symmetry operator. w T = resulting translation down the helix axis.
n
(x,y,z)i= position of stair, i.
n
total translation after n applications of C :
The Standard Terminology For displacements down the helical axis: w total rise/turn = pitch, P. w rise/step = helical rise, h. w steps/turn = helical repeat, c (= n).
P=ch
For angular displacements about the axis: w angle/step = = helical angle, or twist.
= 2/c = 2h/P
For helical symmetry about the z-axis, n
Positions of adjacent steps then related by:
Helix Handedness The Symmetry Matrix does not uniquely specify a helix: n n
a helix can be left or right-handed. in principle, handedness given by : w right-handed: > 0. w left-handed: .
n
but we have the convention: >= 0. w all rotations defined as right-handed. w Our notation is degenerate…
We distinguish right and left helices by: n
the axial displacement: P or h. w For right-handed helices, P and h > 0. w For left handed helices, P and h < 0.
The helix shown here is right-handed.
Naming Helices by Symmetry Helical symmetry is denoted by NT: n n
n
N denotes its N-fold rotation operator. T = translation generated by the symmetry operator, in repeats (monomers). compare with Nd notation.
Example: Take the helix at right... n
Each residue rotated by +120o. w 3-fold helical symmetry (N = 3).
n
1 application of the symmetry operator: w translates a residue by +1 repeat, or P/3. w thus, T = 1.
n
This helix has 31 symmetry. w note that it is a right-handed helix. w right-handed helices with integer N are N1 helices.
The 310 Helix has 31 Symmetry Example: The 310 Helix n
n
one of the two common helices in proteins. Commonly named by Nd notation: w N=3 residues/turn. w d=10 atoms in each H-bond closed
ring. n note: green line includes a shared H. w ith keto-O H-bonded with (i+3)th Nitrogen.
The 310 helix has 31 helical symmetry. n
Each residue rotated by +120o. w exactly 3 residues/turn.
The -helix has 185 Symmetry
The most common helix in globular proteins. n
ith keto-O H-bonded with the (i+4)th Nitrogen. w 13 atoms b/w H-bond donor and acceptor.
The -helix has 3.61 helical symmetry. n
Each residue rotated by +100o. w 3.6 residues/turn (N = 3.6).
n
1 application of the symmetry operator: w translates a residue by +1 repeat, or P/3.6. w thus, a right-handed helix, with .
For helices with non-integral symmetry: n n
N and T converted to integers… -helix said to have 185 symmetry. w 18 residues in 5 full turns.
Left-Handed Helices How do we construct a left-handed helix… n
using only right-handed rotations?
Consider the helix at right: n n
Each residue rotated by +120o (N=3). But, application of this operator: w translates each residue by 2 repeats (T = 2).
n
The resulting helix (units 1,2,3): w has 32symmetry. w has gaps at 1’, 2’, 3’.
Copying this unit a distance P along the axis: n
fills the gaps, generating a left-handed helix… …but, using only right-handed rotations.
A left-handed helix with N-fold helical symmetry:
The ‘trans-conformation’ is a 21 helix Consider the fully extended polypeptide: n
n
keto-O and amino-H of each Peptide bond in the trans configuration, and…
For each residue, both and = 180o. w …remember our convention: n
Polymer chemistry: cis = 0o.
w Biopolymer in a ‘fully’ trans conformation… n n
Somewhat expanded use of the term, ‘trans’. …since cis/trans defined for configurations.
The -strand is also a 21 Helix Differs from the transconformation: n n
a -strand is pleated, like a curtain. shorter (smaller rise, h).
In order to be stable, -strands combine to form sheets: n
n
by strand-to-strand hydrogen bonding. two general types: w anti-parallel sheets (A). w parallel sheets (P).
Generally, these sheets are twisted.
Standard Helices of Biopolymers
n
these include all of the standard, 2o structures of biopolymers.
Thus, our original contention is correct: n
the only symmetric building block formed by a chain of chiral units is a helix.
Question: can the backbone adopt these structures?
The Peptide Bond The peptide bond… n n
links each pair of adjacent residues; is an amide linkage, w with a partial double-bond character.
n
this bond not freely rotating: w 6 atoms constrained to a plane:
Ci, Ci, Oi, Ni+1,Hi+1,Ci+1.
The peptide bond can adopt one of 2 configurations: n
based on the positions of Ci and Ci+1.
n
The trans-configuration (i= 180 ):
o
w energetically favored…usually adopted. n
The cis-configuration: (i= 0o): w sterically hindered…energetically unfavorable. w exception: Proline (cis and trans nearly
The Peptide Bond (cont.) Fixing the peptide bond in either cis or trans: n
n
leaves the backbone with only two degrees of freedom at Ri. these are expressed by two torsion angles: w = angle about the Ni-Ci bond. n
defined by Ci-1-Ni-Ci.-Ci.
w = angle about the Ci-Ci bond. n
n
defined by Ni-Ci-Ci-Ni+1.
bond lengths are nearly constant…therefore:
The Ramachandran Plot The Ni-Ci bond, and Ci-Ci bond are single bonds: n
in principle, each may rotate freely… w and could then assume any values b/w +/- 180o. w however, this is true only for Glycine (R = H).
For other R-groups, non-covalent interactions b/w adjacent side-chains: n
place energetic constraints on and . w thus, some conformations (,) are sterically disallowed.
Sasisekharan and Ramachandran (1968): n
first plotted the van der Waals energies of interaction vs. (,). w using poly-L-Alanine n
R = Me, the minimally constrained group (with a C-Carbon).
w with the trans-configuration for each peptide bond. n
the resulting plot is a ‘Ramachandran Plot’.
The Ramachandran Plot (cont.) Plot shown in terms of allowed regions… w here, c = helical repeat, with +/- = right/left-handed. n n n
dark regions = sterically allowed. lightly shaded regions = moderately disfavored. others = disallowed.
Torsion angles of helices: n n
all in allowed regions. Note: L- left handed -helix. w Glycine only (achiral).
Not a good model for: n n
Glycine or Proline. residues w/ bulky side-chains.
Conclusion In this Lecture, we have discussed: n
Amino Acid Residues w Characteristics and Hydrophobicities
n
o
Protein 2 structure: w The local, helical structures adopted by polypeptides.
We will continue our discussion in Lecture 4: n
The intermediate level structure of Proteins: w Super-2o and Domainal structure of Proteins.
n
n
Methods of Visualization of the overall 3-D structures of Proteins. Protein 3o structure: w contact plots.
n
Protein 4o structure.