Garima Kushwaha BBI-VIII BBI-03005
Agenda
HYDROGENASE Molecular Dynamic Simulation NAMD GOMACS PYMOL VMD CAVER
Periodic oil crises have become the norm for today’s market. ……….. S o l u t i o n ……….. The use of Molecular Hydrogen as an alternative to fossil fuel as an active area of research.
The long term goal for this project is to develop efficient and economical technology for the biological conversion of solar energy into molecular hydrogen.
HYDROGENASE
Hydrogenase enzymes were discovered in enteric bacteria by Stephenson and Strickland.
They are found in a wide array of organisms, ranging from aerobes to anaerobes, autotrophs to heterotrophs, prokaryotes to eukaryotic photosynthetic organisms, fermentative organisms, sulfate reducers, and others.
The current research focuses on prokaryotes.
Hydrogenase enzymes catalyze the reversible oxidation of molecular hydrogen to protons and electrons [H2 <=> 2H + + 2e - ]
HYDROGENASE (Cont..)
Classification of hydrogense enzymes are determined by their metal center.
The two most studied classes are: 1. Ni-Fe hydrogenases 2. Fe-only hydrogenases
In genernal, Ni-Fe hydrogenase enzymes consume molecular hydrogen as a fuel source, and Fe-only hydrogense or “iron-only” [FeFe]-hydrogenases enzymes produce molecular hydrogen.
Current attempts to identify the structure of the active site and to formulate a mechanism have been fruitful. The x-ray crystal structures of two Fe-only hydrogenases have been obtained thus far.
HYDROGENASE (Cont..)
Much of the recent scientific interest in [FeFe]-hydro-genases, however, concerns a different role entirely: the H2 production properties of [FeFe]-hydrogenases offer the promise of a means for affordable large-scale production of H2 as a source of renewable energy.
New developments in the research of Fe-only hydrogenase has peaked interest for use of enzymes in the production of hydrogen.
Simple and short pathway for water photolysis in a biological organism that may therefore deliver an attractive conversion efficiency.
Hydrogenase from Clostridium pasteurianum
The Fe-only hydrogenases have been identified in a small group of microbes.
One such enzyme, the soluble, monomeric hydrogenase isolated from the Gram-positive anaerobe Clostridium pasteurianum, has been purified and extensively characterized both bio-chemically and spectroscopically.
Results from various spectroscopic methods have suggested that the Fe and S are organized into five distinct metal clusters. One of these, termed the H-cluster (hydrogen cluster), is proposed to be the site of hydrogen activation. X-ray crystallo-graphic methods are used to determine the structure of the C. pasteurianum Fe-only hydrogenase (CpI) to 1.8 Å resolution, revealing the structure of the active-site cluster.
Overall Structure
The overall structure of CpI resembles a mushroom, with a large cap connected to a stem. This overall structure can be subdivided into four distinct nonoverlapping domains. • The largest of the four domains, designated the active-site domain, makes up the mushroom cap; • the re-maining three smaller domains constitute the stem, which contains the accessory [Fe-S] clusters termed FS4A, FS4B, FS4C, and FS2 . • The active-site domain contains about twothirds of the total protein (amino acid residues 210 to 574). The fold of the active-site domain consists of two four-stranded twisted b sheets, each flanked by a number of a helices that appear to be two nearly equivalent lobes, with one b sheet and associated helices contained within each lobe.
(C1) Nomenclature for the [Fe–S] clusters of Fe-only hydrogenase CpI.
(C2) GRASP representation of the relative charge on the surface of CpI, with regions of acidic residues (aspartic acid and glutamic acid) indicated in red and regions of basic residues (lysine and arginine) indicated in blue [3].
Hydrogen production in CpI happens at the H cluster, a metallic cluster bound to and embedded in-side the CpI protein matrix.
H-Cluster
The promise of Cheap Renewable energy So,cultures of hydrogenase-containing microorganisms have the ability to produce a constant output of hydrogen gas (H2) from just sunlight and water. If harnessed properly, hydrogenase and/or hydrogenase-containing organisms could be used to supply affordable and renewable H2 to be used as an energy fuel, and thus solve the "supply“ aspect of the future hydrogen economy.
This idealistic picture is not without problems…. Notably, hydrogenase's H-cluster is extremely sensitive to the presence of oxygen gas (O2), which will bind to it permanently. In the presence of O2, hydrogen production is maintained for only a few minutes before the hydrogenases become deactivated. An anaerobic environment is required, making hydrogenase a costly and impractical source of H2.
If we identify the pathways through which O2 reaches the H-cluster.
Then we can create an engineered version of hydrogenase in which these O2 pathways are blocked. Thus decreasing hydrogenase's sensitivity towards O2.
Molecular Dynamic Simulation This is a computational method that calculates the time dependent behavior of a molecular system. It provides a detailed information on the fluctuations and conformational changes of proteins and nucleic acids. Molecular dynamics (MD) is a form of computer simulation, wherein atoms and molecules are allowed to interact for a period of time under known laws of physics.
Because molecular systems generally consist of a vast number of particles, it is impossible to find the properties of such complex systems analytically; M.D. simulation circumvents this problem by using numerical methods. Molecular dynamics is a multidisciplinary field. Its laws and theories stem from mathematics, physics, and chemistry, and it employs algorithms from computer science and information theory. In the broadest sense, molecular dynamics is concerned with molecular motion.
The driving force for chemical processes is described by thermodynamics. The mechanism by which chemical processes occur is described by kinetics. Thermodynamics dictates the energetic relationships between different chemical states, whereas the sequence or rate of events that occur as molecules transform between their various possible states is described by kinetics:
Conformational transitions and local vibrations are the usual subjects of molecular dynamics studies. Molecular dynamics alters the intramolecular degrees of freedom in a step-wise fashion, analogous to energy minimization. The individual steps in energy minimization are merely directed at establishing a down-hill direction to a minimum. The steps in molecular dynamics, on the other hand, meaningfully represent the changes in atomic position, ri, over time (i.e. velocity). Newton's equation is used in the molecular dynamics formalism to simulate atomic motion: The rate and direction of motion (velocity) are governed by the forces that the atoms of the system exert on each other as described by Newton's equation.
The force on an atom can be calculated from the change in energy between its current position and its position a small distance away. This can be recognized as the derivative of the energy with respect to the change in the atom's position:
Knowledge of the atomic forces and masses can then be used to solve for the positions of each atom along a series of extremely small time steps (on the order of femtoseconds = 10^-15 seconds). The resulting series of snapshots of structural changes over time is called a trajectory. The use of this method to compute trajectories can be more easily seen when Newton's equation is expressed in the following form:
In practice, trajectories are not directly obtained from Newton's equation due to lack of an analytical solution. First, the atomic accelerations are computed from the forces and masses. The velocities are next calculated from the accelerations based on the following relationship:
Lastly, the positions are calculated from the velocities:
A trajectory between two states can be subdivided into a series of sub-state separated by a small time step, "delta t" (e.g.1 femtosecond)
NAMD NAMD, is a parallel molecular dynamics code designed for high-performance simulation of large biomolecular systems. NAMD uses the popular molecular graphics program VMD for simulation setup and trajectory analysis, but is also file -compatible with AMBER, CHARMM, and X-PLOR. You can build NAMD yourself or download binaries for a wide variety of platforms. It has been developed by the joint collaboration of The Theoretical and Computational Biophysics Group (TCB) and the Parallel Programming Laboratory (PPL) at the University of Illinois at Urbana-Champaign.
NAMD (Cont …) NAMD requires four files for any MD Simulation run. These are as follows:1. PDB file. 2. PSF file. 3. PARAMETER file. 4. CONFIGURATION file.
PDB File Format Files in the PDB include information such as the name of the compound, the species and tissue from which is was obtained, authorship, revision history, journal citation, references, amino acid sequence, stoichiometry, secondary structure locations, crystal lattice and symmetry group, and finally the ATOM and HETATM records containing the coordinates of the protein and any waters, ions, or other heterogeneous atoms in the crystal. Some PDB files include multiple sets of coordinates for some or all atoms. NAMD and VMD ignore everything in a PDB file except for the ATOM and HETATM records, and when writing PDB files the ATOM record type is used for all atoms in the system, including solvent and ions.
Protein Structure File A PSF file, also called a protein structure file, contains all of the molecule specific information needed to apply a particular force field to a molecular system. The CHARMM force field is divided into a topology file, which is needed to generate the PSF file, and a parameter file, which supplies specific numerical values for the generic CHARMM potential function. The PSF file contains five main sections of interest: atoms, bonds, angles, dihedrals, and impropers. The X-ray structure of a protein from the Protein Data Bank does not contain information about the hydrogen atoms of that protein. NAMD provides the psfgen utility, which is capable of generating the required PSF and PDB files by merging PDB files and guessing coordinates for missing atoms. So the pdb file which will generated with psfgen along with the psf will contain guessed coordinates for hydrogen atoms of the structure.
Forcefield Parameter File A force field is a mathematical expression of the potential which atoms in the system experience. CHARMM, X-PLOR, AMBER, and GROMACS are four types of force fields, and NAMD is able to use all of them. The CHARMM force field was used containing topology and parameter information. It contains all of the numerical constants needed to evaluate forces and energies, given a PSF structure file and atomic coordinates. The current versions of the CHARMM forcefield are CHARMM22 for proteins and CHARMM27 for lipids and nucleic acids. The Individual parameter files are named, respectively, par_all22_prot.inp, par_all27_lipid.inp, and par_all27_na.inp. To enable hybrid systems, combinations are also provided, named par_all27_na_lipid.inp, par_all27_prot_lipid.inp, and par_all27_prot_na.inp. The parameter fie used was par_all27_prot_lipid.inp.
NAMD Configuration File NAMD uses a file referred to as the configuration file. This file specifies what dynamics options and values that NAMD should use, such as the number of timesteps to perform, initial temperature, etc. The options and values in this file control how the system will be simulated. A NAMD configuration file contains a set of options and values. The options and values specified determine the exact behavior of NAMD, what features are active or inactive, how long the simulation should continue, etc.
Steps Involved in Making the Protein Structure File At first the PDB file of the desired protein is downloaded from Protein Data Bank. 2. Then the information of water molecules is deleted from that PDB file to create a pdb file of the protein alone. 3. After that a file is created with .pdb extension which contains the coordinates of the protein alone without hydrogense by typing the following commands in the VMD TkConsole window:set protein [atomselect top protein] $protein writepdb proteinp.pdb
4. Now the PSF file of protein is created using the psfgen package of VMD.In order to create a psf, firslty a pgn file is created which will be the target of psfgen. 5.Open a terminal window and typed the following commands:package require psfgen topology top all27 prot lipid.inp pdbalias residue HIS HSE pdbalias atom ILE CD1 CD segment U {pdb proteinp.pdb} coordpdb proteinp.pdb U guesscoord writepdb protein.pdb writepsf protein.psf
6. After typing save the file by .pgn extension(i.e. protein.pgn). 7. In the terminal window type the following command:vmd -dispdev text -e protein.pgn This run the package psfgen on the file protein.pgn and generate the psf and pdb file of protein with hydrogens.
Topology Files A CHARMM forcefield topology file contains all of the information needed to convert a list of residue names into a complete PSF structure file. It also contains internal coordinates that allow the automatic assignment of coordinates to hydrogens and other atoms missing from a crystal PDB file. The current versions of the CHARMM forcefield top_all22_prot.inp, top_all27_lipid.inp, and top_all27_na.inp. To enable computation on hybrid systems, combinations are also provided, named top_all27_na_lipid.inp, top_all27_prot_lipid.inp, and top_all27_prot_na.inp.
Solvating the Protein The protein needs to be solvated, i.e., put inside water, to more closely resemble the cellular environment. It is done so in two ways, placing protein in: • A water sphere in surrounding vacuum, in preparation for minimization and equilibration without periodic boundary conditions. • A water box, in preparation for minimization and equilibration with periodic boundary conditions.
Simulation with Periodic Boundary Conditions This is to examine the minimization and equilibration of hydrogenase in a water box with periodic boundary conditions. The configuration file, 1feh_wb eq.conf is opened by typing nedit 1feh_wb eq.conf. In this the simulation parameters and the outputsection is modified. The command used for running the simulation on parallel machine is as follows nohup (full path of NAMD executable files)/charmrun +p(number of processors) (full path of NAMD executable files)/namd2 (name of configuration file) > (name of output file)
GROMACS Groningen Machine For Chemical Simulation
Gromacs is a molecular dynamics simulation package originally developed in the University of Groningen, now maintained and extended at different places, including University of Uppasal and University of Stokholm and the Max Planck Institute for Polymer Research.
The GROMACS project was originally started to construct a dedicated parallel computer system for molecular simulations, based on a ring architecture. The molecular dynamics specific routines were rewritten in the C programming language from the Fortran 77-based program GROMOS, which had been developed in the same group. The highly optimized code makes GROMACS the fastest program for molecular simulations to date. Besides, the support of different force fields and the open source (GPL) character make GROMACS very flexible. GROMACS is a high-end, high performance research tool designed for the study of protein-dynamics using classical molecular dynamics theory. You may download it from http://www.gromacs.org GROMACS:runs on linux, unix, and on Windows(a recent de velopment).
Gromacs Input File In order to start a molecular dynamics simulation using GROMACS we need three input files:
The atomic coordinates (and, optionally, velocities) are stored in a file that is conventionally called conf.gro
The molecular topology file (conventionally called topol.top) that describes the chemical composition of the system, including information on the force field parameters, bond lengths, etc.
The molecular dynamics parameter file (conventionally called grompp.mdp) which holds parameters like the number of integration steps, treatment of cut-offs and so on.
Steps for creating GROMACS Simulation
1. Creating a Gromacs topology from the PDB file Processing of the pdb file is done with pdb2gmx command. The pdb2gmx command converts your pdb file to a gromacs file and writes the topology for you. This file is derived from an NMR structure which contains hydrogen atoms. The x-y-z co-ordinates of the atoms are stored in a .gro file, and the atomic masses, charges, and bonds are stored in a .top file. Command used pdb2gmx_d -ignh -ff G43al -f 1FEH.pdb -o feh.pdb -p feh.top
2. Adding solvent water around the protein •
editconf takes the .gro file, and the dimensions of the box which you specify, and append the box dimensions on the last line of the .gro file. Command used editconf -bt cubic -f feh.pdb -o feh.pdb -c -d 0.9
Now that the box dimensions have been specified, it is possible to add more than one of the molecules found in the .pdb file, and to fill the rest of the box with solvent molecules. Both of these operations are handled by the genbox command. Command used
genbox_d -cp feh.pdb -cs spc216.gro -o feh_b4em.pdb -p feh.top
(On periodic images: GROMACS by default uses periodic boundary conditions. )
Ions are added to the system by the use of the genion utility. However, genion does not work directly on .gro files. You must first translate the text-based .gro file into a binary file with the use of the grompp utility. Command used
genion –f em.mdp –c feh_b4em.pdb –feh.top –o feh_em.tpr
•
Once the ions are in place, it is time to start running simulations. There are three types of simulation, although the first two are optional.
All three types are executed the same way by the use of first grompp and then mdrun. takes the .gro, .top and .mdp files, and produces a .tpr file, which is the input to mdrun.
grompp
outputs a large binary file containing the state of the system at regular time intervals (.trr) and also outputs a .gro file which contains the state of the system at the last time step.
mdrun
3. Running energy minimization All it does is nudge the atoms in the solute molecules only, until the bond lengths and angles are in their minimum potential energy configuration (completely ignoring the other atoms in the system). Command used
grompp_d -f em.mdp -c feh_ion.pdb -p feh.top -o feh_em.tpr mpirun -np 6 /home/class/gromacs3.3.1/src/kernel/mdrun_d -v –s feh_em.tpr –o feh_em.trr –c feh_b4md.pdb –g em.log –e em.edr
4. Carefully equilibrating the water around
the protein.
The second simulation is the * “position restraints†simulation. This ones fixes the solute molecules in place, and allows the solvent and ions to “relax†into their minimum potential energy positions. Commands used
grompp_d -np 6 –f pr.mdp –c feh_b4pr.pdb –r feh_b4pr.pdb –p feh.top –o feh_pr.tpr mpirun -np 6 /home/class/gromacs3.3.1/src/kernel/mdrun_d -v –s feh_pr.tpr –o feh_pr.trr –c feh_b4md.pdb –g pr.log –e pr.edr &
5. Running the production simulation The third simulation is the * “fullâ€simulation, and is the one in which the full molecular dynamics is calculated. When running GROMACS on a high performance computer, this is the step which will be parallelised. Command used grompp_d -np 6 –f md.mdp –c feh_b4md.pdb –r feh_b4md.pdb –p feh.top –o feh_md.tpr mpirun -np 6 /home/class/gromacs3.3.1/src/kernel/mdrun_d -v –s 1sk8_md.tpr –o 1sk8_md.trr –c 1sk8_pmd.pdb –g md.log –e md.edr
Making a movie A normal movie uses roughly 30 frames/second, so a 10-second movie requires 300 simulation trajectory frames. To make a smooth movie the frames should not be more 1-2 ps apart, or it will just appear to shake nervously. Export a short trajectory from the first 2.5 ns in PDB format (readable by PyMOL) as trjconv –s confout.gro -f traj.xtc -e 2500.0 –o movie.pdb.
Choose the protein group for output rather than the entire system. If you open this trajectory in PyMOL you can immediately play it using the VCR-style controls on the bottom right, adjust visual settings in the menus, and even use photorealistic ray-tracing for all images in the movie.
Pymol PyMOL is a molecular visualization and manipulation system. The program is designed to meet a variety of the molecular graphics, animation, and presentation needs of research scientists in academia and industry, including display of structure information.
Introduction to PyMol
What is PyMol for? – looking at pdb files (protein, nucleic acid, ligands etc.) – making publication quality figures (of models and maps) – NOT for model building
System requirements? – computer (faster is better): PC (Windows/Linux), Mac (OS X) – a 3-button scroll mouse – works with hardware stereo
Where can I get it? – pymol.sourceforge.net – current version: 0.99 – pymol.sourceforge.net/html/ - for the manual
How to start the program? Double-click the application icon (?) or From a terminal window, type “pymol” You should see a command window and a graphics window (may be combined)
Demo
Useful Display Settings Di spl ay > Ba ckg ro un d > wh it e ----set background colour Di spl ay > or tho sc op ic vie w ----no perspective distortion
Using mouse in graphics window • Unmodified controls – Left - rotate molecule (x, y and, at edges, z) – Middle - translate molecule (x, y) – Right - zoom (=MovZ) – Wheel - slab/clip Menu at bottom right • With shift key – Right - up/down: clip front - left/right: clip back
VMD VMD is designed for the visualization and analysis of biological sytemssuch as proteins, nucleic acids, lipid bilayer assemblies, etc. VMD can read standard Protein Data Bank (PDB) files and display the contained structure. VMD provides a wide variety of methods for rendering and coloring a molecule: simple points and lines, CPK spheres and cylinders, licorice bonds, backbone tubes and ribbons, cartoon drawings, and others. VMD can also be used to animate and analyze the trajectory of a molecular dynamics (MD) simulation. In particular, VMD can act as a graphical front end for an external MD program by displaying and animating a molecule undergoing simulation on a remote computer.
Loading a Molecule 1 Choose the File New Molecule... menu item Fig. 2(a) in the VMD Main window. Another window, the Molecule File Browser (b), will appear in your screen. 2 Use the Browse... (c) button to find the file . Note that when you select the file, you will be back in the Molecule File Browser window. In order to actually load the file you have to press Load (d).
Displaying the Protein In order to see the 3D structure of our protein we use the mouse in multiple modes. 3.
In the OpenGL Display, pressing first (left) mouse button down and moving the mouse is the rotation mode of the mouse and allows to rotate the molecule around an axis parallel to the screen Fig. 3(a).
2. If you press the second (right) button and repeat the previous step, the rotation is be done around an axis perpendicular to your screen (b)
3 In the VMD Main window, look at the Mouse menu (Fig. 4). Here, you will be able to switch the mouse mode from Rotation to Translation or Scale modes. 4 The Translation mode will allow you to move the molecule around the screen while holding the first (left) button down. 5 The Scale mode will allow you to zoom in or out by moving the mouse horizontally while holding the first (left) button down.
Exploring Different Drawing Styles 1 Choose the Graphics Representations... Menu item. A window called Graphical Representations will appear and you will see highlighted in yellow Fig. 5(a) the current graphical representation used to display your molecule. 2 In the Draw Style tab (b) we can change the style(d) and color (c) of the representation. In this section we will focus in the drawing style (the default is Lines). 3 Each Drawing Method has its own parameters. For instance, change the Thickness of the lines by using the controls on the right bottom part (e) of the Graphical Representations window. 4 Now, choose VDW (van der Waals) from Drawing Method. Each atom is now represented by a sphere. In this way you can see more easily the volumetric distribution of the protein.
5 To see the arrangements of atoms in the interior of the protein, use the new controls on the right bottom part of the window (e) to change the Sphere Scale to 0.5 and the Sphere Resolution to 13. 6 Note in Coloring Method Name, each atom has its own color, i.e: O is red, N is blue, C is cyan and S is yellow. 7 Press the Default button. This allows you to return to the default properties of the drawing method. 8 Choose the Tube style under Drawing Method and observe the backbone of your protein. Set the Radius at 0.8. 9 By looking at your protein in the tube mode, can you distinguish how many helices, sheets and coils are present in the protein? The last drawing method we will explore is NewCartoon. It gives a simplified representation of a protein based in its secondary structure. 10 Choose Drawing Method NewCartoon.
Exploring Different Coloring Methods 1 Choose Coloring Method ResType Fig. 5(c). This allows you to distinguish non-polar residues (white), basic residues (blue), acidic residues (red) and polar residues (green). 2 Select Coloring Method Structure (c) and confirm that the NewCartoon representation displays colors consistent with secondary structure.
Exploring Different Selections 1 In the Selected Atoms text entry Fig. 5(f) of the Graphical Representations window delete the word all, type helix and press the Apply button or hit the Enter/Return key on your keyboard (do this every time you type something). VMD will show just the helices present in our molecule. 2 In the Graphical Representations window choose the Selections tab Fig. 7(a). In section Singlewords (b) you will find a list of possible selections you can type. For instance, try to display sheets instead of helices by typing the appropriate word in the Selected Atoms text entry.
Combinations of boolean operators can also be used when writing a selection. 3 In order to see the molecule without helices and sheets, type the following in Selected Atoms: (not helix)and(not betasheet) 4 In the section Keyword (c) of the Selections tab (a) you can see properties that can be used to select parts of a protein with their possible values. Look at possible values of the Keyword resname (d). Display all the Lysines and Glycines presents in the protein by typing (resname LYS)or(resname GLY). Lysines play a fundamental role in the configuration of polyubiquitin chains. 5 In order to see which water molecules are closer to the protein you can use the command within. Type water and within 3 of protein. This selects all the water molecules that are within a distance of 3 angstroms of the protein.
Multiple Representations The button Create Rep Fig. 8(a) in the Graphical Representations window allows you to create multiple representations. 1 For the current representation, set the Drawing Method to New Cartoon and the Coloring Method to Structure. 2 In Selected Atoms type protein. 3 Press the Create Rep button (a). Now, using the menu items of the Draw Style tab and the Selected Atoms text entry, modify the new representation in order to get VDW as the Drawing Method, ResType as the Coloring Method, and resname LYS typed in as the current selection.
5 Create a final representation by pressing again the Create Rep button. Select Drawing Method Surf, the Coloring Method Molecule and type protein in the Selected Atoms entry. For this last representation choose in the Material section (c) the Transparent menu item. 6 Note, that with the mouse, you can select the different representations you have created and modify each one independently. Also, you can switch each one on/off by double-clicking on it or delete each one by using the Delete Rep button (b). Turn off the second and last representations. At the end of this section, the Graphical Representations window should look like Fig. 8.
Sequence Viewer Extension 1 Choose the Extensions ->Analysis -> Sequence Viewer menu item. A window Fig. 9(a) with a list of the amino acids (e) and their properties (b)&(c) will appear in your screen. 2 With the mouse, click over different residues (e) in the list and see how they are highlighted. In addition, the highlighted residue will appear in your OpenGL Display window in yellow and bond drawing method, so you can visualize it easily. Use the right button of the mouse to unselect residues. 3 Using the Zoom controls (f) you can display the entire list of residues in the window. This is especially useful for larger proteins 4 Using the shift key while pressing the mouse button allows you to pick multiple residues at the same time. Look at residues 48, 63, 11 and 29 (e).
Saving your Work In the VMD Main window, choose the File ->Save State menu item. Write an appropriate name (e.g., myfirststate.vmd) and save it. The File -> Load State menu item will allow you to load a previously saved VMD state, just like the file you saved.
CAVER CAVER provides rapid, accurate and fully automated calculation of pathways leading from buried cavities to outside solvent in static and dynamic protein structures. Calculated pathways can be visualized by graphic program PyMol dissecting anatomy and dynamics of entrance tunnels. CAVER allows analysis of any molecular structure including proteins, nucleic acids, inorganic materials, etc.
CAVER Versions PyMol plug-in suitable for calculation of pathways in discrete protein structures
Stand alone version enabling analysis of trajectories from molecular dynamics simulations.
Plugin integrates PyMOL with CAVER
Load a protein structure into PyMOL(File->open >Browse). Select Residues or Atoms forming active place.Click Display and select Sequence. This will display the single letter sequence of the protein in the pdb file as well as ligands, waters, and ions. Click on FeS4, FeS, HC1 from the sequence. The selection is made by the default name as (sele). Start this plugin(Plugin->Caver Tools). Check the name of your selection "(sele)" Make sure that the path to the CAVER binary is correct on the "CAVER Location" entry field.
1.
2. 3.
4. 5.
Specify starting point by geometric centre of selected atoms/residues by click on the radio button "Use average point from centres of given selections" and write "(sele)" to Selection list Click "Show starting point" button. You should see small crisscross object in the PyMOL Viewer window. Now you can switch to the choice "Use X, Y, Z coordinates to specify starting point" and you can state starting point more precisely by clicking arrows. You should move crisscross object in an empty space near active place in the molecule. Click the "Run CAVER" button. Wait some time and see results.
Stand-alone version # Molecule preparation 1. Molecular dynamics run 2. Atom radius preparation 3. Atom types preparation
Molecular dynamics run Dynamical behaviour of macromolecules can be studied by many different mathematical-physical models. One of them is a classical molecular dynamics (MD). Program package AMBER,CHARM,GROMOS and many others can be used for MD simulation.
Atom radius preparation CAVER considers molecule as a group of ball-shaped atoms. Therefore, radius of every atom type has to be specified. The van der Waals radii (in A) are used for this purpose. An example of atom radius file named radius.dat can be seen below. H 0.6000 HP 1.1000 HA 1.4590 CE2 1.9080 CE3 1.9080 CG2 1.9080 CH2 1.9080 . :
Atom types preparation Atom type is necessary for appropriate assignement of atom radius. Types are specified in the file types.dat. The first line specifies a number of atoms in a molecule. This file can be easily generated from PDB using gawk. An example of atom types file named types.dat can be seen below. 4696 N H1 H2 H3 CA HG3 . :
Configuration of CAVER 1. Grid cell resolution 2. Name of file with radii 3. Name of file with atomic types 4. Name of file with trajectory 5. Trajectory traversing 6. Specification of starting point 7. Cropping number
Configuration of Caver CAVER options are specified in a file named config.dat. An example of config.dat file is shown below. 0.8 radius.dat types.dat simulation.trj 20 1 0 2 3 3 1652 1667 2285 4
Running CAVER caver [-h | --help] [-o ] [-e | --enable-outputtrajectory
options -h --help -i file -e
-- Display this help screen. --input file --enable-output-trajectory Outputs PDB trajectory of shortest way for PYMOL output, and for GNUPLOT output. --enable-output-vmd --Outputs for VMD visualization. -o file --Output file Writes resulting table of radiuses. -d dir --Output-dir dir Set directory for output files. -t num --tun num --Tries to find num tunnels from active site. (for 0.99.2 version and higher)
CAVER configuration is stored in config.dat and must be present in the working directory.
What is a profile of a protein tunnel? A tunnel connects a protein cavity with bulk solvent. The shape of protein tunnel can be approximated as a pipeline with a varying width of cross section. This approximation is useful for estimation of the biggest probe accessing the deepest place in the pocket. The size of a probe able to access internal cavity is limited by radius of tunnel gorge, i.e. the most narrow place in the tunnel. The tunnel profile is a graph of cross section radius (radius of maximally inscribed ball) versus tunnel length measured from its deepest place to the surface opening.
Sample Profile
NAMD: OUTPUT FOR SIMULATION Following files were created : 1feh_wb_eq.coor.BAK 1feh_wb_eq.dcd 1feh_wb_eq.dcd.BAK 1feh_wb_eq.log 1feh_wb_eq.restart.coor 1feh_wb_eq.restart.coor.old 1feh_wb_eq.restart.vel 1feh_wb_eq.restart.vel.old 1feh_wb_eq.restart.xsc 1feh_wb_eq.restart.xsc.old 1feh_wb_eq.rocks.log 1feh_wb_eq.vel 1feh_wb_eq.vel.BAK 1feh_wb_eq.xsc 1feh_wb_eq.xsc.BAK 1feh_wb_eq.xst 1feh_wb_eq.xst.BAK
RMSD for Individual Residues It is defined as atomic positions compared over the no. of time steps in the simulation.
Maxwell-Boltzmann Energy Distribution
Temperature distribution
Energies
GROMACS :Radius of Gyration
Root Mean Square Deviation
CAVER: Via Pymol plug-in
Surface on the protein
Cartoon structure of the protein with the surface on tunnels
Surface on the protein with transparency
Results from Standalone version: Tunnel profile
Thus, for the case of CpI, despite the lack of permanently open gas channels, the pathways taken by gas molecules such as O2 and H2 are predefined to lie in areas of the protein which have a natural disposition towards greater density fluctuations and the no. of fluctuating channels are much more than the permanent ones.
THANK YOU!!