1 \documentclass[12pt]{article}
2 %\usepackage{latex2html}
14 \setcounter{secnumdepth}{5}
15 \setcounter{tocdepth}{5}
19 \title{UNRES - A PROGRAM FOR COARSE-GRAINED SIMULATIONS OF PROTEINS}
21 \author{Department of Molecular Modeling\\ Faculty of Chemistry\\ University of Gdansk\\ Sobieskiego 18\\ 80-952 Gdansk, Poland\\
24 Scheraga Group\\ Baker Laboratory of Chemistry \\
25 and Chemical Biology\\ Cornell University\\ Ithaca, NY 14853-1303, USA}
37 %3. General information
39 % 3.2. Functions of the program
40 % 3.3. Companion programs
41 % 3.4. Programming language
44 %5. Customizing your batch and C-shell script
45 %6. Command line and files
48 % 8.1. Main input data file
50 % 8.1.2. Control data (data list format; READ_CONTROL subroutine)
51 % 8.1.2.1 Keywords to chose calculation type
52 % 8.1.2.2 Specification of protein and structure output in non-MD applications
53 % 8.1.2.3. Miscellaneous
54 % 8.1.3. Minimizer options (data list, subroutine READ_MINIM)
55 % 8.1.4. CSA control parameters
56 % 8.1.5. MCM data (data list, subroutine MCMREAD)
57 % 8.1.6. MD data (subroutine READ_MDPAR)
58 % 8.1.7. REMD/MREMD data (subroutine READ_REMDPAR)
59 % 8.1.8. Energy-term weights (data list; subroutine MOLREAD)
60 % 8.1.9. Input and/or reference PDB file name (text format; subroutine MOLREAD)
61 % 8.1.10. Amino-acid sequence (free and text format)
62 % 8.1.11. Disulfide-bridge information (free format; subroutine READ_BRIDGE)
63 % 8.1.12. Dihedral-angle restraint data (free format; subroutine MOLREAD)
64 % 8.1.13. Distance restraints (subroutine READ_DIST_CONSTR)
65 % 8.1.14. Internal coordinates of the reference structure (free format; subroutine READ_ANGLES)
66 % 8.1.15. Internal coordinates of the initial conformation (free format; subroutine READ_ANGLES)
67 % 8.1.15.1. File name with internal coordinates of the conformations to be processed
68 % 8.1.16 Control data for energy map construction (data lists; subroutine MAP_READ)
69 % 8.2. Input coordinate files
70 % 8.3. Other input files
72 % 9.1. Coordinate files
73 % 9.1.1. The internal coordinate (INT) files
74 % 9.1.2. The plain Cartesian coordinate (X) files
75 % 9.1.3. The compressed Cartesian coordinate (CX) files
76 % 9.1.4. The Brookhaven Protein Data Bank format (PDB) files
77 % 9.1.5. The SYBYLL (MOL2) files
78 % 9.2. The summary (STAT) file
80 % 9.2.2. MD and MREMD runs
81 % 9.3. CSA-specific output files
82 %10. Technical support contact information
87 \section{LICENSE TERMS}
93 This software is provided free of charge to academic users, subject to the condition that no part of it be sold or used otherwise for commercial purposes, including, but not limited to its incorporation into commercial software packages, without written consent from the authors. For permission contact Prof. H. A. Scheraga, Cornell University.
96 This software package is provided on an ``as is'' basis. We in no way warrant either this software or results it may produce.
99 Reports or publications using this software package must contain an acknowledgment to the authors and the NIH Resource in the form commonly used in academic research.
108 The current and former developers of UNRES are listed in this section in alphabetic
109 order together with their current or former affiliations.
112 Maurizio Chinchio (formerly Cornell Univ., USA)
113 Cezary Czaplewski (Univ. of Gdansk, Poland)
114 Carlo Guardiani (Georgia State Univ., USA)
115 Yi He (Cornell Univ., USA)
116 Justyna Iwaszkiewicz (Swiss Institute of Bioinformatics, Switzerland)
117 Dawid Jagiela (Univ. of Gdansk, Poland)
118 Stanislaw Jaworski (deceased)
119 Sebastian Kalinowski (Univ. of Gdansk, Poland)
120 Urszula Kozlowska (deceased)
121 Rajmund Kazmierkiewicz (Univ. of Gdansk, Poland)
122 Jooyoung Lee (Korea Institute for Advanced Studies, Korea)
123 Adam Liwo (Univ. of Gdansk, Poland)
124 Mariusz Makowski (Univ. of Gdansk, Poland)
125 Marian Nanias (formerly Cornell Univ., USA)
126 Stanislaw Oldziej (Univ. of Gdansk, Poland)
127 Jaroslaw Pillardy (Cornell Univ., USA)
128 Daniel Ripoll (formerly Cornell Univ., USA)
129 Jeff Saunders (Schrodinger Inc., USA)
130 Harold A. Scheraga (Cornell Univ., USA)
131 Hujun Shen (Dalian Institute of Chemical Physics, P.R. China)
132 Adam Sieradzan (Univ. of Gdansk, Poland)
133 Ryszard Wawak (formerly Cornell Univ., USA)
134 Bartlomiej Zaborowski (Univ. of Gdansk, Poland)
139 \section{GENERAL INFORMATION}
143 \label{sect:geninfo:purpose}
145 Run coarse-grained calculations of polypeptide chains with the UNRES force field.
146 There are two versions of the package which should be kept separate because of
147 non-overlapping functions: version which runs global optimization (Conformational
148 Space Annealing, CSA) and version that runs coarse-grained molecular dynamics and
149 its extension. Because the installation, input file preparation and running CSA
150 and MD versions are similar, a common manual is provided. Items specific
151 for the CSA and MD version are marked ``CSA'' and ``MD'', respectively.
153 MD version can be used to run multiple-chain proteins (however, that version of
154 the code is a new release and might fail if yet un-checked functions are used).
155 The multi-chain CSA version for this purpose is another package (written largely in
158 \subsection{Functions of the program}
159 \label{sect:geninfo:functions}
164 Perform energy evaluation of a single or multiple conformations (serial and parallel) (CSA and MD).
167 Run canonical mesoscopic molecular dynamics (serial and parallel) (MD).
170 Run replica exchange (REMD) and multiplexing replica exchange (MREMD) dynamics (parallel only) (MD).
173 Run multicanonical molecular dynamics (parallel only) (MD).
176 Run energy minimization (serial and parallel) (CSA and MD).
179 Run conformational space annealing (CSA search) (parallel only) (CSA).
182 Run Monte Carlo plus Minimization (MCM) (parallel only) (CSA).
185 Run conformational family Monte Carlo (CFMC) calculations (CSA).
188 Thread the sequence against a database from the PDB and minimize energy of each structure (CSA).
192 Energy and force evaluation is parallelized in MD version.
195 \subsection{Companion programs}
196 \label{sect:geninfo:companion}
198 The structures produced by UNRES can be used as inputs to the following programs provided
199 with this package or separately:
203 \item{xdrf2pdb} -- converts the compressed coordinate files from MD (but not MREMD)runs into
206 \item{xdrf2pdb-m} -- same for MREMD runs (multiple trajectory capacity).
208 \item{xdrf2x} -- converts the plain Cartesian coordinate files into PDB format.
210 \item{WHAM} -- processes the coordinate files from MREMD runs and computes temperature profiles
211 of ensemble averages and computes the probabilities of conformations at selected
212 temperatures; also prepares data for CLUSTER and ZSCORE.
214 \item{CLUSTER} -- does the cluster analysis of the conformations; for MREMD runs takes the
215 coordinate files from WHAM which contain information to compute probabilities
216 of conformations at any temperature.
218 \item{PHOENIX} -- conversion of UNRES conformations to all-atom conformations.
220 \item{ZSCORE} -- force field optimization (for developers).
224 Please consult the manuals of the corresponding packages for details. Note that not
225 all of these packages are released yet; they will be released depending on their
226 readiness for distribution. Contact Adam Liwo, Cezary Czaplewski or Stanislaw Oldziej
227 for developmental versions of these programs.
229 \subsection{Programming language}
230 \label{sect:geninfo:language}
232 This version of UNRES is written almost exclusively in Fortran 77; some subroutines
233 for data management are in ansi-C. The package was parallelized with MPI.
237 \subsection{References}
238 \label{sect:geninfo:references}
240 Citing the following references in your work that makes use of UNRES is gratefully
244 \renewcommand{\section}[2]{}%
245 \begin{thebibliography}{10}
248 A. Liwo, S. Oldziej, M.R. Pincus, R.J. Wawak, S. Rackovsky, H.A. Scheraga.
249 A united-residue force field for off-lattice protein-structure simulations.
250 I: Functional forms and parameters of long-range side-chain interaction potentials
251 from protein crystal data. {\it J. Comput. Chem.}, {\bf 1997}, 18, 849-873.
253 \bibitem{liwo_1997_02}
254 A. Liwo, M.R. Pincus, R.J. Wawak, S. Rackovsky, S. Oldziej, H.A. Scheraga.
255 A united-residue force field for off-lattice protein-structure simulations.
256 II: Parameterization of local interactions and determination
257 of the weights of energy terms by Z-score optimization.
258 {\it J. Comput. Chem.}, {\bf 1997}, 18, 874-887.
260 \bibitem{liwo_1997_03}
261 A. Liwo, S. O{\l}dziej, R. Ka\'zmierkiewicz, M. Groth, C. Czaplewski.
262 Design of a knowledge-based force field for off-lattice simulations of protein
264 {\it Acta Biochim. Pol.}, {\bf 1997}, 44, 527-548.
268 A. Liwo, R. Kazmierkiewicz, C. Czaplewski, M. Groth, S. Oldziej, R.J. Wawak,
269 S. Rackovsky, M.R. Pincus, H.A. Scheraga.
270 United-residue force field for off-lattice protein-structure simulations.
271 III. Origin of backbone hydrogen-bonding cooperativity in united-residue potentials.
272 {\it J. Comput. Chem.}, {\bf 1998}, 19, 259-276.
275 A. Liwo, C. Czaplewski, J. Pillardy, H.A. Scheraga.
276 Cumulant-based expressions for the multibody terms for the correlation between
277 local and electrostatic interactions in the united-residue force field.
278 {\it J. Chem. Phys.}, {\bf 2001}, 115, 2323-2347.
281 J. Lee, D.R. Ripoll, C. Czaplewski, J. Pillardy, W.J. Wedemeyer, H.A. Scheraga,
282 Optimization of parameters in macromolecular potential energy functions by
283 conformational space annealing. {\it J. Phys. Chem. B}, {\bf 2001}, 105, 7291-7298
285 \bibitem{pillardy_2001}
286 J. Pillardy, C. Czaplewski, A. Liwo, W.J. Wedemeyer, J. Lee, D.R. Ripoll,
287 P. Arlukowicz, S. Oldziej, Y.A. Arnautova, H.A. Scheraga,
288 Development of physics-based energy functions that predict medium-resolution
289 structures for proteins of the $\alpha, \beta$, and $\alpha/\beta$ structural classes.
290 {\it J. Phys. Chem. B}, {\bf 2001}, 105, 7299-7311
293 A. Liwo, P. Arlukowicz, C. Czaplewski, S. Oldziej, J. Pillardy, H.A. Scheraga.
294 A method for optimizing potential-energy functions by a hierarchical design
295 of the potential-energy landscape: Application to the UNRES force field.
296 {\it Proc. Natl. Acad. Sci. U.S.A.}, {\bf 2002}, 99, 1937-1942.
298 \bibitem{saunders_2003}
299 J. A. Saunders and H.A. Scheraga.
300 Ab initio structure prediction of two $\alpha$-helical oligomers
301 with a multiple-chain united-residue force field and global search.
302 {\it Biopolymers}, {\bf 2003}, 68, 300-317.
304 \bibitem{saunders_2003_02}
305 J.A. Saunders and H.A. Scheraga.
306 Challenges in structure prediction of oligomeric proteins at the united-residue
307 level: searching the multiple-chain energy landscape with CSA and CFMC procedures.
308 {\it Biopolymers}, {\bf 2003}, 68, 318-332.
310 \bibitem{oldziej_2003}
311 S. Oldziej, U. Kozlowska, A. Liwo, H.A. Scheraga.
312 Determination of the potentials of mean force for rotation about C$^\alpha$-C$^\alpha$
313 virtual bonds in polypeptides from the ab initio energy surfaces of terminally
314 blocked glycine, alanine, and proline. {\it J. Phys. Chem. A}, {\bf 2003}, 107, 8035-8046.
317 A. Liwo, S. Oldziej, C. Czaplewski, U. Kozlowska, H.A. Scheraga.
318 Parameterization of backbone-electrostatic and multibody contributions
319 to the UNRES force field for protein-structure prediction from ab initio
320 energy surfaces of model systems. {\it J. Phys. A}, {\bf 2004}, 108, 9421-9438.
322 \bibitem{oldziej_2004}
323 S. Oldziej, A. Liwo, C. Czaplewski, J. Pillardy, H.A. Scheraga.
324 Optimization of the UNRES force field by hierarchical design of the
325 potential-energy landscape. 2. Off-lattice tests of the method with single
326 proteins. {\it J. Phys. Chem. B.}, {\bf 2004}, 108, 16934-16949.
328 \bibitem{oldziej_2004_02}
329 S. Oldziej, J. Lagiewka, A. Liwo, C. Czaplewski, M. Chinchio,
330 M. Nanias, H.A. Scheraga.
331 Optimization of the UNRES force field by hierarchical design of the
332 potential-energy landscape. 3. Use of many proteins in optimization.
333 {\it J. Phys. Chem. B.}, {\bf 2004}, 108, 16950-16959.
335 \bibitem{oldziej_2004_03}
336 M. Khalili, A. Liwo, F. Rakowski, P. Grochowski, H.A. Scheraga.
337 Molecular dynamics with the united-residue model of polypeptide chains.
338 I. Lagrange equations of motion and tests of numerical stability in the
339 microcanonical mode, {\it J. Phys. Chem. B}, {\bf 2005}, 109, 13785-13797.
341 \bibitem{khalili_2005}
342 M. Khalili, A. Liwo, A. Jagielska, H.A. Scheraga.
343 Molecular dynamics with the united-residue model of polypeptide chains.
344 II. Langevin and Berendsen-bath dynamics and tests on model $\alpha$-helical
345 systems. {\it J. Phys. Chem. B}, {\bf 2005}, 109, 13798-13810.
347 \bibitem{khalili_2005_02}
348 A. Liwo, M. Khalili, H.A. Scheraga.
349 Ab initio simulations of protein-folding pathways by molecular dynamics with
350 the united-residue model of polypeptide chains.
351 {\it Proc. Natl. Acad. Sci. U.S.A.}, {\bf 2005}, 102, 2362-2367.
353 \bibitem{rakowski_2006}
354 F. Rakowski, P. Grochowski, B. Lesyng, A. Liwo, H. A. Scheraga.
355 Implementation of a symplectic multiple-time-step molecular dynamics algorithm,
356 based on the united-residue mesoscopic potential energy function.
357 {\it J. Chem. Phys.}, {\bf 2006}, 125, 204107.
359 \bibitem{nanias_2006}
360 M. Nanias, C. Czaplewski, H.A. Scheraga.
361 Replica exchange and multicanonical algorithms with the coarse-grained
362 united-residue (UNRES) force field.
363 {\it J. Chem. Theory and Comput.}, {\bf 2006}, 2, 513-528.
366 A. Liwo, M. Khalili, C. Czaplewski, S. Kalinowski, S. Oldziej, K. Wachucik, H.A. Scheraga.
367 Modification and optimization of the united-residue (UNRES) potential energy
368 function for canonical simulations. I. Temperature dependence of the effective
369 energy function and tests of the optimization method with single training
371 {\it J. Phys. Chem. B}, {\bf 2007}, 111, 260-285.
373 \bibitem{kozlowska_2007}
374 U. Kozlowska, A. Liwo, H.A. Scheraga.
375 Determination of virtual-bond-angle potentials of mean force for coarse-grained
376 simulations of protein structure and folding from ab initio energy surfaces of
377 terminally-blocked glycine, alanine, and proline.
378 {\it J. Phys.: Condens. Matter}, {\bf 2007}, 19, 285203.
380 \bibitem{chichio_2007}
381 M. Chinchio, C. Czaplewski, A. Liwo, S. Oldziej, H.A. Scheraga.
382 Dynamic formation and breaking of disulfide bonds in molecular dynamics
383 simulations with the UNRES force field.
384 {\it J. Chem. Theory Comput.}, {\bf 2007}, 3, 1236-1248.
387 A.V. Rojas, A. Liwo, H.A. Scheraga.
388 Molecular dynamics with the united-residue force field: Ab Initio folding
389 simulations of multichain proteins.
390 {\it J. Phys. Chem. B}, {\bf 2007}, 111, 293-309.
393 A. Liwo, C. Czaplewski, S. Oldziej, A.V. Rojas, R. Kazmierkiewicz,
394 M. Makowski, R.K. Murarka, H.A. Scheraga.
395 Simulation of protein structure and dynamics with the coarse-grained UNRES
396 force field. In: Coarse-Graining of Condensed Phase and Biomolecular
397 Systems., ed. G. Voth, Taylor \& Francis, 2008, Chapter 8, pp. 107-122.
399 \bibitem{czaplewski_2009}
400 C. Czaplewski, S. Kalinowski, A. Liwo, H.A. Scheraga.
401 Application of multiplexed replica exchange molecular dynamics
402 to the UNRES force field: tests with $\alpha$ and $\alpha+\beta$ proteins.
403 {\it J. Chem. Theory Comput.}, {\bf 2009}, 5, 627-640.
406 Y. He, Y. Xiao, A. Liwo, H.A. Scheraga.
407 Exploring the parameter space of the coarse-grained UNRES force field by random
408 search: selecting a transferable medium-resolution force field.
409 {\it J. Comput. Chem.}, {\bf 2009}, 30, 2127-2135.
411 \bibitem{kozlowska_2010}
412 U. Kozlowska, A. Liwo. H.A. Scheraga.
413 Determination of side-chain-rotamer and side-chain and backbone
414 virtual-bond-stretching potentials of mean force from AM1 energy surfaces of
415 terminally-blocked amino-acid residues, for coarse-grained simulations of
416 protein structure and folding. 1. The Method.
417 {\it J. Comput. Chem.}, {\bf 2010}, 31, 1143-1153.
419 \bibitem{kozlowska_2010_02}
420 U. Kozlowska, G.G. Maisuradze, A. Liwo, H.A. Scheraga.
421 Determination of side-chain-rotamer and side-chain and backbone
422 virtual-bond-stretching potentials of mean force from AM1 energy surfaces of
423 terminally-blocked amino-acid residues, for coarse-grained simulations of
424 protein structure and folding. 2. Results, comparison with statistical
425 potentials, and implementation in the UNRES force field.
426 {\it J. Comput. Chem.}, {\bf 2010}, 31, 1154-1167.
429 A. Liwo, S. Oldziej, C. Czaplewski, D.S. Kleinerman, P. Blood, H.A. Scheraga.
430 Implementation of molecular dynamics and its extensions with the coarse-grained
431 UNRES force field on massively parallel systems; towards millisecond-scale
432 simulations of protein structure, dynamics, and thermodynamics.
433 {\it J. Chem. Theory Comput.}, {\bf 2010}, 6, 890-909.
435 \end{thebibliography}
440 \section{INSTALLATION}
443 The distribution is contained in the UNRES.tar.gz file. To uncompress say:
446 gzip -cd UNRES.tar.gz | tar xf -
449 This will produce a directory named UNRES with the following subdirectories:
453 \item{src\_CSA} -- the CSA-version source directory.
455 \item{src\_MD} -- the MD-version source directory, single chains.
457 \item{src\_MD-M} -- the MD-version source directory, oligomeric proteins
459 \item{bin} -- the binaries/scripts directory; its BATCH\_SCRIPTS directory contains the
460 batch scripts (at present the only example is for PBS: unres\_3P\_PBS.csh,
461 which is an UNRES calling script and start.mat, which is the batch script
462 submitted to the PBS system).
464 \item{doc} -- documentation (this file and EXAMPLES.TXT)
466 \item{examples} -- sample input files (see EXAMPLES.TXT for description).
470 To produce the executable do the following:
472 \begin{enumerate}[(a)]
475 To build parallel version, make sure that MPI is installed in your system.
476 Note that the package will have limited functions when compiled in a single-CPU mode.
477 On linux cluster the command source \$HOME/.env should be added to .tcshrc
478 or equivalent file to use parallel version of the program, the
479 alternative is to use queuing system like PBS.
480 In some cases the FORTRAN library subroutine GETENV does not work properly
481 with MPI, if the script is run interactively. In such a case try to
482 add the source mygentenv.F and turn on the -DMYGETENV preprocessor flag.
485 Change directory to the respective source directory.
488 Edit the appropriate Makefile (parallel program that includes CSA
489 procedure, the serial version is no longer supported, for serial task
490 parallel program can be run using only one processor) to customize to your
491 system. Makefiles for the following systems are provided:
493 Makefile\_osf\_f90 - OSF1/Tru64 UNIX HP Alphaserver with f90 compiler,
494 Makefile\_lnx\_pgf90 - Linux, the pgf90 compiler,
495 Makefile\_lnx\_ifc - Linux, ifc compiler.
496 Makefile\_win\_pgf90 - Windows, the pgf90 compiler.
498 Other systems should not cause problems; all you have to do is to change
499 the compiler, compiler options, and preprocessor options. Also, change the
500 BIN variable, if you want to put your binaries in other place than
501 PROTARCH/BIN. In the case of Makefile make sure that the MPI directories are
504 The following architectures are defined in the .F source files:
508 \item{AIX} -- AIX systems (put -DAIX as one of the preprocessor options, if
509 this is your system).
511 \item{LINUX} -- Linux (put -DLINUX).
513 \item{G77} -- Gnu-Fortran compilers (might require sum moderate source code editing)
514 (put -DG77). The recommended compiler is gfortran and not g77.
516 \item{PGI} -- PGI compilers.
518 \item{WINPGI} -- additional setting for PGI compilers for MS Windows.
520 \item{SGI} -- all SGI platforms; should also be good for SUN platforms (put -DSGI).
522 \item{WIN} -- MS Windows with Digital Fortran compiler (put -DWIN)
526 For other platforms, the only problems might appear in connection with
527 machine-specific I/O instructions. Many files are opened in the append
528 mode, whose specification in the OPEN statement is quite machine-dependent.
529 In this case you might need to modify the source code accordingly.
530 The other platform dependent routines are the timing routines contained
531 in timing.F. In addition to the platforms specified above, ES9000, SUN,
532 KSR, and CRAY are defined there.
534 For parallel build -DMP and -DMPI must be set (these are set in Makefile).
536 IMPORTANT! Apart from this, two define flags: -DCRYST\_TOR and -DMOMENT
537 define earlier versions of the force field. The MUST NOT be entered, if
538 the CASP5 and later versions of the force field are used.
542 Build the unres executables by typing at your UNIX prompt:
545 make # will build unres
546 make clean # will remove the object files
549 The bin directory contains pre-built binaries for Red Hat Linux. These
550 executables are specified in the csh scripts listed in section 4.
554 Customize the C-shell scripts unres.unres (to run the parallel version on
555 set of workstation). See the next section of this manual for guidance.
557 After the executables are build and C-shell scripts customized, you can run the
558 test examples contained in UNRES/examples.
564 \section{CUSTOMIZING YOUR C-SHELL SCRIPT}
567 IMPORTANT NOTE -- The unres.csh script is for Linux and should also be easily
568 adaptable to other systems running MPICH. This script is for interactive
569 parallel jobs. Examples of scripts compatible with PBS (pbs.sub) and LoadLever
570 (sp2.sub) queuing systems are also provided.
572 Edit the following lines in your unres.csh script:
575 set DD = your_database_directory
578 e.g., if you installed the package on the directory /usr/local, this line
582 set DD = /usr/local/UNRES/PARAM
583 set BIN = your_binaries_directory
584 set FGPROCS = number_of_processors_per_energy/force_evaluation (MD)
587 e.g., if the root directory is as above:
590 set BIN = /usr/local/UNRES/bin
593 \section{COMMAND LINE AND FILES}
596 To run UNRES interactively enter the following command at your Unix prompt
597 or put it in the batch script:
600 unres.csh POTENTIAL INPUT N_PROCS
605 POTENTIAL specifies the side-chain interaction potential type and must be
606 one of the following:
610 \item{LJ} -- 6-12 radial Lennard-Jones.
612 \item{LJK} -- 6-12 radial Lennard-Jones-Kihara (shifted Lennard Jones).
614 \item{BP} -- 6-12 anisotropic Berne-Pechukas based on Gaussian overlap (dilated
617 \item{GB} -- 6-12 anisotropic Gay-Berne (shifted Lennard-Jones).
619 \item{GBV} -- 6-12 anisotropic Gay-Berne-Vorobjev (shifted Lennard-Jones).
621 See section \ref{sect:forcefields} (Force Fields) for explanation and usage.
623 At present, only the LJ and GB potentials are applied. The LJ potential
624 is used in the ``CASP3'' version of the UNRES force field that is able
625 to predict only $\alpha$-helical structures. All further version of the
626 UNRES force field use the GB potential. For the description of all above-mentioned
627 potentials see ref. \cite{liwo_1997_02}.
629 \item{INPUT} is the prefix for input and output files (see below)
631 \item{N\_PROCS} is the number of processors; for a CSA or REMD/MREMD run it MUST be at least 2.
635 Note! The script takes one more variable, FGPROCS, as the fourth argument,
636 which is the number of fine-grain processors to parallelize energy
637 evaluations. The corresponding code is in UNRES/CSA, but it was written
638 using MPL instead of MPI and therefore is never used in the present version.
639 At present we have no plans to rewrite fine-grain parallelization using MPI,
640 because we found that the scalability for up to 200 residue polypeptide
641 chains was very poor, due to a small number of interactions and,
642 correspondingly, unfavorable ratio of the overhead to the computation time.
646 \item{INPUT.inp} contains the main input data and the control parameters of the CSA
649 \item{INPUT.out\_POTENTIAL\_xxx} is the main output files from different processors; xxx
650 denotes the number of the processor
652 \item{INPUT\_POTENTIALxxx.stat} is the summary files with the energies, energy components,
653 and RMS deviations of the conformations produced by each of the processors;
654 not used in CSA runs; also it outputs different quantity in MD/MREMD runs.
656 CSA version specific files:
658 \item{INPUT\_POTENTIALxxx.int} is the internal coordinates; in the CSA run
660 \item{INPUT\_POTENTIAL\_000.int} contains the coordinates of the conformations,
661 and the other files are empty
663 \item{INPUT.CSA.history} is the history file from a CSA run. This is an I/O file, because
664 it can be used to restart an interrupted CSA run.
666 \item{INPUT.CSA.seed} stores the random seed generated in a CSA run; written for
669 \item{INPUT.CSA.bank} is the current bank of conformations obtained in CSA calculations
670 (expressed as internal coordinates). This information is also stored in
671 INPUT\_POTENTIAL000.int
673 \item{INPUT.CSA.rbank} -- as above, but contains random-generated conformations.
677 MD version specific files:
681 \item{INPUT\_MDyyy.pdb} is the Cartesian coordinates of the conformations in PDB format.
683 \item{INPUT\_MDyyy.x} is the Cartesian coordinates of the conformations in ASCII format.
685 \item{INPUT\_MDyyy.cx} is the Cartesian coordinates of the conformations in compressed format
686 (need xdr2pdb to convert to PDB format).
689 The program currently produces some more files, but they are not used
690 for any purposes and most of them are scratched after a run is completed.
692 The run script also contains definitions of the parameter files through the
693 following environmental variables:
697 \item{SIDEPAR} -- parameters of the SC-SC interaction potentials ($U_{SC SC}$);
699 \item{SCPPAR} -- parameters of the SC-p interaction potential ($U_{SCp}$); this file can
700 be ignored by specifying the -DOLDSCP preprocessor flag, which means that the
701 built-in parameters are used; at present they are the same as the parameters
702 in the file specified by SCPPAR;
704 \item{ELEPAR} -- parameters of the p-p interaction potentials ($U_{pp}$);
706 \item{FOURIER} -- parameters of the multibody potentials of the coupling between the
707 backbone-local and backbone-electrostatic interactions ($U_{corr}$);
709 \item{THETPAR} -- parameters of the virtual-bond-angle bending potentials ($U_b$);
711 \item{ROTPAR} -- parameters of the side-chain rotamer potentials ($U_{rot}$);
713 \item{TORPAR} -- parameters of the torsional potentials ($U_{rot}$);
715 \item{TORDPAR} -- parameters of the double-torsional potentials.
717 \item{SCCORPAR} -- parameters of the supplementary torsional sequence-specific potentials.
718 (implemented recently).
724 \section{FORCE FIELDS}
725 \label{sect:forcefields}
727 UNRES is being developed since 1997 and several versions of the force field
728 were produced. The settings and references to these force fields are
731 Force fields for CSA version (can be used in MD but haven't been parameterized for this
735 \hspace{-2cm}\begin{longtable}{|l|l|l|l|l|l|l|}\hline
737 %---------------------------------------------------------------------------------------
738 & Additional & SC-SC & Example script & Structural &\\
739 Force field & compiler flags& potential& and executables & classes covered& References\\
740 & & & (Linux; PGF90 &&\\
741 & & & and IFC) &&\\ \hline
742 %---------------------------------------------------------------------------------------
743 CASP3 & -DCRYST\_TOR & LJ & unres\_CASP3.csh &only $\alpha$ &\cite{liwo_1997,liwo_1997_02,liwo_1998}\\
744 & -DCRYST\_BOND & &unres\_pgf90\_cryst\_tor.exe&&\\
745 & -DCRYST\_THETA & &unres\_ifc6\_cryst\_tor.exe &&\\
749 ALPHA & -DMOMENT & GB & unres\_CASP4.csh &only $\alpha$ &\cite{liwo_2001,lee_2001,pillardy_2001}\\
750 & -DCRYST\_BOND & &unres\_pgf90\_moment.exe &&\\
751 & -DCRYST\_THETA & &unres\_ifc6\_moment.exe &&\\
754 BETA & -DMOMENT & GB & unres\_CASP4.csh &only $\beta$ &\cite{liwo_2001,lee_2001,pillardy_2001}\\
755 & -DCRYST\_BOND & &unres\_pgf90\_moment.exe &&\\
756 & -DCRYST\_THETA & &unres\_ifc6\_moment.exe &&\\
759 ALPHABETA & -DMOMENT & GB & unres\_CASP4.csh & all &\cite{liwo_2001,lee_2001,pillardy_2001}\\
760 & -DCRYST\_BOND & &unres\_pgf90\_moment.exe &&\\
761 & -DCRYST\_THETA & &unres\_ifc6\_moment.exe &&\\
764 CASP5 & -DCRYST\_BOND & GB & unres\_CASP5.csh & all &\cite{liwo_2002,saunders_2003,saunders_2003_02,liwo_2004}\\
765 & -DCRYST\_THETA & & unres\_pgf90.exe &&\\
766 & -DCRYST\_SC & & unres\_ifc6.exe &&\\
768 3P & -DCRYST\_BOND & GB & unres\_3P.csh & all &\cite{oldziej_2004,oldziej_2004_02}\\
769 & -DCRYST\_THETA & & unres\_pgf90.exe &&\\
770 & -DCRYST\_SC & & unres\_ifc6.exe &&\\
772 4P & -DCRYST\_BOND & GB & unees\_4P.csh & all &\cite{oldziej_2004,oldziej_2004_02}\\
773 & -DCRYST\_THETA & & unres\_pgf90.exe&&\\
774 & -DCRYST\_SC & & unres\_ifc6.exe&&\\ \hline
775 %---------------------------------------------------------------------------------------
781 Force fields for MD version \cite{khalili_2005,khalili_2005_02}.
784 \begin{longtable}{|l|l|l|l|l|l|l|}\hline
785 %---------------------------------------------------------------------------------------
786 & Additional & SC-SC & Example script & Structural &\\
787 Force field & compiler flags& potential& and executables & classes covered& References\\
788 & & & (Linux; PGF90&&\\
789 & & & and IFC)&&\\ \hline
790 %---------------------------------------------------------------------------------------
791 GAB & -DCRYST\_BOND & GB & unres\_GAB.csh & mostly $\alpha$ & \cite{liwo_2007}\\
792 & -DCRYST\_THETA &&&&\\
795 E0G & -DCRYST\_BOND & GB & unres\_E0G.csh & mostly $\alpha$ & \cite{liwo_2007}\\
796 & -DCRYST\_THET &&&&\\
799 1L2Y\_1LE1 & none & GB & unres\_ab.csh & all & \cite{liwo_2007,kozlowska_2007,he_2009,kozlowska_2010,kozlowska_2010_02}\\ \hline
800 %---------------------------------------------------------------------------------------
804 The example scripts (the *.csh filed) contain all appropriate parameter files, while
805 the energy-term weights are provided in the example input files listed in EXAMPLES.TXT
806 (*.inp; see section \ref{sect:input}. for description of the input files). However, it is user's
807 responsibility to specify appropriate compiler flags. Note that a version WILL NOT work,
808 if the force-field specific compiler flags are not set. The parameter files specified
809 in the run script also must strictly correspond to the energy-term weights specified in
810 the input file. The parameter files for specific force fields are also specified below
811 and the energy-term weights are specified in section \ref{sect:input}.
813 The parameter files are as follows (the environment variables from section \ref{sect:command} are
814 used to identify the parameters):
818 \begin{longtable}{ll}
819 BONDPAR &bond.parm \\
820 THETPAR &thetaml.5parm\\
821 ROTPAR &scgauss.parm\\
822 TORPAR &torsion\_cryst.parm\\
823 TORDPAR &torsion\_double\_631Gdp.parm (not used)\\
824 SIDEPAR &scinter\_LJ.parm\\
825 ELEPAR &electr.parm\\
827 FOURIER &fourier\_GAP.parm (not used)\\
828 SCCORPAR&rotcorr\_AM1.parm (not used)\\
831 ALPHA, BETA, ALPHABETA (CASP4):
833 \begin{longtable}{ll}
834 BONDPAR &bond.parm \\
835 THETPAR &thetaml.5parm\\
836 ROTPAR &scgauss.parm\\
837 TORPAR &torsion\_ecepp.parm\\
838 TORDPAR &torsion\_double\_631Gdp.parm (not used)\\
839 SIDEPAR &scinter\_GB.parm\\
840 ELEPAR &electr.parm\\
842 FOURIER &fourier\_GAP.parm\\
843 SCCORPAR&rotcorr\_AM1.parm (not used)\\
848 \begin{longtable}{ll}
850 THETPAR &thetaml.5parm\\
851 ROTPAR &scgauss.parm\\
852 TORPAR &torsion\_631Gdp.parm\\
853 TORDPAR &torsion\_double\_631Gdp.parm\\
854 SIDEPAR &scinter\_GB.parm\\
855 ELEPAR &electr\_631Gdp.parm\\
857 FOURIER &fourier\_opt.parm.1igd\_iter7n\_c\\
858 SCCORPAR&rotcorr\_AM1.parm (not used)\\
863 \begin{longtable}{ll}
865 THETPAR &thetaml.5parm\\
866 ROTPAR &scgauss.parm\\
867 TORPAR &torsion\_631Gdp.parm\\
868 TORDPAR &torsion\_double\_631Gdp.parm\\
869 SIDEPAR &sc\_GB\_opt.3P7\_iter81\_1r\\
870 ELEPAR &electr\_631Gdp.parm\\
872 FOURIER &fourier\_opt.parm.1igd\_hc\_iter3\_3\\
873 SCCORPAR&rotcorr\_AM1.parm (not used)\\
878 \begin{longtable}{ll}
880 THETPAR &thetaml.5parm\\
881 ROTPAR &scgauss.parm\\
882 TORPAR &torsion\_631Gdp.parm\\
883 TORDPAR &torsion\_double\_631Gdp.parm\\
884 SIDEPAR &sc\_GB\_opt.4P5\_iter33\_3r\\
885 ELEPAR &electr\_631Gdp.parm\\
887 FOURIER &fourier\_opt.parm.1igd\_hc\_iter3\_3\\
888 SCCORPAR&rotcorr\_AM1.parm (not used)\\
893 \begin{longtable}{ll}
895 THETPAR &thetaml.5parm\\
896 ROTPAR &scgauss.parm\\
897 TORPAR &torsion\_631Gdp.parm\\
898 TORDPAR &torsion\_double\_631Gdp.parm\\
899 SIDEPAR &sc\_GB\_opt.1gab\_3S\_qclass5no310-shan2-sc-16-10-8k\\
900 ELEPAR &electr\_631Gdp.parm\\
902 FOURIER &fourier\_opt.parm.1igd\_hc\_iter3\_3\\
903 SCCORPAR&rotcorr\_AM1.parm\\
908 \begin{longtable}{ll}
910 THETPAR &thetaml.5parm\\
911 ROTPAR &scgauss.parm\\
912 TORPAR &torsion\_631Gdp.parm\\
913 TORDPAR &torsion\_double\_631Gdp.parm\\
914 SIDEPAR &sc\_GB\_opt.1e0g-52-17k-2k-newclass-shan1e9\_gap8g-sc\\
915 ELEPAR &electr\_631Gdp.parm\\
917 FOURIER &fourier\_opt.parm.1igd\_hc\_iter3\_3\\
918 SCCORPAR&rotcorr\_AM1.parm\\
923 \begin{longtable}{ll}
924 BONDPAR &bond\_AM1.parm\\
925 THETPAR &theta\_abinitio.parm\\
926 ROTPAR &rotamers\_AM1\_aura.10022007.parm\\
927 TORPAR &torsion\_631Gdp.parm\\
928 TORDPAR &torsion\_double\_631Gdp.parm\\
929 SIDEPAR &scinter\_\${POT}.parm\\
930 ELEPAR &electr\_631Gdp.parm\\
932 FOURIER &fourier\_opt.parm.1igd\_hc\_iter3\_3\\
933 SCCORPAR&rotcorr\_AM1.parm\\
936 Additionally, for 1L2Y\_1LE1, the following environment variables and files are required
937 to generate random conformations:
939 THETPARPDB thetaml.5parm\\
940 ROTPARPDB scgauss.parm
942 For CSA, the best force field is 4P. For MD, the 1L2Y\_1LE1 force field is best for
943 ab initio prediction but provides medium resolution (5 A for 60-residue proteins) and
944 overemphasizes $\beta$-structures and has to be run with secondary-structure-prediction
945 information. For prediction of the structure of mostly $\alpha$-protein, and for running
946 dynamics of large proteins, the best is the GAB force field. All these force fields
947 were trained by using our procedure of hierarchical optimization \cite{oldziej_2004,oldziej_2004_02}.
948 The 4P and 1L2Y\_1LE1 force fields have considerable power independent of structural class.
949 The ALPHA, BETA, and ALPHABETA force fields (for CSA) were used in the CASP4 exercises
950 and the CASP5 force field was used in the CASP5 exercise with some success; ALPHA
951 predicts reasonably the structure of $\alpha$-helical proteins and is still not obsolete,
952 while for $\beta$- and $\alpha+\beta$-structure prediction
953 3P or 4P should be used, because they are cheaper and more reliable than BETA and
954 ALPHABETA. The early CASP3 force field is included for historical reasons only.
958 \section{INPUT FILES}
961 \subsection{Main input data file}
962 \label{sect:input:main}
964 Most of the data are organized as data lists, where the data can be put
965 in any order, using a series of statements of the form:
969 for simple non-logical variables
975 to indicate that the corresponding option is turned on. For array variables
976 the assignment statement is:
978 KEYWORD=value1,value2,...
980 However, the data lists are unnamed and that must be placed EXACTLY in the
981 order indicated below. The presence of an \& in the 80th column of a line
982 indicates that the next line will belong to the same data group. The parser
983 subroutines that interpret the keywords are case insensitive.
985 Each group of data organized as a data list is indicated as data list format
988 \subsubsection{Title}
989 \label{sect:input:main:title}
991 Any string containing up to 80 characters. The first input line is always
992 interpreted as title.
994 \subsubsection{Control data}
995 \label{sect:input:main:control}
997 This data section is in data list format and is read in the READ\_CONTROL subroutine.
999 \paragraph{Keywords to chose calculation type}
1003 \item{OUT1FILE} -- only the master processor prints the output file in a parallel job
1005 \item{MINIMIZE} -- if present, energy minimization will be carried out.
1007 \item{REGULAR} -- regularize the read in conformation (usually a crystal or
1008 NMR structure) by doing a series of three constrained minimizations,
1009 to keep the structure as close as possible to the starting
1010 (experimental) structure. The constraints are the CA-CA distances
1011 of the initial structure. The constraints are gradually diminished
1012 and removed in the last minimization.
1014 \item{SOFTREG} -- regularize the read in conformation (usually a crystal or NMR
1015 structure) by doing a series of constrained minimizations, with
1016 additional use of soft potential and secondary structure
1017 freezing, to keep the structure as close as possible to the
1018 starting (experimental) structure.
1021 \item{CSA} -- if present, the run is a CSA run. At present, this is the only
1022 reliable mode of doing global conformational search with this
1023 package; it is NOT recommended to use MCM or THREAD for this
1026 \item{MCM} -- if present, this is a Monte Carlo Minimization (MCM) run.
1028 \item{MULTCONF} -- if present, conformations will be read from the INPUT.intin
1031 \item{MD} -- run canonical MD (single or multiple trajectories).
1033 \item{RE} -- run REMD or MREMD (parallel jobs only).
1035 \item{MUCA} -- run multicanonical MD calculations (parallel jobs only).
1037 \item{MAP=number} (integer) --
1038 Conformational map will be calculated in chosen angles.
1040 \item{THREAD=number} (integer) --
1041 Threading or threading-with-minimization run, using a database of structures
1042 contained in the \$DD/patterns.cart pattern data base (502 chains or chain
1043 fragments), using a total number patterns. It is recommended to use this with
1044 energy minimization; this implies regularization of each minimized pattern.
1045 See refs. \cite{liwo_1997_02} and \cite{liwo_1997_03}.
1047 \item{CHECKGRAD} -- compare numerical and analytical gradient; to be followed by:
1049 \item{CART} -- energy gradient in virtual-bond vectors (Cartesian coordinates)
1051 \item{INT} -- energy gradient in internal coordinates (default)
1053 \item{CARINT} -- derivatives of the internal coordinates in the virtual-bond vectors.
1057 \paragraph{Specification of protein and structure output in non-MD applications}
1061 \item{ONE\_LETTER} -- one-letter and not three-letter code of the amino-acid residues
1064 \item{SYM} (1) -- number of chains with same sequence (for oligomeric proteins only).
1066 \item{PDBSTART} -- the initial conformation is read in from a PDB file.
1068 \item{UNRES\_PDB} -- the starting conformation is in UNRES representation (C$^\alpha$
1069 and SC coordinates only). This keyword MUST appear in such a case
1070 or the program will generate erroneous and unrealistic side-chain
1073 \item{RAND\_CONF} -- start from a random conformation.
1075 \item{EXTCONF} -- start from an extended chain conformation.
1077 \item{PDBOUT} -- if present, conformations will be output in PDB format. Note that
1078 this keyword affects only the output from single energy evaluation,
1079 energy minimization and multiple-conformation data. To request
1080 conformations from MD/MREMD runs in PDB format, the MDPDB keyword
1081 must be placed on the MD input record.
1083 \item{MOL2OUT} -- if present, conformations will be output in SYBYL mol2 format.
1085 \item{REFSTR} -- if present, reference structure will be read (e.g., to monitor
1086 the RMS deviation from the crystal structure).
1088 \item{PDBREF} -- if present, a reference structure will be read in to compare
1089 the calculated conformations with it.
1091 \item{UNRES\_PBD} -- the starting/reference structure is read from an UNRES-generated
1096 Keywords: PDBOUT, MOL2OUT, PDBREF, and PDBSTART are ignored for a CSA run.
1097 Output mode for MD version is specified in MD input (see section \ref{sect:input:main:MD}).
1099 \paragraph{Miscellaneous}
1103 \item{CONSTR\_DIST=number}
1106 \item{0} -- no distance restraints,
1107 \item{$>0$} -- imposes harmonic restraints on selected distances; see section 5.12.
1108 In MD version, also restraints on the q variable \cite{liwo_2007} can be used.
1111 \item{WEIDIS=number} (real)
1112 the weight of the distance term; applies for REGULARIZE and THREAD, otherwise
1115 \item{USE\_SEC\_PRED} -- use secondary-structure prediction information.
1117 \item{SEED=number} (integer) (no default)
1118 Random seed (required, even if the run is not a CSA, MCM, MD or MREMD run).
1120 \item{PHI} -- only the virtual-bond dihedral angles $\gamma$ are considered as
1121 variables in energy minimization.
1123 \item{BACK} -- only the backbone virtual angles (virtual-bond angles theta and
1124 virtual-bond dihedral angles $\gamma$) are considered as variables
1125 in energy minimization.
1127 By default, all internal coordinates: $\theta$, $\gamma$, and the side-chain
1128 centroid polar angles $\alpha$ and $\beta$ are considered as variables in energy
1131 \item{RESCALE\_MODE=number} (real)
1132 Choice of the type of temperature dependence of the force field.
1134 \item{0} -- no temperature dependence
1135 \item{1} -- homographic dependence (not implemented yet with any force field)
1136 \item{2} -- hyperbolic tangent dependence \cite{liwo_2007}.
1139 \item{T\_BATH=number} (real)
1140 temperature (for MD runs and temperature-dependent force fields).
1143 The following keywords apply to MCM only:
1147 \item{MAXGEN=number} (integer) (10000)
1148 maximum number of conformations generated in a single MCM iteration
1150 \item{MAXOVERLAP=number} (integer) (1000)
1151 maximum number of conformations with ``bad'' overlaps allowed to appear in a
1152 row in a single MCM iteration.
1154 \item{DISTCHAINMAX} -- (multi-chain capacity only) maximum distance between the
1155 last residue of a given chain and the first residue of the
1156 next chain such that restraints will not be imposed; quartic
1157 restraints will be imposed for greater distances.
1159 \item{ENERGY\_DEC} -- detailed energies will be printed for each interacting pair
1160 or each virtual bond, virtual-bond angle and dihedral angle,
1161 side chain, etc. DO NOT use unless a single energy evaluation
1165 \subsubsection{Minimizer options}
1167 This data section is in data list format and is read in the READ\_MINIM subroutine.
1169 This data group is present, if MINIMIZE was specified on the control card.
1170 Otherwise, it must not appear.
1174 \item{CART} -- minimize in virtual-bond vectors instead of angles.
1176 \item{MAXMIN=number} (integer) (2000)
1177 maximum number of iterations of the SUMSL minimizer.
1179 \item{MAXFUN=number} (integer) (5000)
1180 maximum number of function evaluations in a single minimization.
1182 \item{TOLF=number} (real) (1.0e-2)
1183 Tolerance on function.
1185 \item{RTOLF=number} (real) (1.0d-4)
1186 Relative tolerance on function.
1188 \item{PRINT\_INI} -- turns on printing nondefault minimization parameters,
1189 initial variables, and gradients in the SUMSL procedures.
1191 \item{PRINT\_FINAL} -- turns on printing final variables and gradients in
1194 \item{PRINT\_STAT} -- turns on printing abbreviated minimization protocol.
1198 The SUMSL minimizer is used in UNRES/CSA. For detailed description of
1199 the control parameters see the source file cored.f and sumsld.f
1202 \subsubsection{CSA control parameters}
1203 \label{sect:input:main:CSA}
1205 This data group should be present only, if CSA was specified on the control
1206 card. It is recommended that the readers to read publications on CSA method
1207 for more complete description of the parameters. Brief description of
1212 \item{NCONF=number} (integer) (50)
1213 This corresponds to the size of the bank at the beginning of the
1214 CSA procedure. The size of the bank, nbank, is set to nconf.
1215 If necessary (at much later stages of the CSA: see icmax below),
1216 nbank increases by multiple of nconf.
1218 \item{JSTART=number} (integer) (1)
1220 \item{JEND}=number (integer) (1)
1221 This corresponds to the limit values of do loop, each of which
1222 corresponds to an separate CSA run. If jstart=1, and jstart=100,
1223 this routine will repeat 100 separate CSA runs (limited by CPU)
1224 each one with separate random number initialization.
1225 The only difference between two CSA runs (one with jstart=jend=1
1226 and another one with jstart=jend=2) would be different random
1227 number initializations if other parameters are identical.
1229 \item{NSTMAX=number} (integer) (500000)
1230 This is to set a limit the total number of local minimizations of CSA
1235 N1=number (integer) (6)\\
1236 N2=number (integer) (4)\\
1237 N3=number (integer) (0)\\
1238 N4=number (integer) (0)\\
1239 N5=number (integer) (0)\\
1240 N6=number (integer) (10)\\
1241 N7=number (integer) (0)\\
1242 N8=number (integer) (0)\\
1243 N9=number (integer) (0)\\
1244 IS1=number (integer) (1)\\
1245 IS2=number (integer) (8)\\
1247 These numbers are used to generate trial conformations for each seed.
1248 See the file newconf.f for more details.
1251 \item{n1:} the total number of trial conformations for each seed by substituting
1252 nran number of variable angles (see subroutine newconf1ab and
1253 subroutine newconf1ar),
1254 \item{n2:} the total number of trial conformations for each seed by substituting
1255 nran number of groups of variable angles (see subroutine newconf1bb and
1256 subroutine newconf1br),
1257 \item{n3:} the total number of trial conformations for each seed by substituting
1258 a window of residues which forms a $\beta$-hairpin, if there is no enough
1259 $\beta$-hairpins uses the same algorithm as n6,
1260 \item{n4:} the total number of trial conformations for each seed by shifting the
1261 turn in $\beta$-hairpin by +/- 1 or 2 residues, if there is no enough
1262 $\beta$-hairpins uses the same algorithm as n6,
1263 \item{n5:} not used,
1264 \item{n6:} the total number of trial conformations for each seed by substituting
1265 a window of residues [is1,is2] inclusive. The size of the window is
1266 determined in a random fashion (see subroutine newconf\_residue for
1267 generation of the trial conformations),
1268 \item{n7:} the total number of trial conformations for each seed by copying a
1269 remote strand pair forming nonlocal $\beta$-sheet contact,
1270 \item{n8:} the total number of trial conformations for each seed by copying an
1271 $\alpha$-helical segment,
1272 \item{n9:} the total number of trial conformations for each seed by shifting the
1273 $\alpha$-helical segment by +/- 1 or 2 residues.
1276 Typical values used for a 75-residue helical protein is
1277 (6 4 0 0 0 10 1 26) for (n1,n2,n3,n4,n5,n6,is1,is2), respectively.
1278 In this example, a total of 20 trial conformations are generated for a seed
1279 Usually is1=1 is used for all applications, and the value of is2 is set about
1280 to 1/3 of the total number of residues. n3, n4 and n7 are design to help in
1281 case of proteins with $\beta$-sheets
1283 NRAN0=number (integer) (4)\\
1284 NRAN1=number (integer) (2)\\
1285 IRR=number (integer) (1)\\
1287 These numbers are used to determine if the CSA stage is very early.
1288 One can use (4 2 1) for these values. For more details one should look into
1289 the file, newconf.f, for more details.
1291 NTOTAL=number (integer) (10000)\\
1292 CUT1=number (real) (2.0)\\
1293 CUT2=number (real) (5.0)\\
1295 Annealing schedule is set in following fashion.
1296 The value of D\_cut is reduced geometrically from 1/cut1 of D\_ave (at the
1297 beginning) to 1/cut2 of D\_ave (after ntotal number of minimizations) where
1298 D\_ave is the average distance between two conformations in the First\_bank.
1302 \item{ESTOP=number} (real) (-3000.0)
1303 The CSA procedure stops if a conformations with energy lower than estop is
1304 obtained. If the do-loop set by jstart and jend requires more than one loop,
1305 the program will go on until the do-loop is finished.
1307 \item{ICMAX=number} (integer) (3)
1308 The maximum value of cycle (see the original publications for details).
1309 If the number of cycle exceeds this value the program will add nconf
1310 more conformations to Bank and First\_bank to continue CSA procedure if
1311 the new size of the nbank is within the maximum set by nbankm (see above).
1312 If the size of nbank exceeds the maximum set by nbankm the CSA procedure
1313 for this run will stop and next CSA will begin depending on the do-loop
1314 set by jstart and jend.
1316 \item{IRESTART=number} (integer) (0)
1317 This tells you if the run is fresh start (irestart=0) or a restart (irestart=1)
1318 starting from an old results
1320 \item{NDIFF=number} (integer) (2)
1321 The number of variables use in comparison when structure is added to the
1322 bank,4 - all angels, 2 - only backbone angles $\gamma$ and $\theta$
1324 \item{NBANKTM=number} (integer) (0)
1325 The maximum number of structures saved in *.CSA.bankt as history of the run
1326 Do not use bankt on massively parallel computation as it kills scalability.
1328 \item{DELE=number} (real) (20.0)
1329 Energy cutoff for bankt.
1331 \item{DIFCUT=number} (real) (720.0)
1332 Angle cutoff for bankt.
1334 \item{IREF=number} (integer) (0)
1335 0 - normal run, 1 - local CSA which generates only structures close to the
1336 reference one read from *.CSA.native.int file.
1338 \item{RMSCUT=number} (real) (4.0)
1339 CA RMSD cut off used in local CSA
1341 \item{PNCCUT=number} (real) (0.5)
1342 Percentage of native contact used in local CSA
1344 \item{NCONF\_IN=number} (integer) (0)
1345 The number of conformation read for the first bank from the input file
1349 Optionally, the CSA parameters can be read from file INPUT.CSA.in, if
1350 this file exists. If so, they are read in free format in the following
1356 n1,n2,n3,n4,n5,n6,n7,n8,is1,is2\\
1362 ntbankm,dele,difcut\\
1363 iref,rmscut,pnccut\\
1367 \subsubsection{MCM data}
1368 \label{sect:input:main:MCM}
1370 (Data list format, subroutine MCMREAD.)
1372 This data group is present, if MCM was specified on the control card.
1373 Otherwise it must not appear.
1377 \item{MAXACC=number} (integer) (100)
1378 Maximum number of accepted conformations.
1380 \item{MAXTRIAL=number} (integer) (100)
1381 Maximum number of unsuccessful trials in a row.
1383 \item{MAXTRIAL\_ITER=number} (integer) (1000)
1384 Maximum number of unsuccessful trials in a single iteration.
1386 \item{MAXREPM=number} (integer) (200)
1387 Maximum number of repetitions of the same minimum.
1389 \item{RANFRACT=number} (real) (0.5d0)
1390 Fraction of chain-rebuild motions.
1392 \item{OVERLAP=number} (real) (1.0d3)
1393 Bad contact energy criterion.
1395 \item{NSTEPH=number} (integer) (0)
1396 Number of heating step in adaptive sampling.
1398 \item{NSTEPC=number} (integer) (0)
1399 Number of cooling step in adaptive sampling.
1401 \item{TMIN=number} (real) (298.0d0)
1402 Minimum temperature in adaptive-temperature sampling).
1404 \item{TMAX=number} (real) (298.0d0)
1405 Maximum temperature in adaptive-temperature sampling).
1407 The temperature is changed according to the formula:
1409 T = TMIN*EXP(ISTEPH*(TMAX-TMIN)/NSTEPH) when heating
1413 T = TMAX*EXP(-ISTEPC*(TMAX-TMIN)/NSTEPC) when cooling
1415 The default is to use a constant temperature.
1417 \item{NWINDOW=number} (integer) (0)
1418 Number of windows in which the variables will be perturbed; the windows are
1419 defined by the numbers of the respective amino-acid residues. If NWINDOW
1420 is nonzero, after specifying all MCM input the next lines must define the
1421 windows. Each line looks like this:
1423 winstart winend (free format)
1425 e.g. if NWINDOW=2, the input:
1430 will mean that only the variables of residues 4-10 and 15-20 will be perturbed.
1431 However, in general, all variables will be considered in minimization.
1433 \item{PRINT\_MC=number} (0)
1434 Printout level in MCM. 0 - no intermediate printing, 1 and 2 - moderate
1435 printing, 3 - extensive printing.
1437 \item{NO\_PRINT\_STAT} -- no output to INPUT\_POTENTIALxxx.stat.
1439 \item{NO\_PRINT\_INT} -- no internal-coordinate output to INPUT\_POTENTIALxxx.int.
1443 \subsubsection{MD data}
1444 \label{sect:input:main:MD}
1446 (Mixed format; subroutine READ\_MDPAR.)
1450 \item{NSTEP} (1000000) number of time steps per trajectory.
1452 \item{NTWE} (100) NTWX (1000) frequency of energy and coordinate output, respectively.
1453 The coordinates are dumped in the pdb or compressed Gromacs (cx) format,
1454 depending on the next keyword.
1455 NTWE=0 means no energy dump.
1457 \item{MDPDB} - dump coordinates in the PDB format (cx otherwise)
1459 \item{TRAJ1FILE} only the master processor outputs coordinates. This feature pertains
1460 only to REMD/MREMD jobs and overrides NTWX; coordinates are dumped at every
1463 \item{REST1FILE} only the master writes the restart file
1465 \item{DT} (real) (0.1) time step; the unit is ``molecular time unit'' (mtu); 1 mtu = 48.9 fs
1467 \item{DAMAX} (real) (1.0) maximum allowed change of acceleration during a single time step.
1468 The time step gets scaled down, if this is exceeded.
1470 \item{DVMAX} (real) (20.0) -- maximum allowed velocity (in A/mtu)
1472 \item{EDRIFTMAX} (real) (10.0) -- maximum allowed energy drift in a single MD step (10 kcal/mol)
1474 \item{REST} -- restart flag. The calculation is restarted if present.
1476 \item{LARGE} -- very detailed output. Don't use except for debugging.
1478 \item{PRINT\_COMPON} -- prints energy components.
1480 \item{RESET\_MOMENT} (1000) -- frequency of zeroing out the total angular momentum when
1481 running Berendsen mode calculations (for Langevin calculations meaningless).
1483 \item{RESET\_VEL}=number (integer) (1000) -- frequency of resetting velocities to values
1484 from Gaussian distribution.
1486 \item{RATTLE} -- use the RATTLE algorithm (constraint bonds); not yet implemented.
1488 \item{RESPA} -- use the Multiple Time Step (MTS) or Adaptive Multiple Time Step (A-MTS)
1489 algorithm \cite{rakowski_2006}. Without this flag the variable time step (VTS) \cite{khalili_2005} is run.
1491 \item{NTIME\_SPLIT=number} (integer) (1) -- initial number of time-split steps
1493 \item{MAXTIME\_SPLIT=number} (integer) (64) -- maximum number of time-split step
1495 If NTIME\_SPLIT==MAXTIME\_SPLIT, MTS is run.
1497 \item{R\_CUT=number} (real) (2.0) -- the cut-off distance in splitting the forces into short- and
1498 long-range in site-site VDW distance units.
1500 \item{LAMBDA} (real) (0.3) -- the transition length (in site-site VDW distance units) between
1501 short- and long-range forces.
1503 \item{XIRESP} -- flag to use MTS/A-MTS with Nos\'e-Hoover/Nos\'e-Poincar\'e thermostats.
1505 \item{LANG=number} (integer) (0) Langevin dynamics flag:
1508 \item{0} -- No explicit Langevin dynamics.
1509 \item{1} -- Langevin with direct integration of the equations of motion (recommended
1510 for Langevin calculations)
1511 \item{2} -- Langevin calculation with analytical pre-integration of the friction and
1512 stochastic part of the equations of motion using an algorithm adapted from TINKER.
1513 This is MUCH MORE time- and memory-consuming than 1 and requires compiling without
1514 the -DLANG0 flag and enormously increases memory requirements.
1515 \item{3} -- The stochastic integrator developed by Cicotti and coworkers.
1516 \item{4} -- for other stochastic integrators (not used at present).
1519 Note: With the enclosed code, the -DLANG0 compiler flag is included which disables
1522 \item{TBF} -- Berendsen thermostat.
1524 \item{TAU\_BATH} (1.0) (units are mtus; 1mtu=48.9 fs) -- constant of the coupling to the thermal bath
1525 used with the Berendsen thermostat.
1527 \item{NOSEPOINCARE99} -- the Nose-Poincare thermostat as of 1999 will be used.
1529 \item{NOSEPOINCARE01} -- the Nose-Poincare thermostat as of 2001 will be used.
1531 \item{NOSEHOOVER96} -- the Nose-Hoover thermostat will be used.
1533 \item{Q\_NP=number} (real) (0.1) -- the value of the mass of the fictitious particle in the calculations
1534 with the Nose-Poincare thermostat.
1536 \item{T\_BATH} (300.0) (in K) -- temperature of canonical simulation or temperature to generate
1539 \item{ETAWAT} (0.8904) -- viscosity of water (in centipoises).
1541 \item{RWAT} (1.4) -- radius of water molecule (in A)
1543 \item{SCAL\_FRIC=number} (real) (0.02) -- scaling factor of the friction coefficients.
1545 \item{SURFAREA} -- scale friction acting on atoms by atoms' solvent accessible area.
1547 \item{RESET\_FRICMAT=number} (integer) (1000) -- recalculate friction matrix every RESET\_FRICMAT MD steps.
1549 \item{USAMPL} -- restraints on q (see reference 5 for meaning) will be imposed (see section .
1550 In this case, the next records specify the restraints; these records are
1551 placed before the list of temperatures or numbers of trajectories.
1553 \item{EQ\_TIME=number} (real) (1.0e4) -- time (in mtus; 1 mtu=48.9 fs) after which restraints
1554 on q will start to be in force.
1558 If USAMPL has been specified, the following information must be supplied after the
1559 main MD input data record (subroutine READ\_FRAGMENTS):
1561 Line 1: nset, npair, nfrag\_back (number of sets of restraints, number of restrained
1562 fragments, number of restrained pairs, number of restrained backbone fragments
1563 (in terms of $\theta$ and $\gamma$ angles)
1565 For each set of restraints (1, 2,..., nset):
1569 \item{mset(iset)} -- how many times the set is multiplied.
1571 \item{wfrag(i,iset), ifrag(1,i,iset), ifrag2(2,i,iset),qfrag(i,iset)} --
1572 weight of the restraint, first and last residue of the fragment, target q value.
1573 This information is repeated through nfrag.
1575 \item{wpair(i,iset), ipair(1,i,iset), ipair(2,i,iset),qinpair(i,iset)} --
1576 weight of the restraint, first and second fragment of the pair (according to fragment
1577 list), target q value. This information is repeated through npair
1579 \item{wfrag\_back(1,i,iset), wfrag\_back(2,i,iset), wfrag\_back(3,i,iset),
1580 ifrag\_back(1,i,iset),ifrag\_back(2,i,iset)} --
1581 weight of the restraints on $\theta$ angles, weight on the restraints on $\gamma$ angles,
1582 weight of the restraints on side-chain rotamers, first residue of the fragment,
1583 last residue of the fragment. This information is repeated through nfrag\_back.
1587 \subsubsection{REMD/MREMD data}
1588 label{sect:input:main:MREMD}
1590 (Miced format; subroutine READ\_REMDPAR.)
1594 \item{NREP} (3) -- number of replicas in a REMD/MREMD run.
1596 \item{NSTEX} (1000) -- number of steps after which exchange is performed in REMD/MREMD
1599 The temperatures in replicas can be specified through
1601 \item{RETMIN} (10.0) -- minimum temperature in a REMD/MREMD run,
1603 \item{RETMAX} (1000.0) -- maximum temperature in a REMD/MREMD run.
1607 Then the range from retmin to retmax is divided into equal segments and
1608 temperature of the replicas assigned accordingly,
1614 \item{TLIST} means that the NREP temperature of the replicas will be input in the
1617 \item{MLIST} numbers of trajectories per each of the NREP temperatures will be
1618 specified in the record after the list of temperatures; this specifies
1623 Important! The number of processors must be exactly equal to the number of
1624 trajectories, i.e., NREP for a REMD run or $\sum_i mlist(i)$ for a MREMD run.
1628 \item{SYNC} -- all trajectories will be synchronized every NSTEX time steps
1629 (by default, they are not synchronized).
1631 \item{TRAJ1FILE} -- only the master processor outputs coordinates. This feature pertains
1632 only to REMD/MREMD jobs and overrides NTWX; coordinates are dumped at every
1635 \item{REST1FILE} -- only the master writes the restart file.
1637 \item{HREMD} -- Hamiltonian replica exchange flag; not only temperatures but also
1638 sets energy-term weights are exchanged between conformations.
1640 \item{TONLY} -- run a ``fake'' HREMD with many sets of energy-term weights in a
1641 single run but only temperature exchange.
1645 \subsubsection{Energy-term weights}
1646 \label{sect:input:main:weights}
1648 (Data list format; subroutine MOLREAD.)
1652 \item{WLONG=number} (real) (1.0d0) --
1653 common weight of the U(SC-SC) (side-chain side-chain interaction)
1654 and U(SC,p) (side-chain peptide-group) term.
1656 \item{WSCC=number} (real) (WLONG) --
1657 weight of the U(SC-SC) term.
1659 \item{WSCP=number} (real) (WLONG)
1660 weight of the U(SC-p) term.
1662 \item{WELEC=number} (real) (1.0d0)
1663 weight of the U(p-p) (peptide-group peptide-group interaction) term.
1665 \item{WEL\_LOC=number} (real) (1.0d0)
1666 weight of the $U_{el;loc}^3$ (local-electrostatic cooperativity, third-order) term.
1668 \item{WCORRH=number} (real) (1.0d0)
1669 weight of the U(corr) (cooperativity of hydrogen-bonding interactions, fourth-order) term.
1671 \item{WCORR5=number} (real) (0.0d0) --
1672 weight of the $U_{el;loc}^5$ (local-electrostatic cooperativity, 5th order
1675 \item{WCORR6=number} (real) (0.0d0) --
1676 weight of the $U_{el;loc}^6$ (local-electrostatic cooperativity, 6th order
1679 \item{WTURN3=number} (real) (1.0d0) --
1680 weight of the $U_{turn}^3$ (local-electrostatic cooperativity within 3 residue
1681 segment, 3rd order contribution).
1683 \item{WTURN4=number} (real) (1.0d0) --
1684 weight of the $U_{turn}^4$ (local-electrostatic cooperativity within 4 residue
1685 segment, 4rd order contributions).
1687 \item{WTURN6=number} (real) (1.0d0) --
1688 weight of the $U_{turn}^6$ (local-electrostatic cooperativity within 6 residue
1689 segment, 6rd order contributions).
1691 \item{WTOR=number} (real) (1.0d0) --
1692 weight of the torsional term, $U_{tor}$.
1694 \item{WANG=number} (real) (1.0d0) --
1695 weight of the virtual-bond angle bending term, $U_b$.
1697 \item{WSCLOC=number} (real) (1.0d0) --
1698 weight of the side-chain rotamer term, $U_{SC}$.
1700 \item{WSTRAIN=number} (real) (1.0d0) --
1701 scaling factor of the distance-constrain or disulfide-bond strain energy term.
1703 \item{SCALSCP=number} (real) (1.0d0) --
1704 scaling factor of $U_{SCp}$; this is an alternative to specifying WSCP; in
1705 this case WSCP will be calculated as WLONG*SCALSCP.
1707 \item{SCAL14=number} (real) (1.0d0) --
1708 scaling factor of the 1,4 SC-p interactions.
1710 \item{CUTOFF} (7.0) -- cut-off on backbone-electrostatic interactions to compute 4-
1711 and higher-order correlations.
1713 \item{DELT\_CORR} (0.5) - thickness of the distance range in which the energy is
1718 The defaults are NOT the recommended values. No ``working'' default values
1719 have been set, because the force field is still under development. The values
1720 corresponding to the force fields listed in section 4 are as follows:
1724 WELEC=1.5 WSTRAIN=1.0 WTOR=0.08617 WANG=0.10384 WSCLOC=0.10384 WCORR=1.5 &
1725 WTURN3=0 WTURN4=0 WTURN6=0 WEL_LOC=0 WCORR5=0 WCORR6=0 SCAL14=0.40 SCALSCP=1.0 &
1726 CUTOFF=7.00000 WSCCOR=0.0
1731 WSC=1.00000 WSCP=0.72364 WELEC=1.10890 WANG=0.68702 WSCLOC=1.79888 &
1732 WTOR=0.30562 WCORRH=1.09616 WCORR5=0.17452 WCORR6=0.36878 WEL_LOC=0.19508 &
1733 WTURN3=0.00000 WTURN4=0.55588 WTURN6=0.11539 CUTOFF=7.00000 WCORR4=0.0000 &
1734 WTORD=0.0 WSCCOR=0.0
1739 WSC=1.00000 WSCP=1.10684 WELEC=0.70000 WANG=0.80775 WSCLOC=1.91939 &
1740 WTOR=3.36070 WCORRH=2.50000 WCORR5=0.99949 WCORR6=0.46247 WEL_LOC=2.50000 &
1741 WTURN3=1.80121 WTURN4=4.35377 WTURN6=0.10000 CUTOFF=7.00000 WCORR4=0.00000 &
1747 WSC=1.00000 WSCP=1.43178 WELEC=0.41501 WANG=0.37790 WSCLOC=0.12880 &
1748 WTOR=1.98784 WCORRH=2.50526 WCORR5=0.23873 WCORR6=0.76327 WEL_LOC=2.97687 &
1749 WTURN3=0.09261 WTURN4=0.79171 WTURN6=0.01074 CUTOFF=7.00000 WCORR4=0.00000 &
1755 WSC=1.00000 WSCP=1.54864 WELEC=0.20016 WANG=1.00572 WSCLOC=0.06764 &
1756 WTOR=1.70537 WTORD=1.24442 WCORRH=0.91583 WCORR5=0.00607 WCORR6=0.02316 &
1757 WEL_LOC=1.51083 WTURN3=2.00764 WTURN4=0.05345 WTURN6=0.05282 WSCCOR=0.0 &
1758 CUTOFF=7.00000 WCORR4=0.00000 WSCCOR=0.0
1763 WSC=1.00000 WSCP=2.85111 WELEC=0.36281 WANG=3.95152 WSCLOC=0.15244 &
1764 WTOR=3.00008 WTORD=2.89863 WCORRH=1.91423 WCORR5=0.00000 WCORR6=0.00000 &
1765 WEL_LOC=1.72128 WTURN3=2.99827 WTURN4=0.59174 WTURN6=0.00000 &
1766 CUTOFF=7.00000 WCORR4=0.00000 WSCCOR=0.0
1771 WSC=1.00000 WSCP=2.73684 WELEC=0.06833 WANG=4.15526 WSCLOC=0.16761 &
1772 WTOR=2.99546 WTORD=2.89720 WCORRH=1.98989 WCORR5=0.00000 WCORR6=0.00000 &
1773 WEL_LOC=1.60072 WTURN3=2.36351 WTURN4=1.34051 WTURN6=0.00000 &
1774 CUTOFF=7.00000 WCORR4=0.00000 WSCCOR=0.0
1779 WLONG=1.35279 WSCP=1.59304 WELEC=0.71534 WBOND=1.00000 WANG=1.13873 &
1780 WSCLOC=0.16258 WTOR=1.98599 WTORD=1.57069 WCORRH=0.42887 WCORR5=0.00000 &
1781 WCORR6=0.00000 WEL_LOC=0.16036 WTURN3=1.68722 WTURN4=0.66230 WTURN6=0.00000 &
1782 WVDWPP=0.11371 WHPB=1.00000 &
1783 CUTOFF=7.00000 WCORR4=0.00000
1788 WLONG=1.70905 WSCP=2.18310 WELEC=1.06684 WBOND=1.00000 WANG=1.17536 &
1789 WSCLOC=0.22070 WTOR=2.65798 WTORD=2.00646 WCORRH=0.23541 WCORR5=0.00000 &
1790 WCORR6=0.00000 WEL_LOC=0.42789 WTURN3=1.68126 WTURN4=0.75080 WTURN6=0.00000 &
1791 WVDWPP=0.27044 WHPB=1.00000 WSCP14=0.00000 &
1792 CUTOFF=7.00000 WCORR4=0.00000
1797 WLONG=1.00000 WSCP=1.23315 WELEC=0.84476 WBOND=1.00000 WANG=0.62954 &
1798 WSCLOC=0.10554 WTOR=1.84316 WTORD=1.26571 WCORRH=0.19212 WCORR5=0.00000 &
1799 WCORR6=0.00000 WEL_LOC=0.37357 WTURN3=1.40323 WTURN4=0.64673 WTURN6=0.00000 &
1800 WVDWPP=0.23173 WHPB=1.00000 WSCCOR=0.0 &
1801 CUTOFF=7.00000 WCORR4=0.00000
1804 \subsubsection{Input and/or reference PDB file name}
1805 \label{sect:input:main:PDB}
1807 (Text format; subroutine MOLREAD.)
1809 If PDBSTART or PDBREF was specified in the control card, this line contains
1810 the PDB file name. Trailing slashes to specify the full path are permitted.
1811 The file name can contain up to 64 characters.
1813 \subsubsection{Amino-acid sequence}
1814 \label{sect:input:main:sequence}
1818 This data appears, if PDBSTART was not specified, otherwise must not be present
1819 because the sequence would be taken from the PDB file. The first line contains
1820 the number of amino-acid residues, including the end groups (free format),
1821 the next lines contain the sequence in 20(1X,A3) format for the three-letter
1822 or 80A1 format for the one-letter code. There are two types of end-groups:
1823 Gly (three-letter code) or G (one-letter code), if an end group contains a full
1824 peptide bond (e.g., the acetyl N-terminal group or the carboxyamide C-terminal
1825 group) and D (in the three-letter code) or X (in the one-letter code), if the
1826 end group does not contain a peptide group (e.g., the NH2 N-terminal end group
1827 or the COOH C-terminal end group). (Note the Gly or G also denotes the regular
1828 glycine residue, if found in the middle of a chain).
1829 In the second case the end group is considered as a ``dummy'' group and serves
1830 only to define the first (last) virtual-bond dihedral angle $\gamma$ for the
1831 first (last) full amino-acid residue.
1833 Consider, for example, the Ac-Ala(19)-NHMe polypeptide. The three-letter code
1834 input will look like this:
1838 Gly Ala Ala Ala Ala Ala Ala Ala Ala Ala Ala Ala Ala Ala Ala Ala Ala Ala Ala Ala
1842 And the one-letter code input will be:
1846 GAAAAAAAAAAAAAAAAAAAG
1849 If the sequence is changed to NH3(+)-Ala(19)-COO(-), the inputs will look
1854 D Ala Ala Ala Ala Ala Ala Ala Ala Ala Ala Ala Ala Ala Ala Ala Ala Ala Ala Ala
1862 XAAAAAAAAAAAAAAAAAAAX
1865 The sequence input is case-insensitive, because the present version of UNRES
1866 considers each amino-acid residue as an L-residue (there are no torsional
1867 parameters for the combinations of the D- and L-residues yet). Furthermore,
1868 each peptide group is considered as a trans group.
1870 If the version of UNRES has multi-chain capacity, placing a dummy residue
1871 inside the sequence indicates start of a new chain. For example, a system
1872 composed of two Ala(10) chains can be specified as follows (3-letter code):
1876 D Ala Ala Ala Ala Ala Ala Ala Ala Ala Ala D Ala Ala Ala Ala Ala Ala Ala Ala
1884 XAAAAAAAAAAXAAAAAAAAAAX
1887 \subsubsection{Disulfide-bridge information}
1888 \label{sect:input:main:disulphide}
1890 (Free format; subroutine READ\_BRIDGE.)
1898 \item{NS} -- the number of half-cystines (required even if no half-cystines are present).
1900 \item{ISS(i)} -- the position of ith half-cystine in the sequence (starting from the
1901 N-terminal end group)
1905 Next line(s) (present only, if $ns>0$ and must not appear otherwise):
1907 NSS,(IHPB(i),JHPB(i),i=1,NSS)
1911 \item{NSS} -- the number of disulfide bridges; must not be greater than NS/2.
1913 \item{IHPB(i),JHPB(i)} -- the cystine residue forming the ith bridge.
1917 The program will check, whether the residues specified in the ISS list
1918 are cystines and terminate with error, if any of them is not. The program
1919 also checks, if the numbers from the IHPB and the JHPB lists have appeared
1922 \subsubsection{Dihedral-angle restraint data}
1923 \label{sect:input:main:dihedral-restraints}
1925 (Free format; subroutine MOLREAD.)
1927 This set of data specifies the harmonic constraints (if any) imposed on selected
1928 virtual-bond dihedral angles $\gamma$.
1934 \item{NDIH\_CONSTR} -- the number of restrained $\gamma$ angles (required even if no
1935 restrains are applied).
1939 2nd line (present only, if NDIH\_CONSTR$>$0; must not appear otherwise):
1940 FTORS - the force constant expressed in kcal/(mol*rad**2)
1942 next NDIH\_CONSTR lines (present only, if NDIH\_CONSTR$>$0):
1944 IDIH\_CONSTR(i),PHI0(i),DRANGE(i)
1948 \item{IDIH\_CONSTR(i)} -- the number of ith restrained $\gamma$ angle. The angles are
1949 numbered after the LAST $\alpha$-carbons. Thus, the first ``real'' angle has number
1950 4 and it corresponds to the rotation about the CA(2)-CA(3) virtual-bond axis
1951 and the last angle has the number NRES and corresponds to the rotation about
1952 the CA(NRES-2)-CA(NRES-1) virtual-bond axis.
1954 \item{PHI0(i)} -- the ``center'' of the restraint (expressed in degrees).
1956 \item{DRANGE(i)} -- the ``flat well'' range of the restraint (in degrees).
1960 The restraint energy for the ith restrained angle is expressed as:
1963 E_{dih} = \begin{cases}
1964 \rm FTORS\times(\gamma_{IDIH\_CONSTR(i)}-PHI0(i)+DRANGE(i))^2&\mbox{if}\ \ \rm \gamma_{IDIH\_CONSTR(i)}\\
1965 &<PHI0(i)+DRANGE(i)\\
1967 0 &\rm if\ \ PHI0(i)-DRANGE(i) \\
1968 &\le \gamma_{IDIH\_CONSTR(i)} \\
1969 &\le PHI0(i)+DRANGE(i)\\
1971 \rm FTORS\times(\gamma_{IDIH\_CONSTR(i)}-PHI0(i)+DRANGE(i))^2&\mbox{if}\ \ \rm \gamma_{IDIH\_CONSTR(i)}\\
1976 Applying dihedral-angle constraints also implies that for ith constrained
1977 $\gamma$ angle the sampling be carried out from the
1978 [PHI0(i)-DRANGE(i)..PHI0(i)+DRANGE(i)] interval and not from the $[-\pi..\pi]$
1979 interval, if random conformations are generated. If only this and not
1980 restrained minimization is required, just set FTORS to 0.
1982 \subsubsection{Distance restraints}
1983 \label{sect:input:main:disance-restraints}
1985 (Mixed format; subroutine READ\_DIST\_CONSTR.)
1987 Restraints are imposed on C$^\alpha\cdots$C$^\alpha$ SC$\cdots$SC distances (C$^\beta\cdots$C$^\beta$.
1991 \item{NDIST=number} (integer) (0) -- number of restraints on specific distances.
1993 \item{NFRAG=number} (integer) (0) -- number of distance-restrained protein segments.
1995 \item{NPAIR=number} (integer) (0) -- number of distance-restrained pairs of segments.
1996 Specifying NPAIR requires specification of segments.
1998 \item{IFRAG=start(1),end(1),start(2),end(2)...start(NFRAG),end(NFRAG)} (integers) --
1999 First and last residues of the distance restrained segments.
2001 \item{WFRAG=w(1),w(2),...,w(NFRAG) (reals)} -- force constants or bases for force
2002 constant calculation corresponding to fragment restraints.
2004 \item{IPAIR=start(1),end(1),start(2),end(2),...,start(NPAIR),end(NPAIR)} (integers)
2005 -- numbers of segments (consecutive numbers of start or end pairs in IFRAG
2006 specification), the distances between which will be restrained.
2008 \item{WPAIR=w(1),w(2),...,w(NFRAG)} (reals) -- force constants or bases for force
2009 constant calculation corresponding to pair restraints.
2011 \item{DIST\_CUT=number} (real) (5.0) -- the cut-off distance in angstroms for force-
2012 constant calculations.
2014 The force constants within fragments/between pairs of fragments are calculated
2015 depending on the value of DIST\_CONSTR described in section 5.1:
2019 \item{1} -- all force constants are equal to the respective entries of WFRAG/WPAIR
2021 \item{2} -- the force constants are equal to the respective entries of WFRAG/WPAIR
2022 when the distance between the C$^\alpha$ atoms in the reference structure
2023 $\le$D\_CUT, 0 otherwise.
2025 \item{3} -- the force constants are calculated from the formula:
2029 \item{$k(C^\alpha_j,C^\alpha_k)=W\times\exp{-[d(C^\alpha_j,C^\alpha_k)/DIST\_CUT)]^2/2}$}
2031 where $k(C^\alpha_j,C^\alpha_k)$ is the force constant between the respective C$^\alpha$ atoms,
2032 $d(C^\alpha_j,C^\alpha_k)$ is the distance between these C$^\alpha$ atoms in the reference
2033 structure, and W is the basis for force-constant calculation (see above).
2037 The above restraints are harmonic resatraints of the form
2040 E_{dis} = \sum_i k_i \left(d_i - d_i^{ref}\right)^2
2043 where $d_i$ is the distance in the calculated structure and $d_i^{ref}$ is the respective
2044 distance in the reference (PDB) structure. The reference structure is required.
2046 If NDIST$>$0, the restraints on specific distance are input explicitly (no reference structure is requires).
2047 The restraints are quartic restraints of a similar form as that in section
2048 \ref{sect:input:main:dihedral-restraints} but with angles replaced with distances.
2050 ihpb(i), jhpb(i), dhpb(i), dhpb1(i), ibecarb(i), forcon(i), i=1,NDIST
2054 \item{ihpb(i)} and jhpb(i) are the numbers of the residues the distance
2055 between the C$^\alpha$ atoms of which will be distance restrained,
2057 \item{dhpb(i)} and dhpb1(i) are the lower and upper distance-restraint,
2059 \item{ibecarc(i)} is the restraint-type flag;
2060 ibecarb(i)==0 indicates that the restraints are imposed on the
2061 C$^\alpha\cdots$C$^\alpha$ distances; otherwise restraints on the
2062 SC$\cdots$SC distances are imposed,
2065 is the respective force constant.
2069 \subsubsection{Internal coordinates of the reference structure}
2070 \label{sect:input:main:internalref}
2072 (Free format; subroutine READ\_ANGLES.)
2074 This part of the data is present, if REFSTR, but not PDBREF was specified,
2075 otherwise must not appear. It contains the following group of variables:
2078 \item{(THETA(i),i=3,NRES)} -- the virtual-bond valence angles THETA.
2079 \item{(PHI(i),i=4,NRES)} -- the virtual-bond dihedral angles GAMMA.
2080 \item{(ALPH(i),i=2,NRES-1)} -- the ALPHA polar angles of consecutive side chains.
2081 \item{(OMEG(i),i=2,NRES-1)} -- the BETA polar angles of consecutive side chains.
2084 ALPHA(i) and OMEG(i) correspond to the side chain attached to CA(i). THETA(i)
2085 is the CA(i-2)-CA(i-1)-CA(i) virtual-bond angle and PHI(i) is the
2086 CA(i-3)-CA(i-2)-CA(i-1)-CA(i) virtual-bond dihedral angle $\gamma$.
2088 \subsubsection{Internal coordinates of the initial conformation}
2089 \label{sect:input:main:intcoord}
2091 (Free format; subroutine READ\_ANGLES.)
2093 This part of the data is present, if RAND\_CONF, MULTCONF, THREAD, or PDBSTART
2094 were not specified, otherwise must not appear. This input is as in section \ref{sect:support}.
2096 \paragraph{File name with internal coordinates of the conformations to be processed}
2097 \label{sect:input:main:intcord:files}
2099 (Text format; subroutine MOLREAD.)
2101 This data is present only, if MULTCONF was specified. It contains the name of
2102 the file with the internal coordinates. Up to 64 characters are allowed.
2103 The structure of the file is that of the *.int file produced by UNRES/CSA.
2104 See section ``The structure of the INT files'' for details.
2106 \subsubsection{Control data for energy map construction}
2107 \label{sect:input:main:map}
2109 (Data list format; subroutine MAP\_READ.)
2111 These data lists appear, if NMAP=n was specified, where n is the number of
2112 variables that will be grid-searched. One list is per one variable or a
2113 group of variables set equal (see below):
2116 \item{PHI} -- the variable is a virtual-bond dihedral angle $\gamma$.
2117 \item{THE} -- the variable is a virtual-bond angle $\theta$.
2118 \item{ALP} -- the variable is a side-chain polar angle $\alpha$.
2119 \item{OME} -- the variable is a side-chain polar angle $\beta$.
2123 \item{RES1=number} (integer)
2124 \item{RES2=number} (integer)
2127 The range of residues for which the values will be set; all these variables
2128 will be set at the same value. It is required that RES2$>$RES1.
2131 \item{FROM=angle} (real)
2132 \item{TO=angle} (real)
2135 Lower and upper limit of scanning in grid search (in degrees)
2138 \item{NSTEP=number} (integer)
2141 Number of steps in scanning along this variable/group of variables.
2143 \subsection{Input coordinate files}
2144 \label{sect:input:coordfiles}
2146 (Text format; subroutine MOLREAD.)
2148 At present, geometry can be input either from the external files in the PDB
2149 format (with the PDBSTART option) or multiple conformations can be read
2150 as virtual-bond-valence and virtual-bond dihedral angles when the MULTCONF
2151 option is used (the latter, however, implies using standard virtual-bond
2152 lengths as initial values). The structure of internal-coordinate files
2153 is the same as that of output internal-coordinate files described in section
2156 \subsection{Other input files}
2157 \label{sect:input:otherfiles}
2159 CSA parameters can optionally be read in free format from file INPUT.CSA.in
2160 (see section 8.1.4). When a CSA run is restarted, the CSA-specific output files
2161 also serve as input files. INPUT is the prefix of input and output files
2162 as explained in section \ref{sect:command}.
2164 Restart files for MD and REMD simulations. They are read when the keyword
2165 RESTART appears on the MD/REMD data group (section \ref{sect:input:main:MD}).
2169 \section{OUTPUT FILES}
2172 UNRES ``main'' output files (INPUT.out\_\$\{POT\}[processor]) are log files from
2173 a run. They contain the information of the molecule, force field, calculation
2174 type, control parameters, etc.; however, not the structures produced during
2175 the run or their energies except single-point energy evaluation and
2176 minimization-related runs. The structural information is included in
2177 coordinate files (*.int, *.x, *.pdb, *.mol2, *.cx) and statistics files (*.stat),
2178 respectively; these files are further processed by other programs (WHAM,
2179 CLUSTER) or can be viewed by molecular viewers (pdb or mol2 files).
2181 \subsection{Coordinate files}
2182 \label{sect:output:coord}
2184 \subsubsection{The internal coordinate (INT) file}
2185 \label{sect:output:coord:int}
2187 This file contains the internal coordinates of the conformations produced
2188 by UNRES in non-MD runs. The virtual-bond lengths are assumed constant so
2189 only the angular variables are provided.
2191 IT,ENER,NSS,(IHPB(I),JHPB(I),I=1,NSS)\\
2192 (I5,F12.5,I2,9(1X,2I3))
2195 \item{IT} -- the number of the conformation.
2196 \item{ENER} -- total energy.
2197 \item{NSS} -- the number of disulfide bridges.
2198 \item{(IHPB(I),JHPB(I),I=1,NSS)} -- the positions of the pairs of half-cystines .
2199 forming the bridges. If NSS$>9$9, the remaining pairs are written in the
2200 following lines in the (3X,11(1X,2I3)) format.
2203 (THETA(I),I=3,NRES)\\
2206 The virtual-bond angles THETA (in degrees)
2211 The virtual-bond dihedral angles GAMMA (in degrees)
2213 (ALPH(I),I=2,NRES-1)\\
2214 (OMEG(I),I=2,NRES-1)\\
2217 The polar angles ALPHA and BETA of the side-chain centers (in degrees).
2219 \subsubsection{The plain Cartesian coordinate (X) files}
2220 \label{sect:output:coord:cart}
2222 (Subroutine CARTOUT.)
2224 This file contains the Cartesian coordinates of the $\alpha$-carbon and
2225 side-chain-center coordinates. All conformations from an MD/MREMD
2226 trajectory are collated to a single file. The structure of each
2227 conformation's record is as follows:
2229 1st line: time, potE, uconst, t\_bath,nss, (ihpb(j), jhpb(j), j=1,nss),
2230 nrestr, (qfrag(i), i=1,nfrag), (qpair(i), i=1,npair),
2231 (utheta(i), ugamma(i), uscdiff(i), i=1,nfrag\_back)
2234 \item{time:} MD time (in ``molecular time units'' 1 mtu = 4.89 fs),
2235 \item{potE:} potential energy,
2236 \item{uconst:} restraint energy corresponding to restraints on Q and backbone geometry,
2237 (see section \ref{sect:input:main:MD}),
2238 \item{t\_bath:} thermostat temperature,
2239 \item{nss:} number of disulfide bonds,
2240 \item{ihpb(j), jhpb(j):} the numbers of linked cystines for jth disulfide bond,
2241 \item{nrestr:} number of restraints on q and local geometry,
2242 \item{qfrag(i):} q value for ith fragment,
2243 \item{qpair(i):} q value for ith pair,
2244 \item{utheta(i):} sum of squares of the differences between the theta angles
2245 of the current conformation from those of the experimental conformation,
2246 \item{ugamma(i):} sum of squares of the differences beaten the gamma angles
2247 of the current conformation from those of the experimental conformation,
2248 \item{uscdiff(i):} sum of squares of the differences between the Cartesian difference
2249 of the unit vector of the C$^\alpha$-SC axis of the current conformation from
2250 those of the experimental conformation.
2253 Next lines: Cartesian coordinates of the C$^\alpha$ atoms (including dummy atoms)
2254 (sequentially, 10 coordinates per line)
2255 Next lines: Cartesian coordinates of the SC atoms (including glycines and
2256 dummy atoms) (sequentially, 10 coordinates per line)
2258 \subsubsection{The compressed Cartesian coordinate (CX) files}
2259 \label{sect:output:coord:cx}
2261 These files are compressed binary files (extension cx). For each conformation,
2262 the items are written in the same order as specified in section \ref{sect:output:coord:cx}. For
2263 MREMD runs, if TRAJ1FILE is specified on MREMD record (see section \ref{sect:input:main:MD}),
2264 snapshots from all trajectories are written every time the coordinates
2265 are dumped. Thus, the file contains snapshot 1 from trajectory 1, ...,
2266 snapshot 1 from trajectory M, snapshot 2 from trajectory 1, ..., etc.
2268 The compressed cx files can be converted to pdb file by using the xdrf2pdb
2269 auxiliary program (single trajectory files) or xdrf2pdb-m program (multiple
2270 trajectory files from MREMD runs generated by using the TRAJ1FILE option).
2271 The multiple-trajectory cx files are also input files for the auxiliary
2274 \subsubsection{The Brookhaven Protein Data Bank format (PDB) files}
2275 \label{sect:output:coord:PDB}
2277 (Subroutine PDBOUT.)
2280 These files are written in PDB standard (see. e.g.,
2281 \href{ftp://ftp.wwpdb.org/pub/pdb/doc/format_descriptions/Format_v33_Letter.pdf}{\textcolor{blue}{ftp://ftp.wwpdb.org/pub/pdb\-/doc/\-format\_descriptions}}). %\-/Format\_v33\_Letter.pdf}.
2282 The REMARK, ATOM, SSBOND, HELIX, SHEET, CONECT, TER, and ENDMDL are used.
2283 The C$^\alpha$ (marked CA) and SC (marked CB) coordinates are output. The CONECT
2284 records specify the C$^\alpha\cdots$C$^\alpha$ and C$^\alpha\cdots$SC virtual bonds. Secondary
2285 structure is detected based on peptide-group contacts, as specified in
2286 ref 12. Dummy residues are omitted from the output. If the program has
2287 multiple-chain function, the presence of a dummy residue in a sequence
2288 starts a new chain, which is assigned the next alphabet letter as ID, and
2289 residue numbering is started over.
2291 \subsubsection{The SYBYLL (MOL2) files}
2292 \label{sect:output:coord:subyll}
2294 See the description of mol2 format (e.g.,
2295 \href{http://tripos.com/data/support/mol2.pdf}{http://tripos.com/data/support/mol2.pdf}.
2296 Similar remarks apply as for
2297 the PDB format (section \ref{sect:output:coord:PDB}).
2299 \subsection{The summary (STAT) file}
2301 \subsubsection{Non-MD runs}
2303 This file contains a short summary of the quantities characterizing the
2304 conformations produced by UNRES/CSA. It is created for MULTCONF and MCM.
2306 NOUT,EVDW,EVDW2,EVDW1+EES,ECORR,EBE,ESCLOC,ETORS,ETOT,RMS,FRAC\\
2310 \item{NOUT} -- the number of the conformations
2311 \item{EVDW,EVDW2,EVDW1+EES,ECORR,EBE,ESCLOC,ETORS} -- energy components
2312 \item{ETOT} -- total energy
2313 \item{RMS} -- RMS deviation from the reference structure (if REFSTR was specified)
2314 \item{FRAC} -- fraction of side chain - side chain contacts of the reference
2315 structure present in this conformation (if REFSTR was specified)
2318 \subsubsection{MD and MREMD runs}
2319 \label{sect:output:coord:MD}
2321 Each line of the stat file generated by MD/MREMD runs contains the following
2325 \item{step} -- the number of the MD step
2326 \item{time} -- time [unit is MTU (molecular time unit) equal to 48.9 fs]
2327 \item{Ekin} -- kinetic energy [kcal/mol]
2328 \item{Epot} -- potential energy [kcal/mol]
2329 \item{Etot} -- total energy (Ekin+Epot)
2330 \item{H-H0} -- the difference between the cureent and initial extended Hamiltionian
2331 in Nose-Hoover or Nose-Poincare runs; not present for other thermostats.
2332 \item{RMSD} -- root mean square deviation from the reference structure (only in
2333 REFSTR has been specified)
2334 item{damax} -- maximum change of acceleration between two MD steps
2335 \item{fracn} -- fraction of native side-chain concacts (very crude, based on
2336 SC-SC distance only)
2337 \item{fracnn} -- fraction of non-native side-chain contacts
2338 \item{co} -- contact order
2339 \item{temp} -- actual temperature [K]
2340 \item{T0} -- initial (microcanonical runs) or thermostat (other run types)
2342 \item{Rgyr} -- radius of gyration based on C$^\alpha$ coordinates [A]
2343 \item{proc} -- in MREMD runs the number of the processor (the number of the
2344 trajectory less 1); not present for other runs.
2347 For an USAMPL run, the following items follow the above list:
2350 \item{iset} -- the number of the restraint set
2351 \item{uconst} -- restraint energy pertaining to q-values
2352 \item{uconst\_back} -- restraint energy pertaining to virtual-backbone restraints
2353 \item{(qfrag(i),i=1,nfrag)} -- q values of the specified fragments
2354 \item{(qpair(ii2),ii2=1,npair)} -- q values of the specified pairs of fragments
2355 \item{(utheta(i),ugamma(i),uscdiff(i),i=1,nfrag\_back)} -- virtual-backbone and
2356 side-chain-rotamer restraint energies of the fragments specified
2359 If PRINT\_COMPON has been specified, the energy components are printed
2360 after the items described above.
2362 \subsection{CSA-specific output files}
2363 \label{sect:output:coord:CSA}
2365 There are several output files from the CSA routine:
2366 INPUT.CSA.seed, INPUT.CSA.history, INPUT.CSA.bank, INPUT.CSA.bank1,
2367 INPUT.CSA.rbank INPUT.CSA.alpha, INPUT.CSA.alpha1.
2369 The most informative outfile is INPUT.CSA.history. This file first write down
2370 the parameters in INPUT.CSA.csa file. Later it shows the energies of random
2371 minimized conformations in its generation. After sorting the First\_bank
2372 in energy (ascending order), the energies of the First\_bank is re-written here.
2373 After this the output looks like:
2376 1 0 100 6048.2 1 100-224.124-114.346 202607 100 100
2377 1 0 700 5882.6 2 29-235.019-203.556 1130308 100 100
2378 1 0 1300 5721.5 2 18-242.245-212.138 2028008 100 100
2379 1 0 1900 5564.8 13 54-245.185-218.087 2897988 98 100
2380 1 0 2500 5412.4 13 61-246.214-222.068 3706478 97 100
2381 1 0 3100 5264.2 13 89-248.715-224.939 4514196 96 100
2384 Each line is written between each iteration (just after selection
2385 of seed conformations) containing following data:
2386 jlee,icycle,nstep,cutdif,ibmin,ibmax,ebmin,ebmax,nft,iuse,nbank
2387 ibmin and ibmax lists the index of bank conformations corresponding to the
2388 lowest and highest energies with ebmin and ebmax.
2389 nft is the total number of function evaluations so far.
2390 iuse is the total number of conformations which have not been used as seeds
2391 prior to calling subroutine select\_is which select seeds.
2393 Therefore, in the example shown above, one notes that so far 3100
2394 minimizations has been performed corresponding to the total of 4514196
2395 function evaluations. The lowest and highest energy in the Bank is
2396 -248.715 (\#13) and -224.939 (\#89), respectively. The number of conformations
2397 already used as seeds (not including those selected as seeds in this iteration)
2398 so far is 4 (100-96).
2400 The files INPUT.CSA.bank and INPUT.CSA.rbank contains data of Bank and
2401 First\_bank. For more information on these look subroutines write\_bank
2402 and write\_rbank. The file INPUT.CSA.bank is overwritten between each
2403 iteration whereas Bank is accumulated in INPUT.CSA.bank1 (not for every
2404 iteration but as specified in the subroutine together.f).
2406 The file INPUT.CSA.seed lists the index of the seed conformations with their
2407 energies. Files INPUT.CSA.alpha, INPUT.CSA.alpha1 are written only once
2408 at the beginning of the CSA run. These files contain some arrays used
2413 \section{TECHNICAL SUPPORT CONTACT INFORMATION}
2414 \label{sect:support}
2417 Faculty of Chemistry, University of Gdansk\\
2418 ul. Sobieskiego 18, 80-952 Gdansk Poland.\\
2419 phone: +48 58 523 5430\\
2420 fax: +48 58 523 5472\\
2421 e-mail: \href{mailto:adam@chem.univ.gda.pl}{adam@chem.univ.gda.pl}\\
2423 Dr. Cezary Czaplewski\\
2424 Faculty of Chemistry, University of Gdansk\\
2425 ul. Sobieskiego 18, 80-952 Gdansk Poland.\\
2426 phone: +48 58 523 5430\\
2427 fax: +48 58 523 5472\\
2428 e-mail: \href{mailto:czarek@chem.univ.gda.pl}{czarek@chem.univ.gda.pl}\\
2430 Dr. Stanislaw Oldziej\\
2431 Intercollegiate Faculty of Biotechnology\\
2432 University of Gdansk, Medical University of Gdansk\\
2433 ul. Kladki 22, 80-922 Gdansk, Poland\\
2434 phone: +48 58 523 5361\\
2435 fax: +48 58 523 5472\\
2436 e-mail: \href{mailto:stan@biotech.ug.gda.pl}{stan@biotech.ug.gda.pl}\\
2439 Korea Institute for Advanced Study\\
2440 207-43 Cheongnyangni 2-dong, Dongdaemun-gu,\\
2441 Seoul 130-722, Korea\\
2442 phone: +82-2-958-3890\\
2443 fax: +82-2-958-3731\\
2444 email: \href={mailto:jlee@kias.re.kr}{jlee@kias.re.kr}
2447 Prepared by Adam Liwo and Jooyoung Lee, 7/17/99\\
2448 Revised by Cezary Czaplewski 1/4/01\\
2449 Revised by Cezary Czaplewski and Adam Liwo 8/26/03\\
2450 Revised by Cezary Czaplewski and Adam Liwo 11/26/11\\
2451 Revised by Adam Liwo 02/19/12\\
2452 LaTeX version by Adam Liwo 09/25/12