X-Git-Url: http://mmka.chem.univ.gda.pl/gitweb/?p=unres.git;a=blobdiff_plain;f=doc%2FUNRES_all.TXT;fp=doc%2FUNRES_all.TXT;h=0000000000000000000000000000000000000000;hp=ff5d4af35ee91f8b6ffa31c9c4b5463fefb9b6e5;hb=7e75cfee188bb6255d455c46c87b9f34cf46aa00;hpb=e62983fb4a02b190d8fbfda27a26ec7ce96799d8 diff --git a/doc/UNRES_all.TXT b/doc/UNRES_all.TXT deleted file mode 100644 index ff5d4af..0000000 --- a/doc/UNRES_all.TXT +++ /dev/null @@ -1,2093 +0,0 @@ - UNRES - A PROGRAM FOR COARSE-GRAINED SIMULATIONS OF PROTEINS - ------------------------------------------------------------ - -TABLE OF CONTENTS ------------------ - -1. License terms - -2. Credits - -3. General information - 3.1. Purpose - 3.2. Functions of the program - 3.2. Companion programs - 3.4. Programming language - 3.5. References - -4. Installation - -5. Customizing your batch and C-shell script - -6. Command line and files - -7. Force fields - -8. Input files - 8.1. Main input data file - 8.1.1. Title - 8.1.2. Control data (data list format; READ_CONTROL subroutine) - 8.1.2.1 Keywords to chose calculation type - 8.1.2.2 Specification of protein and structure output in non-MD - applications - 8.1.2.3. Miscellaneous - 8.1.3. Minimizer options (data list, subroutine READ_MINIM) - 8.1.4. CSA control parameters - 8.1.5. MCM data (data list, subroutine MCMREAD) - 8.1.6. MD data (subroutine READ_MDPAR) - 8.1.7. REMD/MREMD data (subroutine READ_REMDPAR) - 8.1.8. Energy-term weights (data list; subroutine MOLREAD) - 8.1.9. Input and/or reference PDB file name (text format; subroutine MOLREAD) - 8.1.10. Amino-acid sequence (free and text format) - 8.1.11. Disulfide-bridge information (free format; subroutine READ_BRIDGE) - 8.1.12. Dihedral-angle restraint data (free format; subroutine MOLREAD) - 8.1.13. Distance restraints (subroutine READ_DIST_CONSTR) - 8.1.14. Internal coordinates of the reference structure (free format; - subroutine READ_ANGLES) - 8.1.15. Internal coordinates of the initial conformation (free format; - subroutine READ_ANGLES) - 8.1.15.1. File name with internal coordinates of the conformations - to be processed - 8.1.16 Control data for energy map construction (data lists; - subroutine MAP_READ) - 8.2. Parameter files - 8.3. Input coordinate files - 8.4. Other input files - -9. Output files - 9.1. Coordinate files - 9.1.1. The internal coordinate (INT) files - 9.1.2. The plain Cartesian coordinate (X) files - 9.1.3. The compressed Cartesian coordinate (CX) files - 9.1.4. The Brookhaven Protein Data Bank format (PDB) files - 9.1.5. The SYBYLL (MOL2) files - 9.2. The summary (STAT) file - 9.2.1. Non-MD runs - 8.2.2. MD and MREMD runs - 9.3. CSA-specific output files - -10. Technical support contact information - -1. LICENSE TERMS ----------------- - -* This software is provided free of charge to academic users, subject to the - condition that no part of it be sold or used otherwise for commercial - purposes, including, but not limited to its incorporation into commercial - software packages, without written consent from the authors. For permission - contact Prof. H. A. Scheraga, Cornell University. - -* This software package is provided on an "as is" basis. We in no way warrant - either this software or results it may produce. - -* Reports or publications using this software package must contain an - acknowledgment to the authors and the NIH Resource in the form commonly used - in academic research. - -2. CREDITS ----------- - -The current and former developers of UNRES are listed in this section in alphabetic -order together with their current or former affiliations. - -Maurizio Chinchio (formerly Cornell Univ., USA) -Cezary Czaplewski (Univ. of Gdansk, Poland) -Carlo Guardiani (Georgia State Univ., USA) -Yi He (Cornell Univ., USA) -Justyna Iwaszkiewicz (Swiss Institute of Bioinformatics, Switzerland) -Dawid Jagiela (Univ. of Gdansk, Poland) -Stanislaw Jaworski (deceased) -Sebastian Kalinowski (Univ. of Gdansk, Poland) -Urszula Kozlowska (deceased) -Rajmund Kazmierkiewicz (Univ. of Gdansk, Poland) -Jooyoung Lee (Korea Institute for Advanced Studies, Korea) -Adam Liwo (Univ. of Gdansk, Poland) -Mariusz Makowski (Univ. of Gdansk, Poland) -Marian Nanias (formerly Cornell Univ., USA) -Stanislaw Oldziej (Univ. of Gdansk, Poland) -Jaroslaw Pillardy (Cornell Univ., USA) -Daniel Ripoll (formerly Cornell Univ., USA) -Jeff Saunders (Schrodinger Inc., USA) -Harold A. Scheraga (Cornell Univ., USA) -Hujun Shen (Dalian Institute of Chemical Physics, P.R. China) -Adam Sieradzan (Univ. of Gdansk, Poland) -Ryszard Wawak (formerly Cornell Univ., USA) -Bartlomiej Zaborowski (Univ. of Gdansk, Poland) - -3. GENERAL INFORMATION ----------------------- - -3.1. Purpose ------------- - -Run coarse-grained calculations of polypeptide chains with the UNRES force field. -There are two versions of the package which should be kept separate because of -non-overlapping functions: version which runs global optimization (Conformational -Space Annealing, CSA) and version that runs coarse-grained molecular dynamics and -its extension. Because the installation, input file preparation and running CSA -and MD versions are similar, a common manual is provided. Items specific -for the CSA and MD version are marked "CSA" and "MD", respectively. - -MD version can be used to run multiple-chain proteins (however, that version of -the code is a new release and might fail if yet un-checked functions are used). -The multi-chain CSA version for this purpose is another package (written largely in -C++). - -3.2. Functions of the program ------------------------------ - -1. Perform energy evaluation of a single or multiple conformations - (serial and parallel) (CSA and MD) - -2. Run canonical mesoscopic molecular dynamics (serial and parallel) (MD). - -3. Run replica exchange (REMD) and multiplexing replica exchange (MREMD) - dynamics (parallel only) (MD). - -4. Run multicanonical molecular dynamics (parallel only) (MD). - -5. Run energy minimization (serial and parallel) (CSA and MD). - -6. Run conformational space annealing (CSA search) (parallel only) (CSA). - -7. Run Monte Carlo plus Minimization (MCM) (parallel only) (CSA). - -8. Run conformational family Monte Carlo (CFMC) calculations (CSA). - -9. Thread the sequence against a database from the PDB and minimize energy of - each structure (CSA). - -Energy and force evaluation is parallelized in MD version. - -3.3. Companion programs ------------------------ - -The structures produced by UNRES can be used as inputs to the following programs provided -with this package or separately: - -xdrf2pdb - converts the compressed coordinate files from MD (but not MREMD)runs into - PDB format. - -xdrf2pdb-m - same for MREMD runs (multiple trajectory capacity). - -xdrf2x - converts the plain Cartesian coordinate files into PDB format. - -WHAM - processes the coordinate files from MREMD runs and computes temperature profiles - of ensemble averages and computes the probabilities of conformations at selected - temperatures; also prepares data for CLUSTER and ZSCORE. - -CLUSTER - does the cluster analysis of the conformations; for MREMD runs takes the - coordinate files from WHAM which contain information to compute probabilities - of conformations at any temperature. - -PHOENIX - conversion of UNRES conformations to all-atom conformations. - -ZSCORE - force field optimization (for developers). - -Please consult the manuals of the corresponding packages for details. Note that not -all of these packages are released yet; they will be released depending on their -readiness for distribution. Contact Adam Liwo, Cezary Czaplewski or Stanislaw Oldziej -for developmental versions of these programs. - -3.4. Programming language -------------------------- - -This version of UNRES is written almost exclusively in Fortran 77; some subroutines -for data management are in ansi-C. The package was parallelized with MPI. - -3.5. References ---------------- - -Citing the following references in your work that makes use of UNRES is gratefully -acknowledged: - -[1] A. Liwo, S. Oldziej, M.R. Pincus, R.J. Wawak, S. Rackovsky, H.A. Scheraga. - A united-residue force field for off-lattice protein-structure simulations. - I: Functional forms and parameters of long-range side-chain interaction potentials - from protein crystal data. J. Comput. Chem., 1997, 18, 849-873. - -[2] A. Liwo, M.R. Pincus, R.J. Wawak, S. Rackovsky, S. Oldziej, H.A. Scheraga. - A united-residue force field for off-lattice protein-structure simulations. - II: Parameterization of local interactions and determination - of the weights of energy terms by Z-score optimization. - J. Comput. Chem., 1997, 18, 874-887. - -[3] A. Liwo, R. Kazmierkiewicz, C. Czaplewski, M. Groth, S. Oldziej, R.J. Wawak, - S. Rackovsky, M.R. Pincus, H.A. Scheraga. - United-residue force field for off-lattice protein-structure simulations. - III. Origin of backbone hydrogen-bonding cooperativity in united-residue potentials. - J. Comput. Chem., 1998, 19, 259-276. - -[4] A. Liwo, C. Czaplewski, J. Pillardy, H.A. Scheraga. - Cumulant-based expressions for the multibody terms for the correlation between - local and electrostatic interactions in the united-residue force field. - J. Chem. Phys., 2001, 115, 2323-2347. - -[5] J. Lee, D.R. Ripoll, C. Czaplewski, J. Pillardy, W.J. Wedemeyer, H.A. Scheraga, - Optimization of parameters in macromolecular potential energy functions by - conformational space annealing. J. Phys. Chem. B, 2001, 105, 7291-7298 - -[6] J. Pillardy, C. Czaplewski, A. Liwo, W.J. Wedemeyer, J. Lee, D.R. Ripoll, - P. Arlukowicz, S. Oldziej, Y.A. Arnautova, H.A. Scheraga, - Development of physics-based energy functions that predict medium-resolution - structures for proteins of the alpha, beta, and alpha/beta structural classes. - J. Phys. Chem. B, 2001, 105, 7299-7311 - -[7] A. Liwo, P. Arlukowicz, C. Czaplewski, S. Oldziej, J. Pillardy, H.A. Scheraga. - A method for optimizing potential-energy functions by a hierarchical design - of the potential-energy landscape: Application to the UNRES force field. - Proc. Natl. Acad. Sci. U.S.A., 2002, 99, 1937-1942. - -[8] J. A. Saunders and H.A. Scheraga. - Ab initio structure prediction of two $\alpha$-helical oligomers - with a multiple-chain united-residue force field and global search. - Biopolymers, 2003, 68, 300-317. - -[9] J.A. Saunders and H.A. Scheraga. - Challenges in structure prediction of oligomeric proteins at the united-residue - level: searching the multiple-chain energy landscape with CSA and CFMC procedures. - Biopolymers, 2003, 68, 318-332. - -[10] S. Oldziej, U. Kozlowska, A. Liwo, H.A. Scheraga. - Determination of the potentials of mean force for rotation about Calpha-Calpha - virtual bonds in polypeptides from the ab initio energy surfaces of terminally - blocked glycine, alanine, and proline. J. Phys. Chem. A, 2003, 107, 8035-8046. - -[11] A. Liwo, S. Oldziej, C. Czaplewski, U. Kozlowska, H.A. Scheraga. - Parameterization of backbone-electrostatic and multibody contributions - to the UNRES force field for protein-structure prediction from ab initio - energy surfaces of model systems. J. Phys. A, 2004, 108, 9421-9438. - -[12] S. Oldziej, A. Liwo, C. Czaplewski, J. Pillardy, H.A. Scheraga. - Optimization of the UNRES force field by hierarchical design of the - potential-energy landscape. 2. Off-lattice tests of the method with single - proteins. J. Phys. Chem. B., 2004, 108, 16934-16949. - -[13] S. Oldziej, J. Lagiewka, A. Liwo, C. Czaplewski, M. Chinchio, - M. Nanias, H.A. Scheraga. - Optimization of the UNRES force field by hierarchical design of the - potential-energy landscape. 3. Use of many proteins in optimization. - J. Phys. Chem. B., 2004, 108, 16950-16959. - -[14] M. Khalili, A. Liwo, F. Rakowski, P. Grochowski, H.A. Scheraga. - Molecular dynamics with the united-residue model of polypeptide chains. - I. Lagrange equations of motion and tests of numerical stability in the - microcanonical mode, J. Phys. Chem. B, 2005, 109, 13785-13797. - -[15] M. Khalili, A. Liwo, A. Jagielska, H.A. Scheraga. - Molecular dynamics with the united-residue model of polypeptide chains. - II. Langevin and Berendsen-bath dynamics and tests on model $\alpha$-helical - systems. J. Phys. Chem. B, 2005, 109, 13798-13810. - -[16] A. Liwo, M. Khalili, H.A. Scheraga. - Ab initio simulations of protein-folding pathways by molecular dynamics with - the united-residue model of polypeptide chains. - Proc. Natl. Acad. Sci. U.S.A., 2005, 102, 2362-2367. - -[17] F. Rakowski, P. Grochowski, B. Lesyng, A. Liwo, H. A. Scheraga. - Implementation of a symplectic multiple-time-step molecular dynamics algorithm, - based on the united-residue mesoscopic potential energy function. - J. Chem. Phys., 2006, 125, 204107. - -[18] M. Nanias, C. Czaplewski, H.A. Scheraga. - Replica exchange and multicanonical algorithms with the coarse-grained - united-residue (UNRES) force field. - J. Chem. Theory and Comput., 2006, 2, 513-528. - -[19] A. Liwo, M. Khalili, C. Czaplewski, S. Kalinowski, S. Oldziej, K. Wachucik, - H.A. Scheraga. - Modification and optimization of the united-residue (UNRES) potential energy - function for canonical simulations. I. Temperature dependence of the effective - energy function and tests of the optimization method with single training - proteins. - J. Phys. Chem. B, 2007, 111, 260-285. - -[20] U. Kozlowska, A. Liwo, H.A. Scheraga. - Determination of virtual-bond-angle potentials of mean force for coarse-grained - simulations of protein structure and folding from ab initio energy surfaces of - terminally-blocked glycine, alanine, and proline. - J. Phys.: Condens. Matter, 2007, 19, 285203. - -[21] M. Chinchio, C. Czaplewski, A. Liwo, S. Oldziej, H.A. Scheraga. - Dynamic formation and breaking of disulfide bonds in molecular dynamics - simulations with the UNRES force field. - J. Chem. Theory and Comput., 2007, 3, 1236-1248. - -[22] A.V. Rojas, A. Liwo, H.A. Scheraga. - Molecular dynamics with the united-residue force field: Ab Initio folding - simulations of multichain proteins. - J. Phys. Chem. B, 2007, 111, 293-309. - -[23] A. Liwo, C. Czaplewski, S. Oldziej, A.V. Rojas, R. Kazmierkiewicz, - M. Makowski, R.K. Murarka, H.A. Scheraga. - Simulation of protein structure and dynamics with the coarse-grained UNRES - force field. In: Coarse-Graining of Condensed Phase and Biomolecular - Systems., ed. G. Voth, Taylor & Francis, 2008, Chapter 8, pp. 107-122. - -[24] C. Czaplewski, S. Kalinowski, A. Liwo, H.A. Scheraga. - Application of multiplexed replica exchange molecular dynamics - to the UNRES force field: tests with $\alpha$ and $\alpha+\beta$ proteins. - J. Chem. Theor. Comput., 2009, 5, 627-640. - -[24] Y. He, Y. Xiao, A. Liwo, H.A. Scheraga. - Exploring the parameter space of the coarse-grained UNRES force field by random - search: selecting a transferable medium-resolution force field. - J. Comput. Chem., 2009, 30, 2127-2135. - -[25] U. Kozlowska, A. Liwo. H.A. Scheraga. - Determination of side-chain-rotamer and side-chain and backbone - virtual-bond-stretching potentials of mean force from AM1 energy surfaces of - terminally-blocked amino-acid residues, for coarse-grained simulations of - protein structure and folding. 1. The Method. - J. Comput. Chem., 2010, 31, 1143-1153. - -[26] U. Kozlowska, G.G. Maisuradze, A. Liwo, H.A. Scheraga. - Determination of side-chain-rotamer and side-chain and backbone - virtual-bond-stretching potentials of mean force from AM1 energy surfaces of - terminally-blocked amino-acid residues, for coarse-grained simulations of - protein structure and folding. 2. Results, comparison with statistical - potentials, and implementation in the UNRES force field. - J. Comput. Chem., 2010, 31, 1154-1167. - -[27] A. Liwo, S. Oldziej, C. Czaplewski, D.S. Kleinerman, P. Blood, H.A. Scheraga. - Implementation of molecular dynamics and its extensions with the coarse-grained - UNRES force field on massively parallel systems; towards millisecond-scale - simulations of protein structure, dynamics, and thermodynamics. - J. Chem. Theor. Comput., 2010, 6, 890-909. - -4. INSTALLATION ---------------- - -The distribution is contained in the UNRES.tar.gz file. To uncompress say: - -gzip -cd UNRES.tar.gz | tar xf - - -This will produce a directory named UNRES with the following subdirectories: - -src_CSA - the CSA-version source directory. - -src_MD - the MD-version source directory, single chains. - -src_MD-M - the MD-version source directory, oligomeric proteins - -bin - the binaries/scripts directory; its BATCH_SCRIPTS directory contains the - batch scripts (at present the only example is for PBS: unres_3P_PBS.csh, - which is an UNRES calling script and start.mat, which is the batch script - submitted to the PBS system). - -doc - documentation (this file and EXAMPLES.TXT) - -examples - sample input files (see EXAMPLES.TXT for description). - -To produce the executable do the following: - -a) To build parallel version, make sure that MPI is installed in your system. - Note that the package will have limited functions when compiled in a single-CPU mode. - On linux cluster the command source $HOME/.env should be added to .tcshrc - or equivalent file to use parallel version of the program, the - alternative is to use queuing system like PBS. - In some cases the FORTRAN library subroutine GETENV does not work properly - with MPI, if the script is run interactively. In such a case try to - add the source mygentenv.F and turn on the -DMYGETENV preprocessor flag. - -b) Change directory to the respective source directory. - -c) Edit the appropriate Makefile (parallel program that includes CSA - procedure, the serial version is no longer supported, for serial task - parallel program can be run using only one processor) to customize to your - system. Makefiles for the following systems are provided: - - Makefile_osf_f90 - OSF1/Tru64 UNIX HP Alphaserver with f90 compiler, - Makefile_lnx_pgf90 - Linux, the pgf90 compiler, - Makefile_lnx_ifc - Linux, ifc compiler. - Makefile_win_pgf90 - Windows, the pgf90 compiler. - - Other systems should not cause problems; all you have to do is to change - the compiler, compiler options, and preprocessor options. Also, change the - BIN variable, if you want to put your binaries in other place than - PROTARCH/BIN. In the case of Makefile make sure that the MPI directories are - correctly specified. - - The following architectures are defined in the .F source files: - - AIX - AIX systems (put -DAIX as one of the preprocessor options, if - this is your system) - - LINUX - Linux (put -DLINUX) - - G77 - Gnu-Fortran compilers (might require sum moderate source code editing) - (put -DG77). The recommended compiler is gfortran and not g77. - - PGI - PGI compilers - - WINPGI - additional setting for PGI compilers for MS Windows - - SGI - all SGI platforms; should also be good for SUN platforms (put -DSGI) - - WIN - MS Windows with Digital Fortran compiler (put -DWIN) - - For other platforms, the only problems might appear in connection with - machine-specific I/O instructions. Many files are opened in the append - mode, whose specification in the OPEN statement is quite machine-dependent. - In this case you might need to modify the source code accordingly. - The other platform dependent routines are the timing routines contained - in timing.F. In addition to the platforms specified above, ES9000, SUN, - KSR, and CRAY are defined there. - - For parallel build -DMP and -DMPI must be set (these are set in Makefile). - - IMPORTANT! Apart from this, two define flags: -DCRYST_TOR and -DMOMENT - define earlier versions of the force field. The MUST NOT be entered, if - the CASP5 and later versions of the force field are used. - -d) Build the unres executables by typing at your UNIX prompt: - - make # will build unres - - make clean # will remove the object files - - The bin directory contains pre-built binaries for Red Hat Linux. These - executables are specified in the csh scripts listed in section 4. - -e) Customize the C-shell scripts unres.unres (to run the parallel version on - set of workstation). See the next section of this manual for guidance. - -After the executables are build and C-shell scripts customized, you can run the -test examples contained in UNRES/examples. - -5. CUSTOMIZING YOUR C-SHELL SCRIPT ----------------------------------- - -IMPORTANT NOTE - The unres.csh script is for Linux and should also be easily -adaptable to other systems running MPICH. This script is for interactive -parallel jobs. Examples of scripts compatible with PBS (pbs.sub) and LoadLever -(sp2.sub) queuing systems are also provided. - -Edit the following lines in your unres.csh script: - -set DD = your_database_directory - -e.g., if you installed the package on the directory /usr/local, this line -looks like this: - -set DD = /usr/local/UNRES/PARAM - -set BIN = your_binaries_directory - -set FGPROCS = number_of_processors_per_energy/force_evaluation (MD) - -e.g., if the root directory is as above: - -set BIN = /usr/local/UNRES/bin - -6. COMMAND LINE AND FILES -------------------------- - -To run UNRES interactively enter the following command at your Unix prompt -or put it in the batch script: - -unres.csh POTENTIAL INPUT N_PROCS - -where: - -POTENTIAL specifies the side-chain interaction potential type and must be -one of the following: - -LJ - 6-12 radial Lennard-Jones -LJK - 6-12 radial Lennard-Jones-Kihara (shifted Lennard Jones) -BP - 6-12 anisotropic Berne-Pechukas based on Gaussian overlap (dilated - Lennard-Jones) -GB - 6-12 anisotropic Gay-Berne (shifted Lennard-Jones) -GBV - 6-12 anisotropic Gay-Berne-Vorobjev (shifted Lennard-Jones) - -See section 4. (Force Fields) for explanation and usage. - -At present, only the LJ and GB potentials are applied. The LJ potential -is used in the "CASP3" version of the UNRES force field that is able -to predict only alpha-helical structures. All further version of the -UNRES force field use the GB potential. For the description of all above-mentioned -potentials see A. Liwo, St. Oldziej, M.R. Pincus, R.J. Wawak, S. Rackovsky, -H.A. Scheraga, J. Comput. Chem., 1997, 18, 849-873. - -INPUT is the prefix for input and output files (see below) - -N_PROCS is the number of processors; for a CSA or REMD/MREMD run it MUST be at least 2. - -Note! The script takes one more variable, FGPROCS, as the fourth argument, -which is the number of fine-grain processors to parallelize energy -evaluations. The corresponding code is in UNRES/CSA, but it was written -using MPL instead of MPI and therefore is never used in the present version. -At present we have no plans to rewrite fine-grain parallelization using MPI, -because we found that the scalability for up to 200 residue polypeptide -chains was very poor, due to a small number of interactions and, -correspondingly, unfavorable ratio of the overhead to the computation time. - -INPUT.inp contains the main input data and the control parameters of the CSA - method. - -INPUT.out_POTENTIAL_xxx - main output files from different processors; xxx - denotes the number of the processor - -INPUT_POTENTIALxxx.stat - summary files with the energies, energy components, - and RMS deviations of the conformations produced by each of the processors; - not used in CSA runs; also it outputs different quantity in MD/MREMD runs. - -CSA version specific files: - -INPUT_POTENTIALxxx.int - internal coordinates; in the CSA run - INPUT_POTENTIAL_000.int contains the coordinates of the conformations, - and the other files are empty - -INPUT.CSA.history - history file from a CSA run. This is an I/O file, because - it can be used to restart an interrupted CSA run. - -INPUT.CSA.seed - stores the random seed generated in a CSA run; written for - restart purposes. - -INPUT.CSA.bank - current bank of conformations obtained in CSA calculations - (expressed as internal coordinates). This information is also stored in - INPUT_POTENTIAL000.int - -INPUT.CSA.rbank - as above, but contains random-generated conformations. - -MD version specific files: - -INPUT_MDyyy.pdb - Cartesian coordinates of the conformations in PDB format. - -INPUT_MDyyy.x - Cartesian coordinates of the conformations in ASCII format. - -INPUT_MDyyy.cx - Cartesian coordinates of the conformations in compressed format - (need xdr2pdb to convert to PDB format). - -The program currently produces some more files, but they are not used -for any purposes and most of them are scratched after a run is completed. - -The run script also contains definitions of the parameter files through the -following environmental variables: - -SIDEPAR - parameters of the SC-SC interaction potentials (U_{SC SC}); -SCPPAR - parameters of the SC-p interaction potential (U_{SCp}); this file can - be ignored by specifying the -DOLDSCP preprocessor flag, which means that the - built-in parameters are used; at present they are the same as the parameters - in the file specified by SCPPAR; -ELEPAR - parameters of the p-p interaction potentials (U_{pp}); -FOURIER - parameters of the multibody potentials of the coupling between the - backbone-local and backbone-electrostatic interactions (U_{corr}); -THETPAR - parameters of the virtual-bond-angle bending potentials (U_b); -ROTPAR - parameters of the side-chain rotamer potentials (U_{rot}); -TORPAR - parameters of the torsional potentials (U_{rot}); -TORDPAR - parameters of the double-torsional potentials. -SCCORPAR - parameters of the supplementary torsional sequence-specific potentials - (not implemented yet). - -7. FORCE FIELDS ---------------- - -UNRES is being developed since 1997 and several versions of the force field -were produced. The settings and references to these force fields are -summarized below. - -Force fields for CSA version (can be used in MD but haven't been parameterized for this -purpose). - ---------------------------------------------------------------------------------------- - Additional SC-SC Example script Structural -Force field compiler flags potential and executables classes covered References - (Linux; PGF90 - and IFC) ---------------------------------------------------------------------------------------- - -CASP3 -DCRYST_TOR LJ unres_CASP3.csh only alpha [1-3] - -DCRYST_BOND unres_pgf90_cryst_tor.exe - -DCRYST_THETA unres_ifc6_cryst_tor.exe - -DCRYST_SC - -DMOMENT - -ALPHA -DMOMENT GB unres_CASP4.csh only alpha [4-6] - -DCRYST_BOND unres_pgf90_moment.exe - -DCRYST_THETA unres_ifc6_moment.exe - -DCRYST_SC - -BETA -DMOMENT GB unres_CASP4.csh only beta [4-6] - -DCRYST_BOND unres_pgf90_moment.exe - -DCRYST_THETA unres_ifc6_moment.exe - -DCRYST_SC - -ALPHABETA -DMOMENT GB unres_CASP4.csh all [4-6] - -DCRYST_BOND unres_pgf90_moment.exe - -DCRYST_THETA unres_ifc6_moment.exe - -DCRYST_SC - -CASP5 -DCRYST_BOND GB unres_CASP5.csh all [7,8,11] - -DCRYST_THETA unres_pgf90.exe - -DCRYST_SC unres_ifc6.exe - -3P -DCRYST_BOND GB unres_3P.csh all [12,13] - -DCRYST_THETA unres_pgf90.exe - -DCRYST_SC unres_ifc6.exe - -4P -DCRYST_BOND GB unees_4P.csh all [12,13] - -DCRYST_THETA unres_pgf90.exe - -DCRYST_SC unres_ifc6.exe ---------------------------------------------------------------------------------------- - -Force fields for MD version - ---------------------------------------------------------------------------------------- - Additional SC-SC Example script Structural -Force field compiler flags potential and executables classes covered References - (Linux; PGF90 - and IFC) ---------------------------------------------------------------------------------------- - -GAB -DCRYST_BOND GB unres_GAB.csh mostly alpha [19] - -DCRYST_THETA - -DCRYST_SC - -E0G -DCRYST_BOND GB unres_E0G.csh mostly alpha [19] - -DCRYST_THET - -DCRYST_SC - -1L2Y_1LE1 none GB unres_ab.csh all [20,25-27] - ---------------------------------------------------------------------------------------- - -The example scripts (the *.csh filed) contain all appropriate parameter files, while -the energy-term weights are provided in the example input files listed in EXAMPLES.TXT -(*.inp; see section 5. for description of the input files). However, it is user's -responsibility to specify appropriate compiler flags. Note that a version WILL NOT work, -if the force-field specific compiler flags are not set. The parameter files specified -in the run script also must strictly correspond to the energy-term weights specified in -the input file. The parameter files for specific force fields are also specified below -and the energy-term weights are specified in section 5. - -The parameter files are as follows (the environment variables from section 3 are -used to identify the parameters): - -CASP3: - -BONDPAR bond.parm -THETPAR thetaml.5parm -ROTPAR scgauss.parm -TORPAR torsion_cryst.parm -TORDPAR torsion_double_631Gdp.parm (not used) -SIDEPAR scinter_LJ.parm -ELEPAR electr.parm -SCPPAR scp.parm -FOURIER fourier_GAP.parm (not used) -SCCORPAR rotcorr_AM1.parm (not used) - -ALPHA, BETA, ALPHABETA (CASP4): - -BONDPAR bond.parm -THETPAR thetaml.5parm -ROTPAR scgauss.parm -TORPAR torsion_ecepp.parm -TORDPAR torsion_double_631Gdp.parm (not used) -SIDEPAR scinter_GB.parm -ELEPAR electr.parm -SCPPAR scp.parm -FOURIER fourier_GAP.parm -SCCORPAR rotcorr_AM1.parm (not used) - -CASP5: - -BONDPAR bond.parm -THETPAR thetaml.5parm -ROTPAR scgauss.parm -TORPAR torsion_631Gdp.parm -TORDPAR torsion_double_631Gdp.parm -SIDEPAR scinter_GB.parm -ELEPAR electr_631Gdp.parm -SCPPAR scp.parm -FOURIER fourier_opt.parm.1igd_iter7n_c -SCCORPAR rotcorr_AM1.parm (not used) - -3P: - -BONDPAR bond.parm -THETPAR thetaml.5parm -ROTPAR scgauss.parm -TORPAR torsion_631Gdp.parm -TORDPAR torsion_double_631Gdp.parm -SIDEPAR sc_GB_opt.3P7_iter81_1r -ELEPAR electr_631Gdp.parm -SCPPAR scp.parm -FOURIER fourier_opt.parm.1igd_hc_iter3_3 -SCCORPAR rotcorr_AM1.parm (not used) - -4P: - -BONDPAR bond.parm -THETPAR thetaml.5parm -ROTPAR scgauss.parm -TORPAR torsion_631Gdp.parm -TORDPAR torsion_double_631Gdp.parm -SIDEPAR sc_GB_opt.4P5_iter33_3r -ELEPAR electr_631Gdp.parm -SCPPAR scp.parm -FOURIER fourier_opt.parm.1igd_hc_iter3_3 -SCCORPAR rotcorr_AM1.parm (not used) - -GAB: - -BONDPAR bond.parm -THETPAR thetaml.5parm -ROTPAR scgauss.parm -TORPAR torsion_631Gdp.parm -TORDPAR torsion_double_631Gdp.parm -SIDEPAR sc_GB_opt.1gab_3S_qclass5no310-shan2-sc-16-10-8k -ELEPAR electr_631Gdp.parm -SCPPAR scp.parm -FOURIER fourier_opt.parm.1igd_hc_iter3_3 -SCCORPAR rotcorr_AM1.parm - -E0G: - -BONDPAR bond.parm -THETPAR thetaml.5parm -ROTPAR scgauss.parm -TORPAR torsion_631Gdp.parm -TORDPAR torsion_double_631Gdp.parm -SIDEPAR sc_GB_opt.1e0g-52-17k-2k-newclass-shan1e9_gap8g-sc -ELEPAR electr_631Gdp.parm -SCPPAR scp.parm -FOURIER fourier_opt.parm.1igd_hc_iter3_3 -SCCORPAR rotcorr_AM1.parm - -1L2Y_1LE1: - -BONDPAR bond_AM1.parm -THETPAR theta_abinitio.parm -ROTPAR rotamers_AM1_aura.10022007.parm -TORPAR torsion_631Gdp.parm -TORDPAR torsion_double_631Gdp.parm -SIDEPAR scinter_${POT}.parm -ELEPAR electr_631Gdp.parm -SCPPAR scp.parm -FOURIER fourier_opt.parm.1igd_hc_iter3_3 -SCCORPAR rotcorr_AM1.parm - -Additionally, for 1L2Y_1LE1, the following environment variables and files are required -to generate random conformations: - -THETPARPDB thetaml.5parm -ROTPARPDB scgauss.parm - -For CSA, the best force field is 4P. For MD, the 1L2Y_1LE1 force field is best for -ab initio prediction but provides medium resolution (5 A for 60-residue proteins) and -overemphasizes beta structures and has to be run with secondary-structure-prediction -information. For prediction of the structure of mostly alpha-protein, and for running -dynamics of large proteins, the best is the GAB force field. All these force fields -were trained by using our procedure of hierarchical optimization [5]. -The 4P and 1L2Y_1LE1 force fields have considerable power independent of structural class. -The ALPHA, BETA, and ALPHABETA force fields (for CSA) were used in the CASP4 exercises -and the CASP5 force field was used in the CASP5 exercise with some success; ALPHA -predicts reasonably the structure of alpha-helical proteins and is still not obsolete, -while for beta and alpha+beta structure prediction -3P or 4P should be used, because they are cheaper and more reliable than BETA and -ALPHABETA. The early CASP3 force field is included for historical reasons only. - -7. INPUT FILES --------------- - -7.1. Main input data file -------------------------- - -Most of the data are organized as data lists, where the data can be put -in any order, using a series of statements of the form: - -KEYWORD=value - -for simple non-logical variables - -or just - -KEYWORD - -to indicate that the corresponding option is turned on. For array variables -the assignment statement is: - -KEYWORD=value1,value2,... - -However, the data lists are unnamed and that must be placed EXACTLY in the -order indicated below. The presence of an "&" in the 80th column of a line -indicates that the next line will belong to the same data group. The parser -subroutines that interpret the keywords are case insensitive. - -Each group of data organized as a data list is indicated as "data list format" -input. - -8.1.1. Title ------------- -Any string containing up to 80 characters. The first input line is always -interpreted as title. - -8.1.2. Control data (data list format; READ_CONTROL subroutine) ---------------------------------------------------------------- - -8.1.2.1 Keywords to chose calculation type ------------------------------------------- - -OUT1FILE - only the master processor prints the output file in a parallel job - -MINIMIZE - if present, energy minimization will be carried out. - -REGULAR - regularize the read in conformation (usually a crystal or - NMR structure) by doing a series of three constrained minimizations, - to keep the structure as close as possible to the starting - (experimental) structure. The constraints are the CA-CA distances - of the initial structure. The constraints are gradually diminished - and removed in the last minimization. - -SOFTREG - regularize the read in conformation (usually a crystal or NMR - structure) by doing a series of constrained minimizations, with - additional use of soft potential and secondary structure - freezing, to keep the structure as close as possible to the - starting (experimental) structure. - - -CSA - if present, the run is a CSA run. At present, this is the only - reliable mode of doing global conformational search with this - package; it is NOT recommended to use MCM or THREAD for this - purpose. - -MCMA - if present, this is a Monte Carlo Minimization (MCM) run. - -MULTCONF- if present, conformations will be read from the INPUT.intin - file. - -MD - run canonical MD (single or multiple trajectories) - -RE - run REMD or MREMD (parallel jobs only) - -MUCA - run multicanonical MD calculations (parallel jobs only) - -MAP=number (integer) -Conformational map will be calculated in chosen angles. - -THREAD=number (integer) -Threading or threading-with-minimization run, using a database of structures -contained in the $DD/patterns.cart pattern data base (502 chains or chain -fragments), using a total number patterns. It is recommended to use this with -energy minimization; this implies regularization of each minimized pattern. -For references see A. Liwo, M.R. Pincus, R.J. Wawak, -S. Rackovsky, St. Oldziej, H.A. Scheraga, J. Comput. Chem., 1997, 18, 874-887 -and A. Liwo, St. Oldziej, R. Kazmierkiewicz, M. Groth, C. Czaplewski, -Acta Biochim. Pol., 1997, 44, 527-547. - -CHECKGRAD - compare numerical and analytical gradient; to be followed by: - CART - energy gradient in virtual-bond vectors (Cartesian coordinates) - INT - energy gradient in internal coordinates (default) - CARINT - derivatives of the internal coordinates in the virtual-bond vectors. - -8.1.2.2 Specification of protein and structure output in non-MD applications ----------------------------------------------------------------------------- - -ONE_LETTER - one-letter and not three-letter code of the amino-acid residues - is used - -SYM (1) - number of chains with same sequence (for oligomeric proteins only), - -PDBSTART - the initial conformation is read in from a PDB file - -UNRES_PDB - the starting conformation is in UNRES representation (Calpha - and SC coordinates only). This keyword MUST appear in such a case - or the program will generate erroneous and unrealistic side-chain - coordinates. - -RAND_CONF- start from a random conformation - -EXTCONF - start from an extended chain conformation - -PDBOUT - if present, conformations will be output in PDB format. Note that - this keyword affects only the output from single energy evaluation, - energy minimization and multiple-conformation data. To request - conformations from MD/MREMD runs in PDB format, the MDPDB keyword - must be placed on the MD input record. - -MOL2OUT - if present, conformations will be output in SYBYL mol2 format - -REFSTR - if present, reference structure will be read (e.g., to monitor - the RMS deviation from the crystal structure) - -PDBREF - if present, a reference structure will be read in to compare - the calculated conformations with it - -UNRES_PBD - the starting/reference structure is read from an UNRES-generated - PDB file - -Keywords: PDBOUT, MOL2OUT, PDBREF, and PDBSTART are ignored for a CSA run. -Output mode for MD version is specified in MD input (see section 5.5). - -8.1.2.3. Miscellaneous ----------------------- - -CONSTR_DIST=number -0 - no distance restraints ->0 imposes harmonic restraints on selected distances; see section 5.12. -In MD version, also restraints on the q variable [18] can be used. - -WEIDIS=number (real) -the weight of the distance term; applies for REGULARIZE and THREAD, otherwise -ignored. - -USE_SEC_PRED - use secondary-structure prediction information. - -SEED=number (integer) (no default) -Random seed (required, even if the run is not a CSA, MCM, MD or MREMD run) - -PHI - only the virtual-bond dihedral angles gamma are considered as - variables in energy minimization - -BACK - only the backbone virtual angles (virtual-bond angles theta and - virtual-bond dihedral angles gamma) are considered as variables - in energy minimization - -By default, all internal coordinates: theta, gamma, and the side-chain -centroid polar angles alpha and beta are considered as variables in energy -minimization. - -RESCALE_MODE=number (real) -Choice of the type of temperature dependence of the force field. -0 - no temperature dependence -1 - homographic dependence (not implemented yet with any force field) -2 - hyperbolic tangent dependence [18]. - -T_BATH=number (real) -temperature (for MD runs and temperature-dependent force fields). - -The following keywords apply to MCM only: - -MAXGEN=number (integer) (10000) -maximum number of conformations generated in a single MCM iteration - -MAXOVERLAP=number (integer) (1000) -maximum number of conformations with "bad" overlaps allowed to appear in a -row in a single MCM iteration. - -DISTCHAINMAX - (multi-chain capacity only) maximum distance between the - last residue of a given chain and the first residue of the - next chain such that restraints will not be imposed; quartic - restraints will be imposed for greater distances. - -ENERGY_DEC - detailed energies will be printed for each interacting pair - or each virtual bond, virtual-bond angle and dihedral angle, - side chain, etc. DO NOT use unless a single energy evaluation - was requested. - -8.1.3. Minimizer options (data list, subroutine READ_MINIM) ------------------------------------------------------------ - -This data group is present, if MINIMIZE was specified on the control card. -Otherwise, it must not appear. - -CART - minimize in virtual-bond vectors instead of angles - -MAXMIN=number (integer) (2000) -maximum number of iterations of the SUMSL minimizer - -MAXFUN=number (integer) (5000) -maximum number of function evaluations in a single minimization - -TOLF=number (real) (1.0e-2) -Tolerance on function - -RTOLF=number (real) (1.0d-4) -Relative tolerance on function - -The SUMSL minimizer is used in UNRES/CSA. For detailed description of -the control parameters see the source file cored.f and sumsld.f - - -8.1.4 CSA control parameters ----------------------------- - -This data group should be present only, if CSA was specified on the control -card. It is recommended that the readers to read publications on CSA method -for more complete description of the parameters. Brief description of -parameters: - -NCONF=number (integer) (50) -This corresponds to the size of the bank at the beginning of the -CSA procedure. The size of the bank, nbank, is set to nconf. -If necessary (at much later stages of the CSA: see icmax below), -nbank increases by multiple of nconf. - -JSTART=number (integer) (1) -JEND=number (integer) (1) -This corresponds to the limit values of do loop, each of which -corresponds to an separate CSA run. If jstart=1, and jstart=100, -this routine will repeat 100 separate CSA runs (limited by CPU) -each one with separate random number initialization. -The only difference between two CSA runs (one with jstart=jend=1 -and another one with jstart=jend=2) would be different random -number initializations if other parameters are identical. - -NSTMAX=number (integer) (500000) -This is to set a limit the total number of local minimizations of CSA -before termination. - -N1=number (integer) (6) -N2=number (integer) (4) -N3=number (integer) (0) -N4=number (integer) (0) -N5=number (integer) (0) -N6=number (integer) (10) -N7=number (integer) (0) -N8=number (integer) (0) -N9=number (integer) (0) -IS1=number (integer) (1) -IS2=number (integer) (8) -These numbers are used to generate trial conformations for each seed. -See the file, "newconf.f", for more details. - n1: the total number of trial conformations for each seed by substituting - nran number of variable angles (see subroutine newconf1ab and - subroutine newconf1ar) - n2: the total number of trial conformations for each seed by substituting - nran number of groups of variable angles (see subroutine newconf1bb and - subroutine newconf1br) - n3: the total number of trial conformations for each seed by substituting - a window of residues which forms a beta-hairpin, if there is no enough - beta-hairpins uses the same algorithm as n6 - n4: the total number of trial conformations for each seed by shifting the - turn in beta-hairpin by +/- 1 or 2 residues, if there is no enough - beta-hairpins uses the same algorithm as n6 - n5: not used - n6: the total number of trial conformations for each seed by substituting - a window of residues [is1,is2] inclusive. The size of the window is - determined in a random fashion (see subroutine newconf_residue for - generation of the trial conformations) - n7: the total number of trial conformations for each seed by copying a - remote strand pair forming nonlocal beta-sheet contact - n8: the total number of trial conformations for each seed by copying an - alpha-helical segment - n9: the total number of trial conformations for each seed by shifting the - alpha-helical segment by +/- 1 or 2 residues - -Typical values used for a 75-residue helical protein is -(6 4 0 0 0 10 1 26) for (n1,n2,n3,n4,n5,n6,is1,is2), respectively. -In this example, a total of 20 trial conformations are generated for a seed -Usually is1=1 is used for all applications, and the value of is2 is set about -to 1/3 of the total number of residues. n3, n4 and n7 are design to help in -case of proteins with beta-sheets - -NRAN0=number (integer) (4) -NRAN1=number (integer) (2) -IRR=number (integer) (1) -These numbers are used to determine if the CSA stage is very early. -One can use (4 2 1) for these values. For more details one should look into -the file, "newconf.f", for more details. - -NTOTAL=number (integer) (10000) -CUT1=number (real) (2.0) -CUT2=number (real) (5.0) -Annealing schedule is set in following fashion. -The value of D_cut is reduced geometrically from 1/cut1 of D_ave (at the -beginning) to 1/cut2 of D_ave (after ntotal number of minimizations) where -D_ave is the average distance between two conformations in the First_bank. - -ESTOP=number (real) (-3000.0) -The CSA procedure stops if a conformations with energy lower than estop is -obtained. If the do-loop set by jstart and jend requires more than one loop, -the program will go on until the do-loop is finished. - -ICMAX=number (integer) (3) -The maximum value of cycle (see the original publications for details). -If the number of cycle exceeds this value the program will add nconf -more conformations to Bank and First_bank to continue CSA procedure if -the new size of the nbank is within the maximum set by nbankm (see above). -If the size of nbank exceeds the maximum set by nbankm the CSA procedure -for this run will stop and next CSA will begin depending on the do-loop -set by jstart and jend. - -IRESTART=number (integer) (0) -This tells you if the run is fresh start (irestart=0) or a restart (irestart=1) -starting from an old results - -NDIFF=number (integer) (2) -The number of variables use in comparison when structure is added to the -bank,4 - all angels, 2 - only backbone angles gamma and theta - -NBANKTM=number (integer) (0) -The maximum number of structures saved in *.CSA.bankt as history of the run -Do not use bankt on massively parallel computation as it kills scalability. - -DELE=number (real) (20.0) -Energy cutoff for bankt. - -DIFCUT=number (real) (720.0) -Angle cutoff for bankt. - -IREF=number (integer) (0) -0 - normal run, 1 - local CSA which generates only structures close to the -reference one read from *.CSA.native.int file - -RMSCUT=number (real) (4.0) -CA RMSD cut off used in local CSA - -PNCCUT=number (real) (0.5) -Percentage of native contact used in local CSA - -NCONF_IN=number (integer) (0) -The number of conformation read for the first bank from the input file -*.intin - -Optionally, the CSA parameters can be read from file INPUT.CSA.in, if -this file exists. If so, they are read in free format in the following -order: - -nconf -jstart,jend -nstmax -n1,n2,n3,n4,n5,n6,n7,n8,is1,is2 -nran0,nran1,irr -nseed -ntotal,cut1,cut2 -estop -icmax,irestart -ntbankm,dele,difcut -iref,rmscut,pnccut -ndiff - - -8.1.5. MCM data (data list, subroutine MCMREAD) ------------------------------------------------ - -This data group is present, if MCM was specified on the control card. -Otherwise it must not appear. - -MAXACC=number (integer) (100) -Maximum number of accepted conformations - -MAXTRIAL=number (integer) (100) -Maximum number of unsuccessful trials in a row - -MAXTRIAL_ITER=number (integer) (1000) -Maximum number of unsuccessful trials in a single iteration - -MAXREPM=number (integer) (200) -Maximum number of repetitions of the same minimum - -RANFRACT=number (real) (0.5d0) -Fraction of chain-rebuild motions - -OVERLAP=number (real) (1.0d3) -Bad contact energy criterion - -NSTEPH=number (integer) (0) -Number of heating step in adaptive sampling - -NSTEPC=number (integer) (0) -Number of cooling step in adaptive sampling - -TMIN=number (real) (298.0d0) -Minimum temperature in adaptive-temperature sampling) - -TMAX=number (real) (298.0d0) -Maximum temperature in adaptive-temperature sampling) - -The temperature is changed according to the formula: - -T = TMIN*EXP(ISTEPH*(TMAX-TMIN)/NSTEPH) when heating - -and - -T = TMAX*EXP(-ISTEPC*(TMAX-TMIN)/NSTEPC) when cooling - -The default is to use a constant temperature. - -NWINDOW=number (integer) (0) -Number of windows in which the variables will be perturbed; the windows are -defined by the numbers of the respective amino-acid residues. If NWINDOW -is nonzero, after specifying all MCM input the next lines must define the -windows. Each line looks like this: - -winstart winend (free format) - -e.g. if NWINDOW=2, the input: - -4 10 -15 20 - -will mean that only the variables of residues 4-10 and 15-20 will be perturbed. -However, in general, all variables will be considered in minimization. - -PRINT_MC=number (0) -Printout level in MCM. 0 - no intermediate printing, 1 and 2 - moderate -printing, 3 - extensive printing. - -NO_PRINT_STAT - no output to INPUT_POTENTIALxxx.stat. - -NO_PRINT_INT - no internal-coordinate output to INPUT_POTENTIALxxx.int. - -8.1.6. MD data (subroutine READ_MDPAR) --------------------------------------- - -NSTEP (1000000) number of time steps per trajectory. - -NTWE (100) NTWX (1000) frequency of energy and coordinate output, respectively. -The coordinates are dumped in the pdb or compressed Gromacs (cx) format, -depending on the next keyword. -NTWE=0 means no energy dump. - -MDPDB - dump coordinates in the PDB format (cx otherwise) - -TRAJ1FILE only the master processor outputs coordinates. This feature pertains - only to REMD/MREMD jobs and overrides NTWX; coordinates are dumped at every - exchange in MREMD. - -REST1FILE only the master writes the restart file - -DT (real) (0.1) time step; the unit is "molecular time unit" (mtu); 1 mtu = 48.9 fs - -DAMAX (real) (1.0) maximum allowed change of acceleration during a single time step. -The time step gets scaled down, if this is exceeded. - -DVMAX (real) (20.0) maximum allowed velocity (in A/mtu) - -EDRIFTMAX (real) (10.0) maximum allowed energy drift in a single MD step (10 kcal/mol) - -REST restart flag. The calculation is restarted if present. - -LARGE very detailed output. Don't use except for debugging. - -PRINT_COMPON prints energy components. - -RESET_MOMENT (1000) frequency of zeroing out the total angular momentum when -running Berendsen mode calculations (for Langevin calculations meaningless). - -RESET_VEL=number (integer) (1000) - frequency of resetting velocities to values -from Gaussian distribution. - -RATTLE - use RATTLE algorithm (constraint bonds); not yet implemented. - -RESPA - use the Multiple Time Step (MTS) or Adaptive Multiple Time Step (A-MTS) -algorithm [17]. Without this flag the variable time step (VTS) [14] is run. - -NTIME_SPLIT=number (integer) (1) - initial number of time-split steps - -MAXTIME_SPLIT=number(integer) (64) - maximum number of time-split step - -If NTIME_SPLIT==MAXTIME_SPLIT, MTS is run. - -R_CUT=number (real) (2.0) - the cut-off distance in splitting the forces into short- and -long-range in site-site VDW distance units. - -LAMBDA (real) (0.3) - the transition length (in site-site VDW distance units) between -short- and long-range forces. - -XIRESP - flag to use MTS/A-MTS with Nose-Hoover/Nose-Poincare thermostats. - -LANG=number (integer) (0) Langevin dynamics flag: - -0 - No explicit Langevin dynamics. -1 - Langevin with direct integration of the equations of motion (recommended - for Langevin calculations) -2 - Langevin calculation with analytical pre-integration of the friction and - stochastic part of the equations of motion using an algorithm adapted from TINKER. - This is MUCH MORE time- and memory-consuming than 1 and requires compiling without - the -DLANG0 flag and enormously increases memory requirements. -3 - The stochastic integrator developed by Cicotti and coworkers. -4 - for other stochastic integrators (not used at present). - -Note: With the enclosed code, the -DLANG0 compiler flag is included which disables -LANG=2 and LANG=3 - -TBF Berendsen thermostat. - -TAU_BATH (1.0) (units are mtus; 1mtu=48.9 fs) constant of the coupling to the thermal bath - used with the Berendsen thermostat. - -NOSEPOINCARE99 - the Nose-Poincare thermostat as of 1999 will be used. - -NOSEPOINCARE01 - the Nose-Poincare thermostat as of 2001 will be used. - -NOSEHOOVER96 - the Nose-Hoover thermostat will be used. - -Q_NP=number (real) (0.1) - the value of the mass of the fictitious particle in the calculations - with the Nose-Poincare thermostat. - -T_BATH (300.0) (in K) temperature of canonical simulation or temperature to generate -velocities. - -ETAWAT (0.8904) viscosity of water (in centipoises) - -RWAT (1.4) radius of water molecule (in A) - -SCAL_FRIC=number (real) (0.02) - scaling factor of the friction coefficients. - -SURFAREA - scale friction acting on atoms by atoms' solvent accessible area. - -RESET_FRICMAT=number (integer) (1000) - recalculate friction matrix every RESET_FRICMAT MD steps. - -USAMPL restraints on q (see reference 5 for meaning) will be imposed (see section . -In this case, the next records specify the restraints; these records are -placed before the list of temperatures or numbers of trajectories. - -EQ_TIME=number (real) (1.0e4) time (in mtus; 1 mtu=48.9 fs) after which restraints -on q will start to be in force. - -If USAMPL has been specified, the following information must be supplied after the -main MD input data record (subroutine READ_FRAGMENTS): - -Line 1: nset, npair, nfrag_back (number of sets of restraints, number of restrained -fragments, number of restrained pairs, number of restrained backbone fragments -(in terms of theta and gamma angles) - -For each set of restraints (1, 2,..., nset): - -mset(iset) - how many times the set is multiplied - -wfrag(i,iset), ifrag(1,i,iset), ifrag2(2,i,iset),qfrag(i,iset) -weight of the restraint, first and last residue of the fragment, target q value. -This information is repeated through nfrag. - -wpair(i,iset), ipair(1,i,iset), ipair(2,i,iset),qinpair(i,iset) -weight of the restraint, first and second fragment of the pair (according to fragment -list), target q value. This information is repeated through npair - -wfrag_back(1,i,iset), wfrag_back(2,i,iset), wfrag_back(3,i,iset), -ifrag_back(1,i,iset),ifrag_back(2,i,iset) -weight of the restraints on theta angles, weight on the restraints on gamma angles, -weight of the restraints on side-chain rotamers, first residue of the fragment, -last residue of the fragment. This information is repeated through nfrag_back. - -8.1.7 REMD/MREMD data (subroutine READ_REMDPAR) ------------------------------------------------ - -NREP (3) number of replicas in a REMD/MREMD run - -NSTEX (1000) number of steps after which exchange is performed in REMD/MREMD - runs - -The temperatures in replicas can be specified through - -RETMIN (10.0) minimum temperature in a REMD/MREMD run - -RETMAX (1000.0) maximum temperature in a REMD/MREMD run - -Then the range from retmin to retmax is divided into equal segments and -temperature of the replicas assigned accordingly, - -or - -TLIST means that the NREP temperature of the replicas will be input in the -next record - -MLIST numbers of trajectories per each of the NREP temperatures will be -specified in the record after the list of temperatures; this specifies -a MREMD run. - -Important! The number of processors must be exactly equal to the number of -trajectories, i.e., NREP for a REMD run or sum_i mlist(i) for a MREMD run. - -SYNC - all trajectories will be synchronized every NSTEX time steps -(by default, they are not synchronized) - -TRAJ1FILE only the master processor outputs coordinates. This feature pertains - only to REMD/MREMD jobs and overrides NTWX; coordinates are dumped at every - exchange in MREMD. - -REST1FILE only the master writes the restart file - -HREMD - Hamiltonian replica exchange flag; not only temperatures but also -sets energy-term weights are exchanged between conformations. - -TONLY - run a "fake" HREMD with many sets of energy-term weights in a -single run but only temperature exchange. - -8.1.8 Energy-term weights (data list; subroutine MOLREAD) ---------------------------------------------------------- - -WLONG=number (real) (1.0d0) -common weight of the U(SC-SC) (side-chain side-chain interaction) -and U(SC,p) (side-chain peptide-group) term - -WSCC = number (real) (WLONG) -weight of the U(SC-SC) term - -WSCP = number (real) (WLONG) -weight of the U(SC-p) term - -WELEC=number (real) (1.0d0) -weight of the U(p-p) (peptide-group peptide-group interaction) term - -WEL_LOC=number (real) (1.0d0) -weight of the U_el_loc^3 (local-electrostatic cooperativity, third-order) term - -WCORRH=number (real) (1.0d0) -weight of the U(corr) (cooperativity of hydrogen-bonding interactions, fourth-order) term - -WCORR5=number (real) (0.0d0) -weight of the U_el_loc^5 (local-electrostatic cooperativity, 5th order -contributions) - -WCORR6=number (real) (0.0d0) -weight of the U_el_loc^6 (local-electrostatic cooperativity, 6th order -contributions) - -WTURN3=number (real) (1.0d0) -weight of the U_turn^3 (local-electrostatic cooperativity within 3 residue -segment, 3rd order contribution) - -WTURN4=number (real) (1.0d0) -weight of the U_turn^4 (local-electrostatic cooperativity within 4 residue -segment, 4rd order contributions) - -WTURN6=number (real) (1.0d0) -weight of the U_turn^6 (local-electrostatic cooperativity within 6 residue -segment, 6rd order contributions) - -WTOR=number (real) (1.0d0) -weight of the torsional term U(tor) - -WANG=number (real) (1.0d0) -weight of the virtual-bond angle bending term U(b) - -WSCLOC=number (real) (1.0d0) -weight of the side-chain rotamer term U(SC) - -WSTRAIN=number (real) (1.0d0) -scaling factor of the distance-constrain or disulfide-bond strain energy term - -SCALSCP=number (real) (1.0d0) -scaling factor of U(SC,p); this is an alternative to specifying WSCP; in -this case WSCP will be calculated as WLONG*SCALSCP - -SCAL14=number (real) (1.0d0) -scaling factor of the 1,4 SC-p interactions - -CUTOFF (7.0) - cut-off on backbone-electrostatic interactions to compute 4- -and higher-order correlations - -DELT_CORR (0.5) - thickness of the distance range in which the energy is -decreased to zero - -The defaults are NOT the recommended values. No "working" default values -have been set, because the force field is still under development. The values -corresponding to the force fields listed in section 4 are as follows: - -CASP3: -WELEC=1.5 WSTRAIN=1.0 WTOR=0.08617 WANG=0.10384 WSCLOC=0.10384 WCORR=1.5 & -WTURN3=0 WTURN4=0 WTURN6=0 WEL_LOC=0 WCORR5=0 WCORR6=0 SCAL14=0.40 SCALSCP=1.0 & -CUTOFF=7.00000 WSCCOR=0.0 - -ALPHA: -WSC=1.00000 WSCP=0.72364 WELEC=1.10890 WANG=0.68702 WSCLOC=1.79888 & -WTOR=0.30562 WCORRH=1.09616 WCORR5=0.17452 WCORR6=0.36878 WEL_LOC=0.19508 & -WTURN3=0.00000 WTURN4=0.55588 WTURN6=0.11539 CUTOFF=7.00000 WCORR4=0.0000 & -WTORD=0.0 WSCCOR=0.0 - -BETA: -WSC=1.00000 WSCP=1.10684 WELEC=0.70000 WANG=0.80775 WSCLOC=1.91939 & -WTOR=3.36070 WCORRH=2.50000 WCORR5=0.99949 WCORR6=0.46247 WEL_LOC=2.50000 & -WTURN3=1.80121 WTURN4=4.35377 WTURN6=0.10000 CUTOFF=7.00000 WCORR4=0.00000 & -WSCCOR=0.0 - -ALPHABETA: -WSC=1.00000 WSCP=1.43178 WELEC=0.41501 WANG=0.37790 WSCLOC=0.12880 & -WTOR=1.98784 WCORRH=2.50526 WCORR5=0.23873 WCORR6=0.76327 WEL_LOC=2.97687 & -WTURN3=0.09261 WTURN4=0.79171 WTURN6=0.01074 CUTOFF=7.00000 WCORR4=0.00000 & -WSCCOR=0.0 - -CASP5: -WSC=1.00000 WSCP=1.54864 WELEC=0.20016 WANG=1.00572 WSCLOC=0.06764 & -WTOR=1.70537 WTORD=1.24442 WCORRH=0.91583 WCORR5=0.00607 WCORR6=0.02316 & -WEL_LOC=1.51083 WTURN3=2.00764 WTURN4=0.05345 WTURN6=0.05282 WSCCOR=0.0 & -CUTOFF=7.00000 WCORR4=0.00000 WSCCOR=0.0 - -3P: -WSC=1.00000 WSCP=2.85111 WELEC=0.36281 WANG=3.95152 WSCLOC=0.15244 & -WTOR=3.00008 WTORD=2.89863 WCORRH=1.91423 WCORR5=0.00000 WCORR6=0.00000 & -WEL_LOC=1.72128 WTURN3=2.99827 WTURN4=0.59174 WTURN6=0.00000 & -CUTOFF=7.00000 WCORR4=0.00000 WSCCOR=0.0 - -4P: -WSC=1.00000 WSCP=2.73684 WELEC=0.06833 WANG=4.15526 WSCLOC=0.16761 & -WTOR=2.99546 WTORD=2.89720 WCORRH=1.98989 WCORR5=0.00000 WCORR6=0.00000 & -WEL_LOC=1.60072 WTURN3=2.36351 WTURN4=1.34051 WTURN6=0.00000 & -CUTOFF=7.00000 WCORR4=0.00000 WSCCOR=0.0 - -GAB: -WLONG=1.35279 WSCP=1.59304 WELEC=0.71534 WBOND=1.00000 WANG=1.13873 & -WSCLOC=0.16258 WTOR=1.98599 WTORD=1.57069 WCORRH=0.42887 WCORR5=0.00000 & -WCORR6=0.00000 WEL_LOC=0.16036 WTURN3=1.68722 WTURN4=0.66230 WTURN6=0.00000 & -WVDWPP=0.11371 WHPB=1.00000 & -CUTOFF=7.00000 WCORR4=0.00000 - -E0G: -WLONG=1.70905 WSCP=2.18310 WELEC=1.06684 WBOND=1.00000 WANG=1.17536 & -WSCLOC=0.22070 WTOR=2.65798 WTORD=2.00646 WCORRH=0.23541 WCORR5=0.00000 & -WCORR6=0.00000 WEL_LOC=0.42789 WTURN3=1.68126 WTURN4=0.75080 WTURN6=0.00000 & -WVDWPP=0.27044 WHPB=1.00000 WSCP14=0.00000 & -CUTOFF=7.00000 WCORR4=0.00000 - -1L2Y_1LE1: -WLONG=1.00000 WSCP=1.23315 WELEC=0.84476 WBOND=1.00000 WANG=0.62954 & -WSCLOC=0.10554 WTOR=1.84316 WTORD=1.26571 WCORRH=0.19212 WCORR5=0.00000 & -WCORR6=0.00000 WEL_LOC=0.37357 WTURN3=1.40323 WTURN4=0.64673 WTURN6=0.00000 & -WVDWPP=0.23173 WHPB=1.00000 WSCCOR=0.0 & -CUTOFF=7.00000 WCORR4=0.00000 - -8.1.9. Input and/or reference PDB file name (text format; subroutine MOLREAD) ------------------------------------------------------------------------------ - -If PDBSTART or PDBREF was specified in the control card, this line contains -the PDB file name. Trailing slashes to specify the full path are permitted. -The file name can contain up to 64 characters. - -8.1.10. Amino-acid sequence (free and text format) --------------------------------------------------- - -This data appears, if PDBSTART was not specified, otherwise must not be present -because the sequence would be taken from the PDB file. The first line contains -the number of amino-acid residues, including the end groups (free format), -the next lines contain the sequence in 20(1X,A3) format for the three-letter -or 80A1 format for the one-letter code. There are two types of end-groups: -Gly (three-letter code) or G (one-letter code), if an end group contains a full -peptide bond (e.g., the acetyl N-terminal group or the carboxyamide C-terminal -group) and D (in the three-letter code) or X (in the one-letter code), if the -end group does not contain a peptide group (e.g., the NH2 N-terminal end group -or the COOH C-terminal end group). (Note the Gly or G also denotes the regular -glycine residue, if found in the middle of a chain). -In the second case the end group is considered as a "dummy" group and serves -only to define the first (last) virtual-bond dihedral angle gamma for the -first (last) full amino-acid residue. - -Consider, for example, the Ac-Ala(19)-NHMe polypeptide. The three-letter code -input will look like this: - -21 - Gly Ala Ala Ala Ala Ala Ala Ala Ala Ala Ala Ala Ala Ala Ala Ala Ala Ala Ala Ala - Gly - -And the one-letter code input will be: - -21 -GAAAAAAAAAAAAAAAAAAAG - -If the sequence is changed to NH3(+)-Ala(19)-COO(-), the inputs will look -like this: - -21 - D Ala Ala Ala Ala Ala Ala Ala Ala Ala Ala Ala Ala Ala Ala Ala Ala Ala Ala Ala - D - -and - -21 -XAAAAAAAAAAAAAAAAAAAX - -The sequence input is case-insensitive, because the present version of UNRES -considers each amino-acid residue as an L-residue (there are no torsional -parameters for the combinations of the D- and L-residues yet). Furthermore, -each peptide group is considered as a trans group. - -If the version of UNRES has multi-chain capacity, placing a dummy residue -inside the sequence indicates start of a new chain. For example, a system -composed of two Ala(10) chains can be specified as follows (3-letter code): - -23 - D Ala Ala Ala Ala Ala Ala Ala Ala Ala Ala D Ala Ala Ala Ala Ala Ala Ala Ala - Ala Ala D - -or (1-letter code) - -23 -XAAAAAAAAAAXAAAAAAAAAAX - - -8.1.11. Disulfide-bridge information (free format; subroutine READ_BRIDGE) --------------------------------------------------------------------------- - -1st line: -NS,(ISS(i),i=1,NS) - -NS - the number of half-cystines (required even if no half-cystines are present) - -ISS(i) - the position of ith half-cystine in the sequence (starting from the -N-terminal end group) - -next line(s) (present only, if ns>0 and must not appear otherwise): -NSS,(IHPB(i),JHPB(i),i=1,NSS) - -NSS - the number of disulfide bridges; must not be greater than NS/2 - -IHPB(i),JHPB(i) - the cystine residue forming the ith bridge. - -The program will check, whether the residues specified in the ISS list -are cystines and terminate with error, if any of them is not. The program -also checks, if the numbers from the IHPB and the JHPB lists have appeared -in the ISS list. - -8.1.12. Dihedral-angle restraint data (free format; subroutine MOLREAD) ------------------------------------------------------------------------ - -This set of data specifies the harmonic constraints (if any) imposed on selected -virtual-bond dihedral angles gamma. - -1st line: -NDIH_CONSTR - the number of restrained gamma angles (required even if no -restrains are applied). - -2nd line (present only, if NDIH_CONSTR > 0; must not appear otherwise): -FTORS - the force constant expressed in kcal/(mol*rad**2) - -next NDIH_CONSTR lines (present only, if NDIH_CONSTR > 0): - -IDIH_CONSTR(i),PHI0(i),DRANGE(i) - -IDIH_CONSTR(i) - the number of ith restrained gamma angle. The angles are -numbered after the LAST alpha-carbons. Thus, the first "real" angle has number -4 and it corresponds to the rotation about the CA(2)-CA(3) virtual-bond axis -and the last angle has the number NRES and corresponds to the rotation about -the CA(NRES-2)-CA(NRES-1) virtual-bond axis. - -PHI0(i) - the "center" of the restraint (expressed in degrees) - -DRANGE(i) - the "flat well" range of the restraint (in degrees) - -The restraint energy for the ith restrained angle is expressed as: - - / - | FTORS*(GAMMA(IDIH_CONSTR(i))-PHI0(i)+DRANGE(i))**2, - | if GAMMA(IDIH_CONSTR(i))PHI0(i)+DRANGE(i) - \ - -Applying dihedral-angle constraints also implies that for ith constrained -gamma angle the sampling be carried out from the -[PHI0(i)-DRANGE(i)..PHI0(i)+DRANGE(i)] interval and not from the [-Pi..Pi] -interval, if random conformations are generated. If only this and not -restrained minimization is required, just set FTORS to 0. - -8.1.13 Distance restraints (subroutine READ_DIST_CONSTR) --------------------------------------------------------- - -Restraints are imposed on Calpha...Calpha distances. - -NDIST=number (integer) (0) - number of restraints on specific distances. - -NFRAG=number (integer) (0) - number of distance-restrained protein segments. - -NPAIR=number (integer) (0) - number of distance-restrained pairs of segments. - Specifying NPAIR requires specification of segments. - -IFRAG=start(1),end(1),start(2),end(2)...start(NFRAG),end(NFRAG) (integers) -First and last residues of the distance restrained segments. - -WFRAG=w(1),w(2),...,w(NFRAG) (reals) - force constants or bases for force -constant calculation corresponding to fragment restraints. - -IPAIR=start(1),end(1),start(2),end(2),...,start(NPAIR),end(NPAIR) (integers) -numbers of segments (consecutive numbers of start or end pairs in IFRAG -specification), the distances between which will be restrained. - -WPAIR=w(1),w(2),...,w(NFRAG) (reals) - force constants or bases for force -constant calculation corresponding to pair restraints. - -DIST_CUT=number (real) (5.0) - the cut-off distance in angstroms for force- -constant calculations. - -The force constants within fragments/between pairs of fragments are calculated -depending on the value of DIST_CONSTR described in section 5.1: - -1 - all force constants are equal to the respective entries of WFRAG/WPAIR - -2 - the force constants are equal to the respective entries of WFRAG/WPAIR - when the distance between the Calpha atoms in the reference structure - <=D_CUT, 0 otherwise. - -3 - the force constants are calculated from the formula: - -k(CA_j,CA_k)=W*exp{-[d(CA_j,CA_k)/DIST_CUT)]**2/2} - -where k(CA_j,CA_k) is the force constant between the respective Calpha atoms, -d(CA_j,CA_k) is the distance between these Calpha atoms in the reference -structure, and W is the basis for force-constant calculation (see above). - -If NDIST>0, the restraints on specific distance are subsequently input: - -ihpb(i), jhpb(i), forcon(i), i=1,NDIST - -where ihpb(i) and jhpb(i) are the numbers of the residues the distance -between the Calpha atoms of which will be distance restrained and forcon(i) -is the respective force constant. - -8.1.14 Internal coordinates of the reference structure (free format; --------------------------------------------------------------------- - subroutine READ_ANGLES) - ----------------------- - -This part of the data is present, if REFSTR, but not PDBREF was specified, -otherwise must not appear. It contains the following group of variables: - -(THETA(i),i=3,NRES) - the virtual-bond valence angles THETA -(PHI(i),i=4,NRES) - the virtual-bond dihedral angles GAMMA -(ALPH(i),i=2,NRES-1)- the ALPHA polar angles of consecutive side chains -(OMEG(i),i=2,NRES-1)- the BETA polar angles of consecutive side chains. - -ALPHA(i) and OMEG(i) correspond to the side chain attached to CA(i). THETA(i) -is the CA(i-2)-CA(i-1)-CA(i) virtual-bond angle and PHI(i) is the -CA(i-3)-CA(i-2)-CA(i-1)-CA(i) virtual-bond dihedral angle gamma. - -8.1.15 Internal coordinates of the initial conformation (free format; ---------------------------------------------------------------------- - subroutine READ_ANGLES) - ----------------------- - -This part of the data is present, if RAND_CONF, MULTCONF, THREAD, or PDBSTART -were not specified, otherwise must not appear. This input is as in section 10. - -8.1.15.1 File name with internal coordinates of the conformations to be processed ---------------------------------------------------------------------------------- - (text format; subroutine MOLREAD) - --------------------------------- - -This data is present only, if MULTCONF was specified. It contains the name of -the file with the internal coordinates. Up to 64 characters are allowed. -The structure of the file is that of the *.int file produced by UNRES/CSA. -See section "The structure of the INT files" for details. - -8.1.16 Control data for energy map construction (data lists; subroutine MAP_READ) ---------------------------------------------------------------------------------- - -These data lists appear, if NMAP=n was specified, where n is the number of -variables that will be grid-searched. One list is per one variable or a -group of variables set equal (see below): - -PHI - the variable is a virtual-bond dihedral angle gamma -THE - the variable is a virtual-bond angle theta -ALP - the variable is a side-chain polar angle alpha -OME - the variable is a side-chain polar angle beta - -RES1=number (integer) -RES2=number (integer) - -The range of residues for which the values will be set; all these variables -will be set at the same value. It is required that RES2 > RES1. - -FROM=angle (real) -TO=angle (real) - -Lower and upper limit of scanning in grid search (in degrees) - -NSTEP=number (integer) - -Number of steps in scanning along this variable/group of variables. - -8.2. Input coordinate files ---------------------------- - -At present, geometry can be input either from the external files in the PDB -format (with the PDBSTART option) or multiple conformations can be read -as virtual-bond-valence and virtual-bond dihedral angles when the MULTCONF -option is used (the latter, however, implies using standard virtual-bond -lengths as initial values). The structure of internal-coordinate files -is the same as that of output internal-coordinate files described in section -9.1.1. - -8.3. Other input files ----------------------- - -CSA parameters can optionally be read in free format from file INPUT.CSA.in -(see section 8.1.4). When a CSA run is restarted, the CSA-specific output files -also serve as input files. INPUT is the prefix of input and output files -as explained in section 6. - -Restart files for MD and REMD simulations. They are read when the keyword -RESTART appears on the MD/REMD data group (section 8.1.6). - -8. OUTPUT FILES ---------------- - -UNRES "main" output files (INPUT.out_${POT}[processor]) are log files from -a run. They contain the information of the molecule, force field, calculation -type, control parameters, etc.; however, not the structures produced during -the run or their energies except single-point energy evaluation and -minimization-related runs. The structural information is included in -coordinate files (*.int, *.x, *.pdb, *.mol2, *.cx) and statistics files (*.stat), -respectively; these files are further processed by other programs (WHAM, -CLUSTER) or can be viewed by molecular viewers (pdb or mol2 files). - -9.1. Coordinate files ---------------------- - -9.1.1. The internal coordinate (INT) file ------------------------------------------- - - -This file contains the internal coordinates of the conformations produced -by UNRES in non-MD runs. The virtual-bond lengths are assumed constant so -only the angular variables are provided (see ref - -IT,ENER,NSS,(IHPB(I),JHPB(I),I=1,NSS) -(I5,F12.5,I2,9(1X,2I3)) - -IT - the number of the conformation -ENER - total energy -NSS - the number of disulfide bridges -(IHPB(I),JHPB(I),I=1,NSS) - the positions of the pairs of half-cystines -forming the bridges. If NSS>9, the remaining pairs are written in the -following lines in the (3X,11(1X,2I3)) format. - -(THETA(I),I=3,NRES) -(8F10.4) - -The virtual-bond angles THETA (in degrees) - -(PHI(I),I=4,NRES) -(8F10.4) - -The virtual-bond dihedral angles GAMMA (in degrees) - -(ALPH(I),I=2,NRES-1) -(OMEG(I),I=2,NRES-1) -(8F10.4) - -The polar angles ALPHA and BETA of the side-chain centers (in degrees). - -9.1.2. The plain Cartesian coordinate (X) files (subroutine CARTOUT) --------------------------------------------------------------------- - -This file contains the Cartesian coordinates of the alpha-carbon and -side-chain-center coordinates. All conformations from an MD/MREMD -trajectory are collated to a single file. The structure of each -conformation's record is as follows: - -1st line: time,potE,uconst,t_bath,nss,(ihpb(j),jhpb(j),j=1,nss), -nrestr,(qfrag(i),i=1,nfrag),(qpair(i),i=1,npair), -(utheta(i),ugamma(i),uscdiff(i),i=1,nfrag_back) - -time: MD time (in "molecular time units"; 1 mtu = 4.89 fs), -potE: potential energy, -uconst: restraint energy corresponding to restraints on Q and backbone geometry, -(see section ??), -t_bath: thermostat temperature, -nss: number of disulfide bonds, -ihpb(j), jhpb(j): the numbers of linked cystines for jth disulfide bond, -nrestr: number of restraints on q and local geometry, -qfrag(i): q value for ith fragment, -qpair(i): q value for ith pair, -utheta(i): sum of squares of the differences between the theta angles - of the current conformation from those of the experimental conformation, -ugamma(i): sum of squares of the differences beaten the gamma angles - of the current conformation from those of the experimental conformation, -uscdiff(i): sum of squares of the differences between the Cartesian difference - of the unit vector of the Calpha-SC axis of the current conformation from - those of the experimental conformation. - -Next lines: Cartesian coordinates of the Calpha atoms (including dummy atoms) -(sequentially, 10 coordinates per line) -Next lines: Cartesian coordinates of the SC atoms (including glycines and -dummy atoms) (sequentially, 10 coordinates per line) - -9.1.3. The compressed Cartesian coordinate (CX) files ------------------------------------------------------ - -These files are compressed binary files (extension cx). For each conformation, -the items are written in the same order as specified in section 9.1.2. For -MREMD runs, if TRAJ1FILE is specified on MREMD record (see section 8.1.6), -snapshots from all trajectories are written every time the coordinates -are dumped. Thus, the file contains snapshot 1 from trajectory 1, ..., -snapshot 1 from trajectory M, snapshot 2 from trajectory 1, ..., etc. - -The compressed cx files can be converted to pdb file by using the xdrf2pdb -auxiliary program (single trajectory files) or xdrf2pdb-m program (multiple -trajectory files from MREMD runs generated by using the TRAJ1FILE option). -The multiple-trajectory cx files are also input files for the auxiliary -WHAM program. - -9.1.4. The Brookhaven Protein Data Bank format (PDB) files (subroutine PDBOUT) ------------------------------------------------------------------------------- - -These files are written in PDB standard (see. e.g., -ftp://ftp.wwpdb.org/pub/pdb/doc/format_descriptions/Format_v33_Letter.pdf). -The REMARK, ATOM, SSBOND, HELIX, SHEET, CONECT, TER, and ENDMDL are used. -The Calpha (marked CA) and SC (marked CB) coordinates are output. The CONECT -records specify the Calpha...Calpha and Calpha...SC virtual bonds. Secondary -structure is detected based on peptide-group contacts, as specified in -ref 12. Dummy residues are omitted from the output. If the program has -multiple-chain function, the presence of a dummy residue in a sequence -starts a new chain, which is assigned the next alphabet letter as ID, and -residue numbering is started over. - -9.1.5. The SYBYLL (MOL2) files ------------------------------- - -See the description of mol2 format (e.g., -http://tripos.com/data/support/mol2.pdf). Similar remarks apply as for -the PDB format (section 9.1.4). - -9.2. The summary (STAT) file ----------------------------- - -9.2.1. Non-MD runs ------------------- - -This file contains a short summary of the quantities characterizing the -conformations produced by UNRES/CSA. It is created for MULTCONF and MCM. - -NOUT,EVDW,EVDW2,EVDW1+EES,ECORR,EBE,ESCLOC,ETORS,ETOT,RMS,FRAC -(I5,9(1PE14.5)) - -NOUT - the number of the conformations - -EVDW,EVDW2,EVDW1+EES,ECORR,EBE,ESCLOC,ETORS - energy components - -ETOT - total energy - -RMS - RMS deviation from the reference structure (if REFSTR was specified) - -FRAC - fraction of side chain - side chain contacts of the reference - structure present in this conformation (if REFSTR was specified) - -9.2.2. MD and MREMD runs -------------------------- - -Each line of the stat file generated by MD/MREMD runs contains the following -items in sequence: - -step - the number of the MD step - -time - time [unit is MTU (molecular time unit) equal to 48.9 fs] - -Ekin - kinetic energy [kcal/mol] - -Epot - potential energy [kcal/mol] - -Etot - total energy (Ekin+Epot) - -H-H0 - the difference between the cureent and initial extended Hamiltionian - in Nose-Hoover or Nose-Poincare runs; not present for other thermostats. - -RMSD - root mean square deviation from the reference structure (only in - REFSTR has been specified) - -damax - maximum change of acceleration between two MD steps - -fracn - fraction of native side-chain concacts (very crude, based on - SC-SC distance only) - -fracnn - fraction of non-native side-chain contacts - -co - contact order - -temp - actual temperature [K] - -T0 - initial (microcanonical runs) or thermostat (other run types) - temperature [K] - -Rgyr - radius of gyration based on Calpha coordinates [A] - -proc - in MREMD runs the number of the processor (the number of the - trajectory less 1); not present for other runs. - -For an USAMPL run, the following items follow the above list: - -iset - the number of the restraint set - -uconst - restraint energy pertaining to q-values - -uconst_back - restraint energy pertaining to virtual-backbone restraints - -(qfrag(i),i=1,nfrag) - q values of the specified fragments - -(qpair(ii2),ii2=1,npair) - q values of the specified pairs of fragments - -(utheta(i),ugamma(i),uscdiff(i),i=1,nfrag_back) - virtual-backbone and - side-chain-rotamer restraint energies of the fragments specified - -If PRINT_COMPON has been specified, the energy components are printed -after the items described above. - -9.3. CSA-specific output files ------------------------------- - -There are several output files from the CSA routine: -INPUT.CSA.seed, INPUT.CSA.history, INPUT.CSA.bank, INPUT.CSA.bank1, -INPUT.CSA.rbank INPUT.CSA.alpha, INPUT.CSA.alpha1. - -The most informative outfile is INPUT.CSA.history. This file first write down -the parameters in INPUT.CSA.csa file. Later it shows the energies of random -minimized conformations in it's generation. After sorting the First_bank -in energy (ascending order), the energies of the First_bank is re-written here. -After this the output looks like: - 1 0 100 6048.2 1 100-224.124-114.346 202607 100 100 - 1 0 700 5882.6 2 29-235.019-203.556 1130308 100 100 - 1 0 1300 5721.5 2 18-242.245-212.138 2028008 100 100 - 1 0 1900 5564.8 13 54-245.185-218.087 2897988 98 100 - 1 0 2500 5412.4 13 61-246.214-222.068 3706478 97 100 - 1 0 3100 5264.2 13 89-248.715-224.939 4514196 96 100 - -Each line is written between each iteration (just after selection -of seed conformations) containing following data: -jlee,icycle,nstep,cutdif,ibmin,ibmax,ebmin,ebmax,nft,iuse,nbank -ibmin and ibmax lists the index of bank conformations corresponding to the -lowest and highest energies with ebmin and ebmax. -nft is the total number of function evaluations so far. -iuse is the total number of conformations which have not been used as seeds -prior to calling subroutine select_is which select seeds. - -Therefore, in the example shown above, one notes that so far 3100 -minimizations has been performed corresponding to the total of 4514196 -function evaluations. The lowest and highest energy in the Bank is --248.715 (#13) and -224.939 (#89), respectively. The number of conformations -already used as seeds (not including those selected as seeds in this iteration) -so far is 4 (100-96). - -The files INPUT.CSA.bank and INPUT.CSA.rbank contains data of Bank and -First_bank. For more information on these look subroutines write_bank -and write_rbank. The file INPUT.CSA.bank is overwritten between each -iteration whereas Bank is accumulated in INPUT.CSA.bank1 (not for every -iteration but as specified in the subroutine together.f). - -The file INPUT.CSA.seed lists the index of the seed conformations with their -energies. Files INPUT.CSA.alpha, INPUT.CSA.alpha1 are written only once -at the beginning of the CSA run. These files contain some arrays used -in CSA procedure. - -10. TECHNICAL SUPPORT CONTACT INFORMATION ------------------------------------------ - - Dr. Adam Liwo - Faculty of Chemistry, University of Gdansk - ul. Sobieskiego 18, 80-952 Gdansk Poland. - phone: +48 58 523 5430 - fax: +48 58 523 5472 - e-mail: adam@chem.univ.gda.pl - - Dr. Cezary Czaplewski - Faculty of Chemistry, University of Gdansk - ul. Sobieskiego 18, 80-952 Gdansk Poland. - phone: +48 58 523 5430 - fax: +48 58 523 5472 - e-mail: czarek@chem.univ.gda.pl - - Dr. Stanislaw Oldziej - Intercollegiate Faculty of Biotechnology - University of Gdansk, Medical University of Gdansk - ul. Kladki 22, 80-922 Gdansk, Poland - phone: +48 58 523 5361 - fax: +48 58 523 5472 - e-mail: stan@biotech.ug.gda.pl - - Dr. Jooyoung Lee - Korea Institute for Advanced Study - 207-43 Cheongnyangni 2-dong, Dongdaemun-gu, - Seoul 130-722, Korea - phone: +82-2-958-3890 - fax: +82-2-958-3731 - email: jlee@kias.re.kr - -Prepared by Adam Liwo and Jooyoung Lee, 7/17/99 -Revised by Cezary Czaplewski 1/4/01 -Revised by Cezary Czaplewski and Adam Liwo 8/26/03 -Revised by Cezary Czaplewski and Adam Liwo 11/26/11 -Revised by Adam Liwo 02/19/12 -