Manual

The following sections describe the method implemented in DAMMIN, how to run DAMMIN on the supported platforms and the required input and the generated output files.

Introduction

The program DAMMIN implements a method to restore ab initio low resolution shape of randomly oriented particles in solution (e.g., biological macromolecules) from its small angle X-ray scattering. A search volume which encloses the particle (e.g., a sphere of sufficiently large radius R) is filled with N densely packed spheres of radius r, referred to as dummy atom. Given the fixed spatial positions, the shape of the dummy atom model is completely described by a vector X with N components which assigns each dummy atom either to the solute phase (i.e. protein in this case) or to the solvent phase. (In a general approach implemented in the program MONSA, the number of phases can be up to 4 in order to deal with protein complexes and protein-RNA complexes.) For an adequate description of a structure the number of dummy atoms usually reaches a few thousands. The task of shape reconstruction from the scattering data is thus transformed to the problem of finding a configuration X where a goal function f(X) is minimized. The goal function takes into account the discrepancy between the experimental data and the calculated scattering of the dummy atom model, as well as several aspects of the model quantified as penalties. In order to guarantee a compact and inter-connected model, the looseness penalty and the disconnectivity penalty are introduced. The peripheral penalty ensures that the model is close to the center of the search volume. The contribution of the penalties to the goal function is expected to be 10-50% for the final model. Simulated annealing (SA) is used to perform the global minimization of the target function.

Running DAMMIN

Command-Line Arguments and Options

The program DAMMIN can be started in the batch mode when arguments are given:

$ dammin GNOMFILE [OPTIONS]

DAMMIN accepts the following command line arguments:

Argument Description
GNOMFILE A relative or absolute path to regularised SAS data (.out).

DAMMIN recognizes the following command-line options:

Short Option Long Option Description
  --mo <MODE> Configuration of the annealing procedure, one of FAST (bigger beads, cooling down quickly), SLOW(smaller beads, cooling down slowly), or KEEP (keeps up to 15 best models fitting the data); default: FAST.
  --sy <SYMMETRY> Specify the point symmetry of the particle. Point groups P1, …, P19, Pn2 (n_ = 2, …, 12), P23, P432 orPICO(icosahedral) are supported. By default, no symmetry is enforced (P1).
  --an <ANISOMETRY> Particle anisometry: oblate (O), prolate (P) or unknown (default).
  --dr <DIRECTION> Direction of anisometry, applicable with P2 symmetry only:along(L),across(C) or_unknown_(default).
  --un <UNIT> Angular unit of the input file, either ‘1’ (\(\AA^{-1}\) or ‘2’ (nm^-1^); undefined by default.
  --lo <LOG_FILE> Prefix to prepend to output filenames. Default is the name of theDAMMIN input filewithout extension.
  --id <DESCRIPTION> Project description. By default, the command line content is used.
  --seed <INT> Set the seed for the random number generator.
  --lm <INT> Maximum order of harmonics (default 10).
  --ra <X> Packing radius of dummy atoms (in angstrom).
  --sv <SHAPE> Search volume, one of:sphere, ellipsoid, cylinder, parallelepiped. Default is ‘sphere’.
  --param1 <X> ellipsoid semi-axis/cylinder outer radius/parallelepiped length (in angstrom).
  --param2 <X> ellipsoid semi-axis/cylinder inner radius/parallelepiped width (in angstrom).
  --param3 <X> ellipsoid semi-axis/cylinder height/parallelepiped height (in angstrom).
  --svfile <FILE> File with the custom search volume.
  --model-format <FMT> Format of 3D models, one of: cif, pdb (default: cif)
-v --help Print a summary of arguments, options, and exit.
-h --version Print the ATSAS version and exit.

Interactive Configuration

Alternatively DAMMIN can be started in the interactive mode when not giving any arguments:

$ dammin
Screen Text Default Description
Mode:<[F]>ast, [S]low, [J]ag, [K]eep, [E]xpert Fast Configuration of the annealing procedure, one ofFast(bigger beads, cooling down quickly),Slow(smaller beads, cooling down slowly),Jag(more dummy atoms and more spherical harmonics, annealing in repeated cycles),ExpertorKeep(keeps up to 15 best models fitting the data).
Log file name None Prefix to theDAMMIN output files.
Input data, GNOM output file name None DAMMIN input file.
Enter project description None Short description of the run.
Angular units in the input file 4pisin(theta)/lambda [1/angstrom] (1) 4pisin(theta)/lambda [1/nm ] (2) 1 Angular units of the input file, one of [1/Angstrom] or [1/nm]. Default is [1/Angstrom].
Portion of the curve to be fitted 1.000 Percentage of the scattering curve to fit, starting at the first point. The whole curve is used by default.
Number of knots in the curve to fit 20 Experimental data is smoothed by spline interpolation before fitting. This defines the number of supporting points of the spline.
Constant subtraction procedure. Enter Positive number: value to be subtracted, OR Negative number: to skip subtraction Zero for automatic subtraction 0.0 A constant is subtracted from the data to force the s^-4^decay of the intensity at higher angles. By default, an appropriate constant is determined automatically.
Maximum order of harmonics 20 The default value for the maximum order of spherical harmonics taken in the computation of scattering intensityL=10is usually sufficient in most practical applications. If you wish to fit a scattering curve over a very broad range (e.g. more than 15_>Shannon_channels) or if the particle is expected to be very anisometric, it might be useful to compute with largerL(maximumL=20is supported). Please note that the computational time is proportional toL^2^, i.e. a run withL=20takes 4 times more CPU than the default run.
Initial DAM: type S for sphere [default], E for ellipsoid, C for cylinder, P for parallelepiped orstart file name S Define a search volume, one of sphere (S), ellipsoid (E), cylinder (C), parallelepiped (P) or user-defined pdb file. - When a spherical search volume is used, the diameter of the sphere is by default the maximum size of the particle given in theinput fileand it can be modified when running DAMMIN in Expert mode. - When an ellipsoid is used as the search volume, the values for the three semi-axis can be defined. - When a cylinder is used, the outer radius, the inner radius (in the case of a hollow cylinder) and height can be defined. - The search volume can also be a parallepiped where its length, width and height can be defined. Alternatively, - a pdb file of dummy atoms can be used as the search volume. It is helpful, for example, in the case of refining the shape when an averaged ab inito model is available.
Symmetry: P1…19 or Pn2 (n=1,..,12) or P23 or P432 or PICO P1 Specify the point symmetry of the particle, one of P1…19, Pn2 (n=1,…,12), P23, P432 or PICO (icosahedral).
Packing radius of dummy atoms variable The default radius of dummy atoms is determined by the maximum size of the particle because the program packs approximately 1500 dummy atoms in the search volume (for each asymmetric unit in case of symmetry). It is possible to change the radius in the expert mode, which effectively modifies the number of dummy atoms.
Expected particle shape:<P>rolate,<O>blate, or<U>nknown Unknown Expected particle anisometry, either prolate, oblate or unknown.
Looseness penalty weight variable Looseness penalty is introduced to penalize the loosely connected dummy atoms models in order to obtain compact models. The default weight is of the order of 5e-3 depending on the running mode, e.g. 6e-3 in the fast mode and 2e-3 in the expert mode. It is not recommended to change the default value. However, the looseness penalty can be disabled by setting this weight to 0.
Disconnectivity penalty weight variable Disconnectivity is introduced to penalize models consisting of isolated bodies. The default weight is of the order of 1e-3 depending on the running mode, e.g. 6e-3 in the fast mode and 2e-3 in the expert mode. It is not recommended to change the default value. The disconnectivity penalty can be disabled by setting this weight to 0.
Peripheral penalty weight variable Peripheral penalty is to ensure the model stays at the center of the search volume. It is gradually decreasing with the annealing temperature. It is not recommended to change the default value.
Fixing thresholds Los and Rf 0.0, 0.0 When the shape is already well defined, some of the dummy atoms can be fixed as particle or solvent in order to prevent unnecessary rotations and movements of the entire model and thus to improve the convergence. The temperature for doing so is selected using the thresholds of”Looseness fixing”and”R-factor fixing”. It is recommended to disable this feature by accepting the default values<0.0, 0.0>.
Randomize the structure Yes The initial structure is randomized, i.e., the spheres in the search volume are assigned randomly to 0s (=solvent) and 1s (=particle).
Weight: 0=s^2, 1=Emphas.s->0, 2=Log 1 Choose the weighting function for the SAXS data: 0 - W(s) = s^2^(Porod weighting) 1 - Porod weighting with emphasis of initial points (default) 2 - logarithmic scale weighting In the case of option 0 (Porod weighting) the very low angle points are somewhat underestimated for very anisometric (1:10) particles, which leads to poor fits in this range. The options 1 and 2 give better results over option 0 for very anisometric objects.
Initial scale factor variable The scale factor is a factor between the computed scattering intensity and the experimental data in the least squares fitting. The initial scale factor has no significant effect on the modeling process because it is calculated after each change of the model configurations.
Fix the scale factor No It is not recommended to fix the scale factor.
Initial annealing temperature variable The initial annealing temperature is by default of the order of 1e-3. In the fast mode, it is tuned automatically by the program and can be defined in the expert mode. The annealing temperature decreases typically to a value of the order of 1e-6. So the initial annealing temperature can be varied between 1e-5 and 1e-4. It is not recommended to change the default value.
Annealing schedule factor 0.95 Factor by which the temperature is decreased; 0.95 is a good value for the annealing process. Faster cooling for smaller systems is possible by setting the factor to 0.9.
# of independent atoms to modify 1 Number of independent dummy atoms changing the phase during one iteration. Default is 1; it is not recommended to change this value.
Max # of iterations at each T variable Complete a temperature step and cool after this number of iterations at the latest.
Max # of successes at each T variable Finalize temperature step and cool after at most this many successful phase changes.
Min # of successes to continue variable Stop if not at least this many successful state changes within a single temperature step can be done.
Max # of annealing steps 200 Stop if simulated annealing is not finished after this number of steps.

Runtime Output

On runtime, two lines of output will be generated for each temperature step:

� 1 T: 0.108E-02 Suc: 13818 Eva:� 16326 CPU:� 0.642E+01 SqF: 0.6070
� Rf: 0.45421 Los:0.1120 Dis:0.0228 Per:� 0.5380 Sca: 0.981E-14

The fields can be interpreted as follows, top-left to bottom-right:

Field Description
J Step number. Starts at 1, increases monotonically.
T Temperature measure, starts at an arbitrary high value, descreases each step.
Suc Number of successful phase changes in this temperature step.
Eva Accumulated number of function evaluations.
CPU Elapsed CPU time since the annealing procedure was started.
SqF Square root of the sum of the goodness of fit and the penalty.
Rf Goodness of fit of simulated data versus experimental data, not taking Penalties into account.
Los Contribution of Looseness Penalty.
Dis Contribution of disconnectivity Penalty.
Per Contribution of peripheral Penalty.
Sca Scale factor providing the best least square fit between the experimental data and the simulated data.

dammin Input Files

DAMMIN requires regularised SAS data (.out) as generated by GNOM.

dammin Output Files

DAMMIN outputs a set of files, each filename starts with a customizable prefix option. If a prefix has been used before, existing files will be overwritten without further note.

Extension Description
.log Contains the same information as the screen output and is updated during execution of the program.
-0.pdb The file ‘-0.pdb’ contains the beads of the solvent inside the search volume.
-1.pdb The file ‘-1.pdb’ represents the modeled particle. TheREMARKsections of both files contain information about the application used and about invariants of the particle, e.g. R~g~,volume and molecular mass of the particle.
.fit Fit of the simulated scattering curve versus a smoothed experimental data (spline interpolation). Columns in the output file are: ‘s’, ‘I~exp~’ and ‘I~sim~’.
.fir Fit of the simulated scattering curve versus the experimental data. Columns in the output file are: ‘s’, ‘I~exp~’, ‘Err~exp~’and ‘I~sim~’.

Examples

The following examples show how to do shape reconstruction using DAMMIN given the GNOM output file ly01.out.

Running in batch mode

Run DAMMIN and get online help.

$ dammin -h

Run DAMMIN on the GNOM out file ly01.out using default values for all parameters.

$ dammin ly01.out

Run DAMMIN on the GNOM out file ly01.out defining the prefix to the output files as ly0100 and using default values for all other parameters.

$ dammin ly01.out -lo ly0100

Run DAMMIN in slow mode on the GNOM out file ly01.out defining the prefix to the output files as ly0100 and using default values for all other parameters.

$ dammin ly01.out -lo ly0100 -mo Slow

Run DAMMIN in slow mode on the GNOM out file ly01.out defining the prefix to the output files as ly0100, applying P2 symmetry, giving “ this is a test run for lysozyme “ as project description and using default values for all other parameters.

$ dammin ly01.out -lo ly0100 -mo Slow -sy P2 -id "this is a test run for lysozyme"

Interactive configuration

Run DAMMIN on the GNOM out file ly01.out interactively in fast mode.

$ dammin
  ***     Ab inito shape determination by simulated      ***
  ***  annealing using a single phase dummy atoms model  ***
  ***  Win 9x/NT, UNIX/Linux/Mac release version  5.3    ***
  ***      Last modified     ---  13/02/07 18:00         ***
  ***  Please reference: D.Svergun (1999). Biophys. J.   ***
  ***                             76, 2879-2886.         ***
  ***   Copyright (c) ATSAS Team                         ***
  ***   EMBL, Hamburg Outstation, 1999 - 2007            ***

   Type dammin /help for batch mode use

   ====== DAMMIN� started on           18-Aug-2009 09:34:18

 Mode: <[F]>ast, [S]low, [J]ag, [K]eep, [E]xpert <         Fast >:
 Log file name .......................... <          log >: t01
 Input data, GNOM output file name ...... <         .out >: ly01
 Project identificator .................................. : t01
 Enter project description .............. :
 Random sequence initialized from ....................... : 93435
  ** Information read from the GNOM file **
 Data set title:   Lysozyme, high angles (>.22) 46 mg/ml, small angles (<.22 mg/
 Raw data file name:  ly01exp.dat
 Maximum diameter of the particle ....................... : 50.00
 Solution at Alpha =  0.107E+01   Rg :  0.154E+02   I(0) :   0.657E+01
 Radius of gyration read ................................ : 15.40
 Number of GNOM data points ............................. : 213
 Angular units in the input file:
 4*pi*sin(theta)/lambda [1/angstrom] (1)
 4*pi*sin(theta)/lambda [1/nm      ] (2)  <            1 >: 2
 Angular units multiplied by ............................ : 0.1000
 Dmax and Rg divided by ................................. : 0.1000
 Maximum s value [1/angstrom] ........................... : 4.960e-2
 Number of Shannon channels ............................. : 7.894
 Portion of the curve to be fitted ...... <        1.000 >:
 Number of knots in the curve to fit .................... : 20
  *** Warning: constant reduced to avoid oversubtraction
 A constant was subtracted .............................. : 3.454e-2
 Maximum order of harmonics ............................. : 10
  Initial DAM: type S for sphere [default],
  E for ellipsoid, C for cylinder, P for parallelepiped
  or start file name .................... <         pdb >: S
 Symmetry: P1...19 or Pn2 (n=1,..,12)
 or P23 or P432 or PICO ................. <           P1 >:
 Sphere� diameter [Angstrom] ............................ : 500.0
 Packing radius of dummy atoms .......................... : 17.90
 Radius of the sphere generated ......................... : 250.0
 Number of dummy atoms .................................. : 1974
 Number of equivalent positions ......................... : 1
 Expected particle shape: <P>rolate, <O>blate,
  or <U>nknown .......................... <      Unknown >:
 Excluded volume per atom ............................... : 3.247e+4
 Radius of 1st coordination sphere ...................... : 50.48
 Minimum number of contacts ............................. : 5
 Maximum number of contacts ............................. : 12
 Looseness penalty weight ............................... : 6.000e-3
 No of non-solvent atoms ................................ : 1974
 Initial DAM looseness .................................. : 6.949e-3
 Disconnectivity penalty weight ......................... : 6.000e-3
 Initial DAM # of graphs ................................ : 1
 Discontiguity   value .................................. : 0.0
 Center of the initial DAM:    0.0000   0.0000   0.0000
 Peripheral penalty weight .............................. : 0.3000
 Peripheral penalty value ............................... : 0.5944
 Looseness fixing threshold ............................. : 5.000e-2
 R-factor� fixing threshold ............................. : 1.500e-2
 No of non-solvent atoms ................................ : 981
 Randomized DAM looseness ............................... : 0.1054
 Randomized DAM # of graphs ............................. : 5
 Discontiguity   value .................................. : 5.110e-3
 Randomized peripheral penalty value .................... : 0.5960
 Initial DAM shape anisometry ........................... : 1.973e-2
 Initial DAM non-prolateness ............................ : 0.0
 Initial DAM non-oblateness ............................. : 7.403e-3
 Weight: 0=s^2, 1=Emphas.s->0, 2=Log .................... : 1
 *** Porod weight with emphasis at low s ***
 Initial scale factor ................................... : 9.224e-15
 Initial R^2 factor ..................................... : 0.2739
 Initial R   factor ..................................... : 0.5234
 Initial penalty ........................................ : 0.1795
 Initial fVal ........................................... : 0.4534
  Tuning the annealing parameters. Please wait...
 Variation of the target function ....................... : 3.594e-4
 CPU per function call, seconds ......................... : 4.063e-4
 Initial annealing temperature .......................... : 1.078e-3
 Annealing schedule factor .............................. : 0.9000
 # of independent atoms to modify ....................... : 1
 Max # of iterations at each T .......................... : 138180
 Max # of successes at each T ........................... : 13818
 Min # of successes to continue ......................... : 46
 Max # of annealing steps ............................... : 100
  ====  Simulated annealing procedure started  ====
 j:   1 T: 0.108E-02 Suc: 13818 Eva:    16326 CPU:  0.642E+01 SqF: 0.6070
  Rf: 0.45421 Los:0.1120 Dis:0.0228 Per:  0.5380 Sca: 0.981E-14

Run DAMMIN on the GNOM out file ly01.out interactively in expert mode where all parameters can be tuned.

$ dammin
  ***     Ab inito shape determination by simulated      ***
  ***  annealing using a single phase dummy atoms model  ***
  ***  Win 9x/NT, UNIX/Linux/Mac release version  5.3    ***
  ***      Last modified     ---  13/02/07 18:00         ***
  ***  Please reference: D.Svergun (1999). Biophys. J.   ***
  ***                             76, 2879-2886.         ***
  ***   Copyright (c) ATSAS Team                         ***
  ***   EMBL, Hamburg Outstation, 1999 - 2007            ***
   Type dammin /help for batch mode use

    ======

DAMMIN  started on           18-Aug-2009   12:07:24
 Mode: <[F]>ast, [S]low, [J]ag, [K]eep, [E]xpert <         Fast >: E
 Log file name .......................... <         .log >: t01
 Input data, GNOM output file name ...... <         .out >: ly01
 Project identificator .................................. : t01
 Enter project description .............. : This is a test run for lysozyme
 Random sequence initialized from ....................... : 120818
  ** Information read from the GNOM file **
 Data set title:   Lysozyme, high angles (>.22) 46 mg/ml, small angles (<.22) 15� mg/
 Raw data file name:  ly01exp.dat
 Maximum diameter of the particle ....................... : 50.00
 Solution at Alpha =  0.107E+01   Rg :  0.154E+02   I(0) :   0.657E+01
 Radius of gyration read ................................ : 15.40
 Number of GNOM data points ............................. : 213
 Angular units in the input file:
 4*pi*sin(theta)/lambda [1/angstrom] (1)
 4*pi*sin(theta)/lambda [1/nm      ] (2)� <            1 >: 2
 Angular units multiplied by ............................ : 0.1000
 Dmax and Rg divided by ................................. : 0.1000
 Maximum s value [1/angstrom] ........................... : 4.960e-2
 Number of Shannon channels ............................. : 7.894
 Portion of the curve to be fitted ...... <         1.000 >:
 Number of knots in the curve to fit .... <            20 >:
 Constant subtraction procedure. Enter
 Positive number: value to be subtracted, OR
 Negative number: to skip subtraction   , OR
 Zero for automatic subtraction ......... <           0.0 >:
  *** Warning: constant reduced to avoid oversubtraction
 A constant was subtracted .............................. : 3.454e-2
 Maximum order of harmonics ............. <           20 >: 15
  Initial DAM: type S for sphere [default],
  E for ellipsoid, C for cylinder, P for parallelepiped
  or start file name .................... <         .pdb >: S
 Symmetry: P1...19 or Pn2 (n=1,..,12)
 or P23 or P432 or PICO ................. <           P1 >: P1
 Sphere� diameter [Angstrom] ............ <        500.0 >:
 Packing radius of dummy atoms .......... <        10.90 >: 10
 Radius of the sphere generated ......................... : 250.2
 Number of dummy atoms .................................. : 11590
 Number of equivalent positions ......................... : 1
 Expected particle shape: <P>rolate, <O>blate,
  or <U>nknown .......................... <      Unknown >:
 Excluded volume per atom ............................... : 5661.
 Radius of 1st coordination sphere ...... <        28.20 >:
 Minimum number of contacts ............................. : 5
 Maximum number of contacts ............................. : 12
 Looseness penalty weight ............... <     2.000e-3 >: 2.0E-2
 No of non-solvent atoms ................................ : 11590
 Initial DAM looseness .................................. : 3.635e-3
 Disconnectivity penalty weight ......... <     2.000e-2 >:
 Initial DAM # of graphs ................................ : 1
 Discontiguity   value .................................. : 0.0
 Center of the initial DAM:    0.0000   0.0000   0.0000
 Peripheral penalty weight .............. <       0.3000 >: 0.5
 Peripheral penalty value ............................... : 0.5997
 Fixing thresholds Los and Rf <          0.0,        0.0 >: 0,0
 Randomize the structure [ Y / N ] ...... <          Yes >:
 No of non-solvent atoms ................................ : 5749
 Randomized DAM looseness ............................... : 9.166e-2
 Randomized DAM # of graphs ............................. : 14
 Discontiguity   value .................................. : 2.787e-3
 Randomized peripheral penalty value .................... : 0.5953
 Initial DAM shape anisometry ........................... : 4.752e-3
 Initial DAM non-prolateness ............................ : 4.005e-3
 Initial DAM non-oblateness ............................. : 0.0
 Weight: 0=s^2, 1=Emphas.s->0, 2=Log .... <            1 >:
 *** Porod weight with emphasis at low s ***
 Initial scale factor ................... <    8.809e-15 >:
 Fix the scale factor [ Y / N ] ......... <           No >:
 Initial R^2 factor ..................................... : 0.2918
 Initial R�� factor ..................................... : 0.5402
 Initial penalty ........................................ : 0.2995
 Initial fVal ........................................... : 0.5913
  Tuning the annealing parameters. Please wait...
 Variation of the target function ....................... : 6.450e-5
 CPU per function call, seconds ......................... : 2.250e-3
 Initial annealing temperature .......... <     1.000e-3 >:
 Annealing schedule factor .............. <       0.9500 >:
 # of independent atoms to modify ....... <            1 >:
 Max # of iterations at each T .......... <       811300 >:
 Max # of successes at each T ........... <        81130 >:
 Min # of successes to continue ......... <          270 >:
 Max # of annealing steps ............... <          200 >:
  ====  Simulated annealing procedure started� ====
 j:   1 T: 0.100E-02 Suc: 81130 Eva:    83447 CPU:  0.191E+03 SqF: 0.7583
  Rf: 0.52963 Los:0.0913 Dis:0.0021 Per:  0.5852 Sca: 0.868E-14