Manual

The following sections describe the method implemented in DAMMIN, how to run DAMMIN on the supported platforms and the required input and the generated output files.

Introduction

The program DAMMIN implements a method to restore ab initio low resolution shape of randomly oriented particles in solution (e.g., biological macromolecules) from its small angle X-ray scattering. A search volume which encloses the particle (e.g., a sphere of sufficiently large radius R) is filled with N densely packed spheres of radius r, referred to as dummy atom. Given the fixed spatial positions, the shape of the dummy atom model is completely described by a vector X with N components which assigns each dummy atom either to the solute phase (i.e. protein in this case) or to the solvent phase. (In a general approach implemented in the program MONSA, the number of phases can be up to 4 in order to deal with protein complexes and protein-RNA complexes.) For an adequate description of a structure the number of dummy atoms usually reaches a few thousands. The task of shape reconstruction from the scattering data is thus transformed to the problem of finding a configuration X where a goal function f(X) is minimized. The goal function takes into account the discrepancy between the experimental data and the calculated scattering of the dummy atom model, as well as several aspects of the model quantified as penalties. In order to guarantee a compact and inter-connected model, the looseness penalty and the disconnectivity penalty are introduced. The peripheral penalty ensures that the model is close to the center of the search volume. The contribution of the penalties to the goal function is expected to be 10-50% for the final model. Simulated annealing (SA) is used to perform the global minimization of the target function.

Running DAMMIN

Command-Line Arguments and Options

Usage:

$ dammin GNOMFILE [OPTIONS]

DAMMIN requires the following command line arguments:

Argument Description
GNOMFILE A relative or absolute path to regularised SAS data (.out).

DAMMIN recognizes the following command-line options:

Short Option Long Option Description
  --mo <MODE> Configuration of the annealing procedure, one of FAST (bigger beads, cooling down quickly), SLOW (smaller beads, cooling down slowly), or KEEP (keeps up to 15 best models fitting the data); default: FAST.
  --sy <SYMMETRY> Specify the point symmetry of the particle. Point groups P1, …, P19, Pn2 (n = 2, …, 12), P23, P432 or PICO (icosahedral) are supported. By default, no symmetry is enforced (P1).
  --an <ANISOMETRY> Particle anisometry: oblate (O), prolate (P) or unknown (default).
  --dr <DIRECTION> Direction of anisometry, applicable with P2 symmetry only: along (L), across (C) or unknown (default).
  --un <UNIT> Angular unit of the input file, either ‘1’ (\(\AA^{-1}\)) or ‘2’ (\(\text{nm}^{-1}\)); if not given, the application will attempt to guess the units from the data.
  --lo <LOG_FILE> Prefix to prepend to output filenames. Default is the base name of the DAMMIN input file without extension.
  --id <DESCRIPTION> Project description. By default, the command line content is used.
  --seed <INT> Set the seed for the random number generator.
  --lm <INT> Maximum order of harmonics (default 10).
  --ra <X> Packing radius of dummy atoms (in angstrom).
  --sv <SHAPE> Search volume, one of: sphere, ellipsoid, cylinder, parallelepiped. Default is ‘sphere’.
  --param1 <X> ellipsoid semi-axis/cylinder outer radius/parallelepiped length (in angstrom).
  --param2 <X> ellipsoid semi-axis/cylinder inner radius/parallelepiped width (in angstrom).
  --param3 <X> ellipsoid semi-axis/cylinder height/parallelepiped height (in angstrom).
  --svfile <FILE> File with the custom search volume.
  --model-format <FMT> Format of 3D models, one of: cif, pdb (default: cif)
-v --help Print a summary of arguments, options, and exit.
-h --version Print the ATSAS version and exit.

Interactive Configuration

If the optional argument ‘ GNOMFILE ‘ is omitted, settings available through command-line arguments and options may also be configured interactively as shown in the table below. Otherwise these questions are skipped.

Screen Text Default Description
Mode:<[F]>ast, [S]low, [J]ag, [K]eep, [E]xpert Fast Configuration of the annealing procedure, one of Fast (bigger beads, cooling down quickly), Slow (smaller beads, cooling down slowly), Jag (more dummy atoms and more spherical harmonics, annealing in repeated cycles), Expert or Keep (keeps up to 15 best models fitting the data).
Log file name None Prefix to the DAMMIN output files.
Input data, GNOM output file name None DAMMIN input file.
Enter project description None Short description of the run.
Angular units in the input file 4pisin(theta)/lambda [1/angstrom] (1) 4pisin(theta)/lambda [1/nm ] (2) 1 Angular units of the input file, one of \(\AA^{-1}\) or \(\text{nm}^{-1}\). Default is \(\AA^{-1}\).
Portion of the curve to be fitted 1.000 Percentage of the scattering curve to fit, starting at the first point. The whole curve is used by default.
Number of knots in the curve to fit 20 Experimental data is smoothed by spline interpolation before fitting. This defines the number of supporting points of the spline.
Constant subtraction procedure. Enter Positive number: value to be subtracted, OR Negative number: to skip subtraction Zero for automatic subtraction 0.0 A constant is subtracted from the data to force the \(s^{-4}\) decay of the intensity at higher angles. By default, an appropriate constant is determined automatically.
Maximum order of harmonics 20 The default value for the maximum order of spherical harmonics taken in the computation of scattering intensity L=10 is usually sufficient in most practical applications. If you wish to fit a scattering curve over a very broad range (e.g. more than 15 Shannon channels) or if the particle is expected to be very anisometric, it might be useful to compute with larger L (maximum L=30 is supported).
Initial DAM: type S for sphere [default], E for ellipsoid, C for cylinder, P for parallelepiped orstart file name S Define a search volume, one of sphere (S), ellipsoid (E), cylinder (C), parallelepiped (P) or user-defined pdb file.
- When a spherical search volume is used, the diameter of the sphere is by default the maximum size of the particle given in the input file and it can be modified when running DAMMIN in Expert mode.
- When an ellipsoid is used as the search volume, the values for the three semi-axis can be defined.
- When a cylinder is used, the outer radius, the inner radius (in the case of a hollow cylinder) and height can be defined.
- The search volume can also be a parallepiped where its length, width and height can be defined. Alternatively, a pdb file of dummy atoms can be used as the search volume. It is helpful, for example, in the case of refining the shape when an averaged ab inito model is available.
Symmetry: P1…19 or Pn2 (n=1,..,12) or P23 or P432 or PICO P1 Specify the point symmetry of the particle, one of P1…19, Pn2 (n=1,…,12), P23, P432 or PICO (icosahedral).
Packing radius of dummy atoms variable The default radius of dummy atoms is determined by the maximum size of the particle because the program packs approximately 1500 dummy atoms in the search volume (for each asymmetric unit in case of symmetry). It is possible to change the radius in the expert mode, which effectively modifies the number of dummy atoms.
Expected particle shape:<P>rolate,<O>blate, or<U>nknown Unknown Expected particle anisometry, either prolate, oblate or unknown.
Looseness penalty weight variable Looseness is a penalty term that discourages loosely connected models in order to promote compactness. The default weight is approximately \(5 \times 10^{-3},\), varying with the selected mode (e.g., \(6 \times 10^{-3},\) in fast mode, \(2 \times 10^{-3},\) in expert mode). Changing this value is generally not recommended. To disable the looseness penalty entirely, set the weight to 0.
Disconnectivity penalty weight variable Disconnectivity is a penalty term that discourages models composed of disconnected or isolated components. By default, this penalty is applied with a weight on the order of \(10^{-3},\) depending on the running mode (e.g., \(6 \times 10^{-3}\), in fast mode and \(2 \times 10^{-3},\) in expert mode). Changing this value is generally not recommended. To disable the disconnectivity penalty entirely, set the weight to 0.
Peripheral penalty weight variable The peripheral penalty keeps the model centered within the search volume. Its effect gradually decreases with the annealing temperature. It is not recommended to change this default behavior.
Fixing thresholds Los and Rf 0.0, 0.0 When the overall shape is already well defined, some dummy atoms can be fixed as particle or solvent to suppress unnecessary global movements and improve convergence. This feature is controlled via the Looseness fixing and R-factor fixing thresholds. To disable it (recommended), leave both thresholds at their default values <0.0, 0.0>.
Randomize the structure Yes Whether or not to randomize the initial structure, i.e., the spheres in the search volume are assigned randomly to solvent and particle phases.
Weight: 0=s^2, 1=Emphas.s->0, 2=Log 1 Three weighting schemes are available for the least-squares fit:
0 — Porod weighting: \(W(s) = s^2\)
1 — Porod weighting with increased emphasis on low-angle points (default)
2 — Weighting on a logarithmic scale
Option 0 may underweight the lowest-angle data, especially for highly anisometric particles (e.g. 1:10 aspect ratio), leading to poor fits in that region. Options 1 and 2 generally produce better results in such cases.
Initial scale factor variable This is the multiplicative factor relating the computed scattering intensity to the experimental data during least-squares fitting. The initial scale factor has no significant effect on the modeling process, as it is recalculated after each model update.
Fix the scale factor No It is not recommended to fix the scale factor.
Initial annealing temperature variable By default, the starting annealing temperature is on the order of \(10^{-3}\). In fast mode, it is adjusted automatically; in expert mode, it can be set manually. During optimization, the temperature typically decreases to values around \(10^{-6}\). A practical range for the initial value is \(10^{-5}\) to \(10^{-4}\). Changing the default is not recommended.
Annealing schedule factor 0.95 Factor by which the temperature is decreased; 0.95 is a good value for the annealing process. Faster cooling for smaller systems is possible by setting the factor to 0.9.
# of independent atoms to modify 1 Number of independent dummy atoms changing the phase during one iteration. Default is 1; it is not recommended to change this value.
Max # of iterations at each T variable Complete a temperature step and cool after this number of iterations at the latest.
Max # of successes at each T variable Finalize temperature step and cool after at most this many successful phase changes.
Min # of successes to continue variable Stop if not at least this many successful state changes within a single temperature step can be done.
Max # of annealing steps 200 Stop if simulated annealing is not finished after this number of steps.

Runtime Output

On runtime, two lines of output will be generated for each temperature step:

 j:   1 T: 0.100E-02 Suc: 13482 Eva:    16503 CPU:  0.779E+00 SqF: 0.5947
  Rf: 0.43667 Los:0.0749 Dis:0.0198 Per:  0.5413 Sca: 0.109E-07

The fields can be interpreted as follows, top-left to bottom-right:

Field Description
j Step number. Starts at 1, increases monotonically.
T Temperature measure, starts at an arbitrary high value, descreases each step.
Suc Number of successful phase changes in this temperature step.
Eva Accumulated number of function evaluations.
CPU Elapsed CPU time since the annealing procedure was started.
SqF Square root of the sum of the goodness of fit and the penalty.
Rf Goodness of fit of simulated data versus experimental data, not taking Penalties into account.
Los Contribution of Looseness Penalty.
Dis Contribution of disconnectivity Penalty.
Per Contribution of peripheral Penalty.
Sca Scale factor providing the best least square fit between the experimental data and the simulated data.

dammin Input Files

DAMMIN requires regularised SAS data (.out) as generated by GNOM.

dammin Output Files

DAMMIN outputs a set of files, each filename starts with a customizable prefix option. If a prefix has been used before, existing files will be overwritten without further note.

Extension Description
.log A copy of the screen output
-1.pdb or -1.cif The model is provided in either PDB or mmCIF format, depending on the model-format option.
.fit Fit of the simulated scattering curve versus a smoothed-out version of the real-data. See interactive mode how to change the number of supporting points in the spline interpolation.
.fir Fit of the simulated scattering curve versus the experimental data.

Examples

The following examples show how to do shape reconstruction using DAMMIN given the GNOM output file ly01.out.

Running in batch mode

Run DAMMIN and get online help.

$ dammin -h

Run DAMMIN on the GNOM out file ly01.out using default values for all parameters.

$ dammin ly01.out

Run DAMMIN on the GNOM out file ly01.out defining the prefix to the output files as ly0100 and using default values for all other parameters.

$ dammin ly01.out -lo ly0100

Run DAMMIN in slow mode on the GNOM out file ly01.out defining the prefix to the output files as ly0100 and using default values for all other parameters.

$ dammin ly01.out -lo ly0100 -mo Slow

Run DAMMIN in slow mode on the GNOM out file ly01.out defining the prefix to the output files as ly0100, applying P2 symmetry, giving “ this is a test run for lysozyme “ as project description and using default values for all other parameters.

$ dammin ly01.out -lo ly0100 -mo Slow -sy P2 -id "this is a test run for lysozyme"

Interactive configuration

Run DAMMIN on the GNOM out file ly01.out interactively in fast mode.

$ dammin
  ***     Ab inito shape determination by simulated      ***
  ***  annealing using a single phase dummy atoms model  ***
  ***  Win 9x/NT, UNIX/Linux/Mac release version  5.3    ***
  ***      Last modified     ---  13/02/07 18:00         ***
  ***  Please reference: D.Svergun (1999). Biophys. J.   ***
  ***                             76, 2879-2886.         ***
  ***   Copyright (c) ATSAS Team                         ***
  ***   EMBL, Hamburg Outstation, 1999 - 2007            ***

   Type dammin /help for batch mode use

   ====== DAMMIN  started on           18-Aug-2009 09:34:18

 Mode: <[F]>ast, [S]low, [J]ag, [K]eep, [E]xpert <         Fast >:
 Log file name .......................... <          log >: t01
 Input data, GNOM output file name ...... <         .out >: ly01
 Project identificator .................................. : t01
 Enter project description .............. :
 Random sequence initialized from ....................... : 93435
  ** Information read from the GNOM file **
 Data set title:   Lysozyme, high angles (>.22) 46 mg/ml, small angles (<.22 mg/
 Raw data file name:  ly01exp.dat
 Maximum diameter of the particle ....................... : 50.00
 Solution at Alpha =  0.107E+01   Rg :  0.154E+02   I(0) :   0.657E+01
 Radius of gyration read ................................ : 15.40
 Number of GNOM data points ............................. : 213
 Angular units in the input file:
 4*pi*sin(theta)/lambda [1/angstrom] (1)
 4*pi*sin(theta)/lambda [1/nm      ] (2)  <            1 >: 2
 Angular units multiplied by ............................ : 0.1000
 Dmax and Rg divided by ................................. : 0.1000
 Maximum s value [1/angstrom] ........................... : 4.960e-2
 Number of Shannon channels ............................. : 7.894
 Portion of the curve to be fitted ...... <        1.000 >:
 Number of knots in the curve to fit .................... : 20
  *** Warning: constant reduced to avoid oversubtraction
 A constant was subtracted .............................. : 3.454e-2
 Maximum order of harmonics ............................. : 10
  Initial DAM: type S for sphere [default],
  E for ellipsoid, C for cylinder, P for parallelepiped
  or start file name .................... <         pdb >: S
 Symmetry: P1...19 or Pn2 (n=1,..,12)
 or P23 or P432 or PICO ................. <           P1 >:
 Sphere  diameter [Angstrom] ............................ : 500.0
 Packing radius of dummy atoms .......................... : 17.90
 Radius of the sphere generated ......................... : 250.0
 Number of dummy atoms .................................. : 1974
 Number of equivalent positions ......................... : 1
 Expected particle shape: <P>rolate, <O>blate,
  or <U>nknown .......................... <      Unknown >:
 Excluded volume per atom ............................... : 3.247e+4
 Radius of 1st coordination sphere ...................... : 50.48
 Minimum number of contacts ............................. : 5
 Maximum number of contacts ............................. : 12
 Looseness penalty weight ............................... : 6.000e-3
 No of non-solvent atoms ................................ : 1974
 Initial DAM looseness .................................. : 6.949e-3
 Disconnectivity penalty weight ......................... : 6.000e-3
 Initial DAM # of graphs ................................ : 1
 Discontiguity   value .................................. : 0.0
 Center of the initial DAM:    0.0000   0.0000   0.0000
 Peripheral penalty weight .............................. : 0.3000
 Peripheral penalty value ............................... : 0.5944
 Looseness fixing threshold ............................. : 5.000e-2
 R-factor  fixing threshold ............................. : 1.500e-2
 No of non-solvent atoms ................................ : 981
 Randomized DAM looseness ............................... : 0.1054
 Randomized DAM # of graphs ............................. : 5
 Discontiguity   value .................................. : 5.110e-3
 Randomized peripheral penalty value .................... : 0.5960
 Initial DAM shape anisometry ........................... : 1.973e-2
 Initial DAM non-prolateness ............................ : 0.0
 Initial DAM non-oblateness ............................. : 7.403e-3
 Weight: 0=s^2, 1=Emphas.s->0, 2=Log .................... : 1
 *** Porod weight with emphasis at low s ***
 Initial scale factor ................................... : 9.224e-15
 Initial R^2 factor ..................................... : 0.2739
 Initial R   factor ..................................... : 0.5234
 Initial penalty ........................................ : 0.1795
 Initial fVal ........................................... : 0.4534
  Tuning the annealing parameters. Please wait...
 Variation of the target function ....................... : 3.594e-4
 CPU per function call, seconds ......................... : 4.063e-4
 Initial annealing temperature .......................... : 1.078e-3
 Annealing schedule factor .............................. : 0.9000
 # of independent atoms to modify ....................... : 1
 Max # of iterations at each T .......................... : 138180
 Max # of successes at each T ........................... : 13818
 Min # of successes to continue ......................... : 46
 Max # of annealing steps ............................... : 100
  ====  Simulated annealing procedure started  ====
 j:   1 T: 0.108E-02 Suc: 13818 Eva:    16326 CPU:  0.642E+01 SqF: 0.6070
  Rf: 0.45421 Los:0.1120 Dis:0.0228 Per:  0.5380 Sca: 0.981E-14

Run DAMMIN on the GNOM out file ly01.out interactively in expert mode where all parameters can be tuned.

$ dammin
  ***     Ab inito shape determination by simulated      ***
  ***  annealing using a single phase dummy atoms model  ***
  ***  Win 9x/NT, UNIX/Linux/Mac release version  5.3    ***
  ***      Last modified     ---  13/02/07 18:00         ***
  ***  Please reference: D.Svergun (1999). Biophys. J.   ***
  ***                             76, 2879-2886.         ***
  ***   Copyright (c) ATSAS Team                         ***
  ***   EMBL, Hamburg Outstation, 1999 - 2007            ***
   Type dammin /help for batch mode use

    ======

DAMMIN  started on           18-Aug-2009   12:07:24
 Mode: <[F]>ast, [S]low, [J]ag, [K]eep, [E]xpert <         Fast >: E
 Log file name .......................... <         .log >: t01
 Input data, GNOM output file name ...... <         .out >: ly01
 Project identificator .................................. : t01
 Enter project description .............. : This is a test run for lysozyme
 Random sequence initialized from ....................... : 120818
  ** Information read from the GNOM file **
 Data set title:   Lysozyme, high angles (>.22) 46 mg/ml, small angles (<.22) 15 mg/
 Raw data file name:  ly01exp.dat
 Maximum diameter of the particle ....................... : 50.00
 Solution at Alpha =  0.107E+01   Rg :  0.154E+02   I(0) :   0.657E+01
 Radius of gyration read ................................ : 15.40
 Number of GNOM data points ............................. : 213
 Angular units in the input file:
 4*pi*sin(theta)/lambda [1/angstrom] (1)
 4*pi*sin(theta)/lambda [1/nm      ] (2)  <            1 >: 2
 Angular units multiplied by ............................ : 0.1000
 Dmax and Rg divided by ................................. : 0.1000
 Maximum s value [1/angstrom] ........................... : 4.960e-2
 Number of Shannon channels ............................. : 7.894
 Portion of the curve to be fitted ...... <         1.000 >:
 Number of knots in the curve to fit .... <            20 >:
 Constant subtraction procedure. Enter
 Positive number: value to be subtracted, OR
 Negative number: to skip subtraction   , OR
 Zero for automatic subtraction ......... <           0.0 >:
  *** Warning: constant reduced to avoid oversubtraction
 A constant was subtracted .............................. : 3.454e-2
 Maximum order of harmonics ............. <           20 >: 15
  Initial DAM: type S for sphere [default],
  E for ellipsoid, C for cylinder, P for parallelepiped
  or start file name .................... <         .pdb >: S
 Symmetry: P1...19 or Pn2 (n=1,..,12)
 or P23 or P432 or PICO ................. <           P1 >: P1
 Sphere  diameter [Angstrom] ............ <        500.0 >:
 Packing radius of dummy atoms .......... <        10.90 >: 10
 Radius of the sphere generated ......................... : 250.2
 Number of dummy atoms .................................. : 11590
 Number of equivalent positions ......................... : 1
 Expected particle shape: <P>rolate, <O>blate,
  or <U>nknown .......................... <      Unknown >:
 Excluded volume per atom ............................... : 5661.
 Radius of 1st coordination sphere ...... <        28.20 >:
 Minimum number of contacts ............................. : 5
 Maximum number of contacts ............................. : 12
 Looseness penalty weight ............... <     2.000e-3 >: 2.0E-2
 No of non-solvent atoms ................................ : 11590
 Initial DAM looseness .................................. : 3.635e-3
 Disconnectivity penalty weight ......... <     2.000e-2 >:
 Initial DAM # of graphs ................................ : 1
 Discontiguity   value .................................. : 0.0
 Center of the initial DAM:    0.0000   0.0000   0.0000
 Peripheral penalty weight .............. <       0.3000 >: 0.5
 Peripheral penalty value ............................... : 0.5997
 Fixing thresholds Los and Rf <          0.0,        0.0 >: 0,0
 Randomize the structure [ Y / N ] ...... <          Yes >:
 No of non-solvent atoms ................................ : 5749
 Randomized DAM looseness ............................... : 9.166e-2
 Randomized DAM # of graphs ............................. : 14
 Discontiguity   value .................................. : 2.787e-3
 Randomized peripheral penalty value .................... : 0.5953
 Initial DAM shape anisometry ........................... : 4.752e-3
 Initial DAM non-prolateness ............................ : 4.005e-3
 Initial DAM non-oblateness ............................. : 0.0
 Weight: 0=s^2, 1=Emphas.s->0, 2=Log .... <            1 >:
 *** Porod weight with emphasis at low s ***
 Initial scale factor ................... <    8.809e-15 >:
 Fix the scale factor [ Y / N ] ......... <           No >:
 Initial R^2 factor ..................................... : 0.2918
 Initial R   factor ..................................... : 0.5402
 Initial penalty ........................................ : 0.2995
 Initial fVal ........................................... : 0.5913
  Tuning the annealing parameters. Please wait...
 Variation of the target function ....................... : 6.450e-5
 CPU per function call, seconds ......................... : 2.250e-3
 Initial annealing temperature .......... <     1.000e-3 >:
 Annealing schedule factor .............. <       0.9500 >:
 # of independent atoms to modify ....... <            1 >:
 Max # of iterations at each T .......... <       811300 >:
 Max # of successes at each T ........... <        81130 >:
 Min # of successes to continue ......... <          270 >:
 Max # of annealing steps ............... <          200 >:
  ====  Simulated annealing procedure started  ====
 j:   1 T: 0.100E-02 Suc: 81130 Eva:    83447 CPU:  0.191E+03 SqF: 0.7583
  Rf: 0.52963 Los:0.0913 Dis:0.0021 Per:  0.5852 Sca: 0.868E-14