Manual

The following describes the method implemented in GLOBSYMM, details of the dialog prompt as well as the required input and the produced output files.

Introduction

GLOBSYMM builds symmetric oligomers from a single subunit by fitting their calculated scattering to experimental SAXS data.

At a high level, the program places one monomer in 3D (rotation + translation), applies the chosen symmetry to generate all copies, computes the theoretical scattering of the resulting assembly, and calculates the fit to the data. This process is repeated over many candidate arrangements using an efficient grid search; models with clashes or disconnected subunits are penalized, and optional residue–residue contact information can be used to guide the search.

The result of GLOBSYMM is a ranked list of symmetric arrangements with the best-fitting model saved together with its calculated fit, and additional top solutions available for retrieval from the log.

Running GLOBSYMM

Usage:

$ globsymm [OPTIONS] [<LOGFILE> <SOLUTION_NUMBER>]

OPTIONS known by GLOBSYMM are described in next section. The configuration is done in full interactive mode.

Command-Line Arguments and Options

GLOBSYMM recognizes the following command-line options. Mandatory arguments to long options are mandatory for short options too.

Short Option Long Option Description
  --model-format=<FMT> Format of 3D models, one of: cif, pdb (default: cif)
-h --help Print a summary of arguments, options, and exit.
-v --version Print version information and exit.

Modelling with GLOBSYMM is done in interactive dialog mode. Retrieval of non- best solutions is performed from the command line.

Interactive Configuration

Modelling with GLOBSYMM is done in interactive dialog mode.

An may be used to record and replay configurations, enabling repeatable runs without re-entering parameters.

Screen Text Default Description
Output file N/A Project identifier, will be used as a prefix for all output file names
Enter file name, experimental data N/A The name of the data file containing the experimental SAXS profile of the oligomer.
Angular units in the input file : 4pisin(theta)/lambda [1/angstrom] (1) 4pisin(theta)/lambda [1/nm ] (2) 2* sin(theta)/lambda [1/angstrom] (3) 2* sin(theta)/lambda [1/nm ] (4) 1 Formula for the scattering vector in the data file and its units.
Fitting range in fractions of Smax 1.0 Percentage of the scattering curve to fit, starting at the first point. Default is the entire curve.
Amplitudes of a monomer N/A The name of the file with partial scattering amplitudes of the monomer computed by CRYSOL.
Symmetry: Pn(2)(n=2-6) P222 Supported symmetries are:P2-P6, P222, P32-P62. The n-fold axis is Z, if there is in addition a two-fold axis it coincides with Y
Average distance from the origin var The search will be performed in the vicinity of specified value. The default is computed as a square root of the difference of the experimental radius of gyration squared and that from the monomer. This ensures proper R~g~value of the intact oligomer.
Spatial increment in angstroems 1.0 Step in the grid search of translations.
No of spatial steps in one direction 1 Number of steps along the radius towards the origin or in opposite direction.
Angular increment in degrees 5.0 Step in the grid search of rotations.
Number of angular steps var Number of clockwise (or counterclockwise) rotations. Default is 180/step.
Fibonacci grid order for positioning 12 for Pn2,1 for Pn Translational search is done by moving the monomer along radial directions. The higher the order of the grid the more directions are considered. For Pn symmetries shifting along one axis (grid order=1) is enough to construct arbitrary multimer.
Fibonacci grid order for rotation axes 12 A separate grid of angular directions is generated as a set of rotational axes for the search of monomer’s orientation.
RMSD threshold for grouping solutions var During the run the program keeps a list of 20 best solutions, which are grouped after the minimization is finished. All solutions within the group must differ from the best model of this group by r.m.s.d. less than the specified threshold. The default value is 20% of the experimental R~g~.
String with the 1st atom in the pair N/A The string from the atomic coordinates file containing the Ca atom of the appropriate residue (for the residues making the contacts, if the oligomerization interface is known). This question may be asked several times. Empty string means termination of the input of the contacts information.
String with the 2nd atom in the pair N/A The string from the atomic coordinates file containing the Ca atom of the appropriate residue. This question may be asked several times. Empty string means termination of the input of the contacts information.
Contact distance 5.0 The desired maximal distance between the two residues. This question may be asked several times.
Contact weight 1.0 How much the Contacts Penalty shall influence the target function. A value of0.0disables the penalty. If unsure, use the default value. If the desired contacts are not observed, try increasing the weight.

Runtime Output

On runtime, the progress, the current value of the target function and the elapsed time are displayed:

  22% done, fVal =   3.87015, CPU =    7 sec

When the minimization procedure is finished the number of grid steps done and the number of skipped (overlapping or not interconnected) configurations are shown:

Total number of steps done ............................. : 1716
Number of structures skipped ........................... : 1695

GLOBSYMM Input Files

GLOBSYMM requires experimental SAS data (.dat) as well as binary amplitude data (.alm) of the monomer, as calculated by CRYSOL.

GLOBSYMM Output Files

GLOBSYMM creates a set of output files, each filename starts with a customizable prefix that gets an extension appended. If a prefix has been used before, existing files will be overwritten without further note.

Extension Description
.log Contains the information about the modelling parameters and the resulting models.
.pdb or .cif The model is provided in either PDB or mmCIF format, depending on the model-format option. The header of the file contains information about the modelling parameters.
.fit Fit of the simulated scattering curve versus the calculated scattering of the model.

To retrieve a model from the list of the kept models (in the log file), a command has to be executed:

$ globsymm LOGFILE SOLUTION

LOGFILE is the name of the generated log file and SOLUTION is the solution number. The SOLUTION.pdb and SOLUTION.fit files will be produced.

Examples

Tetrameric pyruvate decarboxilase from Z.mobilis with P222 symmetry

 Output file ............................ <         .log >: pdc01
 Enter file name, experimental data ..... <         .dat >: zymz
 Angular units in the input file :
 4*pi*sin(theta)/lambda [1/angstrom] (1)
 4*pi*sin(theta)/lambda [1/nm      ] (2)
 2*   sin(theta)/lambda [1/angstrom] (3)
 2*   sin(theta)/lambda [1/nm      ] (4)  <            1 >:
 Angular units multiplied by ............................ : 1.000
 Fitting range in fractions of Smax ..... <        1.000 >:
 Number of experimental points found .................... : 51
 Experimental radius of gyration ........................ : 37.82
 Number of points in the Guinier Plot ................... : 9
 Amplitudes, fixed particle ............. <         .alm >: z100
 Radius of gyration of the subunit ...................... : 26.17
 Maximum order of harmonics ............................. : 15
 Number of points in partial amplitudes ................. : 51
  Theoretical  points from            1     to           22    used
 Symmetry: Pn(2) (n=2-6) ................ <         P222 >:
 Average distance from the origin ....... <        27.30 >:
 Spatial increment in angstroems ........ <        1.000 >:
 No of spatial steps in one direction ... <            1 >:
 Angular increment in degrees ........... <        5.000 >:
 Number of angular steps ................ <           36 >:
 Fibonacci grid order for positioning ... <           12 >:
 Fibonacci grid order for rotation axes . <           12 >:
 RMSD threshold for grouping solutions .  <        7.563 >:
  Interatomic contacts information:
  Paste the strings of the monomer pdb file
  containing the atoms forming the pairs
  or press enter to skip further input
  Pair #            1
 String with the 1st atom in the pair ....................:

  Minimization started. Please wait...

 ALMGRZ --- :  110976 summation coefficients used

 ...