Manual

The program OLIGOMER fits an experimental scattering curve from a multicompoment mixture of proteins to find the volume fractions of each component in the mixture.

Introduction

The input data for oligomer is an experimental SAXS or SANS scattering curve from a mixture of several dictinct components and a set of form-factors for each component of this mixture. The experimental scattering intensity I(s) from a mixture of K different particles (components) is written as: I(s) = ∑ ( w~i~*I~i~(s) ) where w~i~ and I~i~ ( s ) are the volume fraction and the scattering intensity from the i -th component, respectively. Given the intensities from the components (form-factors), OLIGOMER finds the volume fractions by solving a system of linear equations using the algorithm of nonnegative or unconstrained least-squares to minimize the discrepancy between the experimental and calculated scattering curves. X-ray or neutron form-factors of proteins with known structure or of dummy atoms models can be computed using the programs CRYSOL and CRYSON respectively, which both are part of the ATSAS package.

Running OLIGOMER

Usage:

$ oligomer [OPTIONS]

OPTIONS known by OLIGOMER are described in next section. Prior to running the application a file containing the Scattering intensity from the form-factors must be prepared in the format: 1 ^st^ line: comment from 2 ^nd^ line: scattering vector ( s ), iForm1, iForm2… here, s = 4 *\pi* sin ( heta)/\lambda, [1/angstrom] Iform1, Iform2 etc are the scattering intensities of the components. They should be normalized in such a way that iForm ( s =0) is equal (or proportional to) the square of the molecular weight MW (or total scattering length). OLIGOMER estimates molecular weights for calculation of volume fractions and radii of gyration from experimental or model scattering curves of components of the mixture. The latter values are used only for determination of the average value of radius of giration for all mixtures. Therefore, one must enter the real values of molecular weight for each component (for example, based on information about primary sequence) in following cases:

  • scattering curves from components (i.e. sets of form-factors) are standardized to_I~exp~(0)_= 1
  • scattering curves from components calculated byCRYSOLfrom dummy atom models built by programsDAMMINorGASBORwith beads not equal size
  • sets of form-factors are standardized to equally values of_I~exp~(_0)If the iForm curves and not properly scaled, the fit not be affected but the volume fractions will be incorrect. If one just uses predicted CRYSOL curves from different oligomers, no additional normalization of the form factors is required. To quickly create form-factor file from pdb files and/or from scattering data files (either from ASCII *.dat files where column 1 will be taken as S-axis, column 2 as intensities or from GNOM output files where desmeared curve - “Ireg” column will be taken for intensity) one can use program FFMAKER that automatically creates form-factor file (one can use interactive or batch mode to run the program. There is no internal scaling in FFMAKER, the curves will correspond to CRYSOL output files (if calculated from PDB models) or to the input scattering curves (as they are). FFMAKER (versions from atsas 2.6 or later releases) shall also calculate the contrasts for each of the components (in the case of input pdb files), that will be later used by OLIGOMER (versions from atsas 2.6 or later releases) in order to properly normalize the volume fractions of the components. This additional adjustment of the volume fractions due to the constrasts will have influence (up to 3-4% of correction for volume fractions) mainly for heterogenous systems (e.g. mixtures of protein-DNA or protein-RNA molecules), for homogenous systems (e.g. mixtures of proteins) there will be no difference whether to take or not to take into account the contrast information (it means the results from earlier or later versions of FFMAKER and OLIGOMER will be the same for the latter case). OLIGOMER and FFMAKER supports the absolute and relative paths to the input file names (that can be important for Linux/OS users), by default the programs are looking for the input files from the current working directory. To process multiple data sets in the batch mode, simply pass multiple file names on the command line. Most shells will automatically expand wildcard expressions such as *.dat into all matching file names.

Running OLIGOMER with volume/number fraction constraints

NOTE: this option is available in OLIGOMER/FFMAKER only starting from atsas 2.6 or later releases If the ratio between the fractions of the components is known, one can keep it during the OLIGOMER analysis and find the best fit that still satisfies the desired fraction ratio. In general, the fit may become worse compared to the result of the unconstrained analysis, however it will ensure the physically meaningful values of the component volume fractions. The ratios between the components should be provided in the input file for FFMAKER using its option ‘/constrain’ in the following format:


     Volume Fraction Relation Coefficients: 1.0 -1.0 0.0

As an example, the above condition can be applied for an arbitrary three- component system, where the volume fraction of the first component is forced to be equal to the volume fraction of the second component (i.e. the following condition will be kept 1.0V(1) - 1.0V(2) + 0.0*V(3) = 0.0 ). Plese note that the number of coefficients in the ratio constrain file should be the same as the number of form-factor curves. In some cases, when the components are mixed with the certain molar ratio, one can use the constraints for the number fraction of the components (i.e. to keep a certain ratio between the number of molecules of different components). For this one has to use the following format for the input constrain file:


     Number Fraction Relation Coefficients: 0.0 1.0 -1.0

The above condition means that the number of molecules of component 2 is equal to the number of molecules of component 3. In practice, such condition can correspond to the partially dissociated binary complex which subunits were mixed with 1:1 molar ratio (here: the first component corresponds to AB complex, the second component - molecule A, the third component - molecule B). After running FFMAKER with the option ‘/constrain confileName’, the form-factor file with the following header lines will be produced:


 FormFactor file for OLIGOMER created from: monomer.pdb dimer.pdb tetramer.pdb
Contrasts:  0.42541E+00  0.42513E+00  0.42492E+00
VolumeFractionConstrains:  1.0000 -1.0000  0.0000
 0.00000E+00  0.15004E+08  0.39378E+07  0.19715E+09
 0.25000E-02  0.14993E+08  0.39355E+07  0.19617E+09
 0.50000E-02  0.14959E+08  0.39289E+07  0.19328E+09
 0.75000E-02  0.14904E+08  0.39178E+07  0.18857E+09
-----------   ----------    ---------   ----------
 0.49000E+00  0.53580E+05  0.26081E+05  0.13573E+06
 0.49250E+00  0.52695E+05  0.25858E+05  0.13167E+06
 0.49500E+00  0.51828E+05  0.25639E+05  0.12746E+06
 0.49750E+00  0.50979E+05  0.25425E+05  0.12318E+06
 0.50000E+00  0.50153E+05  0.25215E+05  0.11893E+06

The subsequent run of OLIGOMER with the above shown form-factor file will find the best fit to the data from the linear combination of the components with the volume (or number) fractions satisfying the input ratio constraints

Command-Line Arguments and Options

OLIGOMER accepts the following command line arguments:

Argument Description
FILE(S) Experimental data file(s).
Short Option Description
–ff Name of the form factor input file.
–fit Output fit file
–un Angular units (1:Å^-1^=default; 2: nm^-1^; 3: 2pisÅ^-1^; 4: 2pis nm^-1^)
–smin minimum value of angular axis s [Å^-1^]
–smax maximum value of angular axis s [Å^-1^]
–out the name of the output file, default is oligomer.log
–svd singular value decomposition analysis of components for multiple data curves, e.g. concentration serias, data measured at different pH, time, temperature, pressure, salt concentration, buffer composition, ligand addition, etc.
–compar comparative analysis of the fits from all possible combinations of the form-factor curves, this option is useful to separate the most significant components that contribute to the fitted curve from the components that only slightly influence the fit result.
–cst adds constant as additional component. Default value is 10 times greater than maximum I(0) value from the form-factor file. This choice of constant will keep the volume fraction of the “constant” component equal to 0 providing correction of the scattering curve from parasitic constant background. In batch mode one needs to provide only the option “-cst” (without value for the constant). The constant value can be changed only in interactive mode.
–ws Overwrite the output log (default: append)
–brief brief output for .log file

Please note: if one runs OLIGOMER in batch mode for several data sets that have different starting s-values and ‘–smin’ option is not used, it will automatically determine the maximum s-value among all starting s-values and take it for fitting of all data sets. If one explicitly uses ‘–smin’ option, all data sets will be taken starting from the specified ‘smin’ value. If one runs OLIGOMER for the individual data set without specifying ‘–smin’ option, then the complete angular data range will be taken for fitting.

Interactive Configuration

OLIGOMER reads experimental data and form-factors (both should be in ASCII format). After starting OLIGOMER you may specify:

Prompt Possible value(s) Default value Description
Program option 0or1 0 option 0 - permits to run OLIGOMER for multiple data sets using one form-factor file, option 1 - permits to use multiple form-factor files for fitting one data file.
Input data, form-factor file name filename ff.dat Input file should contain intensity values from at least one component. It can be prepared by FFMAKER program from *.pdb, *.dat and *.out (GNOM) files or manually by composing the following columns: 1st column - s-axis, 2nd column - intensity from component 1, 3rd column - intensity from component 2 etc.
Use constant as additional component YorN N if ‘Yes’ - additional constant component will be added, that permits to improve the fit at higher angles, if required
Calculated MW and Rg X1,X2 var default values are estimated from form-factor intensity curves
Experimental data file name filename lys.dat The name of the file containing the experimental SAXS profile from multicomponent system
Angular units in the input file : 4pisin(theta)/lambda [1/angstrom] (1) 4pisin(theta)/lambda [1/nm ] (2) 2* sin(theta)/lambda [1/angstrom] (3) 2* sin(theta)/lambda [1/nm ] (4) 1-4 1 Formula for the scattering vector in the data file and its units.
Range for evaluation of scattering X1,X2 var default values are estimated from the full-range of experimental data file
Use of non-negativity condition YorN Y if ‘Yes’ - the volume fractions values will be required to be positive
Output fit file filename *.fit by default: experimental file name + .fit extenstion.
Plot the result YorN Y if ‘Yes’ - the fit curve and exp. curve will be plotted
Process another data set YorN Y if ‘Yes’ - the fitting procedure can be repeated for another data set.

Runtime Output

On runtime, two lines of output will be generated:

Chisquare  MW   Rg   Volume fractions+-errors

 1.5     22.5    1.7     0.600+-0.001 0.400+-0.001

The fields can be interpreted as follows, top-left to bottom-right:

Field Description
Chisquare The quality of the fit (discrepancy value)
MW The averaged molecular mass of the system
Rg The averaged radius of gyration of the system
Volume fractions+-errors The volume fractions of the components in the mixture and errors of volume fractions.The sequence of the components is the same as specified in the form-factor file

OLIGOMER Input Files

The experimental data file is composed as follows:


SAXS experimental data
experimental description
0.177473E+00  0.737508E+02  0.169651E+01
0.180129E+00  0.728071E+02  0.155912E+01
.....            ....          ....

The first two lines are reserved for sample description and further details. The first column in the third row is the s-value, the second column are the scattering intensities and the third column are the experimental errors. The form factor files consists of the individual form factor of the components. If they are derived by the program CRYSOL the second column of the calculated form factors has to be used.


0.00E+00  4.30E+08  3.31E+09  ...
1.00E-02  4.23E+08  3.19E+09  ...
2.00E-02  4.03E+08  2.87E+09  ...
 ....        ...       ...    ...

The first column is the s-value, the second is the first formfactor and the third column the second form factor. The two input files can have different s-values and data length.

OLIGOMER Output Files

OLIGOMER creates a output file named oligomer.log as default or as specified using the /out option. If the file already exists, the results are appended to the file.


         ***   O L I G O M E R ***
================================================================================
                                              01-Sep-2009   17:26:17
 Option 0: a set of form-factors and several sets of experimental data
 Form-factor file              test.dat
 Real oligomer weights          20849   58193
 Oligomer radii of gyration    22.57  34.37
 Experimental data file        Merge00.dat    Range of Scattering angle:    0.02    0.27
 Using non-negativity condition
 Output file                   Merge00.fit
 ChiSquare     <MW><Rg>        Volume fractions +- errors
--------------------------------------------------------------------------------
Merge00.da    2.17      52869     33.82      0.143+-0.004  0.857+-0.002
================================================================================

If the brief -br option is used only the last line with the results is plotted. If -svd option (singular value decomposition analysis) is used, the following files - Ncomp.log, S.dat, U.dat and V.dat are created, they contain information about the singular values (s.dat), singular vectors (u.dat and v.dat) and statistical estimation of the number of components (Ncomp.log) obtained from singular value decomposition of the experimental data series in the particular angular range defined by the -smin and -smax options. If -compar option is used, OLIGOMER will calculate the fits and the volume fractions for all possible combinations of the form-factor components. The output file oligomer.log will contain this information in the form of the table where the missing (non-present) components will be denoted as ‘N/A’ (below you can see an example of oligomer.log with four components). One can use these results to determine the most significant components present in the mixture.


     ExpData         chi       <MW><Rg>   monomer.pdb   dimer.pdb     tetramer.pdb  octamer.pdb
    atx1_021.dat    2.68       3912     28.34  0.214+-0.001  0.656+-0.002  0.130+-0.001  N/A  +-  N/A
    atx1_021.dat    2.68       3912     21.26  0.214+-0.001  0.656+-0.002  0.130+-0.001  0.000+-0.000
    atx1_021.dat    3.83       4026     25.63  0.170+-0.001  0.783+-0.001  N/A  +-  N/A  0.047+-0.000
    atx1_021.dat    4.67       4128     18.62  N/A  +-  N/A  0.926+-0.001  0.074+-0.001  N/A  +-  N/A
    atx1_021.dat    4.67       4128     24.61  N/A  +-  N/A  0.926+-0.001  0.074+-0.001  0.000+-0.000
    atx1_021.dat    4.96       3690     18.80  0.097+-0.001  0.903+-0.001  N/A  +-  N/A  N/A  +-  N/A
    atx1_021.dat    4.97       4157     18.70  N/A  +-  N/A  0.972+-0.001  N/A  +-  N/A  0.028+-0.000
    atx1_021.dat    5.34       3874     18.92  N/A  +-  N/A  1.000+-0.000  N/A  +-  N/A  N/A  +-  N/A
    atx1_021.dat   10.53       3939     17.29  0.632+-0.001  N/A  +-  N/A  0.368+-0.000  N/A  +-  N/A
    atx1_021.dat   10.53       3939     40.50  0.632+-0.001  N/A  +-  N/A  0.368+-0.000  0.000+-0.000
    atx1_021.dat   16.76       4302     17.42  0.808+-0.000  N/A  +-  N/A  N/A  +-  N/A  0.192+-0.000
    atx1_021.dat   25.50       7294     16.47  N/A  +-  N/A  N/A  +-  N/A  1.000+-0.000  0.000+-0.000
    atx1_021.dat   25.50       7294     18.92  N/A  +-  N/A  N/A  +-  N/A  1.000+-0.000  N/A  +-  N/A
    atx1_021.dat   28.66       1984     18.92  1.000+-0.000  N/A  +-  N/A  N/A  +-  N/A  N/A  +-  N/A
    atx1_021.dat   43.25      14023     18.92  N/A  +-  N/A  N/A  +-  N/A  N/A  +-  N/A  1.000+-0.000

OLIGOMER creates also *.fit file that contains 4 columns: S-axis (1st column), experimental intensity Iexp(2nd column), errors of intensity Ierr(3rd column), fit from OLIGOMER Ifit (4th column). NOTE: The above format is valid for ATSAS 2.7 or later versions. In the earlier versions (up to ATSAS 2.6) the output fit file contained 5 columns: S-axis (1st column), experimental intensity Iexp(2nd column), fit from OLIGOMER Ifit (3rd column), errors of intensity Ierr(4th column), the difference curve (Iexp-Ifit) (5th column).

Examples

Example 1

For fitting the experimental data set data.dat using the formfactor file form- factors.dat with the angular axis in nm ^-1^

$ oligomer --ff form-factors.dat --un=2 data.dat

Example 2

For processing all experimental data files with the names beginning with “m” and with “dat” extension using the formfactor file ff2.dat. The fitting will be applied from smin=0.5 for the experimental curves.

$ oligomer --ff ff2.dat --smin 0.5 m*.dat