datclass
Manual
The following sections briefly describe how to run DATCLASS from the command- line, the required input and its runtime output.
Introduction
DATCLASS applies machine learning methods to rapidly classify the particle shape and estimate Molecular Weight (Da) and Dmax (\(\AA\)) from SAXS patterns. Possible shape classes are:
- compact
- extended
- flat
- ring
- compact-hollow
- hollow-sphere
- random-chain
- unknown
Please note that no parameter estimates are provided for objects classified as either “random-chain” or “unknown”. The output will show “N/A” instead.
DATCLASS requires external files to run. If DATCLASS reports:
error: shape classifier initialization failed
please verify your installation and/or make sure that the ATSAS environment variable is set correctly.
Running datclass
Usage:
$ datclass [OPTIONS] <SASDATA(S)>
OPTIONS known by DATCLASS are described in next section, the required argument(s) SASDATA(S) in the section on input files.
Command-Line Arguments and Options
DATCLASS requires the following command line arguments:
Argument | Description |
---|---|
SASDATA | One or more experimental SAS data (.dat) or regularised SAS data (.out) files. |
Absolute as well as relative paths to data files are accepted. Instead of a file name, one of the arguments may be given as ‘-‘ to read data from stdin.
DATCLASS recognizes following command-line options:
Short option | Long option | Description |
---|---|---|
--rg <VALUE> | Experimental Radius of Gyration in the units of the data. This option is mandatory for experimental SAS data (.dat). | |
--i0 <VALUE> | Experimental forward scattering in the units of the data. This option is mandatory for experimental SAS data (.dat). | |
--first <N> | Index of the first point to be used. Default: 1. | |
--query <N> | Query and print the N nearest neighbours of the input data and exit. Default: N=5. | |
--features | Print feature vector of input data and exit. | |
-v | --version | Print version information and exit. |
-h | --help | Print a summary of arguments, options, and exit. |
Runtime output
DATCLASS output consists of one result line for each input file with the following values: shape classification, MW (Da), Dmax (A), file name.
datclass input files
DATCLASS expects background-subtracted experimental SAS data (.dat) or regularised SAS data (.out) files.
If SASDATA is a regularised SAS data (.out) file, reciprocal space \(R_g\) and \(I(0)\) stated in the file are used, but may be overridden by the corresponding command-line options.
Multiple inputs may be provided at once, but note that any --rg and --i0 values specified are applied to all inputs the same.
Examples
$ datclass --rg=3 --i0=65.1 bsa.dat
compact 77594 103.48 bsa.dat