Manual

The following sections briefly describe how to run DATCLASS from the command- line, the required input and its runtime output.

Introduction

DATCLASS applies machine learning methods to rapidly classify the particle shape and estimate Molecular Weight (Da) and Dmax (\(\AA\)) from SAXS patterns. Possible shape classes are:

  • compact
  • extended
  • flat
  • ring
  • compact-hollow
  • hollow-sphere
  • random-chain
  • unknown

Please note that no parameter estimates are provided for objects classified as either “random-chain” or “unknown”. The output will show “N/A” instead.

DATCLASS requires external files to run. If DATCLASS reports:

error: shape classifier initialization failed

please verify your installation and/or make sure that the ATSAS environment variable is set correctly.

Running datclass

Usage:

$ datclass [OPTIONS] <SASDATA(S)>

OPTIONS known by DATCLASS are described in next section, the required argument(s) SASDATA(S) in the section on input files.

Command-Line Arguments and Options

DATCLASS requires the following command line arguments:

Argument Description
SASDATA One or more experimental SAS data (.dat) or regularised SAS data (.out) files.

Absolute as well as relative paths to data files are accepted. Instead of a file name, one of the arguments may be given as ‘-‘ to read data from stdin.

DATCLASS recognizes following command-line options:

Short option Long option Description
  --rg <VALUE> Experimental Radius of Gyration in the units of the data. This option is mandatory for experimental SAS data (.dat).
  --i0 <VALUE> Experimental forward scattering in the units of the data. This option is mandatory for experimental SAS data (.dat).
  --first <N> Index of the first point to be used. Default: 1.
  --query <N> Query and print the N nearest neighbours of the input data and exit. Default: N=5.
  --features Print feature vector of input data and exit.
-v --version Print version information and exit.
-h --help Print a summary of arguments, options, and exit.

Runtime output

DATCLASS output consists of one result line for each input file with the following values: shape classification, MW (Da), Dmax (A), file name.

datclass input files

DATCLASS expects background-subtracted experimental SAS data (.dat) or regularised SAS data (.out) files.

If SASDATA is a regularised SAS data (.out) file, reciprocal space \(R_g\) and \(I(0)\) stated in the file are used, but may be overridden by the corresponding command-line options.

Multiple inputs may be provided at once, but note that any --rg and --i0 values specified are applied to all inputs the same.

Examples

$ datclass --rg=3 --i0=65.1 bsa.dat
compact      77594     103.48   bsa.dat