Name

sxk_means - K-means classification of a set of images

Usage

Usage in command lines:

sxk_means.py stack outdir <maskfile> --K=10 --trials=2 --debug --opt_method='cla' --maxit=100 --rand_seed=10 --crit='all' --F=0.9 --T0=2.0 --init_method='rnd' --normalize --CTF --MPI --CUDA

Usage in python programming:

k_means_main(stack, out_file, maskname, opt_method, K1, K2, rand_seed, maxit, trials, CTF, F, T0, MPI=False, CUDA=False, DEBUG=False, flagnorm=False)

Example:

sxk_means.py hri_stack.hdf RES mask2d_23.hdf --opt_method="SSE" --K=128 --maxit=500 --crit="D"

sxk_means.py bdd:hri_stack RES mask2d_23.hdf --opt_method="SSE" --K=128 --maxit=1000 --rand_seed=100 --T0=2.5 --F=0.995 --MPI

sxk_means.py bdb:hri_stack RES mask2d_13.hdf --K=212 --maxit=10000 --rand_seed=10 --T0=-1 --F=0.995 --CUDA

Note: the 2D input images have to be aligned (see sxali2d).

Input

stack
The input stack of images
maskfile
optional mask file to be used
outdir
name of directory where the results are writed
  • The parameters preceded with -- are optional and default values are given in parenthesis.

  • K
    The requested number of clusters (2).
    trials
    number of trials of K-means (see description below) (default one trial). NOT USED in CUDA version.
    opt_method
    optimization method: 'SSE' or 'cla' (default is SSE) (see description below). NOT USED in CUDA version.
    max_iter
    maximum number of iterations the program will perform (set to 100)
    CTF
    if set, CTF information stored in file headers will be used (default no CTF). NOT USED in CUDA version.
    rand_seed
    the seed used to generating random numbers (set to -1, means different and pseudo-random each time)
    crit

    names of criterion used: 'all' all criterions, 'C' Coleman, 'H' Harabasz or 'D' Davies-Bouldin, thoses criterions return the values of classification quality, see also sxk_means_groups. Possibility to free composed, like 'CD', 'HC', 'CHD', ... CUDA version return every time all criterions, equivalent to 'CHD'.

    T0
    simulated annealing, start the algorithm with the first temperature T0. (set to 0.0, means simulated annealing turn off)
    F
    simulated annealing, cooling factor, how you want decrease the temperature after each iteration, T = T * F (set to 0.0, means simulated annealing turn off)
    MPI
    to use MPI version of k-means (default False, possibility to combine with option CUDA to run on GPU cluster)
    CUDA
    to use CUDA version of k-means (default False, possibility to combine with option MPI to run on GPU cluster)
    normalize
    Normalize images under the mask
    init_method
    Method used to initialize partition: "rnd" randomize or "d2w" for d2 weighting initialization (default is rnd)

    Output

    outdir
    The directory to which the averages of K clusters, and the variance. The classification charts are written to the logfile. To the CUDA version the classification charts and the variance are not export.

    Warning: If the output directory already exists, the program will crash and an error message will come up. Please change the name of directory and restart the program .

    The program will write two kinds of image stack files:

    The averages have the following attributes set:

    The variances have the following attributes set:

    Description

    Reference

    Author / Maintainer

    Julien Bert

    Keywords

    category 1
    APPLICATIONS

    Files

    statisctics.py, sxk_means.py

    See also

    sxk_means_groups sxk_means_stable

    Maturity

    beta
    works for author, often works for others.

    Bugs

    HDF file: HDF file has a limitation on the number of items contain in the header (~16000). In the case 'members' (list of images assigned to each class) is a list over 16000 elements, all assignment will be automatically export to text file: kmeans_grp_00.txt, kmeans_grp_01.txt, etc. Each file contain the list of ID images assigns to this class.

    sxk means (last edited 2010-07-27 20:01:40 by ranlin)