SelvarClust software


Context :

SelvarClust is a software implemented in C++ with object-oriented programming. It is devoted to the variable selection in model-based clustering. It is the greedy algorithm associated to the SR modeling proposed by C. Maugis, G. Celeux and M.-L. Martin-Magniette in [1] and [2], modifying the method of Raftery and Dean [3]. This software allows to study data where individuals are described by quantitative block variables. It returns a data clustering and the selected model, composed of the number of clusters, the mixture form and the variable partition.

Main references :


LINUX WINDOWS

Installing in Linux :

  1. SelvarClust uses Mixmod software (version 2.1.1) available here.
    First, install the mixmod software (see the Quick Start for an installation help).
    In the following, we call mixmodDir the full path of the directory where Mixmod software is located.


  2. Declare the path of Mixmod by adding the following command in the bash shell : export PATH=mixmodDir/Mixmod/BIN:$PATH


  3. Download the following .zip file containing the .cpp files, the .hpp files and the Makefile for Linux.

    SelvarClust.zip

    Unzip SelvarClust.zip in a directory. In the following, the full path of this directory is called SelvarClustDir.
    Compile with the command make. The executable SelvarClust is then created. You can declare this executable in the bash shell with the command export PATH=SelvarClustDir:$PATH

Arguments and Usage in Linux:

For running the SelvarClust algorithm, use the following command : nohup ./SelvarClust Arg1 Arg2 Arg3 Arg4 Arg5

with the following arguments :

Arg1 : path of the file containing the data (Example : /home/example/Data.txt)
Arg2 : path of the file containing the block variable sizes (Example : /home/example/Variablesize.txt)
Variablesize.txt contains a column given the size of each block variable.
Arg3 : path of the file containing the considered cluster numbers (Example : /home/example/NbClusters.txt)
NbClusters.txt contains a column given the considered numbers of Gaussian mixture components.
Arg4 : path of the file containing the considered Gaussian mixture forms (Example : /home/example/MixtureForms.txt)
MixtureForms.txt contains a column given the number of each considered Gaussian mixture form according to the correspondence table.
Arg5 : path of the directory where the results will be saved (Example : /home/Results/)

Results :

After using the SelvarClust algorithm, the directory given in Arg5 for saving results contains the following files:

Examples :

The description of three examples is given in the following pdf file : Description.pdf
Data.zip Results1.zip Results2.zip Results3.zip

Installing in Windows :

  1. SelvarClust uses Mixmod software (version 2.1.1) available here.
    First, install the mixmod software in the folder C:\Program Files\Mixmod and with the name Mixmod. See the Quick Start for an installation help.


  2. Declare the path of Mixmod : From the desktop, right-click My Computer and click properties. In the System Properties window, click on the Advanced tab. In the Advanced section, click the Environment Variables button. Finally, in the Environment Variables window, highlight the path variable in the Systems Variable section and click edit. Add a semicolon and the path C:\Program Files\Mixmod\BIN.


  3. Download the following executable in a directory whose the full path is called SelvarClustDir in the following.

    SelvarClustWindows.exe

Arguments and Usage in Windows:

For running the SelvarClust algorithm, use the following command : SelvarClustDir\SelvarClustWindows.exe Arg1 Arg2 Arg3 Arg4 Arg5

with the following arguments :

Arg1 : path of the file containing the data
Arg2 : path of the file containing the block variable sizes
For instance, Variablesize.txt contains a column given the size of each block variable.
Arg3 : path of the file containing the considered cluster numbers
For instance, NbClusters.txt contains a column given the considered numbers of Gaussian mixture components.
Arg4 : path of the file containing the considered Gaussian mixture forms
For instance, MixtureForms.txt contains a column given the number of each considered Gaussian mixture forms according to the correspondence table.
Arg5 : path of the directory where the results will be saved

Results :

The directory given in Arg5 for saving results contains the following files:


Bugs and Feedback - Contacts

Send an e-mail with the subject "Bugs-SelvarClust" at cathy.maugis -AT- insa-toulouse.fr