SelvarClustIndep software


Context :

SelvarClustIndep is a software implemented in C++ with object-oriented programming. It is devoted to the variable selection in model-based clustering. It is a greedy algorithm associated to the SRUW modeling proposed by C.Maugis, G.Celeux and M.-L. Martin-Magniette in [1] and [2], modifying the method of Raftery and Dean [3] and improving our SelvarClust algorithm [4]. The SRUW modeling takes into account the three possible roles: relevant, redundant and independent variables. This software allows to study datasets where observations are described by quantitative variables. It returns a data clustering and the selected model composed of the number of clusters, the mixture form, the variance matrix form for the linear regression and the independent Gaussian density, and the variable partition.

Main references :


LINUX WINDOWS

Installing in Linux :

  1. SelvarClustIndep uses Mixmod software (version 2.1.1) available here.
    First install the mixmod software (see the Quick Start for an installation help).
    In the following, we call mixmodDir the full path of the directory where Mixmod software is located.


  2. Declare the path of Mixmod by adding the following command in the bash shell : export PATH=mixmodDir/Mixmod/BIN:$PATH


  3. Download the following .zip file containing the .cpp files, the .hpp files and the Makefile for Linux.

    SelvarClustIndep.zip

    Unzip SelvarClustIndep.zip in a directory. In the following, the full path of this directory is called SelvarClustIndepDir.
    Compile with the command make. The executable SelvarClustIndep is then created. You can declare this executable in the bash shell with the command export PATH=SelvarClustIndepDir:$PATH

Arguments and Usage in Linux:

For running the SelvarClustIndep algorithm, use the following command : nohup ./SelvarClustIndep Arg1 Arg2 Arg3 Arg4 Arg5 Arg6

with the following arguments :

Arg1 : path of the file containing the data (Example : /home/example/Data.txt)
Arg2 : path of the file containing the considered cluster numbers (Example : /home/example/NbClusters.txt)
NbClusters.txt contains a column given the considered numbers of Gaussian mixture components.
Arg3 : path of the file containing the considered Gaussian mixture forms (Example : /home/example/MixtureForms.txt)
MixtureForms.txt contains a column given the number of each considered Gaussian mixture forms according to the correspondence table.
Arg4 : path of the file containing the considered forms of the regression covariance matrix (Example : /home/example/RegForms.txt)
RegForms.txt contains a column given the number of each considered form (1: spherical form, 2: diagonal form, 3: general form)
Arg5 : path of the file containing the considered forms for the variance matrix of the independent Gaussian density (Example : /home/example/IndepForms.txt)
IndepForms.txt contains a column given the number of each considered form (1: spherical form, 2: diagonal form)
Arg6 : path of the directory where the results will be saved (Example : /home/Results)

Results :

After using the SelvarClustIndep algorithm, the directory given in Arg6 for saving results contains the following files:

Examples :

Three examples are given below. DATA.zip contains the files for using SelvarClustIndep with the command nohup ./SelvarClustIndep DATAxxx.txt k.txt m.txt reg.txt indep.txt Resultsxxx/.
DATA1.txt, DATA2.txt and DATA3.txt contain a dataset simulated according to Scenario 1, Scenario 5 and Scenario 6 respectively (see Section "Seven simulated situations" in [1] ).
Data.zip Results1.zip Results2.zip Results3.zip

Installing in Windows :

  1. SelvarClustIndep uses Mixmod software (version 2.1.1) available here.
    First install the mixmod software in the folder C:\Program Files\Mixmod and with the name Mixmod. See the Quick Start for an installation help.


  2. Declare the path of Mixmod : From the desktop, right-click My Computer and click properties. In the System Properties window, click on the Advanced tab. In the Advanced section, click the Environment Variables button. Finally, in the Environment Variables window, highlight the path variable in the Systems Variable section and click edit. Add a semicolon and the path C:\Program Files\Mixmod\BIN.


  3. Download the following executable in a directory whose the full path is called SelvarClustIndepDir in the following.

    SelvarClustIndepWindows.exe

Arguments and Usage in Windows:

For running the SelvarClustIndep algorithm, use the following command : SelvarClustIndepDir\SelvarClustIndepWindows.exe Arg1 Arg2 Arg3 Arg4 Arg5 Arg6

with the following arguments :

Arg1 : path of the file containing the data
Arg2 : path of the file containing the considered cluster numbers
NbClusters.txt contains a column given the considered numbers of Gaussian mixture components.
Arg3 : path of the file containing the considered Gaussian mixture forms
MixtureForms.txt contains a column given the number of each considered Gaussian mixture forms according to the correspondence table.
Arg4 : path of the file containing the considered forms of the regression covariance matrix
RegForms.txt contains a column given the number of each considered form (1: spherical form, 2: diagonal form, 3: general form)
Arg5 : path of the file containing the considered forms for the Gaussian density variance matrix
IndepForms.txt contains a column given the number of each considered form (1: spherical form, 2: diagonal form)
Arg6 : path of the directory where the results will be saved

Results :

After using the SelvarClustIndep algorithm, the directory given in Arg6 for saving results contains the following files:


Bugs and Feedback - Contacts

Send an e-mail with the subject "Bugs-SelvarClustIndep" at cathy.maugis -AT- insa-toulouse.fr