Fast Correlation-Based Filter Software for Feature Selection
Arizona State University, Computer Science and Engineering, Data Mining and
Machine Learning
Overview
Feature selection is a preprocessing technique
frequently used in data mining and machine learning tasks. It can
reduce dimensionality, remove irrelevant data, increase learning
accuracy, and improve results comprehensibility. FCBF is a
fast correlation-based filter algorithm designed for high-dimensional
data and has been shown effective in removing both irrelevant features
and redundant features.
About FCBF Software
This software package is prepared in Java. It is provided
free of charge to the research community as an academic software package with no
commitment in terms of support or maintenance. Data may need to be discretized
before running FCBF. Please refer to the paper below. A discretization package
is also available on our web site.
Downloads
- Java package can be downloaded
here.
- Before 3/15/04 a version for Linux was released.
The software is freely available for academic use,
and can be downloaded here for Linux
Run "tar xvfz FCBF.tar.gz" to obtain all the files
including README and some sample data files.
README provides the details about how to run, how to
prepare the data, and how to read the results.
People
- ASU Data Mining and Machine Learning Lab: Huan Liu, Lei Yu
- Ravi Bhimavarapu, Manoranjan Dash, Farhad Hussain
References
- L. Yu and H. Liu. "Feature Selection for High-Dimensional Data: A
Fast Correlation-Based Filter Solution". In Proceedings of The
Twentieth International Conference on Machine Leaning (ICML-03), pp
856-863, Washington, D.C., August 21-24, 2003. pdf
- H. Liu, F. Hussain, C.L. Tan, and M. Dash. "Discretization: An
Enabling Technique", Journal of Data Mining and Knowledge Discovery
6(4): 393-423; Oct 2002.
Created on 3/15/2004
Updated on 4/14/2004