Gene expression data classification using genetic algorithm-based feature selection

Authors: ÖZNUR SİNEM SÖNMEZ, MUSTAFA DAĞTEKİN, TOLGA ENSARİ

Abstract: In this study, hybrid methods are proposed for feature selection and classification of gene expression datasets. In the proposed genetic algorithm/support vector machine (GA-SVM) and genetic algorithm/k nearest neighbor (GA-KNN) hybrid methods, genetic algorithm is improved using Pearson's correlation coefficient, Relief-F, or mutual information. Crossover and selection operations of the genetic algorithm are specialized. Eight different gene expression datasets are used for classification process. The classification performances of the proposed methods are compared with the traditional GA-KNN and GA-SVM wrapper methods and other studies in the literature. Classification results demonstrate that higher accuracy rates are obtained with the proposed methods compared to the other methods for all datasets.

Keywords: Feature selection, gene expression datasets, hybrid method, genetic algorithm, support vector machine, cancer classification

Full Text: PDF