Search results

Filters

  • Journals
  • Authors
  • Keywords
  • Date
  • Type

Search results

Number of results: 2
items per page: 25 50 75
Sort by:
Download PDF Download RIS Download Bibtex

Abstract

The paper presents the fusion approach of different feature selection methods in pattern recognition problems. The following methods are examined: nearest component analysis, Fisher discriminant criterion, refiefF method, stepwise fit, Kolmogorov-Smirnov criteria, T2-test, Kruskall-Wallis test, feature correlation with class, and SVM recursive feature elimination. The sensitivity to the noisy data as well as the repeatability of the most important features are studied. Based on this study, the best selection methods are chosen and applied in the process of selection of the most important genes and gene sequences in a dataset of gene expression microarray in prostate and ovarian cancers. The results of their fusion are presented and discussed. The small selected set of such genes can be treated as biomarkers of cancer.
Go to article

Bibliography

  1.  I. Guyon and A. Elisseeff, “An introduction to variable and feature selection”, J. Mach. Learn. Res. 3, 1158–1182 (2003).
  2.  I. Guyon, A.J. Weston, S. Barnhill, and V. Vapnik, “Gene selection for cancer classification using SVM”, Mach. Learn. 46, 389‒422 (2003).
  3.  P.N. Tan, M. Steinbach, and V Kumar, Introduction to data mining, Boston, Pearson Education Inc., 2006.
  4.  H. Chen, Y. Zhang, and I. Gutman, “A kernel-based clustering method for gene selection with gene expression data”, J. Biomed. Inf orm. 62, 12‒20 (2016).
  5.  P. Das, A. Roychowdhury, S. Das, S. Roychoudhury, and S. Tripathy, “sigFeature: novel significant feature selection method for classification of gene expression data using support vector machine and t statistic”, Front. Genet. 11, 247 (2020), doi: 10.3389/fgene.2020.00247.
  6.  A. Wiliński and S. Osowski, “Ensemble of data mining methods for gene ranking”, Bull. Pol. Acad. Sci. Tech. Sci. 60, 461‒471 (2012).
  7.  H. Mitsubayashi, S. Aso, T. Nagashima, and Y. Okada, “Accurate and robust gene selection for disease classification using simple statistics, Biomed. Inf orm. 391, 68–71 (2008).
  8.  J. Xu, Y. Wang, K. Xu, and T. Zhang, “Feature genes selection using fuzzy rough uncertainty metric for tumour diagnosis”, Comput. Math. Method Med. 2019, 6705648 (2019), doi: 10.1155/2019/6705648.
  9.  B. Lyu and A. Haque, “Deep learning based tumour type classification using gene expression data”, bioRxiv, p. 364323 (2018), doi: 10.1101/364323.
  10.  F. Yang, “Robust feature selection for microarray data based on multi criterion fusion”, IEEE Trans. Comput. Biol. Bioinf . 8(4), 1080–1092 (2011).
  11.  M. Muszyński and S. Osowski, “Data mining methods for gene selection on the basis of gene expression arrays”, Int. J. .Appl. Math. Comput. Sci. 24(3), 657‒668 (2014).
  12.  T. Latkowski and S. Osowski, “Data mining for feature selection in gene expression autism data”, Expert Syst. Appl. 42(2), 864‒872 (2015).
  13.  Matlab user manual. Natick (USA): MathWorks: (2020).
  14.  P. Sprent, and N.C. Smeeton, Applied Nonparametric Statistical Methods. Boca Raton, Chapman & Hall/CRC, 2007.
  15.  R.O. Duda, P.E. Hart, and P. Stork, Pattern Classif ication and Scene Analysis, New York: Wiley, 2003.
  16.  Exxact. [Online]. https://blog.exxactcorp.com/scikitlearn-vs-mlr-for-machine-learning/
  17.  Tutorialspoint. [Online]. https://www.tutorialspoint.com/weka/weka_feature_selection.htm
  18.  R. Robnik-Sikonja, and I. Kononenko, “Theoretical and empirical analysis of Relief ”, Mach. Learn. 53, 23‒69 (2003).
  19.  W. Yang, K. Wang, and W. Zuo. “Neighborhood Component Feature Selection for High-Dimensional Data”, J. Comput. 7(1), 161‒168 (2012).
  20.  L. Breiman, “Random forests”, Mach. Learn. 45, 5–32 (2001).
  21.  NCBI database. [Online]. http://www.ncbi.nlm.nih.gov/sites/GDSbrowser?acc=GDS4431, (2011).
  22. http://discover1.mc.vanderbilt.edu/discover/public/mcsvm/
  23. http://sdmc.lit.org.sg/GEDatasets/Datasets.html
  24.  F. Gil and S. Osowski, “Feature selection methods in gene recognition problem”, in Proc. on-line Conf erence Computatational Methods in Electrical Engineering, 2020, pp. 1‒4.
Go to article

Authors and Affiliations

Fabian Gil
1
Stanislaw Osowski
1 2
ORCID: ORCID

  1. Warsaw University of Technology, Pl. Politechniki 1, 00-661 Warsaw, Poland
  2. Military University of Technology, ul. gen. Sylwestra Kaliskiego 2, 00-908 Warsaw, Poland

This page uses 'cookies'. Learn more