Lung cancer classification based on support vector machine-recursive feature elimination and artificial bee colony
DOI:
https://doi.org/10.33292/amm.v3i1.26Keywords:
Artificial bee colony, Lung cancer classification, Support vector machineAbstract
Early detection of cancerous cells can increase survival rates for patients by more than 97%. Microarray data, used for cancer classification, are comp osed of many thousands of features and from tens to hundreds of instances. Handling these huge datasets is the most imp ortant challenge in data classification. Feature selection or reduction is therefore an essential task in data classification. We prop ose a cancer diagnostic to ol using a supp ort vector machine for classifier and feature selection. First, we use supp ort vector machine-recursive feature elimination to prefilter the genes. This was enhanced with the artificial b ee colony algorithm. We ran four simulations using Ontario and Michigan lung cancer datasets. This approach provides higher classification accuracy than those without feature selection, supp ort vector machine-recursive feature elimination, or the artificial b ee colony algorithm. The accuracy of a supp ort vector machine using a feature selection-based recursive feature elimination metho d combined with the artificial b ee colony algorithm reached 98% with 100 b est features for the Michigan lung cancer dataset and 97% with 70 b est features for the Ontario lung cancer dataset. SVM with RFE-ABC as the feature selection metho d gives us an accurate result to diagnose Lung cancer using microarray data.
References
T. M. St John, “With Every Breath,” A Lung Cancer Guid. Vancouver, 2003.
J. M. Luna et al., “Predicting radiation pneumonitis in locally advanced stage II–III non-small cell lung cancer using machine learning,” Radiother. Oncol., vol. 133, pp. 106–112, 2019, doi: https://doi.org/10.1016/j.radonc.2019.01.003.
B. Ghaddar and J. Naoum-Sawaya, “High dimensional data classification and feature selection using support vector machines,” Eur. J. Oper. Res., vol. 265, no. 3, pp. 993–1004, 2018, doi: https://doi.org/10.1016/j.ejor.2017.08.040.
J. A. Tsou, J. A. Hagen, C. L. Carpenter, and I. A. Laird-Offringa, “DNA methylation analysis: a powerful new tool for lung cancer diagnosis,” Oncogene, vol. 21, no. 35, pp. 5450–5461, 2002.
M. A. Khan et al., “Lungs cancer classification from CT images: An integrated design of contrast based classical features fusion and selection,” Pattern Recognit. Lett., vol. 129, pp. 77–85, 2020, doi: https://doi.org/10.1016/j.patrec.2019.11.014.
M. Wiesweg et al., “Machine learning reveals a PD-L1–independent prediction of response to immunotherapy of non-small cell lung cancer by gene expression context,” Eur. J. Cancer, vol. 140, pp. 76–85, 2020, doi: https://doi.org/10.1016/j.ejca.2020.09.015.
J. Cervantes, F. Garcia-Lamont, L. Rodríguez-Mazahua, and A. Lopez, “A comprehensive survey on support vector machine classification: Applications, challenges and trends,” Neurocomputing, vol. 408, pp. 189–215, 2020, doi: https://doi.org/10.1016/j.neucom.2019.10.118.
J. A. ALzubi, B. Bharathikannan, S. Tanwar, R. Manikandan, A. Khanna, and C. Thaventhiran, “Boosted neural network ensemble classification for lung cancer disease diagnosis,” Appl. Soft Comput., vol. 80, pp. 579–591, 2019, doi: https://doi.org/10.1016/j.asoc.2019.04.031.
P. A. Mundra and J. C. Rajapakse, “SVM-RFE with MRMR filter for gene selection,” IEEE Trans. Nanobioscience, vol. 9, no. 1, pp. 31–37, 2009.
G. Wang, G. Zhang, K.-S. Choi, K.-M. Lam, and J. Lu, “Output based transfer learning with least squares support vector machine and its application in bladder cancer prognosis,” Neurocomputing, vol. 387, pp. 279–292, 2020, doi: https://doi.org/10.1016/j.neucom.2019.11.010.
Q. Gu et al., “Machine learning-based radiomics strategy for prediction of cell proliferation in non-small cell lung cancer,” Eur. J. Radiol., vol. 118, pp. 32–37, 2019, doi: https://doi.org/10.1016/j.ejrad.2019.06.025.
K. R. Kavitha, G. S. Rajendran, and J. Varsha, “A correlation based SVM-recursive multiple feature elimination classifier for breast cancer disease using microarray,” in 2016 International Conference on Advances in Computing, Communications and Informatics (ICACCI), 2016, pp. 2677–2683.
K.-B. Duan, J. C. Rajapakse, H. Wang, and F. Azuaje, “Multiple SVM-RFE for gene selection in cancer classification with expression data,” IEEE Trans. Nanobioscience, vol. 4, no. 3, pp. 228–234, 2005.
Y. Tang, Y.-Q. Zhang, Z. Huang, X. Hu, and Y. Zhao, “Recursive fuzzy granulation for gene subsets extraction and cancer classification,” IEEE Trans. Inf. Technol. Biomed., vol. 12, no. 6, pp. 723–730, 2008.
F. Yuan, L. Lu, and Q. Zou, “Analysis of gene expression profiles of lung cancer subtypes with machine learning algorithms,” Biochim. Biophys. Acta - Mol. Basis Dis., vol. 1866, no. 8, p. 165822, 2020, doi: https://doi.org/10.1016/j.bbadis.2020.165822.
Y. Huo, L. Xin, C. Kang, M. Wang, Q. Ma, and B. Yu, “SGL-SVM: A novel method for tumor classification via support vector machine with sparse group Lasso,” J. Theor. Biol., vol. 486, p. 110098, 2020, doi: https://doi.org/10.1016/j.jtbi.2019.110098.
A. Bustamam, A. Bachtiar, and D. Sarwinda, “Selecting Features Subsets Based on Support Vector Machine-Recursive Features Elimination and One Dimensional-Naïve Bayes Classifier using Support Vector Machines for Classification of Prostate and Breast Cancer,” Procedia Comput. Sci., vol. 157, pp. 450–458, 2019, doi: https://doi.org/10.1016/j.procs.2019.08.238.
P. K R and N. N C, “Lung Cancer Survivability Prediction based on Performance Using Classification Techniques of Support Vector Machines, C4.5 and Naive Bayes Algorithms for Healthcare Analytics,” Procedia Comput. Sci., vol. 132, pp. 412–420, 2018, doi: https://doi.org/10.1016/j.procs.2018.05.162.
Y. Xu, P. Fan, and L. Yuan, “A simple and efficient artificial bee colony algorithm,” Math. Probl. Eng., vol. 2013, 2013.
X. Yu, J. Zhang, J. Fan, and T. Zhang, “A faster convergence artificial bee colony algorithm in sensor deployment for wireless sensor networks,” Int. J. Distrib. Sens. Networks, vol. 9, no. 10, p. 497264, 2013.