Performance Comparison of Tree-Based Algorithms for In-silico Mutagenicity Prediction

Published in Akademik Bilisim 2018 Karabuk/Turkey, 2018

Abstract: Observation of the biological activities of chemical compounds is a time consuming and costly process. In-silico experiments are simulations that are done in computer environment to accelerate and reduce the cost. By using in-silico experiments, biological molecular systems become much easier to understand in terms of time and cost. Mutagenicity, which may cause genetic change in the cell, can be predcited by In-silico experiment method through using data mining algorithms. Data set including 8208 observations and 155 variables is used in the study,and data mining algorithms with tree structure are applied to determine mutagenicity on the data set. Among the algorithms used, CART showed a classification success of 71.67%, GBM 77.9%, XGBoost 84.21% and finally Random Forest 84.68%.