SELECTION OF A MINIMAL NUMBER OF SIGNIFICANT PORCINE SNPs BY AN INFORMATION GAIN AND GENETIC ALGORITHM HYBRID MODEL

Main Article Content

Wanthanee Rathasamuth
Kitsuchart Pasupa
Sissades Tongsima

Abstract

A panel of a large number of common Single Nucleotide Polymorphisms (SNPs) distributed across an entire porcine genome has been widely used to represent genetic variability of pigs. With the advent of SNP-array technology, a genome-wide genetic profile of a specimen can be easily observed. Among the large number of such variations, there exists a much smaller subset of the SNP panel that could equally be used to correctly identify the corresponding breed. This work presents a SNP selection heuristic that can still be used effectively in the breed classification. The features were selected by combining a filter method and a wrapper method–information gain method and genetic algorithm–plus a feature frequency selection step, while classification used a support vector machine. We were able to reduce the number of significant SNPs to 0.86 % of the total number of SNPs in a swine dataset with 94.80 % classification accuracy.

Downloads

Download data is not yet available.

Article Details

How to Cite
Rathasamuth, W., Pasupa, K., & Tongsima, S. (2019). SELECTION OF A MINIMAL NUMBER OF SIGNIFICANT PORCINE SNPs BY AN INFORMATION GAIN AND GENETIC ALGORITHM HYBRID MODEL. Malaysian Journal of Computer Science, 79–95. https://doi.org/10.22452/mjcs.sp2019no2.5
Section
Articles