Abstract
Mutations are structural changes in DNA that can cause protein malfunction and genetic disease. This paper describes a machine learning approach to analyzing mutations associated to Osteogenesis Imperfecta (OI), also known as brittle bone disease, We apply SORCER, a second-order rule induction system to predict clinical phenotypes of OI from mutation and neighboring amino acid sequences in the COLIA1 gene. On the average, over ten 10-fold cross-validations, SORCER gives more accurate results than C4.5 with average accuracy of about 81.2%. The paper discusses the advantages and limitations of SORCER and demonstrates its use to provide initial exploration of biological sequences.
Original language | English |
---|---|
Pages (from-to) | 721-740 |
Number of pages | 20 |
Journal | International Journal of Pattern Recognition and Artificial Intelligence |
Volume | 17 |
Issue number | 5 |
DOIs | |
State | Published - Aug 2003 |
Keywords
- Bioinformatics
- Data mining algorithms
- Induction
- Machine learning
- Mutations