Mutations are structural changes in DNA that can cause protein malfunction and genetic disease. This paper describes a machine learning approach to analyzing mutations associated to Osteogenesis Imperfecta (OI), also known as brittle bone disease, We apply SORCER, a second-order rule induction system to predict clinical phenotypes of OI from mutation and neighboring amino acid sequences in the COLIA1 gene. On the average, over ten 10-fold cross-validations, SORCER gives more accurate results than C4.5 with average accuracy of about 81.2%. The paper discusses the advantages and limitations of SORCER and demonstrates its use to provide initial exploration of biological sequences.
|Number of pages||20|
|Journal||International Journal of Pattern Recognition and Artificial Intelligence|
|State||Published - Aug 2003|
- Data mining algorithms
- Machine learning