A robust model order estimation and segmentation technique for classification of biopsies in breast cancer

Enrique Corona, Brian Nutter, Sunanda Mitra

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review


The difficult problem of identifying dominant structures in unknown data sets has been elegantly addressed recently by a non-parametric information theoretic approach, the "Jump" method. The method employs an appropriate but fixed power transformation on the distortion-rate, D(R), curve estimated by the popular K-means algorithm. Although this approach yields good results asymptotically for higher dimensional spaces, in many practical cases involving lower dimensional spaces, a transformation function with a fixed power may not find the correct model order. The work presented here develops an objective function to derive a more suitable transformation function that minimizes classification error in low dimensional data sets. In addition, a number of carefully chosen K-means seeding methods based upon proper heuristic choices have been used to enhance the detection sensitivity and to allow a more accurate estimation. The proposed method has been evaluated for a large variety of datasets and compared with the original Jump method and other well-known order estimation methods such as Minimum Description Length (MDL), Akaike Information Criteria (AIC), and Consistent Akaike Information Criteria (CAIC), demonstrating superior overall performance. Comparative results for the Wisconsin Diagnostic Breast Cancer Dataset have been included. This modified information theoretic approach to model order estimation is expected to improve and validate diagnostic classification and detection of pre-cancerous lesions. Other applications such as finding plausible number of segments in image segmentation scenarios are also possible.

Original languageEnglish
Title of host publicationMedical Imaging 2010
Subtitle of host publicationImage Processing
EditionPART 1
StatePublished - 2010
EventMedical Imaging 2010: Image Processing - San Diego, CA, United States
Duration: Feb 14 2010Feb 16 2010

Publication series

NameProgress in Biomedical Optics and Imaging - Proceedings of SPIE
NumberPART 1
ISSN (Print)1605-7422


ConferenceMedical Imaging 2010: Image Processing
Country/TerritoryUnited States
CitySan Diego, CA


  • Distortion rate function
  • Gaussian mixture model order
  • classification error
  • diagnostic classification


Dive into the research topics of 'A robust model order estimation and segmentation technique for classification of biopsies in breast cancer'. Together they form a unique fingerprint.

Cite this