Optimizing computer-aided colonic polyp detection for CT colonography by evolving the Pareto fronta

Jiang Li, Adam Huang, Jack Yao, Jiamin Liu, Robert L Van Uitert, Nicholas Petrick, Ronald M Summers, Jiang Li, Adam Huang, Jack Yao, Jiamin Liu, Robert L Van Uitert, Nicholas Petrick, Ronald M Summers

Abstract

A multiobjective genetic algorithm is designed to optimize a computer-aided detection (CAD) system for identifying colonic polyps. Colonic polyps appear as elliptical protrusions on the inner surface of the colon. Curvature-based features for colonic polyp detection have proved to be successful in several CT colonography (CTC) CAD systems. Our CTC CAD program uses a sequential classifier to form initial polyp detections on the colon surface. The classifier utilizes a set of thresholds on curvature-based features to cluster suspicious colon surface regions into polyp candidates. The thresholds were previously chosen experimentally by using feature histograms. The chosen thresholds were effective for detecting polyps sized 10 mm or larger in diameter. However, many medium-sized polyps, 6-9 mm in diameter, were missed in the initial detection procedure. In this paper, the task of finding optimal thresholds as a multiobjective optimization problem was formulated, and a genetic algorithm to solve it was utilized by evolving the Pareto front of the Pareto optimal set. The new CTC CAD system was tested on 792 patients. The sensitivities of the optimized system improved significantly, from 61.68% to 74.71% with an increase of 13.03% (95% CI [6.57%, 19.5%], p = 7.78 x 10(-5)) for the size category of 6-9 mm polyps, from 65.02% to 77.4% with an increase of 12.38% (95% CI [6.23%, 18.53%], p = 7.95 x 10(-5)) for polyps 6 mm or larger, and from 82.2% to 90.58% with an increase of 8.38% (95% CI [0.75%, 16%], p = 0.03) for polyps 8 mm or larger at comparable false positive rates. The sensitivities of the optimized system are nearly equivalent to those of expert radiologists.

Figures

Figure 1
Figure 1
A block diagram representation of our CTC CAD system.
Figure 2
Figure 2
The Pareto front from the SPEA2 algorithm on 134 patients selected from the training data set. The horizontal axis represents false positive rates per patient and the vertical axis denotes the number of missed true detections. Three potential operation points are also shown. The operation point 1 has 51 missed true detections with 77.6 false positives per patient, while point 2 has 32 missed true detections with 179 false positives per patient, and point 3 has 28 missed true detections with 296 false positives per patient.
Figure 3
Figure 3
FROC curves on patients in training data set for each of the three chosen Pareto front operation points.
Figure 4
Figure 4
Final FROC curves for the two systems on patients in the training and testing data sets. A filled square or diamond denotes the chosen classifier operation point based on the training FROC curve. Error bars on the testing results represent two standard deviations and are obtained at the same classifier operating points.
Figure 5
Figure 5
Medium-sized polyps in the training data set detected by Pareto front optimized CTC CAD system but not by our prior system. Surface rendered 3D endoluminal images for two polyps (A), (C) without and (B), (D) with CTC CAD detections. (A), (B) 6 mm pedunculated adenoma in the sigmoid colon on supine CTC (72 y.o. female). (C), (D) 7 mm sessile adenoma in the descending colon on prone CTC (57 y.o. male). The dark gray dots in (B, D) are ground truth vertices on the colon surface. Detected vertices that match the ground truth are shown in light gray. Vertices clustered in the true detection but not marked as ground truth are shown as small, light gray dots.

Source: PubMed

3
Iratkozz fel