<strong>3D</strong> <strong>Slicer</strong> (<strong>Slicer</strong>) 6 is a free, <strong>open</strong> <strong>source</strong> s<strong>of</strong>tware package for visualization and image comput<strong>in</strong>g. It providesfunctionality for segmentation, registration, and <strong>3D</strong> visualization <strong>of</strong> multi-modal medical image data. <strong>Slicer</strong> began as animage-guided surgery system developed at the MIT AI Lab <strong>in</strong> collaboration with the Surgical Plann<strong>in</strong>g Laboratory at theBrigham and Women’s Hospital <strong>in</strong> 1999 7 and has evolved to support a wide variety <strong>of</strong> cl<strong>in</strong>ical applications. <strong>Slicer</strong><strong>in</strong>cludes rout<strong>in</strong>es to read and write various file formats, manipulate 2D and <strong>3D</strong> coord<strong>in</strong>ate systems, and present aconsistent user <strong>in</strong>terface paradigm and visualization metaphor 8 . <strong>Slicer</strong>’s component architecture is based on the concept<strong>of</strong> modules that can provide new user <strong>in</strong>terface components, new s<strong>of</strong>tware services, or a comb<strong>in</strong>ation there<strong>of</strong>. Almost allfunctionalities <strong>of</strong> <strong>Slicer</strong> are implemented as modules (command-l<strong>in</strong>e module or <strong>in</strong>teractive module). In this paper, wedescribe our <strong>implementation</strong> <strong>of</strong> an <strong>in</strong>teractive module <strong>in</strong> <strong>Slicer</strong> to assist CTC researchers <strong>in</strong> discover<strong>in</strong>g causes <strong>of</strong> falsepositive detections and design<strong>in</strong>g methods to reduce them.To the best knowledge <strong>of</strong> the authors, the most recent work <strong>of</strong> CTC <strong>in</strong> <strong>Slicer</strong> was published by Na<strong>in</strong>, et al. 9 The proposed<strong>3D</strong> virtual endoscopy system allows the user to <strong>in</strong>teractively explore the <strong>in</strong>ternal surface <strong>of</strong> a <strong>3D</strong> anatomical model andto create and update a fly-through trajectory through the model to simulate endoscopy. The goal <strong>in</strong> that work was tocomb<strong>in</strong>e the strength <strong>of</strong> 2D imag<strong>in</strong>g techniques with <strong>3D</strong> visualization <strong>in</strong> order to simulate the surgical environment andprovides the user with navigational and path creation options. Like most commercial <strong>colon</strong> <strong>CAD</strong> products, that systemassists radiologists <strong>in</strong> f<strong>in</strong>d<strong>in</strong>g polyps <strong>in</strong> CTC rather than helps CTC researchers <strong>in</strong> discover<strong>in</strong>g causes <strong>of</strong> false positivedetections. In our <strong>colon</strong> <strong>CAD</strong>, we developed a free, <strong>open</strong> <strong>source</strong> <strong>colon</strong> <strong>CAD</strong> system to meet the researchers’requirements.2. METHODOLOGY2.1 Overview <strong>of</strong> <strong>colon</strong> <strong>CAD</strong>A <strong>colon</strong> <strong>CAD</strong> system is a multi-step procedure, typically consist<strong>in</strong>g <strong>of</strong> (1) <strong>colon</strong> wall segmentation, (2) <strong>in</strong>termediatepolyp candidate generation, (3) classification for detection <strong>of</strong> f<strong>in</strong>al candidates, and (4) f<strong>in</strong>al polyp candidatepresentation 10 . The purpose <strong>of</strong> segmentation is to limit the search area for polyps to reduce primary process<strong>in</strong>g time, andto reduce causes <strong>of</strong> false positive detections com<strong>in</strong>g from the small bowel and extra <strong>colon</strong>ic structures. Generat<strong>in</strong>g<strong>in</strong>termediate polyp candidates is applied to f<strong>in</strong>d regions which are likely polyps; each <strong>in</strong>termediate polyp candidate isrepresented by features gathered from the candidate region. The goal <strong>of</strong> polyp candidate generation is to identify asmany true polyps as possible while m<strong>in</strong>imiz<strong>in</strong>g false positives. However, s<strong>in</strong>ce high sensitivity is important at this stage,many false positives are generated. Thus, the <strong>in</strong>termediate polyp candidates are fed <strong>in</strong>to a classifier that is tra<strong>in</strong>ed toreduce false positives. Positive polyp candidates filtered by a classifier are presented to users as f<strong>in</strong>al polyp candidates.In this paper, we focus on the last step (f<strong>in</strong>al polyp candidate presentation) and describe a <strong>Slicer</strong> module which shows alist <strong>of</strong> polyp candidates and display <strong>3D</strong> view for each polyp candidate.2.2 Segmentation, <strong>in</strong>termediate polyp candidate generation, and classificationThe details <strong>of</strong> segmentation, generat<strong>in</strong>g <strong>in</strong>termediate polyp candidates, and classification are published <strong>in</strong> 11,12,13,14 . In thissection, we briefly describe the procedure <strong>in</strong> each step.A fully automatic segmentation method that has been proposed <strong>in</strong> 11 was used <strong>in</strong> our <strong>colon</strong> <strong>CAD</strong>. The algorithm uses thegeometry <strong>of</strong> the <strong>colon</strong> as features to detect and segment the lumen. First, segmentation <strong>of</strong> the volume background is doneus<strong>in</strong>g a region grower. The result<strong>in</strong>g connected region is used as a mask to elim<strong>in</strong>ate the background dur<strong>in</strong>g furtherprocess<strong>in</strong>g. Next, a set <strong>of</strong> thresholds is applied to generate a b<strong>in</strong>ary image <strong>of</strong> gas and tissue. A distance transform is thenused to locate a po<strong>in</strong>t <strong>in</strong>side the gas regions with a maximal distance from other tissues. This seed po<strong>in</strong>t is used by theregion grow<strong>in</strong>g algorithm to segment the gas-filled portion <strong>of</strong> the <strong>colon</strong>. <strong>An</strong> estimate <strong>of</strong> the amount <strong>of</strong> elongation <strong>of</strong> thegas-filled object is obta<strong>in</strong>ed dur<strong>in</strong>g the segmentation process and is used with the location <strong>of</strong> the seed po<strong>in</strong>t to decide ifthe object is bowel or stomach. If the object is relatively elongated or occurs below the top one-quarter <strong>of</strong> the volume, itis considered part <strong>of</strong> the bowel and is added to the segmentation. The algorithm then proceeds by mask<strong>in</strong>g the previousgrown region from the distance transform image and search<strong>in</strong>g for another seed po<strong>in</strong>t <strong>in</strong> the rema<strong>in</strong><strong>in</strong>g gas-filledsegments. Besides gas, the <strong>colon</strong> is also filled with reta<strong>in</strong>ed residue after bowel preparation, which is homogeneouscontrast enhanced fluid (CEF). To <strong>in</strong>clude these CEF-filled lumen sections, the mean and Gaussian curvature at eachpo<strong>in</strong>t on the surface are computed and relatively large areas <strong>of</strong> low curvature <strong>in</strong>dicate a fluid boundary. Selective dilationacross the boundary is used to extend the gas segmentation <strong>in</strong> the residual CEF.Proc. <strong>of</strong> SPIE Vol. 7624 762421-2Downloaded from SPIE Digital Library on 21 Jan 2012 to 128.103.149.52. Terms <strong>of</strong> Use: http://spiedl.org/terms
In generat<strong>in</strong>g <strong>in</strong>termediate polyp candidates, a polyp candidate consists <strong>of</strong> a group <strong>of</strong> connected voxels, from whichgeometric features are calculated and classified to identify a polyp. These candidates are <strong>in</strong>itially detected us<strong>in</strong>g a subset<strong>of</strong> features and segmented roughly us<strong>in</strong>g various heuristic means to form candidate polyp regions, over which statisticscan be computed per polyp. At each <strong>colon</strong>ic wall voxel, pr<strong>in</strong>cipal, Gaussian, mean curvatures, shape <strong>in</strong>dex (SI) andcurvedness (CV) values are computed respectively. Voxels that have SI and CV values with<strong>in</strong> predef<strong>in</strong>ed range areextracted as seed voxels by threshold<strong>in</strong>g (SI: [0, 0.11], CV [0.075, 0.2]). From those seed voxels, a region grow<strong>in</strong>gprocedure then extracts the major part <strong>of</strong> a polyp. It is reasonable to relax the ranges (SI: [0, 0.22], CV [0.05, 0.25])around polyps’ peripheral region. Then a fuzzy c-means cluster<strong>in</strong>g 15 is applied to remove some polyp candidates due toimage noise and effectively group voxels belong<strong>in</strong>g to the same polyp <strong>in</strong> a large cluster. The cluster<strong>in</strong>g is based onvoxels’ feature values, which <strong>in</strong>clude SI, CV, magnitude <strong>of</strong> the gradient, CT <strong>in</strong>tensity value, and spatial coord<strong>in</strong>ates.Further features such as gradient concentration, directional gradient concentrate, sphericity, compactness, wall/regiondensity and polyp radius are computed after the cluster<strong>in</strong>g step. These features are further characterized by statisticaloperations (mean, max, m<strong>in</strong>, variance, etc.) to form polyp candidates’ features. <strong>An</strong> <strong>in</strong>termediate polyp candidate isconsidered as a true positive detection if the distance between it and the center <strong>of</strong> a polyp (detected <strong>in</strong> CTC/OC) is equalto or less than the polyp size measured <strong>in</strong> CTC/OC.In classification for detection <strong>of</strong> f<strong>in</strong>al candidates, several algorithms have been <strong>in</strong>vestigated <strong>in</strong> our <strong>CAD</strong> system,<strong>in</strong>clud<strong>in</strong>g support vector mach<strong>in</strong>es (SVMs), AdaBoost, artificial neural network, C4.5 decision tree, etc. We use theclassification results from SVMs <strong>in</strong> this paper for its balanced performance <strong>in</strong> terms <strong>of</strong> sensitivity, specificity, and areaunder ROC curve (AUC) 13,14 . For the classification task <strong>in</strong> a <strong>colon</strong> <strong>CAD</strong> system, the tra<strong>in</strong><strong>in</strong>g data is extremelyimbalanced because there are many more false positive detections than true positive ones <strong>in</strong> the <strong>in</strong>termediate polypcandidates. Simply tra<strong>in</strong><strong>in</strong>g a classifier with this imbalanced data will produce an un<strong>in</strong>formative result s<strong>in</strong>ce the classifiercan reach very high accuracy by classify<strong>in</strong>g all polyp candidates as false positive detections. In order to overcome thisimbalanced data problem, we employed the SMOTE oversampl<strong>in</strong>g technique 16 dur<strong>in</strong>g classifier tra<strong>in</strong><strong>in</strong>g. <strong>An</strong>other issue<strong>in</strong> classification task is to determ<strong>in</strong>e the value for parameters for each classification algorithm, e.g., what kernel functionshould be used <strong>in</strong> SVMs. In our <strong>colon</strong> <strong>CAD</strong>, we tuned the parameters experimentally. A classifier with different valuesfor each parameter is tra<strong>in</strong>ed and tested <strong>in</strong> a computer cluster environment. Tra<strong>in</strong>ed classifiers are evaluated us<strong>in</strong>g 10-fold cross-validation method and are ranked accord<strong>in</strong>g to different comb<strong>in</strong>ations <strong>of</strong> criteria: sensitivity, specificity, andAUC. The output <strong>of</strong> each classifier is mapped to a prediction value rang<strong>in</strong>g from 0 to 1. In the <strong>implementation</strong>, we usethe <strong>open</strong> <strong>source</strong> mach<strong>in</strong>e learn<strong>in</strong>g s<strong>of</strong>tware WEKA 17 .2.3 Presentation <strong>of</strong> the polyp candidates – a <strong>Slicer</strong> module to <strong>in</strong>tegrate and visualize polyp candidatesThe <strong>in</strong>put to the <strong>Slicer</strong> module <strong>in</strong>cludes three parts: 1) the segmentation file <strong>of</strong> the <strong>colon</strong> wall, which is label map<strong>in</strong>dicat<strong>in</strong>g <strong>in</strong>side and outside <strong>of</strong> <strong>colon</strong> wall. 2) The <strong>in</strong>termediate polyp candidates’ feature files. As mentioned above,each <strong>in</strong>termediate polyp candidate is generated from cluster<strong>in</strong>g seed voxels. The features for <strong>in</strong>termediate polypcandidates and features for seed voxels are stored separately, and the relationship between each <strong>in</strong>termediate and its seedvoxels are established through an assigned polyp id. 3) The classification file that <strong>in</strong>cludes the predication value for each<strong>in</strong>termediate polyp candidate and the CTC/OC f<strong>in</strong>d<strong>in</strong>g (“1” <strong>in</strong>dicates an <strong>in</strong>termediate polyp candidate is true positive;“0” <strong>in</strong>dicates false positive). Normally these three <strong>in</strong>puts are stored <strong>in</strong> different files mak<strong>in</strong>g it difficult to analyze thefalse positive <strong>in</strong>termediate polyp candidates. In the next sections, we describe a <strong>Slicer</strong> module that <strong>in</strong>tegrates this<strong>in</strong>formation <strong>in</strong>to one workbench and allows one to the visualize <strong>colon</strong> wall, polyp candidate surface, and voxels <strong>in</strong> oneuser <strong>in</strong>terface. In this way, allow<strong>in</strong>g discovery <strong>of</strong> the basis for the many false positives.Figure 1 illustrates how to use this <strong>Slicer</strong> module. This <strong>in</strong>cludes the follow<strong>in</strong>g steps: load CT study, load <strong>in</strong>termediatepolyp candidates (<strong>in</strong>clud<strong>in</strong>g its seed voxels), load segmentation result, locate a polyp candidate, observe the polypcandidate <strong>in</strong> 2D views (sagittal, coronal, and axial), and visualize the polyp candidate <strong>in</strong> a <strong>3D</strong> view. The first step, loadCT study, is done through <strong>Slicer</strong>’s volumes module. In load<strong>in</strong>g <strong>in</strong>termediate polyp candidates, the filenames for polypcandidates’ features, seed voxels’ features, and classifier’s prediction value are specified <strong>in</strong> a panel as shown <strong>in</strong> figure 2.Each file is a text file; the appendix describes its format. After this step, a list <strong>of</strong> <strong>in</strong>termediate polyp candidates is shownas <strong>in</strong> the lower part <strong>of</strong> the figure 2. The 1st column is the assigned polyp id. The next three columns are DICOMcoord<strong>in</strong>ates for each <strong>in</strong>termediate polyp candidate. The 4th column, pred 0, is the classifier’s predication value for this<strong>in</strong>termediate polyp candidate. A value close to 1 means this <strong>in</strong>termediate polyp candidate is most likely to be a negativedetection. In the 5th column, a value close to 1 means this <strong>in</strong>termediate polyp candidate is most likely to be positive. Thelast column, target, is the CTC/OC f<strong>in</strong>d<strong>in</strong>g; A value <strong>of</strong> “1” <strong>in</strong>dicates that there is a true polyp found near the location <strong>of</strong>that <strong>in</strong>termediate polyp candidate <strong>in</strong> CTC/OC, while “0” <strong>in</strong>dicates no nearby polyp. In the third step, load segmentationProc. <strong>of</strong> SPIE Vol. 7624 762421-<strong>3D</strong>ownloaded from SPIE Digital Library on 21 Jan 2012 to 128.103.149.52. Terms <strong>of</strong> Use: http://spiedl.org/terms