REAL-TIME EXTRACTION OF LOCAL PHASE FEATURES FROM VOLUMETRICMEDICAL IMAGE DATAAlborz Amir-khalili ⋆ Antony J. Hodgson † Rafeef Abugharbieh ⋆⋆ Biomedical Signal and Image Computing Lab (BiSICL),Department of Electrical and Computer Engineering; and† Department of Mechanical Engineering, The University of British ColumbiaABSTRACTWe present a novel real-time implementation of local phasefeature extraction from volumetric image data based on 3Ddirectional (log-Gabor) filters. We achieve drastic performancegains without compromising the signal-to-noise ratioby pre-computing the filters and adaptive noise estimation parameters,and streamlining the remainder of the computationsto efficiently run on a multi-processor graphic processingunit (GPU). We validate our method on clinical ultrasounddata and demonstrate a 15-fold speedup in computation timeover state-of-the art methods, which could potentially facilitatea wide range of practical applications for real-timeimage-guided procedures.Index Terms— Local Phase Features, Real-Time, Segmentation,CUDA, GPU, Ultrasound, 3D Feature Extraction1. INTRODUCTIONImage guided medical procedures commonly deployed inminimally-invasive surgeries and, more recently, in emergingrobot assisted interventions are generating increasingdemands for real-time data processing and analysis. Withthe steady improvement in spatial and temporal scanningresolution and the continuing increase in complexity of dataanalysis algorithms, real-time processing is becoming a challengingbottleneck that requires thorough consideration inmost practical applications. Feature extraction from 3D imagesfor critical tasks such as real-time segmentation andregistration are prime examples. For instance, image featuresextracted from local phase information have recently beenshown to be robust in computer aided orthopedic applications.In previous works  we demonstrated the effectiveness ofphase symmetry (PS) features for segmentation and localizationof bone fractures in 3D ultrasound. In an extension tothis work we implemented a step towards clinical ultrasoundguided intervention , in which we used PS features toregister intra-operative ultrasound to pre-operative computedtomography (CT) images. Other local phase features suchas phase-asymmetry (PA) and phase-congruency (PC) can beused to localize soft-tissue boundaries in imaging modalitiessuch as ultrasound, CT, and magnetic resonance imaging .In a previous work by Eklund et al.  a fast GPU implementationof phase feature extraction was proposed for thepurpose of multimodal image registration. Although their approachperforms in real-time, it suffers from being highly specializedwithin a registration framework which only considersphases features at a single scale in the three principal directions.In this paper, we present a novel and generalized toolfor real-time extraction of local image phase features that deploysefficient pre-computations and GPU processing.2. METHODOLOGYIn most stages of local phase feature extraction, each pixelmay be processed independently. The CUDA parallel programmingmodel developed by Nvidia Corporation (SantaClara, CA, USA) takes advantage of the hundreds of coresavailable on a GPU to run a computationally expensive algorithmin parallelized chunks, each operating independentlyon a separate stream processor of the GPU. The libraries designedfor CUDA, which are included in its freely availableSoftware Development Kit, also contain many fundamentaltools for scientific computation, such as the fast Fouriertransform (CuFFT).There is strength in numbers, but a single core of a GPUis still significantly slower than a CPU. Furthermore, there issignificant overhead associated with data transfer to and fro aswell as sharing of information. In this section, we present themethods used in optimizing the implementation of local phasefeature extraction to fully take advantage of GPU capabilities.2.1. Local Phase Feature ExtractionPhase features are local, unitless measures that can be derivedusing quadrature odd and even (denoted o and e respectively)wavelet pairs . Such features can be extracted across a setof different wavelengths (denoted s for scales) and at differentangles in 3D space (denoted r for orientations). Currentlyavailable implementations of 3D local phase image feature
Table 1. Averaged double precision runtimes (mean of 20 trials) of our algorithm (MATLAB and CUDA) compared to Hatt Orientations Scales Benchmark Hatt Proposed Algorithm Speed-up factor Speed-up factorMATLAB (CPU) C++ (CPU) CUDA (GPU) CUDA/MATLAB CUDA/Hattr = 1 s = 1 430 ms 674 ms 30 ms 14.3 22.5s = 2 625 ms 949 ms 36 ms 17.4 26.4s = 3 787 ms 1204 ms 42 ms 18.7 28.7s = 4 964 ms 1450 ms 48 ms 20.1 30.2r = 2 s = 1 639 ms 993 ms 52 ms 12.3 19.1s = 2 989 ms 1472 ms 66 ms 15.0 22.3s = 3 1270 ms 2086 ms 78 ms 16.3 26.7s = 4 1609 ms 2545 ms 90 ms 17.9 28.3r = 3 s = 1 918 ms 1294 ms 74 ms 12.4 17.5s = 2 1396 ms 2049 ms 96 ms 14.5 21.3s = 3 1851 ms 2787 ms 116 ms 16.0 24.0s = 4 2306 ms 3556 ms 136 ms 17.0 26.1r = 4 s = 1 1198 ms 1523 ms 97 ms 12.4 15.7s = 2 1825 ms 2655 ms 127 ms 14.4 20.9s = 3 2447 ms 3646 ms 155 ms 15.8 23.5data demonstrating that our CUDA implementation results ina minimum of 12-fold speedup over our MATLAB benchmarkand 15-fold speedup over Hatt’s  implementation.The performance of our algorithm is expected to increasein the future as it scales very well the hardware capabilitiesof the GPU device. Figure 5 provides a comparison betweenthe MATLAB benchmark, an entry level GPU (Nvidia NVS5400M mobile graphic card with 1GB of memory), and a professionalGPU (Nvidia Tesla c2050). Both of these GPUsare based on Nvidia’s previous hardware architecture, Fermi.Nvidia is reporting a 3-fold increase in single-precision floatingpoint performance per Watt of power with their new Keplerarchitecture. This algorithm will perform considerablyfaster on the next generation of Kepler based Tesla cards, expectedto become available by the end of 2012.Professional GPUEntry Level GPUMATLAB0 50 100 150 200 250 300 350 400Time (ms)FFTtransformationsNoise CalculationOtherFig. 5. On-line runtime of PS with 3 orientations and 1 scale.We will extend our work by incorporating the automaticparameterization methods  in real-time such that they canboth be implemented within our Ultrasound to CT registrationframework .5. REFERENCES I. Hacihaliloglu, R. Abugharbieh, A.J. Hodgson, andR.N. Rohling, “Bone surface localization in ultrasoundusing image phase-based features,” UMB, vol. 35, no. 9,pp. 1475–1487, 2009. A. Brounstein, I. Hacihaliloglu, P. Guy, A.J. Hodgson,and R. Abugharbieh, “Towards real-time 3D US to CTbone image registration using phase and curvature featurebased GMM matching,” MICCAI, pp. 235–242, 2011. A. Wong and W. Bishop, “Efficient least squares fusionof MRI and CT images using a phase congruency model,”Pattern Recognition Letters, vol. 29, no. 3, pp. 173–180,2008. A. Eklund, M. Andersson, and H. Knutsson, “Phasebased volume registration using CUDA,” in ICASSP.IEEE, 2010, pp. 658–661. P. Kovesi, “Symmetry and asymmetry from local phase,”Tenth Australian Joint Converence on Artificial Intelligence,pp. 2–4, 1997. I. Hacihaliloglu, R. Abugharbieh, A. Hodgson, andR. Rohling, “Automatic data-driven parameterization forphase-based bone localization in US using log-gabor filters,”Advances in Visual Computing, pp. 944–954, 2009. P. Kovesi, “Image features from phase congruency,”Videre: Journal of Computer Vision Research, vol. 1, no.3, pp. 1–26, 1999. G. Beliakov, “Parallel calculation of the median and orderstatistics on GPUs with application to robust regression,”arXiv preprint arXiv:1104.2732, 2011. C. Hatt, “Multi-scale steerable phase-symmetry filters forITK,” 2012.