A NEW ALGORITHM FOR CONTENT-BASED REGION ... - ICMCC

icmcc.org

A NEW ALGORITHM FOR CONTENT-BASED REGION ... - ICMCC

A NEW ALGORITHM FOR CONTENT-BASED

REGION QUERY IN DATABASES WITH

MEDICAL IMAGES

Dumitru Dan BURDESCU, Liana STANESCU

Faculty of Automation, Computers and Electronics

University of Craiova, Romania


A NEW ALGORITHM FOR CONTENT-BASED REGION QUERY IN DATABASES WITH MEDICAL IMAGES

• This article presents an original method of implementation of the

color set back-projection algorithm that is one of the most efficient

method of automated detection of color regions from an image.

• The detected regions are then used in the content-based region

query.

• The query is realized on one or more regions, having into

consideration the color feature.

• The efficiency of the method was studied by means of a number of

experiments effectuated with the help of a software system realized

for this purpose, on a collection of medical images collected with an

endoscope.

• The new method for the implementation of the algorithm is

compared with the traditional one, not only from the point of view of

the execution time, but also from the point of view of the retrieval

process quality


A NEW ALGORITHM FOR CONTENT-BASED REGION QUERY IN DATABASES WITH MEDICAL IMAGES

Introduction

• In present there are a variety of activity fields in which massive

databases with gray level or color images were created. One of

these domains is the medical field.

• For querying these imagistic collections, the traditional simple

methods based on text are not sufficient. This is due to the fact that

the information from images and in general from multimedia data, is

not structured and in consequence the utilization of some attributes

for describing its content is not possible.

• From this appears the big necessity of using alternative methods for

retrieving with accuracy and rapidity the relevant information, from a

massive imagistic collection, such that the user’s query could be

satisfied.

• These techniques are known under the name of content-based

visual information retrieval and they were centered in the attention of

a lot of researchers, in the last years.


A NEW ALGORITHM FOR CONTENT-BASED REGION QUERY IN DATABASES WITH MEDICAL IMAGES

Introduction

• In the medical field, images, and especially digital images, are

produced and used for diagnostics and therapy in large amounts.

• In some medical areas, hundreds or even thousands of images are

daily produced.

• A big part of them are color images, like the images collected with

the endoscope’s help, so to take into consideration the color

characteristic in the content–based visual retrieval presents

importance.


A NEW ALGORITHM FOR CONTENT-BASED REGION QUERY IN DATABASES WITH MEDICAL IMAGES

Introduction

There are argued some important reasons that explain the need for

supplementary methods for image retrieval:

• in the process of taking clinical decision, it may be very important to

specify an image like query or some regions like query regions and

to retrieve those images from the database that are most similar to

the specified image query or region query, together with the afferent

diagnoses.

• the education and the research activity can be improved by using

the access visual methods.

• the visual characteristics allow not only the retrieving of the patients

having the same disease, but also the cases where the visual

similitude exists, but the diagnosis differs.


A NEW ALGORITHM FOR CONTENT-BASED REGION QUERY IN DATABASES WITH MEDICAL IMAGES

Introduction

• In a content-based region query, the images are compared on their

regions.

• For realizing the content-based region query on a database with

medical images, it is necessary an automated algorithm for

detecting the color regions, significant for the diagnosis.

• It was chosen the color set back-projection algorithm, introduced

initially by Swain and Ballard and then developed in the research

projects at Columbia University, in the content-based visual retrieval

domain. This technique provides the automated extraction of regions

and the representation of their color content.


A NEW ALGORITHM FOR CONTENT-BASED REGION QUERY IN DATABASES WITH MEDICAL IMAGES

Introduction

The extraction system for color regions has four steps:

• The image transformation, quantization and filtering (the

transformation from RGB to HSV color space and the quantization at

166 colors)

• Back-projection of binary color sets

• The labeling of regions

• The extraction of region features ( the binary color set, the area, the

centroid coordinates and the minimum bounding rectangle

coordinates)


A NEW ALGORITHM FOR CONTENT-BASED REGION QUERY IN DATABASES WITH MEDICAL IMAGES

A method for the implementation of the color set back-projection

algorithm

• In the first implementation of the color set back-projection

algorithm (Method1), the image is read in a .bmp format.

• Each pixel from the initial image is transformed in HSV format and

quantized. At the end of this processing there are obtained the

global histogram and the color set of the image.

• On the matrix that memorizes only the quantized colors from 0 to

165 it is applied a 5x5 median filter, which has the role of

eliminating the isolated points.

• Having the HSV quantized matrix it is possible to begin the

process of regions extraction presented above.


A NEW ALGORITHM FOR CONTENT-BASED REGION QUERY IN DATABASES WITH MEDICAL IMAGES

• In the first implementation (Method1), it may be observed that this process is in fact a

depth – first traversal, described in pseudo-cod in the following way:

• procedure FindRegions (Image I, colorset C) is:

• InitStack(S)

• Visited = ∅

• for *each node P in the I do

• if *color of P is in C then

• PUSH(P)

• Visited ← Visited ∪ {P}

• while not Empty(S) do

• CrtPoint


A NEW ALGORITHM FOR CONTENT-BASED REGION QUERY IN DATABASES WITH MEDICAL IMAGES

Proposition 1

• The total running time of a call of the procedure FindRegions (Image I,

colorset C) is O(m 2 *n 2 ), where “m” is the width and “n” is the height of image.

Proof

• Recall that the number of pixels of image is m*n, where “m” is the width and

“n” is the height of image. Observe next, that the first loop FOR of the

algorithm is executed at most once for each pixel P in the image. Hence,

the total time spent in this loop is O(n*m). The WHILE loop processes the

stack S for each pixel which has the same color of its neighbor. The inner

loop FOR processes the pixels of an unvisited neighbor. So, the total time

spent in these loops is O(m*n), because are processed all pixels of image at

most once. The result of the previous statements is that the total running

time of this procedure is O(m 2 *n 2 ).


A NEW ALGORITHM FOR CONTENT-BASED REGION QUERY IN DATABASES WITH MEDICAL IMAGES

A new method for the implementation of the color set backprojection

algorithm

• In the new original implementation of the algorithm (Method2), the

image pixels were arranged into hexagons.

• The edge of a hexagon has a certain number of pixels (3, 4, 5).

• Only the pixels which correspond to the vertices of the hexagons

with an established edge are taken into consideration.

• The image is viewed as a graph not as a pixel matrix. The vertices

represent the pixels and the edges represent neighborhoods

between pixels.


A NEW ALGORITHM FOR CONTENT-BASED REGION QUERY IN DATABASES WITH MEDICAL IMAGES

A new method for the implementation of the color set back-projection

algorithm

For each binary set is executed:

• the graph is inspected until it is found the first vertex having the color

from the color set

• starting from this vertex, there are found all the adjacent vertices

having the same color

• the process will continue in the same manner for each neighbor,

until there are not found vertices having the same color

• it is verified if the detected region satisfies the imposed thresholds;

in affirmative case, the region is labeled and introduced in the

database


A NEW ALGORITHM FOR CONTENT-BASED REGION QUERY IN DATABASES WITH MEDICAL IMAGES

• This process of regions extraction from a graph is in fact a breadth – first traversal, described in pseudo-cod in the following

way

• procedure construct_graph (Image I, Graph g, Edge edge) is :

• for * i->0,width/edge

• for * j->0;height/edge


• if (i mod 3==0)

• *if(jmod2==0)


g[i][j]=I[edge*i][edge*j+edge-1]

• *if(jmod2==1)

• g[i][j]=I[edge*i][edge*j+edge+2]



• if (i mod 3==1)

• * if(j mod 2==0)


g[i][j]=I[edge*i-1][edge*jedge]

• * if(j mod 2==1)


g[i][j]=I[edge*i-1][edge*j+edge*2]


• if (i mod 3==2)

• *if(j mod 2==0)


g[i][j]=I[edge*i-2][edge*j+edge-1]

• *if(j mod 2==1)


g[i][j]=I[edge*i2][edge*j+edge+2]


• //end for * j->0

• *output the graph g

• //end for * i->0


A NEW ALGORITHM FOR CONTENT-BASED REGION QUERY IN DATABASES WITH MEDICAL IMAGES

• procedure FindRegions (Graph G, colorset C) :

• InitQueue(Q)

• Visited = ∅

• for *each node P in the G do

• if *color of P is in C then

• PUSH(P)

• Visited ← Visited ∪ {P}

• while not Empty(q) do

• CrtPoint


A NEW ALGORITHM FOR CONTENT-BASED REGION QUERY IN DATABASES WITH MEDICAL IMAGES

Proposition 2

• The total running time of a call of the procedure FindRegions (Graph

G, colorset C) is O(n 2 ), where “n ” is the number of nodes of graph

attached to an image.

Proof

• Observe that the first FOR loop of the algorithm is executed at most

once for each node of the graph. Hence, the total time spent in this

loop is O(n). The WHILE loop processes the queue Q for each node

which has the same color of its neighbor. The inner loop FOR

processes the nodes of an unvisited neighbor. So, the total time

spent in these loops is O(n), because are processed all nodes of

graph at most once.

• From previous statements results that the total running time of this

procedure is O(n 2 ).


A NEW ALGORITHM FOR CONTENT-BASED REGION QUERY IN DATABASES WITH MEDICAL IMAGES

Taking into account that the color information of each region is stored

as a color binary set, the color similitude between two regions may

be computed either with the quadratic distance between color sets,

or with Hamming distance between color sets.

Here, there was used the quadratic distance between binary sets ‘sq’

and ‘st’ that is given by the following equation :

M-1 M-1

d1= Σ Σ (sq[m0] –st[m0])am0,m1(sq[m1]-st[m1]) (1)

m0=0 m1=0


A NEW ALGORITHM FOR CONTENT-BASED REGION QUERY IN DATABASES WITH MEDICAL IMAGES

Experiments and results

• For testing the efficiency of the new method and for comparing the

two methods of implementation of the color set back-projection

algorithm, there have been made some experiments over the

medical images collection.

• For Method2 the hexagon edge can be equal to 3, respective 4.

• For each query, the images from the databases were inspected and

relevance was assigned to them (1- relevant, 0 – irrelevant) and the

retrieval effectiveness using recall and precision was recorded


A NEW ALGORITHM FOR CONTENT-BASED REGION QUERY IN DATABASES WITH MEDICAL IMAGES

Experiment 1

The image with detected color regions using Method2 and edge=3; Region6

(representing the sick area) marked for the content-based region query


A NEW ALGORITHM FOR CONTENT-BASED REGION QUERY IN DATABASES WITH MEDICAL IMAGES

Experiment 1

The retrieved images using Method2 and edge equal to 3, for the Region6

as query region. All the images are relevant.


A NEW ALGORITHM FOR CONTENT-BASED REGION QUERY IN DATABASES WITH MEDICAL IMAGES

Experiment 1

The graphic of the retrieving efficiency for Method1 and Method2 with the hexagon

edge equal to 3


A NEW ALGORITHM FOR CONTENT-BASED REGION QUERY IN DATABASES WITH MEDICAL IMAGES

Experiment 2:

The image with detected color regions using Metod2 and edge=3; Region8, Region9,

Region10 marked for the content-based region query


A NEW ALGORITHM FOR CONTENT-BASED REGION QUERY IN DATABASES WITH MEDICAL IMAGES

Experiment 2:

The retrieved images Method2 with hexagon edge equal to 3, for the Region8,

Region9, Region10 as query regions. Only the last image is irrelevant.


A NEW ALGORITHM FOR CONTENT-BASED REGION QUERY IN DATABASES WITH MEDICAL IMAGES

Experiment 2:

The graphic of the retrieving efficiency for the two methods, with the hexagon edge

equal to 3,for Method2.


A NEW ALGORITHM FOR CONTENT-BASED REGION QUERY IN DATABASES WITH MEDICAL IMAGES

Conclusion

• This article presents an original method of implementation of the color set

back-projection algorithm, algorithm that allows the automated detection of

the color regions from a color medical image. The detected regions are then

used in the content-based region query.

• The very good results obtained in the effectuated experiments indicate the

fact that each of the two implementations methods (Method1 and Method2)

of the color set back-projection algorithm can be used in the processing of

the content – based visual query.

• The experiments show that the results obtained with the Method2 and

edge=3 are closer in quality with those obtained with Method1.

• The advantage of the second method (Method 2 with edge equal to 3) is

given by the fact that for detecting the color regions it is not necessary the

pixel-by-pixel image traversal, but only the pixels arranged in the vertices of

a hexagon with edge equal to 3 pixels.

• If the processing time of the Method 1 is O(m 2 *n 2 ) (m - is the width and n - is

the height of image), the processing time for the Method 2 presented here is

O(n 2 ) (n - is the number of nodes of graph attached to an image).


A NEW ALGORITHM FOR CONTENT-BASED REGION QUERY IN DATABASES WITH MEDICAL

Software system for content-based region query

http://193.226.37.211:8080/Medical/index.HTM

More magazines by this user
Similar magazines