24.11.2014 Views

Elektronika 2009-11.pdf - Instytut Systemów Elektronicznych

Elektronika 2009-11.pdf - Instytut Systemów Elektronicznych

Elektronika 2009-11.pdf - Instytut Systemów Elektronicznych

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Associated with every N-point M-dimensional vector quantizer<br />

is a partition of R M into N regions or cells, R i , i =1,...,N. The<br />

i-th cell is defined by:<br />

For a given codebook Y of size N, the optimal partition cells<br />

satisfy:<br />

where i, j =1,...,N and i = j. That is, Q(x) = y i only if d(x,y i ) ≤<br />

d(x,y j ). Thus given the codebook Y, the encoder contains<br />

a minimum distortion or nearest-neighbor mapping with:<br />

For making of measurement of distance, it is necessary qualification<br />

of center of gravity of area, considered as similar. It<br />

was defined the centroid cent(R o ), of any nonempty set<br />

R o ∈ R M as the vector y o (if it exists) that minimizes the distortion<br />

between a point X ∈ R o and y o , averaged over the<br />

probability distribution of X given X ∈ R o . Thus:<br />

for every y ∈ R M . For a given partition {R i ; i =1,...,N}, the optimal<br />

code vectors satisfy:<br />

(7)<br />

(8)<br />

(9)<br />

(10)<br />

Practical results<br />

In this work, it was proposed use speech recognition method<br />

to control the movement of the camera of closed-circuit television<br />

system, and use user verification method to log on to<br />

this system. Scheme of this system was showed on Figures.<br />

In system, during the registering, user must enter his<br />

unique login and read the three long randomly generated expressions.<br />

When registered in the system user logs on, also<br />

must enter his registered login and read one long randomly<br />

generated expression. If this statement is similar to at least<br />

two speeches given by the user login with highest probability,<br />

the system accepts the login into the system correctly. After<br />

adding a new user, the system makes code book, based on<br />

the recorded speech. Next, user must learn in the selected<br />

commands to control camera movement. It was selected the<br />

following commands to control: left, right, top, bottom, zoom,<br />

plus, minus, start, stop, recording, and the numerical values of<br />

the angle of rotation. Each of the commands, during the learning<br />

system, user must repeat five times. For the registered<br />

user, system controls the movement of the camera, using the<br />

database commands created by this user.<br />

Samples were taken at frequency of 8 kHz and 16-bit encoding.<br />

The system includes thirty five different posts, which<br />

are randomly selected during registration or login. In aim to<br />

eliminate harmful interference, the recording of expression is<br />

effected by means of a microphone, equipped with the filter<br />

code and the external processor DSP. For registered users,<br />

the system correctly recognizes all the commands to control<br />

the camera movement.<br />

Area including similar elements, except center of gravity it has<br />

to possess also boundary. Given the partition cells R i , i =<br />

1,...,N, the boundary set is defined as:<br />

for all i, j =1,...,N. Thus, the boundary consists of points that<br />

are equally close to both y j and to some other y i and hence do<br />

not have a unique nearest neighbor. A necessary condition for<br />

a codebook to be optimal for a given source distribution is:<br />

That is, the boundary set must be empty. Alter natively:<br />

(11)<br />

(12)<br />

(13)<br />

(14)<br />

for all i, j = 1,...,N.<br />

Suppose that the boundary set is non empty and hence<br />

there is at least one x that is equidistant to the code vectors y i<br />

and y j . Mapping x to y i or y j will yield two encoding schemes<br />

with the same average distortion. By including the nonzero<br />

probability input point x into either cell (R i and R j associated<br />

with y i and y j , respectively) will necessarily modify the centroids<br />

of R i and R j , meaning that the codebook is no longer optimal<br />

for the new partition. Given a codebook Y of size N, it is desired<br />

to find the input partition cells R i and codewords such that the<br />

average distortion D = E {d(X,Q(X))} is minimized, where X is<br />

the input random vector with a given PDF (Probability Density<br />

Function). All method of vector quantization with use Lloyd algorithm<br />

it was described on base work [4].<br />

Scheme of closed-circuit television system with speech recognition<br />

and and speaker verification<br />

Schemat systemu telewizji przemysłowej z systemem rozpoznawania<br />

mowy i głosowej weryfikacji użytkownika<br />

Conclusion and future work<br />

Now that speech signals are recorded with a frequency of<br />

8000 Hz, the next step in the development of this system will<br />

be able to log into the system and traffic control cameras by<br />

the Internet. Encoded as the recorded sound, and send over<br />

the network should not be a problem. Next, the system will be<br />

equipped with a module to automatically locate the face, eyes<br />

and mouth of person, contained in the frame of the camera.<br />

The given data will be used to verify the identity of persons<br />

on the basis of facial asymmetry.<br />

ELEKTRONIKA 11/<strong>2009</strong> 67

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!