Elektronika 2009-11.pdf - Instytut Systemów Elektronicznych
Elektronika 2009-11.pdf - Instytut Systemów Elektronicznych
Elektronika 2009-11.pdf - Instytut Systemów Elektronicznych
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
Associated with every N-point M-dimensional vector quantizer<br />
is a partition of R M into N regions or cells, R i , i =1,...,N. The<br />
i-th cell is defined by:<br />
For a given codebook Y of size N, the optimal partition cells<br />
satisfy:<br />
where i, j =1,...,N and i = j. That is, Q(x) = y i only if d(x,y i ) ≤<br />
d(x,y j ). Thus given the codebook Y, the encoder contains<br />
a minimum distortion or nearest-neighbor mapping with:<br />
For making of measurement of distance, it is necessary qualification<br />
of center of gravity of area, considered as similar. It<br />
was defined the centroid cent(R o ), of any nonempty set<br />
R o ∈ R M as the vector y o (if it exists) that minimizes the distortion<br />
between a point X ∈ R o and y o , averaged over the<br />
probability distribution of X given X ∈ R o . Thus:<br />
for every y ∈ R M . For a given partition {R i ; i =1,...,N}, the optimal<br />
code vectors satisfy:<br />
(7)<br />
(8)<br />
(9)<br />
(10)<br />
Practical results<br />
In this work, it was proposed use speech recognition method<br />
to control the movement of the camera of closed-circuit television<br />
system, and use user verification method to log on to<br />
this system. Scheme of this system was showed on Figures.<br />
In system, during the registering, user must enter his<br />
unique login and read the three long randomly generated expressions.<br />
When registered in the system user logs on, also<br />
must enter his registered login and read one long randomly<br />
generated expression. If this statement is similar to at least<br />
two speeches given by the user login with highest probability,<br />
the system accepts the login into the system correctly. After<br />
adding a new user, the system makes code book, based on<br />
the recorded speech. Next, user must learn in the selected<br />
commands to control camera movement. It was selected the<br />
following commands to control: left, right, top, bottom, zoom,<br />
plus, minus, start, stop, recording, and the numerical values of<br />
the angle of rotation. Each of the commands, during the learning<br />
system, user must repeat five times. For the registered<br />
user, system controls the movement of the camera, using the<br />
database commands created by this user.<br />
Samples were taken at frequency of 8 kHz and 16-bit encoding.<br />
The system includes thirty five different posts, which<br />
are randomly selected during registration or login. In aim to<br />
eliminate harmful interference, the recording of expression is<br />
effected by means of a microphone, equipped with the filter<br />
code and the external processor DSP. For registered users,<br />
the system correctly recognizes all the commands to control<br />
the camera movement.<br />
Area including similar elements, except center of gravity it has<br />
to possess also boundary. Given the partition cells R i , i =<br />
1,...,N, the boundary set is defined as:<br />
for all i, j =1,...,N. Thus, the boundary consists of points that<br />
are equally close to both y j and to some other y i and hence do<br />
not have a unique nearest neighbor. A necessary condition for<br />
a codebook to be optimal for a given source distribution is:<br />
That is, the boundary set must be empty. Alter natively:<br />
(11)<br />
(12)<br />
(13)<br />
(14)<br />
for all i, j = 1,...,N.<br />
Suppose that the boundary set is non empty and hence<br />
there is at least one x that is equidistant to the code vectors y i<br />
and y j . Mapping x to y i or y j will yield two encoding schemes<br />
with the same average distortion. By including the nonzero<br />
probability input point x into either cell (R i and R j associated<br />
with y i and y j , respectively) will necessarily modify the centroids<br />
of R i and R j , meaning that the codebook is no longer optimal<br />
for the new partition. Given a codebook Y of size N, it is desired<br />
to find the input partition cells R i and codewords such that the<br />
average distortion D = E {d(X,Q(X))} is minimized, where X is<br />
the input random vector with a given PDF (Probability Density<br />
Function). All method of vector quantization with use Lloyd algorithm<br />
it was described on base work [4].<br />
Scheme of closed-circuit television system with speech recognition<br />
and and speaker verification<br />
Schemat systemu telewizji przemysłowej z systemem rozpoznawania<br />
mowy i głosowej weryfikacji użytkownika<br />
Conclusion and future work<br />
Now that speech signals are recorded with a frequency of<br />
8000 Hz, the next step in the development of this system will<br />
be able to log into the system and traffic control cameras by<br />
the Internet. Encoded as the recorded sound, and send over<br />
the network should not be a problem. Next, the system will be<br />
equipped with a module to automatically locate the face, eyes<br />
and mouth of person, contained in the frame of the camera.<br />
The given data will be used to verify the identity of persons<br />
on the basis of facial asymmetry.<br />
ELEKTRONIKA 11/<strong>2009</strong> 67