Elektronika 2009-11.pdf - Instytut SystemÃ³w Elektronicznych

More documents

Recommendations

Info

definite temporary partition. Use of this two methods permits onto proper emission of signal audio, and correct realizing remaining stages of system. Suitable connection of both methods permits onto correct delimitation of beginning and end of signal. After delimitation of beginning and end of polish word, to further operation signal is design without superfluous silence. The non-stationary nature of the speech signal caused by dynamic proprieties of human speech result in dependence of the next stage on use of division of entrance signal onto stationary frame boxes [5]. Signal is stationary in short temporary partitions (10 ± 30 ms) [7]. Every such stationary frame box was replaced by symbol of observation in process of create of vectors of observation. In created system it was assumed that length of every frame box equals 30 ms, what at given sampling of signal (8 kHz) remove 240 samples. For speech recognition, in aim of keeping stationary of signal, it was assumed that every next frame box is sew on previous with delay. It was accepted, that 80 last samples of signal of previous frame box are simultaneously 80 samples of next frame box. For speaker verification, in aim to keep all the detail signal, all frame boxes do not overlap. The mechanism of cepstral speech analysis Speech processing applications require specific representations of speech information. A wide range of possibilities exists for parametrically representing the speech signal. Among these the most important parametric representation of speech is short time spectral envelope [5,7]. Linear Predictive Coding (LPC) and Mel Frequency Cepstral Coefficients (MFCC) spectral analysis models have been used widely for speech recognition applications. Usually together with MFCC coefficients, first and second order derivatives are also used to take into account the dynamic evolution of the speech signal, which carries relevant information for speech recognition. In the mel-cepstrum, the spectrum were first passed through mel-frequency-bandpass-filters before they were transformed to the frequency domain [8]: The characteristics of filters followed the characteristics old human auditory system [8]. The filters had triangular bandpass frequency responses. For speech recognition, it was used filters with width 300 mels, and transferred with a delay 150 mels. For speaker verification, it was used filters with width 200 mels, and transferred with a delay 100 mels, for a more detailed analysis of low frequency. The bands of filters were spaced linearly for bandwidth below 1000 Hz and increased logarithmically after the 1000 Hz. In the mel-frequency scaling, all the filter bands had the same width, which were equal to the intended characteristic of the filters, when they were in normal frequency scaling. Spectrum of signal of every frame boxes obtained by Fast Fourier Transform (FFT,512) comes under process of filtration by bank of filters. The next step was to calculate the members of each filter by multiplying the filter’s amplitude with the average power spectrum of the corresponding frequency of the voice input. The summation of all members of a filters is: (1) (2) Finally, the mel-frequency cepstrum coefficients (MFCC) was derived by taking the log of the mel-power-spectrum coefficients S k then convert them back to time (frequency) domain using Discrete Cosine Transform (DCT). The number of melcoefficients K used, for speaker recognition purposes, was usually from 12 to 20 [8]: In practice, removing this one from the formula gave a better performance, both for speech recognition, and verification of user. In this work, for speech cooding, it was used twenty dimensional MFCC as the standard audio features. For speech recognition, aim of analysis of signal audio was coding of signal audio, and obtaining of entrance data in form of vectors of observation. Polish language contains 37 different phonems, therefore to coding of signal audio it was applied codebook including 37 code symbols. At frequency of sampling 8 kHz, instead of 8000 values, signal audio will be coded by about 50 values. It was applied Lloyd algorithm to vector quantization. One from basic operation during vector quantization is delimitation of distance of next vector of observation from all center of gravity of codebook. To measurement of distance it was applied Euclidean measure. For speaker verification, obtained for all frame boxes cepstrum coefficients add upped properly. In this expedient all independent statement one coded by twenty cepstral coefficients. Vector quantization with use Lloyd algorithm In use of loss-free compression, generated data by source have to be represented by one from small number of code words. Number of possible different data is generally larger from number of code word, design to them of representing. Process of representing of large collection of value by collection considerably smaller is called quantization [9]. A vector quantizer Q of dimension M and size N is a mapping from a vector x in M-dimensional Euclidean space R M into a finite set Y containing NM-dimensional outputs or reproduction points, called code vectors or code words. Thus: where: Y is known as the codebook of the quantizer. The mapping action is written as: (3) (4) (5) (6) 66 ELEKTRONIKA 11/<strong>2009</strong>
Associated with every N-point M-dimensional vector quantizer is a partition of R M into N regions or cells, R i , i =1,...,N. The i-th cell is defined by: For a given codebook Y of size N, the optimal partition cells satisfy: where i, j =1,...,N and i = j. That is, Q(x) = y i only if d(x,y i ) ≤ d(x,y j ). Thus given the codebook Y, the encoder contains a minimum distortion or nearest-neighbor mapping with: For making of measurement of distance, it is necessary qualification of center of gravity of area, considered as similar. It was defined the centroid cent(R o ), of any nonempty set R o ∈ R M as the vector y o (if it exists) that minimizes the distortion between a point X ∈ R o and y o , averaged over the probability distribution of X given X ∈ R o . Thus: for every y ∈ R M . For a given partition {R i ; i =1,...,N}, the optimal code vectors satisfy: (7) (8) (9) (10) Practical results In this work, it was proposed use speech recognition method to control the movement of the camera of closed-circuit television system, and use user verification method to log on to this system. Scheme of this system was showed on Figures. In system, during the registering, user must enter his unique login and read the three long randomly generated expressions. When registered in the system user logs on, also must enter his registered login and read one long randomly generated expression. If this statement is similar to at least two speeches given by the user login with highest probability, the system accepts the login into the system correctly. After adding a new user, the system makes code book, based on the recorded speech. Next, user must learn in the selected commands to control camera movement. It was selected the following commands to control: left, right, top, bottom, zoom, plus, minus, start, stop, recording, and the numerical values of the angle of rotation. Each of the commands, during the learning system, user must repeat five times. For the registered user, system controls the movement of the camera, using the database commands created by this user. Samples were taken at frequency of 8 kHz and 16-bit encoding. The system includes thirty five different posts, which are randomly selected during registration or login. In aim to eliminate harmful interference, the recording of expression is effected by means of a microphone, equipped with the filter code and the external processor DSP. For registered users, the system correctly recognizes all the commands to control the camera movement. Area including similar elements, except center of gravity it has to possess also boundary. Given the partition cells R i , i = 1,...,N, the boundary set is defined as: for all i, j =1,...,N. Thus, the boundary consists of points that are equally close to both y j and to some other y i and hence do not have a unique nearest neighbor. A necessary condition for a codebook to be optimal for a given source distribution is: That is, the boundary set must be empty. Alter natively: (11) (12) (13) (14) for all i, j = 1,...,N. Suppose that the boundary set is non empty and hence there is at least one x that is equidistant to the code vectors y i and y j . Mapping x to y i or y j will yield two encoding schemes with the same average distortion. By including the nonzero probability input point x into either cell (R i and R j associated with y i and y j , respectively) will necessarily modify the centroids of R i and R j , meaning that the codebook is no longer optimal for the new partition. Given a codebook Y of size N, it is desired to find the input partition cells R i and codewords such that the average distortion D = E {d(X,Q(X))} is minimized, where X is the input random vector with a given PDF (Probability Density Function). All method of vector quantization with use Lloyd algorithm it was described on base work [4]. Scheme of closed-circuit television system with speech recognition and and speaker verification Schemat systemu telewizji przemysłowej z systemem rozpoznawania mowy i głosowej weryfikacji użytkownika Conclusion and future work Now that speech signals are recorded with a frequency of 8000 Hz, the next step in the development of this system will be able to log into the system and traffic control cameras by the Internet. Encoded as the recorded sound, and send over the network should not be a problem. Next, the system will be equipped with a module to automatically locate the face, eyes and mouth of person, contained in the frame of the camera. The given data will be used to verify the identity of persons on the basis of facial asymmetry. ELEKTRONIKA 11/<strong>2009</strong> 67
Page 5 and 6:
konstrukcje technologie zastosowani
Page 7 and 8:
Streszczenia artykułów • Summar
Page 9 and 10:
Page 11 and 12:
Page 13 and 14:
Medical pattern intelligent recogni
Page 15 and 16:
Fig. 2. Start graph Z and the set o
Page 17 and 18:
Both models are created of the basi
Page 19 and 20: • in some cases in the methods of
Page 21 and 22: 4. Random operators: three types of
Page 23 and 24: useful. In work [1] is stressed, th
Page 25 and 26: Tabl. 1. Fuzzy sets of the objects,
Page 27 and 28: In addition, one has to remember th
Page 29 and 30: The correction of digital images ob
Page 31 and 32: Tabl. 4. MD error before and after
Page 33 and 34: tionship then determines which indi
Page 35 and 36: this goal. A user’s public key au
Page 37 and 38: a) will be or could be broken, or b
Page 39 and 40: Ontology-based approach to scada sy
Page 41 and 42: erarchy of vulnerability classes wi
Page 43 and 44: and the set of tasks is divided int
Page 45 and 46: tasks: T 0 , T 4 , T 5 , T 6 and T
Page 47 and 48: According to presented function CSF
Page 49 and 50: The efficient data authentication i
Page 51 and 52: Fig. 3. Example run of three rounds
Page 53 and 54: Fig. 2. Possible scenarios of data
Page 55 and 56: e necessary, in worst case - also s
Page 57 and 58: dicates the number of tasks at the
Page 59 and 60: • information on their state come
Page 61 and 62: • data regarding their identity (
Page 63 and 64: k = 1, 2, ..., m and m is the numbe
Page 65 and 66: more powerful statistical (algorith
Page 67 and 68: Signal to Noise Ratio (PSNR). We pr
Page 69: [23] W3C - Web Services Glossary -
Page 73 and 74: The SMS-B system architecture (Sour
Page 75 and 76: Technika próżni i technologie pr
Page 77 and 78: wniosku i w konsekwencji za rok 200
Page 79 and 80: Poniżej przedstawiono krótki kome
Page 81 and 82: Wspomnienie Edward Leja (1937-2009)
Page 83 and 84: Zastosowanie technik immunoenzymaty
Page 85 and 86: z pasty węglowej, zaś odniesienia
Page 87 and 88: [33] Biani A., Centi S. Tombrlli S.
Page 89 and 90: Problemem bowiem w pracach instytut
Page 91 and 92: 1985 - Zdzisław Dorywalski 1986 -
Page 93 and 94: Rys. 4. Uroczyste wręczanie świad
Page 95 and 96: Imię i Nazwisko Patenty Wzory uży
Page 97 and 98: zadań określonych przez użytkown
Page 99 and 100: Ocena sugerowanych w ankiecie metod
Page 101: Zaprenumeruj wiedzę fachową 2010
Page 104 and 105: Radary pasywne - nowa technika radi
Page 106 and 107: c) stopniowo przesuwać jeden przeb
Page 108 and 109: W przypadku wykorzystania nadajnika
Page 110 and 111: kreślić to, że owale Cassiniego
Page 112 and 113: Dla każdego kierunku odbieranej fa
Page 114 and 115: chomych, które w ogólnym przypadk
Page 116 and 117: Rys. 19. Fragment zobrazowania SS3
Page 118 and 119: Zbigniew Czekała jest projektantem
Page 120 and 121:
• wytypowaniu statków zobowiąza
Page 122 and 123:
• używanie właściwego osprzęt
Page 124 and 125:
W jednomodowym szklanym włóknie t
Page 126 and 127:
Światłowody scyntylacyjne W środ
Page 128 and 129:
Parametry materiałowe szklanego w
Page 130 and 131:
126 ELEKTRONIKA 11/2009
Page 132 and 133:
Literatura [1] Yamane M., Asahara Y
Page 134 and 135:
Rys.1. Przebiegi testowe na wyprowa
Page 136 and 137:
nież pasmo emisji od około 500 MH
show all

Elektronika 2009-11.pdf - Instytut SystemÃ³w Elektronicznych

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?