UC Riverside Undergraduate Research Journal

More documents

Recommendations

Info

Motion Based Bird Sensing Using Frame Differencing and Gaussian Mixture Deep J. Shah Introduction In the past, several background segmentation techniques have been used to identify the objects of interest in a scene. The object of interest can be defined as something that is different in a scene in comparison to previous scenes. This comparison is performed by comparing a scene with an object, to an ideal scene which just has the presence of the background. The regions which observe a considerable change between the current scene and the previous scenes are the areas of interest, as they may indicate the location of the new object. The term “background segmentation” refers to identifying the difference between an image and its background using any of the techniques mentioned in [4] and then thresholding the results to identify an object of interest. There are several papers that describe various methods of performing background subtraction. However, very few papers describe the algorithms in sufficient detail to make the re-implementation easy [4]. Hence, this paper thoroughly describes two background segmentation techniques - frame differencing and Gaussian mixture. We then investigate the usefulness and shortcomings of each of these methods in an environment that involves a constant moving background due to bird flight, leaf movement, lighting changes, and other movements in the sky. We try to monitor the birds as our objects of interest. Methods: Segmentation Techniques In the following sections, we describe two approaches that we use for bird detection. The first section describes how frame differencing is performed and the second section displays the framework behind the Gaussian mixture approach. Frame Differencing This section describes one of the most common techniques used in background segmentation. As the name itself suggests, frame differencing involves taking the difference between two frames and using this difference to detect the object. The approach we use is very similar to this and is a two part process. First, the object is detected using frame differencing. Then this detected object is compared with the ground truth to learn the reliability of this approach. In our implementation we load two consecutive frames from a given sequence of video frames. These color frames are converted to gray scale intensity. Otsu’s Method [1] is then used to determine the threshold value of the gray scale images. The threshold value is determined such that the pixel values on either side of this value are established to be either a background or a foreground pixel. Following Otsu’s method, the two consecutive gray scaled images are differentiated and their absolute difference is used to identify the movement between frames. The noise collected due to differencing is removed by applying the threshold value to the images. The threshold value that we find in our case varies between [0.43 - 0.45]. Pixels below the threshold are removed from the differenced frame leaving behind our object of interest. As described in equation 1, the absolute difference between two frames needs to be greater than the threshold for the object to be detected. | F2 − F1 | > T - (1) Here F 1 is the initial frame, F2 is the following frame and T is the threshold value. Gaussian Mixture In this section, we describe another technique that is commonly used for performing background segmentation. In their paper [5], Stauffer and Grimson suggest a probabilistic approach using a mixture of Gaussians for identifying the background and foreground objects. The probability of observing a given pixel value p at time t is given by [5]: t K ( , i= 1 P p t ) = ∑ω i, tη( pt , µ i, t , Σi t ) - (2) where K is the number of Gaussian mixtures that are used. The number of K varies depending on the memory allocated for simulations. The normalized Gaussian η is a function ofω i, t , µ i, t , Σ i, t which represent the weight, mean, and the covariance matrix of the i th Gaussian at time t respectively. The weight indicates the influence of the i th Gaussian at time t. In our case we choose K = 5, to maximize the distinction amongst pixel values. Since this is an iterative process, all the parameters are updated with the inclusion of every new pixel. Before the update takes place, the new pixel is compared to see if it matches any of the K existing Gaussians. A match is 48 <strong>UC</strong>R Un d e r g r a d u a t e Re s e a r c h Jo u r n a l
Motion Based Bird Sensing Using Frame Differencing and Gaussian Mixture Deep J. Shah determined if | pt - µ i,t| < 2.5σ , where σ corresponds to the standard deviation of the Gaussian. Depending on the match, the Gaussian mixture is updated in the following manner as mentioned in [4, 5]: If the pixel value p t matches the i th Gaussian, then the i th Gaussian component values are updated in the following manner: σ ω i , t = ( 1−α) ωi, t−1 + α - (3) µ i, t= ) t−1 t ( 1− ρ µ + p ρ - (4) 2 T = (1 − ρ) σ − + ρ( p − µ ) ( p − µ ) - (5) 2 i, t i, t 1 t t t t where - (6) In this case the variable 1 / α defines the speed at which the distribution parameters change. If the pixel p t matches the i th Gaussian, then the remaining K-1 Gaussians are updated in the following manner: ω - (7) i, t = ( 1−α) ωi, t−1 µ - (8) i , t= µ i, t−1 σ - (9) 2 2 i, t = σ i, t−1 If the pixel pt fails to match any of the K Gaussians, then the Gaussian with the least likelihood of being the background is removed and replaced with a new distribution with the following parameters: ω = A very low Weight - (10) i,t µ i,t= Pixel value p t - (11) σ = A high Variance - (12) 2 i,t The values for weight and variance can vary based on the significance that is given to a pixel which is least likely to occur in a particular setting. All the Gaussian weights are renormalized after the update is performed. The K Gaussians are then re-ordered based on their likelihood of existence. This likelihood is ωi, t determined using the ratio [5]. Both ω i, t being high and/or σ i, t σ being low leads to a higher ratio. This leads i, t to an open-ended list of all K Gaussians, where the most likely background models with a high ratio are to the left and the less probable transient background distributions are to the right [5]. Then b distributions are modeled to be the background and the remaining K - b distributions are modeled as the foreground for the next pixel. The value for b is determined as described in [5]: b ∑ B = arg min ( w > T ) - (13) b i= 1 where T is some threshold value which measures the proportion of the data that needs to match the background and then the first B distributions are chosen as the background model. We use MATLAB to perform all our computations and implementation for both the segmentation approaches. The video recordings and images used for this study are collected near the bird feeding station at the James Reserve, CA. Our sampled videos are each a minute and six seconds long in length, with each video containing about 1000 frames. There are 15 frames taken per second at a resolution of 640×480 pixels using the Sony NTSC SNC-RZ30N network camera. The camera is located about 15 feet away from the feeder station. The location of the feeder station with respect to the camera is shown in Figure 1: Results Location of the bird feeder s tat i o n Figure 1. The bird feeder station region that is used for capturing video samples. To compare the validity of all the results obtained by using different background segmentation techniques we evaluate our results with the actual ground truth. Frame Differencing The data collected can be classified in two different ways to understand the detection of a bird. Table 1 shows the sum of the distances from a centroid to N of i <strong>UC</strong>R Un d e r g r a d u a t e Re s e a r c h Jo u r n a l 49
Page 1 and 2: University of California, Riverside
Page 3 and 4: UCR Undergraduate Research Journal
Page 5 and 6: Zero Waste Biodiesel: Using Glyceri
Page 13 and 14: Computational Prediction of Associa
Page 23 and 24: Phosphorylation of Crk Adaptor Prot
Page 29 and 30: Augustan Era Policy on the Rhine Fr
Page 35 and 36: Fractal Strings and Number Theory:
Page 47: Motion Based Bird Sensing Using Fra
Page 51 and 52: Motion Based Bird Sensing Using Fra
Page 53 and 54: Love a Son, Raise a Daughter: A Cro
Page 61 and 62: Mating-type distribution of the ric
Page 67 and 68: Secondary Organic Aerosol (Soa) And
Page 75 and 76: Bacterium-Induced Fluorescence-Enha

UC Riverside Undergraduate Research Journal

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?