12.07.2015 Views

A Framework for Multiple-Instance Learning

A Framework for Multiple-Instance Learning

A Framework for Multiple-Instance Learning

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

positivebag #1negative bagpositivebag #1negative bagpositivebag #2point Apositivebag #2point AsectionB {positivebag #3(a)The different shapes that a molecule cantake on are represented as a path. The intersectionpoint of positive paths is where theytook on the same shape.positivebag #3(b)Samples taken along the paths. Section Bis a high density area, but point A is a highDiverse Density area.Figure 1: A motivating example <strong>for</strong> Diverse DensityFigure 1(a) becomes Figure 1(b). The problem of trying to find an intersection changesto a problem of trying to find an area where there is both high density of positive pointsand low density of negative points. The difficulty with using regular density is illustrated inFigure 1(b), Section B. We are not just looking <strong>for</strong> high density, but high “Diverse Density”.We define Diverse Density at a point to be a measure of how many different positive bags haveinstances near that point, and how far the negative instances are from that point.2.1 Algorithms <strong>for</strong> multiple-instance learningIn this section, we derive a probabilistic measure of Diverse Density, and test it on a difficultartificial data set. We denote positive bags as B + i , the jth point in that bag as B + ij, and thevalue of the k th feature of that point as B + ijk . Likewise, B; ijrepresents a negative point.Assuming <strong>for</strong> now that the true concept is a single point t, we can find it by maximizingPr(x = t j B + 1 B+ n B; 1 B; m ) over all points x in feature space. If we use Bayes’rule and an unin<strong>for</strong>mative prior over the concept location, this is equivalent to maximizingthe likelihood Pr(B + 1 B+ n B; 1 B; m j x = t). By making the additional assumptionthat the bags are conditionally independent given the target concept t, the best hypothesis isarg max xQi Q Pr(B+ i j x = t) i Pr(B; i j x = t). Using Bayes’ rule once more (and againassuming a uni<strong>for</strong>m prior over concept location), this is equivalent toYarg max Pr(x = t j B + Yxi ) Pr(x = t j B ; i ): (1)iThis is a general definition of maximum Diverse Density, but we need to define the terms in theproducts to instantiate it. One possibilityis a noisy-or model: the probability that not all pointsmissed the target is Pr(x = t j B + i )=Pr(x = t j B+ i1 B+ :::)=1;Q i2 j (1;Pr(x = tjB+ ij )),and likewise Pr(x = t j B ; )=Q i j (1 ; Pr(x = t j B; ij )). We model the causal probability ofan individual instance on a potential target as related to the distance between them. Namely,Pr(x = t j B ij )=exp(; kB ij ; x k 2 ). Intuitively, if one of the instances in a positive bagis close to x, then Pr(x = t j B + i ) is high. Likewise, if every positive bag has an instanceclose to x and no negative bags are close to x, then x will have high Diverse Density. DiverseDensity at an intersection of n bags is exponentially higher than it is at an intersection of n ; 1bags, yet all it takes is one well placed negative instance to drive the Diverse Density down.i

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!