07.03.2014 Views

7 IR models based on predicate logic

7 IR models based on predicate logic

7 IR models based on predicate logic

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 1<br />

7 <str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong><br />

<strong>logic</strong><br />

7.1 General c<strong>on</strong>siderati<strong>on</strong>s<br />

7.1.1 <str<strong>on</strong>g>IR</str<strong>on</strong>g> as inference<br />

q - query<br />

d – document<br />

retrieval:<br />

search for documents which imply the query:<br />

d → q<br />

example:<br />

d = {t 1 , t 2 , t 3 }<br />

q = {t 1 , t 3 }<br />

<strong>logic</strong>al view:<br />

d = t 1 ∧ t 2 ∧ t 3<br />

q = t 1 ∧ t 3<br />

⇒: d → q<br />

Norbert Fuhr


<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 2<br />

advantage of inference-<str<strong>on</strong>g>based</str<strong>on</strong>g> approach:<br />

step from term-<str<strong>on</strong>g>based</str<strong>on</strong>g> to knowledge-<str<strong>on</strong>g>based</str<strong>on</strong>g> retrieval<br />

e.g. easy incorporati<strong>on</strong> of additi<strong>on</strong>al knowledge<br />

example:<br />

d: ’squares’<br />

q: ’rectangles’<br />

thesaurus: ’squares’ → ’rectangles’<br />

⇒: d → q<br />

Norbert Fuhr


<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 3<br />

7.1.2 <str<strong>on</strong>g>IR</str<strong>on</strong>g> as uncertain inference<br />

d: ’quadrangles’<br />

q: ’rectangles’<br />

⇒ uncertain knowledge required<br />

’quadrangles’ 0.3<br />

→ ’rectangles’<br />

[Rijsbergen 86]:<br />

<str<strong>on</strong>g>IR</str<strong>on</strong>g> as uncertain inference<br />

Retrieval ˆ=<br />

estimate probability P (d → q) = P (q|d)<br />

q<br />

t 1<br />

t 4<br />

t 2<br />

t 5<br />

t 3<br />

t 6<br />

d<br />

Norbert Fuhr


<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 4<br />

7.1.3 Propositi<strong>on</strong>al vs. <strong>predicate</strong> <strong>logic</strong><br />

spatio-temporal relati<strong>on</strong>ships:<br />

• document attributes<br />

query: documents published after 1990?<br />

?- pubyear(D,Y) & Y>1990<br />

• multimedia retrieval<br />

Norbert Fuhr<br />

c<strong>on</strong>venti<strong>on</strong>al indexing (<str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> propositi<strong>on</strong>al <strong>logic</strong>):<br />

d = {tree, house}<br />

query: Is there a tree <strong>on</strong> the left of the house?<br />

⇒ query cannot be expressed in propositi<strong>on</strong>al <strong>logic</strong><br />

<strong>predicate</strong> <strong>logic</strong>:<br />

d: tree(t1). house(h1). tree(t2).<br />

left(t1, h1). left(h1,t2).<br />

?- tree(X) & house(Y) & left(X,Y).


<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 5<br />

Ontologies<br />

Thesaurus<br />

polyg<strong>on</strong><br />

regular<br />

polyg<strong>on</strong><br />

triangle quadrangle ...<br />

rectangle<br />

regular<br />

triangle<br />

square<br />

thesaurus knowledge:<br />

can be expressed in propositi<strong>on</strong>al <strong>logic</strong><br />

square = quadrangle ∧ regular-polyg<strong>on</strong><br />

descripti<strong>on</strong> <strong>logic</strong>s<br />

• instances of c<strong>on</strong>cepts<br />

• roles (relati<strong>on</strong>ships) between c<strong>on</strong>cepts/instances<br />

Norbert Fuhr


<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 6<br />

Advantages of <strong>predicate</strong> <strong>logic</strong>:<br />

• modelling of spatial and temporal relati<strong>on</strong>ships<br />

(e.g. for multimedia retrieval)<br />

• instances of c<strong>on</strong>cepts<br />

≈ combinati<strong>on</strong> of c<strong>on</strong>trolled vocabulary and free text search<br />

• roles/relati<strong>on</strong>ships between c<strong>on</strong>cepts or instances<br />

→ higher expressiveness for c<strong>on</strong>cept definiti<strong>on</strong> and descripti<strong>on</strong> of document<br />

c<strong>on</strong>tent<br />

Norbert Fuhr


<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 7<br />

7.2 RDF<br />

7.2.1 RDF: basic c<strong>on</strong>cepts<br />

Resource object <strong>on</strong> the WWW, e.g. Web page, database<br />

naming of resources: Uniform Resource Identifier (URI)<br />

Literal special type of resource, with string value, no explicit URI<br />

Property aspect / attribute / characteristics / relati<strong>on</strong><br />

Statement resource + named property + value of property<br />

(subject, <strong>predicate</strong>, object)<br />

visits<br />

Norbert Pisa<br />

Norbert Fuhr


<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 8<br />

RDF example<br />

organized−by<br />

ISSDL<br />

M.Agosti<br />

isPartOf<br />

Name Email<br />

"Maristella Agosti" "agosti@..."<br />

<str<strong>on</strong>g>IR</str<strong>on</strong>g>−Course<br />

title<br />

"Introducti<strong>on</strong> to <str<strong>on</strong>g>IR</str<strong>on</strong>g>"<br />

teaches<br />

N.Fuhr<br />

Name<br />

"Norbert Fuhr"<br />

Email<br />

"fuhr@cs.uni−..."<br />

Norbert Fuhr


<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 9<br />

RDF schemas<br />

similar to semantic networks / descripti<strong>on</strong> <strong>logic</strong>s<br />

describes relati<strong>on</strong>ships between types of resources and/or properties<br />

• fundamental c<strong>on</strong>cepts<br />

– rdfs:Resource<br />

– rdf:Property<br />

– rdfs:Class<br />

• schema definiti<strong>on</strong> c<strong>on</strong>cepts<br />

– rdf:type<br />

– rdfs:subClassOf<br />

– rdfs:subPropertyOf<br />

– rdfs:seeAlso<br />

– rdfs:isDefinedBy<br />

Norbert Fuhr


<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 10<br />

RDFS example: resource hierarchy<br />

rdfs:Resource<br />

rdfs:Class<br />

xyz:MotorVehicle<br />

xyz:Van<br />

xyz:Truck<br />

xyz:PassengerVehicle<br />

xyz:MiniVan<br />

rdfs:subClassOf<br />

rdf:Type<br />

Norbert Fuhr


<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 11<br />

RDFS example: resource and property hierarchies<br />

rdf:Property<br />

rdfs:Class<br />

rdf:type<br />

rdf:type<br />

rdf:type<br />

visits<br />

Pers<strong>on</strong><br />

visits<br />

Place<br />

rdfs:subPropertyOf<br />

rdfs:subClassOf<br />

rdfs:subClassOf<br />

tourist−visit business−visit ISSDL−Tutor<br />

business−visit<br />

C<strong>on</strong>f.−Loc.<br />

rdf:type rdf:type<br />

business−visit<br />

N. Fuhr Pisa<br />

Norbert Fuhr


<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 12<br />

Norbert Fuhr


<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 13<br />

RDF example: image descripti<strong>on</strong><br />

picture<br />

c<strong>on</strong>tains<br />

rdf:type<br />

parc<br />

above<br />

artifact<br />

rdfs:subClassOf<br />

sculpture<br />

right−of<br />

man woman<br />

swan cherub<br />

socle<br />

Norbert Fuhr


<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 14<br />

Retrieval with RDF<br />

?X<br />

artifact<br />

right−of<br />

?Y ?Z<br />

man woman<br />

Norbert Fuhr


<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 15<br />

Norbert Fuhr


<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 16<br />

Subsumpti<strong>on</strong><br />

retrieval as inference (implicati<strong>on</strong>) in descripti<strong>on</strong> <strong>logic</strong>: subsumpti<strong>on</strong><br />

find implicit subclasses of query c<strong>on</strong>cept<br />

Subsumpti<strong>on</strong> in RDF:<br />

resource r2 has property rdfs:subClassOf r1 if<br />

1. r2 is subclass of all superclasses of r1<br />

2. each property of r1 subsumes the corresp<strong>on</strong>ding property of r2<br />

a property p2 is subsumed by a property p1 if<br />

1. the properties are equal, or the statement r2 rdfs:subPropertyOf r1 holds.<br />

2. the range of p2 is subsumed by the range of p1.<br />

Norbert Fuhr


<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 17<br />

7.3 Modelling <str<strong>on</strong>g>IR</str<strong>on</strong>g> in Datalog<br />

7.3.1 Introducti<strong>on</strong><br />

Datalog:<br />

• horn <strong>predicate</strong> <strong>logic</strong><br />

(most <str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> propositi<strong>on</strong>al <strong>logic</strong>)<br />

• no functi<strong>on</strong>s<br />

• restricted forms of negati<strong>on</strong> allowed<br />

• sound and complete evaluati<strong>on</strong> algorithms<br />

Norbert Fuhr


<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 18<br />

ground facts:<br />

docTerm(d1,ir).<br />

docTerm(d1,db).<br />

docTerm(d2,ir).<br />

docTerm(d2,oop).<br />

rules:<br />

irdoc(D) :- docTerm(D,ir).<br />

iranddb(D) :- docTerm(D,ir) & docTerm(D,db).<br />

irnotdb(D) :- docTerm(D,ir) & not(docTerm(D,db)).<br />

recursive rules:<br />

link(d1,d2). link(d2,d3). link(d3,d1).<br />

linked(X,Y) :- link(X,Y).<br />

linked(X,Y) :- linked(X,Z) & link(Z,Y).<br />

queries:<br />

?- docTerm(D,ir).<br />

?- docTerm(D,ir) & docTerm(D,db).<br />

?- docTerm(D,ir) \& not(docTerm(D,db)).<br />

Norbert Fuhr


<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 19<br />

7.3.2 Hypertext structure<br />

docTerm(d1,ir). docTerm(d1,db).<br />

link(d1,d2). link(d2,d3). link(d3,d1).<br />

about(D,T) :- docTerm(D,T).<br />

about(D,T) :- link(D,D1) & about(D1,T).<br />

d3<br />

docterm<br />

d1<br />

d2<br />

link<br />

ir<br />

db<br />

?- about(D,ir)<br />

Norbert Fuhr


<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 20<br />

7.3.3 Aggregati<strong>on</strong><br />

book<br />

chapter<br />

secti<strong>on</strong><br />

part(D,P) :- chapter(D,P).<br />

part(D,P) :- secti<strong>on</strong>(D,P).<br />

retrieve node if at least <strong>on</strong>e part is about the search<br />

term:<br />

about(D,T) :- part(D,P) & about(P,T).<br />

retrieve node if all its parts are about the search term:<br />

about(D,T) :- part(D,X) & about(X,T) &<br />

not(anypart(D,T)).<br />

anypart(D,T):- part(D,P)& not(about(P,T)).<br />

Norbert Fuhr


<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 21<br />

7.4 Probabilistic Datalog<br />

7.4.1 Motivati<strong>on</strong><br />

powerful retrieval <strong>logic</strong><br />

• expressiveness of Datalog<br />

– <strong>predicate</strong> <strong>logic</strong><br />

(spatio-temporal relati<strong>on</strong>ships, instances of<br />

c<strong>on</strong>cepts)<br />

– recursi<strong>on</strong><br />

(structured documents, hypertext links, termino<strong>logic</strong>al<br />

structures)<br />

• uncertain inference:<br />

probabilistic inference<br />

Norbert Fuhr


<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 22<br />

Syntax<br />

ground facts with probabilistic weights<br />

0.9 docTerm(d1,ir).<br />

0.5 docTerm(d1,db).<br />

0.8 docTerm(d2,ir).<br />

0.3 docTerm(d2,oop).<br />

?- docTerm(D,ir).<br />

gives<br />

d1 0.9<br />

d2 0.8<br />

?- docTerm(D,ir) & docTerm(D,db).<br />

gives<br />

d1 0.45<br />

Norbert Fuhr


<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 23<br />

Example: Image retrieval<br />

<str<strong>on</strong>g>IR</str<strong>on</strong>g>IS (Univ. Bremen): automatic indexing of images with semantic c<strong>on</strong>cepts<br />

imgobj(O,I,N,X1,X2,Y1,Y2)<br />

O image object<br />

I image<br />

N name of semantic c<strong>on</strong>cept<br />

L,R,B,T bounding rectangle of image object<br />

query: images with water in fr<strong>on</strong>t of st<strong>on</strong>es<br />

?- imgobj(OA,I,water,L1,R1,B1,T1) ,<br />

imgobj(OB,I,st<strong>on</strong>e,L2,R2,B2,T2),<br />

B1


<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 24<br />

Norbert Fuhr


<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 25<br />

7.4.2 Semantics of probabilistic Datalog<br />

Extensi<strong>on</strong>al vs. intensi<strong>on</strong>al semantics<br />

0.9 docTerm(d1,ir).<br />

0.5 docTerm(d1,db).<br />

0.7 link(d2,d1).<br />

about(D,T) :- docTerm(D,T).<br />

about(D,T) :- link(D,D1) & about(D1,T)<br />

q(D) :- about(D,ir) & about(D,db).<br />

extensi<strong>on</strong>al semantics:<br />

weight of derived fact as functi<strong>on</strong> of weights of subgoals<br />

P (q(d2)) = P (about(d2,ir)) · P (about(d2,db)) =<br />

(0.7 · 0.9) · (0.7 · 0.5)<br />

Problem:<br />

“improper treatment of correlated sources of evidence”<br />

[Pearl]<br />

→ extensi<strong>on</strong>al semantics <strong>on</strong>ly correct for tree-like<br />

inference structures<br />

intensi<strong>on</strong>al semantics:<br />

weight of IDB fact as functi<strong>on</strong> of weights of underlying<br />

ground facts<br />

Norbert Fuhr


<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 26<br />

Implementati<strong>on</strong> of intensi<strong>on</strong>al semantics<br />

event keys and event expressi<strong>on</strong>s<br />

0.9 docTerm(d1,ir). [dT(d1,ir)]<br />

0.5 docTerm(d1,db). [dT(d1,db)]<br />

0.8 docTerm(d2,ir). [dT(d2,ir)]<br />

0.3 docTerm(d2,oop). [dT(d2,oop)]<br />

0.7 link(d2,d1). [l(d2,d1)]<br />

?- docTerm(D,ir) & docTerm(D,db).<br />

gives<br />

d1 [dT(d1,ir) & dT(d1,db)] 0.9 · 0.5 = 0.45<br />

about(D,T) :- docTerm(D,T).<br />

about(D,T) :- link(D,D1) & about(D1,T)<br />

?- about(D,ir) & about(D,db).<br />

gives<br />

d2<br />

[l(d2,d1) & dT(d1,ir) & l(d2,d1) &<br />

dT(d1,db)]<br />

0.7 · 0.9 · 0.5 = 0.315<br />

d1 [dT(d1,ir) & dT(d1,ir)] 0.7 · 0.5 = 0.35<br />

Norbert Fuhr


<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 27<br />

about(D,T) :- docTerm(D,T).<br />

about(D,T) :- link(D,D1) & about(D1,T).<br />

d3<br />

0.8<br />

0.4<br />

docterm<br />

d1<br />

0.5<br />

d2<br />

link<br />

0.9<br />

0.5<br />

ir<br />

db<br />

?- about(D,ir)<br />

d1 [dT(d1,ir) | l(d1,d2) & l(d2,d3) &<br />

l(d3,d1) &<br />

dT(d1,ir) | ...] 0.900<br />

d3 [l(d3,d1) & dT(d1,ir)] 0.720<br />

d2 [l(d2,d3) & l(d3,d1) & dT(d1,ir)] 0.288<br />

?- about(D,ir) & about(D,db)<br />

d1 [dT(d1,ir) & dT(d1,db)] 0.450<br />

d3 [l(d3,d1) & dT(d1,ir) & l(d3,d1) &<br />

dT(d1,db)] 0.360<br />

d2 [l(d2,d3) & l(d3,d1) & dT(d1,ir) &<br />

dT(d1,db)] 0.144<br />

Norbert Fuhr


<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 28<br />

computati<strong>on</strong> of probabilities for event expressi<strong>on</strong>s<br />

1. transformati<strong>on</strong> of expressi<strong>on</strong> into disjunctive normal<br />

form<br />

2. applicati<strong>on</strong> of sieve formula:<br />

c i – c<strong>on</strong>junct of event keys<br />

P (c 1 ∨ . . . ∨ c n ) =<br />

n∑<br />

(−1) i−1<br />

i=1<br />

∑<br />

1≤j 1


<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 29<br />

Interpretati<strong>on</strong> of probabilistic weights<br />

possible worlds semantics<br />

0.9 docTerm(d1,ir).<br />

P (W 1 ) = 0.9: {docTerm(d1,ir)}<br />

P (W 2 ) = 0.1: {}<br />

Norbert Fuhr


<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 30<br />

0.9 docTerm(d1,ir).<br />

0.5 docTerm(d1,db).<br />

possible interpretati<strong>on</strong>s:<br />

I 1 :<br />

P (W 1 ) = 0.45: {docTerm(d1,ir)}<br />

P (W 2 ) = 0.45: {docTerm(d1,ir),<br />

docTerm(d1,db)}<br />

P (W 3 ) = 0.05: {docTerm(d1,db)}<br />

P (W 3 ) = 0.05: {}<br />

I 2 :<br />

P (W 1 ) = 0.5: {docTerm(d1,ir)}<br />

P (W 2 ) = 0.4: {docTerm(d1,ir), docTerm(d1,db)}<br />

P (W 3 ) = 0.1: {docTerm(d1,db)}<br />

I 3 :<br />

P (W 1 ) = 0.4: {docTerm(d1,ir)}<br />

P (W 2 ) = 0.5: {docTerm(d1,ir), docTerm(d1,db)}<br />

P (W 3 ) = 0.1: {}<br />

probabilistic <strong>logic</strong>:<br />

0.4 ≤ P (docTerm(d1, ir)&docTerm(d1, db)) ≤ 0.5<br />

probabilistic Datalog with independence assumpti<strong>on</strong>s:<br />

P (docTerm(d1, ir)&docTerm(d1, db)) = 0.45<br />

Norbert Fuhr


<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 31<br />

Disjoint events<br />

example: imprecise attribute values<br />

# py(dk,av).<br />

0.2 py(d3,89).<br />

0.7 py(d3,90).<br />

0.1 py(d3,91).<br />

interpretati<strong>on</strong>:<br />

P (W 1 ) = 0.2: {py(d3,89)}<br />

P (W 2 ) = 0.7: {py(d3,90)}<br />

P (W 3 ) = 0.1: {py(d3,91)}<br />

?- py(X,Y) & Y > 89.<br />

d3 [p(d3,90) | p(d3,91)] 0.7 + 0.1 = 0.8<br />

Norbert Fuhr


<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 32<br />

Probabilistic search term weighting<br />

via disjoint events<br />

0.8 docTerm(d1,db). 0.7 docTerm(d1,ir).<br />

# qtw(av).<br />

0.4 qtw(db). 0.6 qtw(ir).<br />

s(D) :- qtw(X) & docTerm(D,X).<br />

?- s(D).<br />

0.4 qtw(db) 0.6 qtw(ir)<br />

0.7 docTerm(d1,ir)<br />

0.8 docTerm(d1,db)<br />

d1 [q(db) & dT(d1,db) | q(ir) & dT(d1,ir)]<br />

0.4 · 0.8 + 0.6 · 0.7 = 0.74<br />

Norbert Fuhr


<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 33<br />

Probabilistic rules<br />

rules for deterministic facts:<br />

0.7 likes-sports(X) :- man(X).<br />

0.4 likes-sports(X) :- woman(X).<br />

man(peter).<br />

interpretati<strong>on</strong>:<br />

P (W 1 ) = 0.7: {man(peter),<br />

likes-sports(peter)}<br />

P (W 2 ) = 0.3: {man(peter)}<br />

rules for uncertain facts:<br />

# sex(dk,av).<br />

0.7 l-s(X) :- sex(X,male).<br />

0.4 l-s(X) :- sex(X,female).<br />

0.5 sex(X,male) :- human(X).<br />

0.5 sex(X,female) :- human(X).<br />

human(peter).<br />

interpretati<strong>on</strong>:<br />

P (W 1 ) = 0.35: {sex(peter,male), l-s(peter)}<br />

P (W 2 ) = 0.15: {sex(peter,male)}<br />

P (W 3 ) = 0.20: {sex(peter,female), l-s(peter)}<br />

P (W 4 ) = 0.30: {sex(peter,female)}<br />

Norbert Fuhr


<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 34<br />

Vague <strong>predicate</strong>s<br />

pc(m1,486/dx50,8,540,900).<br />

pc(m2,pe60,16,250,1000).<br />

pc(m3,pe90,16,540,1100).<br />

?- pc(MOD, CPU, MEM, DISK, PRICE), PRICE < 1000<br />

vague <strong>predicate</strong> ˆ< (builtin)<br />

1.00 ˆ


<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 35<br />

applicati<strong>on</strong>s of vague <strong>predicate</strong>s:<br />

• vague fact c<strong>on</strong>diti<strong>on</strong>s<br />

• proper name search (string similarity)<br />

(also OCRed text)<br />

• multimedia <str<strong>on</strong>g>IR</str<strong>on</strong>g><br />

(e.g. audio retrieval, image retrieval)<br />

Norbert Fuhr

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!