7 IR models based on predicate logic
7 IR models based on predicate logic
7 IR models based on predicate logic
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 1<br />
7 <str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong><br />
<strong>logic</strong><br />
7.1 General c<strong>on</strong>siderati<strong>on</strong>s<br />
7.1.1 <str<strong>on</strong>g>IR</str<strong>on</strong>g> as inference<br />
q - query<br />
d – document<br />
retrieval:<br />
search for documents which imply the query:<br />
d → q<br />
example:<br />
d = {t 1 , t 2 , t 3 }<br />
q = {t 1 , t 3 }<br />
<strong>logic</strong>al view:<br />
d = t 1 ∧ t 2 ∧ t 3<br />
q = t 1 ∧ t 3<br />
⇒: d → q<br />
Norbert Fuhr
<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 2<br />
advantage of inference-<str<strong>on</strong>g>based</str<strong>on</strong>g> approach:<br />
step from term-<str<strong>on</strong>g>based</str<strong>on</strong>g> to knowledge-<str<strong>on</strong>g>based</str<strong>on</strong>g> retrieval<br />
e.g. easy incorporati<strong>on</strong> of additi<strong>on</strong>al knowledge<br />
example:<br />
d: ’squares’<br />
q: ’rectangles’<br />
thesaurus: ’squares’ → ’rectangles’<br />
⇒: d → q<br />
Norbert Fuhr
<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 3<br />
7.1.2 <str<strong>on</strong>g>IR</str<strong>on</strong>g> as uncertain inference<br />
d: ’quadrangles’<br />
q: ’rectangles’<br />
⇒ uncertain knowledge required<br />
’quadrangles’ 0.3<br />
→ ’rectangles’<br />
[Rijsbergen 86]:<br />
<str<strong>on</strong>g>IR</str<strong>on</strong>g> as uncertain inference<br />
Retrieval ˆ=<br />
estimate probability P (d → q) = P (q|d)<br />
q<br />
t 1<br />
t 4<br />
t 2<br />
t 5<br />
t 3<br />
t 6<br />
d<br />
Norbert Fuhr
<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 4<br />
7.1.3 Propositi<strong>on</strong>al vs. <strong>predicate</strong> <strong>logic</strong><br />
spatio-temporal relati<strong>on</strong>ships:<br />
• document attributes<br />
query: documents published after 1990?<br />
?- pubyear(D,Y) & Y>1990<br />
• multimedia retrieval<br />
Norbert Fuhr<br />
c<strong>on</strong>venti<strong>on</strong>al indexing (<str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> propositi<strong>on</strong>al <strong>logic</strong>):<br />
d = {tree, house}<br />
query: Is there a tree <strong>on</strong> the left of the house?<br />
⇒ query cannot be expressed in propositi<strong>on</strong>al <strong>logic</strong><br />
<strong>predicate</strong> <strong>logic</strong>:<br />
d: tree(t1). house(h1). tree(t2).<br />
left(t1, h1). left(h1,t2).<br />
?- tree(X) & house(Y) & left(X,Y).
<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 5<br />
Ontologies<br />
Thesaurus<br />
polyg<strong>on</strong><br />
regular<br />
polyg<strong>on</strong><br />
triangle quadrangle ...<br />
rectangle<br />
regular<br />
triangle<br />
square<br />
thesaurus knowledge:<br />
can be expressed in propositi<strong>on</strong>al <strong>logic</strong><br />
square = quadrangle ∧ regular-polyg<strong>on</strong><br />
descripti<strong>on</strong> <strong>logic</strong>s<br />
• instances of c<strong>on</strong>cepts<br />
• roles (relati<strong>on</strong>ships) between c<strong>on</strong>cepts/instances<br />
Norbert Fuhr
<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 6<br />
Advantages of <strong>predicate</strong> <strong>logic</strong>:<br />
• modelling of spatial and temporal relati<strong>on</strong>ships<br />
(e.g. for multimedia retrieval)<br />
• instances of c<strong>on</strong>cepts<br />
≈ combinati<strong>on</strong> of c<strong>on</strong>trolled vocabulary and free text search<br />
• roles/relati<strong>on</strong>ships between c<strong>on</strong>cepts or instances<br />
→ higher expressiveness for c<strong>on</strong>cept definiti<strong>on</strong> and descripti<strong>on</strong> of document<br />
c<strong>on</strong>tent<br />
Norbert Fuhr
<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 7<br />
7.2 RDF<br />
7.2.1 RDF: basic c<strong>on</strong>cepts<br />
Resource object <strong>on</strong> the WWW, e.g. Web page, database<br />
naming of resources: Uniform Resource Identifier (URI)<br />
Literal special type of resource, with string value, no explicit URI<br />
Property aspect / attribute / characteristics / relati<strong>on</strong><br />
Statement resource + named property + value of property<br />
(subject, <strong>predicate</strong>, object)<br />
visits<br />
Norbert Pisa<br />
Norbert Fuhr
<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 8<br />
RDF example<br />
organized−by<br />
ISSDL<br />
M.Agosti<br />
isPartOf<br />
Name Email<br />
"Maristella Agosti" "agosti@..."<br />
<str<strong>on</strong>g>IR</str<strong>on</strong>g>−Course<br />
title<br />
"Introducti<strong>on</strong> to <str<strong>on</strong>g>IR</str<strong>on</strong>g>"<br />
teaches<br />
N.Fuhr<br />
Name<br />
"Norbert Fuhr"<br />
Email<br />
"fuhr@cs.uni−..."<br />
Norbert Fuhr
<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 9<br />
RDF schemas<br />
similar to semantic networks / descripti<strong>on</strong> <strong>logic</strong>s<br />
describes relati<strong>on</strong>ships between types of resources and/or properties<br />
• fundamental c<strong>on</strong>cepts<br />
– rdfs:Resource<br />
– rdf:Property<br />
– rdfs:Class<br />
• schema definiti<strong>on</strong> c<strong>on</strong>cepts<br />
– rdf:type<br />
– rdfs:subClassOf<br />
– rdfs:subPropertyOf<br />
– rdfs:seeAlso<br />
– rdfs:isDefinedBy<br />
Norbert Fuhr
<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 10<br />
RDFS example: resource hierarchy<br />
rdfs:Resource<br />
rdfs:Class<br />
xyz:MotorVehicle<br />
xyz:Van<br />
xyz:Truck<br />
xyz:PassengerVehicle<br />
xyz:MiniVan<br />
rdfs:subClassOf<br />
rdf:Type<br />
Norbert Fuhr
<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 11<br />
RDFS example: resource and property hierarchies<br />
rdf:Property<br />
rdfs:Class<br />
rdf:type<br />
rdf:type<br />
rdf:type<br />
visits<br />
Pers<strong>on</strong><br />
visits<br />
Place<br />
rdfs:subPropertyOf<br />
rdfs:subClassOf<br />
rdfs:subClassOf<br />
tourist−visit business−visit ISSDL−Tutor<br />
business−visit<br />
C<strong>on</strong>f.−Loc.<br />
rdf:type rdf:type<br />
business−visit<br />
N. Fuhr Pisa<br />
Norbert Fuhr
<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 12<br />
Norbert Fuhr
<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 13<br />
RDF example: image descripti<strong>on</strong><br />
picture<br />
c<strong>on</strong>tains<br />
rdf:type<br />
parc<br />
above<br />
artifact<br />
rdfs:subClassOf<br />
sculpture<br />
right−of<br />
man woman<br />
swan cherub<br />
socle<br />
Norbert Fuhr
<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 14<br />
Retrieval with RDF<br />
?X<br />
artifact<br />
right−of<br />
?Y ?Z<br />
man woman<br />
Norbert Fuhr
<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 15<br />
Norbert Fuhr
<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 16<br />
Subsumpti<strong>on</strong><br />
retrieval as inference (implicati<strong>on</strong>) in descripti<strong>on</strong> <strong>logic</strong>: subsumpti<strong>on</strong><br />
find implicit subclasses of query c<strong>on</strong>cept<br />
Subsumpti<strong>on</strong> in RDF:<br />
resource r2 has property rdfs:subClassOf r1 if<br />
1. r2 is subclass of all superclasses of r1<br />
2. each property of r1 subsumes the corresp<strong>on</strong>ding property of r2<br />
a property p2 is subsumed by a property p1 if<br />
1. the properties are equal, or the statement r2 rdfs:subPropertyOf r1 holds.<br />
2. the range of p2 is subsumed by the range of p1.<br />
Norbert Fuhr
<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 17<br />
7.3 Modelling <str<strong>on</strong>g>IR</str<strong>on</strong>g> in Datalog<br />
7.3.1 Introducti<strong>on</strong><br />
Datalog:<br />
• horn <strong>predicate</strong> <strong>logic</strong><br />
(most <str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> propositi<strong>on</strong>al <strong>logic</strong>)<br />
• no functi<strong>on</strong>s<br />
• restricted forms of negati<strong>on</strong> allowed<br />
• sound and complete evaluati<strong>on</strong> algorithms<br />
Norbert Fuhr
<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 18<br />
ground facts:<br />
docTerm(d1,ir).<br />
docTerm(d1,db).<br />
docTerm(d2,ir).<br />
docTerm(d2,oop).<br />
rules:<br />
irdoc(D) :- docTerm(D,ir).<br />
iranddb(D) :- docTerm(D,ir) & docTerm(D,db).<br />
irnotdb(D) :- docTerm(D,ir) & not(docTerm(D,db)).<br />
recursive rules:<br />
link(d1,d2). link(d2,d3). link(d3,d1).<br />
linked(X,Y) :- link(X,Y).<br />
linked(X,Y) :- linked(X,Z) & link(Z,Y).<br />
queries:<br />
?- docTerm(D,ir).<br />
?- docTerm(D,ir) & docTerm(D,db).<br />
?- docTerm(D,ir) \& not(docTerm(D,db)).<br />
Norbert Fuhr
<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 19<br />
7.3.2 Hypertext structure<br />
docTerm(d1,ir). docTerm(d1,db).<br />
link(d1,d2). link(d2,d3). link(d3,d1).<br />
about(D,T) :- docTerm(D,T).<br />
about(D,T) :- link(D,D1) & about(D1,T).<br />
d3<br />
docterm<br />
d1<br />
d2<br />
link<br />
ir<br />
db<br />
?- about(D,ir)<br />
Norbert Fuhr
<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 20<br />
7.3.3 Aggregati<strong>on</strong><br />
book<br />
chapter<br />
secti<strong>on</strong><br />
part(D,P) :- chapter(D,P).<br />
part(D,P) :- secti<strong>on</strong>(D,P).<br />
retrieve node if at least <strong>on</strong>e part is about the search<br />
term:<br />
about(D,T) :- part(D,P) & about(P,T).<br />
retrieve node if all its parts are about the search term:<br />
about(D,T) :- part(D,X) & about(X,T) &<br />
not(anypart(D,T)).<br />
anypart(D,T):- part(D,P)& not(about(P,T)).<br />
Norbert Fuhr
<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 21<br />
7.4 Probabilistic Datalog<br />
7.4.1 Motivati<strong>on</strong><br />
powerful retrieval <strong>logic</strong><br />
• expressiveness of Datalog<br />
– <strong>predicate</strong> <strong>logic</strong><br />
(spatio-temporal relati<strong>on</strong>ships, instances of<br />
c<strong>on</strong>cepts)<br />
– recursi<strong>on</strong><br />
(structured documents, hypertext links, termino<strong>logic</strong>al<br />
structures)<br />
• uncertain inference:<br />
probabilistic inference<br />
Norbert Fuhr
<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 22<br />
Syntax<br />
ground facts with probabilistic weights<br />
0.9 docTerm(d1,ir).<br />
0.5 docTerm(d1,db).<br />
0.8 docTerm(d2,ir).<br />
0.3 docTerm(d2,oop).<br />
?- docTerm(D,ir).<br />
gives<br />
d1 0.9<br />
d2 0.8<br />
?- docTerm(D,ir) & docTerm(D,db).<br />
gives<br />
d1 0.45<br />
Norbert Fuhr
<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 23<br />
Example: Image retrieval<br />
<str<strong>on</strong>g>IR</str<strong>on</strong>g>IS (Univ. Bremen): automatic indexing of images with semantic c<strong>on</strong>cepts<br />
imgobj(O,I,N,X1,X2,Y1,Y2)<br />
O image object<br />
I image<br />
N name of semantic c<strong>on</strong>cept<br />
L,R,B,T bounding rectangle of image object<br />
query: images with water in fr<strong>on</strong>t of st<strong>on</strong>es<br />
?- imgobj(OA,I,water,L1,R1,B1,T1) ,<br />
imgobj(OB,I,st<strong>on</strong>e,L2,R2,B2,T2),<br />
B1
<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 24<br />
Norbert Fuhr
<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 25<br />
7.4.2 Semantics of probabilistic Datalog<br />
Extensi<strong>on</strong>al vs. intensi<strong>on</strong>al semantics<br />
0.9 docTerm(d1,ir).<br />
0.5 docTerm(d1,db).<br />
0.7 link(d2,d1).<br />
about(D,T) :- docTerm(D,T).<br />
about(D,T) :- link(D,D1) & about(D1,T)<br />
q(D) :- about(D,ir) & about(D,db).<br />
extensi<strong>on</strong>al semantics:<br />
weight of derived fact as functi<strong>on</strong> of weights of subgoals<br />
P (q(d2)) = P (about(d2,ir)) · P (about(d2,db)) =<br />
(0.7 · 0.9) · (0.7 · 0.5)<br />
Problem:<br />
“improper treatment of correlated sources of evidence”<br />
[Pearl]<br />
→ extensi<strong>on</strong>al semantics <strong>on</strong>ly correct for tree-like<br />
inference structures<br />
intensi<strong>on</strong>al semantics:<br />
weight of IDB fact as functi<strong>on</strong> of weights of underlying<br />
ground facts<br />
Norbert Fuhr
<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 26<br />
Implementati<strong>on</strong> of intensi<strong>on</strong>al semantics<br />
event keys and event expressi<strong>on</strong>s<br />
0.9 docTerm(d1,ir). [dT(d1,ir)]<br />
0.5 docTerm(d1,db). [dT(d1,db)]<br />
0.8 docTerm(d2,ir). [dT(d2,ir)]<br />
0.3 docTerm(d2,oop). [dT(d2,oop)]<br />
0.7 link(d2,d1). [l(d2,d1)]<br />
?- docTerm(D,ir) & docTerm(D,db).<br />
gives<br />
d1 [dT(d1,ir) & dT(d1,db)] 0.9 · 0.5 = 0.45<br />
about(D,T) :- docTerm(D,T).<br />
about(D,T) :- link(D,D1) & about(D1,T)<br />
?- about(D,ir) & about(D,db).<br />
gives<br />
d2<br />
[l(d2,d1) & dT(d1,ir) & l(d2,d1) &<br />
dT(d1,db)]<br />
0.7 · 0.9 · 0.5 = 0.315<br />
d1 [dT(d1,ir) & dT(d1,ir)] 0.7 · 0.5 = 0.35<br />
Norbert Fuhr
<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 27<br />
about(D,T) :- docTerm(D,T).<br />
about(D,T) :- link(D,D1) & about(D1,T).<br />
d3<br />
0.8<br />
0.4<br />
docterm<br />
d1<br />
0.5<br />
d2<br />
link<br />
0.9<br />
0.5<br />
ir<br />
db<br />
?- about(D,ir)<br />
d1 [dT(d1,ir) | l(d1,d2) & l(d2,d3) &<br />
l(d3,d1) &<br />
dT(d1,ir) | ...] 0.900<br />
d3 [l(d3,d1) & dT(d1,ir)] 0.720<br />
d2 [l(d2,d3) & l(d3,d1) & dT(d1,ir)] 0.288<br />
?- about(D,ir) & about(D,db)<br />
d1 [dT(d1,ir) & dT(d1,db)] 0.450<br />
d3 [l(d3,d1) & dT(d1,ir) & l(d3,d1) &<br />
dT(d1,db)] 0.360<br />
d2 [l(d2,d3) & l(d3,d1) & dT(d1,ir) &<br />
dT(d1,db)] 0.144<br />
Norbert Fuhr
<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 28<br />
computati<strong>on</strong> of probabilities for event expressi<strong>on</strong>s<br />
1. transformati<strong>on</strong> of expressi<strong>on</strong> into disjunctive normal<br />
form<br />
2. applicati<strong>on</strong> of sieve formula:<br />
c i – c<strong>on</strong>junct of event keys<br />
P (c 1 ∨ . . . ∨ c n ) =<br />
n∑<br />
(−1) i−1<br />
i=1<br />
∑<br />
1≤j 1
<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 29<br />
Interpretati<strong>on</strong> of probabilistic weights<br />
possible worlds semantics<br />
0.9 docTerm(d1,ir).<br />
P (W 1 ) = 0.9: {docTerm(d1,ir)}<br />
P (W 2 ) = 0.1: {}<br />
Norbert Fuhr
<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 30<br />
0.9 docTerm(d1,ir).<br />
0.5 docTerm(d1,db).<br />
possible interpretati<strong>on</strong>s:<br />
I 1 :<br />
P (W 1 ) = 0.45: {docTerm(d1,ir)}<br />
P (W 2 ) = 0.45: {docTerm(d1,ir),<br />
docTerm(d1,db)}<br />
P (W 3 ) = 0.05: {docTerm(d1,db)}<br />
P (W 3 ) = 0.05: {}<br />
I 2 :<br />
P (W 1 ) = 0.5: {docTerm(d1,ir)}<br />
P (W 2 ) = 0.4: {docTerm(d1,ir), docTerm(d1,db)}<br />
P (W 3 ) = 0.1: {docTerm(d1,db)}<br />
I 3 :<br />
P (W 1 ) = 0.4: {docTerm(d1,ir)}<br />
P (W 2 ) = 0.5: {docTerm(d1,ir), docTerm(d1,db)}<br />
P (W 3 ) = 0.1: {}<br />
probabilistic <strong>logic</strong>:<br />
0.4 ≤ P (docTerm(d1, ir)&docTerm(d1, db)) ≤ 0.5<br />
probabilistic Datalog with independence assumpti<strong>on</strong>s:<br />
P (docTerm(d1, ir)&docTerm(d1, db)) = 0.45<br />
Norbert Fuhr
<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 31<br />
Disjoint events<br />
example: imprecise attribute values<br />
# py(dk,av).<br />
0.2 py(d3,89).<br />
0.7 py(d3,90).<br />
0.1 py(d3,91).<br />
interpretati<strong>on</strong>:<br />
P (W 1 ) = 0.2: {py(d3,89)}<br />
P (W 2 ) = 0.7: {py(d3,90)}<br />
P (W 3 ) = 0.1: {py(d3,91)}<br />
?- py(X,Y) & Y > 89.<br />
d3 [p(d3,90) | p(d3,91)] 0.7 + 0.1 = 0.8<br />
Norbert Fuhr
<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 32<br />
Probabilistic search term weighting<br />
via disjoint events<br />
0.8 docTerm(d1,db). 0.7 docTerm(d1,ir).<br />
# qtw(av).<br />
0.4 qtw(db). 0.6 qtw(ir).<br />
s(D) :- qtw(X) & docTerm(D,X).<br />
?- s(D).<br />
0.4 qtw(db) 0.6 qtw(ir)<br />
0.7 docTerm(d1,ir)<br />
0.8 docTerm(d1,db)<br />
d1 [q(db) & dT(d1,db) | q(ir) & dT(d1,ir)]<br />
0.4 · 0.8 + 0.6 · 0.7 = 0.74<br />
Norbert Fuhr
<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 33<br />
Probabilistic rules<br />
rules for deterministic facts:<br />
0.7 likes-sports(X) :- man(X).<br />
0.4 likes-sports(X) :- woman(X).<br />
man(peter).<br />
interpretati<strong>on</strong>:<br />
P (W 1 ) = 0.7: {man(peter),<br />
likes-sports(peter)}<br />
P (W 2 ) = 0.3: {man(peter)}<br />
rules for uncertain facts:<br />
# sex(dk,av).<br />
0.7 l-s(X) :- sex(X,male).<br />
0.4 l-s(X) :- sex(X,female).<br />
0.5 sex(X,male) :- human(X).<br />
0.5 sex(X,female) :- human(X).<br />
human(peter).<br />
interpretati<strong>on</strong>:<br />
P (W 1 ) = 0.35: {sex(peter,male), l-s(peter)}<br />
P (W 2 ) = 0.15: {sex(peter,male)}<br />
P (W 3 ) = 0.20: {sex(peter,female), l-s(peter)}<br />
P (W 4 ) = 0.30: {sex(peter,female)}<br />
Norbert Fuhr
<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 34<br />
Vague <strong>predicate</strong>s<br />
pc(m1,486/dx50,8,540,900).<br />
pc(m2,pe60,16,250,1000).<br />
pc(m3,pe90,16,540,1100).<br />
?- pc(MOD, CPU, MEM, DISK, PRICE), PRICE < 1000<br />
vague <strong>predicate</strong> ˆ< (builtin)<br />
1.00 ˆ
<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 35<br />
applicati<strong>on</strong>s of vague <strong>predicate</strong>s:<br />
• vague fact c<strong>on</strong>diti<strong>on</strong>s<br />
• proper name search (string similarity)<br />
(also OCRed text)<br />
• multimedia <str<strong>on</strong>g>IR</str<strong>on</strong>g><br />
(e.g. audio retrieval, image retrieval)<br />
Norbert Fuhr