07.03.2014 Views

7 IR models based on predicate logic

7 IR models based on predicate logic

7 IR models based on predicate logic

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 1<br />

7 <str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong><br />

<strong>logic</strong><br />

7.1 General c<strong>on</strong>siderati<strong>on</strong>s<br />

7.1.1 <str<strong>on</strong>g>IR</str<strong>on</strong>g> as inference<br />

q - query<br />

d – document<br />

retrieval:<br />

search for documents which imply the query:<br />

d → q<br />

example:<br />

d = {t 1 , t 2 , t 3 }<br />

q = {t 1 , t 3 }<br />

<strong>logic</strong>al view:<br />

d = t 1 ∧ t 2 ∧ t 3<br />

q = t 1 ∧ t 3<br />

⇒: d → q<br />

Norbert Fuhr


<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 2<br />

advantage of inference-<str<strong>on</strong>g>based</str<strong>on</strong>g> approach:<br />

step from term-<str<strong>on</strong>g>based</str<strong>on</strong>g> to knowledge-<str<strong>on</strong>g>based</str<strong>on</strong>g> retrieval<br />

e.g. easy incorporati<strong>on</strong> of additi<strong>on</strong>al knowledge<br />

example:<br />

d: ’squares’<br />

q: ’rectangles’<br />

thesaurus: ’squares’ → ’rectangles’<br />

⇒: d → q<br />

Norbert Fuhr


<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 3<br />

7.1.2 <str<strong>on</strong>g>IR</str<strong>on</strong>g> as uncertain inference<br />

d: ’quadrangles’<br />

q: ’rectangles’<br />

⇒ uncertain knowledge required<br />

’quadrangles’ 0.3<br />

→ ’rectangles’<br />

[Rijsbergen 86]:<br />

<str<strong>on</strong>g>IR</str<strong>on</strong>g> as uncertain inference<br />

Retrieval ˆ=<br />

estimate probability P (d → q) = P (q|d)<br />

q<br />

t 1<br />

t 4<br />

t 2<br />

t 5<br />

t 3<br />

t 6<br />

d<br />

Norbert Fuhr


<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 4<br />

7.1.3 Propositi<strong>on</strong>al vs. <strong>predicate</strong> <strong>logic</strong><br />

spatio-temporal relati<strong>on</strong>ships:<br />

• document attributes<br />

query: documents published after 1990?<br />

?- pubyear(D,Y) & Y>1990<br />

• multimedia retrieval<br />

Norbert Fuhr<br />

c<strong>on</strong>venti<strong>on</strong>al indexing (<str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> propositi<strong>on</strong>al <strong>logic</strong>):<br />

d = {tree, house}<br />

query: Is there a tree <strong>on</strong> the left of the house?<br />

⇒ query cannot be expressed in propositi<strong>on</strong>al <strong>logic</strong><br />

<strong>predicate</strong> <strong>logic</strong>:<br />

d: tree(t1). house(h1). tree(t2).<br />

left(t1, h1). left(h1,t2).<br />

?- tree(X) & house(Y) & left(X,Y).


<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 5<br />

Ontologies<br />

Thesaurus<br />

polyg<strong>on</strong><br />

regular<br />

polyg<strong>on</strong><br />

triangle quadrangle ...<br />

rectangle<br />

regular<br />

triangle<br />

square<br />

thesaurus knowledge:<br />

can be expressed in propositi<strong>on</strong>al <strong>logic</strong><br />

square = quadrangle ∧ regular-polyg<strong>on</strong><br />

descripti<strong>on</strong> <strong>logic</strong>s<br />

• instances of c<strong>on</strong>cepts<br />

• roles (relati<strong>on</strong>ships) between c<strong>on</strong>cepts/instances<br />

Norbert Fuhr


<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 6<br />

Advantages of <strong>predicate</strong> <strong>logic</strong>:<br />

• modelling of spatial and temporal relati<strong>on</strong>ships<br />

(e.g. for multimedia retrieval)<br />

• instances of c<strong>on</strong>cepts<br />

≈ combinati<strong>on</strong> of c<strong>on</strong>trolled vocabulary and free text search<br />

• roles/relati<strong>on</strong>ships between c<strong>on</strong>cepts or instances<br />

→ higher expressiveness for c<strong>on</strong>cept definiti<strong>on</strong> and descripti<strong>on</strong> of document<br />

c<strong>on</strong>tent<br />

Norbert Fuhr


<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 7<br />

7.2 RDF<br />

7.2.1 RDF: basic c<strong>on</strong>cepts<br />

Resource object <strong>on</strong> the WWW, e.g. Web page, database<br />

naming of resources: Uniform Resource Identifier (URI)<br />

Literal special type of resource, with string value, no explicit URI<br />

Property aspect / attribute / characteristics / relati<strong>on</strong><br />

Statement resource + named property + value of property<br />

(subject, <strong>predicate</strong>, object)<br />

visits<br />

Norbert Pisa<br />

Norbert Fuhr


<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 8<br />

RDF example<br />

organized−by<br />

ISSDL<br />

M.Agosti<br />

isPartOf<br />

Name Email<br />

"Maristella Agosti" "agosti@..."<br />

<str<strong>on</strong>g>IR</str<strong>on</strong>g>−Course<br />

title<br />

"Introducti<strong>on</strong> to <str<strong>on</strong>g>IR</str<strong>on</strong>g>"<br />

teaches<br />

N.Fuhr<br />

Name<br />

"Norbert Fuhr"<br />

Email<br />

"fuhr@cs.uni−..."<br />

Norbert Fuhr


<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 9<br />

RDF schemas<br />

similar to semantic networks / descripti<strong>on</strong> <strong>logic</strong>s<br />

describes relati<strong>on</strong>ships between types of resources and/or properties<br />

• fundamental c<strong>on</strong>cepts<br />

– rdfs:Resource<br />

– rdf:Property<br />

– rdfs:Class<br />

• schema definiti<strong>on</strong> c<strong>on</strong>cepts<br />

– rdf:type<br />

– rdfs:subClassOf<br />

– rdfs:subPropertyOf<br />

– rdfs:seeAlso<br />

– rdfs:isDefinedBy<br />

Norbert Fuhr


<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 10<br />

RDFS example: resource hierarchy<br />

rdfs:Resource<br />

rdfs:Class<br />

xyz:MotorVehicle<br />

xyz:Van<br />

xyz:Truck<br />

xyz:PassengerVehicle<br />

xyz:MiniVan<br />

rdfs:subClassOf<br />

rdf:Type<br />

Norbert Fuhr


<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 11<br />

RDFS example: resource and property hierarchies<br />

rdf:Property<br />

rdfs:Class<br />

rdf:type<br />

rdf:type<br />

rdf:type<br />

visits<br />

Pers<strong>on</strong><br />

visits<br />

Place<br />

rdfs:subPropertyOf<br />

rdfs:subClassOf<br />

rdfs:subClassOf<br />

tourist−visit business−visit ISSDL−Tutor<br />

business−visit<br />

C<strong>on</strong>f.−Loc.<br />

rdf:type rdf:type<br />

business−visit<br />

N. Fuhr Pisa<br />

Norbert Fuhr


<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 12<br />

Norbert Fuhr


<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 13<br />

RDF example: image descripti<strong>on</strong><br />

picture<br />

c<strong>on</strong>tains<br />

rdf:type<br />

parc<br />

above<br />

artifact<br />

rdfs:subClassOf<br />

sculpture<br />

right−of<br />

man woman<br />

swan cherub<br />

socle<br />

Norbert Fuhr


<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 14<br />

Retrieval with RDF<br />

?X<br />

artifact<br />

right−of<br />

?Y ?Z<br />

man woman<br />

Norbert Fuhr


<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 15<br />

Norbert Fuhr


<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 16<br />

Subsumpti<strong>on</strong><br />

retrieval as inference (implicati<strong>on</strong>) in descripti<strong>on</strong> <strong>logic</strong>: subsumpti<strong>on</strong><br />

find implicit subclasses of query c<strong>on</strong>cept<br />

Subsumpti<strong>on</strong> in RDF:<br />

resource r2 has property rdfs:subClassOf r1 if<br />

1. r2 is subclass of all superclasses of r1<br />

2. each property of r1 subsumes the corresp<strong>on</strong>ding property of r2<br />

a property p2 is subsumed by a property p1 if<br />

1. the properties are equal, or the statement r2 rdfs:subPropertyOf r1 holds.<br />

2. the range of p2 is subsumed by the range of p1.<br />

Norbert Fuhr


<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 17<br />

7.3 Modelling <str<strong>on</strong>g>IR</str<strong>on</strong>g> in Datalog<br />

7.3.1 Introducti<strong>on</strong><br />

Datalog:<br />

• horn <strong>predicate</strong> <strong>logic</strong><br />

(most <str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> propositi<strong>on</strong>al <strong>logic</strong>)<br />

• no functi<strong>on</strong>s<br />

• restricted forms of negati<strong>on</strong> allowed<br />

• sound and complete evaluati<strong>on</strong> algorithms<br />

Norbert Fuhr


<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 18<br />

ground facts:<br />

docTerm(d1,ir).<br />

docTerm(d1,db).<br />

docTerm(d2,ir).<br />

docTerm(d2,oop).<br />

rules:<br />

irdoc(D) :- docTerm(D,ir).<br />

iranddb(D) :- docTerm(D,ir) & docTerm(D,db).<br />

irnotdb(D) :- docTerm(D,ir) & not(docTerm(D,db)).<br />

recursive rules:<br />

link(d1,d2). link(d2,d3). link(d3,d1).<br />

linked(X,Y) :- link(X,Y).<br />

linked(X,Y) :- linked(X,Z) & link(Z,Y).<br />

queries:<br />

?- docTerm(D,ir).<br />

?- docTerm(D,ir) & docTerm(D,db).<br />

?- docTerm(D,ir) \& not(docTerm(D,db)).<br />

Norbert Fuhr


<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 19<br />

7.3.2 Hypertext structure<br />

docTerm(d1,ir). docTerm(d1,db).<br />

link(d1,d2). link(d2,d3). link(d3,d1).<br />

about(D,T) :- docTerm(D,T).<br />

about(D,T) :- link(D,D1) & about(D1,T).<br />

d3<br />

docterm<br />

d1<br />

d2<br />

link<br />

ir<br />

db<br />

?- about(D,ir)<br />

Norbert Fuhr


<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 20<br />

7.3.3 Aggregati<strong>on</strong><br />

book<br />

chapter<br />

secti<strong>on</strong><br />

part(D,P) :- chapter(D,P).<br />

part(D,P) :- secti<strong>on</strong>(D,P).<br />

retrieve node if at least <strong>on</strong>e part is about the search<br />

term:<br />

about(D,T) :- part(D,P) & about(P,T).<br />

retrieve node if all its parts are about the search term:<br />

about(D,T) :- part(D,X) & about(X,T) &<br />

not(anypart(D,T)).<br />

anypart(D,T):- part(D,P)& not(about(P,T)).<br />

Norbert Fuhr


<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 21<br />

7.4 Probabilistic Datalog<br />

7.4.1 Motivati<strong>on</strong><br />

powerful retrieval <strong>logic</strong><br />

• expressiveness of Datalog<br />

– <strong>predicate</strong> <strong>logic</strong><br />

(spatio-temporal relati<strong>on</strong>ships, instances of<br />

c<strong>on</strong>cepts)<br />

– recursi<strong>on</strong><br />

(structured documents, hypertext links, termino<strong>logic</strong>al<br />

structures)<br />

• uncertain inference:<br />

probabilistic inference<br />

Norbert Fuhr


<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 22<br />

Syntax<br />

ground facts with probabilistic weights<br />

0.9 docTerm(d1,ir).<br />

0.5 docTerm(d1,db).<br />

0.8 docTerm(d2,ir).<br />

0.3 docTerm(d2,oop).<br />

?- docTerm(D,ir).<br />

gives<br />

d1 0.9<br />

d2 0.8<br />

?- docTerm(D,ir) & docTerm(D,db).<br />

gives<br />

d1 0.45<br />

Norbert Fuhr


<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 23<br />

Example: Image retrieval<br />

<str<strong>on</strong>g>IR</str<strong>on</strong>g>IS (Univ. Bremen): automatic indexing of images with semantic c<strong>on</strong>cepts<br />

imgobj(O,I,N,X1,X2,Y1,Y2)<br />

O image object<br />

I image<br />

N name of semantic c<strong>on</strong>cept<br />

L,R,B,T bounding rectangle of image object<br />

query: images with water in fr<strong>on</strong>t of st<strong>on</strong>es<br />

?- imgobj(OA,I,water,L1,R1,B1,T1) ,<br />

imgobj(OB,I,st<strong>on</strong>e,L2,R2,B2,T2),<br />

B1


<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 24<br />

Norbert Fuhr


<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 25<br />

7.4.2 Semantics of probabilistic Datalog<br />

Extensi<strong>on</strong>al vs. intensi<strong>on</strong>al semantics<br />

0.9 docTerm(d1,ir).<br />

0.5 docTerm(d1,db).<br />

0.7 link(d2,d1).<br />

about(D,T) :- docTerm(D,T).<br />

about(D,T) :- link(D,D1) & about(D1,T)<br />

q(D) :- about(D,ir) & about(D,db).<br />

extensi<strong>on</strong>al semantics:<br />

weight of derived fact as functi<strong>on</strong> of weights of subgoals<br />

P (q(d2)) = P (about(d2,ir)) · P (about(d2,db)) =<br />

(0.7 · 0.9) · (0.7 · 0.5)<br />

Problem:<br />

“improper treatment of correlated sources of evidence”<br />

[Pearl]<br />

→ extensi<strong>on</strong>al semantics <strong>on</strong>ly correct for tree-like<br />

inference structures<br />

intensi<strong>on</strong>al semantics:<br />

weight of IDB fact as functi<strong>on</strong> of weights of underlying<br />

ground facts<br />

Norbert Fuhr


<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 26<br />

Implementati<strong>on</strong> of intensi<strong>on</strong>al semantics<br />

event keys and event expressi<strong>on</strong>s<br />

0.9 docTerm(d1,ir). [dT(d1,ir)]<br />

0.5 docTerm(d1,db). [dT(d1,db)]<br />

0.8 docTerm(d2,ir). [dT(d2,ir)]<br />

0.3 docTerm(d2,oop). [dT(d2,oop)]<br />

0.7 link(d2,d1). [l(d2,d1)]<br />

?- docTerm(D,ir) & docTerm(D,db).<br />

gives<br />

d1 [dT(d1,ir) & dT(d1,db)] 0.9 · 0.5 = 0.45<br />

about(D,T) :- docTerm(D,T).<br />

about(D,T) :- link(D,D1) & about(D1,T)<br />

?- about(D,ir) & about(D,db).<br />

gives<br />

d2<br />

[l(d2,d1) & dT(d1,ir) & l(d2,d1) &<br />

dT(d1,db)]<br />

0.7 · 0.9 · 0.5 = 0.315<br />

d1 [dT(d1,ir) & dT(d1,ir)] 0.7 · 0.5 = 0.35<br />

Norbert Fuhr


<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 27<br />

about(D,T) :- docTerm(D,T).<br />

about(D,T) :- link(D,D1) & about(D1,T).<br />

d3<br />

0.8<br />

0.4<br />

docterm<br />

d1<br />

0.5<br />

d2<br />

link<br />

0.9<br />

0.5<br />

ir<br />

db<br />

?- about(D,ir)<br />

d1 [dT(d1,ir) | l(d1,d2) & l(d2,d3) &<br />

l(d3,d1) &<br />

dT(d1,ir) | ...] 0.900<br />

d3 [l(d3,d1) & dT(d1,ir)] 0.720<br />

d2 [l(d2,d3) & l(d3,d1) & dT(d1,ir)] 0.288<br />

?- about(D,ir) & about(D,db)<br />

d1 [dT(d1,ir) & dT(d1,db)] 0.450<br />

d3 [l(d3,d1) & dT(d1,ir) & l(d3,d1) &<br />

dT(d1,db)] 0.360<br />

d2 [l(d2,d3) & l(d3,d1) & dT(d1,ir) &<br />

dT(d1,db)] 0.144<br />

Norbert Fuhr


<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 28<br />

computati<strong>on</strong> of probabilities for event expressi<strong>on</strong>s<br />

1. transformati<strong>on</strong> of expressi<strong>on</strong> into disjunctive normal<br />

form<br />

2. applicati<strong>on</strong> of sieve formula:<br />

c i – c<strong>on</strong>junct of event keys<br />

P (c 1 ∨ . . . ∨ c n ) =<br />

n∑<br />

(−1) i−1<br />

i=1<br />

∑<br />

1≤j 1


<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 29<br />

Interpretati<strong>on</strong> of probabilistic weights<br />

possible worlds semantics<br />

0.9 docTerm(d1,ir).<br />

P (W 1 ) = 0.9: {docTerm(d1,ir)}<br />

P (W 2 ) = 0.1: {}<br />

Norbert Fuhr


<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 30<br />

0.9 docTerm(d1,ir).<br />

0.5 docTerm(d1,db).<br />

possible interpretati<strong>on</strong>s:<br />

I 1 :<br />

P (W 1 ) = 0.45: {docTerm(d1,ir)}<br />

P (W 2 ) = 0.45: {docTerm(d1,ir),<br />

docTerm(d1,db)}<br />

P (W 3 ) = 0.05: {docTerm(d1,db)}<br />

P (W 3 ) = 0.05: {}<br />

I 2 :<br />

P (W 1 ) = 0.5: {docTerm(d1,ir)}<br />

P (W 2 ) = 0.4: {docTerm(d1,ir), docTerm(d1,db)}<br />

P (W 3 ) = 0.1: {docTerm(d1,db)}<br />

I 3 :<br />

P (W 1 ) = 0.4: {docTerm(d1,ir)}<br />

P (W 2 ) = 0.5: {docTerm(d1,ir), docTerm(d1,db)}<br />

P (W 3 ) = 0.1: {}<br />

probabilistic <strong>logic</strong>:<br />

0.4 ≤ P (docTerm(d1, ir)&docTerm(d1, db)) ≤ 0.5<br />

probabilistic Datalog with independence assumpti<strong>on</strong>s:<br />

P (docTerm(d1, ir)&docTerm(d1, db)) = 0.45<br />

Norbert Fuhr


<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 31<br />

Disjoint events<br />

example: imprecise attribute values<br />

# py(dk,av).<br />

0.2 py(d3,89).<br />

0.7 py(d3,90).<br />

0.1 py(d3,91).<br />

interpretati<strong>on</strong>:<br />

P (W 1 ) = 0.2: {py(d3,89)}<br />

P (W 2 ) = 0.7: {py(d3,90)}<br />

P (W 3 ) = 0.1: {py(d3,91)}<br />

?- py(X,Y) & Y > 89.<br />

d3 [p(d3,90) | p(d3,91)] 0.7 + 0.1 = 0.8<br />

Norbert Fuhr


<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 32<br />

Probabilistic search term weighting<br />

via disjoint events<br />

0.8 docTerm(d1,db). 0.7 docTerm(d1,ir).<br />

# qtw(av).<br />

0.4 qtw(db). 0.6 qtw(ir).<br />

s(D) :- qtw(X) & docTerm(D,X).<br />

?- s(D).<br />

0.4 qtw(db) 0.6 qtw(ir)<br />

0.7 docTerm(d1,ir)<br />

0.8 docTerm(d1,db)<br />

d1 [q(db) & dT(d1,db) | q(ir) & dT(d1,ir)]<br />

0.4 · 0.8 + 0.6 · 0.7 = 0.74<br />

Norbert Fuhr


<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 33<br />

Probabilistic rules<br />

rules for deterministic facts:<br />

0.7 likes-sports(X) :- man(X).<br />

0.4 likes-sports(X) :- woman(X).<br />

man(peter).<br />

interpretati<strong>on</strong>:<br />

P (W 1 ) = 0.7: {man(peter),<br />

likes-sports(peter)}<br />

P (W 2 ) = 0.3: {man(peter)}<br />

rules for uncertain facts:<br />

# sex(dk,av).<br />

0.7 l-s(X) :- sex(X,male).<br />

0.4 l-s(X) :- sex(X,female).<br />

0.5 sex(X,male) :- human(X).<br />

0.5 sex(X,female) :- human(X).<br />

human(peter).<br />

interpretati<strong>on</strong>:<br />

P (W 1 ) = 0.35: {sex(peter,male), l-s(peter)}<br />

P (W 2 ) = 0.15: {sex(peter,male)}<br />

P (W 3 ) = 0.20: {sex(peter,female), l-s(peter)}<br />

P (W 4 ) = 0.30: {sex(peter,female)}<br />

Norbert Fuhr


<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 34<br />

Vague <strong>predicate</strong>s<br />

pc(m1,486/dx50,8,540,900).<br />

pc(m2,pe60,16,250,1000).<br />

pc(m3,pe90,16,540,1100).<br />

?- pc(MOD, CPU, MEM, DISK, PRICE), PRICE < 1000<br />

vague <strong>predicate</strong> ˆ< (builtin)<br />

1.00 ˆ


<str<strong>on</strong>g>IR</str<strong>on</strong>g> <str<strong>on</strong>g>models</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>predicate</strong> <strong>logic</strong> 35<br />

applicati<strong>on</strong>s of vague <strong>predicate</strong>s:<br />

• vague fact c<strong>on</strong>diti<strong>on</strong>s<br />

• proper name search (string similarity)<br />

(also OCRed text)<br />

• multimedia <str<strong>on</strong>g>IR</str<strong>on</strong>g><br />

(e.g. audio retrieval, image retrieval)<br />

Norbert Fuhr

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!