10.07.2015 Views

Web Mining and Social Networking: Techniques and ... - tud.ttu.ee

Web Mining and Social Networking: Techniques and ... - tud.ttu.ee

Web Mining and Social Networking: Techniques and ... - tud.ttu.ee

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

90 5 <strong>Web</strong> Linkage <strong>Mining</strong>5.2 Co-citation <strong>and</strong> Bibliographic CouplingCitation analysis is an area of bibliometric research, which s<strong>tud</strong>ies citations to establish therelationship betw<strong>ee</strong>n authors <strong>and</strong> their work. When a paper cites another paper, a relationship(or link) is establish betw<strong>ee</strong>n the papers. Citation analysis uses these relationships (links) toperform various types of analysis. We will introduce two specific types of citation analysisrelated to the HITS algorithm, co-citation <strong>and</strong> bibliographic coupling.5.2.1 Co-citationCo-citation is used to measure the similarity of two documents in clustering. If paper i <strong>and</strong> jare cited together by many papers, it means that i <strong>and</strong> j have a strong relationship or similarity.The more papers they are cited together, the stronger their relationship is. Figure 5.1 belowshows this main idea.Fig. 5.1. Paper i <strong>and</strong> j are co-cited by kLet L be the citation matrix. Each cell of the matrix is defined as follows: L ij = 1 if paper icites paper j, <strong>and</strong> 0 otherwise. Co-citation (denoted by C ij ) is defined as the number of papersthat co-cite i <strong>and</strong> j, <strong>and</strong> is computed with( )C ij = ∑ k=1,...nL ki L kj = L T L(5.1)ijWhere, n is the total number of papers. A square matrix C can be formed with C ij , <strong>and</strong> it iscalled the co-citation matrix.5.2.2 Bibliographic CouplingBibliographic coupling is also a similarity measure of two papers in clustering. We can lookat the bibliographic coupling as the mirror image of co-citation. The main idea here is that ifpaper i <strong>and</strong> paper j both cite paperk, they may be said to be related, even though they do notcite each other directly (s<strong>ee</strong> Fig. 5.2 below).B ij is the number of papers that are cited by both papers i <strong>and</strong> paper j:(B ij = ∑ ) k=1,...nL ik L jk = LL T (5.2)ijWhere, n is the total number of papers. A square matrix B can be formed with B ij , <strong>and</strong> it iscalled the bibliographic coupling matrix.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!