Web Mining and Social Networking: Techniques and ... - tud.ttu.ee

More documents

Recommendations

Info

4Web Content MiningIn recent years the growth of the World Wide Web exceeded all expectations. Today thereare several billions of HTML documents, pictures and other multimedia files available viaInternet and the number is still rising. But considering the impressive variety of the Web,retrieving interesting contents has become a very difficult task. Web Content Mining uses theideas and principles of data mining and knowledge discovery to screen more specific data.The use of the Web as a provider of information is unfortunately more complex than workingwith static databases. Because of its very dynamic nature and its vast number of documents,there is a need for new solutions that are not depending on accessing the complete data onthe outset. Another important aspect is the presentation of query results. Due to its enormoussize, a Web query can retrieve thousands of resulting Web pages. Thus meaningful methodsfor presenting these large results are necessary to help a user to select the most interestingcontent. In this chapter we will discuss several basic topics of Web document representation,Web search, short text processing, topic extraction and Web opinion mining.4.1 Vector Space ModelThe representation of a set of documents as vectors in a common vector space is known as thevector space model and is fundamental to a host of information retrieval operations rangingfrom scoring documents on a query, document classification and document clustering. We firstdevelop the notion of a document vector that captures the relative importance of the terms ina document.Towards this end, we assign to each term in a document a weight for that term, that dependson the number of occurrences of the term in the document. We would like to computea score between a query term and a document based on the weight of t in d. The simplestapproach is to assign the weight to be equal to the number of occurrences of term t in documentd. This weighting scheme is referred to as term frequency and is denoted tf t,d , with thesubscripts denoting the term and the document in order.Raw term frequency as above suffers from a critical problem: all terms are consideredequally important when it comes to assessing relevancy on a query. In fact certain terms havelittle or no discriminating power in determining relevance. For instance, a collection of documentson the auto industry is likely to have the term auto in almost every document. To thisend, we introduce a mechanism for attenuating the effect of terms that occur too often in theG. Xu et al., Web Mining and Social Networking,DOI 10.1007/978-1-4419-7735-9_4, © Springer Science+Business Media, LLC 2011
Page 2 and 3:
Web Mining and Social Networking
Page 4:
Guandong Xu • Yanchun Zhang • L
Page 8 and 9:
VIIIPrefacefollowing characteristic
Page 11:
Acknowledgements: We would like to
Page 14 and 15:
XIVContents3.1.2 Basic Algorithms f
Page 16 and 17:
XVIContentsPart III Social Networki
Page 19:
Part IFoundation
Page 22 and 23:
4 1 Introduction(3). Learning usefu
Page 24 and 25:
6 1 Introductioncalled computationa
Page 26 and 27:
8 1 Introduction• The data on the
Page 28 and 29:
10 1 Introductionin a broad range t
Page 31 and 32:
2Theoretical BackgroundsAs discusse
Page 33 and 34:
2.2 Textual, Linkage and Usage Expr
Page 35 and 36:
2.4 Eigenvector, Principal Eigenvec
Page 37 and 38: 2.5 Singular Value Decomposition (S
Page 39 and 40: 2.6 Tensor Expression and Decomposi
Page 41 and 42: 2.7 Information Retrieval Performan
Page 43 and 44: 2.8 Basic Concepts in Social Networ
Page 45: 2.8 Basic Concepts in Social Networ
Page 48 and 49: 30 3 Algorithms and TechniquesTable
Page 50 and 51: 32 3 Algorithms and TechniquesSpeci
Page 52 and 53: 34 3 Algorithms and Techniquesa sub
Page 54 and 55: 36 3 Algorithms and TechniquesMetho
Page 56 and 57: 38 3 Algorithms and TechniquesCusto
Page 58 and 59: 40 3 Algorithms and TechniquesTable
Page 60 and 61: 42 3 Algorithms and Techniquesa bSI
Page 62 and 63: 44 3 Algorithms and Techniques{a}10
Page 64 and 65: 46 3 Algorithms and Techniques3.2 S
Page 66 and 67: 48 3 Algorithms and TechniquesConce
Page 68 and 69: 50 3 Algorithms and TechniquesNaive
Page 70 and 71: 52 3 Algorithms and Techniquesuses
Page 72 and 73: 54 3 Algorithms and Techniquesin th
Page 74 and 75: 56 3 Algorithms and Techniques// Fu
Page 76 and 77: 58 3 Algorithms and Techniquesendd
Page 78 and 79: 60 3 Algorithms and Techniquesstart
Page 80 and 81: 62 3 Algorithms and TechniquesHere
Page 82 and 83: 64 3 Algorithms and Techniques3.8.2
Page 84 and 85: 66 3 Algorithms and Techniquesfor e
Page 86 and 87: 68 3 Algorithms and Techniquesthat
Page 90 and 91: 72 4 Web Content Miningcollection t
Page 92 and 93: 74 4 Web Content Miningarchiving th
Page 94 and 95: 76 4 Web Content MiningExperimental
Page 96 and 97: 78 4 Web Content MiningInput queryO
Page 98 and 99: 80 4 Web Content Miningrelatively c
Page 100 and 101: 82 4 Web Content MiningP(ω,d)=P(d)
Page 102 and 103: 84 4 Web Content MiningWe will refe
Page 104 and 105: 86 4 Web Content Miningopinions is
Page 107 and 108: 5Web Linkage MiningIn the last chap
Page 109 and 110: 5.3 PageRank and HITS Algorithms 91
Page 111 and 112: 5.3 PageRank and HITS Algorithms 93
Page 113 and 114: 5.4 Web Community Discovery 95• L
Page 115 and 116: 5.4.2 Network Flow/Cut-based Notion
Page 117 and 118: 5.4 Web Community Discovery 99(2) c
Page 119 and 120: 5.5 Web Graph Measurement and Model
Page 121 and 122: 5.6 Using Link Information for Web
Page 127: 110 6 Web Usage Miningbe determined
Page 130 and 131: 112 6 Web Usage Miningthe links (un
Page 132 and 133: 114 6 Web Usage MiningDecomposition
Page 134 and 135: 116 6 Web Usage Miningthe session-p
Page 136 and 137: 118 6 Web Usage Mining6.2 Web Usage
Page 138 and 139:
120 6 Web Usage MiningP(s i |z k )=
Page 140 and 141:
122 6 Web Usage MiningExamples of L
Page 142 and 143:
124 6 Web Usage Mining6.3 Finding U
Page 144 and 145:
126 6 Web Usage MiningThere are two
Page 146 and 147:
128 6 Web Usage MiningAlgorithm 6.7
Page 148 and 149:
130 6 Web Usage Miningap k =∑s i
Page 150 and 151:
132 6 Web Usage Mining6.4.2 An Exam
Page 152 and 153:
134 6 Web Usage MiningWith the prop
Page 154 and 155:
136 6 Web Usage MiningFig. 6.7. An
Page 156 and 157:
138 6 Web Usage Mininghigh co-occur
Page 158 and 159:
140 6 Web Usage MiningDocument clic
Page 160 and 161:
142 6 Web Usage Mining(2) Degree of
Page 163 and 164:
7Extracting and Analyzing Web Socia
Page 165 and 166:
7.1 Extracting Evolution of Web Com
Page 167 and 168:
Page 169 and 170:
Page 171 and 172:
7.2 Temporal Analysis on Semantic G
Page 173 and 174:
7.2 Temporal Analysis on Semantic G
Page 175 and 176:
7.3 Analysis of Communities and The
Page 177 and 178:
7.3 Analysis of Communities and The
Page 179 and 180:
7.4 Socio-Sense: A System for Analy
Page 181 and 182:
Page 183 and 184:
Page 185 and 186:
Page 187 and 188:
8Web Mining and Recommendation Syst
Page 189 and 190:
8.1 User-based and Item-based Colla
Page 191 and 192:
8.1 User-based and Item-based Colla
Page 193 and 194:
8.2 A Hybrid User-based and Item-ba
Page 195 and 196:
Model Building8.2 A Hybrid User-bas
Page 197 and 198:
8.3 User Profiling for Web Recommen
Page 199 and 200:
8.3 User Profiling for Web Recommen
Page 201 and 202:
8.4 Combing Long-Term Web Achieves
Page 203 and 204:
8.5 Combinational CF Approach for P
Page 205 and 206:
8.5 Combinational CF Approach for P
Page 207 and 208:
9Conclusions9.1 SummaryNowadays Wor
Page 209 and 210:
9.2 Future Directions 191to show th
Page 211 and 212:
9.2 Future Directions 193The follow
Page 213 and 214:
References1. http://dms.irb.hr/.2.
Page 215 and 216:
References 19732. D. Billsus and M.
Page 217 and 218:
References 19965. R. Cooley, B. Mob
Page 219 and 220:
References 201hypermedia : links, o
Page 221 and 222:
References 203136. M. Kitsuregawa,
Page 223 and 224:
References 205167. B. Liu and K. Ch
Page 225 and 226:
References 207204. M. Perkowitz and
Page 227 and 228:
References 209241. J. Teevan, S. T.
show all

Web Mining and Social Networking: Techniques and ... - tud.ttu.ee

Create successful ePaper yourself

Delete template?

Save as template?