10.07.2015 Views

Web Mining and Social Networking: Techniques and ... - tud.ttu.ee

Web Mining and Social Networking: Techniques and ... - tud.ttu.ee

Web Mining and Social Networking: Techniques and ... - tud.ttu.ee

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

14 2 Theoretical Backgroundsinformation management <strong>and</strong> retrieval. <strong>Web</strong> researcher <strong>and</strong> engin<strong>ee</strong>r are requested todevelop more efficient <strong>and</strong> effective techniques to satisfy the dem<strong>and</strong>s of <strong>Web</strong> users.<strong>Web</strong> data mining is one kind of these techniques that efficiently h<strong>and</strong>le the tasksof searching n<strong>ee</strong>ded information from the Internet, improving <strong>Web</strong> site structure toimprove the Internet service quality <strong>and</strong> discovering informative knowledge fromthe Internet for advanced <strong>Web</strong> applications. In principle, <strong>Web</strong> mining techniques arethe means of utilizing data mining methods to induce <strong>and</strong> extract useful informationfrom <strong>Web</strong> information <strong>and</strong> service. <strong>Web</strong> mining research has attracted a variety ofacademics <strong>and</strong> researchers from database management, information retrieval, artificialintelligence research areas especially from knowledge discovery <strong>and</strong> machinelearning, <strong>and</strong> many research communities have addressed this topic in recent yearsdue to the tremendous growth of data contents available on the Internet <strong>and</strong> urgentn<strong>ee</strong>ds of e-commerce applications especially. Dependent on various mining targets,<strong>Web</strong> data mining could be categorized into thr<strong>ee</strong> types of <strong>Web</strong> content, <strong>Web</strong> structure<strong>and</strong> <strong>Web</strong> usage mining. In the following chapters, we will systematically presentthe research s<strong>tud</strong>ies <strong>and</strong> applications carried out in the context of <strong>Web</strong> content, <strong>Web</strong>linkage <strong>and</strong> <strong>Web</strong> usage miningTo implement <strong>Web</strong> mining efficiently, it is essential to first introduce a solidmathematical framework, on which the data mining/analysis is performed. There aremany types of data expressions could be used to model the co-occurrence of interactionsbetw<strong>ee</strong>n <strong>Web</strong> users <strong>and</strong> pages, such as matrix, directed graph <strong>and</strong> clicksequence <strong>and</strong> so on. Different data expression models have different mathematical<strong>and</strong> theoretical backgrounds, <strong>and</strong> therefore resulting in various algorithms <strong>and</strong> approaches.In particular, we mainly adopt the commonly used matrix expression asthe analytic scheme, which is widely used in various <strong>Web</strong> mining context. Underthis scheme, the interactive observations betw<strong>ee</strong>n <strong>Web</strong> users <strong>and</strong> pages, <strong>and</strong> the mutualrelationships betw<strong>ee</strong>n <strong>Web</strong> pages are modeled as a co-occurrence matrix, such asin the form of page hyperlink adjacent (inlink or outlink) matrix or session-pageviewmatrix. Based on the proposed mathematical framework, a variety of data mining<strong>and</strong> analysis operations can be employed to conduct <strong>Web</strong> mining.2.2 Textual, Linkage <strong>and</strong> Usage ExpressionsAs described, the starting point of <strong>Web</strong> mining is to choose appropriate data models.To achieve the desired mining tasks discussed above, there are different <strong>Web</strong> datamodels in the forms of feature vectors, engaged in pattern mining <strong>and</strong> knowledgediscovery. According to the thr<strong>ee</strong> identified categories of <strong>Web</strong> mining, thr<strong>ee</strong> typesof <strong>Web</strong> data/sources, namely content data, structure data <strong>and</strong> usage data, are mostlyconsidered in the context of <strong>Web</strong> mining. Before we start to propose different <strong>Web</strong>data models, we firstly give a brief discussion on these thr<strong>ee</strong> data types in the followingparagraphs.<strong>Web</strong> content data is a collection of objects used to convey content informationof <strong>Web</strong> pages to users. In most cases, it is comprised of textural material <strong>and</strong> othertypes of multimedia content, which include static HTML/XML pages, images, sound

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!