10.07.2015 Views

business intelligence and analytics: from big data to big impact

business intelligence and analytics: from big data to big impact

business intelligence and analytics: from big data to big impact

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Chen et al./Introduction: Business Intelligence Researchservice (SaaS), infrastructure as a service (IaaS), or platformas a service (PaaS). Only a few leading IT vendors are currentlypositioned <strong>to</strong> support high-end, high-throughput BI&Aapplications using cloud computing. For example, AmazonElastic Compute Cloud (EC2) enables users <strong>to</strong> rent virtualcomputers on which <strong>to</strong> run their own computer applications.Its Simple S<strong>to</strong>rage Service (S3) provides online s<strong>to</strong>rage webservice. Google App Engine provides a platform for developing<strong>and</strong> hosting Java or Python-based web applications.Google Bigtable is used for backend <strong>data</strong> s<strong>to</strong>rage. Microsoft’sWindows Azure platform provides cloud services such asSQL Azure <strong>and</strong> SharePoint, <strong>and</strong> allows .Net frameworkapplications <strong>to</strong> run on the platform. The industry-led web <strong>and</strong>cloud services offer unique <strong>data</strong> collection, processing, <strong>and</strong><strong>analytics</strong> challenges for BI&A researchers.In academia, current web <strong>analytics</strong> related research encompassessocial search <strong>and</strong> mining, reputation systems, socialmedia <strong>analytics</strong>, <strong>and</strong> web visualization. In addition, webbasedauctions, Internet monetization, social marketing, <strong>and</strong>web privacy/security are some of the promising researchdirections related <strong>to</strong> web <strong>analytics</strong>. Many of these emergingresearch areas may rely on advances in social network analysis,text <strong>analytics</strong>, <strong>and</strong> even economics modeling research.Network AnalyticsNetwork <strong>analytics</strong> is a nascent research area that has evolved<strong>from</strong> the earlier citation-based bibliometric analysis <strong>to</strong> includenew computational models for online community <strong>and</strong> socialnetwork analysis. Grounded in bibliometric analysis, citationnetworks <strong>and</strong> coauthorship networks have long been adopted<strong>to</strong> examine scientific <strong>impact</strong> <strong>and</strong> knowledge diffusion. Theh-index is a good example of a citation metric that aims <strong>to</strong>measure the productivity <strong>and</strong> <strong>impact</strong> of the published work ofa scientist or scholar (Hirsch 2005). Since the early 2000s,network science has begun <strong>to</strong> advance rapidly with contributions<strong>from</strong> sociologists, mathematicians, <strong>and</strong> computerscientists. Various social network theories, network metrics,<strong>to</strong>pology, <strong>and</strong> mathematical models have been developed thathelp underst<strong>and</strong> network properties <strong>and</strong> relationships (e.g.,centrality, betweenness, cliques, paths; ties, structural holes,structural balance; r<strong>and</strong>om network, small-world network,scale-free network) (Barabási 2003; Watts 2003).Recent network <strong>analytics</strong> research has focused on areas suchas link mining <strong>and</strong> community detection. In link mining, oneseeks <strong>to</strong> discover or predict links between nodes of a network.Within a network, nodes may represent cus<strong>to</strong>mers, end users,products <strong>and</strong>/or services, <strong>and</strong> the links between nodes mayrepresent social relationships, collaboration, e-mail exchanges,or product adoptions. One can conduct link mining usingonly the <strong>to</strong>pology information (Liben-Nowell <strong>and</strong> Kleinberg2007). Techniques such as common neighbors, Jaccard’scoefficient, Adamic Adar measure, <strong>and</strong> Katz measure arepopular for predicting missing or future links. The linkmining accuracy can be further improved when the node <strong>and</strong>link attributes are considered. Community detection is also anactive research area of relevance <strong>to</strong> BI&A (Fortuna<strong>to</strong> 2010).By representing networks as graphs, one can apply graphpartitioning algorithms <strong>to</strong> find a minimal cut <strong>to</strong> obtain densesubgraphs representing user communities.Many techniques have been developed <strong>to</strong> help study thedynamic nature of social networks. For example, agent-basedmodels (sometimes referred <strong>to</strong> as multi-agent systems) havebeen used <strong>to</strong> study disease contact networks <strong>and</strong> criminal orterrorist networks (National Research Council 2008). Suchmodels simulate the actions <strong>and</strong> interactions of au<strong>to</strong>nomousagents (of either individual or collective entities such asorganizations or groups) with the intent of assessing theireffects on the system as a whole. Social influence <strong>and</strong> informationdiffusion models are also viable techniques forstudying evolving networks. Some research is particularlyrelevant <strong>to</strong> opinion <strong>and</strong> information dynamics of a society.Such dynamics hold many qualitative similarities <strong>to</strong> diseaseinfections (Bettencourt et al. 2006). Another network<strong>analytics</strong> technique that has drawn attention in recent years isthe use of exponential r<strong>and</strong>om graph models (Frank <strong>and</strong>Strauss 1986; Robins et al. 2007). ERGMs are a family ofstatistical models for analyzing <strong>data</strong> about social <strong>and</strong> othernetworks. To support statistical inference on the processesinfluencing the formation of network structure, ERGMsconsider the set of all possible alternative networks weightedon their similarity <strong>to</strong> an observed network. In addition <strong>to</strong>studying traditional friendship or disease networks, ERGMsare promising for underst<strong>and</strong>ing the underlying networkpropertities that cause the formation <strong>and</strong> evolution ofcus<strong>to</strong>mer, citizen, or patient networks for BI&A.Most of the abovementioned network <strong>analytics</strong> techniques arenot part of the existing commercial BI&A platforms. Significan<strong>to</strong>pen-source development efforts are underway <strong>from</strong> thesocial network analysis community. Tools such as UCINet(Borgatti et al. 2002) <strong>and</strong> Pajek (Batagelj <strong>and</strong> Mrvar 1998)have been developed <strong>and</strong> are widely used for large-scalenetwork analysis <strong>and</strong> visualization. New network <strong>analytics</strong><strong>to</strong>ols such as ERGM have also been made available <strong>to</strong> theacademic community (Hunter et al. 2008). Online virtualcommunities, criminal <strong>and</strong> terrorist networks, social <strong>and</strong>political networks, <strong>and</strong> trust <strong>and</strong> reputation networks are someof the promising new applications for network <strong>analytics</strong>.MIS Quarterly Vol. 36 No. 4/December 2012 13

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!