05.01.2013 Views

Open Source meets Business Intelligence An Introduction to Pentaho

Open Source meets Business Intelligence An Introduction to Pentaho

Open Source meets Business Intelligence An Introduction to Pentaho

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

<strong>Open</strong> <strong>Source</strong> <strong>meets</strong> <strong>Business</strong><br />

<strong>Intelligence</strong><br />

<strong>An</strong> <strong>Introduction</strong> <strong>to</strong> <strong>Pentaho</strong><br />

Seminar “<strong>Business</strong> <strong>Intelligence</strong>”<br />

06.02.07 Konstanz<br />

Monika Podolecheva


BI concept and goals<br />

BI Market Overview<br />

Proprietary BI Vendors<br />

Agenda<br />

<strong>Open</strong> <strong>Source</strong> and BI <strong>Open</strong> <strong>Source</strong> Market<br />

<strong>Open</strong> <strong>Source</strong> BI Vendors<br />

OS SW: pros and cons<br />

From Data gathering, Processing and <strong>An</strong>alysis <strong>to</strong> Report<br />

JasperSoft Inc.<br />

<strong>Pentaho</strong> Products<br />

<strong>Pentaho</strong> Examples and Demo


BI: Concept and Objectives<br />

BI Concept: Techniques as Data warehousing, Data mining, and<br />

Reporting based on gathering, s<strong>to</strong>rage, preprocessing, analysis<br />

and reporting for data<br />

Objectives: help companies <strong>to</strong> become a more comprehensive<br />

knowledge of the fac<strong>to</strong>rs affecting their business and help<br />

companies <strong>to</strong> make better business decisions.<br />

Huge amount of<br />

unstructured data<br />

BI Solutions<br />

Strategic planning,<br />

deriving trends ,<br />

objectives definition


BI Market Segmentation and Overview<br />

• Query, reporting, and analysis SW includes ad hoc query and multidimensional analysis<br />

<strong>to</strong>ols as well as dashboards and production reporting <strong>to</strong>ols. Query and reporting <strong>to</strong>ols are<br />

designed specifically <strong>to</strong> support ad hoc data access and report building by either IT or<br />

business users.<br />

• Advanced analytics software includes data mining and statistical software and uses<br />

technologies such as neural nezworks, rule induction, and clustering, etc. <strong>to</strong> discover<br />

relationships in data and make predictions<br />

<strong>Source</strong>: IDC, July 2006


BI Tools Revenue Share by Region and by<br />

Operating Environment<br />

<strong>Source</strong>: IDC, July 2006


BI Market Facts and Trends<br />

•BI market grows because applying BI <strong>to</strong>ols leads <strong>to</strong><br />

– better market analysis<br />

– better budget controlling<br />

– better strategy planning<br />

•Broader adoption of BI software is expected <strong>to</strong> continue as more end users gain access <strong>to</strong><br />

query and reporting <strong>to</strong>ols and as organizations embed BI software in<strong>to</strong> operational applications<br />

supporting all business processes<br />

•A IDC shows an optimistically trend:<br />

<strong>Source</strong>: IDC, August 2006<br />

therefore brings higher rate of return<br />

!The BI market is dominated by larger, full-service companies, such as IBM and Oracle, and<br />

specialized vendors, such as SAS, Cognos, <strong>Business</strong> Objects and Hyperion.


Proprietary BI Vendors<br />

• Arcplan<br />

• Actuate<br />

• <strong>Business</strong> Objects<br />

• Cognos<br />

• Hyperion Solutions<br />

• Information Builders<br />

• Microsoft<br />

• MicroStrategy<br />

• Oracle<br />

• Panorama Software<br />

• SAP<br />

• SAS Institute, etc.<br />

<strong>Source</strong>: Gartner (January 2007)


<strong>Open</strong> <strong>Source</strong> and BI <strong>Open</strong> <strong>Source</strong> Market<br />

First signs that OOS is coming in<strong>to</strong> the BI <strong>to</strong>ols market: Vendors such as<br />

<strong>Pentaho</strong>, JasperSoft, and Actuate clearly display the first signs of a potential<br />

market niche.<br />

The impact of open source BI <strong>to</strong>ols will be very limited over the next five years<br />

During the latter part of the current 15-year cycle of the BI market, OSS may<br />

develop in<strong>to</strong> a stronger competitive force (IDC, 2006) especially because of the<br />

costs for the commercial <strong>to</strong>ols.<br />

Trends: OS Databases widely used<br />

BI OS Strategy:<br />

JasperSoft Idea: BI becomes embedded in individual applications and much<br />

more transepent. <strong>An</strong>d by making a complex function affordable, that function<br />

becomes universal.


<strong>Open</strong> <strong>Source</strong> BI Vendors<br />

• <strong>Pentaho</strong> – Complete solution with<br />

<strong>Pentaho</strong> BI Suite<br />

• Palo – MOLAP-Server (German vendor<br />

Jedox)<br />

• JasperSoft – Specialist for Reporting-<br />

Tools such as JasperReports<br />

• BIRT – Reporting-Solution of Eclipse<br />

Foundation (taken over by Acuate)<br />

• Weka – Algorithm collection for Data<br />

Mining of the University of Waika<strong>to</strong>


From Data gathering, Processing and<br />

<strong>An</strong>alysis <strong>to</strong> Report (1)<br />

• Data Gathering and S<strong>to</strong>rage: open source databases MySQL,<br />

PostgreSQL, and the Jedox database Palo<br />

• Preprocessing: Extraction, Transformation & Loading (ETL):<br />

Tool Kettle, the CloverETL-Framework and Enhydra Oc<strong>to</strong>pus<br />

• <strong>An</strong>alysis:<br />

OLAP (On Line <strong>An</strong>alytical Processing) – The most popular<br />

free OLAP-Server is the Java based Mondrian-Project ;<br />

<strong>An</strong>other algorithm collection for Data Mining is Weka


From Data gathering, Processing and<br />

<strong>An</strong>alysis <strong>to</strong> Report (2)<br />

• Reporting-Engines such as the Java-Bibliothek JasperReports<br />

(JasperSoft)<br />

• Visualizing the Reports - iReport-Designer (Jaspersoft)<br />

Further: combining OLAP-Server Jasper<strong>An</strong>alysis and the ETL-<br />

Tool JasperETL ! Jasper<strong>Intelligence</strong>-Suite BI-Framework.<br />

• <strong>Pentaho</strong>: offers ETL-, <strong>An</strong>alysis-, Reporting- und Workflow-<br />

Solutions that can be combined in the <strong>Pentaho</strong> BI Suite. These<br />

solutions can be integrated in Standalone-Applications under the<br />

Mozilla Public License, Version 1.1.)<br />

• The German partner of <strong>Pentaho</strong> is <strong>An</strong>cud IT since Mai 2006.


OS SW: pros and cons<br />

Advantages OSS<br />

– no license costs<br />

– reduced dependence on software vendors<br />

– flexible and easier <strong>to</strong> cus<strong>to</strong>mize<br />

Disadvantages OSS<br />

– lack of long-term support<br />

– lack of long-term maintenance


JasperSoft Inc.<br />

JasperSoft is founded 2004 in San Francisco, CA, U.S.A with less than 50<br />

employees. JasperSoft is delivering Commercial <strong>Open</strong> <strong>Source</strong> in the area of<br />

<strong>Business</strong> <strong>Intelligence</strong><br />

Specialist in Reporting, <strong>An</strong>alysis and Integration<br />

•<strong>Open</strong> <strong>Source</strong> Products:<br />

– JasperReport - delivers reports <strong>to</strong> the screen, printer or in<strong>to</strong> PDF, HTML,<br />

XLS, CSV and XML files; can stand alone or be embedded directly in<strong>to</strong> a<br />

user's application <strong>to</strong> give it advanced reporting capabilities.<br />

– JasperServer - interactive and managed reporting for JasperReports<br />

– Jasper<strong>An</strong>alysis - interactive data analysis / OLAP server<br />

– JasperETL - high-performance data integration<br />

– iReport - powerful graphical report designer<br />

•Commercial line: JasperDecisions


<strong>Pentaho</strong><br />

<strong>Pentaho</strong> is founded 2004 in Orlando, U.S.A<br />

<strong>Pentaho</strong> manages, facilitates, supports, and takes the lead development<br />

role in the <strong>Pentaho</strong> BI Project - a pioneering initiative by the <strong>Open</strong> <strong>Source</strong><br />

development community <strong>to</strong> provide organizations with a comprehensive<br />

set of <strong>Business</strong> <strong>Intelligence</strong> (BI) capabilities that enable them <strong>to</strong> radically<br />

improve business performance, efficiency, and effectiveness.<br />

Experienced team:<br />

Founded by industry veterans with a track record of delivering successful<br />

BI products for leading commercial vendors including <strong>Business</strong> Objects,<br />

Cognos, Hyperion, IBM, Oracle, and SAS.


<strong>Pentaho</strong> Products: Reporting<br />

<strong>Pentaho</strong> Reporting<br />

allows organizations <strong>to</strong> easily access, format, and distribute information<br />

<strong>to</strong> employees, cus<strong>to</strong>mers, and partners (former known as JFreeReport)<br />

<strong>Pentaho</strong> Reporting has the following features:<br />

– Full on-screen print preview;<br />

– Output <strong>to</strong> the screen, printer or various export formats: PDF,<br />

HTML,CSV, Excel<br />

– Support for servlets (uses the JFreeReport extensions)<br />

– Complete source code included (subject <strong>to</strong> the GNU LGPL);<br />

– Extensive source code documentation<br />

– Unmatched flexibility through a heavily modularized architecture etc


<strong>Pentaho</strong> Products: <strong>An</strong>alysis<br />

<strong>Pentaho</strong> <strong>An</strong>alysis (The Mondrian project)<br />

helps <strong>to</strong> operate with maximum effectiveness by gaining the insights and<br />

understanding <strong>to</strong> make optimal decisions; Mondrian is an OLAP server that<br />

enables you <strong>to</strong> interactively analyze very large datasets s<strong>to</strong>red in SQL<br />

databases without writing SQL.<br />

<strong>Pentaho</strong> <strong>An</strong>alysis has the following features:<br />

– Integrates directly with Microsoft Excel PivotTable services, supporting data<br />

refresh, drill-down, data pivoting and more<br />

– Provides an easy, interactive way for business users <strong>to</strong> analyze critical<br />

business information, by exploring the data <strong>to</strong> quickly uncover trends or<br />

anomalies. For example, a user looking at sales information for last year<br />

could easily “drill down” from the yearly summary <strong>to</strong> break out sales by<br />

quarter, compare sales across product lines, or analyze specific sales<br />

performance in different geographic regions<br />

– Allows users <strong>to</strong> enhance the data with Excel formatting and Excel charts etc


<strong>Pentaho</strong> Products: Data Integration<br />

<strong>Pentaho</strong> Data Integration (The Kettle project)<br />

delivers powerful Extraction, Transformation and Loading (ETL)<br />

capabilities<br />

<strong>Pentaho</strong> Data Integration is used for:<br />

– Data warehouse population with built-in support for slowly changing<br />

dimensions, junk dimensions and much, much more.<br />

– Export of database(s) <strong>to</strong> text-file(s) or other databases<br />

– Import of data in<strong>to</strong> databases, ranging from text-files <strong>to</strong> excel sheets<br />

– Data migration between database applications<br />

– Exploration of data in existing databases. (tables, views, synonyms, )<br />

– Information enrichment by looking up data in various information s<strong>to</strong>res<br />

(databases, text-files, excel sheets, )<br />

– Data cleaning by applying complex conditions in data transformations<br />

– Application integrationist


<strong>Pentaho</strong> Products: Dashboards<br />

<strong>Pentaho</strong> Data Integration (adopted Kettle project)<br />

provide immediate insight in<strong>to</strong> individual, departmental, or enterprise<br />

performance and gives business users the critical information for<br />

understanding and improvement organizational performance


<strong>Pentaho</strong> Products: Data Mining<br />

<strong>Pentaho</strong> Data Mining<br />

WEKA is integrated in <strong>Pentaho</strong> Data Mining and provides the BI suite<br />

with a the following set of machine learning algorithms:<br />

– Clustering<br />

– Neural Networks<br />

– Decision Trees etc.<br />

– Graphical data mining design and administration <strong>to</strong>ols are integrated<br />

– Graphical user interfaces are provided for data pre-processing,<br />

classification, regression, clustering, association rules, and visualization.


<strong>Pentaho</strong> Exapmles: Budged – Actual<br />

Input Data


<strong>Pentaho</strong> <strong>An</strong>alysis: Example<br />

Filtering<br />

Options


<strong>Pentaho</strong> <strong>An</strong>alysis: Example


<strong>Pentaho</strong> <strong>An</strong>alysis: Example


<strong>Pentaho</strong> <strong>An</strong>alysis: Example


<strong>Pentaho</strong>: Dashboard Example


Thank you for your attention!


Literatur<br />

Worldwide <strong>Business</strong> <strong>Intelligence</strong> Tools 2005 Vendor Shares, July 2006<br />

<strong>Source</strong>: http://www.sas.com/news/analysts/idc_bi_0706.pdf<br />

Worldwide <strong>Business</strong> <strong>Intelligence</strong> Tools 2005 Vendor Shares, Oc<strong>to</strong>ber 2006<br />

<strong>Source</strong>: http://www.sas.com/news/analysts/idc_analytics2_1006.pdf<br />

Alexandra Kleijn, “<strong>Business</strong> <strong>Intelligence</strong> mit <strong>Open</strong> <strong>Source</strong>”, Heise <strong>Open</strong>, 2006.<br />

<strong>Source</strong>: http://www.heise.de/open/artikel/73725<br />

Martin LaMonica : “<strong>Open</strong> source <strong>meets</strong> business intelligence”, CNET<br />

News.com, published: April 23, 2006, 2006.<br />

http://news.com.com/2100-7344_3-6064045.html<br />

http://www.JasperSoft.org<br />

http://www.pentaho.com

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!