01.05.2017 Views

632598256894

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Online analytical processing (OLAP) is a powerful tool for end users to use to access data. As<br />

discussed earlier, the data (the facts) is stored with its characteristics (called dimensions). The concept<br />

uses a cube as a visual to the end user. Of course, a cube can‟t be stored as such on a hard drive. The<br />

data for OLAP retrieval is stored using either a star schema or a snowflake schema. A star schema<br />

would include only the primary dimensions and fact table. When we expand our dimensions to include<br />

further details about the dimensions, the schema becomes more complicated—a snowflake schema.<br />

Illustrated in Exhibit 20.8 is the snowflake schema for the example used earlier in Exhibit 20.4.<br />

New technologies have been developed to address the issue of information overload. In the 1970s,<br />

the average database was perhaps 100 megabytes in size. In the 1980s, databases were typically 20<br />

gigabytes. Now, databases are in terabytes (trillions of bytes). Wal-Mart has a data warehouse that<br />

exceeds 4 petabytes (a petabype is a quadrillion bytes or 1,024 terabytes). With all that data, it is<br />

difficult for a user to know where to look. It is not the question that the user knows to ask that is<br />

necessarily important, but, rather, the question that the user does not know to ask that will come back<br />

to haunt him.<br />

Data mining is a set of technologies that allow users to classify, cluster, learn about associations,<br />

and perform regressions. Classification uses algorithms to assign data into a predefined group.<br />

Clustering is similar to classification but the groups are not predefined. Regression analyzes the data<br />

to build a model that can be used for forecasting. Learning about associations searches for<br />

relationships between variables. Market basket analysis is used by supermarkets to analyze the data of<br />

what each customer buys to determine what products are frequently bought together. Data mining is<br />

used in customer relationship management software to identify prospects with a higher likelihood to<br />

respond to an offer.<br />

Data mining can also be used to scan databases for any data that does not fit the business‟s model<br />

and identify any data that the user needs to examine further. For example, auditors might use data<br />

mining to scan client transaction detail to look for transactions that do not conform to company<br />

policies, and stock analysts can use it to scan data on stock prices and company earnings over a period<br />

of time in order to look for opportunities.<br />

Internet Technology<br />

Nothing has impacted technology and society in the past two decades more than the Internet. When<br />

Bill Clinton was inaugurated in January 1993, there were about 50 pages on the Internet. Today, there<br />

are more than 200 billion pages (and that‟s only an estimate—it‟s probably much higher than that!).<br />

The underlying technology behind the Internet has its roots in a project begun by the U.S. government

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!