Star Schema ×××¡× × × ×ª×× ×× Star Schema â Example 1 Star Schema

Star Schema 

• In a star schema, each dimension table has a 

single-part primary key that links to one 

part of the multipart primary key in the fact 

table. 

מחסני נתונים 

תכנון לוגי של מסד נתונים רב מימדי 

באמצעות סכימה טבלאית 

4 

Star Schema – Example 1 

Time Dimensions 

Time_key 

Day of week 

Day_number_of 

month 

Week_number_in_yea 

r 

Month 

Quarter 

Year 

Holliday_flag 

Weekday_flag 

Sales 

Time_key 

Product_key 

Store_key 

Dollars_sold 

Units_sold 

Product Dimension 

Product_key 

Description 

Brand 

Category 

Store Dimension 

Store_key 

Store_name 

Address 

Floor_plan_type 

Mainly 

descriptiv 

e textual 

Dimension 1 Fact Table 

Dimension 2 

3 

d1_key1 

Att1 

Att2 

d2_key1 

fact1 

Dimension 3 

fact2 

Dimension 4 

d3_key1 

Star Schema 

d1_key1 

d2_key2 

d3_key1 

d4_key1 

Mainly numeric and 

additive 

d4_key1 

ז'/תמוז/תש"ע 


1



Reminder: Normal Forms 

Seeks to eliminate data redundancy: transaction that 

changes any data only need to touch the database in 

one place (optimized for updates) 

The Standard Template Query 

Select p.brand, sum(f.dollars),sum(f.units) 

From sales f, product p, time t 

Where f.product_key=p.product_key 

And f.time_key = t.time_key 

And t.quarter=“1 Q 1995” 

Group by p.brand 

2

On the other hand … 

1. Complexity of query specification is high. 

Without normalization it will be much clearer to 

user. (Simple queries structures) 

2. Poor access efficiency – Normalized design is 

the worst, by far, for most query access. A 

normalized design is optimized for key- based, 

record-at-a-time inquiry or table-level query that 

efficiently uses the provided indexes. 

Resisting Normalization 

1. Eliminate redundancy? – Generally eliminating duplicate rows 

is good. However eliminating "redundant" attributes in a star 

schema dimension table will actually destroy its high- access 

efficiency. Time saving (browsing performance) is much more 

critical in data warehouse. 

2. Save space? – This corollary to eliminating redundancy is a 

holdover from another era. The relative impact of storage on cost 

is way down. The loss of access efficiency has far greater cost 

impact. Furthermore The Fact table in a dimensional schema is 

naturally highly normalized. Disk space saving due to 

normalization is typically less than 1%. 

3. Support efficient update? Does not apply at all - Data 

Warehouse is Nonvolatile: no updates of data (only data 

loading). The load methods for relational tables in a star schema 

design can actually be more efficient than a load of normalized 

transaction and snow- flaked reference data. 

Division 

Division_id 

Division_desc 

ER - BCNF 

Region 

Region_id 

Region_desc 

Why Normalization of Dimension does 

not save space? 

– A typical Example 

• Fact Table data size: 

• Fact Table index size: 

• Largest dim’ table size: 

• Savings by normalization: 

• Total size before: 

• Total size after: 

30GB 

20GB 

0.1GB 

0.05GB 

51GB 

50.5GB. 

Dept 

Dept_id 

Dept_desc 

Division_id 

Sales Facts 

Dept_id 

Market_id 

Week_id 

Sales 

Market 

Market_id 

Market_desc 

Region_id 

3

Snowflake Schema 

Dimensional (Denormalization) 

• In a snowflake schema, one or more dimension 

tables are decomposed into multiple tables with 

the subordinate dimension tables joined to a 

primary dimension table instead of to the fact 

table. 

• i.e.:A refinement of star schema where some 

dimensional hierarchy is normalized into a set of 

smaller dimension tables, forming a shape similar 

to snowflake 

Dept. Lookup 

Dept_id 

Dept_desc 

Division_desc 

Sales Facts 

Dept_id 

Market_id 

Week_id 

Sales 

Market Lookup 

Market_id 

Market_desc 

Region_desc 


Sales 


Large Hierarchy 

Customer 

Time_key 

Customer_Key 

15 

amount 

Customer_Key 

Demo_Key 

Name 

… 

Demographic 

Demo_Key 

Income_Level 

Age_Level 

Sex 


4

Sales 

Time_key 

Customer_Key 

Demo_Key 

18 

amount 

Mini-Dimension 

Customer 

Customer_Key 

Name 

… 

Demographic 

Demo_Key 

Income_Level 

Age_Level 

Sex 


Star schemas or Snowflake schemas? 

• Both star and snowflake schemas can represents the 

same dimensional models; the difference is in their 

RDBMS implementations. 

• Snowflake schemas support ease of dimension 

maintenance because they are more normalized. 

• Star schemas are easier for direct user access and 

often support simpler and more efficient queries. 

• The decision to model a dimension as a star or 

snowflake depends on the nature of the dimension 

itself, such as how frequently it changes and which 

of its elements change, and often involves 

evaluating tradeoffs between ease of use and ease 

of maintenance. 

• In most designs, star schemas are preferable to 

snowflake schemas because they involve fewer joins 

for information retrieval. 

• Surrogate keys 

– A surrogate key is the primary key for a dimension table and is 

independent of any keys provided by source data systems. 

– Surrogate keys are created and maintained in the data warehouse and 

should not encode any information about the contents of records; 

– automatically increasing integers make good surrogate keys. 

– The original key for each record may be carried in the dimension 

table but is not used as the primary key. 

– Benefits: 

• a layer of isolation between DW and the source system; 

• Simple: numeric keys 

• Can handle ambiguous ID’s. 

– Drawback: increased ETL processing 

Dimensions Keys 

• Using Original Operational keys 

– Benefit: reduced transformation effort 

– Drawbacks: 

• Compound and textual keys; 

• Dependency on the source systems (OLTP); for instance what 

happen if the operational system create new key when customer 

change address, while we don’t want to create a “new” customer. 

• Ambiguous ID’s coming from different sources; 

– Multiple application systems 

– World wide companies with many branches: each branch uses its 

own customer’s counting. 

– companies that have done mergers or acquisitions. 

5

Time/Date Dimension 

• For hourly time granularity, the hour 

breakdown can be incorporated into the date 

dimension or placed in a separate dimension. 

• Business needs influence this design decision. 

• If the main use is to extract contiguous 

chunks of time that cross day boundaries (for 

example 11/24/2000 10 p.m. to 11/25/2000 6 

a.m.), then it is easier if the hour and day are 

in the same dimension. 

• However, it is easier to analyze cyclical and 

recurring daily events if they are in separate 

dimensions. 

• Unless there is a clear reason to combine date 

and hour in a single dimension, it is generally 

better to keep them in separate dimensions! 

Time/Date Dimension 

• A date dimension with one record per day will suffice if users do 

not need time granularity finer than a single day. A date by day 

dimension table will contain 365 records per year (366 in leap 

years). 

• A separate time dimension table should be constructed if a fine 

time granularity, such as minute or second, is needed. A time 

dimension table of one-minute granularity will contain 1,440 rows 

for a day, and a table of seconds will contain 86,400 rows for a 

day. If exact event time is needed, it should be stored in the fact 

table. 

• When a separate time dimension is used, the fact table contains 

one foreign key for the date dimension and another for the time 

dimension. Separate date and time dimensions simplify many 

filtering operations. For example, summarizing data for a range of 

days requires joining only the date dimension table to the fact 

table. Analyzing cyclical data by time period within a day requires 

joining just the time dimension table. The date and time dimension 

tables can both be joined to the fact table when a specific time 

range is needed. 

6

Star Schema ×××¡× × × ×ª×× ×× Star Schema â Example 1 Star Schema

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?

Star Schema ×××¡× × × ×ª×× ×× Star Schema â Example 1 Star Schema