Expert Cube Development with Microsoft SQL Server 2008

Recommendations

Info

[ 23 ] Chapter 1 A snapshot fact table records of the state of something at different points in time. If we record in a fact table the total sales for each product every month, we are not recording an event but a specific situation. Snapshots can also be useful when we want to measure something not directly related to any other fact. If we want to rank out customers based on sales or payments, for example, we may want to store snapshots of this data in order to analyze how these rankings change over time in response to marketing campaigns. Using a snapshot table containing aggregated data instead of a transaction table can drastically reduce the number of rows in our fact table, which in turn leads to smaller cubes, faster cube processing and faster querying. The price we pay for this is the loss of any information that can only be stored at the transaction level and cannot be aggregated up into the snapshot, such as the transaction number data we encountered when discussing degenerate dimensions. Whether this is an acceptable price to pay is a question only the end users can answer. Updating fact and dimension tables In an ideal world, data that is stored in the data warehouse would never change. Some books suggest that we should only support insert operations in a data warehouse, not updates: data comes from the OLTP, is cleaned and is then stored in the data warehouse until the end of time, and should never change because it represents the situation at the time of insertion. Nevertheless, the real world is somewhat different to the ideal one. While some updates are handled by the slowly changing dimension techniques already discussed, there are other kinds of updates needed in the life of a data warehouse. In our experience, these other types of update in the data warehouse are needed fairly regularly and are of two main kinds: • • Structural updates: when the data warehouse is up and running, we will need to perform updates to add information like new measures or new dimension attributes. This is normal in the lifecycle of a BI solution. Data updates: we need to update data that has already been loaded into the data warehouse, because it is wrong. We need to delete the old data and enter the new data, as the old data will inevitably lead to confusion. There are many reasons why bad data comes to the data warehouse; the sad reality is that bad data happens and we need to manage it gracefully. Download at Boykma.Com
Designing the Data Warehouse for Analysis Services Now, how do these kinds of updates interact with fact and dimension tables? Let's summarize briefly what the physical distinctions between fact and dimension tables are: • • Dimension tables are normally small, usually with less than 1 million rows and very frequently much less than that. Fact tables are often very large; they can have up to hundreds of millions or even billions of rows. Fact tables may be partitioned, and loading data into them is usually the most time-consuming operation in the whole of the data warehouse. Structural updates on dimension tables are very easy to make. You simply update the table with the new metadata, make the necessary changes to your ETL procedures and the next time they are run the dimension will reflect the new values. If your users decide that they want to analyze data based on a new attribute on, say, the customer dimension, then the new attribute can be added for all of the customers in the dimension. Moreover, if the attribute is not present for some customers, then they can be assigned a default value; after all, updating one million rows is not a difficult task for SQL Server or any other modern relational database. However, even if updating the relational model is simple, the updates need to go through to Analysis Services and this might result in the need for a full process of the dimension and therefore the cube, which might be very time consuming. On the other hand, structural updates may be a huge problem on fact tables. The problem is not that of altering the metadata, but determining and assigning a default value for the large number of rows that are already stored in the fact table. It's easy to insert data into fact tables. However, creating a new field with a default value would result in an UPDATE command that will probably run for hours and might even bring down your database server. Worse, if we do not have a simple default value to assign, then we will need to calculate the new value for each row in the fact table, and so the update operation will take even longer. We have found that it is often better to reload the entire fact table rather than perform an update on it. Of course, in order to reload the fact table, we need to have all of our source data at hand and this is not always possible. Data updates are an even bigger problem still, both on facts and dimensions. Data updates on fact tables suffer from the same problems as adding a new field: often, the number of rows that we need to update is so high that running even simple SQL commands can take a very long time. [ 24 ] Download at Boykma.Com
Page 2 and 3: Expert Cube Development with Micros
Page 4 and 5: Authors Chris Webb Alberto Ferrari
Page 6 and 7: Alberto Ferrari (alberto.ferrari@sq
Page 8 and 9: Table of Contents Preface 1 Chapter
Page 10 and 11: [ iii ] Table of Contents Built-in
Page 12 and 13: [ v ] Table of Contents Designing f
Page 14 and 15: [ vii ] Table of Contents Analysis
Page 16 and 17: Preface Microsoft SQL Server Analys
Page 18 and 19: • Microsoft Windows Vista, Micros
Page 20 and 21: New terms and important words are s
Page 22 and 23: Designing the Data Warehouse for An
Page 24 and 25: • [ 9 ] Chapter 1 In our experien
Page 26 and 27: OLTP System(s) Data Warehouse [ 11
Page 28 and 29: [ 13 ] Chapter 1 Star schemas and s
Page 30 and 31: [ 15 ] Chapter 1 This is as a "snow
Page 32 and 33: We have two choices: • • [ 17 ]
Page 34 and 35: SCDs come in three flavors: • •
Page 36 and 37: All fact tables represent many-to-m
Page 40 and 41: [ 25 ] Chapter 1 Data updates on di
Page 42 and 43: [ 27 ] Chapter 1 Therefore, the con
Page 44 and 45: This is the rule for relational tab
Page 46 and 47: [ 31 ] Chapter 1 We are not going t
Page 48 and 49: Usage of schemas The data warehouse
Page 50 and 51: • • • • • • Views are m
Page 52 and 53: Building Basic Dimensions and Cubes
Page 54 and 55: [ 39 ] Chapter 2 You'll only be abl
Page 56 and 57: [ 41 ] Chapter 2 You are then faced
Page 58 and 59: [ 43 ] Chapter 2 2. by matching col
Page 60 and 61: [ 45 ] Chapter 2 In keeping with ou
Page 62 and 63: [ 47 ] Chapter 2 Now that we are in
Page 64 and 65: • • • [ 49 ] Chapter 2 OrderB
Page 66 and 67: The benefits of doing this are: •
Page 68 and 69: [ 53 ] Chapter 2 You can see that,
Page 70 and 71: You can read a more detailed discus
Page 72 and 73: Warnings and Blue Squiggly Lines Ev
Page 74 and 75: [ 59 ] Chapter 2 Processing errors
Page 76 and 77: Designing More Complex Dimensions A
Page 78 and 79: [ 63 ] Chapter 3 Clearly, this isn'
Page 80 and 81: [ 65 ] Chapter 3 The advantage of t
Page 82 and 83: [ 67 ] Chapter 3 alters one of thes
Page 84 and 85: • [ 69 ] Chapter 3 Build the user
Page 86 and 87: Using folders Customer Current City
Page 88 and 89:
[ 73 ] Chapter 3 Analysis Services
Page 90 and 91:
Avoid using parent/child hierarchie
Page 92 and 93:
[ 77 ] Chapter 3 Summary In this ch
Page 94 and 95:
Measures and Measure Groups With ou
Page 96 and 97:
• • [ 81 ] Chapter 4 Be careful
Page 98 and 99:
[ 83 ] Chapter 4 Built-in measure a
Page 100 and 101:
Semi-additive aggregation types The
Page 102 and 103:
[ 87 ] Chapter 4 Note that the semi
Page 104 and 105:
[ 89 ] Chapter 4 Most of this work
Page 106 and 107:
[ 91 ] Chapter 4 From this, we can
Page 108 and 109:
[ 93 ] Chapter 4 However, if you do
Page 110 and 111:
[ 95 ] Chapter 4 In Analysis Servic
Page 112 and 113:
[ 97 ] Chapter 4 Handling different
Page 114 and 115:
[ 99 ] Chapter 4 In the previous sc
Page 116 and 117:
[ 101 ] Chapter 4 Using linked dime
Page 118 and 119:
which in certain client tools (Exce
Page 120 and 121:
When the join is made between the a
Page 122 and 123:
Adding Transactional Data such as I
Page 124 and 125:
[ 109 ] Chapter 5 Drillthrough In c
Page 126 and 127:
Drillthrough actions are used to br
Page 128 and 129:
[ 113 ] Chapter 5 Drillthrough Colu
Page 130 and 131:
Products All Adv DM 1 Product Lis
Page 132 and 133:
[ 117 ] Chapter 5 maximum rows retu
Page 134 and 135:
[ 119 ] Chapter 5 At this point, it
Page 136 and 137:
It is possible to create two measur
Page 138 and 139:
[ 123 ] Chapter 5 Analysis Services
Page 140 and 141:
[ 125 ] Chapter 5 The Sales Order d
Page 142 and 143:
[ 127 ] Chapter 5 Now we can proces
Page 144 and 145:
[ 129 ] Chapter 5 Performance issue
Page 146 and 147:
Adding Calculations to the Cube Thi
Page 148 and 149:
[ 133 ] Chapter 6 Simple calculatio
Page 150 and 151:
[ 135 ] Chapter 6 However, we stron
Page 152 and 153:
[ 137 ] Chapter 6 Year-to-dates Yea
Page 154 and 155:
The difference between the two rati
Page 156 and 157:
We can easily check that the value
Page 158 and 159:
[ 143 ] Chapter 6 Luckily, one simp
Page 160 and 161:
This produces the following output:
Page 162 and 163:
[ 147 ] Chapter 6 Moving averages M
Page 164 and 165:
) 6, NULL, Avg ( LastPeriods (6,[D
Page 166 and 167:
) MEMBER MEASURES.[Product Rank] AS
Page 168 and 169:
[ 153 ] Chapter 6 In some circumsta
Page 170 and 171:
[ 155 ] Chapter 6 We can see from t
Page 172 and 173:
The result seems to be incorrect: C
Page 174 and 175:
[ 159 ] Chapter 6 Next, we need to
Page 176 and 177:
A more complete, worked example of
Page 178 and 179:
[ 163 ] Chapter 6 Dynamic named set
Page 180 and 181:
[ 165 ] Chapter 6 Summary In this c
Page 182 and 183:
Adding Currency Conversion In this
Page 184 and 185:
[ 169 ] Chapter 7 In both cases, th
Page 186 and 187:
[ 171 ] Chapter 7 For example, end
Page 188 and 189:
[ 173 ] Chapter 7 Many of the objec
Page 190 and 191:
[ 175 ] Chapter 7 The list of measu
Page 192 and 193:
[ 177 ] Chapter 7 In the next step,
Page 194 and 195:
After having clicked Finish, we can
Page 196 and 197:
[ 181 ] Chapter 7 want to delay dat
Page 198 and 199:
[ 183 ] Chapter 7 Alternatively, it
Page 200 and 201:
[ 185 ] Chapter 7 The data model th
Page 202 and 203:
The data model that we'll use to de
Page 204 and 205:
At the time of writing, on Service
Page 206 and 207:
Query Performance Tuning One of the
Page 208 and 209:
[ 193 ] Chapter 8 Remember that the
Page 210 and 211:
[ 195 ] Chapter 8 Building partitio
Page 212 and 213:
[ 197 ] Chapter 8 by our users' que
Page 214 and 215:
[ 199 ] Chapter 8 The extra scans c
Page 216 and 217:
[ 201 ] Chapter 8 You can only desi
Page 218 and 219:
• • • [ 203 ] Chapter 8 None:
Page 220 and 221:
[ 205 ] Chapter 8 The approach we s
Page 222 and 223:
[ 207 ] Chapter 8 Once you've done
Page 224 and 225:
• • Query Processing\Get Data F
Page 226 and 227:
[ 211 ] Chapter 8 data is being req
Page 228 and 229:
[ 213 ] Chapter 8 Unsurprisingly, i
Page 230 and 231:
[ 215 ] Chapter 8 MDX calculation p
Page 232 and 233:
[ 217 ] Chapter 8 Using named sets
Page 234 and 235:
[ 219 ] Chapter 8 ( TopPercent ( {
Page 236 and 237:
[ 221 ] Chapter 8 Tuning the implem
Page 238 and 239:
Formula cache scopes There are thre
Page 240 and 241:
[ 225 ] Chapter 8 CREATE CACHE stat
Page 242 and 243:
Securing the Cube Security, for an
Page 244 and 245:
[ 229 ] Chapter 9 If you are famili
Page 246 and 247:
• • • • • • • [ 231 ]
Page 248 and 249:
• • [ 233 ] Chapter 9 In BI Dev
Page 250 and 251:
[ 235 ] Chapter 9 Let's look at an
Page 252 and 253:
[Sales Territory].[Sales Territory
Page 254 and 255:
[ 239 ] Chapter 9 First of all, not
Page 256 and 257:
We now have the exact result we wan
Page 258 and 259:
So, if we deny access to the Total
Page 260 and 261:
[ 245 ] Chapter 9 In order to solve
Page 262 and 263:
The relational data source, for the
Page 264 and 265:
[ 249 ] Chapter 9 This expression f
Page 266 and 267:
} [ 251 ] Chapter 9 MemberCollectio
Page 268 and 269:
[ 253 ] Chapter 9 All we then need
Page 270 and 271:
[ 255 ] Chapter 9 Now, here's the d
Page 272 and 273:
[ 257 ] Chapter 9 Generate takes tw
Page 274 and 275:
[ 259 ] Chapter 9 Dynamic cell secu
Page 276 and 277:
[ 261 ] Chapter 9 One last problem
Page 278 and 279:
[ 263 ] Chapter 9 Accessing a cube
Page 280 and 281:
[ 265 ] Chapter 9 Dimension securit
Page 282 and 283:
Productionization When the Analysis
Page 284 and 285:
• [ 269 ] Chapter 10 The recommen
Page 286 and 287:
• [ 271 ] Chapter 10 We may chang
Page 288 and 289:
At this point the package looks lik
Page 290 and 291:
UNION ALL SELECT * FROM CubeSales.S
Page 292 and 293:
Analysis Services processing can be
Page 294 and 295:
Type Description [ 279 ] Chapter 10
Page 296 and 297:
Type Description [ 281 ] Chapter 10
Page 298 and 299:
Processing an object will lead to t
Page 300 and 301:
We can also set these properties fo
Page 302 and 303:
• [ 287 ] Chapter 10 We can also
Page 304 and 305:
[ 289 ] Chapter 10 Push-mode proces
Page 306 and 307:
Copying databases between servers T
Page 308 and 309:
Summary In this chapter we have dis
Page 310 and 311:
Monitoring Cube Performance and Usa
Page 312 and 313:
[ 297 ] Chapter 11 CPU Analysis Ser
Page 314 and 315:
[ 299 ] Chapter 11 In order to unde
Page 316 and 317:
Another reason for paging physical
Page 318 and 319:
[ 303 ] Chapter 11 The Services tab
Page 320 and 321:
[ 305 ] Chapter 11 Performance Moni
Page 322 and 323:
[ 307 ] Chapter 11 The Disk I/O cou
Page 324 and 325:
[ 309 ] Chapter 11 As well as givin
Page 326 and 327:
[ 311 ] Chapter 11 With the 64 bit
Page 328 and 329:
[ 313 ] Chapter 11 However, it take
Page 330 and 331:
[ 315 ] Chapter 11 In both cases, t
Page 332 and 333:
[ 317 ] Chapter 11 Monitoring proce
Page 334 and 335:
[ 319 ] Chapter 11 Looking at these
Page 336 and 337:
[ 321 ] Chapter 11 Monitoring Proce
Page 338 and 339:
[ 323 ] Chapter 11 Despite the lack
Page 340 and 341:
• • • • • ° ° ° ° Get
Page 342 and 343:
[ 327 ] Chapter 11 Monitoring queri
Page 344 and 345:
[ 329 ] Chapter 11 The result of th
Page 346 and 347:
[ 331 ] Chapter 11 We could use all
Page 348 and 349:
Symbols % Processor Time process ca
Page 350 and 351:
common calculations 132 errors 152
Page 352 and 353:
data, modeling 117-122 drillthrough
Page 354 and 355:
Memory Usage KB 321 MSSQLServer OLA
Page 356 and 357:
parent/child hierarchy, NamingTempl
Page 358 and 359:
Thank you for buying Expert Cube De
Page 360:
SharePoint Designer Tutorial: Worki
show all

Expert Cube Development with Microsoft SQL Server 2008

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?