Expert Cube Development with Microsoft SQL Server 2008

Recommendations

Info

This is the rule for relational tables. However, you also need to remember that the equivalent measure data type in Analysis Services must be large enough to hold the largest aggregated value of a given measure, not just the largest value present in a single fact table row. [ 29 ] Chapter 1 Always remember that there are situations in which the rules must be overridden. If we have a fact table containing 20 billion rows, each composed of 20 bytes and a column that references a date, then it might be better to use a SMALLINT column for the date, if we find a suitable representation that holds all necessary values. We will gain 2 bytes for each row, and that means a 10% in the size of the whole table. SQL queries generated during cube processing When Analysis Services needs to process a cube or a dimension, it sends queries to the relational database in order to retrieve the information it needs. Not all the queries are simple SELECTs; there are many situations in which Analysis Services generates complex queries. Even if we do not have space enough to cover all scenarios, we're going to provide some examples relating to SQL Server, and we advise the reader to have a look at the SQL queries generated for their own cube to check whether they can be optimized in some way. Dimension processing During dimension processing Analysis Services sends several queries, one for each attribute of the dimension, in the form of SELECT DISTINCT ColName, where ColName is the name of the column holding the attribute. Many of these queries are run in parallel (exactly which ones can be run in parallel depends on the attribute relationships defined on the Analysis Services dimension), so SQL Server will take advantage of its cache system and perform only one physical read of the table, so all successive scans are performed from memory. Nevertheless, keep in mind that the task of detecting the DISTINCT values of the attributes is done by SQL Server, not Analysis Services. We also need to be aware that if our dimensions are built from complex views, they might confuse the SQL Server engine and lead to poor SQL query performance. If, for example, we add a very complex WHERE condition to our view, then the condition will be evaluated more than once. We have personally seen a situation where the processing of a simple time dimension with only a few hundred rows, which had a very complex WHERE condition, took tens of minutes to complete. Download at Boykma.Com
Designing the Data Warehouse for Analysis Services Dimensions with joined tables If a dimension contains attributes that come from a joined table, the JOIN is performed by SQL Server, not Analysis Services. This situation arises very frequently when we define snowflakes instead of simpler star schemas. Since some attributes of a dimension are computed by taking their values from another dimension table, Analysis Services will send a query to SQL Server containing the INNER JOIN between the two tables. Beware that the type of JOIN requested by Analysis Services is always an INNER JOIN. If, for any reason, you need a LEFT OUTER JOIN, then you definitely need to avoid using joined tables inside the DSV and use, as we suggest, SQL VIEWS to obtain the desired result. As long as all the joins are made on the primary keys, this will not lead to any problems but, in cases where the JOIN is not made on the primary key, bad performance might result. As we said before, if we succeed in the goal of exposing to Analysis Services a simple star schema, we will never have to handle these JOINs. As we argue below, if a snowflake is really needed we can still hide it from Analysis Services using views, and in these views we will have full control over, and knowledge of, the complexity of the query used. Reference dimensions Reference dimensions, when present in the cube definition, will lead to one of the most hidden and most dangerous types of JOIN. When we define the relationship between a dimension and a fact table, we can use the Referenced relationship type and use an intermediate dimension to relate the dimension to the fact table. Reference dimensions often appear in the design due to snowflakes or due to the need to reduce fact table size. A referenced dimension may be materialized or not. If we decide to materialize a reference dimension (as BI Development Studio will suggest) the result is that the fact table query will contain a JOIN to the intermediate dimension, to allow Analysis Services to get the value of the key for the reference dimension. If JOINs are a problem with dimension processing queries, they are a serious problem with fact table processing queries. It might be the case that SQL Server needs to write a large amount of data to its temporary database before returning information to Analysis Services. It all depends on the size of the intermediate table and the number of reference dimensions that appear in the cube design. [ 30 ] Download at Boykma.Com
Page 2 and 3: Expert Cube Development with Micros
Page 4 and 5: Authors Chris Webb Alberto Ferrari
Page 6 and 7: Alberto Ferrari (alberto.ferrari@sq
Page 8 and 9: Table of Contents Preface 1 Chapter
Page 10 and 11: [ iii ] Table of Contents Built-in
Page 12 and 13: [ v ] Table of Contents Designing f
Page 14 and 15: [ vii ] Table of Contents Analysis
Page 16 and 17: Preface Microsoft SQL Server Analys
Page 18 and 19: • Microsoft Windows Vista, Micros
Page 20 and 21: New terms and important words are s
Page 22 and 23: Designing the Data Warehouse for An
Page 24 and 25: • [ 9 ] Chapter 1 In our experien
Page 26 and 27: OLTP System(s) Data Warehouse [ 11
Page 28 and 29: [ 13 ] Chapter 1 Star schemas and s
Page 30 and 31: [ 15 ] Chapter 1 This is as a "snow
Page 32 and 33: We have two choices: • • [ 17 ]
Page 34 and 35: SCDs come in three flavors: • •
Page 36 and 37: All fact tables represent many-to-m
Page 38 and 39: [ 23 ] Chapter 1 A snapshot fact ta
Page 40 and 41: [ 25 ] Chapter 1 Data updates on di
Page 42 and 43: [ 27 ] Chapter 1 Therefore, the con
Page 46 and 47: [ 31 ] Chapter 1 We are not going t
Page 48 and 49: Usage of schemas The data warehouse
Page 50 and 51: • • • • • • Views are m
Page 52 and 53: Building Basic Dimensions and Cubes
Page 54 and 55: [ 39 ] Chapter 2 You'll only be abl
Page 56 and 57: [ 41 ] Chapter 2 You are then faced
Page 58 and 59: [ 43 ] Chapter 2 2. by matching col
Page 60 and 61: [ 45 ] Chapter 2 In keeping with ou
Page 62 and 63: [ 47 ] Chapter 2 Now that we are in
Page 64 and 65: • • • [ 49 ] Chapter 2 OrderB
Page 66 and 67: The benefits of doing this are: •
Page 68 and 69: [ 53 ] Chapter 2 You can see that,
Page 70 and 71: You can read a more detailed discus
Page 72 and 73: Warnings and Blue Squiggly Lines Ev
Page 74 and 75: [ 59 ] Chapter 2 Processing errors
Page 76 and 77: Designing More Complex Dimensions A
Page 78 and 79: [ 63 ] Chapter 3 Clearly, this isn'
Page 80 and 81: [ 65 ] Chapter 3 The advantage of t
Page 82 and 83: [ 67 ] Chapter 3 alters one of thes
Page 84 and 85: • [ 69 ] Chapter 3 Build the user
Page 86 and 87: Using folders Customer Current City
Page 88 and 89: [ 73 ] Chapter 3 Analysis Services
Page 90 and 91: Avoid using parent/child hierarchie
Page 92 and 93: [ 77 ] Chapter 3 Summary In this ch
Page 94 and 95:
Measures and Measure Groups With ou
Page 96 and 97:
• • [ 81 ] Chapter 4 Be careful
Page 98 and 99:
[ 83 ] Chapter 4 Built-in measure a
Page 100 and 101:
Semi-additive aggregation types The
Page 102 and 103:
[ 87 ] Chapter 4 Note that the semi
Page 104 and 105:
[ 89 ] Chapter 4 Most of this work
Page 106 and 107:
[ 91 ] Chapter 4 From this, we can
Page 108 and 109:
[ 93 ] Chapter 4 However, if you do
Page 110 and 111:
[ 95 ] Chapter 4 In Analysis Servic
Page 112 and 113:
[ 97 ] Chapter 4 Handling different
Page 114 and 115:
[ 99 ] Chapter 4 In the previous sc
Page 116 and 117:
[ 101 ] Chapter 4 Using linked dime
Page 118 and 119:
which in certain client tools (Exce
Page 120 and 121:
When the join is made between the a
Page 122 and 123:
Adding Transactional Data such as I
Page 124 and 125:
[ 109 ] Chapter 5 Drillthrough In c
Page 126 and 127:
Drillthrough actions are used to br
Page 128 and 129:
[ 113 ] Chapter 5 Drillthrough Colu
Page 130 and 131:
Products All Adv DM 1 Product Lis
Page 132 and 133:
[ 117 ] Chapter 5 maximum rows retu
Page 134 and 135:
[ 119 ] Chapter 5 At this point, it
Page 136 and 137:
It is possible to create two measur
Page 138 and 139:
[ 123 ] Chapter 5 Analysis Services
Page 140 and 141:
[ 125 ] Chapter 5 The Sales Order d
Page 142 and 143:
[ 127 ] Chapter 5 Now we can proces
Page 144 and 145:
[ 129 ] Chapter 5 Performance issue
Page 146 and 147:
Adding Calculations to the Cube Thi
Page 148 and 149:
[ 133 ] Chapter 6 Simple calculatio
Page 150 and 151:
[ 135 ] Chapter 6 However, we stron
Page 152 and 153:
[ 137 ] Chapter 6 Year-to-dates Yea
Page 154 and 155:
The difference between the two rati
Page 156 and 157:
We can easily check that the value
Page 158 and 159:
[ 143 ] Chapter 6 Luckily, one simp
Page 160 and 161:
This produces the following output:
Page 162 and 163:
[ 147 ] Chapter 6 Moving averages M
Page 164 and 165:
) 6, NULL, Avg ( LastPeriods (6,[D
Page 166 and 167:
) MEMBER MEASURES.[Product Rank] AS
Page 168 and 169:
[ 153 ] Chapter 6 In some circumsta
Page 170 and 171:
[ 155 ] Chapter 6 We can see from t
Page 172 and 173:
The result seems to be incorrect: C
Page 174 and 175:
[ 159 ] Chapter 6 Next, we need to
Page 176 and 177:
A more complete, worked example of
Page 178 and 179:
[ 163 ] Chapter 6 Dynamic named set
Page 180 and 181:
[ 165 ] Chapter 6 Summary In this c
Page 182 and 183:
Adding Currency Conversion In this
Page 184 and 185:
[ 169 ] Chapter 7 In both cases, th
Page 186 and 187:
[ 171 ] Chapter 7 For example, end
Page 188 and 189:
[ 173 ] Chapter 7 Many of the objec
Page 190 and 191:
[ 175 ] Chapter 7 The list of measu
Page 192 and 193:
[ 177 ] Chapter 7 In the next step,
Page 194 and 195:
After having clicked Finish, we can
Page 196 and 197:
[ 181 ] Chapter 7 want to delay dat
Page 198 and 199:
[ 183 ] Chapter 7 Alternatively, it
Page 200 and 201:
[ 185 ] Chapter 7 The data model th
Page 202 and 203:
The data model that we'll use to de
Page 204 and 205:
At the time of writing, on Service
Page 206 and 207:
Query Performance Tuning One of the
Page 208 and 209:
[ 193 ] Chapter 8 Remember that the
Page 210 and 211:
[ 195 ] Chapter 8 Building partitio
Page 212 and 213:
[ 197 ] Chapter 8 by our users' que
Page 214 and 215:
[ 199 ] Chapter 8 The extra scans c
Page 216 and 217:
[ 201 ] Chapter 8 You can only desi
Page 218 and 219:
• • • [ 203 ] Chapter 8 None:
Page 220 and 221:
[ 205 ] Chapter 8 The approach we s
Page 222 and 223:
[ 207 ] Chapter 8 Once you've done
Page 224 and 225:
• • Query Processing\Get Data F
Page 226 and 227:
[ 211 ] Chapter 8 data is being req
Page 228 and 229:
[ 213 ] Chapter 8 Unsurprisingly, i
Page 230 and 231:
[ 215 ] Chapter 8 MDX calculation p
Page 232 and 233:
[ 217 ] Chapter 8 Using named sets
Page 234 and 235:
[ 219 ] Chapter 8 ( TopPercent ( {
Page 236 and 237:
[ 221 ] Chapter 8 Tuning the implem
Page 238 and 239:
Formula cache scopes There are thre
Page 240 and 241:
[ 225 ] Chapter 8 CREATE CACHE stat
Page 242 and 243:
Securing the Cube Security, for an
Page 244 and 245:
[ 229 ] Chapter 9 If you are famili
Page 246 and 247:
• • • • • • • [ 231 ]
Page 248 and 249:
• • [ 233 ] Chapter 9 In BI Dev
Page 250 and 251:
[ 235 ] Chapter 9 Let's look at an
Page 252 and 253:
[Sales Territory].[Sales Territory
Page 254 and 255:
[ 239 ] Chapter 9 First of all, not
Page 256 and 257:
We now have the exact result we wan
Page 258 and 259:
So, if we deny access to the Total
Page 260 and 261:
[ 245 ] Chapter 9 In order to solve
Page 262 and 263:
The relational data source, for the
Page 264 and 265:
[ 249 ] Chapter 9 This expression f
Page 266 and 267:
} [ 251 ] Chapter 9 MemberCollectio
Page 268 and 269:
[ 253 ] Chapter 9 All we then need
Page 270 and 271:
[ 255 ] Chapter 9 Now, here's the d
Page 272 and 273:
[ 257 ] Chapter 9 Generate takes tw
Page 274 and 275:
[ 259 ] Chapter 9 Dynamic cell secu
Page 276 and 277:
[ 261 ] Chapter 9 One last problem
Page 278 and 279:
[ 263 ] Chapter 9 Accessing a cube
Page 280 and 281:
[ 265 ] Chapter 9 Dimension securit
Page 282 and 283:
Productionization When the Analysis
Page 284 and 285:
• [ 269 ] Chapter 10 The recommen
Page 286 and 287:
• [ 271 ] Chapter 10 We may chang
Page 288 and 289:
At this point the package looks lik
Page 290 and 291:
UNION ALL SELECT * FROM CubeSales.S
Page 292 and 293:
Analysis Services processing can be
Page 294 and 295:
Type Description [ 279 ] Chapter 10
Page 296 and 297:
Type Description [ 281 ] Chapter 10
Page 298 and 299:
Processing an object will lead to t
Page 300 and 301:
We can also set these properties fo
Page 302 and 303:
• [ 287 ] Chapter 10 We can also
Page 304 and 305:
[ 289 ] Chapter 10 Push-mode proces
Page 306 and 307:
Copying databases between servers T
Page 308 and 309:
Summary In this chapter we have dis
Page 310 and 311:
Monitoring Cube Performance and Usa
Page 312 and 313:
[ 297 ] Chapter 11 CPU Analysis Ser
Page 314 and 315:
[ 299 ] Chapter 11 In order to unde
Page 316 and 317:
Another reason for paging physical
Page 318 and 319:
[ 303 ] Chapter 11 The Services tab
Page 320 and 321:
[ 305 ] Chapter 11 Performance Moni
Page 322 and 323:
[ 307 ] Chapter 11 The Disk I/O cou
Page 324 and 325:
[ 309 ] Chapter 11 As well as givin
Page 326 and 327:
[ 311 ] Chapter 11 With the 64 bit
Page 328 and 329:
[ 313 ] Chapter 11 However, it take
Page 330 and 331:
[ 315 ] Chapter 11 In both cases, t
Page 332 and 333:
[ 317 ] Chapter 11 Monitoring proce
Page 334 and 335:
[ 319 ] Chapter 11 Looking at these
Page 336 and 337:
[ 321 ] Chapter 11 Monitoring Proce
Page 338 and 339:
[ 323 ] Chapter 11 Despite the lack
Page 340 and 341:
• • • • • ° ° ° ° Get
Page 342 and 343:
[ 327 ] Chapter 11 Monitoring queri
Page 344 and 345:
[ 329 ] Chapter 11 The result of th
Page 346 and 347:
[ 331 ] Chapter 11 We could use all
Page 348 and 349:
Symbols % Processor Time process ca
Page 350 and 351:
common calculations 132 errors 152
Page 352 and 353:
data, modeling 117-122 drillthrough
Page 354 and 355:
Memory Usage KB 321 MSSQLServer OLA
Page 356 and 357:
parent/child hierarchy, NamingTempl
Page 358 and 359:
Thank you for buying Expert Cube De
Page 360:
SharePoint Designer Tutorial: Worki
show all

Expert Cube Development with Microsoft SQL Server 2008

Create successful ePaper yourself

Delete template?

Save as template?