10.07.2015 Views

Expert Oracle Exadata - Parent Directory

Expert Oracle Exadata - Parent Directory

Expert Oracle Exadata - Parent Directory

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

CHAPTER 3 HYBRID COLUMNAR COMPRESSIONCompression UnitBlock HeaderBlock Header Block Header Block HeaderCompression UnitHeaderColumn 1Column 3Column 2 Column 4Column 4Column 5Column 5Column 6Figure 3-5. Layout of an HCC Compression UnitNotice that the rows are no longer stored together. Instead the data is organized by column withinthe compression unit. This is not a true column oriented storage format but rather a cross betweencolumn oriented and row oriented. Remember that the sorting is done only within a single CU. The nextCU will start over with Column 1 again. The advantage of this format is that it allows any row to be readin its entirety by reading a single CU. With a true column oriented storage format you would have toperform a separate read for each column. The disadvantage is that reading an individual record willrequire reading a multi-block CU instead of a single block. Of course full table scans will not suffer,because all the blocks will be read. We’ll talk more about this trade-off a little later but you shouldalready be thinking that this limitation could make HCC less attractive for tables that need to supportlots of single row access.The sorting by column is actually done to improve the effectiveness of the compression algorithms,not to get performance benefits of column oriented storage. This is where the name “Hybrid ColumnarCompression” comes from and why <strong>Exadata</strong> has not been marketed as a column oriented database. Thename is actually very descriptive of how the feature actually works.HCC PerformanceThere are three areas of concern when discussing performance related to table compression. The first,load performance, is how long it takes to compress the data. Since compression only takes place ondirect path loads, this is essentially a measurement of the impact of loading data. The second area ofconcern, query performance, is the impact of decompression and other side effects on queries againstthe compressed data. The third area of concern, DML performance, is the impact compressionalgorithms have on other DML activities such as Updates and Deletes.Load PerformanceAs you might expect, load time tends to increase with the amount of compression applied. As the sayinggoes, “There is no such thing as a free puppy.” When you compare costs in terms of increased load timewith the benefit provided by increased compression ratio, the two Zlib-based options (QUERY LOW andARCHIVE HIGH) appear to offer the best trade-off. Here’s a listing showing the syntax for generatingcompressed versions of a 15G table along with timing information.SYS@SANDBOX1> @hcc_build3SYS@SANDBOX1> set timing onSYS@SANDBOX1> set echo onSYS@SANDBOX1> create table kso.skew3_none nologging parallel 873

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!