05.03.2013 Views

Page Size Selection for OLTP Databases on SSD RAID Storage

Page Size Selection for OLTP Databases on SSD RAID Storage

Page Size Selection for OLTP Databases on SSD RAID Storage

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

12 · Ilia Petrov et. al<br />

Device<br />

Seq. Read<br />

[MB/s] (128K)<br />

Seq. Write<br />

[MB/s] (128K)<br />

Table I. Comparis<strong>on</strong> of enterprise HDDs, <strong>SSD</strong>s<br />

Rand. Read<br />

[ms] (4 KB)<br />

Rand. Write<br />

[ms] (4 KB)<br />

Rand. Read<br />

[ms] (16 KB)<br />

Rand. Write<br />

[ms] (16KB)<br />

Read IOPS<br />

(4 KB)<br />

Write IOPS<br />

(4 KB)<br />

Read IOPS<br />

(16 KB)<br />

Write IOPS<br />

(16 KB)<br />

E. HDD 160 160 3.2 3.5 3.3 3.4 291 287 288 285 2.5<br />

E. <strong>SSD</strong> 250 180 0.161 0.125 0.294 0.377 35 510 5 953 12 743 3.665 10<br />

The c<strong>on</strong>tributi<strong>on</strong>s of the present work can be summarized as follows:<br />

—<strong>SSD</strong> storage characteristics revert the trend of increasing page sizes of database systems. We claim<br />

that <str<strong>on</strong>g>for</str<strong>on</strong>g> <str<strong>on</strong>g>OLTP</str<strong>on</strong>g> databases, a smaller 4KB page size is better choice than a larger <strong>on</strong>e, e.g. 16 KB.<br />

—Smaller block sizes relax the demand <str<strong>on</strong>g>for</str<strong>on</strong>g> essential buffer space. Larger buffers can be used to<br />

additi<strong>on</strong>ally improve per<str<strong>on</strong>g>for</str<strong>on</strong>g>mance by buffering more data or <str<strong>on</strong>g>for</str<strong>on</strong>g> providing space <str<strong>on</strong>g>for</str<strong>on</strong>g> maintenance<br />

operati<strong>on</strong>s such as index rebuilding etc.<br />

—We claim that all database systems (not <strong>on</strong>ly several commercial <strong>on</strong>es) should support multiple<br />

dynamically c<strong>on</strong>figurable block sizes. And the ”default” block size should be smaller (in the range<br />

of 4KB to 8KB) since it influences the database catalogue.<br />

—Higher CPU utilizati<strong>on</strong> can be observed <str<strong>on</strong>g>for</str<strong>on</strong>g> <str<strong>on</strong>g>OLTP</str<strong>on</strong>g> databases, which are typically IO-Bound envir<strong>on</strong>ments.<br />

This is a result of the lower resp<strong>on</strong>se times <str<strong>on</strong>g>for</str<strong>on</strong>g> small block operati<strong>on</strong>s. This increased<br />

CPU demand is a natural fit <str<strong>on</strong>g>for</str<strong>on</strong>g> the multi-core CPU trend.<br />

The present article is structured as follows: we c<strong>on</strong>tinue by examining the IO properties of Flash<br />

<strong>SSD</strong>s and describing the system under test. Next we investigate the database page size influence <strong>on</strong><br />

the per<str<strong>on</strong>g>for</str<strong>on</strong>g>mance of an <str<strong>on</strong>g>OLTP</str<strong>on</strong>g> database. We use TPC-C as a standard <str<strong>on</strong>g>OLTP</str<strong>on</strong>g> workload. Last but not<br />

least we summarize our findings.<br />

2. RELATED WORK<br />

There is a large body of research <strong>on</strong> the properties of NAND <strong>SSD</strong>s [Chen et al. 2009; Agrawal et al.<br />

2008], the design of Flash Translati<strong>on</strong> Layer. Research influence of Flash <strong>SSD</strong>s in the database field<br />

[Lee et al. 2009; Lee et al. 2008] reflects primarily logging [Lee and Mo<strong>on</strong> 2007], indexing [Li et al.<br />

2009], page organizati<strong>on</strong> <str<strong>on</strong>g>for</str<strong>on</strong>g> analytical loads and its influence <strong>on</strong> joins [Shah et al. 2008; Do and Patel<br />

2009]. There are new algorithms and data structures emerging. They address issues such as indices,<br />

page <str<strong>on</strong>g>for</str<strong>on</strong>g>mats, logging and log record <str<strong>on</strong>g>for</str<strong>on</strong>g>mats [Nath and Kansal 2007; Lee and Mo<strong>on</strong> 2007; Shah et al.<br />

2008; Li et al. 2009]. [Graefe 2007] outlines the influence of <strong>SSD</strong>s <strong>on</strong> 5-Minute-Rule and discusses<br />

the influence of flash properties the node utility metric and <strong>on</strong> the page size of an B-Tree database<br />

storage. [Graefe 2007] proposes an optimal page size of 2KB. A detailed analysis of the database page<br />

size influence <strong>on</strong> per<str<strong>on</strong>g>for</str<strong>on</strong>g>mance does not exist.<br />

3. ENTERPRISE FLASH <strong>SSD</strong>S AND <strong>SSD</strong> <strong>RAID</strong> CONFIGURATIONS<br />

The per<str<strong>on</strong>g>for</str<strong>on</strong>g>mance Flash <strong>SSD</strong>s is characterized through (Table I): (i)asymmetry, (ii)very high random<br />

throughput, (iii) high sequential per<str<strong>on</strong>g>for</str<strong>on</strong>g>mance, (iv) low latency, (v) low power c<strong>on</strong>sumpti<strong>on</strong>. The basic<br />

characteristics of the Flash <strong>SSD</strong>s are well documented [Chen et al. 2009; Agrawal et al. 2008; Hudlet<br />

and Schall 2011]. These can be summarized as follows: (a) asymmetric read/write per<str<strong>on</strong>g>for</str<strong>on</strong>g>mance - the<br />

read per<str<strong>on</strong>g>for</str<strong>on</strong>g>mance is significantly better than the write per<str<strong>on</strong>g>for</str<strong>on</strong>g>mance, up to an order of magnitude. This<br />

is due to the internal organizati<strong>on</strong> of the NAND memory and FTL algorithms. (b) excellent random<br />

read throughput (IOPS) - especially <str<strong>on</strong>g>for</str<strong>on</strong>g> small block sizes. (c) acceptable random write throughput<br />

Journal of In<str<strong>on</strong>g>for</str<strong>on</strong>g>mati<strong>on</strong> and Data Management, Vol. 2, No. 1, February 2011.<br />

Price [e/GB]

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!