Overview presentation of TokuDB in slide format (PDF - Tokutek
Overview presentation of TokuDB in slide format (PDF - Tokutek
Overview presentation of TokuDB in slide format (PDF - Tokutek
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
Introduction to <strong>TokuDB</strong> v6.6
B-Tree Indexes Can’t Keep Up<br />
B-trees, over 40 years old and used <strong>in</strong> virtually all<br />
databases, are adequate for sequential workloads, but<br />
fail for more <strong>in</strong><strong>format</strong>ion rich (random) workloads.<br />
• HDD I/O Reduced: Disk head<br />
spends majority <strong>of</strong> time on seeks<br />
• SSD Life Limited: Small blocks<br />
written repeatedly<br />
• Limited Compression: Algorithms<br />
<strong>in</strong>effective on small block sizes<br />
• Database Performance Drops: Hi<br />
<strong>in</strong>sertion workloads can’t keep up
The Performance Gap<br />
Today’s database categories - OLAP, OLTP are compromises to<br />
get around dated, rigid <strong>in</strong>dex<strong>in</strong>g technology<br />
Usage<br />
B-tree<br />
Challenge<br />
OLAP OLTP<br />
Columnar DBs Hadoop<br />
MongoDB<br />
Read-<strong>in</strong>tensive Write-<strong>in</strong>tensive
Usage<br />
Address<strong>in</strong>g the Gap<br />
Real Time<br />
OLAP<br />
Log analysis<br />
Ad analytics<br />
Copyright<br />
Surveillance<br />
What everybody wants<br />
Cloud storage<br />
metadata<br />
Analytic<br />
OLTP<br />
Read-<strong>in</strong>tensive Write-<strong>in</strong>tensive
Usage<br />
B-tree <strong>in</strong>dexes<br />
Address<strong>in</strong>g the Gap<br />
Real Time<br />
OLAP<br />
Log analysis<br />
Ad analytics<br />
Copyright<br />
Surveillance<br />
Fractal Trees<br />
Elim<strong>in</strong>ate Death Valley<br />
Cloud storage<br />
metadata<br />
Analytic<br />
OLTP<br />
Read-<strong>in</strong>tensive Write-<strong>in</strong>tensive<br />
Fractal Tree ® Index<strong>in</strong>g
Smarter S<strong>of</strong>tware Algorithms…<br />
Fractal Tree ® <strong>in</strong>dexes are a new data structure that<br />
perform up to two orders <strong>of</strong> magnitude faster than<br />
B-trees, with optimal read/write characteristics<br />
Faster Queries<br />
Optimal<br />
Curve<br />
Faster Insertions<br />
Fractal Tree <strong>in</strong>dexes Are At Optimal Po<strong>in</strong>t<br />
Enjoy the write performance <strong>of</strong> LSMs while ma<strong>in</strong>ta<strong>in</strong><strong>in</strong>g<br />
the read performance <strong>of</strong> B-trees<br />
• Intelligently aggregates and<br />
rebalances data at tree nodes<br />
• Scales Performances for high<br />
<strong>in</strong>sertion workloads<br />
• Yields Higher Compression<br />
achieved over larger blocks<br />
• Extends Flash Life with larger, less<br />
frequent I/O
…Keep Pace with New Hardware Trends<br />
New Technology for today’s denser drives and faster flash technology<br />
Application<br />
MySQL Database<br />
SQL Process<strong>in</strong>g, Query<br />
Optimization…<br />
File System<br />
Speed<br />
Early Drives<br />
Flash<br />
Addresses Space and<br />
Wear-life constra<strong>in</strong>ts<br />
Capacity<br />
Multi TB HDD<br />
Addresses I/O<br />
Constra<strong>in</strong>ts
The Benefits <strong>of</strong> Better Indexes<br />
• <strong>TokuDB</strong> ® Fractal Tree ® <strong>in</strong>dexes change the game with huge<br />
<strong>in</strong>sertion rates and richer <strong>in</strong>dexed queries <strong>in</strong> scalable systems<br />
Indexed <strong>in</strong>sertions<br />
Insert 1 billion rows <strong>in</strong>to a table<br />
ma<strong>in</strong>ta<strong>in</strong><strong>in</strong>g three multi-column<br />
secondary <strong>in</strong>dexes.<br />
Term<strong>in</strong>al rate (last 10MM rows):<br />
InnoDB ® : 876 <strong>in</strong>serts/sec<br />
<strong>TokuDB</strong>: 16,507 <strong>in</strong>serts/sec<br />
(19x faster!)<br />
More on how FT <strong>in</strong>dexes<br />
work: http://goo.gl/Smu6H
Compression Reduces Effective Storage Cost<br />
• InnoDB: 2-4x compression<br />
– Small block sizes used<br />
– Fixed on-disk block size causes split /<br />
recompress operations that further<br />
reduce performance<br />
• <strong>TokuDB</strong>: 5x – 25x compression<br />
– Larger block sizes allow better compression<br />
• <strong>TokuDB</strong> compression doesn’t<br />
sacrifice performance<br />
– InnoDB performance suffers when<br />
compression is enabled<br />
– <strong>TokuDB</strong> compression is always enabled
Agility at Scale with Hot Schema Changes<br />
<strong>TokuDB</strong> ® - The only MySQL Storage Eng<strong>in</strong>e with Hot Schema Support!<br />
• Hot Column Addition/Deletion/Rename<br />
– Provides capability to add/drop/rename a<br />
column to a database as a fast operation<br />
– Enables database adm<strong>in</strong>istrators to rapidly<br />
def<strong>in</strong>e and add new fields<br />
– Allows for much larger tables to be created<br />
given simplified ma<strong>in</strong>tenance<br />
• Hot Index<strong>in</strong>g<br />
– Allows for concurrent operation on the<br />
database and <strong>in</strong>dex<br />
– Enables ad hoc queries to run fast with realtime,<br />
optimized <strong>in</strong>dex support<br />
– Br<strong>in</strong>gs familiar Enterprise Database onl<strong>in</strong>e<br />
operations to to MySQL<br />
In Test: InnoDB and <strong>TokuDB</strong><br />
rate limited to 10k <strong>in</strong>sertions/sec
Improved Life and Utilization for SSDs<br />
• Reduced Write Wear<br />
– B-trees write small blocks, result<strong>in</strong>g <strong>in</strong><br />
more writes and <strong>in</strong>creased wear<br />
– Fractal Tree <strong>in</strong>dexes write large blocks,<br />
reduc<strong>in</strong>g wear<br />
• Better Utilization<br />
– B-trees allow only 75% <strong>of</strong> capacity to be<br />
used: their smaller / more numerous<br />
writes make the Flash Translation Layer<br />
(FTL) a bottleneck<br />
– Fractal Tree <strong>in</strong>dexes allow 90%<br />
utilization because they generate<br />
larger / fewer writes<br />
In Test:<br />
InnoDB avg block size 23k<br />
<strong>TokuDB</strong> avg block size 215k
<strong>TokuDB</strong> ® Performance, Compression, Agility<br />
• Insertion Speed: 5-20x faster <strong>in</strong>dex <strong>in</strong>serts<br />
• Query Performance: Fast ad-hoc and rich<br />
queries<br />
• Scalability: Predictable performance at 10+<br />
billion rows<br />
• Compression: Up to 10x<br />
• Hot Schema Changes: Hot <strong>in</strong>dex<strong>in</strong>g, hot<br />
column addition and rename<br />
• Better Life for SSDs: Reduces write wear<br />
• Cont<strong>in</strong>uous Operation: Avoid downtime for<br />
<strong>in</strong>dex de-fragmentation<br />
• Replication: Ensure slaves can keep up<br />
• Support: MySQL and MariaDB
<strong>TokuDB</strong> Serves a Broad Target Market<br />
• Social Networks<br />
– Evidenzia (P2P Network Monitor<strong>in</strong>g)<br />
– Pr<strong>of</strong>ile Technology (Advanced Search for<br />
Facebook)<br />
– Marketyou (Pr<strong>of</strong>essional Network)<br />
– Swaylo (Social Media Monitor<strong>in</strong>g)<br />
• Cloud Enablement<br />
– FictionPress (Millions <strong>of</strong> Pieces <strong>of</strong> Fiction)<br />
– Frequency.com (Video Services)<br />
– Limelight Networks (Agile Storage Offer<strong>in</strong>g)<br />
• eCommerce Solutions<br />
– Intent Media (Onl<strong>in</strong>e Advertis<strong>in</strong>g)<br />
– Paybox (Electornic Payment Solutions)<br />
• Data Process<strong>in</strong>g<br />
– Jawa (Mobile Gam<strong>in</strong>g)<br />
– Mozilla (Data Visualization)<br />
– SwRI (Mach<strong>in</strong>e Data for NASA)
GigaOM: Twitter open sources its MySQL<br />
secret sauce<br />
Silicon Angle: Scal<strong>in</strong>g MySQL for the Era<br />
<strong>of</strong> Big Data<br />
Forbes: <strong>Tokutek</strong> Makes Big Data Dance<br />
O’Reilly: 2012 Strata Startup F<strong>in</strong>alist<br />
VentureWire: “BIG DATA: Amid Deluge,<br />
Database Start-Ups F<strong>in</strong>d Traction With New<br />
Tools”<br />
Xconomy: It was a big week for “Big Data”<br />
Recent Coverage<br />
Computerworld: <strong>Tokutek</strong> Boosts MySQL<br />
Scalability for Big Data Applications<br />
CTO Edge: Driv<strong>in</strong>g MySQL Performance<br />
Network World: Selected for Product <strong>of</strong> the<br />
Week<br />
IT News: MySQL Gets Live Schema Updat<strong>in</strong>g<br />
with <strong>Tokutek</strong> Eng<strong>in</strong>e<br />
DBTA: <strong>Tokutek</strong> Enables Big Data Scalability on<br />
MySQL with Support for Hot Schema Changes<br />
ReadWriteCloud: <strong>Tokutek</strong> Updates Its<br />
MySQL-based Big Database