New SQL: An Alternative to NoSQL and Old SQL for New OLTP Apps

Full Text 

User Comments (10) 

BLOG@CACM 

New SQL: An Alternative to NoSQL and Old SQL for New OLTP Apps 

Michael Stonebraker 

June 16, 2011 

Historically, Online Transaction Processing 

(OLTP) was performed by customers 

submitting traditional transactions (order 

something, withdraw money, cash a check, 

etc.) to a relational DBMS. Large 

enterprises might have dozens to hundreds 

of these systems. Invariably, enterprises 

wanted to consolidate the information in 

these OLTP systems for business analysis, 

cross selling, or some other purpose. 

Hence, Extract-Transform-and-Load (ETL) 

products were used to convert OLTP data 

to a common format and load it into a data warehouse. Data warehouse activity rarely 

shared machine resources with OLTP because of lock contention in the DBMS and 

because business intelligence (BI) queries were so resource-heavy that they got in the 

way of timely responses to transactions. 

This combination of a collection of OLTP systems, connected to ETL, and connected to 

one or more data warehouses is the gold standard in enterprise computing. I will term it 

“Old OLTP.” By and large, this activity was supported by the traditional RDBMS 

vendors. In the past I have affectionately called them “the elephants”; in this posting I 

refer to them as “Old SQL.” 

As noted by most pundits, “the Web changes everything,” and I have noticed a very 

different collection of OLTP requirements that are emerging for Web properties, which I 

will term “New OLTP.” These sites seem to be driven by two customer requirements: 

The need for far more OLTP throughput. Consider new Web-based applications 

such as multi-player games, social networking sites, and online gambling networks. The 

aggregate number of interactions per second is skyrocketing for the successful Web 

properties in this category. In addition, the explosive growth of smartphones has 

created a market for applications that use the phone as a geographic sensor and 

provide location-based services. Again, successful applications are seeing explosive 

growth in transaction requirements. Hence, the Web and smartphones are driving the 

volume of interactions with a DBMS through the roof, and New OLTP developers need 

vastly better DBMS performance and enhanced scalability. 

ACM.ORG JOIN ACM ABOUT COMMUNICATIONS ALERTS & FEEDS SIGN IN 

The need for real-time analytics. Intermixed with a tidal wave of updates is the need 

for a query capability. For example, a Web property wants to know the number of 

current users playing its game, or a smartphone user wants to know “What is around 

me?” These are not the typical BI requests to consolidated data, but rather real-time 

inquiries to current data. Hence, New OLTP requires a real-time query capability. 

In my opinion, these two characteristics are shared by quite a number of enterprise 

non-Web applications. For example, electronic trading firms often trade securities in 

several locations around the world. The enterprise wants to keep track of the global 

position (short or long and by how much) for each security. To do so, all trading actions 

must be recorded, creating a fire hose of updates. Furthermore, there are occasional 

real-time queries. Some of these are triggered by risk exposure—i.e., alert the CEO if 

Communications Digital Library 

Home News Blogs Opinion Browse by Subject Magazine Archive Careers ACM Resources Subscribe 

Home » Blogs » BLOG@CACM » New SQL: An Alternative to NoSQL and Old SQL for New... » Full Text 

TOOLS FOR READERS 

Comments 

Print 

Text Size 

RELATED NEWS & OPINION 

Robot 'Mission Impossible' 

Wins Video Prize 

New Scientist 

Embracing Noise or Why 

Computer Scientists Should 

Stop Worrying and Learn to 

Love the Errors 

Greg Linden 

What Constitutes an Act of 

Cyberwar? 

Slate 

RELATED ACM RESOURCES 

CONFERENCES: 

ICNB 2011: 2nd International 

Conference on 

Nanotechnology and 

Biosensors - December 28 

– 30, 2011; Dubai, United 

Arab Emirates 

COURSES: 

Networking and Security 

Administration of Red Hat 

Linux 5 - For network 

administrators managing 

security and performance of 

multiple systems on a 

network. Implement 

advanced system and 

network services along with 

security policies to ensure … 

MORE ABOUT ACM 

RESOURCES 

IN THE DIGITAL LIBRARY 

Software systems as cities: a 

controlled experiment 

ICSE '11: Proceeding of the 

33rd international conference 

on Software engineering 

Code Gestalt: a software 

visualization tool for human 

beings 

CHI EA '11: Proceedings of 

the 2011 annual conference 

extended abstracts on 

Human factors in computing 

systems 

Monitoring code quality and 

development activity by 

software maps 

MTD '11: Proceeding of the 

2nd working on Managing 

technical debt

the aggregate risk for or against a particular security exceeds a certain monetary 

threshold. Others come from humans, e.g., “What is the current position of the firm with 

respect to security X?” 

Hence, we expect New OLTP to be a substantial application area, driven by Web 

applications as the early adopters but followed by enterprise systems. Let’s look at the 

deployment options. 

1) Traditional OLTP. This architecture is not ideal for New OLTP for two reasons. 

First, the OLTP workload experienced by New OLTP may exceed the capabilities of Old 

SQL solutions. In addition, data warehouses are typically stale by tens of minutes to 

hours. Hence, this technology is incapable of providing real-time analytics. 

2) NoSQL. There have been a variety of startups in the past few years that call 

themselves NoSQL vendors. Most claim extreme scalability and high performance, 

achieved through relaxing or eliminating transaction support and moving back to a low- 

level DBMS interface, thereby eliminating SQL. 

In my opinion, these vendors have a couple of issues when presented with New 

OLTP. First, most New OLTP applications want real ACID. Replacing real ACID with 

either no ACID or “ACID lite” just pushes consistency problems into the applications 

where they are far harder to solve. Second, the absence of SQL makes queries a lot of 

work. In summary, NoSQL will translate into “lots of work for the application”—i.e., this 

will be the full employment act for programmers for the indefinite future. 

3) New SQL. Systems are starting to appear that preserve SQL and offer high 

performance and scalability, while preserving the traditional ACID notion for 

transactions. To distinguish these solutions from the traditional vendors, we term this 

class of systems “New SQL.” Such systems should be equally capable of high 

throughput as the NoSQL solutions, without the need for application-level consistency 

code. Moreover, they preserve the high-level language query capabilities of SQL. Such 

systems include Clustrix , NimbusDB , and VoltDB . (Disclosure: I am a founder of 

VoltDB.) 

Hence, New SQL should be considered as an alternative to NoSQL or Old SQL for New 

OLTP applications. If New OLTP is as big a market as I foresee, I expect we will see 

many more New SQL engines employing a variety of architectures in the near future. 

Disclosure: Michael Stonebraker is associated with four startups that are either 

producers or consumers of database technology. 

USER COMMENTS (10) 

RavenDB - a document database of the NoSql genre, has great ETL, OLTP support, and is backed by 

ACID storage. See http://t.co/fVxvmSV (and it evolved a lot since then). 

— Anonymous, June 17, 2011 

Was hoping to know the characteristicsmof New SQL, Actuallynnot convinced with your arguments 

about noSQL databases as you have no argures presented. Problems with old SQL are known, 

promises of noSQL are also known, what is new SQL? 

— Anonymous, June 19, 2011 

In the context of transaction processing, I would define a NewSQL DBMS as one having the following 5 

characteristics: 

1) SQL as the primary mechanism for application interaction 

2) ACID support for transactions 

3) A non-locking concurrency control mechanism so real-time reads will not conflict with writes, and 

thereby cause 

them to stall.

4) An architecture providing much higher per-node performance than available from the traditional 

"elephants" 

5) A scale-out, shared-nothing architecture, capable of running on a large number of nodes without 

bottlenecking 

--Michael Stonebraker 

MIT 

— CACM Administrator, June 22, 2011 

About point 3, 

Isn't this achieved in most traditional databases by MVCC. 

About point 4, 

Most databases (Clustrix,Akiban,NimbusDB) which are NewSQL candidates talks only about better 

query performance using distributed query or kind of object storage. I am not sure if they have anything 

in better for DML performance. 

VoltDB is an exception; 

I am not sure if they are much better than TerraData,GreenPlum which are based on OLD RDBMS 

architecture. 

About Point 5, 

Yes, this is a new feature; if I understand correctly, it means scaling the performance by adding new 

nodes without interrupting existing users; 

So does any traditional RDBMS implements following 2 points is candidate for NewSQL: 

1. Better Query or DML Performance by distributing load. 

2. Scale-out, shared nothing architecture. 

— Anonymous, July 1, 2011 

My previous comment suggested 5 criteria that defined a NewSQL DBMS. I would like to stress three 

points that I made previously. First, my posting focused on DBMSs for new OLTP applications. Data 

warehouse vendors, such as Teradata and Greenplum, are focused on a completely different market, 

and are not designed to perform high velocity transactions. Hence, they are not considered as 

NewSQL vendors. 

Second, most of the OldSQL vendors use standard two phase locking, although there are exceptions. 

Hence, there are OldSQL engines which satisfy some of my 5 criteria. 

Third, one of the big problems with OldSQL engines is their mediocre per-node performance. One of 

my criteria for NewSQL vendors is much better per-node performance. The proof of this is via 

performance on standard benchmarks. Hence, whether any particular vendor satisfies this criteria 

would have to be determined experimentally. As such, the list of vendors who satisfy the 5 criteria may 

well change over time. 


MIT 

— CACM Administrator, July 5, 2011 

Incase any database has scale-out architecture, why it is most necessary that per-node performance 

also should be very high; anyway if performance is getting better by adding more nodes it will be 

achieved. 

My main focus is to understand why can't some existing database like PostgreSQL be considered an 

option similar to NewSQL, if we enhance it to support scale-out architecture in it. Saying this doesn't 

mean I have a clear idea about how to achieve it. 

It will have benefit to existing customers as well even though the performance is somewhat less than 

NewSQL engines. It can save lot of effort to change applications to suit to new engines. 


In round numbers,NewSQL engines are a factor of 50 or more faster on New OLTP than OldSQL. That 

means that and OldSQL engine would require 500 nodes to do a task that can be accomplished by a 

NewSQL engine with 10 nodes. 

The downside of 500 nodes is: 

increased hardware cost 

increased power 

increased system administration complexity 

increased data base administration complexity 

high availability complexity (if a node fails once a month, then NewSQL fails every 3rd day, while 

OldSQL 

fails once an hour.) 

less flexibility (if load doubles, have to add 500 more nodes, rather than 10)


MIT 

— CACM Administrator, July 11, 2011 

Name: Marius from Scimore 

2PC doesn't scale, also in-doubt states almost impossible to handle correct, since, at any time the 

coordinator could die and no way to enlist the in-doubt transactions. For example, instead of 2PC, 

ScimoreDB from www.scimore.com uses dynamic 2PC, where nested sub-coordinators coordinate 

transactions in parallel, and, failures of sub-coordinators are tolerated. When combining D2PC with the 

tree execution, massive execution parallelization can be achieved, handling 100's of nodes with 

minimum penalty to compare with a master to nodes query distribution. 

I am interested, if ScimoreDB can be considered as NewSQL, we still based on sub-set of 2PC, since, 

we believe, quorum based algorithms impose performance penalties on read transactions. 


On July 5th you write, "The proof of this is via performance on standard benchmarks." 

Would you point the curious to these results. 

All that you say sounds good and theoretical. Some real numbers would make them something more 

than intellectual entertainment. 


Mike, 

I admire your work and what you've achieved as well as admit that I've greatly benefited from the 

insights you and others (incl. Rick Cattell in [1]) have shared with the community. However, when I 

read the blog post at hand I have to ask myself: do we really need to come up with new marketing 

terms every year (hint: NewSQL) or wouldn't our time be better invested in trying to understand what 

the fundamental issues are and how to overcome them. For example, the MR vs. parallel DB 

discussion (is it still alive?) or, maybe, someone can help me finding answers to my question [2]? 

KUTGW and hope to hear from you! 

Cheers, 

Michael 

[1] http://0-cacm.acm.org.millennium.lib.cyut.edu.tw/magazines/2011/6/108651-10-rules-for-scalable- 

performance-in-simple-operation-datastores 

[2] http://webofdata.wordpress.com/2011/05/02/nosql-linked-data-processing/ 


Post a comment... 

Name: 

Signed and anonymous 

comments submitted to this site 

are moderated and will appear 

if they are relevant to the topic 

and not abusive. Your 

comment will appear with your 

username if you are signed 

into the site, and will be 

anonymous if you are not 

signed in. 

View our policy on comments 

Comments: 

Input error: Invalid referer 

HOME NEWS BLOGS OPINION BROWSE BY SUBJECT MAGAZINE ARCHIVE CAREERS ACM RESOURCES 

About Communications | Join ACM | Renew | Subscribe | Sign In | For Authors | For Advertisers | Privacy | Site Map | Help | Contact Us | Mobile Site 

Copyright © 2011 by the ACM. All rights reserved.

New SQL: An Alternative to NoSQL and Old SQL for New OLTP Apps

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?