Wednesday, April 24, 2013

Concurrency is not parallelism

No so new, but still good piece of reading: http://blog.golang.org/2013/01/concurrency-is-not-parallelism.html
"Concurrency is the composition of independently executing processes, while Parallelism is the simultaneous execution of (possibly related) computations"
As I wrote several times in the past, in OLTP, throughput is king, concurrency is the main thing that is put into the test.

Concurrency is where Facebook has a million "Like"s every second, each "Like" is independent, and they need to be processed concurrently.

Parallelism, is where few concurrent activities, say a few analytic reports run in Oracle Exadata, Vertica or GreenPlum. Every report is is sliced into many related computations that execute simultaneously.

Are these the same?

From 50,000 feet, we see many things running in the same time, in parallel, concurrently, maybe even distributed. But we need to be accurate, there is a huge difference, and it is in the source: how many "original" transactions we had to process? A million "Like"s vs. a few big analytic report. In both cases I see million operations coming out of them at the back, but:
In the "Like"s use case - those are the real transactions, concurrently running, distributed.
In the report use case - those are million pieces of the same initial single job.

Important! Not to be confused! Big difference! One is great for throughput scalability and one is not. More in my next post.


Wednesday, April 3, 2013

MySQL thread pool and scalability examples

Nice article about SimCity outage and ways to defend databases: http://www.mysqlperformanceblog.com/2013/03/16/simcity-outages-traffic-control-and-thread-pool-for-mysql/

The graphs showing throughput with and without the thread pool are taken from the benchmark performed by Oracle and taken from here:
http://www.mysql.com/products/enterprise/scalability.html

The main take away is this graph (all rights reserved to Oracle, picture original URL):
20x Better Scalability: Read/Write
Scalability is where throughput can grow and grow, as demand grows. I need to get more from the database, the question is: "can it scale to give it to me?". Scalability is where the response time remains "acceptable" while the throughput grows and grows.

Every database has a "knee point".
  1. In the best case scenario, in this knee-point, throughput will go into a flat plateau, and On the same point BTW,  response time will start climbing, passing the non-acceptable point.
  2. In a worse case scenario, in this knee-point, throughput, instead of a flat plateau, it will take a plunger. On the same point BTW, response time will start climbing fast to the roof.
Actually, the red best case scenario, is actually pretty bad... There's NO scalability there, throughput has a hard limit! It's around 6,500 transactions per second. I need to do more on my DB, there are additional connections - but the DB is not giving even 1 inch of additional throughput. It doesn't scale.

The thread pool feature is no more than a defense mechanism. It doesn't break the scalability limit of a single machine, rather its job is to defend the database from death.

Real scalability is when throughput graph is neither dropping or becoming flat - it goes up and up and up with a stable response time. This can be achieved only by Scale Out. Getting 7,500 TPS with 1 database with 32 connections, then add an additional database and the straight line going up will reach, say, 14,000. A system with 3 database can support 96 connections and 21,000 TPS... and on and on it goes... 

Data needs to be distributed across those databases, so the load can be distributed as well. Maintaining this distributed data on the scaled-out database is the key... I'll touch that in future posts. 

Tuesday, March 26, 2013

They say: "Relational Databases Aren't Dead"

This is a good read, claiming: "Relational Databases Aren't Dead. Heck, They're Not Even Sleeping", http://readwrite.com/2013/03/26/relational-databases-far-from-dead. A key quote:
"While not comprehensive, the uses for NoSQL databases center around the acquisition of fast-growing data or data that does not easily fit within uniform structures."

There were 2 parts in the statement about NoSQL's uses. I'll start with the latter:


"data that does not easily fit within uniform structures" - NoSQL is probably the right choice, hmm although I always encourage thinking and architecting in advance. And also online structure changes do exist in the RDBMS world and recently in MySQL: http://dev.mysql.com/doc/refman/5.6/en/innodb-online-ddl.html...
I would definitely warn about the caveats of NoSQL when it comes to actually use and query the data that is so easily stored there...

"acquisition of fast-growing data" - is no longer a no-go for RDBMS and MySQL database. Distributed RDBMS solutions do exist today and they can exploit performance and scalability from the good old MySQL itself

What do you think?

Friday, January 4, 2013

Partial partitioning and sharding

I came across this: http://stackoverflow.com/questions/14136633/difference-between-partial-replication-and-sharding
I was wondering if sharding is an alternate name for partial replication or not. What I have figured out that --
  • Partial Repl. – each data item has only copies at some but not all of the nodes (‘Sharding’?)
  • Pure Partial Repl. – has only copies of a subset of the data item but no node contains a full copy of the database
  • Hybrid Partial Repl. – a set of nodes are full replicas and another set of nodes are partial replicas

I thought it was a good topic, I write a really nice answer, but there was a problem when I pressed "post this answer", probably some error or mistake on their side. Anyway - this is what I have to say about partial partitioning and sharding:

Partial replication is an interesting way, in which you distribute the data with replication from a master to slaves, each contains a portion of the data. Eventually you get an array of smaller DBs, read only, each contains a portion of the data. Reads can very well be distributed and parallelized. 

But what about the writes? 

Those are still clogged, in 1 big fat lazy master database, tasks as buffer management, locking, thread locks/semaphores, and recovery tasks - are the real bottleneck of the OLTP, they make writes impossible to scale... As I wrote in many previous posts, for example here: http://database-scalability.blogspot.com/2012/08/scale-up-partitioning-scale-out.html.

Sharding is where every piece of data lives only in one place, within an array of DBs. Each database is the complete owner of the data: data is read from there, data is written to there. This way, reads and writes are distributed and parallelized, real scale-out can be achieved. 

The idea behind sharding is great, the ultimate scaling solution (ask Facebook, Google, Twitter and all the other big guys) but it's a mess to handle, to maintain. It's hard as hell if done by yourself, ScaleBase enables an automatic transparent scale-out machine - that does all that, so you won't have too...



Friday, December 21, 2012

The battle on the OS...

I came across this piece: http://www.zdnet.com/windows-has-fallen-behind-apple-ios-and-google-android-7000008699/, and I'm all nostalgic now...

I had an Atari, my first computer was Apple IIc,  bunch of my friends had a Commodore 64 (I envied them, it was much cooler than the Apple!) and Amiga (wow!!).

Then arrived almost 2 lost decades where people knew* that Wintel is the only thing out there, and couldn't believe there was or ever will be anything else. Just like in the quote from MIB below...

Thank you smartphone, and thank you RIM, Apple and then Google. You made the world a better place.
I know you're not doing anything for us, you're doing it for your own good, to generate value and yield for your stockholders, while pumping ridiculous paychecks to the executives...

Still, you managed to make the world a better, more interesting place to live in, and more challenges to cope with... Big Data, OLTP velocity, app and data sprawl. More challenges for people like me to seek a solution for! You made scalability a big challenge! So, thank you.

BTW, as it turns out to be, I have a Wintel laptop and a Nokia Windows (splendid!) phone, but I'm doing it out of being open minded, and celebrating plurality!


* - The unforgettable quote from "Men In Black":

"...Fifteen hundred years ago everybody knew the Earth was the center of the universe. Five hundred years ago, everybody knew the Earth was flat, and fifteen minutes ago, you knew that humans were alone on this planet. Imagine what you'll know tomorrow"

Monday, December 17, 2012

Database Performance, a Ferrari and a truck

In the last days I got several queries, from colleagues and customers, about one thing I thought it's a given, well well known, but found out differently: "What is database performance?". Is it speed? Is it throughput? What are the metrics and how do you measure?

I tried to refer to an existing link, but then had to write and describe myself. The thing nearest to describing what I think "Database Performance" really is, is this, it's not bad yet I was able to make it even simpler to my esteemed colleagues and more esteemed customers.

Database performance, in an essence, derived from 2 major metrics:
Latency: the time we wait for an operation to finish. Measured in milliseconds (ms) or any other time unit.
Throughput: number of transactions/commands per time unit usually second or minute.

In the classic world of Data Warehouse and Analytics, throughput is usually a non-issue and latency is king. When database grows larger and larger, analytics complex queries take longer and longer to finish, and the demand is "I need speed!".

In the world of OLTP, throughput is the important measure. TPC-C benchmarks for example, measure only throughput (New Order Transactions per Minute). Oracle made it to meet 30,249,688 NO Transactions Per Minute, nice job, we as readers of the results have no way to know if a single transaction tool 1ms and they managed to squeeze thousands of those in parallel in 1 minute to meet this number, or maybe, the scenario transaction took exactly 1 minute, and Oracle managed to perform 30,249,688 such transaction in parallel. The truth is somewhere in the middle, between the 1 millisecond and 1 minute...

In OLTP the latency should be bearable (for some it's 50ms, for some it's 500ms) and stable as throughput must grow and grow as number of users/sites/devices/accounts/profiles grows and grows.

Another key word is predictability. In my OLTP I need predictable good enough, bearable, constant latency performance. I can't afford a 50ms transactions to take 1 minute once every while. I need transactions latency to be some X I can live with, I need it constant and predictable - while throughput is growing.

Not a popular comparison, but very very relevant: A Ferrari and a truck. Both have 500 horsepower.
A Ferrari will take you 200 miles per hour! However a truck will drive a good legal 70, and she'll go same 70 miles per hour with 100 pounds, 1 ton or 20 tons. Constant, stable, predictable. Yea, I'd like to have a Ferrari for my spare time, or to ace a benchmark, but when it comes to backend server infrastructure they're more like a truck to me... and they deliver...

Life's not fair sometimes... at least one of these has definitely got the looks:


Wednesday, October 3, 2012

"(Cloud) is complete gibberish. It's insane. When is this idiocy going to stop?"


This is Larry Ellison keynotes in Oracle OpenWorld on September 2008. Only 4 years ago.
"The computer industry is the only industry that is more fashion-driven than women's fashion. Maybe I'm an idiot, but I have no idea what anyone is talking about. What is it? It's complete gibberish. It's insane. When is this idiocy going to stop?"
"We'll make cloud computing announcements. I'm not going to fight this thing. But I don't understand what we would do differently in the light of cloud."
The above along with additional marbles are all here: http://news.cnet.com/8301-13953_3-10052188-80.html

Yesterday Larry stood on that same stage and announced Oracle 12c. The c stands for... Cloud!

And what makes Oracle 12c cloud ready?
12c is a "container database." It's function is to hold lots of other databases, keeping their data separate, but allowing them to share underlying hardware resources like memory or file storage. So this way 12c can be used for software-as-a-service tech companies that need a way to let multiple customers access a single database. It's also geared toward large enterprises who may have hundreds of Oracle databases. It would let them consolidate their databases onto less hardware, saving them money on that and making all of those databases easier to manage.

So in short 2 words: multi-tenancy.

Oracle is still Shared-everything, big boxes, and allow virtualization using internal division and allocation of those shared resources to multiple smaller "virtual databases". Indeed, it's great for consolidation and also multi-tenancy.

Cloud? IMHO, to me, cloud is multi-tenancy, but also scale (out) and elasticity. Amazon calls their cloud services EC2, the E stands for Elasticity. The new "DB made for the cloud" has no news about scale-out or about elasticity. Maybe we'll need to wait for Oracle 13e... :)

Also there's a new Exadata database machine, called x3... Yet another bigger box to do all the above. They say it's Oracle competition to SAP HANA.

And finally, we have a new player in the cloud services... Oracle! They'll have a public cloud offering (like Amazon, Rackspace, HP) and also a private cloud, which is a replica of Oracle's public cloud that is put in the customer's own data center. Oracle would still own the hardware and be responsible for running it, securing it and updating it. "as a Service" in your premise. It's interesting and even rhymes...

So it seems someone regained his composure...