Skip to main content

Cassandra: Fact vs fiction

Cassandra has seen some impressive adoption success over the past months, leading some to conclude that Cassandra is the frontrunner in the highly scalable databases space (a subset of the hot NoSQL category). Among all the attention, some misunderstandings have been propagated, which I'd like to clear up.

Fiction: "Cassandra relies on high-speed fiber between datacenters" and can't reliably replicate between datacenters with more than a few ms of latency between them.

Fact: Cassandra's multi-datacenter replication is one of its earliest features and is by far the most battle-tested in the NoSQL space. Facebook had Cassandra deployed on east and west coast datacenters since before open sourcing it. SimpleGeo's Cassandra cluster spans 3 EC2 availability zones, and Digg is also deployed on both coasts. Claims that this can't possibly work are an excellent sign that you're reading an article by someone who doesn't know what he's talking about.

Fiction: "It’s impossible to tell when [Cassandra] replicas will be up-to-date."

Fact: Cassandra provides consistency when R + W > N (read replica count + write replica count > replication factor), to use the Dynamo vocabulary. If you do writes and reads both with QUORUM, for one example, you can expect data consistency as soon as there are enough reachable nodes for a quorum. Cassandra also provides read repair and anti-entropy, so that even reads at ConsistencyLevel.ONE will be consistent after either of these events.

Fiction: Cassandra has a small community

Fact: Although popularity has never been a good metric for determining correctness, it's true that when using bleeding edge technology, it's good to have company. As I write this late at night (in the USA), there are 175 people in the Cassandra irc channel, 60 in the HBase one, 32 in Riak's, and 15 in Voldemort's. (Six months ago, the numbers were 90, 45, and 12 for Cassandra, HBase, and Voldemort. I did not hang out in #riak yet then.) Mailing list participation tells a similar story.

It's also interesting that the creators of Thrudb and dynomite are both using Cassandra now, indicating that the predicted NoSQL consolidation is beginning.

Fiction: "Cassandra only supports one [keyspace] per install."

Fact: This has not been true for almost a year (June of 2009).

Fiction: Cassandra cannot support Hadoop, or supporting tools such as Pig.

Fact: It has always been straightforward to send the output of Hadoop jobs to Cassandra, and Facebook, Digg, and others have been using Hadoop like this as a Cassandra bulk-loader for over a year. For 0.6, I contributed a Hadoop InputFormat and related code to let Hadoop jobs process data from Cassandra as well, while cooperating with Hadoop to keep processing on the nodes that actually hold the data. Stu Hood then contributed a Pig LoadFunc, also in 0.6.

Fiction: Cassandra achieves its high performance by sacrificing reliability (alternately phrased: Cassandra is only good for data you can afford to lose)

Fact: unlike some NoSQL databases (notably MongoDB and HBase), Cassandra offers full single-server durability. Relying on replication is not sufficient for can't-afford-to-lose-data scenarios; if your data center loses power, you are highly likely to lose data if you are not syncing to disk no matter how many replicas you have, and if you run large systems in production long enough, you will realize that power outages through some combination of equipment failure and human error are not occurrences you can ignore. But with its fsync'd commitlog design, Cassandra can protect you against that scenario too.

What to do after your data is saved, e.g. backups and snapshots, is outside of my scope here but covered in the operations wiki page.

Comments

Is FB using the Apache version of Cassandra or their own internal version?
Toby DiPasquale said…
Totally love Cassandra, however, I would mention that all three EC2 availability zones are currently in the same datacenter on the East Coast so that's not such an awesome example of datacenter distribution.
Unknown said…
Toby: You're wrong that AZs are in the same data center. AZs are physically different locations that are ~10-15 miles apart.
kryton said…
Fact or fiction: Cassandra can brew my coffee in multiple data centers at the same time?
cloudhead said…
note that in mongo you can now use the --syncdelay option to improve the durability
Anonymous said…
A question that is not related to this blog but I guess you are the right person to ask: why Cassandra is written in Java? Excuse me if this question is over asked. Because writing robust apps would be easy with Java? Or any reason else. I just want to know why if there is a why, thank you
Unknown said…
Are there any solutions for using Hadoop with Cassandra?
Platypus said…
For the record, since I've been known to write about this before, I just want to say I've never denied that Cassandra can work across data centers. What I've said is that I don't think the way it works in that environment is the best way, but that's an entirely different kind of statement. Facts are facts, and good enough is good enough. It's just part of my nature to worry about the icebergs we haven't hit yet.
LaggyLuke said…
Just for the record, channel #mongodb on FreeNode has 207 people right now. This makes whole thing look a bit biased.
Anonymous said…
What is the lowest speed boundary?
Rafael Ribeiro said…
Hi Jonathan,

I am in the process of writting an article about Cassandra to one of the Java focused magazines on Brazil. As I am giving a overview of its architecture before covering its usage on Java I'd like to cover a little of its gossip protocol. Do you suggest any readings? Any particular paper/post/whatsoever?

Feel free to contact me over my e-mail that you'll probably see in the post.

best regards,
Rafael Ribeiro
Jonathan Ellis said…
For Gossip, you want to read http://wiki.apache.org/cassandra/ArchitectureGossip and http://www.cs.cornell.edu/home/rvr/papers/flowgossip.pdf
Unknown said…
So my question is.. can Cassandra replace mysql in all conditions and scenarios?
if it can't which ones that it can't?
Rafael Ribeiro said…
Hi Jonathan!

First of all tks a lot for the article and wiki page link, it clarified a lot of things.
Still... mind if I certify that I got things right? According to wiki and using the naming conventions present on the article Cassandra is a anti-entropy gossip protocol since wiki only mentions it gossips each second and does not mention anything about stopping due to a certain criteira, right? Second wiki says it randomly picks up peers to receive deltas.
Another thing I noticed is that Cassandra employs a protocol similar to the one described on the article, right? And at last... it is a push pull right? Since it pushes deltas upon updates and pulls upon reads (the mentioned read repair thing), right?

tks a lot for all the help!
kindly regards,
Rafael Ribeiro
Robert said…
currently there are no PHP support for using Cassandra.. any info regarding implement Cassandra with PHP (driver) or something?
Jonathan Ellis said…
PHP is supported via Thrift, like everything else. See http://wiki.apache.org/cassandra/ClientOptions

Popular posts from this blog

Why schema definition belongs in the database

Earlier, I wrote about how ORM developers shouldn't try to re-invent SQL . It doesn't need to be done, and you're not likely to end up with an actual improvement. SQL may be designed by committee, but it's also been refined from thousands if not millions of man-years of database experience. The same applies to DDL. (Data Definition Langage -- the part of the SQL standard that deals with CREATE and ALTER.) Unfortunately, a number of Python ORMs are trying to replace DDL with a homegrown Python API. This is a Bad Thing. There are at least four reasons why: Standards compliance Completeness Maintainability Beauty Standards compliance SQL DDL is a standard. That means if you want something more sophisticated than Emacs, you can choose any of half a dozen modeling tools like ERwin or ER/Studio to generate and edit your DDL. The Python data definition APIs, by contrast, aren't even compatibile with other Python tools. You can't take a table definition

Python at Mozy.com

At my day job, I write code for a company called Berkeley Data Systems. (They found me through this blog, actually. It's been a good place to work.) Our first product is free online backup at mozy.com . Our second beta release was yesterday; the obvious problems have been fixed, so I feel reasonably good about blogging about it. Our back end, which is the most algorithmically complex part -- as opposed to fighting-Microsoft-APIs complex, as we have to in our desktop client -- is 90% in python with one C extension for speed. We (well, they, since I wasn't at the company at that point) initially chose Python for speed of development, and it's definitely fulfilled that expectation. (It's also lived up to its reputation for readability, in that the Python code has had 3 different developers -- in serial -- with very quick ramp-ups in each case. Python's succinctness and and one-obvious-way-to-do-it philosophy played a big part in this.) If you try it out, pleas

A review of 6 Python IDEs

(March 2006: you may also be interested the updated review I did for PyCon -- http://spyced.blogspot.com/2006/02/pycon-python-ide-review.html .) For September's meeting, the Utah Python User Group hosted an IDE shootout. 5 presenters reviewed 6 IDEs: PyDev 0.9.8.1 Eric3 3.7.1 Boa Constructor 0.4.4 BlackAdder 1.1 Komodo 3.1 Wing IDE 2.0.3 (The windows version was tested for all but Eric3, which was tested on Linux. Eric3 is based on Qt, which basically means you can't run it on Windows unless you've shelled out $$$ for a commerical Qt license, since there is no GPL version of Qt for Windows. Yes, there's Qt Free , but that's not exactly production-ready software.) Perhaps the most notable IDEs not included are SPE and DrPython. Alas, nobody had time to review these, but if you're looking for a free IDE perhaps you should include these in your search, because PyDev was the only one of the 3 free ones that we'd consider using. And if you aren