Skip to main content

Cassandra in Google Summer of Code 2010

Cassandra is participating in the Google Summer of Code, which opened for proposal submission today. Cassandra is part of the Apache Software Foundation, which has its own page of guidelines up for students and mentors.

We have a good mix of project ideas involving both core and non-core areas, from straightforward code bashing to some pretty tricky stuff, depending on your appetite. Core tickets aren't necessarily harder than non-core, but they will require reading and understanding more existing code.

Non-core

  • Create a web ui for cassandra: we have a (fairly minimal) command line interface, but a web gui is more user-friendly. There is the beginnings of such a beast in the Cassandra source tree at contrib/cassandra_browser [pretty ugly Python code] and a gtk-based one at http://github.com/driftx/chiton [also Python, less ugly].
  • First-class commandline interface: if you prefer to kick things old-school, improving the cli itself would also be welcome.
  • Create a Cassandra demo application: we have Twissandra, but we can always use more examples to introduce people to "thinking in Casssandra," which is the hardest part of using it. This one seems to be the most popular with students so far. (So stand out from the crowd, and submit something else too. :)

Almost-core

Core

  • Avro RPC support: currently Cassandra's client layer is the Thrift RPC framework, which sucks for reasons outside our scope here. We're moving to Avro, the new hotness from Doug Cutting (creator of Lucene and Hadoop, you may have heard of those). Basically this means porting org.apache.cassandra.thrift.CassandraServer to org.apache.cassandra.avro.CassandraServer; some examples are already done by Eric Evans.
  • Session-level consistency: In one and two Amazon discusses the concept of "eventual consistency." Cassandra uses eventual consistency in a design similar to Dynamo. Supporting session consistency would be useful and relatively easy to add: we already have the concept of a Memtable to "stage" updates in before flushing to disk; if we applied mutations to a session-level memtable on the coordinator machine (that is, the machine the client is connected to), and then did a final merge from that table against query results before handing them to the client, we'd get it almost for free.
  • Optimize commitlog performance: this is about as low-level as you'll find in Cassandra's code base. fsync, CAS, it's all here. http://wiki.apache.org/cassandra/ArchitectureCommitLog describes the current CommitLog design.

You can comment directly on the JIRA tickets after creating an account (it's open to the public) if you're interested or have other questions. And of course feel free to propose other ideas!

Comments

So why does Thrift suck? Was the reasoning behind switching to Avro written down anywhere? I'm mostly just curious as to why you made the decision you've made.
Jonathan Ellis said…
The short answer is, "Thrift development is broken, and I'm not optimistic that it will get better any time soon." Exhibit A is http://issues.apache.org/jira/browse/THRIFT-347, which has been open over a year with a patch fixing an issue that everyone running PHP + Cassandra runs into, but is still not committed.
Anonymous said…
Is there a easy way to setup the Cassandra browser using thrift interface.

The procedure explained at "README" section of http://svn.apache.org/repos/asf/cassandra/trunk/contrib/cassandra_browser/ does not seems to be working.

Facing issues in generating thrift interfaces.
Please post a sample if you have one.

Thanks a lot.

Popular posts from this blog

PyCon Python IDE review

I presented an IDE review at PyCon last Friday. It was basically a re-review of what I thought were the 3 most promising IDEs from the Utah Python User Group IDE review , to which I added SPE, which was by far the most popular of the ones we left out that time. The versions reviewed are: PyDev 1.0.2 SPE 0.8.2.a Komodo 3.5.2 Wing IDE 2.1 beta 1 I'd intended to base my presentation around a comparison of writing a smallish program in each of the IDEs, but the more I tried to make this not suck, the more I realized it was a losing proposition. Instead, I decided to try to focus on the features in each that most set them apart from the others (both positive and negative); this seemed more likely be useful. (I did a new feature matrix for this review, which is included after my comments. The slides I used are also up, at http://utahpython.org/jellis/pycon-ides.pdf , but aren't very useful absent video of the presentation itself. Hence this post.) PyDev PyDev has g...

Why PHP sucks

(July 8 2005) Apparently I got linked by some PHP sites, and while there were a few well-reasoned comments here I mostly just got people who only knew PHP reacting like I told them their firstborn was ugly. These people tended to give variants on one or more themes: All environments have warts, so PHP is no worse than anything else in this respect I can work around PHP's problems, ergo they are not really problems You aren't experienced enough in PHP to judge it yet As to the first, it is true that PHP is not alone in having warts. However, the lack of qualitative difference does not mean that the quantitative difference is insignificant. Similarly, problems can be worked around, but languages/environments designed by people with more foresight and, to put it bluntly, clue, simply don't make the kind of really boneheaded architecture mistakes that you can't help but run into on a daily baisis in PHP. Finally, as I noted in my original introduction, with PHP, ...

A review of 6 Python IDEs

(March 2006: you may also be interested the updated review I did for PyCon -- http://spyced.blogspot.com/2006/02/pycon-python-ide-review.html .) For September's meeting, the Utah Python User Group hosted an IDE shootout. 5 presenters reviewed 6 IDEs: PyDev 0.9.8.1 Eric3 3.7.1 Boa Constructor 0.4.4 BlackAdder 1.1 Komodo 3.1 Wing IDE 2.0.3 (The windows version was tested for all but Eric3, which was tested on Linux. Eric3 is based on Qt, which basically means you can't run it on Windows unless you've shelled out $$$ for a commerical Qt license, since there is no GPL version of Qt for Windows. Yes, there's Qt Free , but that's not exactly production-ready software.) Perhaps the most notable IDEs not included are SPE and DrPython. Alas, nobody had time to review these, but if you're looking for a free IDE perhaps you should include these in your search, because PyDev was the only one of the 3 free ones that we'd consider using. And if you aren...