tag:blogger.com,1999:blog-116837132024-03-09T18:45:53.755-08:00SpycedJonathan Ellishttp://www.blogger.com/profile/11003648392946638242noreply@blogger.comBlogger218125tag:blogger.com,1999:blog-11683713.post-70221189777971648812021-09-06T07:13:00.031-07:002022-05-13T07:11:49.638-07:00A review of Lambda School from the father of a recent graduate<h1 style="text-align: left;"><span style="background-color: transparent; color: black;">Background</span></h1><p></p><p><span style="background-color: transparent; color: black;">I’ve been a professional developer for twenty years. I exposed my son N to programming a couple times while he was growing up -- Scratch when he was around 8, Khan Academy javascript when he was 12. He learned it easily enough but it didn’t grab him.</span></p><p></p><p><span style="background-color: transparent; color: black;">But his junior year in high school he had a hole in his schedule and I convinced him to try AP CS to fill it. And this time, he got hooked. He started programming for fun in the evenings. You know how it goes.</span></p><p></p><p><span style="background-color: transparent; color: black;">Then in March 2020, Covid hit and his high school went virtual. It was a terrible experience, to the point that instead of going back for more his senior year, he took the last classes he needed to graduate over the summer, and decided to apply to programming boot camps in the fall. I think </span><a href="https://www.amazon.com/Case-against-Education-System-Waste/dp/0691174652" style="background-color: transparent; color: #1155cc;" target="_blank">the American college system is broken</a><span style="background-color: transparent; color: black;">, so I was happy to help evaluate his options for something different.</span></p><p></p><h1><span style="background-color: transparent; color: black;">Evaluating boot camps</span></h1><p></p><p><span style="background-color: transparent; color: black;">N and I came up with three criteria for evaluating boot camps. If they didn’t meet these three, we weren’t interested:</span></p><ol><li><span style="background-color: transparent; color: black;">Income-sharing agreement, or similar. The incentives for the school to, in effect, take your money and run are very very strong in for-profit education. ISA means they only get paid if you get a job. This creates a simple but powerful alignment.</span></li><li><span style="background-color: transparent; color: black;">Modern curriculum. If they’re still teaching Ruby on Rails, we’ll pass.</span></li><li><span style="background-color: transparent; color: black;">Pre-work. If a school will admit anyone who applies with a shotgun-style approach to student success, that’s not a good model. It’s a much better sign if they have a rigorous set of pre-admission work to demonstrate some level of interest and aptitude. I spent a little over a year teaching college level CS classes, so I know that for whatever reason programming just doesn’t fit everyone's brain.</span></li></ol><p></p><p><span style="background-color: transparent; color: black;">Our shortlist was App Academy, Hack Reactor, Rithm, and Lambda School. App Academy, Rithm, and Hack Reactor hit all three of the above criteria. Lambda School did not have pre-work as </span><span style="background-color: transparent; color: black;"><span style="background-color: transparent; color: black;">rigorous as the others</span>, but they offered the longest curriculum so we thought that could make up for it. (However, Lambda changed from a 9 month course to 6 months just before N applied.)</span></p><p></p><p><span style="background-color: transparent; color: black;">While applying, we found that App Academy had some fine print that they would not do an ISA for students under 20, so we took them off the list. </span></p><p></p><p><span style="background-color: transparent; color: black;">N was accepted to Rithm, Hack Reactor, and Lambda. He decided on Lamba primarily because they have been online-only from the beginning so we thought they probably had an edge over Hack Reactor and Rithm, which had been primarily (HR) or entirely (Rithm) in-person before the pandemic.</span></p><p></p><h1><span style="background-color: transparent; color: black;">Lamba School</span></h1><p><span style="background-color: transparent; color: black;">Lambda (now renamed to Bloom Technology) did a pretty competent job preparing N to be an entry level web developer. Enough HTML, CSS, Node.js, React, and PostgreSQL to be dangerous, plus the basics of Git and Bash. A good foundation that he can build on.</span></p><p></p><p><span style="background-color: transparent; color: black;">Lambda School’s curriculum is a series of six month-long units. After the first unit, most of the coursework was set up to prepare the students to tackle fairly meaty projects done in teams of five or six, divided into presentation layer, front end code (Node), and back end code (SQL). These divisions are by experience level at Lambda. So students X, Y, and Z would work on a project, then next month Z would graduate, X and Y would rotate positions, and W would join as the new guy. Multiply this by two to get the six person teams.</span></p><p></p><p><span style="background-color: transparent; color: black;">The quality of instruction was overall solid, but if something went wrong like a version incompatibility with node, getting help troubleshooting depended on who you asked.</span></p><p></p><p><span style="background-color: transparent; color: black;">After the first four units came a month of CS subjects like binary trees and recursion, and then for the final month, they got back into teams for a capstone project. N’s team revamped a web app for a tiny nonprofit. It was good experience, he learned a lot about understanding what an existing system did and how to rebuild it while keeping the good parts.</span></p><p></p><p><span style="background-color: transparent; color: black;">So on balance: while I totally get the standpoint that there is high quality instructional and reference material across the Internet for free or a much lower cost than Lambda School, I think that between the actual instruction, the accountability from a formal curriculum, the project work, and the real-world capstone with the nonprofit, Lambda delivered value for what N is paying.</span></p><p></p><h1><span style="background-color: transparent; color: black;">Post graduation</span></h1><p><span style="background-color: transparent; color: black;">So the coursework and instruction at Lambda was well done. Unfortunately, Lambda came up short in several areas of helping N find a job.</span></p><ol><li><span style="background-color: transparent; color: black;">Before finishing the coursework, there was little communication on how the job application process would go, what to expect, or how to prioritize your time. </span></li><li><span style="background-color: transparent; color: black;">Resume building was a mixed bag. They did help N create a plaintext resume, and they coached the students on how to present their prior, non-programming experience, but they did not help create a rich text version for human reviewers.</span></li><li><span style="background-color: transparent; color: black;">N doesn’t know how you get picked for one of the sexy new programs like <a href="https://twitter.com/austen/status/1329539767969615873?lang=en">Lambda Fellows</a>, nobody talked about that and nobody he knows was included.<br /></span></li><li><span style="background-color: transparent; color: black;">N graduated right in the middle of traditional summer intern season, which Lambda ignored completely. Seems like a missed opportunity at the very least -- surely most of the Lambda grads would prefer a paid internship to months of applying to developer positions while working another job.</span></li><li><span style="background-color: transparent; color: black;">Most importantly: after graduating, Lambda emailed N just five job openings over a period of two months to say, contact them here if you’re interested. That’s it, that’s the extent of their post-graduation job search support. (None of the five replied to N’s application.)</span></li></ol><p></p><p><span style="background-color: transparent; color: black;">On the positive side, N thinks Lambda had a great program for interview coaching. They did multiple rounds of one-on-one mock interviews to help the students get used to the kinds of questions they could expect in the interview process.</span></p><p></p><p><span style="background-color: transparent; color: black;">N ended up finding a job through my network, several of whom were willing to interview a new boot camp grad. (Thank you!) He just finished his first week working full time as a software developer.</span></p><p></p><h1><span style="background-color: transparent; color: black;">Commentary and educated guesses</span></h1><p><span style="background-color: transparent; color: black;">Given that N thought (and I think, and the people who interviewed him thought) that Lambda’s actual instruction was good, why are there so many reviews online complaining about it? I think there are two big factors:</span></p><p></p><p><span style="background-color: transparent; color: black;">First, Lambda is doing something new and consciously not following traditional instructional design because "that’s how we’ve always done it." Remember, Lambda was </span><i style="background-color: transparent; color: black;">online-only before the pandemic.</i><span style="background-color: transparent; color: black;"> That alone means things are going to be different from other schools. And they’re not trying to help you get a well rounded classical liberal arts education or even necessarily to “learn to learn” -- their goal is to teach you enough practical programming to get a job as an entry level developer. This means, for instance, that they do a lot more project work in teams than your nearest college CS department would. I think that’s a good thing.</span></p><p></p><p><span style="background-color: transparent; color: black;">It also means that nothing is sacred and things can change quickly. So they changed it from 9 months to 6 (which I believe did not affect already-enrolled students) and eliminated paid team leads from their project work (which did). If you try new things, some of them aren’t going to work out. I understand how this would suck as a student, but running a school is expensive, running a school that does something nobody has done before (successfully, and at scale) is even more expensive, so the faster they can iterate on what works and stop what doesn’t, the better. I don’t fault Lambda for this.</span></p><p></p><p><span style="background-color: transparent; color: black;">The other factor is students who didn’t have the necessary background to be successful. It makes me sad to see Lambda students writing about “flexing” (repeating) a unit </span><i style="background-color: transparent; color: black;">for a second time.</i><span style="background-color: transparent; color: black;"> I think there’s a high likelihood that they weren’t ready to be admitted. This is something Lambda could fix by increasing the rigor of their relatively short precourse work.</span></p><p></p><p><span style="background-color: transparent; color: black;">It’s a balance -- you don’t want to only admit students who are 100% guaranteed to succeed, but on the other hand it’s not really doing people a favor to admit them if they only have a 10% chance. I’m not sure exactly where the balance is, but it seems likely (based on what I see other bootcamps doing that create high-quality outcomes) that Lambda’s filter should be a bit tighter. (On the other hand, I see other people criticizing Lambda for making money off the students who are so well prepared that they would be able to get a job programming no matter what they did. This is definitely not the case for the typical Lambda student, but if people are saying that then maybe that’s an indicator that Lambda has about the right balance after all.)</span></p><p></p><p><span style="background-color: transparent; color: black;">Based on Lambda’s relative lack of help sourcing job opportunities for N, I also wonder if they’ve scaled too fast, too quickly. Lambda advertises two things: relevant skills, and help finding a job. It seems to me that it’s a lot easier to scale the instructional part, than your pipeline of companies who want to hire graduates from a new and relatively unproven school. This would explain the relative lack of referrals that N saw, and it would also explain why Lambda hasn’t released student success metrics since 1H 2019 over a year ago. (And for that, I do fault Lambda.)</span></p><p></p><h1><span style="background-color: transparent; color: black;">TLDR</span></h1><p><span style="background-color: transparent; color: black;">Lambda did a good job with curriculum and instructional design, maybe even a great job. But their job-search program was significantly weaker, or perhaps it just hasn’t been able to scale to meet an increased volume of admissions. I am cheering for Lambda and I hope they can fix it.<br /></span></p>Jonathan Ellishttp://www.blogger.com/profile/11003648392946638242noreply@blogger.comtag:blogger.com,1999:blog-11683713.post-2197009325961408642012-03-18T16:01:00.001-07:002016-06-08T09:16:48.613-07:00Speaking to a technical conferenceI just got back from PyCon, and as with all conferences where the talks are delivered by engineers instead of professional speakers, we had a mixed bag. Some talks were great; others <a href="https://twitter.com/#%21/spyced/status/166893273850454018">made me get my laptop out</a>.<br />
<br />
The most important important axiom is: a talk is not just an essay without random access. It's a different medium. Respect its strengths instead of wishing it were something it's not.<br />
<br />
Here are some concrete principles that can help:
<br />
<h3>
Don't read your slides</h3>
Advice often repeated, too-seldom followed. This is sometimes phrased as "make eye contact with your audience," but I've seen that second version interpreted to mean, "make eye contact while reading your slides, so your head pops up and down like a gopher poking out of its hole." So just don't read your slides, no matter what else you're doing.<br />
<br />
Some good presenters go to <a href="http://en.wikipedia.org/wiki/Takahashi_method">extremes</a> with this, with just one or two words per slide. This is fine as a stylistic embellishment, but not necessary for a good talk. You don't need to be <i>that</i> minimalistic. Just remember that with every transition, your audience will read the new slide before returning its attention to whatever you are saying. (Watch for this the next talk you attend; you will absolutely catch yourself doing it.)<br />
<br />
Other presenters use "builds" to combat this. This can be useful in moderation, but it's more often used as a crutch, especially when presenting a list of related material. Personally, if I have an information-dense topic, like <a href="http://www.slideshare.net/jbellis/apache-cassandra-nosql-in-the-enterprise/8">this one</a> from my Strata talk, I'll put the whole list up at once but I'll leave the details off the slide and speak them instead.<br />
<br />
I'm also not a fan of "presenter notes" displayed on a secondary monitor. Too often this leads to the gopher effect or to underpracticing, or both.<br />
<br />
The one time you <i>do</i> want to explicitly direct attention to your slides is to explain part of it. For example, on <a href="http://www.slideshare.net/jbellis/pycon-2012-what-java-can-learn-from-python/6">this slide</a> I explained that the upper right was an example of DataStax's Opscenter product interfacing with Cassandra over JMX; the upper left was jvisualvm, and so forth. Since it was a large room, I really did say things like, "in the upper right, ..." In a smaller room I like to stand close enough to the screen to just point.
<br />
<h3>
Use visual aids</h3>
One of the best uses of builds is to explain a complicated diagram or sequence a piece at a time. This is difficult-to-impossible to do as effectively in prose alone. Sylvain's talk on <a href="http://mirror.aarnet.edu.au/pub/fosdem/2012/maintracks/janson/cassandra.webm">the Cassandra storage engine at FOSDEM 2012</a> is a good example. Starting at about 22:00, he explains how Cassandra uses log-structured merge trees to turn random writes into sequential i/o. Compare that with the treatment in the <a href="http://research.google.com/archive/bigtable-osdi06.pdf">Bigtable paper</a>, or the original <a href="http://staff.ustc.edu.cn/%7Ejpq/paper/flash/1996-The%20Log-Structured%20Merge-Tree%20%28LSM-Tree%29.pdf">LSTM paper</a>. Sylvain's explanation is much more clear by virtue of how it's presented.<br />
<br />
I avoid audio or video during my presentations since using it effectively is a skill I don't yet have, but I've seen it done well by others. I can't imagine <a href="http://pyvideo.org/video/669/militarizing-your-backyard-with-python-computer">my favorite PyCon talk</a> being as effective without the recorded demonstration at the end.<br />
<br />
Finally, pictures can also be more effective than the spoken word at communicating humor. I'm <a href="http://www.google.com/search?q=javascript+good+parts&um=1&ie=UTF-8&hl=en&tbm=isch&source=og&sa=N&tab=wi">not sure</a> who came up with this first, but the juxtaposition <a href="http://www.slideshare.net/jbellis/pycon-2012-what-java-can-learn-from-python/2">here</a> is worth well over 1000 words.
<br />
<h3>
Leave them wanting more</h3>
Your goal in most public speaking is to get people interested enough to learn more on their own, <i>not</i> to make them experts.<br />
<br />
One thing I struggled with early on was, how do you explain code without reading your slides? I realized that the answer was, if you're trying to explain code, you're getting too deep into the weeds. Sometimes I'll use a snippet of code to give the "flavor" of an API, but wall-of-text slides mean you're Doing It Wrong.<br />
<br />
Another common mistake is to start your talk with an outline. (Worse: outline "progress reports" during the talk that tell the audience how far along you are.)<br />
<br />
A much better way to get the audience engaged is to tell a story: How did you come across the problem you are solving? What makes it challenging? What promising approaches didn't actually work out, and why? This is a classic story arc that will get people interested much more than if you dive into the nuts and bolts of your solution.
<br />
<h3>
Practice</h3>
Paul Graham <a href="http://paulgraham.com/speak.html">gets this one wrong</a>: while ad-libbing is indeed the polar opposite of reading your slides, it's also sub-optimal. You need practice to get timings right, to try out different phrasings of your thoughts, and to make transitions smooth. Don't fall for the false dichotomy that either you ad-lib or you practice all the spontaneity out; there's a happy medium in between.
<br />
<h3>
Mechanics</h3>
Finally (last <i>and</i> least?), a brief note on mechanics. Stand where you can gesture freely and naturally; ideally (in a small room) next to the screen. Don't stand behind a podium. Don't speak sitting down. Pacing a little bit is good.<br />
<br />
All these things mean: you need a slide remote. Even if you are right next to your laptop, reaching down to hit the spacebar or arrow key is distracting. But if you are doing it right you are probably not right next to your laptop. The remote included with Macs is unfortunately not enough, since it relies on infrared line-of-sight. If the conference doesn't provide one, borrow one from another presenter. If you speak frequently, it's worth the approximately $40 cost to get your own so you don't have to wrestle with unfamiliar hardware when you go live.<br />
<br />
Good luck!<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />Jonathan Ellishttp://www.blogger.com/profile/11003648392946638242noreply@blogger.com3tag:blogger.com,1999:blog-11683713.post-15586965807928329782012-02-02T15:23:00.000-08:002013-07-25T11:53:40.858-07:00Thinkpad 420s review<a href="http://ecx.images-amazon.com/images/I/415xXcJ40UL._SL500_AA300_.jpg"><img alt="" border="0" src="http://ecx.images-amazon.com/images/I/415xXcJ40UL._SL500_AA300_.jpg" style="cursor: hand; cursor: pointer; float: right; height: 300px; margin: 0 0 10px 10px; width: 300px;" /></a>In the last three years my primary machines have run OS X, Linux, Windows, OS X, and now Windows again, in that order. The observant reader may note, "That's a lot of machines in three years." It is, but I also changed jobs twice in that time frame, so that's part of it. Another part is that I'm a bit rough with laptops; the two mac machines broke badly enough that AppleCare told me they weren't going to help. The Dell and Lenovo machines, though, outlasted my use of them.
For this most recent machine, I had several requirements and several nice-to-haves, some of which were in tension.
Requirements:
<br />
<ul>
<li>Able to drive a 30" external monitor</li>
<li>At least 8GB of RAM</li>
<li>At least 1440x900 native resolution
</li>
</ul>
Nice to have:<br />
<ul>
<li>Smaller than my 15" macbook pro, which is too large to use comfortably in coach on an airplane</li>
<li>Larger screen than my wife's 13.3" mbp</li>
<li>A "real" cpu, not the underclocked ones in the Macbook Airs</li>
<li>A graphics card that can do justice to Starcraft II</li>
</ul>
I wasn't picky about my operating system. Linux is by far the best experience for software development, but support for multiple monitors is still dicey, which is bad when you're relying on it to give presentations on unfamiliar projectors. OS X is superficially unix-y but lack of package management means in practice it's not really any better than Windows. Windows is ... Windows, although I'm pretty fond of the <a href="http://www.softwarepro.com/articles/win_snap_shake.htm">new-in-Windows-7 window management keyboard shortcuts</a>. But fundamentally I spend 99% of my time in <a href="http://www.jetbrains.org/">an IDE</a>, a web browser, <a href="https://riptano.hipchat.com/">hipchat</a>, and IRC, all of which are cross-platform. <a href="http://www.mingw.org/wiki/MSYS">MSYS</a> gives me about as much of the unix experience on Windows, as I got on OS X. (And <a href="http://ninite.com/">ninite</a> gives me more of a package manager than I had on OS X--granted, that isn't saying much.)
<br />
I ended up buying a 14" Thinkpad 420s. I think the S stands for "slim," and it is. My 15" mbp looks and weighs like a ton of bricks next to it, even after I swapped out the Thinkpad's dvd drive for the supplementary battery module, which weighs a little more. The Thinkpad's legendary keyboard lives up to its reputation, and I'm a huge fan of the trackpoint living right there on the home row of the keyboard, for when keyboard shortcuts aren't easily available. The cooling is excellent without the fans ever getting loud.
<br />
For the most part, I'm extremely happy with the hardware. There are two exceptions:<br />
<ul>
<li>The built-in microphone is terrible. Almost without exception, people have trouble hearing me over Skype. Adding insult to injury, there is no mic input. I thought at first the headphone jack was a phone-style out-plus-in jack, but no. I'll have to get a USB mic.</li>
<li>Optimus doesn't work in one important respect: in optimus mode it won't drive my 30" monitor at full resolution; it picks something weird like 2048x1560 instead. Lenovo said they were going to fix this but hasn't, yet. To drive this monitor correctly I have to lock it to discrete graphics in the bios. In discrete mode it gets about 2h 45m battery life even with the CPU downclocked and the display dim. So when I travel, I reboot to integrated graphics.</li>
</ul>
The main alternative I considered was the Sony Vaio Z. Ironically, I ended up going with the Thinkpad mostly because Vaio reviewers consistently called out how terrible its built-in speakers were... so I ended up with a system with a terrible mic instead.<br />
On the software side, I'm more than happy with Windows, especially after the Steam holiday sale. I hadn't realized how many fantastic indie games are available these days. (Most recently, I highly recommend <a href="http://supergiantgames.com/?page_id=242">Bastion</a>.)<br />
The one fly in my soup is that I'd anticipated being able to run OS X in a VM for the sake of Keynote. Neither Google Docs presentations, Open/Libre Office Impress, or Powerpoint are adequate replacements. Unfortunately, the Core Image (?) APIs used by Keynote don't work under virtualization, so for now I'm still using my old mbp to create presentations, and taking them on the road with me as pdf.
Jonathan Ellishttp://www.blogger.com/profile/11003648392946638242noreply@blogger.com3tag:blogger.com,1999:blog-11683713.post-48423173835981002222011-11-19T09:24:00.001-08:002011-11-19T09:27:39.134-08:00On applying for jobsA friend <a href="http://yamanin.livejournal.com/239406.html">asks</a>,
<blockquote>If [I see a] job I could do, even though I don't meet the stated requirements, should I apply anyway? </blockquote>Short answer: yes.
<p>
Longer answer: companies are all over the map here, although in general the less layers of bureaucracy there are between the team that the candidate will work with and the hiring process, the more likely the list of requirements is to be actual requirements.
</p><p>
How can you tell?
</p><p>
HR paper pushers like to think in terms of checklists because that lets them go through hundreds of resumes without any real understanding of the position, so they write ads like <a href="http://jobview.monster.com/Lead-Software-Engineer-Java-J2EE-Oracle-MQ-Job-MetroWest-MA-US-103927164.aspx">this one</a> -- lots of really specific "5+ years of X," not much about what the position actually involves.
</p><p>
But if it's the team lead himself writing the description, which you will see at smaller companies, then you get much <a href="http://www.linkedin.com/jobs?viewJob=&jobId=2066287">more about what the position involves</a> and less checklist items, because the lead is comfortable determining competence based on skill instead of pattern matching. For a software development position, I don't care if you have a degree in CS if you can code. (Open-source contributions are a better signal for ability and passion than a degree, anyway.) My team has people with no degree, to people with PhDs.
</p><p>
Even when dealing with large companies, you have to factor in that people are terrible at distinguishing "want" from "need." A lot of "requirements" are really "nice-to-haves." It can be tough to tell the difference, but the better idea you have of what the job actually involves, the better you can tell which are hard requirements.
</p><p>
For instance: without knowing anything else about a position, my guess is that "native French speaker" really would be a hard requirement. That's not the sort of thing people tend to put down on a whim. But even then, there are shades of grey. For instance, if I were looking for a job and found a "distributed databases developer position, must know Java, be familiar with open source and be a native French speaker" then I might see if they'd give me a pass on the last part because I'm a <span style="font-style: italic;">really</span> good fit for the rest -- and I know they're unlikely to find a lot of candidates with an <span style="font-style: italic;">exact</span> match.
</p><p>
In short, you have little to lose by trying, but don't just shotgun out resumes; include a cover letter that highlights the best matches from your experience to what they are looking for. Follow up with the hiring manager if possible to ask (a) "I sent in my resume a few days ago, and I wanted to see where you are in the hiring process for this position," and if they reply that they got it but you're not a good fit, ask (b) what specifically they were looking for, so you can flesh out your intuition that much more for next time.
</p><p>
Good luck!</p>Jonathan Ellishttp://www.blogger.com/profile/11003648392946638242noreply@blogger.com1tag:blogger.com,1999:blog-11683713.post-61439272868702236082011-01-04T11:53:00.000-08:002011-10-13T15:30:01.428-07:00Apache Cassandra: 2010 in review<p>
In 2010, Apache Cassandra increased its momentum as the leading scalable database. Here is a summary of the notable activity in three areas: code, community and controversy. As always, comments are welcome.
</p><p>
</p><h3>Code</h3>
<p>
2010 started with the release of <a href="http://spyced.blogspot.com/2010/01/cassandra-05.html">Cassandra 0.5</a>, followed by <a href="http://blogs.apache.org/foundation/entry/the_apache_software_foundation_announces3">0.6 and graduation from the ASF incubator</a> a few months later. Seven more stable releases of 0.6 proceeded, adding <a href="http://www.riptano.com/docs/0.6/appendix/appendix_a_whats_new">many features</a> to improve operations in response to feedback from production users.
</p><p>
0.7 adds highly anticipated features like <a href="http://www.riptano.com/blog/www.riptano.com/blog/whats-new-cassandra-07-secondary-indexes">column value indexes</a>, <a href="http://www.riptano.com/blog/whats-new-cassandra-07-live-schema-updates">live schema updates</a>, more efficient cluster expansion, and more control over replication, but didn't quite make it into 2010, with rc4 <a href="http://twitter.com/#%21/cassandra/status/21268489612296192">released on new year's 2011</a>.
</p><p>
We also committed the <a href="https://issues.apache.org/jira/browse/CASSANDRA-1072">distributed counters</a> patchset, begun at Digg and enhanced by Twitter for their <a href="http://mashable.com/2010/09/23/twitter-real-time-analytics/">real-time analytics product</a>. Notable as the most-involved feature discussion to date, distributed counters started with a <a href="https://issues.apache.org/jira/browse/CASSANDRA-580">vector clock approach</a>, but switched to a <a href="https://issues.apache.org/jira/secure/attachment/12459754/Partitionedcountersdesigndoc.pdf">new design</a> by <a href="http://twitter.com/#%21/kelvin">Kelvin Kakugawa</a> after we realized vector clocks were a <a href="http://pl.atyp.us/wordpress/?p=2601">dead end</a> for anything but the trivial case of monotonic-increments-by-one.
</p><p>
One of the biggest trends was increasing activity <i>around</i> Cassandra as well as in the core database itself. 2010 saw <a href="http://wiki.apache.org/cassandra/HadoopSupport">Hadoop map/reduce integration</a>, as well as Pig support and a <a href="https://issues.apache.org/jira/browse/HIVE-1434">patch for Hive</a>.
</p><p>
We also saw <a href="http://blog.sematext.com/2010/02/09/lucandra-a-cassandra-based-lucene-backend/">Lucandra</a>, which implements a Cassandra back end for Lucene and is used in several high volume production sites, grow up into <a href="https://github.com/tjake/Lucandra">Solandra</a>, embedding Solr and Cassandra in the same JVM for even more performance.
</p><p>
</p><h3>Community</h3>
<p>
Cassandra hit its stride in 2010, starting with <a href="http://blogs.apache.org/foundation/entry/the_apache_software_foundation_announces3">graduation from the ASF incubator</a> in April. 2010 saw 1025 <a href="https://issues.apache.org/jira/browse/CASSANDRA">tickets</a> resolved, nearly twice as many compared to 2009 (565).
</p><p>
Like many Apache projects, Cassandra has a relatively small set of <a href="http://wiki.apache.org/cassandra/Committers">committers</a>, but a much larger group of contributors. In 2010 Cassandra passed <a href="http://pastebin.com/DjXhVn2g">over 100 people</a> who have contributed at least one patch. Release manager <a href="http://blog.sym-link.com/">Eric Evans</a> put together a great way to visual this with a <a href="http://www.youtube.com/watch?v=FWSyoXnWsTQ">Code Swarm video of Cassandra development</a>.
</p><p>
I <a href="http://spyced.blogspot.com/2010/04/and-now-for-something-completely.html">started Riptano</a> with Matt Pfeil in April to provide professional products and services around Cassandra. In October, we announced <a href="http://www.riptano.com/blog/cassandra-investment-lightspeed-sequoia">funding from Lightspeed and Sequoia</a>. From May to December, we conducted eleven <a href="http://www.eventbrite.com/org/474011012">Cassandra training</a> events in eight months, and twice that many private classes on-site with customers.
</p><p>
Riptano is now up to 25 employees, with offices in the San Francisco bay area, Austin, and New York, and engineers working remotely in San Antonio, France, and Belarus.
</p><p>
In August, Riptano and Rackspace organized a very successful inaugural <a href="http://www.riptano.com/blog/cassandra-summit-recap">Cassandra Summit</a>, with about 200 attendees (<a href="http://www.riptano.com/blog/slides-and-videos-cassandra-summit-2010">videos available</a>), followed by <a href="http://us.apachecon.com/c/acna2010/schedule/grid">almost a full track at ApacheCon</a> in November. Cassandra was also represented at many other conferences on <a href="http://en.oreilly.com/rails2010/public/schedule/detail/14740">multiple</a> <a href="http://www.inf.unibz.it/krdb/school/2010/program.html">subjects</a>, <a href="http://my.javaonedevelop.com/events/a2z/JAVAONE">for</a> <a href="http://www.slideshare.net/aaronmorton/b-5857745">several</a> <a href="http://www.slideshare.net/supertom/using-cassandra-with-your-web-application">languages</a>, <a href="http://www.devoxx.com/display/Devoxx2K10/Introduction+to+Cassandra">and</a> <a href="http://www.gemini-bigdata.com/2010/11/brief-reviews-of-nosql-afternoon-in.html">continents</a>.
</p><p>
</p><h3>Controversy</h3>
<p>
Cassandra got a lot of negative publicity when Kevin Rose <a href="http://gilhildebrand.com/afterthought/2010/09/kevin-rose-spreads-fud-blames-cassandra-for-digg-v4-woes/">blamed</a> Cassandra for Digg v4's teething problems. However, there was no deluge of <a href="https://issues.apache.org/jira/browse/CASSANDRA">bug reports</a> coming out of Digg's Cassandra team, and Digg engineers Arin Sarkissian and Chris Goffinet (now working on Cassandra for Twitter) got on Quora <a href="http://www.quora.com/Is-Cassandra-to-blame-for-Digg-v4s-technical-failures">to refute the idea that Cassandra was at fault</a>:
</p><blockquote>
The whole "Cassandra to blame" thing is 100% a result of folks clinging on to the NoSQL vs SQL thing. It's a red herring.
<p>
The new version of Digg has a whole new architecture with a bunch of technologies involved. Problem is, over the last few months or so the only technological change we mentioned (blogged about etc) was Cassandra. That made it pretty easy for folks to cling on to it as the "problem".
</p></blockquote>
<p>
Meanwhile, Digg competitor Reddit <a href="http://twitter.com/#%21/ketralnis/status/658776965255168">has</a> <a href="http://twitter.com/#%21/ketralnis/status/10098518563758080">continued</a> <a href="http://twitter.com/#%21/jedberg/status/10401811135463424">migrating</a> to Cassandra, crediting it with <a href="http://www.reddit.com/r/blog/comments/evmek/2010_we_hardly_knew_ye/c1bbmrq">enabling their 3x traffic growth in 2010</a>.
</p><p>
More importantly, 2010 saw dozens of new Cassandra deployments, including a new contender for the largest-cluster crown when Digital Reasoning announced a <a href="http://www.businesswire.com/news/home/20101006005485/en/Digital-Reasoning-Riptano-Advance-Cassandra-Based-Analytic-Solutions">400-node cluster for the US government.</a>
</p><p>
We look forward to another great year in 2011!</p>Jonathan Ellishttp://www.blogger.com/profile/11003648392946638242noreply@blogger.com3tag:blogger.com,1999:blog-11683713.post-49802796965960312432010-04-26T12:29:00.000-07:002010-05-23T18:51:15.223-07:00And now for something completely different<p>
A month ago I left Rackspace to start <a href="http://riptano.com/">Riptano</a>, a <a href="http://riptano.com/services.php">Cassandra support and services</a> company.
</p><p>
I was in the unusal position of being a technical person looking for a business-savvy co-founder. For whatever reason, the converse seems a lot <a href="http://answers.onstartups.com/questions/35/how-do-i-find-a-technical-co-founder">more</a> <a href="http://www.mattcollins.net/2008/04/how-to-find-a-technical-co-founder">common</a>. Maybe technical people tend to sterotype softer skills as being easy.
</p><p>
But despite some examples to the contrary (notably for me, <a href="http://www.jcoates.org/">Josh Coates</a> at <a href="http://spyced.blogspot.com/2005/09/python-at-mozycom.html">Mozy</a>), I found that <a href="http://www.paulgraham.com/startupmistakes.html">starting a company is too hard for just one person</a>. Unfortunately, all of my fairly slim portfolio of business guys I'd like to co-found with were unavailable. So progress was slow, until <a href="http://www.linkedin.com/pub/matt-pfeil/19/71/201">Matt Pfeil</a> heard that I was leaving Rackspace and drove to San Antonio from Austin to talk me out of it. Not only was he not successful in talking me out of leaving, but he ended up co-founding Riptano. And here we are, with a Riptano mini-faq.
</p><p>
<b>Isn't Cassandra mostly just a web 2.0 thing for ex-mysql shops?</b>
</p><p>
Although most of the <a href="http://spyced.blogspot.com/2010/03/cassandra-in-action.html">early adopters</a> fit this stereotype, we're seeing interest from a lot of Oracle users and a lot of industries. Unlike many "NoSQL" databases, Cassandra <a href="http://spyced.blogspot.com/2010/04/cassandra-fact-vs-fiction.html">doesn't drop durability</a> (the D in <a href="http://en.wikipedia.org/wiki/ACID">ACID</a>), and besides scalability, enterprises are very interested in our support for multiple data centers and <a href="http://wiki.apache.org/cassandra/HadoopSupport">Hadoop analytics</a>.
</p><p>
<b>Are you going to fork Cassandra?</b>
</p><p>
No. Although the ASF license allows doing basically anything with the code, including creating proprietary forks, we think the track record of this strategy in the open source database world is <a href="http://jcole.us/blog/archives/2007/08/09/mysql-community-split-officially-a-failure/">mixed</a> at best.
</p><p>
We might create a (still open-source) Cassandra distribution similar to <a href="http://www.cloudera.com/hadoop/">Cloudera's Distribution for Hadoop</a>, but the mainline Cassandra development is responsive enough that there isn't as much need for a third party to do this as there is with Hadoop.
</p><p>
<b>What does Rackspace think?</b>
</p><p>
<a href="http://rackspace.com/">Rackspace</a> has been the primary driver of Cassandra development recently, employing (until I left) the three most active <a href="http://wiki.apache.org/cassandra/Committers">committers on the project</a>. For the same reasons <a href="http://www.rackspacecloud.com/blog/2009/09/23/the-cassandra-project/">Rackspace supported Cassandra</a> to begin with, Rackspace is excited to see Riptano help take the Cassandra ecosystem to the next level. <a href="http://www.informationweek.com/news/hardware/virtual/showArticle.jhtml?articleID=224600336">Rackspace has invested in Riptano</a> and has been completely supportive in every way.
</p><p>
<b>Where did you get the name "Riptano?" Does it mean anything?</b>
</p><p>
We took a sophisticated, augmented AI approach. By which I mean, we took a <a href="http://www.multicians.org/thvv/gpw.html">program that generated random, pronouceable strings</a>, and put together a couple fragments that sounded good together. (This is basically the same approach we took at Mozy, only there Josh insisted on a four letter domain name which narrowed it down a <i>lot</i>.)
</p><p>
I hope it doesn't mean "your dog has bad breath" somewhere.
</p><p>
And yes, <a href="http://twitter.com/riptano">Riptano is on twitter</a>.
</p><p>
<b>Are you hiring?</b>
</p><p>
Yes. We'll have a jobs page on the site soon. In the meantime you can email me a resume if you can't wait. Prior participation in the Apache Cassandra project is of course a huge plus.</p>Jonathan Ellishttp://www.blogger.com/profile/11003648392946638242noreply@blogger.com2tag:blogger.com,1999:blog-11683713.post-74442810616920942612010-04-07T09:37:00.000-07:002010-04-20T06:17:45.056-07:00Cassandra: Fact vs fiction<p>
<a href="http://cassandra.apache.org/">Cassandra</a> has seen some <a href="http://spyced.blogspot.com/2010/03/cassandra-in-action.html">impressive adoption success</a> over the past months, leading some to conclude that <a href="http://blog.tonybain.com/tony_bain/2009/12/is-cassandra-winning-the-nosql-race.html">Cassandra is the frontrunner</a> in the highly scalable databases space (a subset of the hot <a href="http://www.rackspacecloud.com/blog/2009/11/09/nosql-ecosystem/">NoSQL category</a>). Among all the attention, some misunderstandings have been propagated, which I'd like to clear up.
</p><p>
<span style="font-weight: bold;">Fiction</span>: "Cassandra relies on high-speed fiber between datacenters" and can't reliably replicate between datacenters with more than a few ms of latency between them.
</p><p>
<span style="font-weight: bold;">Fact</span>: Cassandra's multi-datacenter replication is one of its earliest features and is by far the most battle-tested in the NoSQL space. Facebook had Cassandra deployed on east and west coast datacenters since before open sourcing it. SimpleGeo's Cassandra cluster <a href="http://permalink.gmane.org/gmane.comp.db.cassandra.user/3462">spans 3 EC2 availability zones</a>, and Digg is also deployed on both coasts. Claims that this can't possibly work are an excellent sign that you're reading an article by someone who doesn't know what he's talking about.
</p><p>
<span style="font-weight: bold;">Fiction</span>: "It’s impossible to tell when [Cassandra] replicas will be up-to-date."
</p><p>
<span style="font-weight: bold;">Fact</span>: Cassandra provides consistency when R + W > N (read replica count + write replica count > replication factor), to use the <a href="http://www.allthingsdistributed.com/2007/10/amazons_dynamo.html">Dynamo vocabulary</a>. If you do writes and reads both with QUORUM, for one example, you can expect data consistency as soon as there are enough reachable nodes for a quorum. Cassandra also provides <a href="http://wiki.apache.org/cassandra/ReadRepair">read repair</a> and <a href="http://wiki.apache.org/cassandra/AntiEntropy">anti-entropy</a>, so that even reads at <a href="http://wiki.apache.org/cassandra/API">ConsistencyLevel.ONE</a> will be consistent after either of these events.
</p><p>
<span style="font-weight: bold;">Fiction</span>: Cassandra has a small community
</p><p>
<span style="font-weight: bold;">Fact</span>: Although popularity has never been a good metric for determining correctness, it's true that when using bleeding edge technology, it's good to have company. As I write this late at night (in the USA), there are 175 people in the Cassandra irc channel, 60 in the HBase one, 32 in Riak's, and 15 in Voldemort's. (Six months ago, the numbers were 90, 45, and 12 for Cassandra, HBase, and Voldemort. I did not hang out in #riak yet then.) Mailing list participation tells a similar story.
<p>
It's also interesting that the creators of <a href="http://code.google.com/p/thrudb/">Thrudb</a> and <a href="http://github.com/cliffmoon/dynomite">dynomite</a> are both using Cassandra now, indicating that the predicted NoSQL consolidation is beginning.
</p><p>
<span style="font-weight: bold;">Fiction</span>: "Cassandra only supports one [keyspace] per install."
</p><p>
<span style="font-weight: bold;">Fact</span>: This has not been true for almost a year (<a href="https://issues.apache.org/jira/browse/CASSANDRA-79">June of 2009</a>).
</p><p>
<span style="font-weight: bold;">Fiction</span>: Cassandra cannot support Hadoop, or supporting tools such as Pig.
</p><p>
<span style="font-weight: bold;">Fact</span>: It has always been straightforward to send the output of Hadoop jobs to Cassandra, and Facebook, Digg, and others have been using Hadoop like this as a Cassandra bulk-loader for over a year. For 0.6, I contributed a Hadoop InputFormat and related code to let Hadoop jobs <a href="http://wiki.apache.org/cassandra/HadoopSupport">process data <span style="font-style: italic;">from</span> Cassandra</a> as well, while cooperating with Hadoop to keep processing on the nodes that actually hold the data. Stu Hood then contributed a Pig LoadFunc, also in 0.6.
</p><p>
<span style="font-weight: bold;">Fiction</span>: Cassandra achieves its high performance by sacrificing reliability (alternately phrased: Cassandra is only good for data you can afford to lose)
</p><p>
<span style="font-weight: bold;">Fact</span>: unlike some NoSQL databases (notably <a href="http://blog.mongodb.org/post/381927266/what-about-durability">MongoDB</a> and <a href="http://www.slideshare.net/cloudera/hbase-user-group-9-hbase-and-hdfs">HBase</a>), Cassandra offers <a href="http://wiki.apache.org/cassandra/Durability">full single-server durability</a>. Relying on replication is not sufficient for can't-afford-to-lose-data scenarios; if your data center loses power, you are highly likely to lose data if you are not syncing to disk no matter how many replicas you have, and if you run large systems in production long enough, you will realize that power outages through some combination of equipment failure and human error are not occurrences you can ignore. But with its <a href="http://linux.die.net/man/2/fsync">fsync</a>'d <a href="http://wiki.apache.org/cassandra/ArchitectureCommitLog">commitlog</a> design, Cassandra can protect you against that scenario too.
</p><p>
What to do after your data is saved, e.g. backups and snapshots, is outside of my scope here but covered in the <a href="http://wiki.apache.org/cassandra/Operations">operations wiki page</a>.</p>Jonathan Ellishttp://www.blogger.com/profile/11003648392946638242noreply@blogger.com16tag:blogger.com,1999:blog-11683713.post-21351072286060485212010-03-30T07:48:00.001-07:002011-11-01T09:19:19.137-07:00Cassandra in Google Summer of Code 2010<p>
Cassandra is participating in the Google Summer of Code, which <a href="http://google-opensource.blogspot.com/2010/03/students-apply-now-for-google-summer-of.html">opened for proposal submission today</a>. Cassandra is part of the Apache Software Foundation, which has <a href="http://community.apache.org/gsoc.html">its own page of guidelines up</a> for students and mentors.
</p><p>
We have a good mix of project ideas involving both core and non-core areas, from straightforward code bashing to some pretty tricky stuff, depending on your appetite. Core tickets aren't necessarily harder than non-core, but they will require reading and understanding more existing code.
</p><p>
<span style="font-weight: bold;font-size:130%;" >Non-core</span>
</p><ul><li><a href="https://issues.apache.org/jira/browse/CASSANDRA-918">Create a web ui for cassandra</a>: we have a (fairly minimal) command line interface, but a web gui is more user-friendly. There is the beginnings of such a beast in the Cassandra source tree at contrib/cassandra_browser [pretty ugly Python code] and a gtk-based one at http://github.com/driftx/chiton [also Python, less ugly].</li><li><a href="https://issues.apache.org/jira/browse/CASSANDRA-912">First-class commandline interface</a>: if you prefer to kick things old-school, improving the cli itself would also be welcome.</li><li><a href="https://issues.apache.org/jira/browse/CASSANDRA-873">Create a Cassandra demo application</a>: we have <a href="http://twissandra.com/">Twissandra</a>, but we can always use more examples to introduce people to "thinking in Casssandra," which is the hardest part of using it. This one seems to be the most popular with students so far. (So stand out from the crowd, and submit something else too. :)
</li></ul>
<p>
<span style="font-weight: bold;font-size:130%;" >Almost-core
</span></p><ul><li><a href="https://issues.apache.org/jira/browse/CASSANDRA-875">Performance regression tests</a>: pretty self-explanatory?</li><li><a href="https://issues.apache.org/jira/browse/CASSANDRA-874">System tests against multiple nodes</a>: If GSOC were a wish-granting fairy I would probably choose this with my first wish. There's a couple different ways you can approach this; scripting VMs is one, or you could explore the <a href="https://issues.apache.org/jira/browse/CASSANDRA-561">Cassandra simulator</a> that was contributed a while ago (some TLC required).</li><li><a href="https://issues.apache.org/jira/browse/CASSANDRA-913">Hive support</a>: <a href="http://hadoop.apache.org/hive/">Hive</a> is a project that runs SQL queries against Hadoop map/reduce clusters. (For analytics; it is too high-latency to run applications against Hive directly). <a href="https://issues.apache.org/jira/browse/HIVE-705" title="Let Hive can analyse hbase's tables"><strike>HIVE-705</strike></a> added support for backends other than HDFS, with HBase as the first. Cassandra support should be doable too now. The Hive storage backends are described in <a href="http://wiki.apache.org/hadoop/Hive/StorageHandlers">http://wiki.apache.org/hadoop/Hive/StorageHandlers</a> and the HBase backend specifically in <a href="http://wiki.apache.org/hadoop/Hive/HBaseIntegration">http://wiki.apache.org/hadoop/Hive/HBaseIntegration</a>.
</li></ul>
<p>
<span style="font-weight: bold;font-size:130%;" >Core</span>
</p><ul><li><a href="https://issues.apache.org/jira/browse/CASSANDRA-926">Avro RPC support</a>: currently Cassandra's client layer is the Thrift RPC framework, which sucks for reasons outside our scope here. We're moving to Avro, the new hotness from Doug Cutting (creator of Lucene and Hadoop, you may have heard of those). Basically this means porting org.apache.cassandra.thrift.CassandraServer to org.apache.cassandra.avro.CassandraServer; some examples are already done by Eric Evans.</li><li><a href="https://issues.apache.org/jira/browse/CASSANDRA-876">Session-level consistency</a>: In <a href="http://www.allthingsdistributed.com/2007/10/amazons_dynamo.html">one</a> and <a href="http://www.allthingsdistributed.com/2008/12/eventually_consistent.html">two</a> Amazon discusses the concept of "eventual consistency." Cassandra uses eventual consistency in a design similar to Dynamo. Supporting session consistency would be useful and relatively easy to add: we already have the concept of a <a href="http://wiki.apache.org/cassandra/MemtableSSTable">Memtable</a> to "stage" updates in before flushing to disk; if we applied mutations to a session-level memtable on the coordinator machine (that is, the machine the client is connected to), and then did a final merge from that table against query results before handing them to the client, we'd get it almost for free.</li><li><a href="https://issues.apache.org/jira/browse/CASSANDRA-622">Optimize commitlog performance</a>: this is about as low-level as you'll find in Cassandra's code base. fsync, CAS, it's all here. <a href="http://wiki.apache.org/cassandra/ArchitectureCommitLog">http://wiki.apache.org/cassandra/ArchitectureCommitLog</a> describes the current CommitLog design. </li></ul>
<p>
You can comment directly on the JIRA tickets after creating an account (it's open to the public) if you're interested or have other questions. And of course feel free to propose other ideas!</p>Jonathan Ellishttp://www.blogger.com/profile/11003648392946638242noreply@blogger.com3tag:blogger.com,1999:blog-11683713.post-64342895440162636402010-03-24T08:41:00.001-07:002011-03-14T06:25:21.361-07:00Cassandra in action<p>
There's been a lot of new articles about <a href="http://cassandra.apache.org/">Cassandra </a>deployments in the past month, enough that I thought it would be useful to summarize in a post.
</p>
<p>
Ryan King explained in <a href="http://nosql.mypopescu.com/post/407159447/cassandra-twitter-an-interview-with-ryan-king">an interview with Alex Popescu</a> why Twitter is moving to Cassandra for tweet storage, and why they selected Cassandra over the alternatives. My experience is that the more someone understands large systems and the problems you can run into with them from an operational standpoint, the more likely they are to choose Cassandra when doing this kind of evaluation. Ryan's list of criteria is worth checking out.
</p>
<p>
Digg followed up their <a href="http://about.digg.com/blog/looking-future-cassandra">earlier announcement</a> that they had taken part of their site live on Cassandra with <a href="http://about.digg.com/node/564">another</a> saying that they've now "reimplemented most of Digg's functionality using Cassandra as our primary datastore." Digg engineer Ian Eure also gave <a href="http://news.ycombinator.com/item?id=1184603">some more details on Digg's cassandra data model</a> in a Hacker News thread.</p>
<p>Om Malik <a href="http://gigaom.com/2010/03/11/digg-cassandara/">quoted extensively</a> from the Digg announcement and from Rackspace engineer Stu Hood, who explained Cassandra's appeal: "Over the Bigtable clones, Cassandra has huge high-availability advantages, and no single point of failure. When compared to the Dynamo adherents, Cassandra has the advantage of a more advanced datamodel, allowing for a single row to contain billions of column/value pairs: enough to fill a machine. You also get efficient range queries for the top level key, and even within your values."</p>
<p>
The Twitter and Digg news kicked off <a href="http://blogsearch.google.com/blogsearch?q=twitter+cassandra">a lot of publicity</a>, including a lot of "me too" articles but some interesting ones, including a highscalability post wondering if this was <a href="http://highscalability.com/blog/2010/2/26/mysql-and-memcached-end-of-an-era.html">the end of the mysql + memcached era</a>. If not quite yet the end, then the beginning of it. As Ian Eure from Digg <a href="http://www.rackspacecloud.com/blog/2010/02/25/should-you-switch-to-nosql-too/">said</a>, "If you're deploying memcache on top of your database, you're inventing your own ad-hoc, difficult to maintain NoSQL system." Possibly the best commentary on this idea is <a href="http://www.25hoursaday.com/weblog/2010/03/10/BuildingScalableDatabasesAreRelationalDatabasesCompatibleWithLargeScaleWebsites.aspx">Dare Obasanjo's</a>, who explained "Digg's usage of Cassandra actually serves as a rebuttal to [an article claiming SQL scales just fine] since they couldn't feasibly get what they want with either horizontal or vertical scaling of their relational database-based solution."
</p>
<p>
<a href="http://blog.reddit.com/2010/03/she-who-entangles-men.html">Reddit also migrated to Cassandra</a> from memcachedb, in only 10 days, the fastest migration to Cassandra I've seen. More comments from the engineer doing the migration, ketralnis, in the <a href="http://www.reddit.com/r/programming/comments/bcqhi/reddits_now_running_on_cassandra/">reddit discussion thread</a>.
</p>
<p>
CloudKick <a href="https://www.cloudkick.com/blog/2010/mar/02/4_months_with_cassandra/">blogged about how they use Cassandra for time series data</a>, including a sketch of their data model. CloudKick migrated from PostgreSQL, skewering the theory you will sometimes see proffered that "only MySQL users are migrating to NoSQL, not people who use [my favorite vendor's relational database]."
</p>
<p>
Jake Luciani wrote about <a href="http://blog.sematext.com/2010/02/09/lucandra-a-cassandra-based-lucene-backend/">how Lucandra, the Cassandra Lucene back-end works</a>, and how he's using it to power <a href="http://sparse.ly">the Twitter search app sparse.ly</a>. IMO, <a href="http://github.com/tjake/Lucandra">Lucandra</a> is one of Cassandra's killer apps.
</p>
<p>
The FightMyMonster team <a href="http://ria101.wordpress.com/2010/02/24/hbase-vs-cassandra-why-we-moved/">switched from HBase to Cassandra</a> after concluding that "HBase is more suitable for data warehousing, and large scale data processing and analysis... and Cassandra is more suitable for real time transaction processing and the serving of interactive data." Dominic covers CAP, architecture considerations, benchmarks, map/reduce, and durability in explaining his conclusion.
</p>
<p>
<a href="http://www.startupmonkeys.com/2010/03/cassandra-frugal-mechanic/">Eric Peters gave a talk on Cassandra</a> use at his company, Frugal Mechanic, at the Seattle Tech Startups Meetup. This was interesting not because Frugal Mechanic is a big name but because it's not. I haven't seen Eric's name on the Cassandra mailing lists at all, but there he was deploying it and giving a talk on it, showing that Cassandra is starting to move beyond early adopters. (And, just maybe, that our documentation is improving. :)</p>
<p>
Finally, <a href="http://www.eflorenzano.com/">Eric Florenzano</a> has a live demo up now of Cassandra running a Twitter clone at <a href="http://twissandra.com/">twissandra.com</a>, with <a href="http://github.com/ericflo/twissandra">source</a> at github, as an example of how to use Cassandra's data model. If you're interested in the nuts and bolts of how to build an app on Cassandra, you should check it out.</p>Jonathan Ellishttp://www.blogger.com/profile/11003648392946638242noreply@blogger.com10tag:blogger.com,1999:blog-11683713.post-38443091628946708112010-03-15T15:19:00.000-07:002010-03-17T08:43:49.079-07:00Why your data may not belong in the cloud<p>
<a href="http://pl.atyp.us/wordpress/?p=2742">Several</a> of the <a href="http://groups.csail.mit.edu/haystack/blog/2010/03/12/notes-from-nosql-live-boston/">reports</a> of the recently-concluded NoSQL Live event mentioned that I took a contrarian position on the "NoSQL in the Cloud" panel, arguing that traditional, bare metal servers usually make more sense. Here's why.
<p>
There are two reasons to use cloud infrastructure (and by cloud I mean here "commodity VMs such as those provided by Rackspace Cloud Servers or Amazon EC2):
<ol><li>You only need a fraction of the capacity of a single machine</li><li>Your demand is highly elastic; you want to be able to quickly spin up many new instances, then drop them when you are done</li></ol>Most people looking at NoSQL solutions are doing it because their data is larger than a traditional solution can handle, or will be, so (1) is not a very strong motivation. But what about (2)? At first glance, cloud is a great fit for adding capacity to a database cluster painlessly. But there's an important difference between load like web traffic that bounces up and down frequently, and databases: with few exceptions, databases only get larger with time. You won't have 20 TB of data this week, and 2 next.
<p>
When capacity only grows in one direction it makes less sense to pay a premium for the flexibility of being able to reduce your capacity nearly instantly, especially when you also get reduced I/O performance (the most common bottleneck for databases) in the bargain because of the virtualization layer. That's why, despite working for a <a href="http://rackspacecloud.com/">cloud provider</a>, I don't think it's always a good fit for databases. (It doesn't hurt that Rackspace also offers classic bare metal hosting in the same data centers, so you can have the best of both worlds.)Jonathan Ellishttp://www.blogger.com/profile/11003648392946638242noreply@blogger.com8tag:blogger.com,1999:blog-11683713.post-19289828148437567372010-02-08T10:06:00.000-08:002010-02-25T20:39:33.258-08:00Distributed deletes in the Cassandra database<p>
Handling deletes in a distributed, <a href="http://www.allthingsdistributed.com/2008/12/eventually_consistent.html">eventually consistent</a> system is a little tricky, as demonstrated by the fairly frequent recurrence of the question, "<a href="http://wiki.apache.org/cassandra/FAQ#i_deleted_what_gives">Why doesn't disk usage immediately decrease when I remove data in Cassandra</a>?"
<p>
As background, recall that a <a href="http://incubator.apache.org/cassandra/">Cassandra </a>cluster defines a ReplicationFactor that determines how many nodes each key and associated columns are written to. In Cassandra (as in <a href="http://www.allthingsdistributed.com/2007/10/amazons_dynamo.html">Dynamo</a>), the client controls how many replicas to block for on writes, which includes deletions. In particular, the client may (and typically will) specify a ConsistencyLevel of less than the cluster's ReplicationFactor, that is, the coordinating server node should report the write successful even if some replicas are down or otherwise not responsive to the write.
<p>
(Thus, the "eventual" in eventual consistency: if a client reads from a replica that did not get the update with a low enough ConsistencyLevel, it will potentially see old data. Cassandra uses <a href="http://wiki.apache.org/cassandra/HintedHandoff">Hinted Handoff</a>, <a href="http://wiki.apache.org/cassandra/ReadRepair">Read Repair</a>, and <a href="http://wiki.apache.org/cassandra/AntiEntropy">Anti Entropy</a> to reduce the inconsistency window, as well as offering higher consistency levels such as ConstencyLevel.QUORUM, but it's still something we have to be aware of.)
<p>
Thus, a delete operation can't just wipe out all traces of the data being removed immediately: if we did, and a replica did not receive the delete operation, when it becomes available again it will treat the replicas that <span style="font-style: italic;">did</span> receive the delete as having missed a write update, and repair them! So, instead of wiping out data on delete, Cassandra replaces it with a special value called a tombstone. The tombstone can then be propagated to replicas that missed the initial remove request.
<p>
There's one more piece to the problem: how do we know when it's safe to remove tombstones? In a fully distributed system, we can't. We could add a coordinator like <a href="http://hadoop.apache.org/zookeeper/">ZooKeeper</a>, but that would pollute the simplicity of the design, as well as complicating ops -- then you'd essentially have two systems to monitor, instead of one. (This is not to say ZK is bad software -- I believe it is best in class at what it does -- only that it solves a problem that we do not wish to add to our system.)
<p>
So, Cassandra does what distributed systems designers frequently do when confronted with a problem we don't know how to solve: define some additional constraints that turn it into one that we do. Here, we defined a constant, <em>GCGraceSeconds</em>, and had each node track tombstone age locally. Once it has aged past the constant, it can be GC'd. This means that if you have a node down for longer than <em>GCGraceSeconds</em>, you should treat it as a failed node and replace it as described in <a href="http://wiki.apache.org/cassandra/Operations">Cassandra Operations</a>. The default setting is very conservative, at 10 days; you can reduce that once you have Anti Entropy configured to your satisfaction. And of course if you are only running a single Cassandra node, you can reduce it to zero, and tombstones will be GC'd at the first <a href="http://wiki.apache.org/cassandra/MemtableSSTable">compaction</a>.Jonathan Ellishttp://www.blogger.com/profile/11003648392946638242noreply@blogger.com2tag:blogger.com,1999:blog-11683713.post-5408134620441385432010-01-25T14:23:00.000-08:002010-10-13T06:38:26.199-07:00Cassandra 0.5.0 released<p>
<a href="http://cassandra.apache.org/">Apache Cassandra</a> 0.5.0 was released over the weekend, four months after 0.4. (<a href="https://svn.apache.org/repos/asf/cassandra/tags/cassandra-0.5.0/NEWS.txt">Upgrade notes</a>; <a href="https://svn.apache.org/repos/asf/cassandra/tags/cassandra-0.5.0/CHANGES.txt">full changelog</a>.) We're excited about releasing 0.5 because it makes life even better for people using Cassandra as their primary data source -- as opposed to a replica, possibly denormalized, of data that exists somewhere else.
</p><p>
The Cassandra distributed database has always had a commitlog to provide durable writes, and in 0.4 we added an option to waiting for commitlog sync before acknowledging writes, for cases where even a few seconds of potential data loss was not an option. But what if a node goes down temporarily? 0.5 adds proactive repair, what Dynamo calls "anti-entropy," to synchronize any updates <a href="http://wiki.apache.org/cassandra/HintedHandoff">Hinted Handoff</a> or read repair didn't catch across all replicas for a given piece of data.
</p><p>
0.5 also adds load balancing and significantly improves bootstrap (adding nodes to a running cluster). We've also been busy adding documentation on <a href="http://wiki.apache.org/cassandra/Operations">operations in production</a> and <a href="http://wiki.apache.org/cassandra/ArchitectureInternals">system internals</a>.
</p><p>
Finally, in 0.5 we've improved concurrency across the board, improving insert speed by over 50% on the stress.py benchmark (from contrib/) on a relatively modest 4-core system with 2GB of ram. We've also added a [row] key cache, enabling similar relative improvements in reads:
</p><p>
<a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhOUPwO26PMd1LChphTiKlOpx9-z8PfWVvBwNUpMhIgRglk3bfiV1j4mtkpAezpHZDNOnd6nZ2wL_p2wmNFMcPuVVGCp-dV-gut6b2VAc_6uXnNjCxRrR9FbfvptaIHL2fFqEL5Lw/s1600-h/cassandra+04+vs+05+single+machine.png"><img style="display: block; margin: 0px auto 10px; text-align: center; cursor: pointer; width: 400px; height: 160px;" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhOUPwO26PMd1LChphTiKlOpx9-z8PfWVvBwNUpMhIgRglk3bfiV1j4mtkpAezpHZDNOnd6nZ2wL_p2wmNFMcPuVVGCp-dV-gut6b2VAc_6uXnNjCxRrR9FbfvptaIHL2fFqEL5Lw/s400/cassandra+04+vs+05+single+machine.png" alt="" id="BLOGGER_PHOTO_ID_5430823145830768562" border="0" /></a>(You will note that unlike most systems, Cassandra <a href="http://wiki.apache.org/cassandra/FAQ#reads_slower_writes">reads are usually slower than writes</a>. 0.6 will narrow this gap with full row caching and mmap'd I/O, but fundamentally we think optimizing for writes is the right thing to do since writes have always been harder to scale.)
</p><p>
Log replay, flush, compaction, and range queries are also faster.
</p><p>
0.5 also brings new tools, including JSON-based data export and import, an improved command-line interface, and new JMX metrics.
</p><p>
One final note: like all distributed systems, Cassandra is designed to maximize throughput when under load from many clients. Benchmarking with a single thread or a small handful will not give you numbers representative of production (unless you only ever have four or five users at a time in production, I suppose). Please don't ask "why is Cassandra so slow" and offer up a single-threaded benchmark as evidence; that makes me sad inside. Here's 1000 words:
<a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiE2o8rsTYSUWwaD9OcBMi4Smv6YVsJ3_zceIPT0sq2yA8jTz057l3W-GLZv06kSHonoyCloyPQvY3M4GTCp2zrJ5NQSCbW4HHmE7sPbtcvPc1tkEUluDK9kL6fPUmVAoXpRchzTA/s1600-h/cassandra-inserts-vs-threads.png"><img style="display: block; margin: 0px auto 10px; text-align: center; cursor: pointer; width: 400px; height: 160px;" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiE2o8rsTYSUWwaD9OcBMi4Smv6YVsJ3_zceIPT0sq2yA8jTz057l3W-GLZv06kSHonoyCloyPQvY3M4GTCp2zrJ5NQSCbW4HHmE7sPbtcvPc1tkEUluDK9kL6fPUmVAoXpRchzTA/s400/cassandra-inserts-vs-threads.png" alt="" id="BLOGGER_PHOTO_ID_5430823381437231426" border="0" /></a>
</p><p>
(Thanks to <a href="http://twitter.com/faltering">Brandon Williams</a> for the graphs.)</p>Jonathan Ellishttp://www.blogger.com/profile/11003648392946638242noreply@blogger.com13tag:blogger.com,1999:blog-11683713.post-39520484965016137992010-01-21T08:22:00.000-08:002010-05-17T05:17:31.953-07:00Linux performance basics<p>
I want to write about <a href="http://incubator.apache.org/cassandra/">Cassandra</a> performance tuning, but first I need to cover some basics: how to use vmstat, iostat, and top to understand what part of your system is the bottleneck -- not just for Cassandra but for any system.
</p><p>
</p><div style="font-size: 130%;"><b>vmstat</b></div>
You will typically run vmstat with "vmstat sampling-period", e.g., "vmstat 5." The output looks like this:
<p></p><pre class="code">
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
r b swpd free buff cache si so bi bo in cs us sy id wa
20 0 195540 32772 6952 576752 0 0 11 12 38 43 1 0 99 0
22 2 195536 35988 6680 575132 6 0 2952 14 959 16375 72 21 4 3
</pre>
The first line is your total system average since boot; typically this will not be very useful, since you are interested in what is causing problems NOW. Then you will get one line per sample period; most of the output is self explanatory. The reason to start with vmstat is the "swap" section: si and so are swap in (memory read from disk) and swap out (memory written to disk). Remember that a little swapping is normal, particularly during application startup: by default, Linux will swap infrequently used pages of application memory to disk to free up more room for disk caching, <a href="http://lwn.net/Articles/83588/">even if there is enough ram</a> to accommodate all applications.
<p>
</p><div style="font-size: 130%;"><b>iostat</b></div>
To get more details of io, use iostat -x. Again, you want to give it a sampling interval, and ignore the first set of output. iostat also gives you some cpu information but top does that better; let's focus on the Device section:
<p></p><pre class="code">
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util
sda 9.80 0.20 36.60 0.40 5326.40 4.80 144.09 0.06 1.62 1.41 5.20
</pre>
There are 3 easy ways to tell if a disk is a probable bottleneck here, and none of them show up without the -x flag, so get in the habit of using that. "avgqu-sz" is the size of the io request queue; if it is large, there are lots of requests waiting in line. "await" is how long (in ms) the average request took to be satisfied (including time enqueued); recall that on non-SSDs, a single seek is between 5 and 10ms. Finally, "%util" is Linux's guess at how fully saturated the device is.
<p>
</p><div style="font-size: 130%;"><b>top</b></div>
To learn more about per-process CPU and memory usage, use "top." I won't paste top output here because everyone is so familiar with it, but I will mention a few useful things to know:
<p></p><ul><li>"P" and "M" toggle between sorting by cpu usage and sorting by memory usage</li><li>"1" toggles breaking down the CPU summary by CPU core</li><li>SHR (shared memory) is included in RES (resident memory)</li><li>Amount of memory belonging to a process that has been swapped out is VIRT - RES</li><li>a state (S column) of D means the process (or thread, see below) is waiting for disk or network i/o
</li><li>"steal" is how much CPU the hypervisor is giving to another VM in a virtual environment; as virtual provisioning becomes more common, <a href="http://alan.blog-city.com/has_amazon_ec2_become_over_subscribed.htm">avoiding noisy neighbors</a> is increasingly important</li></ul>
<p>
"top -H" will split out individual threads into their own lines; both per-process and per-thread views are useful. The per-thread view is particularly useful when dealing with Java applications since you can easily <a href="http://publib.boulder.ibm.com/infocenter/javasdk/tools/index.jsp?topic=/com.ibm.java.doc.igaa/_1vg0001475cb4a-1190e2e0f74-8000_1007.html">correlate them with thread names from the JVM</a> to see which threads are consuming your CPU. Briefly, you take the PID (thread ID) from top, convert it to hex -- e.g., "python -c 'print hex(12345)'" -- and match it with the corresponding thread ID from jstack.</p><p>Now you can troubleshoot with a process like: "Am I swapping? If so, what processes are using all the memory? If my application makes a lot of disk read requests, are my reads being cached or are they actually hitting the disk? If I am hitting the disk, is it saturated? How much 'hot data' can I have before I run out of cache room? Are any/all of my cpu cores maxed? Which threads are actually using the CPU? Which threads spend most of their time waiting for i/o?" Then if you go to ask for help tuning something, you can <a href="http://www.catb.org/%7Eesr/faqs/smart-questions.html">show that you've done your homework</a>.
</p>Jonathan Ellishttp://www.blogger.com/profile/11003648392946638242noreply@blogger.com11tag:blogger.com,1999:blog-11683713.post-31556959438360054972009-12-15T14:01:00.001-08:002010-01-19T05:49:59.849-08:00Cassandra reading listI put together this list for a co-worker who wants to learn more about Cassandra: (<a href="https://svn.apache.org/repos/asf/incubator/cassandra/branches/cassandra-0.5/CHANGES.txt">0.5</a> beta 2 out now!)<ul><li><a href="http://wiki.apache.org/cassandra/GettingStarted">Getting Started</a>: Cassandra is surprisingly easy to try out. This walks you through both single-node and clustered setup.
</li><li><a href="http://www.allthingsdistributed.com/2007/10/amazons_dynamo.html">The Dynamo paper</a> and <a href="http://www.allthingsdistributed.com/2008/12/eventually_consistent.html">Amazon's related article on eventual consistency</a>: Cassandra's replication model is strongly influenced by Dynamo's. Almost everything you read here also applies to Cassandra. (The major exceptions are vector clocks, and even that <a href="https://issues.apache.org/jira/browse/CASSANDRA-580">may change</a>, and Cassandra's<a href="http://spyced.blogspot.com/2009/05/consistent-hashing-vs-order-preserving.html"> support for order-preserving partitioning</a> with active load balancing.)</li><li><a href="http://arin.me/code/wtf-is-a-supercolumn-cassandra-data-model">WTF is a SuperColumn</a>? Arin Sarkissian from Digg explains the Cassandra data model.</li><li><a href="http://wiki.apache.org/cassandra/Operations">Operations</a>: stuff you will want to know when you run Cassandra in production</li><li><a href="http://n2.nabble.com/Cassandra-users-survey-td4040068.html">Cassandra users survey from Nov 09</a>: What Twitter, Mahalo, Ooyala, SimpleGeo, and others are using Cassandra for
</li><li><a href="http://wiki.apache.org/cassandra/ArticlesAndPresentations">More articles here</a> (Cassandra on OS X seems to be a particularly popular topic)
</li></ul>If you want to know more about the internals, also see these:
<ul><li><a href="http://wiki.apache.org/cassandra/ArchitectureInternals">Internals documentation</a></li><li><a href="http://www.facebook.com/video/video.php?v=540974400803">Facebook presentation</a> and <a href="http://vimeo.com/5185526">NoSQL SF presentation</a>, by Avinash Lakshman (the second picks up almost where the first leaves off)</li><li><a href="http://www.cs.cornell.edu/projects/ladis2009/papers/lakshman-ladis2009.pdf">LADIS 2009 paper</a> by Avinash Lakshman and Prashant Malik
</li></ul>Jonathan Ellishttp://www.blogger.com/profile/11003648392946638242noreply@blogger.com2tag:blogger.com,1999:blog-11683713.post-23423210154274284202009-07-29T10:34:00.000-07:002009-11-02T05:36:16.641-08:00Cassandra hackfest and OSCON reportThe best part of OSCON for me wasn't actually part of OSCON. The guys at Twitter put together a <a href="http://incubator.apache.org/cassandra/">Cassandra</a> <a href="http://mail-archives.apache.org/mod_mbox/incubator-cassandra-dev/200907.mbox/%3Cb6f68fc60907161612t5469a76ds6175f846ce29a05a@mail.gmail.com%3E">hackfest</a> on Wednesday night, with <a href="http://www.flickr.com/photos/_evan/3751880113/in/set-72157621750026309/">much awesomeness</a> resulting. Thanks to <a href="http://twitter.com/evan">Evan</a> for organizing!
<p>
<a href="http://twitter.com/stuhood">Stu Hood</a> flew up from Rackspace's Virginia offices just for the night, which normally probably wouldn't have been worth it, but <a href="http://twitter.com/moonpolysoft">Cliff Moon</a>, author of <a href="http://github.com/cliffmoon/dynomite/tree/master">dynomite</a>, showed up (thanks, Cliff!) and was able to give Stu a lot of pointers on <a href="https://issues.apache.org/jira/browse/CASSANDRA-193">implementing merkle trees</a>. Cliff and I also had a good discussion with Jun Rao about hinted handoff--Cliff and Jun are not fans, and I tend to agree with them--and <a href="http://www.allthingsdistributed.com/2008/12/eventually_consistent.html">eventual consistency</a>.
<p>
I also met <a href="http://blog.lostlake.org/">David Pollack</a> and got to talk a little about persistence for <a href="http://blog.lostlake.org/index.php?/archives/94-Lift,-Goat-Rodeo-and-Such.html">Goat Rodeo</a>, and talked to a ton of people from Twitter and Digg. I think those two, with Rackspace and IBM Research, constituted the companies with more than one engineer attending. The rest was "long tail."
<p>
Back at OSCON, my Cassandra talk was standing room only. Slides:
<div style="width: 425px; text-align: left;" id="__ss_1786870"><a style="margin: 12px 0pt 3px; font-family: Helvetica,Arial,Sans-serif; font-style: normal; font-variant: normal; font-weight: normal; font-size: 14px; line-height: normal; font-size-adjust: none; font-stretch: normal; display: block; text-decoration: underline;" href="http://www.slideshare.net/jbellis/cassandra-open-source-bigtable-dynamo" title="Cassandra: Open Source Bigtable + Dynamo">Cassandra: Open Source Bigtable + Dynamo</a><object style="margin: 0px;" width="425" height="355"><param name="movie" value="http://static.slidesharecdn.com/swf/ssplayer2.swf?doc=cassandraopensourcebigtabledynamopresentation-090729134121-phpapp01&stripped_title=cassandra-open-source-bigtable-dynamo"><param name="allowFullScreen" value="true"><param name="allowScriptAccess" value="always"><embed src="http://static.slidesharecdn.com/swf/ssplayer2.swf?doc=cassandraopensourcebigtabledynamopresentation-090729134121-phpapp01&stripped_title=cassandra-open-source-bigtable-dynamo" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="425" height="355"></embed></object><div style="font-size: 11px; font-family: tahoma,arial; height: 26px; padding-top: 2px;">View more <a style="text-decoration: underline;" href="http://www.slideshare.net/">documents</a> from <a style="text-decoration: underline;" href="http://www.slideshare.net/jbellis">jbellis</a>.</div></div>
My second talk is the one I would have preferred to give first, on "What Every Developer Should Know About Database Scalability". (I would have preferred to give it first so that I could have just said "come to my Cassandra talk for more details" instead of trying to cram that in at the end. But, it was in my proposal outline!) Slides: <div style="width: 425px; text-align: left;" id="__ss_1786869"><a style="margin: 12px 0pt 3px; font-family: Helvetica,Arial,Sans-serif; font-style: normal; font-variant: normal; font-weight: normal; font-size: 14px; line-height: normal; font-size-adjust: none; font-stretch: normal; display: block; text-decoration: underline;" href="http://www.slideshare.net/jbellis/what-every-developer-should-know-about-database-scalability" title="What Every Developer Should Know About Database Scalability">What Every Developer Should Know About Database Scalability</a><object style="margin: 0px;" width="425" height="355"><param name="movie" value="http://static.slidesharecdn.com/swf/ssplayer2.swf?doc=scalingdatabases-090729134110-phpapp01&stripped_title=what-every-developer-should-know-about-database-scalability"><param name="allowFullScreen" value="true"><param name="allowScriptAccess" value="always"><embed src="http://static.slidesharecdn.com/swf/ssplayer2.swf?doc=scalingdatabases-090729134110-phpapp01&stripped_title=what-every-developer-should-know-about-database-scalability" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="425" height="355"></embed></object><div style="font-size: 11px; font-family: tahoma,arial; height: 26px; padding-top: 2px;">View more <a style="text-decoration: underline;" href="http://www.slideshare.net/">documents</a> from <a style="text-decoration: underline;" href="http://www.slideshare.net/jbellis">jbellis</a>.</div></div>
Other OSCON talks I liked (that have slides available):
<ul><li><a href="http://en.oreilly.com/oscon2009/public/schedule/detail/8198">Gearman: Bringing the Power of Map/Reduce to Everyday Applications</a></li><li><a href="http://en.oreilly.com/oscon2009/public/schedule/detail/8230">High Performance SQL with PostgreSQL [8.4]
</a></li><li><a href="http://en.oreilly.com/oscon2009/public/schedule/detail/8432">Linux Filesystem Performance for Databases</a> (reiserfs blows everyone away for random writes, by a factor of > 2!?)</li><li><a href="http://en.oreilly.com/oscon2009/public/schedule/detail/8364">Neo4j - The Benefits of Graph Databases</a></li><li><a href="http://en.oreilly.com/oscon2009/public/schedule/detail/7823">Release Mismanagement: How to Alienate Users and Frustrate Developers</a></li></ul>Jonathan Ellishttp://www.blogger.com/profile/11003648392946638242noreply@blogger.com0tag:blogger.com,1999:blog-11683713.post-20871584129996909762009-07-06T10:56:00.000-07:002009-07-06T11:20:33.833-07:00Cassandra 0.3 update<p>
Two months after <a href="http://spyced.blogspot.com/2009/05/cassandra-03-release-candidate-and.html">the first release candidate</a>, <a href="http://incubator.apache.org/cassandra/">Cassandra</a> 0.3 is still not out. But, we're close!
<p>
We had two more bug-fix release candidates, and it's virtually certain that 0.3-final will be the same exact code as <a href="http://people.apache.org/%7Ejbellis/cassandra/cassandra-0.3.0-rc3.tar.gz">0.3-rc3</a>. (If you're using rc1, you do want to upgrade; see <a href="https://svn.apache.org/repos/asf/incubator/cassandra/tags/cassandra-0.3.0-rc3/CHANGES.txt">CHANGES.txt</a>.) But, we got stuck in the <a href="http://twitter.com/spyced/status/2497990811">ASF bureaucracy</a> and it's going to take at least <a href="http://mail-archives.apache.org/mod_mbox/incubator-cassandra-dev/200907.mbox/%3Ce06563880907060804p731964d5k6cb6d7ab73d92767@mail.gmail.com%3E">one more round-trip</a> before the crack Release Prevention Team grudgingly lets us call it official.
<p>
In the meantime, <a href="https://issues.apache.org/jira/secure/IssueNavigator.jspa?reset=true&pid=12310865&status=5">work continues apace</a> on trunk for 0.4.Jonathan Ellishttp://www.blogger.com/profile/11003648392946638242noreply@blogger.com2tag:blogger.com,1999:blog-11683713.post-21148707550954676712009-06-23T09:40:00.000-07:002009-06-23T11:07:36.548-07:00Patch-oriented development made sane with git-svnOne of the drawbacks to working on <a href="http://incubator.apache.org/cassandra/">Cassandra</a> is that unlike every other OSS project I have ever worked on, we are using a patch-oriented development process rather than post-commit review. It's really quite painfully slow. Somehow this became sort of the default for ASF projects, but <a href="http://www.mail-archive.com/cassandra-dev@incubator.apache.org/msg00244.html">there is precedent</a> for switching to post-commit review eventually.
<p>
In the meantime, there is git-svn.
</p><p>
(The ASF does have <a href="http://wiki.apache.org/general/GitAtApache">a git mirror set up</a>, but I'm going to ignore that because (a) its reliability has been questionable and (b) sticking with git-svn hopefully makes this more useful for non-ASF projects.)
</p>
<p>
Disclaimer: <a href="http://spyced.blogspot.com/2008/12/frustrated-with-git.html">
I am not a git expert</a>, and probably some of this will make you cringe if you are. Still, I hope it will be useful for some others fumbling their way towards enlightenment. As background, I suggest the <a href="http://git.or.cz/course/svn.html">git crash course for svn users</a>. Just the parts up to the Remote section.
</p>
<p>
Checkout:
</p><ol>
<li>git-svn init https://svn.apache.org/repos/asf/cassandra/trunk cassandra
</li></ol>
Once that's done the only git-svn commands you need to know about are dcommit to push the changes <i>in the current git branch</i> back to svn, and rebase, to pull changes from svn and re-apply your uncommitted patches on top of that (basically exactly like svn up).
<p>
Creating new code:
</p><ol>
<li>git checkout -b [ticket number]
</li><li>[edit stuff, maybe get add or git rm new or obsolete files]
</li><li>git commit -a -m 'commit'
</li><li>repeat 2-3 as necessary
</li><li><a href="http://github.com/dreiss/git-jira-attacher/tree/master">git-jira-attacher</a> [revision] (usually some variant of HEAD^^^)
</ol>
[after review]
<ol>
</li>
<li>git log <i>(just to make sure I'm about to commit what I think I'm about to commit)</i>
<li>git-svn dcommit
</li><li>git checkout master
</li><li>git-svn rebase -l <i>(this will put the changes you just committed into master)</i>
</li><li>git branch -d [ticket number]
</li></ol>
When I'm reviewing code it looks similar:
<ol>
<li>git checkout -b [ticket number]
</li><li>wget patches and git-apply, or <a href="http://github.com/eevans/git-jira-attacher/blob/b002ab0a0cd4d7c9f3801df4e8664e9fc3711053/jira-apply">jira-apply</a> CASSANDRA-[ticket-number]
</li><li>review in gitk/qgit and/or IDE (the intellij git plugin is quite decent)
</li><li>commit .. branch -d as above
</li></ol>
The last operation is "see who I need to bug to get reviews moving." This is just a list of the branches I haven't merged into master and deleted yet:
<ol>
<li>git branch
</li></ol>
Git-svn takes a lot of the pain out of the ASF's patch-and-jira workflow. In particular, you can easily break changes for a ticket up into multiple patches that are easily reviewed, and the latency of waiting for patch review doesn't kill your throughput so badly since you can just leave that branch alone and start a new one for your next piece of functionality. And of course you get git commit --amend and git rebase -i for massaging patches during the review process.
<p>
One fairly common complication is if you finish a ticket A, then start on ticket B (that depends on A) while waiting for A to be reviewed. So you checkout -b from your branch A rather than master and build some patches on that. As sometimes happens, the reviewer finds something you need to improve in your patch set for A, so you make those changes. Now you need to rebase your patches to B on top of the changes you made to A. The best way to do this is to branch A to B-2, then git cherry-pick from B and resolve conflicts as necessary.
</p><p>
Final note: I often like to create lots of small commits as I am exploring a solution and combine them into larger units with git rebase -i for patch submission. (It's easier to combine small patches, than pull apart large ones.) So my early commit messages are often terse and need editing. You can change commit messages with edit mode in rebase, then using commit --amend and rebase --continue, but that is tedious. I complained about this to my friend <a href="http://twitter.com/zwily">Zach Wily</a> and he made this git amend-message command (place in [alias] in your .gitconfig):
</p><pre>
amend-message = "!bash -c ' \
c=$0; \
if [ $c == \"bash\" ]; then echo \"Usage: git amend-message <commit>\"; exit 1; fi; \
saved_head=$(git rev-parse HEAD); \
commit=$(git rev-parse $c); \
commits=$(git log --reverse --pretty=format:%H $commit..HEAD); \
echo \"Rewinding to $commit...\"; \
git reset --hard $commit; \
git commit --amend; \
for X in $commits; do \
echo \"Applying $X...\"; \
git cherry-pick $X >> /dev/null; \
if [ $? -ne 0 ]; then \
echo \" apply failed (is this a merge?), rolling back all changes\"; \
git reset --hard $saved_head; \
echo \" ** AMEND-MESSAGE FAILED, sorry\"; \
exit 1; \
fi; \
done; \
echo \"Done\"'"
</pre>
(Zach would like the record to show that he knows this is pretty hacky. "For instance, it won't work if one of the commits after the one you're changing is a merge, since cherry-pick can't handle those." But it's quite useful, all the same.)
<p>
For what it's worth, the rest of my aliases are
</p><pre>
st = status
ci = commit
co = checkout
br = branch
cp = cherry-pick
</pre>Jonathan Ellishttp://www.blogger.com/profile/11003648392946638242noreply@blogger.com7tag:blogger.com,1999:blog-11683713.post-72566994474147242062009-05-27T06:59:00.000-07:002009-05-27T08:21:14.068-07:00Why you won't be building your killer app on a distributed hash tableI ran across <a href="http://seattleweb.intel-research.net/people/lamarca/pubs/paper-ChaRam.pdf">A case study in building layered DHT applications</a> while doing some research on <a href="https://issues.apache.org/jira/browse/CASSANDRA-192">implementing load-balancing in Cassandra</a>. The question addressed is, "Are DHTs a general-purpose tool that you can build more sophisticated services on?"
<p>
Short version: no. A few specialized applications can and have been built on a plain DHT, but most applications built on DHTs have ended up having to customize the DHT's internals to achieve their functional or performance goals.
<p>
This paper describes the results of attempting to build a relatively complex datastructure (prefix hash trees, for range queries) on top of OpenDHT. The result was mostly failure:
<blockquote style="font-style: italic;">A simple put-get interface was not quite enough. In particular, OpenDHT relies on timeouts to invalidate entries and has no support for atomicity primitives... In return for ease of implementation and deployment, we sacrificed performance. With the OpenDHT implementation, a PHT query operation took a median of 2–4 seconds. This is due to the fact that layering entirely on top of a DHT service inherently implies that applications must perform a sequence of put-get operations to implement higher level semantics with limited opportunity for optimization within the DHT.</blockquote>In other words, there are two primary problems with the DHT approach:
<ul><li>Most DHTs will require a second locking layer to achieve correctness when implementing a more complex data structure on top of the DHT semantics. In particular, this will certainly apply to eventually-consistent systems in the Dynamo mold.</li><li>Advanced functionality like range queries needs to be supported natively to be at all efficient.
</li></ul>While they spin this in a positive manner -- "hey, at least it didn't take much code<span style="font-style: italic;"></span>" -- the reality is that for most of us, query latency of two to four seconds is several orders of magnitude away from acceptable.
<p>
This is one reason why I think <a href="http://incubator.apache.org/cassandra/">Cassandra</a> is the most promising of the open-source distributed databases -- you get a relatively rich data model and a distribution model that supports efficient range queries. These are not things that can be grafted on top of a simpler DHT foundation, so Cassandra will be useful for a wider variety of applications.Jonathan Ellishttp://www.blogger.com/profile/11003648392946638242noreply@blogger.com14tag:blogger.com,1999:blog-11683713.post-50497729543194795992009-05-18T15:19:00.002-07:002012-04-16T07:14:06.230-07:00Belated 2009 Introduction to SQLAlchemy slidesI was asked to put my slides up again -- sorry it took so long. The slides and code samples are now up <a href="http://people.apache.org/%7Ejbellis/sqla2009/">here</a>. Video of the tutorial <a href="http://blip.tv/file/1998818">is also up</a>. (3 parts, first is linked). There's definitely audio problems in parts but at least some is watchable.Jonathan Ellishttp://www.blogger.com/profile/11003648392946638242noreply@blogger.com3tag:blogger.com,1999:blog-11683713.post-62196649890872209162009-05-13T20:18:00.000-07:002010-07-29T21:34:16.724-07:00Cassandra 0.3 release candidate and progressWe have a release candidate out for <a href="http://incubator.apache.org/cassandra/">Cassandra</a> 0.3. Grab the <a href="http://people.apache.org/%7Ejbellis/cassandra/cassandra-0.3-rc.tgz">download</a> and check out <a href="http://wiki.apache.org/cassandra/GettingStarted">how to get started</a>. The <a href="http://www.facebook.com/video/video.php?v=540974400803">facebook presentation</a> from almost a year ago now is also still a good intro to some of the features and data model.
<p>
<span style="font-weight: bold;">Cassandra in a nutshell</span>:
</p><ul><li>Scales writes very, very well: just add more nodes!</li><li>Has a much richer data model than vanilla key/value stores -- closer to what you'd be used to in a relational db.</li><li>Is pretty bleeding edge -- to my knowledge, Facebook is the only group running Cassandra in production. (Their largest cluster is <a href="http://groups.google.com/group/cassandra-user/msg/85a83621d07ff165">120 machines and 40TB of data</a>.) At Rackspace we are working on a Cassandra-based app now that 0.3 has the extra features we need.</li><li>Moved to the Apache Incubator about 40 days ago, at which point development greatly accelerated.
</li></ul><span style="font-weight: bold;">Changes in 0.3 include</span>
<ul><li>Range queries on keys, including user-defined key collation.</li><li>Remove support, which is nontrivial in an eventually consistent world.
</li><li>Workaround for a weird bug in JDK select/register that seems particularly common on VM environments. Cassandra should deploy fine on EC2 now. (Oddly, it never had problems on Slicehost / Cloud Servers, which is also Xen-based.)</li><li>Much improved infrastructure: the beginnings of a decent test suite ("ant test" for unit tests; "nosetests" for system tests), code coverage reporting, etc.</li><li>Expanded node status reporting via JMX
</li><li>Improved error reporting/logging on both server and client
</li><li>Reduced memory footprint in default configuration</li><li>and plenty of bug fixes.</li></ul>For those of you just joining us, Cassandra already had
<ul><li>An advanced on-disk storage engine that never does random writes</li><li>Transaction log-based data integrity</li><li>P2P gossip failure detection
</li><li>Read repair</li><li>Hinted handoff</li><li>Bootstrap (adding new nodes to a running cluster)</li></ul>(Read repair and hinted handoff are discussed in more detail in the <a href="http://s3.amazonaws.com/AllThingsDistributed/sosp/amazon-dynamo-sosp2007.pdf">Dynamo paper</a>.)
<p>
The cassandra development and user community is also growing at an exciting pace. Besides the original two developers from Facebook, we now have five developers regularly contributing improvements and fixes, and many others on a more ad-hoc basis.
</p><p>
<span style="font-weight: bold;">How fast is it?</span>
</p><p>
In a nutshell, Cassandra is much faster than relational databases, and much slower than memory-only systems or systems that don't sync each update to disk. Actual benchmarks are <a href="http://blog.oskarsson.nu/2009/05/vpork.html">in the works</a>. We plan to start performance tuning with the next release, but if you want to benchmark it, here are some suggestions to get numbers closer to what you'll see in the wild (and about 10x more throughput than if you don't do these):
</p><ul><li>Do enough runs of your benchmark first that each operation tested by your suite runs 20k times before timing it for real. This will allow the JVM jit to compile down to machine code; otherwise you'll just be getting the interpreted version.</li><li>Change the root logger level in conf/log4j.properties from DEBUG to INFO; we do a LOT of logging for debuggability and for small column values the logging has more overhead than the actual workload. (It would be even faster if we were to <a href="http://surguy.net/articles/removing-log-messages.xml">remove them entirely</a> but that didn't make this release.)</li></ul>Jonathan Ellishttp://www.blogger.com/profile/11003648392946638242noreply@blogger.com5tag:blogger.com,1999:blog-11683713.post-67320101630995722502009-05-01T13:54:00.000-07:002009-05-04T17:00:46.230-07:00A better analysis of Cassandra than mostVladimir Sedach wrote a three-part dive into <a href="http://incubator.apache.org/cassandra/">Cassandra</a>. (Almost two months ago now. Guess I need to set up Google Alerts. Trouble is there's a surprising amount of noise around the word `cassandra.`)
<ul><li><a href="http://carcaddar.blogspot.com/2009/03/cassandra-of-facebook-or-tale-of.html">Part 0
</a></li><li><a href="http://carcaddar.blogspot.com/2009/03/cassandra-of-facebook-or-tale-of_10.html">Part 1</a>
</li><li><a href="http://carcaddar.blogspot.com/2009/03/cassandra-of-facebook-or-tale-of_1895.html">Part 2</a></li></ul>A few notes:
<ul><li>We now have an <a href="http://spyced.blogspot.com/2009/05/consistent-hashing-vs-order-preserving.html">order-preserving partitioner</a> as well as the hash-based one</li><li>Yes, if you tell Cassandra to wait for all replicas to be ack'd before calling a write a success, then you would have traditional consistency (as opposed to "eventual") but you'd also have no tolerance for hardware failures which is a main point of this kind of system.</li><li>Zookeeper is not currently used by Cassandra, although we have plans to use it in the future.
</li><li>Load balancing is not implemented yet.
</li><li>The move to Apache is <a href="http://incubator.apache.org/cassandra/">finished</a> and development is active there now.
</li></ul>Jonathan Ellishttp://www.blogger.com/profile/11003648392946638242noreply@blogger.com2tag:blogger.com,1999:blog-11683713.post-73359938426811499752009-05-01T07:44:00.000-07:002010-10-11T20:54:44.189-07:00Consistent hashing vs order-preserving partitioning in distributed databases<p>
The <a href="http://incubator.apache.org/cassandra/">Cassandra</a> distributed database supports two partitioning schemes now: the traditional <a href="http://en.wikipedia.org/wiki/Consistent_hashing">consistent hashing</a> scheme, and an order-preserving partitioner.
</p><p>
The reason that almost all similar systems use consistent hashing (the <a href="http://www.allthingsdistributed.com/2007/10/amazons_dynamo.html">dynamo paper</a> has the best description; see sections 4.1-4.3) is that it provides a kind of brain-dead load balancing from the hash algorithm spreading keys across the ring. But the dynamo authors go into some detail about how this by itself doesn't actually give good results in practice; their solution was to assign multiple tokens to each node in the cluster and they describe several approaches to that. But Cassandra's original designer considers this <a href="http://groups.google.com/group/cassandra-user/msg/b0e9eed9116f0337">a hack</a> and prefers <a href="http://groups.google.com/group/cassandra-dev/msg/b3d67acf35801c41">real load balancing</a>.
</p><p>
An order-preserving partitioner, where keys are distributed to nodes in their natural order, has huge advantages over consistent hashing, particularly the ability to do range queries across the keys in the system (which has also been committed to Cassandra now). This is important, because the corollary of "the partitioner uses the key to determine what node the data is on" is, "each key should only have an amount of data associated with it (see the <a href="http://cwiki.apache.org/confluence/display/CSDR/Data+Model">data model explanation</a>) that is relatively small compared to a node's capacity." Cassandra column families will often have more columns in them than you'd see in a traditional database, but "millions" is pushing it (depending on column size) and "billions" is a bad idea. So you'll want to model things such that you spread data across multiple keys and if you then pick an appropriate key naming convention, range queries will let you slice and dice that data as needed.
</p><p>
Cassandra is in the process of implementing load balancing still, but in the meantime order-preserving partitioning is still be useful without that <span style="font-style: italic;">if</span> you know what your key distribution will look like in advance and can pick your node tokens accordingly. Otherwise, there's always the old-school hash-based partitioner until we get that done (for the release after the one we'll have in the next week or so).
</p><p>
See the <a href="http://cwiki.apache.org/confluence/display/CSDR/Index">introduction</a> and <a href="http://cwiki.apache.org/confluence/display/CSDR/GettingStarted">getting started</a> pages of the Cassandra wiki for more on Cassandra, and drop us a line on the mailing list or in IRC if you have questions; we're actively trying to improve our docs.</p>Jonathan Ellishttp://www.blogger.com/profile/11003648392946638242noreply@blogger.com3tag:blogger.com,1999:blog-11683713.post-54484877372875979312009-04-30T14:58:00.000-07:002009-11-20T05:29:06.176-08:00Automatic project structure inference<p>
David MacIver has an interesting blog entry up about <a href="http://www.drmaciver.com/2009/04/determining-logical-project-structure-from-commit-logs">determining logical project structure via commit logs</a>. I was very interested because <a href="https://issues.apache.org/jira/browse/CASSANDRA-27">one of Cassandra's oldest issues</a> is creating categories for our JIRA instance. (I've never been a big fan of JIRA, but you work with the tools you have. Or the ones the ASF inflicts on you, in this case.)
<p>
The desire to add extra work to issue reporting for a young project like Cassandra strikes me as slightly misguided in the first place. I have what may be an excessive aversion to overengineering, and I like to see a very clear benefit before adding complexity to anything, even an issue tracker. Still, I was curious to see what David's clustering algorithm made of things. And after pestering him to show me how to run his code I figure I owe it to him to <a href="http://people.apache.org/%7Ejbellis/maciver-clusters.txt">show my results</a>.
<p>
In general it did a pretty good job, particularly with the mid-sized groups of files. The large groups are just noise; the small groups, well, it's not exactly a revelation that Filter and FilterTest go together. I'd be tempted to play with it more but with only about two months and 250 commits in the apache repo there's not really all that much data there. (Cassandra's first two years were in an internal Facebook repository.) Working with data that exists as a side effect of natural activity is fascinating.Jonathan Ellishttp://www.blogger.com/profile/11003648392946638242noreply@blogger.com1tag:blogger.com,1999:blog-11683713.post-27206993009175168402009-04-11T06:52:00.000-07:002011-11-04T20:26:33.846-07:00The best PyCon talk you didn't seeThere were a lot of good talks at PyCon but I humbly submit that the best one you haven't seen yet is Robert Brewer's <a href="http://us.pycon.org/2009/conference/schedule/event/70/">talk on DejaVu</a>. Robert describes how his <a href="http://www.aminus.net/geniusql/chrome/common/doc/trunk/">Geniusql</a> layer <span style="font-style: italic;">disassembles and parses python bytecode</span> to let his ORM turn python lambdas into SQL. Microsoft got a lot of press for doing something similar for .NET with <a href="http://msdn.microsoft.com/en-us/netframework/aa904594.aspx">LINQ</a>, but Bob was there first.
<pre> box = store.new_sandbox()
print [c.Title for c in box.recall(
Comic, lambda c: 'Hob' in c.Title or c.Views > 0)]
</pre>This is cool as hell. The Geniusql part start about 15 minutes in.Jonathan Ellishttp://www.blogger.com/profile/11003648392946638242noreply@blogger.com4tag:blogger.com,1999:blog-11683713.post-63584651332078642672009-04-07T07:57:00.000-07:002009-04-07T08:03:02.030-07:00Credit where credit is dueI'm starting to conclude that <a href="http://spyced.blogspot.com/2008/12/frustrated-with-git.html">git just doesn't fit my brain</a>. Several months in, I'm still confused when things don't work the way they "should." My co-worker says I should start a wiki for weird-ass things to do with git: "You keep coming up with use cases that would never occur to me."
But, I have to give the git community credit: I've never gone in to #git on freenode and gotten less than fantastic help. Even with git-svn.Jonathan Ellishttp://www.blogger.com/profile/11003648392946638242noreply@blogger.com2