Skip to main content

Patch-oriented development made sane with git-svn

One of the drawbacks to working on Cassandra is that unlike every other OSS project I have ever worked on, we are using a patch-oriented development process rather than post-commit review. It's really quite painfully slow. Somehow this became sort of the default for ASF projects, but there is precedent for switching to post-commit review eventually.

In the meantime, there is git-svn.

(The ASF does have a git mirror set up, but I'm going to ignore that because (a) its reliability has been questionable and (b) sticking with git-svn hopefully makes this more useful for non-ASF projects.)

Disclaimer: I am not a git expert, and probably some of this will make you cringe if you are. Still, I hope it will be useful for some others fumbling their way towards enlightenment. As background, I suggest the git crash course for svn users. Just the parts up to the Remote section.

Checkout:

  1. git-svn init https://svn.apache.org/repos/asf/cassandra/trunk cassandra
Once that's done the only git-svn commands you need to know about are dcommit to push the changes in the current git branch back to svn, and rebase, to pull changes from svn and re-apply your uncommitted patches on top of that (basically exactly like svn up).

Creating new code:

  1. git checkout -b [ticket number]
  2. [edit stuff, maybe get add or git rm new or obsolete files]
  3. git commit -a -m 'commit'
  4. repeat 2-3 as necessary
  5. git-jira-attacher [revision] (usually some variant of HEAD^^^)
[after review]
  1. git log (just to make sure I'm about to commit what I think I'm about to commit)
  2. git-svn dcommit
  3. git checkout master
  4. git-svn rebase -l (this will put the changes you just committed into master)
  5. git branch -d [ticket number]
When I'm reviewing code it looks similar:
  1. git checkout -b [ticket number]
  2. wget patches and git-apply, or jira-apply CASSANDRA-[ticket-number]
  3. review in gitk/qgit and/or IDE (the intellij git plugin is quite decent)
  4. commit .. branch -d as above
The last operation is "see who I need to bug to get reviews moving." This is just a list of the branches I haven't merged into master and deleted yet:
  1. git branch
Git-svn takes a lot of the pain out of the ASF's patch-and-jira workflow. In particular, you can easily break changes for a ticket up into multiple patches that are easily reviewed, and the latency of waiting for patch review doesn't kill your throughput so badly since you can just leave that branch alone and start a new one for your next piece of functionality. And of course you get git commit --amend and git rebase -i for massaging patches during the review process.

One fairly common complication is if you finish a ticket A, then start on ticket B (that depends on A) while waiting for A to be reviewed. So you checkout -b from your branch A rather than master and build some patches on that. As sometimes happens, the reviewer finds something you need to improve in your patch set for A, so you make those changes. Now you need to rebase your patches to B on top of the changes you made to A. The best way to do this is to branch A to B-2, then git cherry-pick from B and resolve conflicts as necessary.

Final note: I often like to create lots of small commits as I am exploring a solution and combine them into larger units with git rebase -i for patch submission. (It's easier to combine small patches, than pull apart large ones.) So my early commit messages are often terse and need editing. You can change commit messages with edit mode in rebase, then using commit --amend and rebase --continue, but that is tedious. I complained about this to my friend Zach Wily and he made this git amend-message command (place in [alias] in your .gitconfig):

   amend-message = "!bash -c ' \
       c=$0; \
       if [ $c == \"bash\" ]; then echo \"Usage: git amend-message <commit>\"; exit 1; fi; \
       saved_head=$(git rev-parse HEAD); \
       commit=$(git rev-parse $c); \
       commits=$(git log --reverse --pretty=format:%H $commit..HEAD); \
       echo \"Rewinding to $commit...\"; \
       git reset --hard $commit; \
       git commit --amend; \
       for X in $commits; do \
           echo \"Applying $X...\"; \
           git cherry-pick $X >> /dev/null; \
           if [ $? -ne 0 ]; then \
               echo \"  apply failed (is this a merge?), rolling back all changes\"; \
               git reset --hard $saved_head; \
               echo \" ** AMEND-MESSAGE FAILED, sorry\"; \
               exit 1; \
           fi; \
       done; \
       echo \"Done\"'"
(Zach would like the record to show that he knows this is pretty hacky. "For instance, it won't work if one of the commits after the one you're changing is a merge, since cherry-pick can't handle those." But it's quite useful, all the same.)

For what it's worth, the rest of my aliases are

 st = status
 ci = commit
 co = checkout
 br = branch
 cp = cherry-pick

Comments

Unknown said…
Yikes dude. Did Zach tell you about Gerrit? Sounds like you have a sort of institutional need for JIRA, but our Gerrit system is all kinds of awesome.
Jonathan Ellis said…
We're pretty much stuck with what the ASF gives us. They're picky that way.

(In particular, patches from people without a contributor license agreement on file need to go through JIRA to make the lawyers happy.)
Zach said…
On that note, Gerrit has the option of requiring submitters to sign contributor agreements before it will allow patches. (I know that doesn't help you, but oh well.)
Jonathan,

You would be probably interested in this discussion:
http://markmail.org/thread/2vtyrx56jwsloxhn
Anonymous said…
These kind of projects are very interesting. You can market this idea among wider audience in Australia. Do contact CDR Writers for help.
cdr writers said…
looks great do contact us for cdr report writing and acs rpl reports for Australian computer society
MaksBondar said…
Hello! This site provides valuable insights for businesses looking to outsource their customer support operations. The article highlights key factors such as improved customer satisfaction, faster response times, and reduced costs. Overall, this article provides useful information for businesses looking to enhance their ecommerce customer support while maximizing the benefits of outsourcing. So, I recommend it! Use company website

Popular posts from this blog

Why schema definition belongs in the database

Earlier, I wrote about how ORM developers shouldn't try to re-invent SQL . It doesn't need to be done, and you're not likely to end up with an actual improvement. SQL may be designed by committee, but it's also been refined from thousands if not millions of man-years of database experience. The same applies to DDL. (Data Definition Langage -- the part of the SQL standard that deals with CREATE and ALTER.) Unfortunately, a number of Python ORMs are trying to replace DDL with a homegrown Python API. This is a Bad Thing. There are at least four reasons why: Standards compliance Completeness Maintainability Beauty Standards compliance SQL DDL is a standard. That means if you want something more sophisticated than Emacs, you can choose any of half a dozen modeling tools like ERwin or ER/Studio to generate and edit your DDL. The Python data definition APIs, by contrast, aren't even compatibile with other Python tools. You can't take a table definition

Python at Mozy.com

At my day job, I write code for a company called Berkeley Data Systems. (They found me through this blog, actually. It's been a good place to work.) Our first product is free online backup at mozy.com . Our second beta release was yesterday; the obvious problems have been fixed, so I feel reasonably good about blogging about it. Our back end, which is the most algorithmically complex part -- as opposed to fighting-Microsoft-APIs complex, as we have to in our desktop client -- is 90% in python with one C extension for speed. We (well, they, since I wasn't at the company at that point) initially chose Python for speed of development, and it's definitely fulfilled that expectation. (It's also lived up to its reputation for readability, in that the Python code has had 3 different developers -- in serial -- with very quick ramp-ups in each case. Python's succinctness and and one-obvious-way-to-do-it philosophy played a big part in this.) If you try it out, pleas

A review of 6 Python IDEs

(March 2006: you may also be interested the updated review I did for PyCon -- http://spyced.blogspot.com/2006/02/pycon-python-ide-review.html .) For September's meeting, the Utah Python User Group hosted an IDE shootout. 5 presenters reviewed 6 IDEs: PyDev 0.9.8.1 Eric3 3.7.1 Boa Constructor 0.4.4 BlackAdder 1.1 Komodo 3.1 Wing IDE 2.0.3 (The windows version was tested for all but Eric3, which was tested on Linux. Eric3 is based on Qt, which basically means you can't run it on Windows unless you've shelled out $$$ for a commerical Qt license, since there is no GPL version of Qt for Windows. Yes, there's Qt Free , but that's not exactly production-ready software.) Perhaps the most notable IDEs not included are SPE and DrPython. Alas, nobody had time to review these, but if you're looking for a free IDE perhaps you should include these in your search, because PyDev was the only one of the 3 free ones that we'd consider using. And if you aren