I have been using Arch to package my Debian packages since 2003; which means that Arch has had a good long run as my SCM of choice. I have been using CVS for a few years before I moved to arch, and the migration took me about six months, since it involved a while new philosophy of packaging; I am hoping that migrating to git would not involve such a major paradigm shift, and thus be less disruptive and time consuming. What follows is a narrative of my efforts to get educated about Git.
This article is meant to be an annotated, selective, organized set of links to information about Git. How does it differ from the myriad of other link collections about Git proliferating on the web? Well, the value add is in the annotations and the organization: while not quite a narrative of my exploration, this is an idealized version of what I think my discovery process should have been, to be most effective. Staging the information is important; google finds one lots of information that is incomprehensible to someone just coming to Git. This selection of links is actually selective; I have included only pointers to resources that fed me information at the level that I could handle at that stage, and I have eliminated links to information that was not new at that point. I have tried to select the best (in terms of information and clarity) of breed for each kind of information source I have come across so far.
There is a caveat: while still a beginner, though I am able to better judge now what is confusing to a beginner than I shall be when I have become more familiar with the system, I am still enough of a novice not to trust my judgement on what really is best practice. I can fix the latter as I gain experience, but then I'll need to be careful not to overload on complexity too early in the learning curve.
On the down side, this selection is subjective, and probably shall be even in the long term: I include what appealed to me, and will probably miss loads of pointers to information that I have not yet come across. However, I hope this will make it easier for other people to reach the same goal: use git for their version control needs.
A good place to start is the Wikipedia entry for git – especially the external links at the bottom of the page. I like the fact that git seems to support my usage of source code versioning systems in that I usually build a lot of scaffolding around the core system – the grand-daddy of all the various tools for building packages out of source control, cvs-buildackage, was my porcelain around CVS plumbing. I like that the history of development is cryptographically authenticated – and that this is distinct from explicitly signing tagged versions. I am intrigued by the idea of a content addressable file system, and still trying to wrap my head around that. I am also looking forward to the speed increases.
The next stop seems to be to listen to the practitioners – these are the authors, and the early adopter, and give us the feel for what the system was designed to do, and th work flows that it is suited for. In complex software style matters, some times. There are right way to do things, and things the software is not suited for. By going with the confort zone of the software, and not learning bad habits to start with, can make adoption far easier.
So the first place I decided to start with was to go listen to Linus giving a talk at google. I figured that his was probably the best way to get the vision thang. It is important to know what the underlying vision is, I think, to not get into bad habits and into modes of operation that the underlying system is not well suited for. I know I'll change things – but I should know how things are supposed to work before I start tweaking. This is just like cooking. Follow the recipe at least once – before you start throwing in random Indian spices.
What follows are my notes on the talk (I steal his slides shamelessly).
I first saw mention of this talk by Randal Schwarz on a blog entry from John Goerzen. This is more of a talk about Git, as opposed to the vision thing.
Entities in a repository:
Git makes branches easy. Git work means you branch at the drop off a hat. No complex commit rules, no huge test suites to pass before committing, etc. In SVN, branches are global – everyone sees it, an svn up is humongous. It encourages making trial branches and experimental branches.
Why git is useful for non-distributed development? Almost everyone who uses Git, or any other distributed source code management system, harps on how useful it is for distributed development. Keith Packard, in his blog, persuasively argues that off-line access, private development, and distributed backups make Git a compelling tool even in centralized development environments.
Sam Vilain's introduction to git-svn is another positive testimonial. While this introduction is ostensibly about git-svn, the introduction is an impassioned introduction to the clear strengths of git.
I had heard it before. Git is a content addressable file system, really. Git does not track files and filenames. It manages changes to a tree of files over time. It does not track file owners and permissions. It tracks content. It does not do renames. History is the history of the project. It can track function moves from one file to another file. How does it store things internally?
This is a quick introduction to git internals. Complete with diagrams, and a cogent explanation. It even manages to explain rebasing, and why people want to do that. There is a LWN article that covers similar ground, without the pretty pictures, and less succinctly.
As to why Git does not track renames, or why it is a good thing, there is this email from Linus on the git list that gives a great explanation. Indeed, tracking files is fundamentally broken for a SCM, as he puts it. There is another email which covers renames as well, but the points in this one are less compelling. Also this. And, then, this. Here is another example of how Git can follow content around.
A link between two git repositories has four parameters:
So far, this has all been an longish introduction, and nothing that one can sink one's teeth into. If you are like me, and have been following along, you are now itching to get started.
There is a wonderful introduction to git created by Carl Worth, who ported Chapter 2: Tour of Mercurial from the mercurial book, since it found it to be quite well written. The result is A very easy introduction to git.
There is an official tutorial, in Part I, covering the basics (porcelain), and then comes the more advanced Part II, which gets into the plumbing.
Then there is the Debian home grown Guide to using Git on Alioth. (While you are there, checkout the Alioth Packaging Project.
Then there is another page very similar to this one, based on someone else's learning experience. He has some other pages as well, worth reading.
While those were designed for newcomers, gently leading them into the intricacies of Git, the following are going to be useful in the long haul. Firstly, there is the Git users manual, always handy. Then there is the Git FAQ. And there is the Kernel hackers guide to Git.
One of the things I am interested in is hierarchical stitching together of repositories into a build tree. For Git, this is called Superproject/Submodules, and was detailed in an LWN article by Linus. If you are interested in Sub-Modules, then this is the place to start. There is a tutorial on the subject.
Sam Vilain's introduction to git-svn. Yes, this is a duplicate. But this is still a great HOWTO for people coming over from subversion. This is another one. And not just for subversion folks either.
Here is the git maintainers documentation for how to manage a distributed project, and how to set polices fro branches to enhance collaboration.
The XMMS2 howto on using Git has information on how to handle merging from upstream when they have cherry picked from your branch. This is fairly advanced.
There is a HOWTO for packaging software for Debian. I really like this one, based as it is on a work-flow similar to mine. Then there is a case history of converting a Debian package into git.
Well, this is not a howto, but explains 'inexplicable failure to merge recursively across cherry-picks'. This is related to the Debian HOWTO's above.
These are little one off tips on how to do things that one only occasionally encounters.
If you are still here, you have now a fair understanding of how Git works. This section is a retrospective on the birth of Git – the days of Bitkeeper and tridge and free readers for a proprietary version control system. Here is some historical background. And some additional detail.
Date: <2008-04-01 Tue>