This is a collection of random infrastructure notes based on the work I'm doing at any given time. Most of the technical notes here assume an infrastructure similar to the one I'm working on (which I will not describe in detail, and which is subject to change). I can't be responsible if you do something that's documented here and bad things happen.

Thursday, May 29, 2008

SVN and Yum Repositories

We use "yum" in our environment, not just for OS updates, but for managing third-party software and our own production software. It's a great way for rolling releases out to a large number of servers in a way that ensures that everyone is running the same code. The ideal is that, if it isn't in one of our yum repositories, it's not going on one of our servers. The reality is considerably more nuanced, of course, but it's still a good idea.

So here's the challenge: If we're, say, mirroring Dag Wieer's excellent repository of third-party apps and we want to refresh the mirror to get updated packages, subsequent new server builds will be out-of-sync with production until all of the systems have been updated. If we get partially through that update process and find a problem which requires a roll-back, it becomes difficult to unwind to the previous state. Enter subversion.

If you can make it through a whole post on this blog without falling asleep, chances are you already know what subversion is. What we're doing is combining it with apache and mod_dav_svn to allow our hosts to update directly from the subversion repository. We have a "production" and a "qa" branch for each repo, and we just point yum on the servers to the appropriate branch. Since yum uses http for its transport this requires no trickery at all with yum. Simple, elegant, manageable, and saves a ton of disk space over manually managing multiple versions of a repo. Subversion only stores one copy of a file if it's identical across branches, and most repository updates will consist of just adding some RPM files and changing the metadata file.

Update: Well, that didn't work. Turns out that yum uses an http 1.1 byte range request to get the headers out of RPM files for dependency checking. Unfortunately mod_dav_svn doesn't seem to support this type of request, so it's back to the drawing board.


pierats said...

Mike, Good post. Have you looked at the stuff rPath is doing? It is a similar concept - combining scm functionality with disto creation/maintenance.

Mike Merideth said...

I don't know much about rpath, but it seems like a great approach for a virtualized datacenter if you have some money to spend on licenses. I haven't seen a great open-source solution yet, though I think there are a few out there.

Dreams said...

Mike, funny I thought about the same thing but didn't want to pollute svn with binaries.

Instead, what I'm doing now is defining additional yum channels:
# holds approved patches/updates for OEL4/5 chosen from ULN channels
name=My Security Rollup 1 - OEL$releasever - $basearch

At a certain point in time, I copy any RPMs into the channel and start testing. If complete, I enable the channel or force an update using --enablerepo=my_sr1

I just have to experiment with hard/soft copies or symlinks so I don't dublicate whole channels and waste GBs of storage. ;)