Buildbot Scheduler and Builder graphing

One of the the most important systems I work is the release automation for Firefox and Thunderbird. The process behind the automation long predates me, but I’ve been deeply involved in automating, refining, and optimizing it. It shouldn’t come as any surprise that one of the biggest challenges of working on such a complex system is understanding how the smaller pieces fit together to make the whole system. For the release automation we have an advantage though: the smaller pieces are generally Buildbot Builders, and the things that fit them together are generally Buildbot Schedulers. Awhile ago I was improving parallelism for l10n repacks and found it extremely difficult to reason about whether or not my changes would actually create the desired Builders and string them together correctly. I threw together some (terrible) code that spat out a digraph of the release automation’s Builders and Schedulers. By comparing the before and after graphs I was able to iterate on some parts of my code without spending hours and hours testing.

This week I finally got around to tidying up and packaging this code as a more general purpose tool. It’s not nearly complete and has many rough edges, but as a very basic tool to help you understand non-trivial Buildbot installations, I think it’s wonderful. It’s pip installable (“buildbot-scheduler-graph”) and available on Github. Once you’ve got it, try it out with “buildbot-scheduler-graph /path/to/your/master.cfg /path/to/output-dir”. Here’s what Mozilla’s scheduler graphs looks like. What do yours look like?

Loading Python modules from arbitrary files

tl;dr: Use imp.load_source.

I’ve been hacking on a tool on and off that needs to load Python code from badly named files (eg, “master.cfg”). To my surprise, there wasn’t an obvious way to do this. My “go to” method of doing this is with execfile. For example, this will load the contents of master.cfg into “m”, with each top level object as a key:

m = {}
execfile("master.cfg", m)

This works well enough for simple cases, but what happens when you try to load a module that loads other modules? It turns out that execfile has a nasty limitation of requiring modules that aren’t in sys.path to be in the same directory as the file that calls execfile. You can’t even chdir your way around this, you have to copy the files you need to the caller’s directory. (We actually have some production code that does this.

Someone in #python on Freenode suggested using importlib. That seemed like a fine idea, especially after recently watching Brett Cannon’s “How Import Works” talk. Unfortunately, Python 2.7′s importlib only has a single method which can only load a module by name.

Eventually I came across a Stack Overflow post that pointed me at imp.load_source. This function is similar to execfile in that it loads Python code from a named file. However, it properly handles imports without the need to copy files around. It also has the nice added bonus of returning a module rather than throwing objects into a dict. I ended up with code like this, to load the contents of “foo/bar/master.cfg”:

>>> import os, sys
>>> os.chdir("foo/bar")
>>> sys.path.insert(0, "") # Needed to ensure that the current directory is looked at when importing
>>> m = imp.load_source("buildbot.master.cfg", "master.cfg")

Problem solved!