As many of you know there have been numerous times lately that Tinderbox has become unresponsive, sometimes to the point of going down completely for a period of time. This post will attempt to summarize the issues and what's being done about them.
The biggest issue is load (surprise!). In a period of a few years we've gone from a few active trees with tens of columns between them to tens of active trees with hundreds of columns between them. Unsurprisingly, this has made the Tinderbox server a lot busier. The biggest load items are:
- showlog.cgi - Shows a log file for a specific build
- showbuilds.cgi - Shows the main page for a tree (like this)
- processbuilds.pl - Processes incoming "build complete" mail
A bit of profiling has also been done in bug 585814 to try to find specific hotspots.
We've already done a few things to help with Tinderbox load:
Other ways we're looking at improving the situation:
- bug 585691 - Split up Tinderbox data processing from display. This wouldn't reduce overall load, but it should segregate it enough to keep the Tinderbox display up.
- bug 390341 - Pregenerate brief and full logs. This would eliminate the need for showlog.cgi to uncompress logs in most cases.
- bug 530318 - Put full logs on FTP server; stop serving them from Tinderbox.