The past month has seen some significant and exciting improvements to Balrog. We've had the usual flow of feature work, but also a lot of improvements to the infrastructure and Docker image structure. Let's have a look at all the great work that's been done!
Most recently, two big changes have landed that allow us to use multifile updates for SystemAddons. This type of update configuration let us simplify configuration of Rules and Releases, which is one of the main design goals of Balrog. Part of this project involved implementing "fallback" Releases - which are used if an incoming request fails a throttle dice roll. This benefits Firefox updates as as well, because it will allow us to continue serving updates to older users while we're in the middle of a throttled rollout.
Some house cleaning work has been done to remove attributes that Firefox no longer supports, and add support for the new "backgroundInterval" attribute. The latter will give us server-side control of how fast Firefox downloads updates, which we hope will help speed up uptake of new Releases.
There's been some significant refactoring of our Domain Whitelist system. This doesn't change the way it works at all, but cleaning up the implementation has paved the way for some future enhancements.
E-mails are now sent to a mailing list for some types of changes to Balrog's database. This serves as an alert system that ensures unexpected changes don't go unnoticed. In a similar vein, we also started aggregating exceptions to CloudOps' Sentry instance, which has already covered numerous production-only errors that have gone unnoticed for months.
Significant improvements have been made to the way our Docker images are structured. Instead of sharing one single Dockerfile for production and local dev, we've split them out. This has allowed the production image to get a lot smaller (mostly thanks to Benson's changes). On the dev side, it has let us improve the local development workflow - all code (including frontend) is now automatically rebuilt and reloaded when changed on the host machine. And thanks to Stefan, we even support development on Windows now!
We now have a script that extracts what the "active data" from the production database. When imported into a fresh database (ie: local database, or stage), it will serve exactly the same updates as production without all of the unnecessary history. This should make it much easier to reproduce issues locally, and verify that stage is functionally correctly.
Finally, something that's been on my wishlist for a long time finally happened as well: the Balrog Admin API Client code has been moved into the Balrog repo! Because it is so closely linked with the server side API, integrating them makes it much easier to keep them in sync.
The work above was only possible because of all the great contributors to Balrog. A big thanks goes to Ninad, Johan, Varun, Stefan, Simon, Meet, Benson, and Njira for all their time and hard work on Balrog!