This week in Mozilla RelEng – June 13th, 2014 - *double edition*
I spaced and forgot to post this last week, so here's a double edition covering everything so far this month. I'll also be away for the next 3 Fridays, and Kim volunteered to take the reigns in my stead. Now, on with it! Major highlights:
- We had a busy couple of weeks of important, but relatively boring work. We shipped 5 releases (2 betas, Firefox 30.0 & 24.6.0esr, Thunderbird 24.6.0), did a series of uplifts, and continued moving machines from scl1 to scl3.
- Catlee did some investigation into our CDN bandwidth and found that on average, 20% of users who have a partial update available still end up downloading the complete update.
- I finished switching B2G updates to aus4.mozilla.org. The updates should now be more reliable, and we'll be able to kill update.boot2gecko.org in the near future. Because we're abiding by standard update channel names now, this also means that the Socorro folks are able to process crashes for most B2G builds.
- Mike got his patch that allows RelEng builds to be built with mach ready to land. This is the step along the path making automation steps reproducible by developers, and enabling parallelization of post-build steps.
- Ed had us disable coalecing for changes that land within a few minutes of each other. During tree closures for bustage, this should help get the tree open faster because fewer builds will be needed for bisection.
- Balrog: Backend
- Support new attributes for update.xml in Balrog
- add support for "isOSUpdate" attribute
- need a way to delete releases in balrog
- Buildduty
- Large set of t-snow-r4 slaves is disabled (broken in slave-health)
- Pending queue for tegras > 1000 and time between jobs per tegra is > 6 hours
- All Trees Closed -> building backlog of linux jobs because of issues with dynamic jacuzzi allocation
- Trees closed due to "Error: Cannot retrieve repository metadata (repomd.xml) for repository: centos6. Please verify its path and try again"
- Deploy trychooser to production from tools repo tip to pick up bug 1012689
- put tegras that were on loan back onto a foopy and into production
- General Automation
- Run Android 2.3 tests against armv6 builds, on Ash only
- Bumper Bot doesn't seem to be always seem to be in sync when updating sources.xml and gaia.json for Gaia changes on Aurora
- Tooltool upload request for GCC with plugin headers to use for B2G hazard analysis
- bm84 spamming about Unauthorized Logins
- kill b2g26_v1_2 on june 9 merge day
- Tooltool upload request for version of sixgill compatible with b2g gcc
- Schedule Mn on opt Linux/Linux64 on trunk trees
- B2G nonunified builds are running across all release branches
- set-up initial balrog rules for b2g updates
- Add tooltool support to Windows builds
- Triggering arbitrary jobs gets branch wrong
- [Flame] Get seccomp enabled on jellybean based flame builds
- 06/10/2014 No Tarako 1.3t build available to smoketest
- no new hamachi/helix/nexus-4 updates since may 30th
- servo builder git clean should clean ignored files too
- rm_old_symbols step is failing for win64-ff64 nightlies
- Disable buildbot-master69 (bm69) until needed
- Need Tarako 1.3t FOTA updates for testing purposes
- switch b2g builds to use aus4.mozilla.org as their update server
- nextSlave should take into account retries and spot instances
- Add 'uz' to the Firefox build
- [Dolphin] Create Dolphin builds for 1.4
- Run Flame builds per-push instead of periodically
- B2G non-unified builds are falling way behind
- Handle reverting fake branch values in build-{running, pending}.js and builds-4hr.js.gz
- Remove the need to create Puppet changes for BuildSlaves*.py.erb, production_config.py and production-master.json
- Mn tests on Windows 8 fail with: WindowsError: [Error 740] The requested operation requires elevation
- Intermittent command timed out: 3600 seconds without output, attempting to kill fetching b2g bits from gitmo
- Tracking bug for 09-jun-2014 migration work
- B2G and Android builds failing with yum errors
- [mozharness] Move structured logging support code into mozharness proper
- Disable MOZ_AUTOMATION for Hf builds
- include device in fota mar filenames
- Clean up b2g names in our configs
- Give Jetpack tests a shorter maxTime than 2 hours
- Upload the list of all functions from hazard analysis
- Schedule (mostly) B2G tests on Mozilla-B2g28-v1.3t
- lock reporepo to a tag - b2g builds failing with AttributeError: 'list' object has no attribute 'values' | caught OS error 2: No such file or directory while running ['./gonk-misc/add-revision.py', '-o', 'sources.xml', '--force', '.repo/manifest.xml']
- Add 'dsb' to the Firefox build
- emulator-kk builds crash make half the time (which it describes as "failed to build")
- [mozharness] Allow web-platform-tests to be split by test type and into chunks.
- Add support for webapprt-test-chrome test jobs & enable them per push on Cedar
- Loan Requests
- Request for OS X 10.6 test machine for jchen
- Loan :kmoir talos-linux64-ix-005
- Please loan shu OS X 10.6 test runner
- Requesting a loaner machine bld-linux64-ec2 to diagnose bug 887761
- Slave loan request for a OSX 10.6 (snow leopard) machine to glandium
- Other
- Add b2g-inbound, fx-team, and mozilla-central to regression archive
- Switch in house try builds from ceph to reverse-proxied S3
- Stop merging builds for pushes within 3 minutes of each other
- sign Thunderbird hot-fix testing addon
- Platform Support
- New spot test images don't have v4l2loopback configured
- Most Win64 Windows 8 Debug tests are failing -- mozjs.dll issue?
- cancelled 2.3 mochitest jobs put ix slaves into weird state (and so need rebooting)
- scl1 Move Train A releng config Work
- Intermittent Android x86 We have not been able to establish a telnet connection with the emulator
- add r3.xlarge to our bids
- Create a Windows-native alternate to msys rm.exe to avoid common problems deleting files
- disable selected tests on tegras
- release-automation
- release l10n repacks failed due to failed "rm"
- Stop automatically pushing ESR deliverables to mirrors
- B2G device image builds failing with "error: packfile <...> does not match index" (followed by: "Output exceeded 52428800 bytes, remaining output has been truncated (output was 52467067 bytes)")
- No 'ready for releasetest testing' email for Firefox 30.0 build2
- Releases
- Show /whatsnew tour URL to users updating to 30.0 and 31.0 (from all past versions of Firefox)
- Disable Aurora 30 daily updates until merge to mozilla-aurora has stabilized
- Remove sw during beta-release migration for Firefox 30
- Tracking bug for 28-april-2014 migration work
- Disable Aurora daily updates until merge to mozilla-aurora has stabilized
- Remove sw during beta-release migration for Firefox 29
- Releases: Custom Builds
- Modify AOL Repack Configuration
- Yandex partner repack changes for Fx 30 release
- Version bumps for Yahoo FF 30
- Repos and Hooks
- Fix missing trailing slashes for repositories in push_printurls.py
- Please move https://github.com/eamsen/node-gonzales to the Mozilla github org
- Disable try_gcc45 hook
- Add an exception to the WebIDL hook for code uplifts
- Request for a new repository in /gaia-l10n: son
- Request to mirror darwinstreamingserver for FFOS emulator builds
- [Flame] Adding qcom prima wlan git mirror
- Need branches of mozilla-b2g/codeaurora_kernel_msm for mako/hammerhead
- [RTSP] Request for a new repository: darwinstreamingserver
- external/sprd-aosp/platform/system/core not accepting a non-fast-forward change
- New git repositories to mirror for Flame
- Request for a new repository in /gaia-l10n: mai
- Tools
- slave_health needs to be updated to manage b-linux64-hp boxes for "build" slaveclass
- b2g tagging script
- Panda tests retrying more than necessary
- Update trychooser for Android 2.3
- implement "disable" action in slaveapi
- Trychooser should not select opt/debug by default and leave the user to choose
- Please add treeherder to allowed origins response headers for BuildAPI self-serve
- Link to TBPL in trychooser job result emails
- https://secure.pub.build.mozilla.org/builddata/reports/slave_health/buildduty_report.html is 404
- Balrog release submissions should adjust productName as needed
- kill b2g18 + b2g18_v1_1_0_hd
- Balrog: Backend
- update balrog blob schema to support multiple partials
- support comparison operators for matching version & buildID in rules
- Balrog: Frontend
- Buildduty
- Stale slaverebooter lockfile
- Investigate Windows 8 machines that are still out of action
- [tracking] Eliminate buildduty
- General Automation
- Make blobber uploads discoverable
- [mozharness] Make web-platform-tests output match TEST-UNEXPECTED-.* regexp for test_end messages.
- Figure out the correct path setup and mozconfigs for automation-driven MSVC2013 builds
- disable uploading to update.boot2gecko.org for mozilla-central/mozilla-aurora/1.4
- timestamps for build directories are off by timezone offset (7 hours)
- Partial update generation service
- AWS region-local caches for https stuff
- Remove config for ubuntu64_hw-b2g-dt platform
- [Meta] Some "Android 4.0 debug" tests fail
- [tracker] run Android 2.3 test jobs on EC2
- Schedule all Android 2.3 armv6 tests, except mochitest-gl, on all trunk trees and make them ride the trains
- Move emulator gaia-ui-tests on cedar from AWS to IX slaves
- Add the build step or else process name to buildbot's generic command timed out failure strings
- Make b2g_emulator_unittest.py easier to run outside of automation
- [Dolphin] Need a way to build Dolphin builds for 1.4
- Tooltool doesn't work on (at least) windows c-c
- Monitor aws_stop_idle.py hungs
- Please add non-unified builds to mozilla-central
- Don't generate nightly builds on a tree if no new changes have landed since the previous nightly
- Remove spot instances from inventory
- revamp b2g upload configs
- Race condition between builders that push updates to in-tree files
- Add hazards builds to ash branch
- FlatFish: Integrate boot.img and recovery.img into the build system
- Schedule Mnw on cedar on emulator-jb and emulator-kk
- Don't require puppet or DNS to launch new instances
- Start doing mulet builds
- reduce EBS writes by removing journal, tweaking writeout
- create in-tree CA pinning preload list
- Split web-platform-tests into two testsuites by type
- Loan Requests
- Slave loan request for a talos-r4-snow machine
- Loan an ami-6a395a5a instance to Aaron Klotz
- Slave loan request for a t-mavericks-r5 machine
- Please loan Dan Glastonbury an OSX 10.9 test slave
- Other
- Platform Support
- Deploy new clang when available to fix ASan: Intermittent crashes [@ NS_IsMainThread] with heap-buffer-overflow
- evaluate mac cloud options
- slaves should always have tools checked out and up to date
- slave pre-flight tasks
- Deploy hg.m.o/build/buildbot production-0.8 to buildslaves to pick up bug 961075
- address high pending count in in-house Linux64 test pool
- scl1 Move Train C releng config Work
- release-automation
- Update channels for single local Beta and Release builds of Firefox for Android 30 (and beyond)
- Figure out how to offer release build to beta users
- Releases
- tracking bug for build and release of Firefox and Fennec 30.0
- tracking bug for build and release of Firefox 24.6.0 ESR
- tracking bug for build and release of Thunderbird 24.6.0
- Releases: Custom Builds
- Repos and Hooks
- Tools
- cut over gecko.git to the new vcs-sync system
- vcs-sync needs to populate mapper db once it's live
- port b2g branching script to mozharness, with revision locking
- Figure out tools versioning for partial generation
- tegra/panda health checks (verify.py) should not swallow exceptions
- Deploy relengapi 0.2.1 and mapper 0.2.1 into production https://api.pub.build.mozilla.org/
- end_to_end_reconfig.sh should store logs from manage_foopies.py
- cut over l10n repos to the new vcs-sync system
- Move mqext into the new https://hg.mozilla.org/hgcustom/version-control-tools repo
- Possible bug in end_to_end_reconfig.sh when using -p option?
- implement 'aws_create_instance' action in slaveapi
- slaveapi disable - comment in bug while disabling is in progress and add a reason dep bug option
- buildfarm/maintenance/manage_foopies.py not executable
- Make mozharness use structured logging for marionette tests
- db-based mapper on web cluster
- implement 'enable' in slaveapi.
- Create a Comprehensive Slave Loan tool
- update_maintenance_wiki.sh is truncating text content
- AWS Sanity Check lies about how long an instance was shut down for...