Imagine this for a second.  You work on fixing something, get the required reviews, and run your patch on the try server.  Everything looks good.  Then you set a flag or something on the bug, and go home, and enjoy your evening.  The next morning, you're reading your bugmail while enjoying your coffee, and you see a message from landingbot@mozilla.org saying that the patch has successfully landed on mozilla-central, you smile, and wonder how people used to spend 4 hours watching the tree (and possibly getting yelled at by me or philor in the meantime) when they wanted to land something on mozilla-central.  This, my friend, is what we need to move towards, I think.

Now, I'm not entirely delusional here.  We have a very large number of tests testing all aspects of our code, including correctness and performance.  We usually use these tests to judge the quality of a patch.  The results of these tests can be used by machines to make a judgement about the quality of a patch.  So, we can put these tests to use for automating this process.  Here is roughly what I have in mind:

  1. We would have a bot which constantly watches Bugzilla for automated landing requests.  Once such a request is found, it gets added to a queue.
  2. For landing each change, the bot looks at the head of the queue, imports the patch (or hg bundle, in case of bugs with multiple patches) into a clone of mozilla-central.  If the import process is not successful (because the patch has been bit-rotten for example), the bot aborts and reports the problem on the bug.  Otherwise, the bot pushes the changes to the try server.
  3. The bot would watch the try server for results of the push.  If the push has more than one orange job, the bot aborts the landing process and reports the problem on the bug.  If the push has only one orange job, the bot retriggers that job, and reports the possibility of coming across an intermittent orange on the bug, and goes back to watching the try server push.  If the push is all green, the bot takes the change, transplants it on a fresh clone of mozilla-central (and aborts if the patch has been bit-rotten since step 1) and pushes it to mozilla-central.
  4. The bot would watch the mozilla-central push.  For any orange job, the bot retriggers it.  If the second run of the job goes green, the bot reports the orange as intermittent on the bug.  Otherwise, the bot backs out the change.  When the push gets one green run for every job, the bot reports success on the bug, and markes the bug as RESOLVED FIXED.

“Yeah, right!", you would say, “Like there's ever a push which doesn't see any intermittent orange!".  But if you've been watching the tree closely during the past week or so, you would have noticed that there are a lot of jobs which do not see any intermittent oranges (I'm not talking about oranges, or worse, reds, caused by people landing untested stuff, those will be reliably caught by our good robot).  These pushes are still not the majority of pushes, but we're getting there.  Slowly, but surely.  Take a look at this image, which can be found here.

Orange Factor going down

The situation is not improving on its own.  It's improving because of all of the wonderful developers who are working on fixing intermittent orange bugs in the area of their expertise (and some brilliant people who even go one step further and fix oranges in the areas of code unfamiliar to them)!  You can help too.  But more on that in a future post.

Once we reach an average of 1 intermittent orange per push, we could make such a plan work for real.  I don't know about you, but this makes me really excited.  I think we all have better stuff to do than watching the tree for hours after we land something.