Blog Archives

Blog entries related to the Mozilla project

Porting an OpenGL application to the web

Emscripten is a tool which compiles C/C++ applications to Javascript, which can then be run inside a web page in a browser.  I have started to work on adding an OpenGL translation layer which is based on WebGL.  The goal of this project is to make it possible to compile OpenGL C/C++ applications into Javascript which uses WebGL to draw the 3D scenes.

My first demo is a port of the es2gears application to the web.  es2gears is an OpenGL ES 2.0 port of the well-known glxgears application.  You can see the web port of es2gears in action if you’re using a WebGL enabled browser (Firefox, Chrome or Opera Next).  For some extra fun, press the arrow keys as the gears are animating!

Screenshot of the es2gears application

This port has been automatically generated from this version of es2gears.  If you want to play with this locally, you can fork the emscripten repository.

A note about the demo: this is not supposed to be a performance benchmark of any kind.  My GLUT implementation uses the requestAnimationFrame API if available which means that your rendering speed should be capped at about 60FPS.  And that is what you would get if you compile es2gears directly into a native application as well.  But this application doesn’t push either the CPU or GPU to their limits, so it is only useful as a proof-of-concept, and should not be used in order to compare the graphics/Javascript performance of browsers!

I’m very excited about this project, and this is only the beginning.  If you’re interested in this work, watch my blog for further posts about future demos!

Posted in Blog Tagged with: , ,

Updating Firefox in the Background

The dialog below should look familiar. It displays while Firefox completes the update process after a new version is installed and the browser is restarted.

Firefox Update Dialog

In order to update itself, Firefox first starts to download an update in the background. When the update is downloaded, Firefox stages it in a directory ready to be applied. The next time that Firefox is about to start up, it checks out the staging directory. If an update ready to be applied is found, Firefox launches the updater program, and applies the update on top of the existing installation (showing that progress bar as it’s doing its job). When the update process is finished, the updater program restarts Firefox. All of this happens as you’re waiting for your browser to start up in order to do what you wanted to do. This is clearly less than ideal.

For the past few weeks, I have been working on a project to improve this process. The goal of my project is to minimize the amount of time it takes for Firefox to launch after downloading an update. The technical details of how I’m fixing this problem can be found this document. Here’s a short version of how the fix works. When Firefox finishes downloading an update, it launches the updater application in the background without displaying any UI, and applies the update in a new directory that is completely separate from the existing installation directory. Instead of staging the update itself, an entire updated version of Firefox is staged. The next time that Firefox starts up, the existing installation is swapped with the new updated installation which is ready to be used. In this scenario, you likely won’t notice that Firefox has applied an update as no UI is shown.

Now, the reason that this approach fixes the problem is that swapping the directories, unlike the actual process of applying the update, is really fast. We are effectively moving the cost of applying the update to right after the update has been downloaded while the browser is running. This leaves only the really fast copy operation to be performed the next time that the browser starts up.

I have some experimental builds with this feature ready in a temporary channel called Ash. The implementation is now at a stage where it can benefit testing from the community. You can download the latest builds here. I will trigger a few nightly builds on this branch every day so that you would get updates if you’re running an Ash build.

In order to help with testing this new update process, all you need to do is to download the latest build from Ash, then wait a few hours so that a new nightly build becomes available, and then update to that build. Updating can be triggered manually by opening the About dialog, or by the background update checker if you leave the build running for a few hours. If everything works correctly, when you restart Firefox, you should get a new build without seeing any progress bar as Firefox is starting up. In order to verify that you have indeed been updated to a new build, you can go to about:buildconfig, copy its contents, and then compare it with the contents of about:buildconfig when Firefox starts up after an update.

It would be extremely useful if you can test this with different types of security and anti-virus software running. If you observe any problems or warning, or if you see that the update did not change the contents of about:buildconfig, then please let me know so that I can try to fix those problems.

For people who are curious to see the code, I’m doing my development on this branch, and I’m regularly posting patches on bug 307181.

Please note that this is still in the testing stage, and at this point, we’re not quite sure which version of Firefox this will land in (we’re working to land it as soon as is safely possible). No matter which version of Firefox includes this feature for the first time, we believe that this will be a very positive change in making the Firefox update experience more streamlined for all of our users.

Posted in Blog Tagged with: , ,

Why you should switch to clang today, and how

Clang is a new C/C++/Objective-C/Objectice-C++ compiler being developed on top of LLVM.  Clang is open-source, and its development is being sponsored by Apple.  I’m writing this post to try to convince you that you should switch to using it by default for your local development if you’re targetting Mac or Linux at least.

Clang tries to act as a drop-in replacement for gcc, by trying to immitate its command line argument syntax and semantics, which means that in most cases you can switch from gcc to clang by just changing the name of the compiler you’re using.  This means that switching to clang is going to be really easy, but it also provides at least two useful features which make it really better than gcc for local development:

  • Compilation speed.  Clang is usually a lot faster to compile than gcc is.  It’s been quite a while since I did measurements, but I’ve seen compile times up to twice as fast with clang compared to gcc.  Yes.  You read right.  That’s twice!.
  • Better compiler diagnostics.  Clang usually provides much better diagnostics in case your code fails to compile, which means that you need to spend less time trying to understand what you should do to fix your code.  It even goes further by suggesting you of the most likely fixes.  I’ll give you two examples!

Consider the following program:

void foobar();

int main() {
  foobaz();
}

Here is the output of clang on this program.

test.cpp:4:3: error: use of undeclared identifier ‘foobaz’; did you mean ‘foobar’?
foobaz();
^~~~~~
foobar
test.cpp:1:6: note: ‘foobar’ declared here
void foobar();
^
1 error generated.

Here’s another program, followed by clang’s output:

#define MIN(a,b) (((a) < (b)) ? (a) : (b))
struct X {}

int main() {
  int x = MIN(2,X());
}

test.cpp:2:12: error: expected ‘;’ after struct
struct X {}
^
;
test.cpp:5:11: error: invalid operands to binary expression (‘int’ and ‘X’)
int x = MIN(2,X());
^~~~~~~~~~
test.cpp:1:24: note: instantiated from:
#define MIN(a,b) (((a) < (b)) ? (a) : (b))
~~~ ^ ~~~
2 errors generated.

Now if that has not made you drool yet, you can check out this page for more reasons why clang provides better diagnostics than gcc does.

For the impatient, here is how you would build and install clang locally on Mac and Linux.  You can check out this page for more comprehensive documentation.  Note that the clang port is not ready for everyday use yet, so I won’t recommend you switching to clang if you’re on Windows.

mkdir /path/to/clang-build
cd /path/to/clang-build
svn co http://llvm.org/svn/llvm-project/llvm/trunk llvm
cd llvm/tools
svn co http://llvm.org/svn/llvm-project/cfe/trunk clang
cd ../..
mkdir build
cd build ../llvm/configure --enable-optimized --disable-assertions
make && sudo make install

At this point, clang should be installed to /usr/local. In order to use it, you should add the following two lines to your mozconfig file:

export CC=clang
export CXX=clang++

Note that I’m assuming that /usr/local/bin is in your $PATH.

I’ve been using clang locally with very little problem for the past few months.  There has been a lot of effort to make Firefox successfully build with clang (mostly due to the heroic work done by Rafael), and he is now working hard to get us clang builders so that we would know when somebody lands code which break clang builds.  But you can switch to clang locally today and start benefiting from it right now.

Switching to clang will soon enable you to do one more cool thing.  But I won’t talk about that today; that’s the topic of another blog post!

Posted in Blog Tagged with: ,

Upcoming changes to absolute positioning in tables and table margin collapsing in Firefox 10

Last week I landed a number of patches which I’ve been working on which fix two very old (5 digit) bugs in Gecko (bug 10209 and bug 87277) which affect rendering of web content.  This point summarizes the changes to the behavior of Firefox as a result of those patches.

The first behavior change is about absolute positioning of elements inside positioned tables.  When you specify the CSS position: absolute style on an element in a web page, it is taken out of the flow of the web page, and its position is calculated relative to the nearest positioned ancestor in the DOM (by positioned, we mean an element which has a non-static position, i.e., one of fixed, relative or absolute for its position computed style.  See this test case for a simple example.

For a long time, Gecko used to only look for inline or block positioned elements in the parent chain.  So, for example, if you had a positioned table element inside a positioned block element, any position: absolute element inside the table used to be positioned relative to the outer block element, as opposed to the table element (which is the nearest positioned ancestor).  This test case shows the bug if you load it in Firefox 7.  In the correct rendering, the div with the red border should be placed ten pixels below the element with the green border, but in Firefox 7, it is positioned 10 pixels below the element with the red border.

The other behavior change is a fix to margin collapsing on table elements.  Firefox used to have a bug which caused margins on table elements to not be collapsed with other adjacent elements.  Take a look at this test case for example.  Here we have an outer table with a height of 180 pixels, in which there are 4 div elements, each with the height of 20 pixels, and with 20 pixels margins on each side.  The correct rendering for this test case is for the 4 inner divs to be laid out evenly spaced in the vertical direction in the outer div.  This happens because the bottom margin of the first inner div is collapsed with the top margin of the second div, which makes the content of the second inner div to be laid out 20 pixels below the content of the first inner div.

Now, if you make the inner divs tables instead (as in this test case), you can see the bug in Firefox 7.  What’s happening is that the vertical margins between tables are not collapsed, which effectively means that the contents of the each of the tables will be 40 pixels below the previous one, making the 4 tables overflow the height of their container.  If you try this test case in Firefox trunk, you will see that the rendering is identical to the rendering of the test case using inner divs instead of inner tables.

Note that this fix is not specific to the case of adjacent tables.  This test case interleaves divs and tables.  The rendering should be identical to the previous two test cases in Firefox trunk now.

It should be obvious that since these two changes affect the rendering of web pages, they may break existing web sites.  Indeed, today we had our first bug report about this behavior change.  The good news is that these two changes make us more compliant to the CSS specification, and all other web browser engines implement these cases correctly, so web sites which are affected by these two changes have been relying on a bug in Gecko, and have probably been already broken in other web browsers for a long time.  These fixes bring Gecko on par with other browser engines.

You can test this fix in Firefox Nightly right now.  If you see this change affecting a website you own, please let me know if you need help in fixing it.  If you see this change affecting websites you do not own, please let me know and I’ll try to contact the website and help them fix the problem.  If you see a behavior change which you think is not intentional and is a regression from these changes, please file a bug and CC me on it.

Posted in Blog Tagged with: ,

Future of editing on the web

Last week, I met with Aryeh Gregor (the editor of the HTML Editing APIs specification, among other specs), Ryosuke Niwa and Annie Sullivan (both of WebKit editing fame) and Jonas Sicking (prolific Gecko hacker) to discuss the future of HTML editing APIs on the web, and also exchange ideas on how Gecko and WebKit handle editing.  The meetings were extremely productive, and we managed to discuss a lot of stuff.  I’m trying to provide a summary of everything we discussed.  You can see the meeting agenda and minutes for more information.

HTML Editing Specification

We all agreed that Aryeh has been doing a great job in specifying how web browsers should behave in response to editing APIs.  The current situation is less than ideal.  Currently, every browser engine behaves differently when those APIs are invoked, and because of the complexity of handling editing commands, in a lot of situations, no browser engine gets it right.  Aryeh has been doing black-box testing on how browser engines behave, and he’s been designing generic algorithms which behave sanely.  Gecko and WebKit are planning to gradually move towards implementing the newly specified behavior for these APIs.  This can be tricky in some cases where modifying the engine’s behvior would break web compatibility, but we are planning to try to handle the safer cases first and watch closely for web content which make the wrong assumption about how things are supposed to behave.

The spec itself is not finished yet, but it’s generally in a good shape.  We discussed some future improvements to the spec, and we’ll reiterate on the spec to improve it even more as we move ahead.

Mutation Events Replacement, and New Editing Events

Mutation events in their current form behave very badly.  They are usually a reason for bad performance, and they are extremely painful to handle correctly, so they are always responsible for a lot of the stability problems that we face in Gecko (and I’m sure other browser engines have been experiencing the same types of problems).  Jonas has proposed a replacement for these events.  There has been lots of feedback on the proposal, and it’s not yet in a final form, but the discussions seem to be proceeding quite well.

Traditionally one of the reasons that people have needed mutation events have been modifying how browsers handle editing commands.  We discussed a possible better alternative for this use case.  We could provide two new events, the beforeeditaction and aftereditaction events (note that I’m just using these names as placeholders for better names that we will hopefully think of in the future!).  The beforeeditaction event would fire before the browser initiates handling of an editing command.  It would be cancelable, so that web applications which do not want certain commands enabled can just cancel the respective events.  The aftereditaction event would fire when the browser is done performing the editing action.  This event will provide semantic information about the editing operation performed, and will allow web content to alter the DOM generated as a result of the editing action.  Think of a web application which wants to use strong tags instead of b tags for bold text generated by the bold command.  With aftereditaction, the web application can just fix up the DOM after the browser has modified it to use strong tags.

This idea can even be taken one step further.  The browser can fire beforeeditaction and aftereditaction events even for commands which it does not support natively.  Handling these events will give web applications a new way of providing custom editing commands!  The more we talked about these two events, the more we thought that they are going to be useful!

UndoManager Specification

Ryosuke has been working on the UndoManager specification.  It is a new interface which allows web applications to hook into browser’s native undo/redo transaction manager.  It is an awesome way to enable web applications to put custom transactions into the browser’s undo/redo stack, and also enable undo/redo for a host of new applications on the web (think of having undo/redo working for a drawing tool built on top of the HTML5 canvas, for example!).  We discussed some of the details of the spec, and managed to agree on everything that we discussed.

The good news here is that this is something which is going to come to a browser engine near you really soon!  William Chen, an awesome intern at Mozilla is working on implementing it for Gecko, and Ryosuke told me that he’s planning to implement it for WebKit too once he finishes some of the refactoring that needs to be done before this can happen.

Selection APIs

We talked about the Selection specification, which is currently part of the DOM Range specification.  We agreed that we need a way for web developers to create new selection objects, with the hope that some day we would allow editing commands to operate on parts of the web page that are not visually selected by the user.  The selection story is really complicated for Gecko, because we allow multi-range selections, but we fail to handle them correctly everywhere, especially in the case of editable content.  I have some plans to address this problem, but those plans probably deserve their own blog/mailing-list post.

Collaborative Editing

Collaborative editing seems to be one of the topics which is currently hot among web developers.  Think of web applications like Etherpad which allow several users to edit the same document at the same time.  Annie gave us an overview of some of the challenges in developing collaborative editing applications.  We then discussed some of the ideas which might make the lives of people developing such applications easier.  These ideas included the beforeeditaction and aftereditaction events, an event to fire when the selection changes in a document, and adding a helper method to change the tag for an element in the DOM (which is now being discussed here).

Clipboard APIs

We discussed possible ways to provide clipboard access to web applications.  Opera has proposed a Clipboard API specification, which seems like a good starting point.  We talked about some of the security implications of allowing web pages to put data on the clipboard (which might be an annoyance, or worse, if the web page puts code to run a malicious command on the terminal hoping that the user would paste it in the terminal), and read data from the clipboard (the possibility of reading important data related to the user from the clipboard).  WebKit currently fires the clipboard events for user initiated copy/paste operations, providing access to the clipboard contents to the web application.  We would want to support this in Gecko too.  But the general problem of how to handle the security implications is mostly a UX issue, and we decided to ask our UX teams for feedback on this.

We also discussed how we could prevent potentially bad content from being pasted into an editable area.  One notable example is absolutely positioned elements.  There is no good way for us to determine what styles would apply to an element before the actual pasted document fragment gets injected into the target page’s DOM.  We concluded that allowing authors to handle this case in their aftereditaction handlers might be the best approach here.

Implementation Issues

We also spent some time describing the Gecko and WebKit implementation of contentEditable, and tried to get familiar with some of the upsides and downsides of each implementation.  One of the interesting things that was discussed here was the possibility of borrowing tests from each other.  It seems like WebKit should at least be able to borrow a subset of the crashtests and reftests from Gecko.  Borrowing WebKit tests for Gecko might be a bit trickier, since WebKit’s LayoutTests depend on WebKit’s internals.  But we all agreed that it would be a very good idea for us to adopt cross-browser unit tests.

Future Ideas

We also took some time to go over some future ideas.

IME APIs for the web

There is a proposal to enable web pages to implement IME in Javascript.  This seems to be an interesting proposal, and we decided to put the Gecko and WebKit IME experts in touch with each other to make some progress on this.

Keyboard events interoperability

Currently, there are some interoperability problems with keyboard events.  One such category of problems align with the differences between the modifier keys on Mac and Windows/Linux platforms.  Another source is some browsers not sending the correct keyCode and charCode in the keyboard events for some keyboard layouts.  Determining what needs to be done in all of these cases is tricky, but this was not the area of expertise of any of us.

Keyboard shortcuts

We might be able to leverage the accessKeyLabel property to inform the web pages about the default keyboard accelerator modifier key on the platform on which the browser runs (but there are some privacy concerns with exposing this information).  We agreed that we need a platform independent way to defining shortcut keys, in order to not require authors to handle low level keyboard events to implement keyboard shortcuts.  We might be able to use the accesskey attribute on command elements for this purpose.  We also would need a way to assign a command element to a certain contentEditable element.  We might be able to use the for attribute on the command elements to point to the ID of a contentEditable element.  Also, we thought that we should handle the well-known keyboard shortcuts for operations such as undo, redo, select all, bold, italicize, underline, cut, copy and paste automatically.

Spell checking APIs

There has been a proposal to enable web applications to provide their own spell checking facilities in Javascript.  Both Gecko and WebKit are interested in implementing this API in the future.

Exposing editing algorithms to web content

It might be a good idea for us to expose some of the base editing algorithms to web application authors so that they can use them in their own applications.  We all agreed that we should probably wait for the editing spec to stabilize a bit before we go ahead and consider exposing some of the base algorithms.  We also need feedback from web developers on what kinds of algorithms they would find useful to have access to.

Accessibility concerns

Currently, browser engines do not behave in a consistent way when navigating through editable regions using the keyboard.  We need to specify how browsers should move the caret when the user initiates moving commands using the keyboard.  While some of this behavior might be platform dependent, we would really prefer the keyboard navigation to work similarly across different platforms, if possible.

Resizing and table editing UI

One of the things that people really want is resizing and table editing UI.  Gecko tries to provide this UI, but the existing UI is buggy and may not be the ideal one.  WebKit currently doesn’t provide any UI for this, at all.  For resizing, we’re currently thinking of using the CSS resize property.  Table editing UI is not going to be that easy though, and we agreed that we should discuss that in the future.

Overall, I’m really excited about the direction we’re heading towards in the world of editing on the web.  There seems to be a lot of interest in both Gecko and WebKit to try to improve the current situation, and I do hope that in the future we would be able to have this discussion with other browser vendors too.

Posted in Blog Tagged with: ,

Resisting saying yes

We’ve been too used to say yes.  I’d like to remind everyone about this, and ask them to reconsider this old habit of ours.

Recently, what happened with bug 656120 made me feel warm and fuzzy inside.  This bug potentially solves the Javascript memory usage problems introduced in Firefox 4.  The feedback from the users on the Nightly branch was positive after landing this patch on trunk, so it was suggested for this patch to be backported to the Aurora branch.

mentioned that this is a bad idea, but we decided to let the drivers make the call, and I think they made the right call on this: this patch should wait for Firefox 7, and should not be backported to Aurora (which would mean that it would make it into Firefox 6).

I’d like to take a moment and bring our channel rules into attention.  According to these rules, the code changes that we should accept on Aurora include features backouts or disabling, security fixes, or fixes necessary to get the product into shipping state.  Although I too would really like our users to get this fix as soon as possible, but rushing things like this in the last minute is one of the reasons why we used to slip our schedules in the previous release model.  We were too lenient.  We used to say yes to a lot of things.  I could understand why.  In the old release model, if a fix slipped a release, nobody could tell when it would ship in the next version of Firefox.  That has changed now.  Look at things this way: rushing something on a branch where it doesn’t belong could potentially cause all of the other fixes on that branch to reach our users with a delay.  Let’s be more patient, and let’s say no more often.  As a wise man once said, "it’s good to know that there’s going to be another train in 6 weeks".

Posted in Blog Tagged with: ,

Building Firefox with Address Sanitizer

Address Sanitizer is a new project based on clang which aims to provide relatively cheap memory access checking.  It is capable of detecting errors such as out-of-bounds access or use-after-free at runtime.  Although its functionality is a subset of what Valgrind supports, running applications built with Address Sanitizer is noticeably faster than running them under Valgrind, which can simplify the testing process.

I recently got a build of Firefox with Address Sanitizer working.  Getting these builds in relatively simple.  Firstly, you should build Address Sanizer yourself.  Then, you can use a mozconfig like this:

export CC=/path/to/address-sanitizer/asan_clang_Linux/bin/clang
export CXX=/path/to/address-sanitizer/asan_clang_Linux/bin/clang++

export CFLAGS='-fasan -Dxmalloc=myxmalloc'
export CXXFLAGS='-fasan -Dxmalloc=myxmalloc'
export LDFLAGS=-ldl

. $topsrcdir/browser/config/mozconfig
mk_add_options MOZ_OBJDIR=@TOPSRCDIR@/objdir-ff-asan
mk_add_options MOZ_MAKE_FLAGS="-j4"
ac_add_options --enable-application=browser
ac_add_options --enable-debug
ac_add_options --disable-optimize
ac_add_options --disable-jemalloc
ac_add_options --disable-crashreporter

Once your mozconfig is set up, just build and run Firefox as you normally would.  I have not yet found any memory safety issues in Firefox using Address Sanitizer, but it’s not really surprising, since I mostly attempted to run our unit tests with this build, but the build is fast enough that you can even use it as your main browser.  If you do see an issue, please file a bug.

Posted in Blog Tagged with: , ,

Automated landing on mozilla-central

My previous post about assisted landing of patches on mozilla-central was very well received.  Apparently, all you have to do to get something awesome like that working is to blog about it.  I chatted at the office with Chris AtLee about the idea and talked to him about what needs to be done, and who needs to own it, and shortly after, Lukas Blakk informed me that she has an intern for this job, and invited me to a meeting about the project.  All this happened without me lifting a finger, which is amazing!

The result of that meeting is a wiki page on the topic, which is a combination of minutes of the meeting and the goals and steps for the project.  Feel free to review it and let us know if we’re missing something very big!

Marc Jessome is the real force behind this project, so please send all of the thank-you notes his way!  :-)  And don’t forget to answer the try server usage survey which Lukas designed in order to gather much-needed feedback about this project.

Last, but not least, it’s time to get more serious about the good fight against intermittent oranges, because that would be a prerequisite of actually starting to use these facilities.  I’ll keep writing more about that in the near future, so stay tuned.

Posted in Blog Tagged with: ,

Assisted landing of patches on mozilla-central

Imagine this for a second.  You work on fixing something, get the required reviews, and run your patch on the try server.  Everything looks good.  Then you set a flag or something on the bug, and go home, and enjoy your evening.  The next morning, you’re reading your bugmail while enjoying your coffee, and you see a message from landingbot@mozilla.org saying that the patch has successfully landed on mozilla-central, you smile, and wonder how people used to spend 4 hours watching the tree (and possibly getting yelled at by me or philor in the meantime) when they wanted to land something on mozilla-central.  This, my friend, is what we need to move towards, I think.

Now, I’m not entirely delusional here.  We have a very large number of tests testing all aspects of our code, including correctness and performance.  We usually use these tests to judge the quality of a patch.  The results of these tests can be used by machines to make a judgement about the quality of a patch.  So, we can put these tests to use for automating this process.  Here is roughly what I have in mind:

  1. We would have a bot which constantly watches Bugzilla for automated landing requests.  Once such a request is found, it gets added to a queue.
  2. For landing each change, the bot looks at the head of the queue, imports the patch (or hg bundle, in case of bugs with multiple patches) into a clone of mozilla-central.  If the import process is not successful (because the patch has been bit-rotten for example), the bot aborts and reports the problem on the bug.  Otherwise, the bot pushes the changes to the try server.
  3. The bot would watch the try server for results of the push.  If the push has more than one orange job, the bot aborts the landing process and reports the problem on the bug.  If the push has only one orange job, the bot retriggers that job, and reports the possibility of coming across an intermittent orange on the bug, and goes back to watching the try server push.  If the push is all green, the bot takes the change, transplants it on a fresh clone of mozilla-central (and aborts if the patch has been bit-rotten since step 1) and pushes it to mozilla-central.
  4. The bot would watch the mozilla-central push.  For any orange job, the bot retriggers it.  If the second run of the job goes green, the bot reports the orange as intermittent on the bug.  Otherwise, the bot backs out the change.  When the push gets one green run for every job, the bot reports success on the bug, and markes the bug as RESOLVED FIXED.

"Yeah, right!", you would say, "Like there’s ever a push which doesn’t see any intermittent orange!".  But if you’ve been watching the tree closely during the past week or so, you would have noticed that there are a lot of jobs which do not see any intermittent oranges (I’m not talking about oranges, or worse, reds, caused by people landing untested stuff, those will be reliably caught by our good robot).  These pushes are still not the majority of pushes, but we’re getting there.  Slowly, but surely.  Take a look at this image, which can be found here.

Orange Factor going down

The situation is not improving on its own.  It’s improving because of all of the wonderful developers who are working on fixing intermittent orange bugs in the area of their expertise (and some brilliant people who even go one step further and fix oranges in the areas of code unfamiliar to them)!  You can help too.  But more on that in a future post.

Once we reach an average of 1 intermittent orange per push, we could make such a plan work for real.  I don’t know about you, but this makes me really excited.  I think we all have better stuff to do than watching the tree for hours after we land something.

Posted in Blog Tagged with: ,

Avoiding intermittent oranges

Writing tests which are resilient against intermittent failures is hard.  In the process of trying to fix a large number of intermittent orange bugs, I’ve found out that a large portion of them are just caused by mistakes in writing tests, and almost all of those mistakes fall into commonly reoccurring patterns.  It’s hard to avoid those mistakes, unless you know how they lead to intermittent oranges, and how to avoid them.

In order to share my experience about what type of patterns could cause a test to fail intermittently, I’ve gathered a list of common intermittent failure patterns on MDN, and I urge everybody who writes tests for the Mozilla project to go ahead and take a look at that list.  I think if the test writers and reviewers have those items on mind when creating or reviewing a test, we can reduce the amount of new tests which are susceptible to intermittent failures dramatically.

And please make sure to add to the list if you know of other such patterns that I’ve missed.

Posted in Blog Tagged with: , ,