Last June, I wrote about enabling building Firefox with clang-cl. We didn't get these builds up on the infrastructure and things regressed on both the Mozilla and LLVM side, and we got to a state where clang-cl either wouldn't compile Firefox any more, or the resulting build would be severely broken. It took us months but earlier today we finally managed to finally get a full x86-64 Firefox build with clang-cl! The build works for basic browsing (except that it crashes on yahoo.com for reasons we haven't diagnosed yet) and just for extra fun, I ran all of our service worker tests (which I happen to run many times a day on other platforms) and they all passed.
This time, we got to an impressive milestone. Previously, we were building Firefox with the help of clang-cl's fallback mode (which falls back to MSVC when clang fails to build a file for whatever reason) but this time we have a build of Firefox that is fully produced with clang, without even using the MSVC compiler once. And that includes all of our C++ tests too. (Note that we still use the rest of the Microsoft toolchain, such as the linker, resource compiler, etc. to produce the ultimate binaries; I'm only focusing on C and C++ compilation in this post.)
We should now try to keep these builds working. I believe that this is a big opportunity for Firefox to be able to leverage the modern toolchain that we have been enjoying on other platforms on Windows, where we have most of our users. An open source compilation toolchain has long been needed on Windows, and clang-cl is the first open source replacement that is designed to be a drop-in replacement for MSVC, and that makes it an ideal target for Firefox. Also, Microsoft's recent integration of clang as a front-end for the MSVC code generator promises the prospects of switching to clang/C2 in the future as our default compiler on Windows (assuming that the performance of our clang-cl builds don't end up being on par with the MSVC PGO compiler.)
My next priority for this project would be to stand up Windows static analysis builds on TreeHerder. That requires getting our clang-plugin to work on Windows, fixing the issues that it may find (since that would be the first time we would be running our static analyses on Windows!), and trying to get them up and running on TaskCluster. That way we would be able to leverage our static analysis on Windows as the first fruit of this effort, and also keep these builds working in the future. Since clang-cl is still being heavily developed, we will be preparing periodic updates to the compiler, potentially fixing the issues that may have been uncovered in either Firefox or LLVM and therefore we will keep up with the development in both projects.
Some of the future things that I think we should look into, sorted by priority:
- Get DXR to work on Windows. Once we port our static analysis clang plugin to Windows, the next logical step would be to get the DXR indexer clang plugin to work on Windows, so that we can get a DXR that works for Windows specific code too.
- Look into getting these builds to pass our test suites. So far we are at a stage where clang-cl can understand all of our C and C++ code on Windows, but the generated code is still not completely correct. Right now we're at such an early stage that one can find crashes etc within minutes of browsing. But we need to get all of our tests to pass in these builds in order to get a rock-solid browser built with clang-cl. This will also help the clang-cl project, since we have been reporting clang-cl bugs that the Firefox code has continually uncovered, and on occasions also fixes to those bugs. Although the rate of the LLVM side issues has decreased dramatically as the compiler has matured, I expect to find LLVM bugs occasionally, and hope to continue to work with the LLVM project to resolve them.
- Get the clang-based sanitizers to work on Windows, to extend their coverage to Windows. This includes things such as AddressSanitizer, ThreadSanitizer, LeakSanitizer, etc. The security team has been asking for AddressSanitizer for a long time. We could definitely benefit from the Windows specific coverage of all of these tools. Obviously the first step is to get a solid build that works. We have previously attempted to use AddressSanitizer on Windows but we have run into a lot of AddressSanitizer specific issues that we had to fix on both sides. These efforts have been halted for a while since we have been focusing on getting the basic compilation to work again.
- Start to do hybrid builds, where we randomly pick either clang-cl or MSVC o build each individual file. This is required to improve clang-cl's ABI compatibility. The reason that is important is that compiler bugs in a lot of cases represent themselves as hard to explain artifacts in the functionality, anything from a random crash somewhere where everything should be working fine, to parts of the rendering going black for unclear reasons. A cost effective way to diagnose such issues is getting us to a point where we can mix and match object files produced by either compilers to get a correct build, and then bisect between the compilers to find the translation unit that is being miscompiled, and then keep bisecting to find the function (and sometimes the exact part of the code) that is being miscompiled. This is basically impractical without a good story for ABI compatibility in clang-cl. We have recently hit a few of these issues.
- Improving support for debug information generation in LLVM to a point that we can use Breakpad to generate crash stacks for crash-stats. This should enable us to advertise these builds to a small community of dogfooders.
- Start to look at a performance comparison between clang-cl and MSVC builds. This plus the bisection infrastructure I touched on above should generate a wealth of information on performance issues in clang-cl, and then we can fix them with the help of the LLVM community. Also, in case the numbers aren't too far apart, maybe even ship one Firefox Nightly for Windows built with clang-cl, as a milestone! :-)
Longer term, we can look into issues such as helping to add support for full debug information support, with the goal of making it possible to use Visual Studio on Windows with these builds. Right now, we basically debug at the assembly level. Although facilitating this will probably help speed up development too, so perhaps we should start on it earlier. There is also LLDB on Windows which should in theory be able to consume the DWRAF debug information that clang-cl can generate similar to how it does on Linux, so that is worth looking into as well. I'm sure there are other things that I'm not currently thinking of that we can do as well.
Last but not least, this has been a collaboration between quite a few people, on the Mozilla side, Jeff Muizelaar, David Major, Nathan Froyd, Mike Hommey, Raymond Forbes and myself, and on the LLVM side many members of the Google compiler and Chromium teams: Reid Kleckner, David Majnemer, Hans Wennborg, Richard Smith, Nico Weber, Timur Iskhodzhanov, and the rest of the LLVM community who made clang-cl possible. I'm sure I'm forgetting some important names. I would like to appreciate all of these people's help and effort.