It has been a few weeks since I have given an update about our progress on reducing the amount of slow synchronous IPC messages that we send across our processes.  This hasn't been because there hasn't been a lot to talk about, quite to the contrary, so much great work has happened here that for a while I decided it may be better to highlight other ongoing work instead.  But now as the development cycle of Firefox 55 comes to a closing point, it's time to have another look at where we stand on this issue.

I've prepared a new Sync IPC Analysis for today including data from both JS and C++ initiated sync IPCs.  First bit of unfortunate news is that the historical data in the spreadsheet is lost because the server hosting the data had a few hiccups and Google Spreadsheets seems to not really not like that.  Second bit of unfortunate news is that our hopes for disabling the non-multiprocess compatible add-ons by default in Nightly helping with reducing some of the noise in this data don't seem to have panned out.  The data still shows a lot of synchronous IPC triggered from JS as before, and the lion's share of it are messages that are clearly coming from add-ons judging from their names.  My guess about why is that Nightly users have probably turned these add-ons back on manually.  So we will have to live with the noise in the data for now (this is an issue that we have to struggle with when dealing with a lot of telemetry data unfortunately, here is another recent example that wasted some time and energy).

This time I won't give out a percentage based break-down because now after many of these bugs have been fixed, the impact of really commonly occurring IPC messages such as the one we have for document.cookie really makes the earlier method of exploring the data pointless (you can explore the pie chart to get a quick sense of why, I'll just say that message alone is now 55% of the chart and that plus the second one together form 75% of the data.)  This is a great problem to have, of course, it means that we're now starting to get to the “long tail” part of this issue.

The current top offenders, besides the mentioned bug (which BTW is still being made great progress on!) are add-on/browser CPOW messages, two graphics initialization messages that we send at content process startup, NotifyIMEFocus that's in the process of being fixed, and window.open() which I've spent weeks on but have yet to fix all of our tests to be able to land my fixes for (which I've also temporarily given up working on looking for something that isn't this bug to work on for a little while!).  Besides those if you look at the dependency list of the tracker bug, there are many other bugs that are very close to being fixed.  Firefox 55 is going to be much better from this perspective and I hope the future releases will improve on that!

The other effort that is moving ahead quite fast is optimizing for Speedometer V2.  See the chart of our progress on AreWeFastYet.com:

Last week, our score on this chart was about 84.  Now we are at about 91.  Not bad for a week worth a work!  If you're curious to follow along, see our tracker bug.  Also, Speedometer is a very JS heavy benchmark, so a lot of the bugs that are filed and fixed for it happen inside SpiderMonkey so watching the SpiderMonkey specific tracker bug is probably a good idea as well.

It's time for a short performance story!  This one is about technical debt.  I've looked at many performance bugs over the past few months of the Quantum Flow project, and in many cases the solutions have turned out to be just deleting the slow code, that's it!  It turns out that in a large code base as code ages, there is a lot of code that isn't really serving any purpose any more but nobody discovers this because it's impractical to audit every single line of code with scrutiny.  But then some of this unnecessary code is bound to have severe performance issues, and when it does, your software ends up carrying that cruft for years!  Here are a few examples: a function call taking 2.7 seconds on a cold startup doing something that became unnecessary once we dropped support for Windows XP and Vista, some migration code that was doing synchronous IO during all startups to migrate users of Firefox 34 and older to a newer version, and an outdated telemetry probe that turned out to not in use any more scheduling many unnecessary timers causing unneeded jank.

I've been thinking about what to do about these issues.  The first step is fix them, which is what we are busy doing now, but finding these issues typically requires some work, and it would be nice if we had a systematic way of dealing with some of them.  For example, wouldn't it be nice if we had a MIMIMUM_WINDOWS macro that controlled all Windows specific code in the tree, and in the case of my earlier example perhaps the original code would have checked that macro against the minimum version (7 or higher) and when we'd bump MINIMUM_WINDOWS up to 7 along with bumping our release requirements, such code will turn itself into preprocessor waste (hurray!), but of course, the hard part is finding all the code that needs to abide by this macro, and the harder part is enforcing this consistently going forward!  Some of the other issues aren't possible to deal with this way, so we need to work on getting better at detecting these issues.  Not sure, definitely some food for thought!

I'll stop here, and move on to acknowledge the great work of all of you who helped make Firefox faster this past week!  As per usual, apologies to those who I'm forgetting to mention here: