Quantum Flow Engineering Newsletter #19
As usual, I have some quick updates to share about what we’ve been up to on improving the performance of the browser in the past week or so. Let’s first look at our progress on the Speedometer benchmark. Our performance goal for Firefox 57 was to get within 20% of Chrome’s benchmark score on our Acer reference hardware on Win64. Those of you who watch the Firefox Health Dashboards every once in a while may have noticed that now we are well within that target:
It’s nice to see the smiley face on this chart, finally! You can see the more detailed downward slope on the AWFY graph that shows the progress in the past couple of weeks or so (dark red dots are PGO builds, orange dots are non-PGO builds, and of course green in Chrome):
The situation on Win32 is a bit worse, due to Chrome’s recent switch to use clang-cl on Windows instead of MSVC which gave them an around 30% speed boost on the 32-bit Speedometer score, but we have made progress nonetheless. Such is the nature of tracking moving targets!
The other performance aspect to have a look at again is our progress at eliminating slow synchronous IPC calls. I last wrote about this about three weeks ago, and since then at least one major change happened: the infamous document.cookie synchronous IPC call was eliminated, so I figured it may be a good time to look at the data again.
Telemetry data is laggy since it includes data from older versions of Nightly, but if you compare this to the previous chart, there should be a stark difference visible: PCookieService::Msg_GetCookieString is now a much smaller part of the overall data (at around 26.1%). Looking at the list of the top ten messages, the next ones in order are the usual suspects for those who have followed these newsletters for a while: some JS initiated IPC, PAPZCTreeManager::Msg_ReceiveMouseInputEvent, followed by more JS IPC, followed by PBrowser::Msg_NotifyIMEFocus, followed by even more JS IPC, followed by 2 new messages that are now surfacing as we’ve fixed the worst ones of these: PDocAccessible::Msg_SyncTextChangeEvent which is related to accessibility and the data shows it affects a relatively small number of sessions due to its low submission rate, and PContent::Msg_ClassifyLocal, which probably comes from turning the Flash plugin click-to-play by default.
Now let’s look at the breakdown of synchronous IPC messages initiated from JS:
The story here remains unchanged: most of the sync IPC messages we’re seeing come from legacy extensions, and there is also the contextmenu sync IPC, which has a patch pending review. However, the picture here may start changing quite soon. You may have seen the recent announcement about legacy extensions being disabled on Nightly starting from tomorrow, so hopefully this data (and the C++ sync IPC data) will soon start to shift to reflect more of the performance characteristics that our users on the release channel will experience for Firefox 57.
Now please let me to acknowledge the great work of those who made Firefox faster last week. I hope I’m not forgetting any names!
- Ting-Yu Chou removed some needless copying from SpiderMonkey HashTable::lookupForAdd().
- Jim Chen fixed a hang during text selection which happened as a result of a recent regression.
- Marco Castelluccio made it so that we don’t use Preferences.jsm at all before first paint, which should help improve first paint time, and gets us closer to removing that module entirely.
- Marco Bonardo improved the performance of loading the preferences used by UnifiedComplete.js.
- Kirk Steuber made it so that we preload and cache strings for DOM errors when idle. He also moved the handling of the Browser:Thumbnail:CheckState message to the idle queue.
- Paolo Amadini reduced Promise overhead in DownloadLegacy.js progress events, which used to slow down file downloads in some situations.
- Adam Gashlin created a Windows background thread for kicking off a readahead for a few DLLs that can take a significant amount of time to load on the main thread during startup.
- David Keeler made us load the loadable roots PKCS#11 module asynchronously on a background thread during startup. This module provides the built-in CA root store, and as such is only needed when issuing a certificate or querying the trust of a certificate, which is hopefully something that startup doesn’t need to be blocked on for most users.
- Doug Thayer moved the generation of exponential telemetry histogram buckets from startup to compile-time. The computation of these buckets involved some math-heavy code that showed up in profiles, and such code is better run on our build machines and not on our users’ machines! Furthermore, Doug ensured early calls to setExperimentActive() during telemetry initialization don’t have undesirable side effects such as forcing initialization of our graphics stack.
- Ryan Hunt enabled asynchronous keyboard scrolling on pages which register passive event listeners behind a pref, as a way to allow web pages to assist the browser in enabling asynchronous keyboard scrolling if the page’s event listener doesn’t need to call preventDefault() in their event listener code. See the documentation for how this similar idea is used to improve the performance of touch based scrolling if the web page cooperates with the browser.
- Jan de Mooij ensured we don’t interrupt regex JIT code for non-urgent interrupts arriving from background threads such as the IonMonkey compilation thread. He also inlined the constructor and destructor of AutoGeckoProfilerEntry and removed some debug-only code from them.
- Perry Jiang made it so that we attempt to capture page thumbnails off of an idle callback when the browser is less likely to be busy.
- Henry Chang moved the main-thread portion of HTML parsing on behalf of background tabs to happen within idle periods when possible. Previously this would run asynchronously off of a 50ms timer.
- Jonathan Kew used a flag to ensure that expensive property accesses on text nodes when their character data is modified only happens when needed.
- Dão Gottwald ensured we use instant scroll behavior when doing pixel scrolling. This fixed a regression from last week’s landing of using smooth scrolling to scroll the tab bar.
- Alexandre Poirot made sure we only forward the console API calls to the parent process when the web console (or browser console) is open. This avoids the overhead of forwarding these calls when their result is completely invisible.
- Felipe Gomes landed a large set of patches to move various initialization tasks to the idle queue instead of running them off of random timeouts.
- Kannan Vijayan optimized Array.prototype.join for empty and single-item arrays.
- André Bargull optimized GetElemBaseForLambda.
- Jan Varga moved the localStorage API to use the PBackground protocol instead of PContent. This is an important optimization to speed up preloading of the localStorage data (which is a synchronous API) in the content process upon page load.
- Kris Maglione removed the old unused add-on SDK modules. These modules which were used in some legacy extensions were the source of various performance issues.
- Jim Chen ensured that we properly compare node to traversal range under different modes. This fixed a severe performance regression which could render a tab unresponsive by getting Gecko into an infinite loop.of everyone who helped make Firefox faster last week. I hope I’m not forgetting any names by mistake!