Quantum Flow Engineering Newsletter #10

Let's start this week's updates with looking at the ongoing efforts to improve the usefulness of the background hang reports data. With Ben Miroglio's help, we confirmed that we aren't blowing up telemetry ping sizes yet by sending native stack traces for BHR hangs, and as a result we can now capture a deeper call stack depth, which means the resulting data will be easier to analyze. Doug Thayer has also been hard at work at creating a new BHR dashboard based on the perf-html UI. You can see a sneak peak here, but do note that this is work in progress! The raw BHR data is still available for your inspection.

Kannan Vijayan has been working on adding some low level instrumentation to SpiderMonkey in order to get some detailed information on the relative runtime costs of various builtin intrinsic operations inside the JS engine in various workloads using the rdtsc instruction on Windows. He now has a working setup that allows him to take a real world JS workload and get some detailed data on what builtin intrinsics were the most costly in that workload. This is extremely valuable because it allows us to focus our optimization efforts on these builtins where the most gains are to be achieved first. He already has some initial results of running this tool on the Speedometer benchmark and on a general browsing workload and some optimization work has already started to happen.

Dominik Strohmeier has been helping with running startup measurements on the reference Acer machine to track the progress of the ongoing startup improvements using an HDMI video capture card. For these measurements, we are tracking two numbers, one is the first paint times (the time at which we paint the first frame from the browser window) and the other is the hero element time (the time at which we paint the “hero element” which is the search box in about:home in this case.) The baseline build here is the Nightly of Apr 1st as a date before active work on startup optimizations started. At that time, our median first paint time was 1232.84ms (with a standard deviation of 16.58ms) and our hero element time was

1849.26ms (with a standard deviation of 28.58ms). On the Nightly of May 18, our first paint time is 849.66ms (with a standard deviation of 11.78ms) and our hero element time is 1616.02ms (with a standard deviation of 24.59ms).

Next week we're going to have a small work week with some people from the DOM, JS, Layout, Graphics and Perf teams here in Toronto. I expect to be fully busy at the work week, so you should expect the next issue of this newsletter in two weeks! With that, it is time to acknowledge the hard work of those who helped make Firefox faster this past week. I hope I'm not dropping any names by accident!

Olli Pettay wrote a patch to have us coalesce scroll wheel events more aggressively, in the hopes of improving scrolling performance on (at least) Google Spreadsheets
Dão Gottwald got rid of some nasty layout flushes in our tab strip code when adding and removing tabs
Kris Maglione made the content process also use the script precompiler in order to speed up the startup by compiling scripts on a background thread. This resulted in some decent improvements in Talos sessionrestore tests.
Kris also got rid of some directory scans that the Add-ons Manager was doing on start-up, which should result in improved start-up times.
Mike Conley suppressed the window opening animation for the first window, buying us some more time during startup (about 270ms on the reference hardware!)
Florian Quèze switched many uses of Task.jsm / yield to async functions / await instead, which allows us to avoid cross-compartment wrapper overhead in certain situations. This was a huge undertaking - big kudos to Florian!
Dave Townsend lazified loading of the certificate database in the add-on manager.
Patrick McManus made it so that POST’ing files with an async XHR no longer does main thread IO.
Botond Ballo made starting dragging the scrollbar be asynchronous when using APZ even if the main thread of the content process is janking. This will ensure that users who use the mouse to drag the scrollbar in order to scroll web pages will get a smooth scrolling experience no matter how busy the content process is. He also sped up painting on particularly complex pages with many hit regions.
Thomas Nguyen removed another sync reflow when opening the AwesomeBar panel in some cases
Ben Kelly made us choose whether to run multiple consecutive timeouts in one shot of the event loop based on a time budget of 4ms rather than a fixed maximum number of timeouts.
Edouard Oger delayed the initialization of the Weave service until after the first browser window has finished loading, which should improve start-up time.
Thomas Nguyen avoided a synchronous layout flush which would happen when the awesomebar popup was either not displaying or already showing the maximum number of results.
Valentin Gosu made us dispatch an asynchronous job to check for captive portals when a network connection becomes available.
Nika Layzell added information about whether a hang reported to BHR was observed when the user was interacting with the browser.
Andreas Farre added telemetry about the average load caused by timeouts in foreground and background tabs.
André Bargull optimized Function.prototype.bind().
Mathieu Leplatre made the blocklist updater module load more lazily, which should help to avoid some random UI jank.
Doug Thayer built a visualization tool for BHR stacks that borrows heavily from perf.html! Source code is here.
Chris AtLee disabled omni.ja compression which, unintuitively, resulted in a smaller download for users and faster start-up time.
William Chen made us reuse StackNodes in TreeBuilder in order to avoid memory allocation overhead.
Chris Pearce moved telemetry collection about which media decoders are present to happen when the user is idle. Background hang reports evidence had shown that some users would experience hangs of 8 seconds or more when this would run previously shortly after opening the first browser window.
Daniel Holbert made us react to some CSS overflow changes on body/html elements without reframing the entire document.