Keeping track of ongoing progress and how much more work different areas need is important. I've found the new Firefox Health Dashboards that Harald Kirschner and the team have worked on extremely helpful. You can see the status of the ongoing work on the various areas on the Quantum project there, including the Quantum Flow project.
The other charts that I've been watching are the Speedometer v2 benchmark charts (32-bit and 64-bit Windows). We are actively working on improving our score on this benchmark. In case you haven't heard about this benchmark before, it runs a number of simulated user actions in the TodoMVC app written in a number of popular JS frameworks/dialects and computes a score based on that. It mostly examines the JavaScript engine of the browser but since it runs the code in the real browser environment it ends up examining other parts of the browser (DOM, layout, graphics, etc.) to certain extents too. If you would like to keep up with our progress on this benchmark, one nice way is to follow the tracker bug for the optimization work around this.
One important project that is now starting to get to bear fruit is the refactoring of Gecko's event queue. See the following diagram which I borrowed from this brownbag talk about Quantum DOM.
The basic idea is that we want to process events with a different order depending on their type, but we don't want to change the respective ordering of the events of the same type. This will allow Gecko to serve requests to paint pages and handle user input events first, and then process other things like scripts, DOM events, timer callbacks, network events, etc., and finally when there is no other immediate work of higher priority to be done, low priority work such as garbage and cycle collection and idle callbacks are run. This is now mostly finished (except for high priority input events), and we have already started porting a few things to run as idle callbacks! This part of Quantum DOM will hopefully be part of Firefox 55.
The idle callbacks are a great programming utility for asynchronous callbacks that you would like to run with a low priority and without them interfering with important work. These are already exposed to web pages. In C++, you can schedule such callbacks using the NS_IdleDispatchToCurrentThread() API. In Chrome JS, the equivalent is the nsIThread.idleDispatch() method. We will soon write some more thorough documentation on how this can be used in both Gecko and browser code. We now have a lot of bugs on file which can use idle dispatch as a solution or mitigation, so this will be a useful trick for people to get familiar with.
Last week, we had a small work week at the Toronto office. The week was more focused on bug fixing than long-term planning (we only have around 16 more weeks of development time on mozilla-central for Firefox 57 after all!) so you can probably see more of the progress on the burndown chart. We had a big triage on the last day where we went through most of the [qf:p1] bugs and removed a number of them from the set which were now no longer high priority issues or issues we had no realistic hopes of being able to fix in the 57 timeframe given the staffing available.
Now as always it's time to recognize the efforts of many of you who have helped make Firefox faster in the past 2 weeks. Apologies in advance to those whose contributions I'm forgetting to acknowledge here.
Olli Pettay made it so that we can avoid a layout flush if JavaScript attempts to focus() an element that is already focused. This provides some speedups on the Speedometer V2 benchmark! He also made us free the memory consumed by DOM elements only at idle priority of the main thread event loop.
Nika Layzell removed the PPrinting::ShowProgress synchronous IPC message. It’s great to see the whitelist of sync IPC messages shrinking! She also landed a telemetry probe for measuring how long we spend processing synchronous IPC messages. This will be helpful for those remaining sync IPC messages where we’re wondering whether the processing side of the IPC message is taking a long time to finish or whether the overhead of the message dispatch is the slowing factor.
Shih-Chiang Chien lazified the loading of UserAgentOverrides.jsm until the first network connection is made in order to improve startup speed.
Florian removed some main thread IO during early startup caused by accidentally causing a DLL to load to get the localized display name of our crash reporting folder. He also wrote a test to whitelist and blacklist JavaScript code that is run early during start-up.
André Bargull improved the performance of initializing default JS Intl objects. He also made some improvements to the performance of RegExp.prototype.@@split and RegExp.prototype.@@replace.
Boris Zbarsky made calling Element.scrollTop = 0; a lot cheaper by avoiding an unnecessary synchronous layout flush.
Dão Gottwald changed the tab strip scrolling code to flush once instead of once per ‘scroll’ event. He also removed a layout flush that used to occur when clicking on the “List all tabs” button in the tab bar.
Alessio Placitelli ensured Telemetry doesn’t initialize the search service before first paint.
Felipe Gomes made us skip nsURLClassifier initialization for about: loads, causing a start-up perf improvement.
Ting-Yu Chou made it far less expensive to clean up closed tabs and windows.
Will Wang reduced the cost of initializing SessionCookies for improved start-up time.
Mats Palmgren improved our reflow performance on pastebin.com. He also devirtualuized various nsIFrame members (including nsIFrame::IsLeaf() which took a lot of effort), which contributed some Speedometer improvements.
Makoto Kato removed support for undoing setting HTMLInputElement.value and HTMLTextAreaElement.value through JavaScript. This improves the performance of doing so in script, and isn’t something that other browser engines support doing. He also utilized an existing fast path in HTMLInputElement further improving the speed of setting the value property.
David Baron made the hidden window inactive by default, which should help improve start-up time.
Marco Bonardo made it so that the favicon database isn’t opened on startup to check for corrupt-ness, which should help improve start-up time for existing profiles.
Andreas Farre exposed requestIdleCallback to non-DOM JS execution contexts! This will help our front-end engineers schedule main thread jobs using a more intelligent mechanism. He also, with some help from Olli Pettay, ensured that idle callbacks do not run when we are about to fire a timer. This is important to ensure that those callbacks cannot interfere with higher priority timers that may be about to fire. He also enabled throttling of timeouts scheduled by tracking scripts by default. This should help reduce the overhead of pages left open in background tabs.
Bas Schouten avoided some display list construction overhead for backgrounds of elements without rounded borders.
Morris Tseng made table frames get their own display items.
Mike Conley got rid of some unnecessary focus changes during tab switching.
Nihanth Subramanya moved the initialization of Captive Portal detection so that it no longer blocks first paint, which should help to improve start-up time.
Dana Keeler disabled OCSP verification for DV TLS certificates on Nightly. Currently Firefox is the only major browser that support OCSP verification by contacting the CA’s OCSP server during TLS handshake and our telemetry data suggests that this is a major source of slowness during the initial TLS handshake.
Doug Thayer made PDF.js propagate its enabled state asynchronously instead of using synchronous IPC to query it from the content process.
Nicholas Nethercote reduced the locking overhead that the Gecko Profiler macros used for profiler labels and such introduce, which reduces the overhead of these macros even when the profiler isn’t running.
Jan de Mooij generalized baseline JIT type update stubs, allowing us to deal better with polymorphic code. In addition, he made type monitor stubs work with unknown objects/values. He also ensured most RegExp objects are allocated in the incremental GC nursery to avoid requiring a full GC to collect them. He also made Array.prototype.splice() be O(1).
Thomas Nguyen removed a delay to loading web page content when safebrowsing data files are being downloaded in the background.
Jonathan Kew made each frame’s properties be attached to itself rather than all be stored in a shared hashtable.
Kershaw Chang lowered the priority of HTTP requests coming from tracking scripts.
Felipe Gomes made Flash click to play by default! I have lost count on how many years this has been in the works!
Masayuki Nakano removed the PBrowser::Msg_RequestNativeKeyBindings synchronous IPC message.
Kris Maglione took his off-thread script decoding infrastructure designed to improve startup performance to a new level! You may remember this setup was introduced just recently, and now Kris taught it how to decode multiple scripts at a time, to save the cost of off-thread script decoding setup per script. It should be obvious in the graph below from ts_paint measurements when the patches landed!