Quantum Flow Engineering Newsletter #13
I’m back with some more updates on another week worth of work on improving various performance aspects of Firefox.
Similar to the past weeks, Speedometer remains a big focus area for performance work. In addition to the many already identified bugs to work on, we are also still measuring the benchmark quite actively looking for more optimization opportunities.
Another item worthy of an update is Background Hang Reports. Michael Layzell earlier today enabled collection of native stack traces on Win64 (and Mac) using the Gecko Profiler stack walking backend (Linux support soon to follow). Because we are now using the Gecko Profiler backend for BHR, we can soon get interleaved native and pseudo-stacks from BHR similar to the ones that we have come to know and love in Gecko Profiler for a long time now! Also, Doug Thayer has made a lot of progress on hangs.html, his front-end for exploring the native stack traces uploaded from BHR. This is a nice and super fast tool to explore the hangs that our users are experiencing on the Nightly channel and it shows you the corresponding pseudo-stacks that are extremely helpful if for example the hang is coming from chrome-privileged JS (where we get full call stack information through telemetry). Please have a look, and send him feedback.
This edition is exceptionally short, but the most interesting part of these is probably the last part anyway, the credits section, where I acknowledge the hard work of the people who worked on improving the performance of Firefox in the past week. So let’s get to that, and I do hope I’m not dropping any names:
- Dão Gottwald made it so that XUL scrollboxes don’t always cause a style flush when checking if it can scroll to an element.
- Junior Hsu got rid of some main thread IO that we do when closing the DB connection to the cookies database. This helps with startup performance.
- Florian Quèze deferred the construction and initialization of some search components, UnifiedComplete (for the AwesomeBar), and the sidebar browser until after first paint, which should all help to improve start-up speed. Similarly he deferred the initialization of Places to after first paint. Additionally he sped up the appearance of the about:home search bar placeholder text which is the last part of painting the initial UI in a new browser profile. He also made GetShellFolderPath() call the correct Windows API and making startup quite a bit faster as a result.
- David Baron made it so that we cache font lookups on Windows! Among other things, this should help speed up rendering long <select> dropdowns with e10s enabled.
- Wei-Cheng Pan made it so that updating Windows “jump lists” no longer causes main thread IO!
- Robert Strong removed nsWindowsShellService’s ShortcutMaintenance method which was slow during first startup, and no longer needed.
- Andrea Marchesini delayed initialization of the ContextualIdentityService module until after first paint
- Stephen A Pohl made GMPInstallManager extract GMP plugins off main thread
- Gabor Krizsanits made it so that short lived content processes are cached and re-used, which caused some nice performance wins on some of our benchmarks.
- Milan Sreckovic made it so that the prefs file is written to off of the main thread!
- Michael Layzell enabled reporting of native stack traces to BHR on Win64 thanks to David Major’s help in avoiding a deadlock issue. He also fixed a recent performance regression caused by adding a telemetry probe. He furthermore enabled loading pages opened from rel=noopener links in a new process (while some follow-up work is in the process of landing). This is important to enable us to leverage multi-e10s better if sites opt into using this feature.
- Olli Pettay moved GC and CC slices plus forgetSkippable phase to the idle queue. This is a huge part of the cleanup work that Gecko needs to regularly run as pages keep running, and in the old world is could interfere with more important work such as processing an input event or a timeout handler, and now it runs with a lower priority!
- Kyle Machulis removed the dependency of PluginProvider.jsm on NSS which caused startup slowdown.
- Mats Palmgren lifted up the nsClassHashtable::LookupForAdd() API into nsBaseHashtable. This API allows you to look up something inside a hashtable, and insert a new entry with that key if the lookup doesn’t find something without incurring two lookups. He then followed up with a whole bunch of fixes to various code which could use this new more efficient API. (Please consider using it for this use case as well!) Oh, and he also fixed the original bug which was the inspiration for all of this impressive work!
- Evelyn Hung relanded her improvements for spell checker heuristics for a multi-process browser after fixing the existing bug in the spell checker code which got her patch backed out previously.
- Nicholas Nethercote enabled using PROFILER_LABEL macros outside of libxul. This allows profiles that show jank caused by expensive DLL loads to include the name of the DLL being loaded.
- Jon Coppeard parallelised the start of GC marking phase.
- Jan de Mooij made string matching faster. He also made some specific improvements to the performance of String.replace() that helps with rope SpiderMonkey strings.
- Masayuki Nakano made some improvements to the speed of the WindowConstructor() function on Windows that is central to the performance of opening a new window!