As promised (with a day of delay), here is an update on what happened in the last two weeks on making Firefox faster as part of the Quantum Flow project.
Last week we had a big work week at the Mozilla Toronto office. Many members of the various teams were attending and the week was packed with a lot of planning around the performance issues that have been identified in each area so far, and what we are planning to do in each area for Firefox 57 and beyond. I tried to attend as many of the discussions as I could, but of course many of the discussions were happening concurrently so I'm sure a lot of details is going to be missing, but here is a super high level of some of the plans that were being discussed.
- Layout. In the Layout team, we are focusing on improving our reflow performance. One challenge that we have in this area is finding which reflow issues are the important ones. We have done some profiling and measurement and we have identified some issues so far, and we can definitely find more issues, but it's very hard to know how much optimization is enough, which ones are the important ones, and whether we know of the important problems. The nature of the reflow algorithm makes it really difficult to get really great data about this problem without doing a lot of investigation and analysis work, and we talked about some ideas on what we can do to improve our work flows, but nobody seemed to have any million dollar ideas. So at the lack of that we won't be waiting for the perfect data to arrive and we'll start acting on what we know about for now. Through looking at many reflow profiles, we have also developed some “intuitions” on some patterns on the types of expensive things that typically show up in layout profiles, which we are working on improving. There are also some really bad performance cliffs that we need to try to eliminate.
- Graphics. In the Graphics team, we are planning to make some performance improvement to display list construction by retaining and incrementally updating them instead of reconstructing them every time. This is an extremely nice optimization to have since in my experience display list construction is the bottleneck in many of the cases where we suffer from expensive paints, and it seems like we have telemetry data that confirms this. The graphics team is also looking into doing some optimizations around frame layer building and display list building based on measurements highlighting places where things could be improved.
Another thing that happened last week was that I gave a talk on Friday as an introduction on how to use the Gecko Profiler to find performance issues in Firefox. During the week a few people had expressed interest to sit down with me and look over my shoulder as I use the profiler to analyze some performance problems, and due to the lack of time to sit down with people 1:1 we decided to do a recorded talk. This was decided a few minutes before the talk happened. :-) So, I didn't really have anything prepared in advance, which was both good and bad, since the talk is basically me live profiling Firefox, and it shows how you can start from scratch and go all the way to a bug report. The recording is here: <https://air.mozilla.org/gecko-profiler-introduction/>, if you're interested in learning how to use the profiler and/or how to read and analyze a profile you may want to check it out! BTW if you ever felt like you could use documentation and/or training material on how to use a profiler (or how to do that more effectively) please feel free to contact me privately with ideas on what you would like to learn. We haven't been so great on the documentation front, and many people have been pointing this out to me, and I hear and acknowledge your feedback! I'd like to try to improve this situation, and your feedback on what you'd like to learn about will help prioritize what's more important.
In hindsight, looking back at the work week, I wish we had more people from the front-end teams also attend the work week. The planning of the work week happened a couple of months or so earlier but due to the nature of this project, the measurements lead us to the various areas of the code base, and now we have a fair amount of issues in the front-end code and I think perhaps the work week was a bit too much focused on Gecko. But on the flip side, I also found the breadth of topics to cover in one week a bit too much, so perhaps adding more people and more meetings to the agenda wouldn't have been a net benefit. A lesson to learn for the next time!
On the bug fixing front, we had a great couple of weeks, there is a long list of really great work that happened, and a list of really amazing individuals to be recognized and appreciated for their contributions. As always I know I'll be missing a few people, so apologies in advance for that!
- Mike Conley made some improvements to our perceived session restore performance by making us smarter about when we update the UI.
- Mike Conley also instrumented our tab closing times so that we can have some telemetry data on some huge tab closing improvements that he is planning to work on.
- Mike Conley also fixed a recent tab switching performance regression.
- And as if none of this was enough, Mike also removed a sync IPC message used when restoring a session!
- Markus Stange made sure (really!) that the compositor process on Windows shows up in the profiles captured by the Gecko Profiler.
- Shih-Chiang Chien implemented retargeting of Necko data delivery notifications to background threads for the content process. This is an important optimization that avoids a round trip to the main thread for things like feeding the data to the HTML parser when loading a web page, for example, which runs off the main thread. We had this optimization before e10s and it's nice to have it again for the content process now!
- Bill McCloskey removed an old expiring telemetry which wasn't needed any more and was slow. I don't think anybody is going to miss this old probe!
- Florian Quèze improved the performance of the ctrl+tab switcher code.
- Dão Gottwald and Jared Wein switched our tab throbbers to use CSS animations. This is super nice since if the main thread of the parent process janks during page load, now the animation of the tab throbber will run on the compositor and can proceed smoothly. Also thanks to Jonathan Watt and Daniel Holbert for providing some SVG help on that bug.
- Gijs Kruitbosch massively improved the performance of importing data from Chrome. The importance of changes like this can't be overstated, this is precisely when you don't want to turn potential users away from using Firefox. :-)
- David Major found and fixed a quadratic algorithm for insertion of generated content in Gecko.
- Felipe Gomes created a tool to help find code that reads the same preferences over and over again which could benefit from being ported to use a preference observer.
- Tooru Fujisawa optimized the creation of really short strings in SpiderMonkey. As it turns out, really short strings are super common on the Web. Some further investigation is also ongoing.
- Jan de Mooij improved the performance of atomizing strings in SpiderMonkey.
- Chris Pearce moved the media cache away from sync IPC.
- Kris Maglione optimized some of the core Add-on SDK modules. And some of the code thereabouts as well.
- Kartikaya Gupta added some telemetry measures for compositor frame throughput during scrolling/animations.
- David Teller reduced the allocation overhead of the performance monitoring code in SpiderMonkey.
- Nika Layzell also removed the sync IPC that the permission manager used for its initialization, improving our page navigation speed with multi-e10s, while winning some privacy benefits by reducing the set of permissions that the content process knows about to only those of the web pages the content process has loaded.
- Kan-Ru Chen removed all of the sync IPCs used by the screen manager API. This is one of the biggest sync IPC problems that we currently have and one of the largest sources of the pauses of the content process main thread that we currently have, it's great to see it finally fixed!
- Nicholas Hurley switched a couple of usages of UUIDs in Necko to integers. It turns out that generating UUIDs and converting them to strings in hot code paths can be expensive and should be avoided at all costs!
- David Anderson improved some of the sync IPC situation with the compositor. The overall issue is difficult to address and it's great to see the low hanging fruit to be fixed here!
- Greg Tatum and Julien Wajsberg improved the profiler UI by adding a context menu that assists in copying information out of the UI, making it unnecessary to have to use devtools to delve into the DOM to copy out information from there! :-)
- The Firefox Screenshots team were very responsive to feedback about assessing any performance issues with the upcoming Firefox Screenshots feature.