+ - 0:00:00
Notes for current slide

Hi, my name is Nolan Lawson and today I'd like to talk to you about CSS runtime performance.

Quick note: these slides will be available online.

The speaker notes have lots of links.

Notes for next slide

So who am I?

If you know me from the internet, it's probably from my blog where I talk about performance, accessibility, web components, etc.

I was on the Microsoft Edge performance team for a couple years, then moved to the performance team at Salesforce, and now I work on our JavaScript framework, Lightning Web Components.

CSS runtime performance

Nolan Lawson, 2022

Press P or click here for speaker notes

1 / 125

Hi, my name is Nolan Lawson and today I'd like to talk to you about CSS runtime performance.

Quick note: these slides will be available online.

The speaker notes have lots of links.

Photo of Nolan Lawson on a bike

  • nolanlawson.com
  • Microsoft Edge 2016-2018
  • Salesforce 2018-
2 / 125

So who am I?

If you know me from the internet, it's probably from my blog where I talk about performance, accessibility, web components, etc.

I was on the Microsoft Edge performance team for a couple years, then moved to the performance team at Salesforce, and now I work on our JavaScript framework, Lightning Web Components.

Car stick shift photo

3 / 125

I'd like to start off with a story. When I was learning to drive as a teenager, my car was a stick shift (manual transmission). These are much more common in Europe than in the U.S., but growing up in the Seattle area, this is what I had.

I kinda like stick shifts. You feel more in tune with what the car is doing. By listening to the sounds of the engine, I developed a feel for when to shift from one gear to another, or how to do things like "engine braking," which is actually an efficient use of fuel.

Internal combustion engine photo

4 / 125

Of course, I have no idea how an internal combustion engine actually works. It's a hugely complex thing.

But by listening to the engine and seeing how it reacted to my actions, I learned how to use the engine efficiently.

Composite browser logo of several browsers

5 / 125

This is kind of what I like about web performance. A browser engine is an very complex. And I'm not a C/C++ developer.

But through observation of how the engine responds to my inputs, I can become a better web developer, and write more efficient web apps.

And if you know just a bit about how the engine works, you can be an even better web developer.

Chrome DevTools yellow JS part and purple style/layout part

6 / 125

Let's look at a browser perf trace like this one (from the Chrome DevTools). This is the main thread.

There are two main parts: the yellow (JavaScript) part, and the purple (style/layout) part.

Chrome DevTools yellow JS part and purple style/layout part

JavaScript (yellow part)

7 / 125

Let's look at a browser perf trace like this one (from the Chrome DevTools). This is the main thread.

There are two main parts: the yellow (JavaScript) part, and the purple (style/layout) part.


If you're an experienced web dev, you might look at the JavaScript side and feel pretty comfortable with it.

We see names of functions we wrote. We see libraries we installed from npm. We see frameworks like React.

Chrome DevTools yellow JS part and purple style/layout part

JavaScript (yellow part)

Style/Layout (purple part)

8 / 125

Let's look at a browser perf trace like this one (from the Chrome DevTools). This is the main thread.

There are two main parts: the yellow (JavaScript) part, and the purple (style/layout) part.


If you're an experienced web dev, you might look at the JavaScript side and feel pretty comfortable with it.

We see names of functions we wrote. We see libraries we installed from npm. We see frameworks like React.


But many people look at the purple part and see a black box. "That's just the browser doing browser things. I couldn't possibly understand that."

Or we say "the purple part doesn't matter." As if the user cares whether their click was delayed by the yellow part or purple part!

This is kind of a "learned helplessness." We feel helpless, so we try to tell ourselves it doesn't matter.

In this talk, I'd like to convince you can understand what's going on in there, if you know a little bit about how browsers work.

Three news sites

Three news sites pie chart, mostly JS but 2.6, 3.5, and 4 seconds style/layout in each

9 / 125

To put the purple part in context, I tested three news sites. I used WebPageTest, simulated low-end Android phone.

Then I categorized the time spent on the main thread..

As you can see, the purple part is not the most important part, but it can be quite big. For the third site in particular, it's worth looking into.

And even for the others, in absolute terms, 2.6s and 3.5s are pretty big! If we can find some quick wins here, that would be great.

How browsers render

Pixel pipeline - JS, style/layout, paint/composite

10 / 125

To understand the purple part, we first need to start with how browsers render content. This process is called "updating the rendering" in the HTML spec. Let's call it "the render loop."

The first step, JavaScript, is where we run some JavaScript that modifies the DOM. Typically, this will be your JavaScript framework rendering, such as React doing its virtual DOM diffing and then eventually putting elements into the DOM.

The next two steps, style and layout, involve applying your CSS to those DOM elements and then laying them out on the page. This is my focus.

The last two steps, paint and composite, are about actually writing pixels to the screen. This is where we have layers, GPU animations, opacity, etc. This is out of scope for this talk.

How browsers render

Style layout

11 / 125

Let's focus on the style/layout part.

Style layout

12 / 125

So let's break down style and layout calculation first. These are two separate steps.

Style layout Style illustration by Lin Clark showing CSS rules being matched with DOM nodes

13 / 125

So let's break down style and layout calculation first. These are two separate steps.

With style, we're figuring out which CSS rules apply to which elements and computing styles.

Style layout Style illustration by Lin Clark showing CSS rules being matched with DOM nodes Layout illustration by Lin Clark showing measurements and geometry on a page

14 / 125

So let's break down style and layout calculation first. These are two separate steps.

With style, we're figuring out which CSS rules apply to which elements and computing styles.

With layout, we're figuring out how to place those elements geometrically on the page.

Style

h1 {
padding: 5px;
}
h2 {
padding: 10px;
}
<h1>Hello</h1>
<h2>World</h2>
15 / 125

So style calculation is about figuring out which elements have which CSS rules. The output of this is called the "layout tree" (or "render tree").

Let's take a simple example. In this case, we have a 5px-padding h1 and a 10px-padding h2. So style calculation is the process of figuring out that

Style

h1 {
padding: 5px;
}
h2 {
padding: 10px;
}
<h1>Hello</h1>
<h2>World</h2>
16 / 125

this rule applies to this h1

Style

h1 {
padding: 5px;
}
h2 {
padding: 10px;
}
<h1>Hello</h1>
<h2>World</h2>
17 / 125

this rule applies to this h2.

In this case, style calculation is (basically) about applying the CSS selectors, and figuring out that h1 refers to the <h1> element, and h2 refers to the <h2> element.

So in a sense, it's almost as if the browser is taking this page, and turning it into this one:

Style

<h1 style="padding: 5px;" >Hello</h1>
<h2 style="padding: 10px;">World</h2>
18 / 125

Conceptually, this is what style calculation is: it's giving us the same page we would have had if we had used inline styles.

It's also computing the ems, rems, etc. and turning them into pxs, as well as resolving custom properties, cascade, inheritance, etc, but this has less of an impact on perf in my experience.

Layout

<h1 style="padding: 5px;" >Hello</h1>
<h2 style="padding: 10px;">World</h2>
19 / 125

Now let's move on to layout. Conceptually, the output of style is the "inline styles," and that's the input to layout.

Hello world layout with padding/margin highlighted

20 / 125

Now we finally get to the geometry of the page. Layout calculation is where the styles, which have been associated with each element, actually get applied.

In this case, the browser figures takes the margin, padding, text wrapping, widths, heights, positions, etc.

Style vs layout performance

Two traces, one with lots of style and the other with lots of layout

21 / 125

Now first off, when you're looking at a perf trace, it's important to understand whether you primarily have a problem with style calculation, layout calculation, or both. Because these two traces are not the same!

Style vs layout performance

Two traces, annotated showing style and layout sections

22 / 125

These two look similar because they're both purple. But in one trace, we have huge style costs, and in the other, we have huge layout costs. The causes of slowness in these two cases is very different!

If you don't remember anything else from my talk, please remember this: style and layout are not the same thing!

The biggest mistake I see people make is seeing one and thinking it's the other. For instance, they have a lot of style costs, but they think it has something to do with the geometry of the page.

Three news sites

Three news sites, style vs layout, mostly layout but one has more style than layout

23 / 125

Just to give an idea of what the style vs layout breakdown might look like, here are those three news sites from earlier. Layout is usually the biggest part, but style is occasionally pretty big. In fact, for the first site, it's spending slightly more time in style than in layout.

I've seen traces where it's almost all style, and where it's almost all layout, and everything in between. It depends.

What slows down style/layout

Style Layout
Complexity of CSS 🐌
Complexity of layout 🐌
DOM size 🐌 🐌
Repeated re-renders (thrashing) 🐌 🐌
24 / 125

At a high level, if you're seeing a large amount of time spent in style or layout, it usually comes down to one of these things.

What slows down style/layout

Style Layout
Complexity of CSS 🐌
Complexity of layout 🐌
DOM size 🐌 🐌
Repeated re-renders (thrashing) 🐌 🐌

25 / 125

At a high level, if you're seeing a large amount of time spent in style or layout, it usually comes down to one of these things.


Typically either your CSS selectors are too complex, or there are a lot of them, which slows down style calculation. Note this has no effect on layout calculation.

What slows down style/layout

Style Layout
Complexity of CSS 🐌
Complexity of layout 🐌
DOM size 🐌 🐌
Repeated re-renders (thrashing) 🐌 🐌

26 / 125

At a high level, if you're seeing a large amount of time spent in style or layout, it usually comes down to one of these things.


Typically either your CSS selectors are too complex, or there are a lot of them, which slows down style calculation. Note this has no effect on layout calculation.


Or your layout itself, i.e. the geometry of the page, is very large or complex, which slows down layout calculation. Note this has no effect on style calculation.

What slows down style/layout

Style Layout
Complexity of CSS 🐌
Complexity of layout 🐌
DOM size 🐌 🐌
Repeated re-renders (thrashing) 🐌 🐌

27 / 125

At a high level, if you're seeing a large amount of time spent in style or layout, it usually comes down to one of these things.


Typically either your CSS selectors are too complex, or there are a lot of them, which slows down style calculation. Note this has no effect on layout calculation.


Or your layout itself, i.e. the geometry of the page, is very large or complex, which slows down layout calculation. Note this has no effect on style calculation.


Or your DOM is very large. A bigger DOM just means more work for the browser to do. This affects both style and layout.

What slows down style/layout

Style Layout
Complexity of CSS 🐌
Complexity of layout 🐌
DOM size 🐌 🐌
Repeated re-renders (thrashing) 🐌 🐌

28 / 125

At a high level, if you're seeing a large amount of time spent in style or layout, it usually comes down to one of these things.


Typically either your CSS selectors are too complex, or there are a lot of them, which slows down style calculation. Note this has no effect on layout calculation.


Or your layout itself, i.e. the geometry of the page, is very large or complex, which slows down layout calculation. Note this has no effect on style calculation.


Or your DOM is very large. A bigger DOM just means more work for the browser to do. This affects both style and layout.


Or you are doing repeated re-renders over time, also called thrashing, which slows down both style and layout.

Style performance

29 / 125

To understand style vs layout performance a bit more, we need to go into detail on how each one works. Let's go into style.

Style illustration by Lin Clark showing CSS rules being matched with DOM nodes

30 / 125

Remember, this is about matching up CSS rules with DOM nodes, and computing styles.

Selector performance

"For most websites I would posit that selector performance is not the best area to spend your time trying to find performance optimizations."

– Greg Whitworth, via Enduring CSS by Ben Frain (2016)

31 / 125

When we talk about style performance, we're mostly talking about selector performance. This needs some clarification.

There's a common refrain in the web perf community that CSS selector performance "doesn't matter" or you shouldn't worry about it. Here is one representative quote Greg Whitworth. I'm picking on Greg because I work with him, but if you Google "CSS performance," you'll see this repeated in multiple places.

I think this is true at the micro level: if you try to micro-optimize your CSS selectors, you're probably wasting your time. But it's less true at the macro level: if you're building a framework or design system, or if you're writing selector patterns that are repeated multiple times on a page, then selector performance can really matter.

I've seen perf regressions that were almost entirely driven by selector performance. And if you see high style calculation costs, then you very likely have a selector perf problem too. So we need to discover the "lost art" of selector performance.

NaΓ―ve style calculation

for (const element of page) {
for (const rule of cssRules) {
if (rule.matches(element)) {
/* ... */
}
}
}
32 / 125

To understand style performance, first it's important to note how browsers actually implement their style engines, and what they've already optimized.

To illustrate, let's imagine we're building a browser. Here is a naive implementation of style calculation that we might have.

NaΓ―ve style calculation

for (const element of page) {
for (const rule of cssRules) {
if (rule.matches(element)) {
/* ... */
}
}
}

O(n * m)

33 / 125

To understand style performance, first it's important to note how browsers actually implement their style engines, and what they've already optimized.

To illustrate, let's imagine we're building a browser. Here is a naive implementation of style calculation that we might have.


Unfortunately this naive implementation has a big problem: this is an O(n * m) operation, where n is the number of elements and m is the number of CSS rules. On any reasonably-sized page, the browser would slow to a crawl. So browsers try to avoid this naive case wherever possible.

34 / 125

For example, let's look at a simple DOM tree. In this case, we have multiple selectors in our stylesheet, and we need to match them with DOM nodes.

35 / 125

If we were doing the naive approach, then the browser would have to walk through the entire DOM, plus every selector for every DOM.

You can see how this would be inefficient, especially if it runs every time the DOM changes!

Style optimization 1: hash maps

  • Tags: span β†’ span, a β†’ a:last-child
  • IDs: bar β†’ #bar
  • Classes: foo β†’ .foo
36 / 125

So let's add a simple optimization to our toy browser. For every tag name, ID, and class, we'll create a hashmap mapping those strings to the list of selectors for that string.

This is pretty reasonable, because tag names for an element never change, and IDs and classes are pretty small and simple most of the time.

37 / 125

As you can see, this has a big impact of the efficiency of our algorithm. Rather than checking all selectors, we can short-circuit to only those selectors that could possibly match

38 / 125

As you can see, this has a big impact of the efficiency of our algorithm. Rather than checking all selectors, we can short-circuit to only those selectors that could possibly match

39 / 125

Now, there's still a problem with our algorithm. What about descendant selectors? In this case, we need to find all .bar elements inside of a .foo.

40 / 125

So we have to traverse the descendants of .foo to try to find all the .bar elements.

The hashmap doesn't really help us here, because it's more about the relationship between the two nodes. So this is pretty inefficient. We're walking a lot of DOM nodes just to find the .bar elements.

Style optimization 2: right-to-left

Foo pointing at bar, bar pointing at foo

41 / 125

So here's another optimization we can do. How about instead of walking from the left to the right, we evaluate the selector from right to left?

So instead of going foo then bar, we would go bar then foo.

42 / 125

Here is our same DOM tree from before.

43 / 125

It turns out we check a lot fewer DOM nodes this way.

You may have heard that browser engines evaluate CSS selectors from right to left. If you've ever wondered why, this is the reason! Any given node in the DOM tree tends to have fewer ancestors than descendants, so this optimization works out really well for most DOM trees.

Problem: generic descendants

.foo div
44 / 125

This right-to-left technique works out pretty well. But we have another problem. What about selectors like this one?

The right-hand-side (i.e. the descendant) is pretty generic. Most DOM trees have a lot of divs.

45 / 125

Consider this DOM tree, where we have a lot of divs and want to find .foo div.

46 / 125

With the right-to-left technique, we have to traverse a lot of ancestor chains just to find .foo.

So how can we solve this?

Style optimization 3: Bloom filter

Bloom filter illustration from Wikipedia

47 / 125

Enter the Bloom filter. WebKit began using this technique in 2011. All browsers have it now.

You can think of a Bloom filter as a Hash Set that may give false positives, but never gives false negatives. It's fast and has low memory overhead.

Here, we hash the strings x, y, and z, and insert them into the Bloom filter. Then we check if w is in there by hashing it.

48 / 125

How does this work in the DOM tree? Well basically, the browser keeps a little Bloom filter hash on each node of its ancestors' tag name, IDs, and classes.

This means that if we're on div, and we want to figure out if .foo is an ancestor, then we don't have to walk up the tree – we know instantly, because .foo is in the Bloom filter.

This particular optimization was first applied by WebKit in 2011 and now all browsers have it.

49 / 125

So now we can quickly filter all the divs based on the Bloom filter, "fast rejecting" any that couldn't possibly have .foo as an ancestor.

Note that, because we could have false positives, we still need to walk the ancestor chain to check that it really has .foo as an ancestor, but we are still eliminating a lot of work.

What's in the Bloom filter?

Supported? Type Example
βœ… ID #id div
βœ… Class️ .foo div
βœ… Tag️ main div
βœ… Attribute πŸ†• [foo] div
⚠️ Attribute value πŸ†• [foo="bar"] div
❌️ Other stuff :nth-child(2) div
50 / 125

So what's in the Bloom filter?

What's in the Bloom filter?

Supported? Type Example
βœ… ID #id div
βœ… Class️ .foo div
βœ… Tag️ main div
βœ… Attribute πŸ†• [foo] div
⚠️ Attribute value πŸ†• [foo="bar"] div
❌️ Other stuff :nth-child(2) div

51 / 125

So what's in the Bloom filter?

Originally it was only IDs, classes, and tags. WebKit invented this in 2011.

What's in the Bloom filter?

Supported? Type Example
βœ… ID #id div
βœ… Class️ .foo div
βœ… Tag️ main div
βœ… Attribute πŸ†• [foo] div
⚠️ Attribute value πŸ†• [foo="bar"] div
❌️ Other stuff :nth-child(2) div

52 / 125

So what's in the Bloom filter?

Originally it was only IDs, classes, and tags. WebKit invented this in 2011.


In 2018 WebKit added attributes, and Firefox and Chrome added them in 2021 after I filed bugs on them.

Note that the attribute optimization only applies to attribute names, not values, but attribute value selectors can kind of piggyback off of them because the browser will quickly check if any ancestors even have the attribute name, before checking the value.

What's in the Bloom filter?

Supported? Type Example
βœ… ID #id div
βœ… Class️ .foo div
βœ… Tag️ main div
βœ… Attribute πŸ†• [foo] div
⚠️ Attribute value πŸ†• [foo="bar"] div
❌️ Other stuff :nth-child(2) div

53 / 125

So what's in the Bloom filter?

Originally it was only IDs, classes, and tags. WebKit invented this in 2011.


In 2018 WebKit added attributes, and Firefox and Chrome added them in 2021 after I filed bugs on them.

Note that the attribute optimization only applies to attribute names, not values, but attribute value selectors can kind of piggyback off of them because the browser will quickly check if any ancestors even have the attribute name, before checking the value.


Other stuff could be optimized in theory, but last I checked (late 2022), no browsers have expanded the Bloom filter to anything else.

Bloom filter source:

Browser style optimizations

54 / 125

Now, there are many more browser style optimizations than what I've mentioned here. Here are a few more.

Browser style optimizations

  • WebKit CSS JIT (2014)
55 / 125

Now, there are many more browser style optimizations than what I've mentioned here. Here are a few more.


WebKit has a JIT where they actually compile some selectors directly to assembly. Pretty impressive!

Browser style optimizations

  • WebKit CSS JIT (2014)
  • Firefox Stylo (2017)
56 / 125

Now, there are many more browser style optimizations than what I've mentioned here. Here are a few more.


WebKit has a JIT where they actually compile some selectors directly to assembly. Pretty impressive!
Firefox has a multithreaded style calculation engine called Stylo.

Browser style optimizations

  • WebKit CSS JIT (2014)
  • Firefox Stylo (2017)
  • WebKit :has (2022)
57 / 125

Now, there are many more browser style optimizations than what I've mentioned here. Here are a few more.


WebKit has a JIT where they actually compile some selectors directly to assembly. Pretty impressive!
Firefox has a multithreaded style calculation engine called Stylo.
And recently both Webkit and Chromium implemented :has(), which can be thought of as an ancestor selector. (How did they make this fast? You guessed it... another Bloom filter. Like the other one, this one has classes, IDs, tags, and attributes, but it also adds :hover, and they hint that they may add other pseudo classes later.)

My goal in telling you this is not to say that you need to memorize whatever optimizations browsers have.

I just want to give you an appreciation for all the work a browser has to do to make it so, 9 times out of 10, you _don't have to worry about selector performance.

Notes:

Improving style performance

58 / 125

So now, knowing a bit more about how browsers work under the hood, what can we as web developers do if we see high style calculation costs?

Remove unused CSS

Unused CSS in Chrome coverage tool

59 / 125

Well, one thing you can do to reduce style calculation costs is to remove unused CSS. (The example shows a screenshot from Chrome Dev Tools "Coverage" tool.)

After all, the browser doesn't know your CSS is unused until it runs the algorithm to determine that it's unused. And it may have to run this algorithm every time something on the page changes. So unused CSS can cost you on the main thread throughout the lifetime of your web app.

So trim that unused CSS!

Avoid excessive complexity in selectors

[class*="foo"] :nth-child(2) > p ~ *
60 / 125

These kinds of selectors are cute, but if we imagine matching right-to-left, we can see why they might be expensive.

Now, you can probably have a few of these on your page and you'll be fine. It's more of a problem at the macro level, e.g. if you're building a framework or design system that repeats these selectors multiple times.

Some folks use preprocessors like SASS or LESS, and it's easy to put something like this in a for-loop.

Rough selector cost estimate

~Cost Type Examples
βœ… ID, class, tag #id   .cls   a
⚠️ Descendant .foo *   .foo > *
⚠️ Attribute name [foo]
🌢️️ Attribute value [foo="bar"]   [foo*="bar"]
🌢️ Sibling .foo ~ *   .foo + *
🌢️ Pseudo-class :nth-child()   :nth-of-type()
61 / 125

This is my extremely rough estimate of selector costs.

Rough selector cost estimate

~Cost Type Examples
βœ… ID, class, tag #id   .cls   a
⚠️ Descendant .foo *   .foo > *
⚠️ Attribute name [foo]
🌢️️ Attribute value [foo="bar"]   [foo*="bar"]
🌢️ Sibling .foo ~ *   .foo + *
🌢️ Pseudo-class :nth-child()   :nth-of-type()

62 / 125

This is my extremely rough estimate of selector costs.


In general, browsers have heavily optimized for things like tag names, IDs, and classes.

Rough selector cost estimate

~Cost Type Examples
βœ… ID, class, tag #id   .cls   a
⚠️ Descendant .foo *   .foo > *
⚠️ Attribute name [foo]
🌢️️ Attribute value [foo="bar"]   [foo*="bar"]
🌢️ Sibling .foo ~ *   .foo + *
🌢️ Pseudo-class :nth-child()   :nth-of-type()

63 / 125

This is my extremely rough estimate of selector costs.


In general, browsers have heavily optimized for things like tag names, IDs, and classes.
Descendant selectors have the Bloom filter optimize, but it doesn't always work.

Attribute names have historically been less optimized. They're still typically slower than classes, but WebKit and Firefox made optimizations recently (in addition to the Bloom filter optimizations), after I began writing about this on my blog.

Rough selector cost estimate

~Cost Type Examples
βœ… ID, class, tag #id   .cls   a
⚠️ Descendant .foo *   .foo > *
⚠️ Attribute name [foo]
🌢️️ Attribute value [foo="bar"]   [foo*="bar"]
🌢️ Sibling .foo ~ *   .foo + *
🌢️ Pseudo-class :nth-child()   :nth-of-type()

64 / 125

This is my extremely rough estimate of selector costs.


In general, browsers have heavily optimized for things like tag names, IDs, and classes.
Descendant selectors have the Bloom filter optimize, but it doesn't always work.

Attribute names have historically been less optimized. They're still typically slower than classes, but WebKit and Firefox made optimizations recently (in addition to the Bloom filter optimizations), after I began writing about this on my blog.


Unlike attribute names, attribute values don't go in the hashmap or Bloom filter. Watch out especially for slow "search" selectors.

Sibling selectors tend to be less optimized. Non-adjacent selector and generic right-hand-side can cause a lot of matching.

Pseudos like :nth-child() and :nth-of-type() tend to be less optimized, although browsers have specific optimizations for common ones like :hover and :focus.

Rough selector cost estimate

~Cost Type Examples
βœ… ID, class, tag #id   .cls   a
⚠️ Descendant .foo *   .foo > *
⚠️ Attribute name [foo]
🌢️️ Attribute value [foo="bar"]   [foo*="bar"]
🌢️ Sibling .foo ~ *   .foo + *
🌢️ Pseudo-class :nth-child()   :nth-of-type()

IT DEPENDS

65 / 125

This is my extremely rough estimate of selector costs.


In general, browsers have heavily optimized for things like tag names, IDs, and classes.
Descendant selectors have the Bloom filter optimize, but it doesn't always work.

Attribute names have historically been less optimized. They're still typically slower than classes, but WebKit and Firefox made optimizations recently (in addition to the Bloom filter optimizations), after I began writing about this on my blog.


Unlike attribute names, attribute values don't go in the hashmap or Bloom filter. Watch out especially for slow "search" selectors.

Sibling selectors tend to be less optimized. Non-adjacent selector and generic right-hand-side can cause a lot of matching.

Pseudos like :nth-child() and :nth-of-type() tend to be less optimized, although browsers have specific optimizations for common ones like :hover and :focus.


But remember: this varies from browser to browser, it could change tomorrow, and there are always exceptions.

chrome tracing, edit categories, blink.debug

66 / 125

To actually understand which selectors are slow, there actually is a new tool released in Chromium earlier this year. If you enable blink.debug when using Chrome tracing...

chrome tracing, document update style, selector stats slice, table of data, elapsed, fast reject count, match attempts, match count, list of selectors

67 / 125

Then you can get this view of the "selector stats." If you sort by elapsed time, you can actually see your most expensive CSS rules ranked from most to least expensive.

The time shown is microseconds. We also have match attempts, match count, and fast reject count. Fast reject count is Bloom filter rejections.

Note this tool may be more useful at the macro level than the micro level. Look for repeated patterns coming from a framework or CSS processor.

This tool may land in Dev Tools soon!

Use shadow DOM

<my-component>
#shadow-root
<style> div { color: red } </style>
<div>Hello!</div>
</my-component>
68 / 125

Shadow DOM is interesting because it encapsulates styles. They don't bleed in and out of this component.

Because of this scoping, this reduces both the n and the m in our O (n*m) algorithm earlier. The browser doesn't need to check as many rules against as many elements – it's clear which ones can apply to each other.

Use scoped styles

:nth-child(2) * /* Input */
.xyz:nth-child(2) .xyz /* Svelte output */
69 / 125

An alternative to shadow DOM is to use style scoping from frameworks like Vue or Svelte. In a sense, these "polyfill" shadow DOM style scoping by modifying the selectors with unique classes, tags, or attributes.

In this case, you can see how Svelte effectively turns an inefficient selector into an efficient selector that benefits from both the hashmap and Bloom filter optimization.

Benchmark data, see blog post for full details

70 / 125

This chart runs my benchmark, median of 25 iterations, on a 2014 Mac Mini. The benchmark generates some random CSS, and compares running it through various scoping algorithms vs using shadow DOM vs just leaving the styles unscoped. It uses 1,000 generated "components," with 10 styles per component.

Note though that this is a microbenchmark, and this is not exactly comparing apples-to-apples. But the point is to show that shadow DOM and scoped styles do have a perf impact.

If you're writing a framework yourself, you may be interested in the differences between the scoping strategies, which is covered in my blog post.

Concatenate stylesheets (Chromium-only)

Benchmark data showing Chrome getting slower for more stylesheets

71 / 125

I hesitated to mention this one, because it only applies to Chromium, but it's a big optimization.

It turns out that in Chromium, one big stylesheet is faster than multiple small stylesheets for style calculation. It has no impact on Safari or Firefox, and the Chromium devs may fix it eventually, but it's something to be aware of.

In this benchmark, this is median of 25 iterations. The styles are exactly the same; it's just the number of individual <style>s that's different. The fastest is to concatenate everything into one big stylesheet.

So the advice would be to concatenate stylesheets as much as possible across pages, similar to how we do concatenation and chunking for JavaScript modules in bundlers like Webpack. There are implications for cache performance here as well, though, so don't over-concatenate. Also try to avoid modifying the stylesheets after inserting them; if you do modify them, it's better to keep them separate according to the Chromium devs.

Layout performance

72 / 125

OK, so now that I've covered all the bases on style performance, I want to move on to layout performance.

Layout illustration by Lin Clark showing measurements and geometry on a page

73 / 125

Remember, this is about the geometry of the page.

74 / 125

So let's say we have a simple layout like this. We've got a header, a sidebar, and the main content.

Each of these boxes contains other boxes, but the browser already knows which elements have which styles, so it's just a matter of laying them out.

75 / 125

This can get very complicated, because some of these boxes may have absolute/relative positioning, others may use flexbox, others may use grid, etc.

And the browser has to calculate all the boxes for these things relative to each other.

76 / 125

So we have our simple layout.

77 / 125

But let's say our main content suddenly takes up a bit more space, so now the sidebar has to shrink.

78 / 125

So now the browser has to recalculate the geometry of all these boxes

79 / 125

Why might this happen? Well maybe we ended up with some super long text that pushed the size of the box out. If that happens, then the browser might have to recalculate the layout for everything inside the sidebar.

Now as the author of the page, you might know this is never going to happen. And even if it did, you'd want to clip the text or something anyway. But is there some simpler way we can reassure the browser that these boxes are never going to change size?

CSS containment

80 / 125

Is there a way we can reassure the browser that a change in one box won't affect the other boxes? Yes, it's called CSS containment.

81 / 125

If we apply the CSS contain: strict to each of these boxes, then the browser can calculate their sizes independently of each other. This has the potential to speed up layout performance.

82 / 125

Now, this has some downsides. If there's a dropdown or something that you want to peek out of one box and into another one...

83 / 125

...then it'll get cut off if we use CSS containment, because we promised the browser that one box wouldn't bleed into another.

CSS containment

  • contain: content
  • contain: strict
84 / 125

Now there are two (main) values: content and strict. content is a little laxer than strict, and is easier to apply more broadly, but it doesn't size itself independently of its descendants. So strict is a bit harder to pull off.

My recommendation would be to try applying these to logically separate parts of your page (sidebars, modals, individual items in a list, etc.), and then measure and see if it improves layout performance.

Containment benchmark, Chrome improves for contain content and contain strict, Firefox only for strict, Safari not at all

85 / 125

It's a bit hard to predict what optimizations a browser will apply, but here is a benchmark I put together based on one built by Manuel Rego Casasnovas. It renders 100 items, changes the text 100 times, and measures the result. (Median of 25 iterations, 100 items.)

As you can see, in Chrome you get most of the benefit with contain: content and don't need to go as far as contain: strict. With Firefox, the benefit only comes with contain: strict. In Safari there isn't any effect.

Keep in mind that browsers may implement multiple different optimizations for CSS containment, or none at all. The spec doesn't require that a browser implement any optimizations – only that the observable effects of containment apply. Also note that this is just one benchmark, and you may see different results in your own page.

(Also note that I tested in Firefox Nightly, and they seem to have the same perf optimization as Chrome now.)

Encapsulation

Shadow DOM CSS containment
Encapsulates Style Layout
Improves first calc? Yes No
86 / 125

Now, CSS containment is a form of encapsulation. You might recall I also referred to shadow DOM as a form of encapsulation. What's the difference?

Encapsulation

Shadow DOM CSS containment
Encapsulates Style Layout
Improves first calc? Yes No

87 / 125

Now, CSS containment is a form of encapsulation. You might recall I also referred to shadow DOM as a form of encapsulation. What's the difference?


Well, shadow DOM encapsulates your styles and improves style calculation. Whereas CSS containment encapsulates your layout and improves layout performance.

So if you have high style costs, CSS containment can't help you. And if you have high layout costs, shadow DOM can't help.

Encapsulation

Shadow DOM CSS containment
Encapsulates Style Layout
Improves first calc? Yes No

88 / 125

Now, CSS containment is a form of encapsulation. You might recall I also referred to shadow DOM as a form of encapsulation. What's the difference?


Well, shadow DOM encapsulates your styles and improves style calculation. Whereas CSS containment encapsulates your layout and improves layout performance.

So if you have high style costs, CSS containment can't help you. And if you have high layout costs, shadow DOM can't help.


Also note that CSS containment can only make subsequent layouts faster. It provides a hint that if one part of the DOM changed, then another part doesn't need to be invalidated.

Some browsers like Firefox have experimented with doing layouts in parallel, but since no browser actually implements parallel layout, CSS containment cannot speed up the first layout pass. Whereas for style calculation, shadow DOM can actually improve style performance for the first style calculation because it's just about reducing the n and m in that O(n*m) algorithm I mentioned.

Improving layout performance

89 / 125

Other than CSS containment, I can only share a few general tips on improving layout performance.

Improving layout performance

  • Explicit is better than implicit
90 / 125

Other than CSS containment, I can only share a few general tips on improving layout performance.


First off, explicitly telling the browser the sizes of things will always be less work than asking it to run its layout algorithm. If you know the exact width/height of something, you can set the explicit size rather than letting the browser calculate it.

Improving layout performance

  • Explicit is better than implicit
  • Use fewer DOM nodes (e.g. virtualization)
91 / 125

Other than CSS containment, I can only share a few general tips on improving layout performance.


First off, explicitly telling the browser the sizes of things will always be less work than asking it to run its layout algorithm. If you know the exact width/height of something, you can set the explicit size rather than letting the browser calculate it.


Also, of course, use fewer DOM nodes. If you have an infinite-scrolling list, use virtualization so that you're not rendering a bunch of DOM nodes that are off-screen.

Improving layout performance

  • Explicit is better than implicit
  • Use fewer DOM nodes (e.g. virtualization)
  • Use display:none and content-visibility
92 / 125

Other than CSS containment, I can only share a few general tips on improving layout performance.


First off, explicitly telling the browser the sizes of things will always be less work than asking it to run its layout algorithm. If you know the exact width/height of something, you can set the explicit size rather than letting the browser calculate it.


Also, of course, use fewer DOM nodes. If you have an infinite-scrolling list, use virtualization so that you're not rendering a bunch of DOM nodes that are off-screen.


If you use something like display:none, it will also avoid paying the layout cost for everything that is currently being hidden.

There is also a new property that you can use called content-visibility, that allows the browser to skip rendering large portions of the page while still allowing them to be searchable with Cmd-F/Ctrl-F.

So I'd say if you have high layout costs, try CSS containment first, then try these techniques.

Invalidation

93 / 125

OK, so now that I've covered the principles of style and layout, and how they're different, I want to move on to topics that affect both style and layout calculation.

Up to now, we've mostly talked about what happens to a page that calculates style/layout once. But of course, a lot of us are building very dynamic pages that are constantly changing, so the browser calculates style/layout more than once. This process is called "invalidation."

Invalidation

94 / 125

Basically it means we are going from one layout state to another state.

This can typically happen for two different reasons: either 1) the DOM changed, and/or 2) the CSS rules changed.

Invalidation

element.style.width = '200px';
95 / 125

Invalidation could be as simple as this – changing the margin on an element.

Invalidation

Pixel pipeline, requestAnimationFrame between JS and style

96 / 125

When the browser detects this, it will automatically re-render during the next style/layout pass, which happens on the next frame.

That's why, when you call requestAnimationFrame, you get the point in time right before the next style/layout operation.

Forcing style/layout calculation

element.style.width = '200px'; // Invalidate
element.getBoundingClientRect(); // Force style/layout
97 / 125

Now, normally this is fine. But it gets dangerous if you're explicitly telling the browser that you want style and layout to be calculated immediately, rather than waiting for the next frame.

Forcing style/layout calculation

element.style.width = '200px'; // Invalidate
element.getBoundingClientRect(); // Force style/layout
98 / 125

Here we're invalidating by setting the margin on the element to 20px. This doesn't actually cause the browser to do any style/layout work yet. Normally it would happen in the next frame.

Forcing style/layout calculation

element.style.width = '200px'; // Invalidate
element.getBoundingClientRect(); // Force style/layout
99 / 125

But instead, we immediately call element.getBoundingClientRect(). This forces the browser to immediately and synchronously calculate both style and layout.

APIs that force style/layout recalc

  • getBoundingClientRect
  • offsetWidth
  • getComputedStyle
  • innerText 🀯
  • etc.
100 / 125

Paul Irish has a complete list of APIs that force style/layout recalc.

Some APIs are a bit suprising (like innerText). Some force style and layout, whereas others only force style (e.g. getComputedStyle).

Layout thrashing

for (const el of elements) {
const width = el.parentElement.offsetWidth;
el.style.width = width + 'px';
}
101 / 125

This leads us to layout thrashing.

Layout thrashing is a situation where, in a loop, you're both writing to the DOM (invalidating) and writing to the DOM (forcing style/layout). This forces the browser to re-run style and layout repeatedly.

Layout thrashing

for (const el of elements) {
const width = el.parentElement.offsetWidth;
el.style.width = width + 'px';
}
102 / 125

So in this case here we are reading from the DOM

Layout thrashing

for (const el of elements) {
const width = el.parentElement.offsetWidth;
el.style.width = width + 'px';
}
103 / 125

And here we are writing to the DOM

Layout thrashing, many small purple slices

104 / 125

The telltale sign of layout thrashing is these repeated sections of purple in the DevTools, and the warning about "forced reflow." (Reflow is another name for layout.)

Solving layout thrashing

const widths = elements.map(el => el.parentElement.offsetWidth);
elements.forEach((el, i) => {
el.style.width = widths[i] + 'px';
});
105 / 125

In these cases, it's better to batch your reads and writes together.

Solving layout thrashing

const widths = elements.map(el => el.parentElement.offsetWidth);
elements.forEach((el, i) => {
el.style.width = widths[i] + 'px';
});
106 / 125

So that you do all the reads at once...

Solving layout thrashing

const widths = elements.map(el => el.parentElement.offsetWidth);
elements.forEach((el, i) => {
el.style.width = widths[i] + 'px';
});
107 / 125

...followed by all the writes.

This ensures you only at most pay for style calculation at most twice – once during the reads, and again during the writes.

Solving layout thrashing

One big purple slice in dev tools

108 / 125

If you do this correctly, then you should see one big style/layout cost (or at most two) rather than multiple. This allows the browser to be more efficient because it's doing all the calculations at once rather than piece by piece.

Demo

Don't be misled

Chrome warning about forced reflow on big purple style

109 / 125

Note that the DevTools can be misleading. They warn you about "forced reflow" anytime you use one of the APIs that force style/layout, such as getBoundingClientRect or offsetWidth.

But if you're only reading from the DOM once, then it's useless to eliminate that call; you're just moving the cost around.

Don't be misled

Big purple slice of same size, no warning

110 / 125

Here we've gone through a lot of effort to remove that getBoundingClientRect call. And the Chrome DevTools have rewarded us! Our "Recalculate style" doesn't have a little red triangle with a warning anymore.

But the result is the same. All we did was move the style/layout costs from the getBoundingClientRect to the next style/layout calc on the next frame. The total time spent is the same. So this DevTools warning can be very misleading.

Demo

Invalidation optimizations

  • Invalidation sets (Chromium)
  • Rule Tree (Firefox)
  • LayoutNG (Chromium)
  • Etc.
111 / 125

Browsers have a lot of invalidation optimizations. They try to avoid unnecessary work.

You change a class, and they try to only check rules related to that class. You change one part of the DOM, and they try to only redo layout for that part of the DOM (although containment can help).

Typical page flow

One big upfront style/layout cost, many small subsequent ones

112 / 125

This is why, in the typical flow for a web app, we have a lot of high upfront style/layout costs, and very small residual style/layout costs on every interaction. This is a good thing – this is what we want.

Sometimes though, these residual costs are surprisingly high. And it can be really tricky to figure out exactly why something was invalidated and is causing these high residual costs.

Invalidation tracking option in devtools settings

113 / 125

One tool you can use to inspect this is "invalidation tracking" in the Chrome DevTools.

Note that this only really works for Chrome – other browsers have different heuristics and different performance characteristics when it comes to invalidation.

List of CSS selectors corresponding to invalidated elements

114 / 125

If you do this, and then you click on the "Recalculate style" or "Recalculate Layout" slice in Dev Tools, then it will show you which CSS rules were invalidated for which elements (in the case of style recalc), or which elements needed layout (in "recalculate layout"). This can be really invaluable in debugging high invalidation costs!

For instance, you might use this to find out that you have an expensive animation that you're paying for, even though it's off-screen or otherwise invisible in the DOM.

Avoid invalidating global CSS

Chrome dev tools showing steadily increasing style costs

115 / 125

Another thing to be aware of with invalidation: not all invalidations are created equal. In general, invalidating CSS rules is more expensive than invalidating DOM elements. In particular, when you insert new CSS rules at the global level, the browser may recalculate all styles for every existing element on the page.

In this benchmark, I'm inserting <style> tags in each rAF, and you can see the style costs just steadily increasing, even though I'm not inserting any new DOM nodes.

So the best practice is to batch your CSS rule insertions.

Conclusion

116 / 125

New CSS features

  • Container queries
  • :has selector
  • Cascade layers
  • Scoping
  • Nesting
117 / 125

CSS has been getting a lot of new features recently. Here are some new and draft specs.

New layout features

  • Subgrid
  • Masonry
  • Multi-column layout
118 / 125

Layout has been getting new features too.

All of these features are cool, and you should be using them. CSS is great.

Chrome DevTools yellow JS part and purple style/layout part

JavaScript (yellow part)

Style/Layout (purple part)

119 / 125

The more work you do in the purple part and the less in the yellow part, the more performant your page will tend to be.

But the more we lean on CSS, and the more ambitious apps we try to build, I worry that we'll see more time spent in the purple part. And it's not good enough to treat it as a black box.

Unfortunately it's harder to understand the purple than the yellow, because JavaScript is imperative, whereas CSS is declarative.

With JavaScript, there's a 1-to-1 mapping between the algorithm we write and the perf trace.

Whereas with CSS, we give a big declarative blob to the browser and tell the browser to implement the algorithm.

SQL

SELECT height, weight FROM pokemon
INNER JOIN pokemon_types ON pokemon.id = pokemon_types.id;
120 / 125

I think there's an interesting analogy here with SQL. Like CSS, SQL is declarative.

And like CSS, SQL has performance considerations. A slow query can result in time spent waiting on the database.

SQL EXPLAIN

EXPLAIN
SELECT height, weight FROM pokemon
INNER JOIN pokemon_types ON pokemon.id = pokemon_types.id;
121 / 125

But unlike CSS, SQL has an EXPLAIN.

SQL explain output showing what made query slow

122 / 125

If we ask Postgres to EXPLAIN, it'll tell us how many rows it scanned, which indexes it used, how it joined the tables, etc.

So we can map this back to the declarative SQL we wrote.

Photo of a classic car dashboard

123 / 125

Going back to my original car metaphor…

Sure, I can listen to the engine's noises and rely on intuition. But what I'd really like is a dashboard, to give me insight into what the engine is doing.

So my pitch to browser vendors is: please give us a CSS EXPLAIN! SelectorStats and Invalidation Tracking are great, but I want so much more.

Task ms
Style 400
  β”œβ”€β”€  Bloom filter misses 200
  β”œβ”€β”€  :nth-child selectors 130
  β”œβ”€β”€  Sibling selectors 50
  β””──  Custom properties 20
Layout 600
  β”œβ”€β”€  <nav> (grid) 300
  β”œβ”€β”€  .sidebar (flexbox) 200
  β””──  <main> (normal flow) 100
124 / 125

Here is a mockup.

Thank you

πŸ“ƒ nolanlawson.github.io/css-talk-2022

🌎 nolanlawson.com

Thanks to Emilio Cobos Álvarez, Manuel Rego Casasnovas, and Daniel Libby for help with research for this talk.

Also thanks to Robert Flack, Steinar Gunderson, and Rune Lillesveen for answering my Chromium style bug questions.

This work is licensed under the Creative Commons Attribution Share-Alike License v3.0.

125 / 125

So that's my talk on style/layout performance. I hope you enjoyed it and learned something about how browsers work and how to optimize style/layout calculation.

Thank you to all the people who helped with research for this talk.

These slides are available online, and you can pop open the speaker notes to find links.

Photo of Nolan Lawson on a bike

  • nolanlawson.com
  • Microsoft Edge 2016-2018
  • Salesforce 2018-
2 / 125

So who am I?

If you know me from the internet, it's probably from my blog where I talk about performance, accessibility, web components, etc.

I was on the Microsoft Edge performance team for a couple years, then moved to the performance team at Salesforce, and now I work on our JavaScript framework, Lightning Web Components.

Paused

Help

Keyboard shortcuts

↑, ←, Pg Up, k Go to previous slide
↓, β†’, Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
Esc Back to slideshow