How to Make Webpack Caching Actually Work

Although Webpack is a popular tool that’s generally pretty decent at gluing together the mess of “modern” web development, fixing its browser caching mechanism has become a rite of passage fraught with uncertainty and frustration. This post combines insights from the infamous 100+ comment GitHub issue with my own experience of taming Webpack at Duolingo.

The goal

JavaScript files generated by Webpack (a.k.a. “bundles”) can be configured to include a hash of their contents in their filenames, e.g. app-89fcb80e.js. This forces the browser to download an updated bundle instead of going to cache whenever we modify our source code.

Isn’t it wasteful, though, to make the browser download all of our JS again even if we only make a small code change? That’s why two bundles are typically used: a vendor bundle that contains all library code (which rarely changes) and an app bundle that contains code we write ourselves.

Webpack’s official CommonsChunkPlugin makes it easy to select what should go into the vendor bundle. Generating the necessary content hashes should be just as simple, right? Webpack’s long-term caching guide seems straightforward enough.

It’s a trap

In order to preserve module scope when combining modules (i.e. source files) into a handful of output bundles, Webpack wraps each module’s source code inside a function. Wrapped modules reference each other by integer IDs that Webpack automatically generates. For example, Webpack might assign an ID of 23 to React and then convert all of our codebase’s require('react') statements into __webpack_require__(23). Unfortunately these IDs vary from compilation to compilation, resulting in nondeterministic bundles—and in turn, causing unnecessary hash churn in our bundles’ filenames.

The Webpack docs recommend OccurrenceOrderPlugin, an official plugin that promises deterministic module IDs by sorting modules in descending order of usage frequency before assigning sequential IDs. A nice side effect of this is that commonly imported modules like Underscore get short IDs, reducing output bundle size.

To test this out, we try compiling our JS a few times in a row. The output bundle hashes are the same every time. Next we try modifying an app module. Only the app bundle’s hash changes; the vendor bundle’s hash remains unchanged. We seem to have fixed caching now, but not so fast!

The problem is that adding a new vendor import inside our app source will often change the vendor modules’ relative frequencies of appearance, leading to shuffled module IDs inside the vendor bundle and more accidental cache invalidation.

The solution

NamedModulesPlugin is another official (but undocumented) plugin that actually produces deterministic module IDs by telling Webpack to use modules’ filepaths as IDs instead of integers. The example above then becomes:

__webpack_require__("./node_modules/react/index.js")

Do we finally have reliable bundle hashing now? Nope, one last thing. Webpack throws a small bit of its own code (e.g. the definition for that __webpack_require__ function) into the mix. By default, this runtime code is included in the vendor bundle. That’s fine, except for one problem: the Webpack runtime includes a manifest containing the filenames of all generated bundles. Changing the app source will affect the app bundle hash, the Webpack manifest, the Webpack runtime, and finally (you guessed it) the vendor bundle hash.

The solution is to extract the Webpack runtime into its own bundle that we either send as a third JS file or inline directly into HTML. We can extract out a manifest bundle by adding “manifest” to CommonsChunkPlugin:

new webpack.optimize.CommonsChunkPlugin({ names: ["vendor", "manifest"] })

At long last, we’re done.

Takeaway

Use NamedModulesPlugin instead of OccurrenceOrderPlugin, and add “manifest” as a CommonsChunkPlugin name.