Better tree shaking with deep scope analysis
Better tree shaking with deep scope analysis
Here’s my project in GSoC 2018: Improve tree-shaking for webpack, a widely used JS code bundler.
Introduction
Tree-shaking, a supporting feature for DCE(dead code elimination) to help DCE on cross-module usage, is a crucial feature for a bundler. It's especially true for JS. Reducing the bundle size means reducing the network cost each time a bundle is delivered for web application.
Without this plugin
Before the implementation of this plugin, webpack uses a very simple solution for DCE. For example:
In the above example, webpack finds the references of the imported variables. Obviously, isNumber is completely not referenced in the module. As a result it can be eliminated in the final bundle if it’s never used in other modules.
The above example is very silly because you won’t import something you don’t need unless you forget to remove them. However, the modern editor and lint tools will remind you to remove unused imports. Thus we need a more powerful solution.
Motivation
The above issue illustrates that Webpack tree shaking mechanism still has some room for improvement. We can find out the relationships between the exported variables and imported variables among modules.
If an exported variable is not imported by another module, it will be eliminated along with its “children”(Other variables that are only referenced by it).
Function2 is not imported by the entry, function2 and the entire external2 can be eliminated. Unfortunately, webpack’s previous mechanism couldn’t catch such case until the introduction of this new plugin.
Feasibility
Think about the role of webpack: it traverses the graph of all modules from entry and bundles them together. At the same time, webpack knows which exports are used. How about traversing all the scopes and bundle the scopes together? That’s the way it can eliminate unused scopes and module. In fact, we can regard scope as a node in the graph.
In the above code, deepEqual is related to fun5, equal is related to fun6. If fun6 is not imported by another module, the imported function equal would be eliminated.
What is a scope?
In computer programming, the scope of a name binding — an association of a name to an entity, such as a variable — is the region of a computer program where the binding is valid: where the name can be used to refer to the entity. Such a region is referred to as a scope block. In other parts of the program the name may refer to a different entity (it may have a different binding), or to nothing at all (it may be unbound). — Wikipedia
Where you can feel scope when you are coding is something like block in the code. But it’s not quite equivalent. Here are types of scopes in ECMAScript:
For a ES6 module, module scope is regarded as a root scope. Within the module, only scopes of class and function can be exported, they are children of module scope. So not all the scopes would be regarded as a node in the graph.
How this plugin works
This plugin contains a scope analyzer, which can extract all the scopes from modules. The output of the analyzer would then be used to find out the references of variables in the module.
Some of scopes could be bound to variables, such as class and function. These scopes that are exported is the atom for traversing.
When webpack gives inputs about which exports are used, this plugin returns which imports could be eliminated.
Implementation Details
If I have seen further it is only by standing on the shoulders of giants. — Isaac Newton
We need a scope analysis to find the references between variables. An existing tool like https://github.com/estools/escope could help.
This scope analysis is based on the ASTs. The plugin hooks the webpack to acquire ASTs of modules, and analyzes them.
Then, the analyzer finds out all the scopes belong to module scope(the root scope), with the referencing data of the imports. If we can know which scopes is used by other modules, the analyzer can traverse all child scopes and tag all the variables from other modules. The imports which have no tag can be regarded as unused.
Edge cases
To improve is to change; to be perfect is to change often. — Churchill
There are many edge cases for JavaScript analysis, some of them are listed below:
Here’s a simple demo to try:
Local References by Module Scope
If a scope or an imported variable is referenced in module scope, it wouldn’t be eliminated.