Imagine running a Node.js process that watches the current working directory for file changes, and have it pass the filename of any updated files to a callback. Now imagine changing the implementation of that callback and making subsequent file updates execute the new callback on save, without exiting the Node.js process. Furthermore, imagine calls to require() always load the newest version of the required module (usually they’re cached), so that when the callback executes again, it uses the newest version of the modules.

I wanted a system like that when I was thinking about how I would write the static site generator for this very blog. I wanted it so that whenever I saved a markdown file in a specific directory, it would run a function to convert the markdown into HTML and write it out to disk. And since I’ll be using React, I wanted it so that whenever I update a layout component, the system would rebuild the entire site.

It’s actually pretty easy to implement such a system with only the Node.js APIs. At a high level, the system works like this:

  1. The entry script watches the current working directory via fs.watch(process.cwd(), { recursive: true }, (event, filename) => { ... });

  2. The callback to fs.watch() clears local modules from the require.cache so that subsequent calls to require() will load the new implementation of the required module.

  3. The callback then calls require('./handler') which is a module that exports a function, and the handler is passed the event and filename arguments. Since we’ve cleared the require cache, the ./handler module is hot swapped with the new version.

In otherwords, we have a directory with index.js and handler.js:

.
├── handler.js
└── index.js

index.js looks like this:

const fs = require('fs');
 
fs.watch(process.cwd(), { recursive: true }, (event, filename) => {
  Object.keys(require.cache).forEach(module => {
    if (!module.match(/node_modules/)) {
      delete require.cache[module];
    }
  });
 
  try {
    require('./handler')(event, filename);
  } catch (err) {
    console.log(err);
  }
});

and handler.js exports a function that takes (event, filename):

module.exports = (event, filename) => {
  // do something with event and filename 
  console.log(event, filename);
};

In this example, saving a file anywhere in the process’ working directory will delete the modules from the module cache if the module does not live in node_modules. This makes the assumption that the implementation of the modules in node_modules don’t change during the lifetime of the process. Now any subsequent calls to require() for local modules will return the new implementation. Then it executes the handler which just logs out its arguments.

What’s cool is this is that we can change the implementation of handler.js and new file changes will execute the new version without exiting the Node.js process.

Module cache busting optimization

In the fs.watch() callback we looped through the entire require.cache and deleted from it the modules that don’t live in node_modules. This operation is pretty fast on my machine, but not entirely optimal. What’s optimal is if we deleted only the updated module, and all of that module’s dependents because otherwise they would be referencing an older version of the updated module.

To achieve something like that we would need to have a dependency graph of our modules during the lifetime of the process. For this I had to dig into how Node.js implements its module system. For example have you ever wondered how require() actually works? Frank K. Schott has an excellent blog post on that very topic.

My idea was to hook into the require() calls and build the dependency graph that way. This should be possible as long as we can get the module that called require() and the resolved requested module. It turns out that the require function that we see in a module is just a wrapper around the module.require function. module.require is implemented on Module.prototype.require, so we can monkey patch that to build the dependency graph.

Once we have the dependency graph, we can query it for the dependents of the update module, and delete the affected modules from the module cache. With that, I present to you the optimized cache busting code. It relies on the dependency-graph library.

const fs = require('fs');
const Module = require('module');
const path = require('path');
const { DepGraph } = require('dependency-graph');
 
const graph = new DepGraph();
const __require = Module.prototype.require;
 
Module.prototype.require = function(p) {
  const module = __require.call(this, p);
  const moduleName = Module._resolveFilename(p, this);
  graph.addNode(this.filename);
  graph.addNode(moduleName);
  graph.addDependency(this.filename, moduleName);
  return module;
};
 
fs.watch(process.cwd(), { recursive: true }, (event, filename) => {
  const absFilename = path.resolve(filename);
 
  if (graph.hasNode(absFilename)) {
    graph.dependantsOf(absFilename).concat([absFilename]).forEach(module => {
      delete require.cache[module];
      graph.removeNode(module);
    });
  }
 
  try {
    require('./handler')(event, filename);
  } catch (err) {
    console.log(err);
  }
});

This way of invalidating a module has been extracted to my invalidate-module library. Using that, the above will look like this:

const fs = require('fs');
const invalidate = require('invalidate-module');
const path = require('path');
 
fs.watch(process.cwd(), { recursive: true }, (event, filename) => {
  const absFilename = path.resolve(filename);
 
  invalidate(absFileName);
 
  try {
    require('./handler')(event, filename);
  } catch (err) {
    console.log(err);
  }
});

Caveats

fs.watch() is not 100% consistent across platforms, and the recursive option is only supported on OS X and Windows. See the documentation for fs.watch here. You can use something like chokidar instead, which should be work better across platforms.

The dependency graph in the optimized module cache busting will throw an error if it detects a cycle. I haven’t bothered to handle that case yet.

Closing

Be creative with what you can do with such a system, and let me know what you come up with! Like I said, I’m using it as a static site generator with React and live reloading, and it's been working pretty well.