redfin/npm-bazel

Name: npm-bazel

Owner: Redfin

Description: Bazel build rules and workspace generators for building node modules

Created: 2016-05-25 21:48:14.0

Updated: 2018-05-16 06:14:07.0

Pushed: 2016-10-11 04:51:44.0

Homepage: null

Size: 2831

Language: Python

GitHub Committers

UserMost Recent Commit# Commits

Other Committers

UserEmailMost Recent Commit# Commits

README

This project provides a generator tool, npm-bazel-gen.py, to generate WORKSPACE rules and BUILD files for npm modules. It also provides Skylark rules in build_tools/npm.bzl to build npm modules in Bazel's sandbox.

Stability: prototype

This is a public clone of code that we're using internally at Redfin. The version we actually use seems to be working, but we cannot guarantee that this repository will be stable/maintained. (But hope springs eternal, and if people seem to like it, we could stabilize it further.)

Getting Started

Generate the WORKSPACE and BUILD files using the Python script, then build the example projects.

on npm-bazel-gen.py
l build foo

Background

Required knowledge
What does npm install do?

npm install with no arguments does two-and-a-half things.

  1. “Install dependencies”
    1. “Download” dependencies declared in package.json into ./node_modules (de-duping dependencies and resolving circular dependencies as needed)

    2. “Rebuild” those downloaded dependencies. (You can npm rebuild to re-run this.)

      • Compile native code: if any of those deps had native code in them, they need to be recompiled. There are no limitations on what this build step is allowed to do.
      • Symlink scripts: if any of the deps provided executable scripts to run in the “bin” section of package.json, (“gulp” is especially important,) then node will symlink them into node_modules/.bin

      For example, node_modules/.bin/gulp will be a symlink to node_modules/gulp/bin/gulp.js

  2. “Prepublish” current project: if package.json declares a scripts.prepublish step, it'll run that. A common scripts.prepublish step is: “gulp prepublish”. (npm automatically adds node_modules/.bin to the path when running this script.)

To generate a .tgz output file, we can run npm pack, which does two things:

  1. Prepublish: runs (re-runs) the "prepublish" step
  2. Tar: It tars up everything in the current directory, except for anything explicitly ignored by `.npmignore`.

Some of these steps violate Bazel's sandbox.

How we implemented that in Bazel
  1. Install phase: build_tools/npm_installer.sh
    1. Download:
      • npm-bazel-gen.py scans all package.json files, finding the full dependency tree (tree of trees? forest?) and declares all external dependencies as rules in the WORKSPACE file
      • build_tools/install-npm-dependencies.py is a script that simulates what npm install would have done, including de-duping dependencies and handling circular dependencies. (npm allows circular dependencies! ?) BEWARE This script is currently imperfect! It doesn't do exactly what npm would have done.
    2. Rebuild:
      • We run npm rebuild directly after we run install_npm_dependencies.
      • To avoid sandbox violations, we set HOME=/tmp and pre-install /tmp/.node-gyp
    3. We do not run prepublish during this “install” phase, because we plan to run it during packaging.
    4. We then tar up the generated node_modules folder.
  2. Pack phase: build_tools/npm_packer.sh
    1. Setup: We're not allowed to modify source files directly, so we setup by:
      • rsyncing source files to a work directory
      • untar the generated node_modules folder into the work directory
    2. We run npm pack which runs the prepublish script, if any, and generates the final output.
npm.bzl

npm.bzl defines two rules, external_npm_module and internal_npm_module. “external” means “third party,” as opposed to “internal” modules being built by Bazel.

The internal_npm_module rule is responsible for running npm_installer.sh and then npm_packer.sh as ctx.actions.

The majority of npm.bzl is a bunch of code to marshall the correct list of inputs and outputs for the “install” phase and then the “pack” phase. Both types of rules return a struct() containing three fields:

In addition, internal modules return these two fields:

Skylark calls this system of a rule returning a struct() a “provider” in Skylark. http://bazel.io/docs/skylark/rules.html (Cmd-F for the “Providers” section)

Skylark's documentation on providers is pretty light, but all it means is: rules can return a struct() of data to provide it to dependent rules.

The top of npm.bzl is taking the list of deps and the list of dev_deps, sorting them into internal and external dependencies, and assembling four lists:

It then creates an internal_modules.json file using the two “internal” lists, so install_npm_dependencies.py knows where to look for internal modules.

It then runs two ctx.action commands:

  1. the “install” phase, which runs build_tools/npm_installer.sh to generate foo_node_modules.tar.gz with these inputs:
    1. All dependencies
    2. package.json files for all internal dependencies
    3. internal_modules.json
  2. the “pack” phase, which runs build_tools/npm_packer.sh to generate foo.tgz with these inputs:
    1. foo_node_modules.tar.gz
    2. All of the source files in the current working directory
Non-standard projects

In our project at Redfin, we have a bunch of projects that do weird/non-standard stuff, usually during the “pack” phase, but sometimes during the “install” phase.

The internal_npm_module rule has install_tool and pack_tool attributes, which default to build_tools:npm_installer and build_tools:npm_packer but you can override them to anything you want, including defining a sh_binary with arbitrary dependencies. (You have to hack in special cases to npm-bazel-gen.py to make it add those for you, when you want it.)

In addition, the default packer tool looks for a ./extra-bazel-script file in the current directory, and if it finds one, it just runs whatever it sees there. In some cases, that's enough pluggability.

Next steps

This work is supported by the National Institutes of Health's National Center for Advancing Translational Sciences, Grant Number U24TR002306. This work is solely the responsibility of the creators and does not necessarily represent the official views of the National Institutes of Health.