Tutorial  on  Containers

A Deeper Look into Node.js Docker Images: Help, My Node Image Has Python!

Picking the right base image for your Node.js application can be challenging. Different variants of the official Docker node image vary x5 in size and x10 in the number of reported CVEs, but the smallest image is not always the best. There is also bitnami/node, which is similar but not identical to the "official" Docker image. And, of course, don't forget about the distroless options such as cgr.dev/chainguard/node and gcr.io/distroless/nodejs. So, which Node.js image should you use and when?

Let's dig in!

50 shades of Node.js images.

Picking The Candidates For Inspection

Node.js recommends using LTS releases for production applications, so, where possible, we'll use the current LTS release for our research. However, the majority of considerations in this post will be applicable to any Node.js release line.

Node.js supported release lines and their maintenance status.

Here is a diverse image selection that represents the most distinct Node.js 22 image variants (the current LTS release):

Official Docker Image(s)

docker pull node:22
docker pull node:22-slim
docker pull node:22-alpine

Re-packaged image by Bitnami

docker pull bitnami/node:22

GoogleContainerTools distroless

docker pull gcr.io/distroless/nodejs22-debian12

Chainguard's distroless

The cgr.dev/chainguard/node:22 tag is not available for free, so we'll use cgr.dev/chainguard/node:latest instead:

docker pull cgr.dev/chainguard/node:latest

Simply listing the pulled images can already give us some initial food for thought:

docker images
REPOSITORY                             TAG          SIZE
node                                   22           1.12GB
bitnami/node                           22           974MB
node                                   22-slim      220MB
node                                   22-alpine    155MB
cgr.dev/chainguard/node                latest       145MB
gcr.io/distroless/nodejs22-debian12    latest       141MB

First of all, the size difference between these images is stunning. Also, we can probably estimate the size of the Node.js installation itself - it should be around ~140MB, given that the distroless and alpine-based images have a very little overhead. But if Node.js is only 140MB, what makes up the rest of the huge node:22 and bitnami/node:22 images?

It's time to take a closer look at these Node.js images, starting with the largest one - node:22.

Help, My Node Image Has Python!

Here is a little known fact about the node:22 image that might blow your mind - this image has a full-fledged Python installation inside!

docker run --entrypoint bash node:22 -c 'python3 --version'
Python 3.11.2

Interesting is that Python is not the only "unexpected" package in this image - for instance, this image also includes the entire GNU Compiler Collection:

docker run --entrypoint bash node:22 -c 'gcc --version'
gcc (Debian 12.2.0-14) 12.2.0
Copyright (C) 2022 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

No wonder its size is more than 1GB - node:22 has a grand total of 661 packages in it 🤯

syft node:22
 ✔ Loaded image
 ✔ Parsed image
 ✔ Cataloged contents
   ├── ✔ Packages                        [661 packages]
   ├── ✔ File digests                    [19,730 files]
   ├── ✔ File metadata                   [19,730 locations]
   └── ✔ Executables                     [1,331 executables]

And, of course, with so many packages come a lot of CVEs:

trivy image -q node:22
node:22 (debian 12.7)

Total: 996 (UNKNOWN: 3, LOW: 492, MEDIUM: 419, HIGH: 76, CRITICAL: 6)

Are all these packages needed to run Node.js? Definitely not - otherwise, the much smaller node:22-slim and the distroless Node.js images wouldn't exist.

But then why does the node:22 image have all this "bloat" in it? Its Docker Hub page might have a hint (which also happens to be a piece of poor advice on base image selection):

As a typical representative of the node:<version> family, the node:22 image is based on a mysterious buildpack-deps image, and that's where the "bloat" comes from... We need to go deeper!

node:22 image composition.

Screenshot of the Docker Hub node:22 tag page.

As its name suggests, buildpack-deps is a build-tailored image that "includes a large number of "development header" packages needed by various things like Ruby Gems, PyPI modules, etc." The node:<version> image is not the only one that depends on buildpack-deps. Popular language images such as python, golang, rust, ruby, openjdk, and many others rely on this image as a base. But very few people actually know about that, so buildpack-deps is a true éminence grise of the container image world!

At first sight, basing a large number of language runtime images on a common "toolset" image seems like a good idea because Docker can reuse the common layers between these images. However, it assumes that "the average user of Docker" works with multiple languages at the same time, and all these languages also happen to rely on exactly the same version of the buildpack-deps base image underneath. Only then the layer sharing will actually happen, leading to a lower overall disk usage.

In practice, though, this often results in N slightly different minor versions of the buildpack-deps image being pulled to satisfy the version requirements of different language runtime images (or different versions of the same language image being in concurrent use). But the buildpack-deps image is a general-purpose image! It means that every particular language runtime needs only a small fraction of the packages it provides.

⚠️  Docker Hub's advice to use the buildpack-deps-based node:<version> image by default is rather misguided - it leads to unnecessary bloated Node.js images and increased security risks.

node:22 image composition explained.

Summarizing, here is what we can tell about the node:<version> image so far:

  • Its large size is caused by lots of dev tools inside, many of which aren't even Node.js-specific.
  • The node:<version> image still may be a valid choice for development and build stages.
  • Because of all these dev tools, it should never be used to run production applications!

Now, let's try to understand which Node.js development and/or build tasks actually require Python and GCC.

Do You Compile Your Node.js Modules From C++?

Most Node.js modules are written in JavaScript (or TypeScript and then transpiled). To npm install a package written entirely in JavaScript, it's enough to fetch its source code from the package repository and place it into the local node_modules folder.

However, Node.js also supports "native" modules, a.k.a. addons, that are written in C++. Installing a package that has one or more modules written in C++ may require compilation. For instance, a package can provide pre-built binaries for popular platforms such as linux/amd64 and darwin/arm64 but require compilation from the source on other platforms. And this is when you will need GCC and... Python in your Node.js image.

Some prominent Node.js packages written in C++ are:

  • bcrypt - a library for hashing passwords
  • node-sqlite3 - a SQLite3 binding for Node.js
  • sharp - a high-performance Node.js module for processing images

For example, installing bcrypt using the node:<version> image for a less mainstream platform (linux/arm/v7) takes a while but succeeds:

docker run --privileged --rm tonistiigi/binfmt --install all

docker run \
  --platform linux/arm/v7 \
  --entrypoint bash \
  node:22 \
  -c 'npm install -g bcrypt'
added 59 packages in 1m

4 packages are looking for funding
  run `npm fund` for details

However, the same npm install -g bcrypt command fails in the node:<version>-slim image:

docker run \
  --platform linux/arm/v7 \
  --entrypoint bash \
  node:22-slim \
  -c 'npm install -g bcrypt'

The (abridged) error message makes it clear that the failure is caused by the missing Python installation, which in turn was needed for node-gyp to prepare the compilation environment for GCC:

npm error code 1
npm error path /usr/local/lib/node_modules/bcrypt
npm error command failed
npm error command sh -c node-pre-gyp install --fallback-to-build
npm error Failed to execute '/usr/local/bin/node /usr/local/lib/node_modules/npm/node_modules/node-gyp/bin/node-gyp.js configure ...
...
npm error node-pre-gyp info check checked for "/usr/local/lib/node_modules/bcrypt/lib/binding/napi-v3/bcrypt_lib.node" (not found)
npm error node-pre-gyp http GET https://github.com/kelektiv/node.bcrypt.js/releases/download/v5.1.1/bcrypt_lib-v5.1.1-napi-v3-linux-arm-glibc.tar.gz
npm error node-pre-gyp ERR! install response status 404 Not Found on https://github.com/kelektiv/node.bcrypt.js/releases/download/v5.1.1/bcrypt_lib-v5.1.1-napi-v3-linux-arm-glibc.tar.gz
npm error node-pre-gyp WARN Pre-built binaries not installable for bcrypt@5.1.1 and node@22.10.0 (node-v127 ABI, glibc) (falling back to source compile with node-gyp)
npm error node-pre-gyp WARN Hit error response status 404 Not Found on https://github.com/kelektiv/node.bcrypt.js/releases/download/v5.1.1/bcrypt_lib-v5.1.1-napi-v3-linux-arm-glibc.tar.gz
...
npm error gyp ERR! find Python
npm error gyp ERR! find Python Python is not set from command line or npm configuration
...
npm error gyp ERR! stack Error: Could not find any Python installation to use

At the same time, bcrypt can be installed in the linux/amd64 variant of the node:<version>-slim image just fine, thanks to the pre-compiled binaries it provides:

docker run \
  --platform linux/amd64 \
  --entrypoint bash \
  node:22-slim \
  -c 'npm install -g bcrypt'
added 59 packages in 40s

4 packages are looking for funding
  run `npm fund` for details

Two important takeaways from the above experiment are:

  • Applications that depend on native Node.js modules may need to compile them from C++ during the npm install step.
  • However, the compilation is only required if these modules don't provide pre-built binaries (and the most popular modules often do, at least for the mainstream platforms).

Thus, the dependency on node-gyp (hence, Python and GCC) occurs only during the build step and only for some fraction of npm install (or npm ci) runs. So, the rule of thumb should probably be:

Always try building your Node.js applications using the node:<version>-slim image first and switch to the "fat" node:<version> variant only if you see some node-gyp-related failures.

Additionally, if the application does have valid reasons to be built in a node:<version> image, it should also employ a multi-stage build process that uses a different and much smaller Node.js image for the runtime stage.

Which Node.js Image Do Pragmatic People Choose?

Now, let's take a look at the "slim" variant of the official Docker Node.js image - node:<version>-slim. Unlike the "fat" node:<version> image, it has a much simpler structure - a regular debian:stable-slim base and the Node.js layer on top of it:

As we saw in the previous section, this image is a good choice for the build stage when you don't need to compile your Node.js modules from C++ (i.e., 99% of the time). And it can also be an acceptable choice for the runtime stage - up-to-date versions of the debian:stable base image tend to be well maintained, and most of the high and critical CVEs in them are quickly patched.

trivy image -q node:22-slim
node:22-slim (debian 12.7)

Total: 72 (UNKNOWN: 0, LOW: 57, MEDIUM: 13, HIGH: 1, CRITICAL: 1)

...which is a pretty good result compared to 996 CVEs in the node:<version> image.

Of course, using the node:<version>-slim image has a few downsides:

  • The debian base is ten times thinner than buildpack-deps, but its overhead is still noticeable (88 extra packages worth 75MB out of the total 220MB).
  • Application images based on node:<version>-slim will require periodic rebuilds to catch security patches from its debian base.
  • Even with little to no high & critical CVEs in this image, it has a few hundreds of executables, including a shell, that increase the attack surface and can simplify lateral movement.

However, the ease of using this image and its universality often outweighs all the downsides:

  • It's a regular Debian-based image, so you can apt-get install any system dependencies you might need.
  • Because fresh versions of debian:stable-slim usually have almost no high or critical CVEs in them, the vuln scans of the base image won't be too noisy.
  • You can use node:<version>-slim for both build- and run-time stages (or even use a single-stage build if it doesn't leave traces of development dependencies in the final image).
  • You can debug your containers "as usual" - the shell is there, and you can also install extra tools when needed.

Yes, node:<version>-slim is not an ideal image, but the pragmatic choice award is well deserved here.

Which Node.js Image Do Brave People Choose?

If the debian's overhead in the node:<version>-slim image sounds too big for you, you may consider the node:<version>-alpine variant. This image has a similar two-layer structure, but its OS layer is ten times thinner.

The node:<version>-alpine image may or may not be a good alternative Node.js image for your application. Its alpine base is very small, and the image usually has no (reported) CVEs:

trivy image -q node:22-alpine
node:22-alpine (alpine 3.20.3)

Total: 0 (UNKNOWN: 0, LOW: 0, MEDIUM: 0, HIGH: 0, CRITICAL: 0)

However, Node.js has no officially supported builds for Alpine Linux - the only available builds still have the experimental status. Additionally, since Alpine relies on an alternative C standard library implementation musl, the behavior and/or performance of some applications may differ from the more widespread glibc-based distributions like Debian:

Nevertheless, the node:<version>-alpine is a valid choice for a build- and/or run-time Node.js image. However, if you choose this image for a mission-critical application, you should have at least some awareness of the musl and glibc discrepancies.

On Docker's Misleading Tagging Schema

It's rant o'clock! 🙈

Without any prior knowledge of what's inside these images, which Node.js image would you choose, judging solely by its name: node:22 or node:22-slim?

Clearly, node:22 sounds like a safer bet compared to the potentially reduced in functionality node:22-slim image.

In actuality, though, it's the opposite! As we now know, the node:<version> image is full of development dependencies and, in most cases, should be avoided, while the full-fledged node:<version>-slim image is a good default choice for the majority of Node.js applications.

The situation gets even worse when we add the existing tag aliases to the picture:

  • node:latest is an alias for the "fat" node:23 image (the highest released version)
  • node:lts is an alias for the "fat" node:22 image (the current LTS release)

What could be a safer choice than an image called node:lts? 🤦 And node, a.k.a. node:latest, is probably the most frequently used demo Node.js image - gigabytes of data flying around for absolutely no reason.

Here is what a less misleading tagging schema could look like:

  • The node:<version> tag could become node:<version>-sdk, emphasizing its development orientation.
  • The node:<version>-slim tag could replace the node:<version> tag, promoting its usage as the default.
  • The node:latest and node:lts aliases would then start pointing to the "slim" variants of the image.

Alas, such a change sounds unrealistic due to the large-scale reliance on the existing naming convention.

Bitnami vs. Official Docker Images

You've probably already guessed why the bitnami/node image is so large, but for the sake of completeness, let's examine it closer, too.

The bitnami/node:<version> image is almost as big as the "official" Docker node:<version> image, and a quick check reveals that it's due to a very similar to buildpack-deps set of packages inside: Python, GCC, and other build tools and "development header" packages. But its actual layer composition is opaque - the bitnami/node image is (supposedly) based on Bitnami's own Debian base image minideb, but it's hard to tell for sure what layers it has on top of it because the final Node.js image is flattened.

Unlike the official Docker images for Node.js, the bitnami/node image lacks the slim or alpine variants, which is a bit strange. Thus, similarly to the node:<version> image, you should probably only use it if you're building some of your Node.js modules from C++.

trivy image -q bitnami/node:22
bitnami/node:22 (debian 12.7)

Total: 626 (UNKNOWN: 0, LOW: 230, MEDIUM: 349, HIGH: 45, CRITICAL: 2)

However, the motivation to use the Bitnami image over the node:<version> image is not very clear. The Bitnami's image is slightly smaller and has ~35% fewer reported CVEs - but 626 vs. 996 CVEs is not a game changer (it's still too many CVEs for a production image, while build-only images tend to receive less scrutiny). Perhaps if you rely on some other Bitnami images like bitnami/java, bitnami/python, or bitnami/mongodb you may want to use bitnami/node to stick with just one ecosystem 🤷🏼

GoogleContainerTools Distroless

The GoogleContainerTools/distroless project provides a hierarchy of extremely small images, with a number of language runtimes at the very top:

GoogleContainerTools distroless images hierarchy.

None of these distroless images have a shell or a package manager inside, so they are absolutely unsuitable for installing extra system packages (unless you want to learn Bazel and derive your own distroless variant).

The gcr.io/distroless/nodejs image comes only with Node.js and a handful of system libraries needed for it to run (libc, libgcc, libstdc++, libssl3, etc). There is no even npm in this image, so the only way to install your application dependencies will be to copy them from the build stage, which uses a more developer-friendly Node.js image (like node:<version>-slim or node:<version>).

Of course, with so little stuff inside, this image should have very few CVEs:

trivy image -q gcr.io/distroless/nodejs22-debian12
gcr.io/distroless/nodejs22-debian12 (debian 12.7)

Total: 15 (UNKNOWN: 0, LOW: 11, MEDIUM: 4, HIGH: 0, CRITICAL: 0)

However, there is a number of caveats to consider before using the gcr.io/distroless/nodejs image:

  • The Node.js images are available only for the last three LTS releases (18, 20, 22 at the moment).
  • Installing extra system packages is (close to) impossible.
  • Installing Node.js dependencies is only possible by copying them over from some other place.
  • Application debugging is more difficult since there will be no shell in the container.
  • The lack of a shell also means that child_process.exec() won't work (so if you need your JavaScript code to execute some other binaries you copied to the image, you're out of luck).

Thus, the gcr.io/distroless/nodejs image is a good choice for running production Node.js applications in a security-hardened environment at the cost of (quite) some extra operational challenges.

Chainguard's Distroless

Chainguard is known for its minimal, hardened container images that are based on their own Linux (un)distro, Wolfi.

The cgr.dev/chainguard/node image has a very small overhead (less than 30 packages installed) and usually has little to no (reported) CVEs:

trivy image -q cgr.dev/chainguard/node
cgr.dev/chainguard/node (wolfi 20230201)

Total: 0 (UNKNOWN: 0, LOW: 0, MEDIUM: 0, HIGH: 0, CRITICAL: 0)

Similarly to gcr.io/distroless/nodejs, this image doesn't have a package manager, so installing extra system dependencies might be tricky. But unlike the GoogleContainerTools image, it does come with a shell (via busybox) and npm inside, so you'll be able to npm install packages and child_process.exec() whatever you like from your JavaScript code:

docker run \
  --entrypoint sh \
  cgr.dev/chainguard/node:latest \
  -c 'npm --version'
10.9.0

Judging by its completeness and the minimal footprint, the cgr.dev/chainguard/node image resembles the Docker official node:alpine variant. However, unlike it, Chainguard's node image is shipped with glibc, so there should be fewer operational surprises.

The main gotcha - you'll have to pay for using this image unless you want to stick with the (ever-changing) latest tag, which, at the time of writing this, is not even pointing to an LTS release:

docker run --rm chainguard/node:latest --version
v23.0.0

Conclusion

Based on the insights from our research, here's a simple heuristic to help you choose the right base image for your Node.js application:

Node.js image choices.

For Development or Build Stages:

  • Use node:<version> or bitnami/node:<version> only if you're compiling native Node.js modules from C++.
  • For most other cases, the slimmer and more efficient choice is node:<version>-slim.

For Production Environments:

  • If security vulnerabilities (CVEs) and image size aren't your top priorities, go with node:<version>-slim for a smoother experience.
  • If your application needs several additional OS packages, node:<version>-slim is again the most practical option.
  • For highly secure or regulated environments:
    • Opt for gcr.io/distroless/nodejs as a free option (but be cautious with child_process.exec() limitations).
    • Use cgr.dev/chainguard/node if you're okay with the pricing or always sticking to the latest tag.

What to Avoid:

  • Avoid using node:latest outside of demo or experimental scenarios.
  • Don't use the "fat" node:<version> image for production unless absolutely necessary.
  • Skip node:<version>-alpine if you aren't familiar with the implications of musl vs. glibc.

Lastly, always utilize multi-stage builds for more efficient, secure containers - and stay safe!

Level up your Server Side game — Join 8,000 engineers who receive insightful learning materials straight to their inbox