A Deeper Look into Node.js Docker Images: Help, My Node Image Has Python!
Picking the right base image for your Node.js application can be challenging.
Different variants of the official Docker node
image vary x5 in size and x10 in the number of reported CVEs,
but the smallest image is not always the best.
There is also bitnami/node
,
which is similar but not identical to the "official" Docker image.
And, of course, don't forget about the distroless options such as
cgr.dev/chainguard/node
and gcr.io/distroless/nodejs
.
So, which Node.js image should you use and when?
Let's dig in!
Picking The Candidates For Inspection
Node.js recommends using Active LTS or Maintenance LTS releases for production applications, so, where possible, we'll use the active LTS version for our research. However, the majority of considerations in this post will be applicable to any Node.js release line.
Here is a diverse image selection that represents the most distinct Node.js 22 image variants (the soon-to-be Active LTS release):
docker pull node:22
docker pull node:22-slim
docker pull node:22-alpine
docker pull bitnami/node:22
GoogleContainerTools distroless
docker pull gcr.io/distroless/nodejs22-debian12
The cgr.dev/chainguard/node:22
tag is not available for free,
so we'll use cgr.dev/chainguard/node:latest
instead:
docker pull cgr.dev/chainguard/node:latest
Simply listing the pulled images can already give us some initial food for thought:
docker images
REPOSITORY TAG SIZE
node 22 1.12GB
bitnami/node 22 974MB
node 22-slim 220MB
node 22-alpine 155MB
cgr.dev/chainguard/node latest 145MB
gcr.io/distroless/nodejs22-debian12 latest 141MB
First of all, the size difference between these images is stunning.
Also, we can probably estimate the size of the Node.js installation itself -
it should be around ~140MB
, given that the distroless and alpine
-based images have a very little overhead.
But if Node.js is only 140MB
, what makes up the rest of the huge node:22
and bitnami/node:22
images?
It's time to take a closer look at these Node.js images,
starting with the largest one - node:22
.
Help, My Node Image Has Python!
Here is a little known fact about the node:22
image that might blow your mind -
this image has a full-fledged Python installation inside!
docker run --entrypoint bash node:22 -c 'python3 --version'
Python 3.11.2
Interesting is that Python is not the only "unexpected" package in this image - for instance, this image also includes the entire GNU Compiler Collection:
docker run --entrypoint bash node:22 -c 'gcc --version'
gcc (Debian 12.2.0-14) 12.2.0
Copyright (C) 2022 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
No wonder its size is more than 1GB
- node:22
has a grand total of 661 packages in it 🤯
syft node:22
✔ Loaded image
✔ Parsed image
✔ Cataloged contents
├── ✔ Packages [661 packages]
├── ✔ File digests [19,730 files]
├── ✔ File metadata [19,730 locations]
└── ✔ Executables [1,331 executables]
And, of course, with so many packages come a lot of CVEs:
trivy image -q node:22
node:22 (debian 12.7)
Total: 996 (UNKNOWN: 3, LOW: 492, MEDIUM: 419, HIGH: 76, CRITICAL: 6)
Are all these packages needed to run Node.js?
Definitely not - otherwise, the much smaller node:22-slim
and the distroless Node.js images wouldn't exist.
But then why does the node:22
image have all this "bloat" in it?
Its Docker Hub page might have a hint (which also happens to be a piece of poor advice on base image selection):
Excerpt of the Docker Hub node
image summary page.
As a typical representative of the node:<version>
family,
the node:22
image is based on a mysterious buildpack-deps
image,
and that's where the "bloat" comes from... We need to go deeper!
Screenshot of the Docker Hub node:22
tag page.
As its name suggests,
buildpack-deps
is a build-tailored image that
"includes a large number of "development header" packages needed by various things like Ruby Gems, PyPI modules, etc."
The node:<version>
image is not the only one that depends on buildpack-deps
.
Popular language images such as
python
,
golang
,
rust
,
ruby
,
openjdk
,
and many others rely on this image as a base.
But very few people actually know about that,
so buildpack-deps
is a true éminence grise of the container image world!
At first sight, basing a large number of language runtime images on a common "toolset" image seems like a good idea
because Docker can reuse the common layers between these images.
However, it assumes that "the average user of Docker" works with multiple languages at the same time,
and all these languages also happen to rely on exactly the same version of the buildpack-deps
base image underneath.
Only then the layer sharing will actually happen, leading to a lower overall disk usage.
In practice, though, this often results in N slightly different minor versions of the buildpack-deps
image being pulled
to satisfy the version requirements of different language runtime images (or different versions of the same language image being in concurrent use).
But the buildpack-deps
image is a general-purpose image!
It means that every particular language runtime needs only a small fraction of the packages it provides.
⚠️ Docker Hub's advice to use the buildpack-deps
-based node:<version>
image by default is rather misguided -
it leads to unnecessary bloated Node.js images and increased security risks.
Summarizing, here is what we can tell about the node:<version>
image so far:
- Its large size is caused by lots of dev tools inside, many of which aren't even Node.js-specific.
- The
node:<version>
image still may be a valid choice for development and build stages. - Because of all these dev tools, it should never be used to run production applications!
Now, let's try to understand which Node.js development and/or build tasks actually require Python and GCC.
Do You Compile Your Node.js Modules From C++?
Most Node.js modules are written in JavaScript (or TypeScript and then transpiled).
To npm install
a package written entirely in JavaScript,
it's enough to fetch its source code from the package repository and place it into the local node_modules
folder.
However, Node.js also supports "native" modules, a.k.a. addons, that are written in C++. Installing a package that has one or more modules written in C++ may require compilation. For instance, a package can provide pre-built binaries for popular platforms such as linux/amd64 and darwin/arm64 but require compilation from the source on other platforms. And this is when you will need GCC and... Python in your Node.js image.
Some prominent Node.js packages written in C++ are:
bcrypt
- a library for hashing passwordsnode-sqlite3
- a SQLite3 binding for Node.jssharp
- a high-performance Node.js module for processing images
For example, installing bcrypt
using the node:<version>
image for a less mainstream platform (linux/arm/v7
) takes a while but succeeds:
docker run --privileged --rm tonistiigi/binfmt --install all
docker run \
--platform linux/arm/v7 \
--entrypoint bash \
node:22 \
-c 'npm install -g bcrypt'
added 59 packages in 1m
4 packages are looking for funding
run `npm fund` for details
However, the same npm install -g bcrypt
command fails in the node:<version>-slim
image:
docker run \
--platform linux/arm/v7 \
--entrypoint bash \
node:22-slim \
-c 'npm install -g bcrypt'
The (abridged) error message makes it clear that the failure is caused by the missing Python installation,
which in turn was needed for node-gyp
to prepare the compilation environment for GCC:
npm error code 1
npm error path /usr/local/lib/node_modules/bcrypt
npm error command failed
npm error command sh -c node-pre-gyp install --fallback-to-build
npm error Failed to execute '/usr/local/bin/node /usr/local/lib/node_modules/npm/node_modules/node-gyp/bin/node-gyp.js configure ...
...
npm error node-pre-gyp info check checked for "/usr/local/lib/node_modules/bcrypt/lib/binding/napi-v3/bcrypt_lib.node" (not found)
npm error node-pre-gyp http GET https://github.com/kelektiv/node.bcrypt.js/releases/download/v5.1.1/bcrypt_lib-v5.1.1-napi-v3-linux-arm-glibc.tar.gz
npm error node-pre-gyp ERR! install response status 404 Not Found on https://github.com/kelektiv/node.bcrypt.js/releases/download/v5.1.1/bcrypt_lib-v5.1.1-napi-v3-linux-arm-glibc.tar.gz
npm error node-pre-gyp WARN Pre-built binaries not installable for bcrypt@5.1.1 and node@22.10.0 (node-v127 ABI, glibc) (falling back to source compile with node-gyp)
npm error node-pre-gyp WARN Hit error response status 404 Not Found on https://github.com/kelektiv/node.bcrypt.js/releases/download/v5.1.1/bcrypt_lib-v5.1.1-napi-v3-linux-arm-glibc.tar.gz
...
npm error gyp ERR! find Python
npm error gyp ERR! find Python Python is not set from command line or npm configuration
...
npm error gyp ERR! stack Error: Could not find any Python installation to use
At the same time, bcrypt
can be installed in the linux/amd64
variant of the node:<version>-slim
image just fine,
thanks to the pre-compiled binaries it provides:
docker run \
--platform linux/amd64 \
--entrypoint bash \
node:22-slim \
-c 'npm install -g bcrypt'
added 59 packages in 40s
4 packages are looking for funding
run `npm fund` for details
Two important takeaways from the above experiment are:
- Applications that depend on native Node.js modules may need to compile them from C++ during the
npm install
step. - However, the compilation is only required if these modules don't provide pre-built binaries (and the most popular modules often do, at least for the mainstream platforms).
Thus, the dependency on node-gyp
(hence, Python and GCC) occurs only during the build step and only for some fraction of npm install
(or npm ci
) runs.
So, the rule of thumb should probably be:
Always try building your Node.js applications using the node:<version>-slim
image first and switch to the "fat" node:<version>
variant only if you see some node-gyp
-related failures.
Additionally, if the application does have valid reasons to be built in a node:<version>
image,
it should also employ a multi-stage build process
that uses a different and much smaller Node.js image for the runtime stage.
Which Node.js Image Do Pragmatic People Choose?
Now, let's take a look at the "slim" variant of the official Docker Node.js image - node:<version>-slim
.
Unlike the "fat" node:<version>
image, it has a much simpler structure -
a regular debian:stable-slim
base and the Node.js layer on top of it:
Screenshot of the Docker Hub node:22-slim
tag page.
As we saw in the previous section, this image is a good choice for the build stage
when you don't need to compile your Node.js modules from C++ (i.e., 99% of the time).
And it can also be an acceptable choice for the runtime stage -
up-to-date versions of the debian:stable
base image tend to be well maintained,
and most of the high and critical CVEs in them are quickly patched.
trivy image -q node:22-slim
node:22-slim (debian 12.7)
Total: 72 (UNKNOWN: 0, LOW: 57, MEDIUM: 13, HIGH: 1, CRITICAL: 1)
...which is a pretty good result compared to 996 CVEs in the node:<version>
image.
Of course, using the node:<version>-slim
image has a few downsides:
- The
debian
base is ten times thinner thanbuildpack-deps
, but its overhead is still noticeable (88 extra packages worth75MB
out of the total220MB
). - Application images based on
node:<version>-slim
will require periodic rebuilds to catch security patches from itsdebian
base. - Even with little to no high & critical CVEs in this image, it has a few hundreds of executables, including a shell, that increase the attack surface and can simplify lateral movement.
However, the ease of using this image and its universality often outweighs all the downsides:
- It's a regular Debian-based image, so you can
apt-get install
any system dependencies you might need. - Because fresh versions of
debian:stable-slim
usually have almost no high or critical CVEs in them, the vuln scans of the base image won't be too noisy. - You can use
node:<version>-slim
for both build- and run-time stages (or even use a single-stage build if it doesn't leave traces of development dependencies in the final image). - You can debug your containers "as usual" - the shell is there, and you can also install extra tools when needed.
Yes, node:<version>-slim
is not an ideal image, but the pragmatic choice award is well deserved here.
Which Node.js Image Do Brave People Choose?
If the debian
's overhead in the node:<version>-slim
image sounds too big for you,
you may consider the node:<version>-alpine
variant.
This image has a similar two-layer structure, but its OS layer is ten times thinner.
Screenshot of the Docker Hub node:22-alpine
tag page.
The node:<version>-alpine
image may or may not be a good alternative Node.js image for your application.
Its alpine
base is very small, and the image usually has no (reported) CVEs:
trivy image -q node:22-alpine
node:22-alpine (alpine 3.20.3)
Total: 0 (UNKNOWN: 0, LOW: 0, MEDIUM: 0, HIGH: 0, CRITICAL: 0)
However, Node.js has no officially supported builds for Alpine Linux -
the only available builds still have the experimental status.
Additionally, since Alpine relies on an alternative C standard library implementation musl
,
the behavior and/or performance of some applications may differ from the more widespread glibc
-based distributions like Debian:
Excerpt of the Docker Hub node
image summary page.
Nevertheless, the node:<version>-alpine
is a valid choice for a build- and/or run-time Node.js image.
However, if you choose this image for a mission-critical application,
you should have at least some awareness of the musl
and glibc
discrepancies.
On Docker's Misleading Tagging Schema
It's rant o'clock! 🙈
Without any prior knowledge of what's inside these images,
which Node.js image would you choose, judging solely by its name: node:22
or node:22-slim
?
Clearly, node:22
sounds like a safer bet compared to the potentially reduced in functionality node:22-slim
image.
In actuality, though, it's the opposite!
As we now know, the node:<version>
image is full of development dependencies and, in most cases, should be avoided,
while the full-fledged node:<version>-slim
image is a good default choice for the majority of Node.js applications.
The situation gets even worse when we add the existing tag aliases to the picture:
node:latest
is an alias for the "fat"node:23
image (the highest released version)node:lts
is an alias for the "fat"node:22
image (the Active LTS release)
What could be a safer choice than an image called node:lts
? 🤦
And node
, a.k.a. node:latest
, is probably the most frequently used demo Node.js image -
gigabytes of data flying around for absolutely no reason.
Here is what a less misleading tagging schema could look like:
- The
node:<version>
tag could becomenode:<version>-sdk
, emphasizing its development orientation. - The
node:<version>-slim
tag could replace thenode:<version>
tag, promoting its usage as the default. - The
node:latest
andnode:lts
aliases would then start pointing to the "slim" variants of the image.
Alas, such a change sounds unrealistic due to the large-scale reliance on the existing naming convention.
Bitnami vs. Official Docker Images
You've probably already guessed why the bitnami/node
image is so large,
but for the sake of completeness, let's examine it closer, too.
The bitnami/node:<version>
image is almost as big as the "official" Docker node:<version>
image,
and a quick check reveals that it's due to a very similar to buildpack-deps
set of packages inside:
Python, GCC, and other build tools and "development header" packages.
But its actual layer composition is opaque -
the bitnami/node
image is (supposedly) based on Bitnami's own Debian base image minideb
,
but it's hard to tell for sure what layers it has on top of it because the final Node.js image is flattened.
Unlike the official Docker images for Node.js, the bitnami/node
image lacks the slim
or alpine
variants,
which is a bit strange.
Thus, similarly to the node:<version>
image, you should probably only use it if you're building some of your Node.js modules from C++.
trivy image -q bitnami/node:22
bitnami/node:22 (debian 12.7)
Total: 626 (UNKNOWN: 0, LOW: 230, MEDIUM: 349, HIGH: 45, CRITICAL: 2)
However, the motivation to use the Bitnami image over the node:<version>
image is not very clear.
The Bitnami's image is slightly smaller and has ~35% fewer reported CVEs -
but 626 vs. 996 CVEs is not a game changer (it's still too many CVEs for a production image,
while build-only images tend to receive less scrutiny).
Perhaps if you rely on some other Bitnami images like bitnami/java
,
bitnami/python
,
or bitnami/mongodb
you may want to use bitnami/node
to stick with just one ecosystem 🤷🏼
GoogleContainerTools Distroless
The GoogleContainerTools/distroless project provides a hierarchy of extremely small images, with a number of language runtimes at the very top:
None of these distroless images have a shell or a package manager inside, so they are absolutely unsuitable for installing extra system packages (unless you want to learn Bazel and derive your own distroless variant).
The gcr.io/distroless/nodejs
image comes only with Node.js and a handful of system libraries needed for it to run
(libc
, libgcc
, libstdc++
, libssl3
, etc).
There is no even npm
in this image,
so the only way to install your application dependencies will be to copy them from the build stage,
which uses a more developer-friendly Node.js image (like node:<version>-slim
or node:<version>
).
Of course, with so little stuff inside, this image should have very few CVEs:
trivy image -q gcr.io/distroless/nodejs22-debian12
gcr.io/distroless/nodejs22-debian12 (debian 12.7)
Total: 15 (UNKNOWN: 0, LOW: 11, MEDIUM: 4, HIGH: 0, CRITICAL: 0)
However, there is a number of caveats to consider before using the gcr.io/distroless/nodejs
image:
- The Node.js images are available only for the last three LTS releases (18, 20, 22 at the moment).
- Installing extra system packages is (close to) impossible.
- Installing Node.js dependencies is only possible by copying them over from some other place.
- Application debugging is more difficult since there will be no shell in the container.
- The lack of a shell also means that
child_process.exec()
won't work (so if you need your JavaScript code to execute some other binaries you copied to the image, you're out of luck).
Thus, the gcr.io/distroless/nodejs
image is a good choice for running production Node.js applications in a security-hardened environment at the cost of (quite) some extra operational challenges.
Chainguard's Distroless
Chainguard is known for its minimal, hardened container images that are based on their own Linux (un)distro, Wolfi.
The cgr.dev/chainguard/node
image has a very small overhead (less than 30 packages installed) and usually has little to no (reported) CVEs:
trivy image -q cgr.dev/chainguard/node
cgr.dev/chainguard/node (wolfi 20230201)
Total: 0 (UNKNOWN: 0, LOW: 0, MEDIUM: 0, HIGH: 0, CRITICAL: 0)
Similarly to gcr.io/distroless/nodejs
, this image doesn't have a package manager,
so installing extra system dependencies might be tricky.
But unlike the GoogleContainerTools image, it does come with a shell (via busybox
) and npm
inside,
so you'll be able to npm install
packages and child_process.exec()
whatever you like from your JavaScript code:
docker run \
--entrypoint sh \
cgr.dev/chainguard/node:latest \
-c 'npm --version'
10.9.0
Judging by its completeness and the minimal footprint,
the cgr.dev/chainguard/node
image resembles the Docker official node:alpine
variant.
However, unlike it, Chainguard's node image is shipped with glibc
, so there should be fewer operational surprises.
The main gotcha - you'll have to pay for using this image unless you want to stick with the (ever-changing) latest
tag,
which, at the time of writing this, is not even pointing to an LTS release:
docker run --rm chainguard/node:latest --version
v23.0.0
Conclusion
Based on the insights from our research, here's a simple heuristic to help you choose the right base image for your Node.js application:
For Development or Build Stages:
- Use
node:<version>
orbitnami/node:<version>
only if you're compiling native Node.js modules from C++. - For most other cases, the slimmer and more efficient choice is
node:<version>-slim
.
For Production Environments:
- If security vulnerabilities (CVEs) and image size aren't your top priorities, go with
node:<version>-slim
for a smoother experience. - If your application needs several additional OS packages,
node:<version>-slim
is again the most practical option. - For highly secure or regulated environments:
- Opt for
gcr.io/distroless/nodejs
as a free option (but be cautious withchild_process.exec()
limitations). - Use
cgr.dev/chainguard/node
if you're okay with the pricing or always sticking to thelatest
tag.
- Opt for
What to Avoid:
- Avoid using
node:latest
outside of demo or experimental scenarios. - Don't use the "fat"
node:<version>
image for production unless absolutely necessary. - Skip
node:<version>-alpine
if you aren't familiar with the implications ofmusl
vs.glibc
.
Lastly, always utilize multi-stage builds for more efficient, secure containers - and stay safe!
Level up your Server Side game — Join 9,000 engineers who receive insightful learning materials straight to their inbox