Always Pin Your Versions

2017-11-12 tech programming

I got thinking about version pinning yesterday after reading the HN discussion of this article. This post isn’t about that article, but rather about something that I like to do which seems to be very unconventional: I always pin version numbers. I would argue that, in any serious project, you should never use a wildcard version for a dependency: the downsides far outweigh the benefits.

Why could it be a good thing to point at version 1.* or ^1.2.0 or whatever? A few reasons occur to me:

You get the latest library bugfixes and features with every build
It forces your hand regarding upgrades, which can reduce maintenance cost when it’s time to upgrade in the future (e.g. an urgent security patch)
It reduces friction when developing

I don’t think any of them stands up to scrutiny.

You get the latest library bugfixes and features with every build

If the library’s bugfix is correcting an issue that was not impacting your project in a noticeable way, then pulling in the fix adds negligible value.
If the library’s bugfix is correcting a known issue in your project, or addresses a critical security vulnerability, then pulling in the bugfix warrants a mention in source control which would be absent if the upgrade is done as a result of a version wildcard.
If you’re starting to rely on a new feature of the library, there should be a corresponding library version bump in source control.
You always run the risk of introducing new library bugs alongside the purported improvements. You are better off if you can git bisect to the commit where you upgraded the library, rather than having to cross-reference your build history with your library’s version history.
This introduces ambiguity:

A: Hey, when did we fix that bug caused by the upstream library?

B: Oh, there’s no commit in git, it was just on Tuesday when they rolled out a new minor version and our builds magically stopped having the issue.

In my view, builds should be as deterministic & reproducible as possible. Using a wildcard version undermines reproducibility by introducing date/time as a build variable.
In general, it’s hard to tell when breaking changes are introduced. Semantic versioning is good, but it’s really just a social norm, not enforced in any way (and then you only get a signal about breaking changes for versions >=1). You can lean on your tests, but even a great test suite won’t catch everything. Some libraries may even be stubbed out as part of tests. The bugs that do get past this filter are likely to be subtle. All the more reason to approach every upgrade thoughtfully.
This all holds just as true for transitive dependencies. You depend on library A which depends on library B, and you want library A to get the latest and greatest version of library B. That should really only happen as a conscious change on the part of library A for the reasons just listed.

It forces your hand regarding upgrades

If an upgrade is trivial enough that it requires no changes in your codebase, then the maintenance cost of upgrading later is also trivial.
If an upgrade is not trivial, then it deserves a commit bumping version of library x, fixing breaking changes accompanying the code change. I am definitely not advocating leaving all your dependencies to rot forever without upgrading them. However, these upgrades should be made intentionally.

It reduces friction when developing

Sometimes you just don’t care, and that’s okay. I build some personal Docker images based off of a :latest tag. I import “whatever’s latest” for the libraries I need when I’m just starting a new project. Laziness makes perfect sense sometimes, but in cases when you do care, e.g. anything that will be deployed to end users or anything developed by another person, this is a sloppy approach and should eventually be updated to pin the version you’re actually using.
Wildcard dependencies are a vector for environments to get out of sync in unpredictable ways, be they teammates’ development environments, or the various staging and production environments of a deployed project.

Summing up

Fundamentally, the purpose of pointing at a wildcard version is to let the version drift over time, with an implication that this will be a net benefit to you. It’s not true that every project gets monotonically better as the version gets higher. There also isn’t any intrinsic value in staying current with a library, though it is often a good idea in practice (and can be done as an intentional, manual process while keeping dependencies pinned). The benefits of pointing at a wildcard version are questionable, and there are several clear drawbacks.

In my experience of always specifying versions explicitly, it hasn’t made development any harder or caused problems with bringing dependencies up to date when necessary. Unfortunately it’s not a panacea because of wildcard transitive dependencies: I rely on library A version 2.0.6, but library A relies on library B version ^1.4.3. Lock files (package-lock.json, Gemfile.lock, etc.) are important and solve this problem, but the problem lock files solve is caused by people not pinning their dependencies. If library authors pinned their dependencies (or were forced to by the package manager), then lock files would be unnecessary.