Vendor by default

This isn’t a recommendation. It’s more of a habit that I’ve acquired recently. If this makes its way to Hacker News and people think how could he, my only response will be 🤷, why not.

I’ve been building a thing, in JavaScript, an application sort of like things I’ve built in the past. With the same basic goal of making something useful. How I get to that goal is flexible. But it ends up being a lot of using boring technology, trying not to overthink the easy parts, trying to properly-think the hard parts, and so on. So there’s not that much to write home about that would surprise your usual React/Next.js/JavaScript engineer person.

But one thing that I do think is sort of unusual is: I’m vendoring a lot of stuff.

Vendoring, in the programming sense, means “copying the source code of another project into your project.” It’s in contrast to the practice of using dependencies, which would be adding another project’s name to your package.json file and having npm or yarn download and link it up for you.

Back in 2016, I wrote an article about optimistic and pessimistic versioning, which mentioned vendoring as the ‘most pessimistic’ approach, and that’s roughly true. Vendoring means that you aren’t going to get automatic bugfixes, or new bugs, from dependencies. It means that you don’t have to trust dependency authors at all.

The downsides of vendoring - no automatic updates, a bigger source tree, more code to maintain, it seems weird and unconventional - are probably pretty clear to JavaScripters, and most members of most language families. Languages like C did vendoring by default, in that people would copy header-only libraries into their projects. Go, for a while, was a little like that - you’d copy files in, whether automatically or manually. But nowadays Go modules (to my untrained eye) look a lot like the norms of other languages.

So what are the reasons for doing this? Here’s what’s on my mind.

I read dependencies anyway

I read a lot of other people’s code. I highly recommend it. One of my golden rules is that you shouldn’t blackbox things you don’t need to. I like to “use dependencies for efficiency, not ignorance.”

When I’m vendoring code - copying it into the project and making it pass my basic eslint & testing standards, I’ll do light rewrites and refactors of new code, allowing me to get a deeper understanding of how they work and where their limits lie.

Obviously, other people’s code is their code. I didn’t do 100% of the thinking that led to it - I’m probably doing 5% of it. But absorbing that bit into my mind, instead of seeing only the external API surface, pays dividends.

And sometimes, sure - I’ll read through a dependency, start refactoring, and realize that it’s going to be simpler to write it myself, or I should find another option. It doesn’t matter if something is a dependency or my code: when you ship a product, it’s all your responsibility.

You don’t really use most of your dependencies

Vendoring makes it very obvious that many dependencies have lots of surface area - API calls and methods - that you don’t use. This hooks into the debate about tiny modules (less surface area! less waste! way harder to maintain!), or big ones, but for this article, the point is that you can pare down dependencies into their ideal form in your application.

Dead code elimination helps! But it can only do so much, and the dead code still exists in your project. Even the most advanced DCE isn’t going to pare down methods of a class that you’re using. Only some DCE systems can handle CommonJS, and most work on the level of individual variables, not parts of those variables.

To go a step further, you can trim the test suite down to the API surface you use, run coverage, and comment out or remove the parts that are dead.

Old projects stabilize

Old projects might not be stable. They might have bugs - they usually do. In every issue tracker, there’s at least one multi-year bug that is so wicked that it might never go away.

But old projects have stabilized. They don’t change that often, and when they do the changes are minor. So the value that you got out of that fast release cycle two years ago doesn’t apply. And worse - you have a lot of old patterns that the project probably can’t remove! Do you want IE10 workarounds in your project? If you’re using a lot of third-party dependencies, you’re probably inheriting polyfills for stuff like Array.includes.

In my brief rewrites, I can pluck out old-fashioned polyfills for isArray or code that checks that JSON.parse really exists in this environment. In 2021, it does.

It’s sort of good for open source

This is a mixed bag. My vendored copy might float far away from the original, and contributing back the changes in full isn’t going to be useful. A lot of the changes that I make are going to be somewhat cosmetic or things like adding TypeScript support for things that didn’t have it.

But on the other hand, porting a fix from my vendored copy to the original is going to be pretty quick, and having dependencies in the source tree itself is an ideal scenario for fixing bugs in vendored code quickly.

Dependencies that are GPL’ed (there aren’t many, but the license still lingers) will get released when the thing is released, because the law requires it - though those changes won’t be especially exciting. Things that are MIT/BSD/ISC licensed will mostly get upstream fixes, for the good of the community.

Not everything

I don’t vendor React or Webpack or other large, incredibly complex dependencies. They’re going to change quickly and there’s little I can really contribute to those projects. I’m happy to keep lodash as a dependency for a few extremely battle-tested utilities. But for small to medium-sized dependencies, vendoring makes a lot of sense.

March 11, 2021 Tom MacWright (@tmcw, @tmcw@mastodon.social)