modularity in the age of antisocial Shell
Since late 2018, I've been accumulating the knowledge that comes from shaving forbidden yak: I started a substantial project in Bash. As with any good yak-shave, one thing led to another.
As more of this work is coming to fruition, I'd like to dissect some of that forbidden knowledge. I'll set aside most of the well-aired complaints about Bash/Shell as languages to focus on some ecosystem issues (which, of course, have roots in problems with the language).
Some of you might be thinking what Shell ecosystem?!--and that's more or less what we're here for:
Shell would be less of a drag with better code re-use, but the language has some interlocking problems that hamper it. I think (some of) these problems are tractable.
There's a simple version of this argument, and a subtler one.
The simple version is roughly the same as the one for not writing it at all: Shell can be hard to read, write and maintain, so writing less of it is usually better. Re-use enables us to build compounding levers and write less Shell in the long run.
The subtler version of this argument is that one of Shell's strengths is the ability to quickly build low-furniture, expressive, near-natural-language DSLs. If you aren't sure what I mean, take a quick look at shakedown (a Bash DSL for HTTP testing).
When our goal is to write as little Shell as possible, I think an imperative, direct approach to the task at hand is natural. This makes sense for true one-offs, but I think it closes us off to opportunties for re-use and leads us to forfeit one of the best parts of the language.
This section has been a little hard to write because I think many of these factors reinforce or exacerbate each other. I'll run through what I see as the list of issues first, and then reflect a little on their consequences collectively.
Shell is imperative and fragile, so it tends to have a lot of checking and polling. This is fine when all of the checks are cheap, or when there's just one party checking. But it's a headwind to narrow modules in favor of monoliths (which end up under a lot of user pressure to support every possible use-case). Imagine, for example, the performance penalty of having three distinct profile modules that each invoke
git to see if
$PWD is a git repo with unpushed changes.
The language can be tricky, and its affordances for working with scarce resources such as traps or
PROMPT_COMMAND default to clobbering existing users and monopolizing the resource (this applies to more than just the shell's interfaces--it can be just as true between libraries).
Shell is hard to package because the language is a big ball of string(s) with a lot of subtle run-time dependencies on the filesystem and execution environment. It's easy to zip up some scripts and ship them off, but it's hard to answer table-stakes questions like whether the packagers have correctly identified all of the external dependencies and provided the right versions/variants.
Portability problems exacerbates all of the other factors here. I've kept this section brief because portability is well-covered in its own right. I'll look at portability from another angle in the next post.
Some consequences of these impediments are:
- Modular Shell is stuck in a gravity well:
- It's hard to justify the extra work to write easy-to-reuse Shell without a leverage-compounding ecosystem.
- Stitching together small modules that do similar work is prone (but not doomed) to performance penalties from overlapping work. In some cases the performance penalty isn't enough to sweat, but it can price them out of some use-cases.
- The headwinds tend to favor monolithic modules over narrow, focused ones:
- The monoliths also end up under pressure from users to support every use-case.
- A lot of the modularity/re-use I see is plugin/extension stuff clustered around individual projects (particularly profile and testing frameworks).
- Perversely, Shell written for distribution accumulates compensation measures--warts like hardcoded executable paths and extensive environment/dependency sanity-checking--that can make scripts harder to read, reason about, and adapt for environments where the same assumptions don't hold.
- Poor dependency management:
- It's hard to develop projects that depend on Shell without vendoring it (or using a Shell/Bash package manager, which solves only parts of the problem).
- The user is often on their own to install the appropriate dependencies.
- If the package manager pulls in dependencies, they usually pollute the user or system environment.
- Regardless, the PATH approach is prone to the dependency conflicts that plague global installs of Python or Ruby--but without the tooling and practices those ecosystems have developed to cope.
The next two parts of this series will focus on two ways I think we can get traction on the problems and consequences discussed here. Part 2 will focus on Shell packaging, and part 3 will daydream a little about how we could make the leverage-compounding Shell ecosystem a reality.