the missing comprehensive package manager for Shell
I posted a comment on the red site (a few months ago, in a thread about GNU and POSIX) that ended with:
I’m noticing a lot of wind-shaped trees, all deformed by the vacuum where the missing comprehensive package manager for shell could be.
Some of these trees are people with Strong Opinions about/for/against things like POSIX and extensions, portability, shells, Shell, userland utilities, and so on. Some of these are real-world shell programs, which can be extensively deformed by having to negotiate this reality to work for a living.
This isn't an ahistorical argument that the developers of the initial Shell ecosystems really screwed the pooch by not co-evolving with a good package manager. It also isn't arguing that portable Shell isn't essential in some domains.
Instead, it suspects that most Shell just needs to run. In these cases, ~portability concerns are implementation details--they wouldn't be necessary if the script's authors/packagers could assert the interpreter and dependencies. In other words, I suspect some small corners of the world could look a lot different if, a few decades ago, time-travelling aliens had dropped off a comprehensive package manager that could run all non-portable Shell with the correct interpreter and dependencies.
By comprehensive package manager for shell, I mean roughly: a general-purpose package manager that can supply Shell scripts and all of the dependencies (other Shell libraries, executables, and so on) needed to just run them. It should not fall prey to the dependency-management problems I mentioned in part 1: it should automatically install the script's dependencies without polluting the user/system environment (and causing related dependency conflicts).
Strictly speaking, Shell's dynamism makes this package manager impossible. But, this is Shell we're talking about--an 80-90% solution is fitting.
Not only is such a solution tractable, but it's already cutting its teeth. In early 2020 I started writing resholve (which helps discover external dependencies and resolve them to absolute paths) to enable Nix/nixpkgs to meet this need. resholve landed in nixpkgs in early 2021 and after adding some important features over the course of the year it is now able to handle a significant fraction of Shell packages.
Note: I hope you'll stick with me even if you avoid Nix. I develop resholve as a general utility. It can do the same job (and inevitably has other uses) in other contexts. I tried leaving Nix out, but I think it's important to communicate that this is already coming to fruition. It's a real thing people can start experimenting with today (even if they're just researching to do it better elsewhere :)
Below is some pseudocode that hopefully illustrates the concept, minus Nix. (If you want a literal Nix example, I discuss a simple/contrived one in write a simple shell package with resholve.)
1# file: back_up_my_git_repos.bash
2source find_git_repos.bash # ①
4while read -r repo; do
6done < <(find_git_repos ~/projects)
8# file: back_up_my_git_repos.package
10 scripts back_up_my_git_repos.bash
11 inputs find_git_repos # ②
13# file: find_git_repos.package
15 scripts find_git_repos.bash
16 inputs git findutils grep
The two most important takeaways for now are about abstraction boundaries:
The script source can ignore concrete ~deployment details like:
- whether this dependency is available (resholve will block the build if not)
- where this library is located (resholve will rewrite it to an absolute path at "build" time)
This applies to external executables as well.
There's actually a package abstraction boundary.
find_git_repos.package is separately resolved with its own dependencies. Packages or users that depend on it don't need to supply its dependencies (or even know what they are).
Before I move on, I want to clarify that the current packaging is a start and not an end. Ways to improve on what we're doing will inevitably emerge as more people do it.
I'll sketch out some thoughts for illustration (inspiration?):
- It might be possible to find some run-time performance gains by inlining the target of a safe subset of source invocations.
- I still find myself writing guards against sourcing a library more than once.
- If it's feasible to detect the presence of these, the tooling might be able to automagically inject them whenever they're missing.
- If inlining as in #1 above worked out, it might be possible to just optimize-out some subset of nth inclusions entirely.
- The tooling might be able to address namespace collisions between separate libraries at build time.
- Whenever the Shell is sufficiently static, it may be possible to optimize out unused functions.
- It may be possible to fix some subset of conflicts over shared resources at build time.
The comprehensive package manager for Shell solves the dependency-management issues I mentioned in part 1, laying a technical foundation for ergonomic re-use. Good re-use in turn makes it easier to build the leverage-compounding ecosystem that modular Shell needs to escape the gravity well.
In the next part, I'll sketch out my ~vision for how we can bring about that leverage-compounding ecosystem.