A systematic approach to modern Unix-like command-line interfaces
1. Goals
- Provide a solid base for more consistent command-line interfaces
- Simplify parsing by facilitating the creation of easy-to-use
"grammar-based" parsers
- I.e., no more getopt(s), no more fiddling around with case statements
in loops. Instead:
- Define a concrete grammar based on the generic one.
- Hand that grammar and the command-line arguments to the
parser.
- Have the parser provide the results to the program or error
out.
This approach to parsing command lines also lends itself well to providing
command completion and interactive help.
2. Task-specific tool sets
Going from what we already have – the POSIX standard and different approaches
to Unix-like command-line interfaces in the wild – the first question that
needs to be answered for a systematic approach is what design approach to
choose for task-specific tool sets.
It seems that, historically, the phenomenon of task-specific tool sets did not
(explicitly) exist on Unix and thus is a relatively new development in
command-line utilities on Unix-like systems that is apparently rooted in the
need for more powerful/complex tools to do more complex jobs well, such as
package management or managing network interfaces. A general desire for a more
orderly/structured tool box, where utilities for particular classes of tasks
are arranged in some meaningful way, has probably played an equally important
role here and is, in fact, intrinsically connected to managing increased
complexity.
There are essentially three ways to implement task-specific tool sets:
- structured command-line options
- non-hierarchical command sets
- sub-commands (which seem to be newer)
2.1 Structured command-line options
A tool that needs to perfom several distinct tasks, e.g., installing and
removing software packages could simply utilize command-line options to denote
the respective operation (“sub-command”). Doing that would, however, require
dividing its options into three distinct categories:
- global options, which manipulate the mode of operation independently of any
particular operation
- options denoting a specific operation
- operation-specific options
Example: pacman
capital letters for operations
2.1.1 Advantages
2.1.2 Disadvantages
Implementing sub-commands as options, i.e., not keeing sub-commands and
options strictly separate syntactically, creates a user interface that is
slightly harder to use because options can not be given in random order anymore
because some options denote commands. One way to partly mitigate this would be
to allow global options to occur among command-specific ones. E.g., if a tool
has the option `-v` for verbose output, then
<command> -v <option denoting operation>
<operation-sepcific options>
and
<command> <option denoting operation>
<operation-sepcific options> -v
should yield the same result, i.e., the meaning of -v
should be
exactly the same in both cases: Generally be verbose. This would also be
advisable for the approach of using actual sub-commands. Non-hierarchical
command sets need to have that ability per se, if there are any global options.
2.2 Non-hierarchical command sets
One example of this design approach is OpenBSD's tool set for package
management. It consists of a collection of four independent utilities (i.e.,
independent from the user's perspective):
pkg_add
pkg_check
pkg_delete
pkg_info
There are two prevalent naming schemes for such command sets:
<toolset_prefix>_<command_name>
<toolset_prefix>-<command_name>
2.2.1 Advantages
- It's POSIX-compliant, which sub-commands are not (see below).
- It doesn't add complexity to the utility syntax and thus parsing.
-
It lends itself very well to splitting things up into several binaries, which
might or might not be desirable.
2.2.2 Disadvantages
- It's a form of pseudo-grouping or pseudo-name-spacing via prefixes.
- This yields some problems with respect to organizing documentation in a
reasonable way.
- It might well be appropriate to have an umbrella or "root" manual page
for
toolset_prefix
. But that would effectively mean creating a
manual page for a command that doesn't exist, like, e.g., git(1). (There's
no pkg(1) on OpenBSD, by the way.) So, this doesn't seem like an elegant
solution, also because it probably wouldn't make sense to make such a
manual page a requirement.
- When a tool set has shared options, they need to be repeated in each
tool's manual page. This should even be done when a "root" manual page
exists. Not doing that would just be really inconvenient.
- The underscore variant of naming the tools in a non-hierarchical command
set (
<toolset_prefix>_<command_name>
) is rather
inconvenient to type.
2.3 Sub-commands
An example of the sub-command approach can be found in Alpine Linux's package
manager, apk. Contrary to OpenBSD's pkg_*
command set, it
offers a set of sub-commands, such as add
, del
,
update
, and upgrade
that have to be entered as
arguments to the apk
command, e.g., apk add nano
to
install the Nano text editor.
2.3.1 Advantages
- It's hierarchical, which is more logical and thus provides some
structural advantages, e.g., a clear separation of options into ones that are
"global" and ones that are specific to a sub-command. It's also more elegant
than pseudo-name-spacing via prefixes.
- It lends itself well to providing a tool set in a single binary but isn't
bound to that approach.
- It doesn't share the documentation issues of non-hierachical command sets,
except for having to repeat "global" options for convenience, depending on
whether one or several manual pages are used to document the tool set.
2.3.2 Disadvantages
- It's not POSIX-comformant.
- The concept of sub-commands does not exist in the POSIX Utility
Conventions and related documents. A POSIX-conformant command-line
parser would interpret a sub-command as either an operand or an
option-argument.
- That said, sub-commands don't strictly collide with POSIX either.
Practically, their implementation merely requires that "operands" to the
main command be treated as commands, the arguments to which can then
again simply be parsed according to POSIX rules. So, the added
complexity is arguably negligible.
- Still, this remains a hack on top of an existing specification.
Integrating the idea of sub-commands into a POSIX-based specification
instead will increase complexity, but it would be the right thing to
do.
Documentation-wise, this approach runs a risk of creating rather
large manual pages because there would normally be one manual page documenting
the whole tool set. But then, this also means all documentation for the
tool set is in one place. Additionally, a strong focus on keeping a tool set's
complexity to the necessary minimum will help a lot with preventing unweildy
manuals.
- Another documentation-related problem is that accessing manual pages for
sub-commands requires replacing the whitespace between the command and
sub-command with a hyphen, e.g.,
man apk-add
instead of man
apk add
. (Non-hierarchical command sets don't have this problem.)
3. Utility grammar
3.1 Token classes
- Arguments
- Commands(?) -- Really necessary?
- Sub-commands(?) -- depends on design choice for task-specific tool
sets
- Options
- Binary options: Invert a Boolean-type default value.
- Incremental options: Increment or decrement an integer-type default
value. (POSIX-compliant?)
- Key-value/associative options: Require a value as an argument.
- Option-arguments
- The end-of-options indicator:
--
- Operands
Last changed: 2022-01-02
Copyright © 2021, 2022
Michael Siegel