The Command Lines
© 17 Aug 2011 Luther Tychonievich
Licensed under Creative Commons: CC BY-NC-ND 3.0
other posts

A practical application of programming.

 

Matt Crook, a friend of mine, suggested two weeks ago that I speak about command lines. I delayed until I had posted on subroutines because command lines commands can be seen as an example of subroutines.

What are Command Lines?

In every operating system I know, to run a program you have to supply two things: the executable file to run and a list of zero or more “‍strings,‍” or sequences of textual symbols. Also supplied, though usually without your notice, are two “‍output streams‍” and one “‍input stream.‍” Each of these parts is explained below.

Command lines are the “‍under the hood‍” way of running programs, simple text-based programs that let you specify some or all of these parts directly. Instead of clicking on an icon representing a standard approach to starting the program you type in the name of the file, the list of strings, and, if you want, some stuff about the streams too.

What a Program Wants

When you start a program, there needs to be a file somewhere that tells the operating systems things like how to lay out memory for the program, what data needs to be available to the program before it starts, and, of course, the actual “‍code‍” or set of instructions to follow. Each system uses a different format for this file, but the information is pretty consistent across different computers.

In addition to the executable itself, every programs gets a list of strings and three streams each time it starts up.

A List of Strings

Every program since C was invented as a language starts in a subroutine that looks like this:

(how to) run given a list of strings

So we can say “‍run given ‘‍Gertie‍’, ‘‍fl@ Tv‍’‍” or “‍run given ‘‍Nestor‍’, ‘‍ate‍’, ‘‍my‍’, ‘‍sandwich‍’‍” or even just “‍run given‍”: the list can be empty. The list itself is typically called “‍the command-line arguments‍”, “‍the arguments‍”, “‍args‍”, or “‍argv‍”. Why “‍v‍” in “‍argv‍”? It stands for “‍values‍”, as opposed to “‍c‍” for “‍count‍”. Some languages are designed such that the length of the list and the list itself appear to be two separate parameters.

What the program decides to do with that list is entirely up to the programmer. None of the strings have any inherent meaning. However, there is tradition in each major operating system to have certain strings be flags or options.

The exact syntax for flags varies quite a bit. In windows, a flag starts with a /, like /all or /silent. In *nix systems *nix means Unix and its flavors and spin-offs: Linux (including Android), Darwin (including OS X), Solaris, BSD, BeOS, etc. they start with a - and are treated as single letters, not words, so -vx means the same thing as -v -x. But *nix also allows flags with two leading -s which are treated as words with spaces replaced by hyphens, like --real-fun. Flags are sometimes treated as a key-value pair, like --word=nonsense or -k myKey, though it is fairly case-by-case whether this is allowed or not.

The idea behind flags is that they specify how a task should be handled while the other strings in the list specify what to do the task to. Thus, for a math program I might have --decimal-places=2 be a flag while the other strings might be things like “‍4 + 5 ÷ 6‍”. Of course, no one is forcing a program to treat the strings that way; it’s just a common convention.

Streams

For many years computers were pretty heavily dependent on text. Your programs would display text to the screen, you’d type text back to them, text would be saved to files and printed to paper. Everything was text.

Because everything was text, people came up with a lovely abstraction that allowed programs to talk to each other, or to files, or to people, all without even needing to know which one they were doing. That abstraction was the “‍stream‍”: this magical thing that accepts text in on one side and spits it out on the other side. From a program’s point of view, each stream was either an input stream or an output stream, depending on which end they were holding. Whether the program’s input was coming from a file, a user, or another program it was just an input stream to the computer.

At first each program just expected two streams: one in, one out. Pretty soon, though, a second output stream was added: one for what you were actually producing and one for error messages, warnings, and other commentary on the process. In the old-school way, these three streams were named 0, 1, and 2: 0, or stdin, was where you collected input; 1, or stdout, was where you sent results; and 2, or stderr, was the place to put error messages and the like.

To this day, every program on every major operating system, from Windows to you cell phone, still gets these three streams every time it runs. A program may ignore them, it may shut them down so the operating system can forget about them, or it may use them in any way it chooses. When you run programs using icons typically the input stream never gets any input and the output streams are either routed to some hidden log file For example, in OS X stderr is usually routed to a log called “‍the console‍”. or simply ignored. But whatever is done with them by the program and the operating system, they exist.

When you are running programs from the command line, there are some nifty tricks you can do with these streams. For example, you can say “‍don’t ask me what to do, read input from a file instead‍” by adding <filename after the list of strings. You can send output to a file too, using >filename (for stdout) or 2>filename (for stderr). You can send the output to the end of a file that already exists instead of replacing that file’s contents with the output by using >>filename. You can also say “‍Have the output from program A be the input to program B‍” using what is called a “‍pipe‍”: stuff to run A | stuff to run B. There are more tricks available (Ts and junctions and so forth) but files and pipes are by far the most common.

How Do I Do X on System Y?

It you have specific questions on particular operating systems, this is really not the blog for you. I find the details of particular systems to be rather boring. There are plenty of tutorials online for using DOS or ksh or bash or whatever. If you are really stuck, send me a @private comment and I’ll see if I can point you toward a solution.

Why Should I Care?

In a common programmer’s cliché, “‍it’s more powerful‍” to use the command line. That means that there are things you can make happen using the command line that you can’t make happen using icons. Of course, the only reason for that added power is because software engineers didn’t bother making all of their programs’ functionality available through graphical interfaces. And there’s a (fairly) good reason they didn’t: designing and implemented a graphical interface takes a lot of time, time they’d rather spend adding more functionality to the program. Nothing’s free, and programmers often assume clients would rather have a program that can do a lot in a confusing way to a program that can just do a few things in an obvious way.

That aside, this really is what’s happening under the hood in every operating system of which I am aware. Even if you never use it at all, truth is a beautiful thing to acquire.




Looking for comments…



Loading user comment form…