The A-Z of Programming Languages: Bourne shell, or sh

An in-depth interview with Steve Bourne, creator of the Bourne shell, or sh

Steve Bourne

Comments

Computerworld is undertaking a series of investigations into the most widely-used programming languages. Previously we have spoken to Alfred v. Aho of AWK fame, S. Tucker Taft on the Ada 1995 and 2005 revisions, Microsoft about its server-side script engine ASP, Chet Ramey about his experiences maintaining Bash, Bjarne Stroustrup of C++ fame and to Charles H. Moore about the design and development of Forth. We've also had a chat with the irreverent Don Woods about the development and uses of INTERCAL, as well as Stephen C. Johnson on YACC, Luca Cardelli on Modula-3, Walter Bright on D, Simon Peyton-Jones on Haskell and more recently, Larry Wall, creator of the Perl programming language.

On this occasion we speak to Steve Bourne, creator of the Bourne shell, or sh. In the early 1970s Bourne was at the Computer Laboratory in Cambridge, England working on a compiler for ALGOL68 as part of his PhD work in dynamical astronomy. This work paved the way for him to travel to IBM’s T.J. Watson Research Center in New York in 1973, in part to undertake research into compilers. Through this work, and a series of connections and circumstance, Bourne got to know people at Bell Labs who then offered him a job in the Unix group in 1975. It was during this time Bourne developed sh.

What prompted the creation of the Bourne shell?

The original shell wasn’t really a language; it was a recording -- a way of executing a linear sequence of commands from a file, the only control flow primitive being goto a label. These limitations to the original shell that Ken Thompson wrote were significant. You couldn’t, for example, easily use a command script as a filter because the command file itself was the standard input. And in a filter the standard input is what you inherit from your parent process, not the command file.

The original shell was simple but as people started to use Unix for application development and scripting, it was too limited. It didn’t have variables, it didn’t have control flow, and it had very inadequate quoting capabilities.

My own interest, before I went to Bell Labs, was in programming language design and compilers. At Cambridge I had worked on the language ALGOL68 with Mike Guy. A small group of us wrote a compiler for ALGOL68 that we called ALGOL68C. We also made some additions to the language to make it more usable. As an aside we boot strapped the compiler so that it was also written in ALGOL68C.

When I arrived at Bell Labs a number of people were looking at ways to add programming capabilities such as variables and control flow primitives to the original shell. One day [mid 1975?] Dennis [Ritchie] and I came out of a meeting where somebody was proposing yet another variation by patching over some of the existing design decisions that were made in the original shell that Ken wrote. And so I looked at Dennis and he looked at me and I said “you know we have to re-do this and re-think some of the original design decisions that were made because you can’t go from here to there without changing some fundamental things”. So that is how I got started on the new shell.

Was there a particular problem that the language aimed to solve?

The primary problem was to design the shell be a fully programmable scripting language that could also serve as the interface to users typing commands interactively at a terminal.

First of all, it needed to be compatible with the existing usage that people were familiar with. There were two usage modes. One was scripting and even though it was very limited there were already many scripts people had written. Also, the shell or command interpreter reads and executes the commands you type at the terminal. And so it is constrained to be both a command line interpreter and a scripting language. As the Unix command line interpreter, for example, you wouldn’t want to be typing commands and have all the strings quoted like you would in C, because most things you type are simply uninterpreted strings. You don’t want to type ls directory and have the directory name in string quotes because that would be such a royal pain. Also, spaces are used to separate arguments to commands. The basic design is driven from there and that determines how you represent strings in the language, which is as un-interpreted text. Everything that isn’t a string has to have something in front of it so you know it is not a string. For example, there is $ sign in front of variables. This is in contrast to a typical programming language, where variables are names and strings are in some kind of quote marks. There are also reserved words for built-in commands like for loops but this is common with many programming languages.

So that is one way of saying what the problem was that the Bourne Shell was designed to solve. I would also say that the shell is the interface to the Unix system environment and so that’s its primary function: to provide a fully functional interface to the Unix system environment so that you could do anything that the Unix command set and the Unix system call set will provide you. This is the primary purpose of the shell.

One of the other things we did, in talking about the problems we were trying to solve, was to add environment variables to Unix system. When you execute a command script you want to have a context for that script to operate in. So in the old days, positional parameters for commands were the primary way of passing information into a command. If you wanted context that was not explicit then the command could resort to reading a file. This is very cumbersome and in practice was only rarely used. We added environment variables to Unix. These were named variables that you didn’t have to explicitly pass down from the parent to the child process. They were inherited by the child process. As an example you could have a search path set up that specifies the list of directories to used when executing commands. This search path would then be available to all processes spawned by the parent where the search path was set. It made a big difference to the way that shell programming was done because you could now see and use information that is in the environment and the guy in the middle didn’t have to pass it to you. That was one of the major additions we made to the operating system to support scripting.

How did it improve on the Thompson shell?

I did change the shell so that command scripts could be used as filters. In the original shell this was not really feasible because the standard input for the executing script was the script itself. This change caused quite a disruption to the way people were used to working. I added variables, control flow and command substitution. The case statement allowed strings to be easily matched so that commands could decode their arguments and make decisions based on that. The for loop allowed iteration over a set of strings that were either explicit or by default the arguments that the command was given.

I also added an additional quoting mechanism so that you could do variable substitutions within quotes. It was a significant redesign with some of the original flavor of the Thompson shell still there. Also I eliminated goto in favour of flow control primitives like if and for. This was also considered rather radical departure from the existing practice.

Command substitution was something else I added because that gives you very general mechanism to do string processing; it allows you to get strings back from commands and use them as the text of the script as if you had typed it directly. I think this was a new idea that I, at least, had not seen in scripting languages, except perhaps LISP.