BCPL to B to C: other articles by Alan Watson, Clive Feather, and Dennis Ritchie
[The text of this article has been slightly edited since its original posting.]

Newsgroups: comp.lang.c
From: msb@sq.sq.com (Mark Brader)
Subject: B (was: int main() (was SUMMARY: CW poll: exit vs. return))
Message-ID: <1993Sep6.062940.11969@sq.sq.com>
Organization: SoftQuad Inc., Toronto, Canada
Date: Mon, 6 Sep 93 06:29:40 GMT

> >  It may also be worth noting in this context that C's predecessor language,
> >  B, did not require ...

I honestly never knew how C got [its] name. ...
According to a textbook of mine C was derived from
BCPL. I did not know a language called B ever existed. ...

To go back to the beginning, once upon a time in England there was a language called CPL I've heard this acronym explained both as Cambridge Programming Language and as Combined..., and seen a suggestion that both names may have been used at different times.  Anyway, this gave rise to a simpler language called BCPL, the B apparently standing for Basic.  A paper about BCPL (which I have not read) can be found in the proceedings of the 1969 AFIPS Spring Joint Computer Conference.  For further information about these languages, I suggest asking in comp.lang.misc or alt.folklore.computers.

Also in 1969, the system that Brian Kernighan would later name Unix was being developed by Ken Thompson "with some assistance from" Dennis Ritchie The original system was implemented in PDP-7 assembler. Once they had it more or less working, the need for a high-level language was felt.  Doug McIlroy implemented a language called TMG, and then, as Ritchie later wrote...

...Thompson decided that we could not pretend to offer a real computing service without Fortran, so he sat down to write a Fortran in TMG.  As I recall, the intent to handle Fortran lasted about a week.  What he produced instead was a definition of and a compiler for the new language B. B was much influenced by the BCPL language; other influences were Thompson's taste for spartan syntax, and the very small space into which the compiler had to fit.

[from the Oct 1984 Bell Labs Tech Journal special issue on UNIX]

This apparently happened in 1970.  The same year a PDP-11 arrived to replace the PDP-7, and UNIX began to be moved to it, still in assembler. B was soon implemented on the PDP-11.  A few years experience with B showed that it was not entirely satisfactory, and C was developed from it by Ritchie soon afterward, after which most of Unix, of course, was reimplemented in C.  I understand that during the transition from B to C there was also a short-lived intermediate language NB (new B).

> Tell me more about this predecessor language. ...
> it seems that this 'B' was even worse than old-style C on type-checking.

B didn't believe in type-checking, period.  There was only one type, the machine word, and the programmer was responsible for applying to a variable only such operators as made sense.  I never used B on a UNIX system, but I used it on GCOS on the Honeywell 6000 series; the first version of B there, and I presume the first one on UNIX, did not support floating point.  Later it was added by means of adding floating-point operators: #+, #* and so on.

Here is a bit of C code and its B equivalent:

    /* infact -- initializes elements from fact[0] = 0! up to
     * fact[n] = n!.  Returns n!. */

    float infact (n) int n;
    /* or, of course, the newer float infact (int n) */
    {
            float f = 1;
            int i;
            extern float fact[];

            for (i = 0; i <= n; ++i)
                    fact[i] = f *= i;

            return f;
    }

    #define TOPFACT 10
    float fact[TOPFACT+1];
And now in B:
    infact (n)
    {
            auto f, i, j;   /* no initialization for auto variables */
            extrn fact;     /* "What would I do differently if designing
                             *  UNIX today?  I'd spell creat() with an e."
                             *  -- Ken Thompson, approx. wording */

            f = 1.;         /* floating point constant */
            j = 0.;
            for (i = 0; i <= n; ++i) {
                    fact[i] = f =#* j;      /* note spelling =#* not #*= */
                    j =#+ 1.;               /* #+ for floating add */
            }

            return (f);     /* at least, I think the () were required */
    }

    TOPFACT = 10;   /* equivalent of #define, allows numeric values only */
    fact[TOPFACT];

The last line is of particular interest because it actually declares 12, not 10, words of storage.  In B the subscripts run from 0 to the declared value, so [0] denoted a 1-element array.  The extra word was a pointer initialized to the first element of the array, so the "fact" of the B program was equivalent to this in C:

        float _unnamed[11], *fact = _unnamed;

The C concept of an array reference decaying to a pointer descends from this.  There were no structs, no arrays of arrays, no enums, no unions-- but you could simulate all of them yourself.  (There was an equivalent of malloc() -- actually, at least on the implementation I used, it was more powerful, because you could free things whether they were allocated by it or not.  I miss that sometimes.)

There was also none of the char/short/int/long hierarchy, and no unsigned operations.  If you wanted to deal with character strings, you could either store one character per word in an array and index them directly, or store one character per byte and access them with library functions.

The library functions performed some of the same sorts of things that the ones now standard in C did, but the set of functions was a good deal smaller, and generally they were not 100% compatible with the UNIX ones. One thing that I miss was that the same function covered both printf() and fprintf() -- if its first argument was a small number, it took it to be a file descriptor.  In the implementation that I used, it was possible to stack open file descriptors, a programmatic equivalent of the way you can say

        (cmd1; (cmd2a; cmd2b >xx; cmd2c) >yy; cmd3) >zz
in UNIX shell language, and strings could be opened as files.

> Was there ever an A ;).

Not spelled that way, but some time after B and C had arrived at the University of Waterloo, some people there did create languages called Eh and Zed -- in that order, I believe.

Finally, as the Jargon File says, "Before Bjarne Stroustrup settled the question by designing C++, there was a humorous debate over whether C's successor should be named `D' or `P'."


Mark Brader  utzoo!sq!msb, msb@sq.com
"We did not try to keep writing until things got full." -- Ritchie

This article is in the public domain.

BCPL to B to C: other articles by Alan Watson, Clive Feather, and Dennis Ritchie