[After Dennis Ritchie had written the following critique of type qualifiers in the draft ANSI standard of January 1988, the bar on assignments to previously const-qualified lvalues was removed, and noalias did go.]

Article 7844 of comp.lang.c:
From: dmr@alice.UUCP
Newsgroups: comp.lang.c
Subject: noalias comments to X3J11
Message-ID: <7753@alice.UUCP>
Date: 20 Mar 88 08:37:58 GMT
Organization: AT&T Bell Laboratories, Murray Hill NJ
Lines: 333

Reproduced below is the long essay I sent as an official comment to X3J11.  It is in two parts; the first points out some problems in the current definition of `const,' and the second is a diatribe about `noalias'.

By way of introduction, the important thing about `const' is that the current wording says, in section 3.3.4, that a pointer to a const-qualified object may be cast to a pointer to the plain object, but "If an attempt is made to modify the pointed-to object by means of the converted pointer, the behavior is undefined." Because function prototypes tend to convert your pointers to const-qualified pointers, difficulties arise.

In discussion with various X3J11 members, I learned that this section is now regarded as an inadvertant error, and no one thinks that it will last in its current form.  Nevertheless, it seemed wisest to keep my comments in their original strong form.  The intentions of the committee are irrelevant; only their document matters.

The second part of the essay is about noalias as such.  It seems likely that even the intentions of the committee on this subject are confused.

Here's the jeremiad.

Dennis Ritchie
research!dmr
dmr@research.att.com


This is an essay on why I do not like X3J11 type qualifiers. It is my own opinion; I am not speaking for AT&T.

      Let me begin by saying that I'm not convinced that even the pre-December qualifiers (`const' and `volatile') carry their weight; I suspect that what they add to the cost of learning and using the language is not repaid in greater expressiveness.
`Volatile', in particular, is a frill for esoteric applications, and much better expressed by other means.  Its chief virtue is that nearly everyone can forget about it.  `Const' is simultaneously more useful and more obtrusive; you can't avoid learning about it, because of its presence in the library interface.  Nevertheless, I don't argue for the extirpation of qualifiers, if only because it is too late.

      The fundamental problem is that it is not possible to write real programs using the X3J11 definition of C.  The committee has created an unreal language that no one can or will actually use.  While the problems of `const' may owe to careless drafting of the specification, `noalias' is an altogether mistaken notion, and must not survive.

1.  The qualifiers create an inconsistent language

      A substantial fraction of the library cannot be expressed in the proposed language.

One of the simplest routines,

    char *strchr(const noalias char *s, int c);
can return its first parameter.  This first parameter must be declared with `const noalias;' otherwise, it would be illegal (by the constraints on assignment, 3.3.16.1) to pass the address of a const or noalias object.  That is, the type qualifiers in the prototype are not merely an optional pleasantry of the interface; they are required, if one is to pass some kinds of data to this or most other library routines.

      Unfortunately, there is no way in X3J11's language for strchr to return the value it promises to, because of the semantics of return (3.6.6.4) and casts (3.3.4).  Whether the stripping of the const and noalias qualifiers is done by cast inside strchr or implicitly by its return statement, strchr returns a pointer that (because of `const') cannot be stored through, and (because of `noalias') cannot even be dereferenced; by the rules, it is useless.  (Incidentally, I think this observation was made by Tom Plum several years ago; it's disconcerting that the inconsistency remains.)

      Although the plain words of the Standard deny it, plastering the appropriate `non-const' cast on an expression to silence a compiler is sometimes safe, because the most probable implementation of `const' objects will allow them to be read through any access path, and will diagnose attempts to change them by generating an access violation fault at run time.  That is, in common implementations, adding or taking away the `const' qualifier of a pointer can never create any bugs not implicit in the rule `do not modify a genuine const object through any access path.'

      Nevertheless, I must emphasize that this is not the rule that X3J11 has written, and that its library is inconsistent with its language.  Someone writing an interpreter using X3J11/88-001 is perfectly at liberty to (indeed, is advised to) carry with each pointer a `modifiable' bit, that (following 3.3.4) remains off when a pointer to a const-qualified object is cast into a plain pointer.  This implementation will prevent many of the real uses of strchr for example.  I'm thinking of things like

        if (p = strchr(q, '/'))
                *p = ' ';
which are common and innocuous in C, but undefined by X3J11's language.

      A related observation is that string literals are not of type `array of const char.' Indeed, the Rationale (88-004 version) says, `However, string literals do not have [this type], in order to avoid the problems of pointer type checking, particularly with library functions....' Should this bald statement be considered anything other than an admission that X3J11's rules are screwy?  It is ludicrous that the committee introduces the `const' qualifier, and also makes strings unwritable, yet is unable to connect the two conceptions.

2.  Noalias is an abomination

      `Noalias' is much more dangerous; the committee is planting timebombs that are sure to explode in people's faces.  Assigning an ordinary pointer to a pointer to a `noalias' object is a license for the compiler to undertake aggressive optimizations that are completely legal by the committee's rules, but make hash of apparently safe programs.  Again, the problem is most visible in the library; parameters declared `noalias type *' are especially problematical.

      In order to write such a library routine using the new parameter declarations, it is in practice necessary to violate 3.3.4: `A pointer to a noalias-qualified type ... may be converted to ... the non-noalias-qualified type.  If the pointed to object is referred to by means of the converted pointer, the behavior is undefined.' Thus, the problem that occurs with `const' is now much worse; there are no interesting and legal uses of strchr.

      How do you code a routine whose prototype specifies a noalias pointer?  If you fail to violate 3.3.4, but instead try to rewrite the declarations of temporary variables to make them agree in type with parameters, it becomes hard to be sure that the routine works.  Consider the specification of strtok:

    char *strtok(noalias char *s1, noalias const char *s2);
It retains a static pointer to its writable, `noalias' first argument.  Can you be sure that this routine can be made safe under the rules?  I have studied it, and the answer is conditionally yes, provided one accepts certain parts of the Standard as gospel (for example that `noalias' handles will not be synchronized at certain times) while ignoring other parts.  It is a very dodgy thing.  For other routines, it is certain that complete rewriting is necessary: qsort for example, is full of pointers that rove the argument array and change it here and there.  If these local pointers are qualified with `noalias,' they may all be pointing to different virtual copies of parts of the array; in any event, the argument itself may have a virtual object that might be completely untouched by the attempt to sort it.

      The `noalias' rules have the assignment and cast restrictions backwards.  Assigning a plain pointer to a const-qualified pointer (pc = p) is well-defined by the rules and is safe, in that it restricts what you can do with pc The other way around (p = pc) is forbidden, presumably because it creates a writable access path to an unwritable object. With `noalias,' the rules are the same (pna = p is OK, p = pna is forbidden), but the realistic safety requirements are completely different.  Both of these assignments are equally suspicious, in that both create two access paths to an object, one manifestation of which might be virtual.

      Here is another way of observing the asymmetry: the presence of `const type *' in a parameter list is a useful piece of interface information, but `noalias type *' most assuredly is not.  Given the declaration

    memcpy(noalias void *s1, const noalias void *s2, size_t n);
what information can one glean from it?  Some committee members apparently believe that it conveys either to the reader or to the compiler that the routine is safe, provided that the strings do not overlap.  They are mistaken. Perhaps the committee's intent is not reflected in the current words of the Standard, but I can find nothing there that justifies their belief.  The rules (page 65, lines 19-20) specify `all objects accessible by these [noalias] lvalues,' which is the entirety of both array arguments.

      More generally, suppose I see a prototype

    char *magicfunction(noalias char *, noalias char *);
Is there anything at all I can conclude about the requirements of magicfunction?  Is there anything at all I can conclude about things it promises to do or not to do?  All I learn from the Rationale (page 52) is that such a routine enjoins me from letting the arguments overlap, but this is at variance with the Standard, which gives a stronger injunction.

      Within the function itself, things are equally bad.  A `const type *' parameter, though it presents problems for strchr and other routines, does usefully constrain the function: it's not allowed to store through the pointer.  However, within a function with a `noalias type *' parameter, nothing is gained except bizarre restrictions: it can't cast the parameter to a plain pointer, and it can't assign the parameter to another noalias pointer without creating unwanted handles and potential virtual objects.  The interface must say noalias, or at any rate does say noalias, so the author of the routine has all the grotesque inventions of 3.5.3 (handles, virtual objects) rubbed in his face, like or not.

      The utter wrongness of `noalias' is that the information it seeks to convey is not a property of an object at all. `Const,' for all its technical faults, is at least a genuine property of objects; `noalias' is not, and the committee's confused attempt to improve optimization by pinning a new qualifier on objects spoils the language. `Noalias' is a bogus invention that is not necessary, and not in any case sufficient for its apparent purpose.

      Earlier languages flirted with gizmos intended to help optimization, and generally abandoned them.  The original Fortran, for example, had a FREQUENCY statement that didn't help much, confused people, and was dropped.  PL/1 had `normal/abnormal' and `uses/sets' attributes that suffered a similar fate.  Today, these are generally looked on as adolescent experiments.

      On the other hand, the insufficiency of `noalias' is suggested by Cray's Fortran compiler, which has 20 separate keywords that control various details of optimization.  They are specified by an equivalent of #pragma and thus, despite their oddness, can be ignored when trying to understand the meaning of a program.

      Perhaps there is some reason to provide a mechanism for asserting, in a particular patch of code, that the compiler is free to make optimistic assumptions about the kinds of aliasing that can occur.  I don't know any acceptable way of changing the language specification to express the possibility of this kind of optimization, and I don't know how much performance improvement is likely to result.  I would encourage compiler-writers to experiment with extensions, by #pragma or otherwise, to see what ideas and improvements they can come up with, but I am certain that nothing resembling the noalias proposal should be in the Standard.

3. The cost of inconsistency

      K&R C has one important internal contradiction (variadic functions are forbidden, yet printf exists) and one important divergence between rule and reality (common vs. ref/def external data definitions).  These contradictions have been an embarrassment to me throughout the years, and resolving them was high on X3J11's agenda.  X3J11 did manage to come up with an adequate, if awkward, solution to the first problem.  Their solution to the second was the same as mine (make a rule, then issue a blanket license to violate it).

      I'm aware that there are distinctions to be made between `conforming' and `strictly conforming' programs. Although the X3J11 rules for qualifiers are inconsistent, and therefore most nominally X3J11 compilers will ignore, or only warn about, casts and assignments that X3J11 says are undefined, people will somehow survive.  C has, after all, survived the vararg and the extern problems.

      Nevertheless, I advise strongly against sanctifying a language specification that no one can possibly embody in a useful compiler.  This advice is based on bitter experience.

4. What to do?

Noalias must go.  This is non-negotiable.

      It must not be reworded, reformulated or reinvented.  The draft's description is badly flawed, but that is not the problem.  The concept is wrong from start to finish.  It negates every brave promise X3J11 ever made about codifying existing practices, preserving the existing body of code, and keeping (dare I say it?) `the spirit of C.'

      Const has two virtues: putting things in read-only memory, and expressing interface restrictions.  For example, saying

    char *strchr(const char *s, int c);
is a reasonable way of expressing that the routine cannot change the object referred to by its first argument.  I think that minor changes in wording preserve the virtues, yet eliminate the contradictions in the current scheme.

  1. Reword page 47, lines 3-5 of 3.3.4 (Cast operators), to remove the undefinedness of modifying pointed-to objects, or remove these lines altogether (since casting non-qualified to qualified isn't discussed explicitly either.)

  2. Rewrite the constraint on page 54, lines 14-15, to say that pointers may be assigned without taking qualifiers into account.

  3. Preserve all current constraints against modifying non-modifiable lvalues, that is things of manifestly const-qualified type.

  4. String literals have type  `const char []'.

  5. Add a constraint (or discussion or example) to assignment that makes clear the illegality of assigning to an object whose actual type is const-qualified, no matter what access path is used.  There is a manifest constraint that is easy to check (left side is not const-qualified), but also a non-checkable constraint (left side is not secretly const-qualified).  The effect should be that converting between pointers to const-qualified and plain objects is legal and well-defined; avoiding assignment through pointers that derive ultimately from `const' objects is the programmer's responsibility.

      These rules give up a certain amount of checking, but they save the consistency of the language.