Article 7844 of comp.lang.c:
Subject: noalias comments to X3J11
Date: 20 Mar 88 08:37:58 GMT
Organization: AT&T Bell Laboratories, Murray Hill NJ
Reproduced below is the long essay I sent as an official comment to X3J11. It is in two parts; the first points out some problems in the current definition of `const,' and the second is a diatribe about `noalias'.
By way of introduction, the important thing about `const' is that the current wording says, in section 3.3.4, that a pointer to a const-qualified object may be cast to a pointer to the plain object, but "If an attempt is made to modify the pointed-to object by means of the converted pointer, the behavior is undefined." Because function prototypes tend to convert your pointers to const-qualified pointers, difficulties arise.
In discussion with various X3J11 members, I learned that this section is now regarded as an inadvertant error, and no one thinks that it will last in its current form. Nevertheless, it seemed wisest to keep my comments in their original strong form. The intentions of the committee are irrelevant; only their document matters.
The second part of the essay is about noalias as such. It seems likely that even the intentions of the committee on this subject are confused.
Here's the jeremiad.
This is an essay on why I do not like X3J11 type qualifiers. It is my own opinion; I am not speaking for AT&T.
Let me begin by saying that I'm not convinced that even
the pre-December qualifiers (`const' and `volatile') carry
their weight; I suspect that what they add to the cost of
learning and using the language is not repaid in greater
`Volatile', in particular, is a frill for esoteric applications, and much better expressed by other means. Its chief virtue is that nearly everyone can forget about it. `Const' is simultaneously more useful and more obtrusive; you can't avoid learning about it, because of its presence in the library interface. Nevertheless, I don't argue for the extirpation of qualifiers, if only because it is too late.
The fundamental problem is that it is not possible to write real programs using the X3J11 definition of C. The committee has created an unreal language that no one can or will actually use. While the problems of `const' may owe to careless drafting of the specification, `noalias' is an altogether mistaken notion, and must not survive.
A substantial fraction of the library cannot be expressed in the proposed language.
One of the simplest routines,
char *strchr(const noalias char *s, int c);can return its first parameter. This first parameter must be declared with `const noalias;' otherwise, it would be illegal (by the constraints on assignment, 188.8.131.52) to pass the address of a const or noalias object. That is, the type qualifiers in the prototype are not merely an optional pleasantry of the interface; they are required, if one is to pass some kinds of data to this or most other library routines.
Unfortunately, there is no way in X3J11's language
strchr to return the value it promises to, because of the
semantics of return (184.108.40.206) and casts (3.3.4). Whether
the stripping of the const and noalias qualifiers is done by
strchr, or implicitly by its return statement,
strchr returns a pointer that (because of `const') cannot be
stored through, and (because of `noalias') cannot even be
dereferenced; by the rules, it is useless. (Incidentally, I
think this observation was made by Tom Plum several years
ago; it's disconcerting that the inconsistency remains.)
Although the plain words of the Standard deny it, plastering the appropriate `non-const' cast on an expression to silence a compiler is sometimes safe, because the most probable implementation of `const' objects will allow them to be read through any access path, and will diagnose attempts to change them by generating an access violation fault at run time. That is, in common implementations, adding or taking away the `const' qualifier of a pointer can never create any bugs not implicit in the rule `do not modify a genuine const object through any access path.'
Nevertheless, I must emphasize that this is not the
rule that X3J11 has written, and that its library is inconsistent
with its language. Someone writing an interpreter
using X3J11/88-001 is perfectly at liberty to (indeed, is
advised to) carry with each pointer a `modifiable' bit, that
(following 3.3.4) remains off when a pointer to
a const-qualified object is cast into a plain
pointer. This implementation will prevent many
of the real uses of
example. I'm thinking of things like
if (p = strchr(q, '/')) *p = ' ';which are common and innocuous in C, but undefined by X3J11's language.
A related observation is that string literals are not of type `array of const char.' Indeed, the Rationale (88-004 version) says, `However, string literals do not have [this type], in order to avoid the problems of pointer type checking, particularly with library functions....' Should this bald statement be considered anything other than an admission that X3J11's rules are screwy? It is ludicrous that the committee introduces the `const' qualifier, and also makes strings unwritable, yet is unable to connect the two conceptions.
`Noalias' is much more dangerous; the committee is planting timebombs that are sure to explode in people's faces. Assigning an ordinary pointer to a pointer to a `noalias' object is a license for the compiler to undertake aggressive optimizations that are completely legal by the committee's rules, but make hash of apparently safe programs. Again, the problem is most visible in the library; parameters declared `noalias type *' are especially problematical.
In order to write such a library routine using the new
parameter declarations, it is in practice necessary to
violate 3.3.4: `A pointer to a noalias-qualified type ...
may be converted to ... the non-noalias-qualified type. If
the pointed to object is referred to by means of the converted
pointer, the behavior is undefined.' Thus, the problem
that occurs with `const' is now much worse; there are no
interesting and legal uses of
How do you code a routine whose prototype specifies
a noalias pointer? If you fail to violate 3.3.4,
but instead try to rewrite the declarations of temporary variables to
make them agree in type with parameters, it becomes hard to
be sure that the routine works. Consider the specification
char *strtok(noalias char *s1, noalias const char *s2);It retains a static pointer to its writable, `noalias' first argument. Can you be sure that this routine can be made safe under the rules? I have studied it, and the answer is conditionally yes, provided one accepts certain parts of the Standard as gospel (for example that `noalias' handles will not be synchronized at certain times) while ignoring other parts. It is a very dodgy thing. For other routines, it is certain that complete rewriting is necessary:
qsort, for example, is full of pointers that rove the argument array and change it here and there. If these local pointers are qualified with `noalias,' they may all be pointing to different virtual copies of parts of the array; in any event, the argument itself may have a virtual object that might be completely untouched by the attempt to sort it.
The `noalias' rules have the assignment and cast restrictions
backwards. Assigning a plain pointer to
a const-qualified pointer (
p) is well-defined by the rules and is safe, in that it
restricts what you can do with
The other way around (
forbidden, presumably because it creates a writable access path to
an unwritable object.
With `noalias,' the rules are the same
p is OK,
pna is forbidden), but the realistic safety
requirements are completely different. Both of these assignments
are equally suspicious, in that both create two access paths to an
object, one manifestation of which might be virtual.
Here is another way of observing the asymmetry: the presence of `const type *' in a parameter list is a useful piece of interface information, but `noalias type *' most assuredly is not. Given the declaration
memcpy(noalias void *s1, const noalias void *s2, size_t n);what information can one glean from it? Some committee members apparently believe that it conveys either to the reader or to the compiler that the routine is safe, provided that the strings do not overlap. They are mistaken. Perhaps the committee's intent is not reflected in the current words of the Standard, but I can find nothing there that justifies their belief. The rules (page 65, lines 19-20) specify `all objects accessible by these [noalias] lvalues,' which is the entirety of both array arguments.
More generally, suppose I see a prototype
char *magicfunction(noalias char *, noalias char *);Is there anything at all I can conclude about the requirements of magicfunction? Is there anything at all I can conclude about things it promises to do or not to do? All I learn from the Rationale (page 52) is that such a routine enjoins me from letting the arguments overlap, but this is at variance with the Standard, which gives a stronger injunction.
Within the function itself, things are equally bad. A
`const type *' parameter, though it presents problems for
strchr and other routines, does usefully constrain the function:
it's not allowed to store through the pointer. However,
within a function with a `noalias type *' parameter,
nothing is gained except bizarre restrictions: it can't cast
the parameter to a plain pointer, and it can't assign the
parameter to another noalias pointer without creating
unwanted handles and potential virtual objects. The interface
must say noalias, or at any
rate does say noalias, so
the author of the routine has all the grotesque inventions
of 3.5.3 (handles, virtual objects) rubbed in his face, like
The utter wrongness of `noalias' is that the information it seeks to convey is not a property of an object at all. `Const,' for all its technical faults, is at least a genuine property of objects; `noalias' is not, and the committee's confused attempt to improve optimization by pinning a new qualifier on objects spoils the language. `Noalias' is a bogus invention that is not necessary, and not in any case sufficient for its apparent purpose.
Earlier languages flirted with gizmos intended to help
optimization, and generally abandoned them. The original
Fortran, for example, had a
FREQUENCY statement that didn't
help much, confused people, and was dropped. PL/1 had
`normal/abnormal' and `uses/sets' attributes that suffered a
similar fate. Today, these are generally looked on as
On the other hand, the insufficiency of `noalias' is
suggested by Cray's Fortran compiler, which has 20 separate
keywords that control various details of optimization. They
are specified by an equivalent of
#pragma, and thus, despite
their oddness, can be ignored when trying to understand the
meaning of a program.
Perhaps there is some reason to provide a mechanism for asserting, in a particular patch of code, that the compiler is free to make optimistic assumptions about the kinds of aliasing that can occur. I don't know any acceptable way of changing the language specification to express the possibility of this kind of optimization, and I don't know how much performance improvement is likely to result. I would encourage compiler-writers to experiment with extensions, by #pragma or otherwise, to see what ideas and improvements they can come up with, but I am certain that nothing resembling the noalias proposal should be in the Standard.
K&R C has one important internal contradiction (variadic functions are forbidden, yet printf exists) and one important divergence between rule and reality (common vs. ref/def external data definitions). These contradictions have been an embarrassment to me throughout the years, and resolving them was high on X3J11's agenda. X3J11 did manage to come up with an adequate, if awkward, solution to the first problem. Their solution to the second was the same as mine (make a rule, then issue a blanket license to violate it).
I'm aware that there are distinctions to be made between `conforming' and `strictly conforming' programs. Although the X3J11 rules for qualifiers are inconsistent, and therefore most nominally X3J11 compilers will ignore, or only warn about, casts and assignments that X3J11 says are undefined, people will somehow survive. C has, after all, survived the vararg and the extern problems.
Nevertheless, I advise strongly against sanctifying a language specification that no one can possibly embody in a useful compiler. This advice is based on bitter experience.
Noalias must go. This is non-negotiable.
It must not be reworded, reformulated or reinvented. The draft's description is badly flawed, but that is not the problem. The concept is wrong from start to finish. It negates every brave promise X3J11 ever made about codifying existing practices, preserving the existing body of code, and keeping (dare I say it?) `the spirit of C.'
Const has two virtues: putting things in read-only memory, and expressing interface restrictions. For example, saying
char *strchr(const char *s, int c);is a reasonable way of expressing that the routine cannot change the object referred to by its first argument. I think that minor changes in wording preserve the virtues, yet eliminate the contradictions in the current scheme.
These rules give up a certain amount of checking, but they save the consistency of the language.