strcpy() revisited

hgm · Post by **hgm** » Thu Dec 12, 2013 5:40 pm

wgarvin wrote:Its not a "reserved word", just a function whose semantics are intrinsically known to the compiler.

Yes, this is the point that puzzled me. When I first learned C, all these functions were just external functions, in no way different from or privileged with respect to functions you could declare and define yourself. And as system headers ccontain just prototypes, there was nothing against using a system header to declare strcpy, but then provide your own definition of it (conforming to the prototype). And in fact this is still exactly how it works in the gcc I use (3.4.4). In the quoted standard, however, this seems to evoke UB.

I don't like that very much, as this apparently requires an exhaustive list of new reserved identifiers, which the programmer should know. It would not be so bad if the compiler warned against redefinition of such a reserved identifier, but gcc 3.4.4 doesn't, not even with -Wall. A better design, IMO, would be to add a single new keyword 'standard' (or 'library', or perhaps '__standard__') that could be added to prototypes for which it is an error to provide a definition. There would have been no need for any UB in that case, as it would just be forbidden to do the things that now are defined to cause UB.

hgm · Post by **hgm** » Thu Dec 12, 2013 6:01 pm

syzygy wrote:I don't know what you are talking about, but I am talking about the example I gave. You can compile it yourself with "gcc -O3 -S" and inspect the assembly code.

I was talking about Bob's example, which was more relevant, as it actually was done on an Apple. I suppose you are referring to this one:

Code: Select all

#include <stdio.h>
#include <string.h>

int main(int argc, char **argv)
{
  char *a = argv[1];
  char b[256];
  strcpy(b, a);
  strcpy(b+1, b);
  printf("strlen("%s") = %d\n", b, strlen(b));
  return 0;
}

?

Well, I compiled it, and I get:

Code: Select all

Makro@Makro-PC ~
$ gcc -O3 test2.c

Makro@Makro-PC ~
$ ./a.exe
Segmentation fault (geheugendump gemaakt)

Makro@Makro-PC ~
$

As I would have expected. The generated assembler contains no surprises:

Code: Select all

Makro@Makro-PC ~
$ gcc -O3 -S test2.c

Makro@Makro-PC ~
$ cat test2.s
        .file   "test2.c"
        .def    ___main;        .scl    2;      .type   32;     .endef
        .section .rdata,"dr"
LC0:
        .ascii "strlen("%s") = %d\12\0"
        .text
        .p2align 4,,15
.globl _main
        .def    _main;  .scl    2;      .type   32;     .endef
_main:
        pushl   %ebp
        movl    $16, %eax
        movl    %esp, %ebp
        pushl   %ebx
        subl    $276, %esp
        andl    $-16, %esp
        call    __alloca
        call    ___main
        movl    12(%ebp), %ecx
        leal    -264(%ebp), %ebx
        movl    4(%ecx), %edx
        movl    %ebx, (%esp)
        movl    %edx, 4(%esp)
        call    _strcpy
        movl    %ebx, 4(%esp)
        leal    -263(%ebp), %eax
        movl    %eax, (%esp)
        call    _strcpy
        movl    %ebx, %ecx
        .p2align 4,,15
L2:
        movl    (%ecx), %eax
        addl    $4, %ecx
        leal    -16843009(%eax), %edx
        notl    %eax
        andl    %eax, %edx
        movl    %edx, %eax
        andl    $-2139062144, %eax
        je      L2
        andl    $32896, %edx
        jne     L4
        shrl    $16, %eax
        addl    $2, %ecx
L4:
        movl    %ebx, 4(%esp)
        addb    %al, %al
        sbbl    $3, %ecx
        subl    %ebx, %ecx
        movl    %ecx, 8(%esp)
        movl    $LC0, (%esp)
        call    _printf
        movl    -4(%ebp), %ebx
        xorl    %eax, %eax
        leave
        ret
        .def    _printf;        .scl    3;      .type   32;     .endef
        .def    _strcpy;        .scl    3;      .type   32;     .endef

Makro@Makro-PC ~
$

I cannot know what the assembler was that the optimizer produced in your case. The most efficient, of course, would be to determine the length of the string at compile time, and make use of the known length to inline a dedicated move. Calling strlen() at run time for a string of known length already seems poor code.

Of course one could redefine strcpy() to not have UB in case of overlapping regions. What should be clear is that this would not only have advantages.

This was clear from the beginning: the disadvantage is that you have to test if the implementation would break on the overlap at hand.

But this advantage is wholly annihilated in the Apple implementation, as they obviously do such a run-time test. And so far not a single other advantage has been shown here.

So the bottom line seems to be that they needlessly break backward compatibility by introducing a new, unnatural kind of UB, and have all correct programs pay for their desire to pester old-time code in terms of performance...

wgarvin · Post by **wgarvin** » Thu Dec 12, 2013 6:15 pm

hgm wrote:
wgarvin wrote:Its not a "reserved word", just a function whose semantics are intrinsically known to the compiler.
Yes, this is the point that puzzled me. When I first learned C, all these functions were just external functions, in no way different from or privileged with respect to functions you could declare and define yourself. And as system headers ccontain just prototypes, there was nothing against using a system header to declare strcpy, but then provide your own definition of it (conforming to the prototype). And in fact this is still exactly how it works in the gcc I use (3.4.4). In the quoted standard, however, this seems to evoke UB.

I think you can still do that if you want to, although if you're replacing a C library function then you can't realistically change its API.. but GCC does offer -fno-builtin as a way to turn off these optimizations, and then it will call your replacement strcpy / memcpy / whatever the same way it would call any regular function. I think GCC does support a bunch of weird embedded platforms and microcontrollers, etc. where these kind of 'builtin function' optimizations could easily just get in your way.

I don't like that very much, as this apparently requires an exhaustive list of new reserved identifiers, which the programmer should know. It would not be so bad if the compiler warned against redefinition of such a reserved identifier, but gcc 3.4.4 doesn't, not even with -Wall. A better design, IMO, would be to add a single new keyword 'standard' (or 'library', or perhaps '__standard__') that could be added to prototypes for which it is an error to provide a definition. There would have been no need for any UB in that case, as it would just be forbidden to do the things that now are defined to cause UB.

The programmer probably already should know them, if they are including a standard header that defines them... they can't then also define their own different function that collides with the standard one, right?

wgarvin · Post by **wgarvin** » Thu Dec 12, 2013 6:24 pm

hgm wrote:
Of course one could redefine strcpy() to not have UB in case of overlapping regions. What should be clear is that this would not only have advantages.
This was clear from the beginning: the disadvantage is that you have to test if the implementation would break on the overlap at hand.

But this advantage is wholly annihilated in the Apple implementation, as they obviously do such a run-time test. And so far not a single other advantage has been shown here.

I think maybe you missed the point... The optimization that Ronald reported, where his compiler replaced the two strcpy calls with a strpcpy and a memcpy -- THAT optimization, and probably some similar other optimizations it has, only works if the two strings don't overlap. THAT is what you're giving up if you decide to change the API specification of strcpy so that strings may now overlap. That optimization becomes unsafe and has to be disabled, or at least gated behind a run-time test and alternate codepath, with significant extra costs.

Several times during the debate, you or bob have claimed that there was no possible performance benefit to forbidding overlapping copies, becuase of Linus's argument that memmove could be implemented as efficiently as memcpy (at least for the non-overlapping cases). This example from Ronald convincingly refutes that argument. Its a nice performance optimization that is only possible because the length of the string is known not to change, which is only easy to know because the two strings don't overlap. So 25 years ago, the C spec was written to forbid overlapping copies, by declaring them as undefined behavior. And here we see an actual clever compiler optimization that can actually make real-world programs faster, and is only possible because of that restriction. It seems to me to be a clear and convincing demonstration of the value that such restrictions contribute to the possible performance.

OTOH, the must-not-overlap restriction and similar other UB restrictions (signed overflow etc.) do also come with a real-world cost: they confuse programmers, or programmers forget about them, or just accidently violate them without noticing. And then we get UB and broken programs, which is obviously bad. I'm not trying to claim the optimization benefit necessarily outweighs these bad costs. But I do think the argument that "there is zero benefit" has been convincingly refuted now.

bob · Post by **bob** » Thu Dec 12, 2013 6:51 pm

Rein Halbersma wrote:
hgm wrote:
mvk wrote: Once you say "#include <string.h>" the compiler can know the semantics of strcpy and strlen in the code that follows, because <string.h> is standardised. There is no obligation to implement <string.h> with a file for example. (if you want that, you have to say #include "string.h").
Semantics? Do you mean that the library file string.h actually contains the complete definition of strcpy, rather than just a prototype? When I do that for my own header files, it usually leads to "multipy defined symbol" error messages from the linker, when such a header is #included in multiple .c files. Is the new standard now allowing multiple definitions of the same routine now allowed, provided the definitions are identical? Would that also work for definitions I write myself, or just from ehader files?
You can put function definitions in headers if you prefix them with the inline keyword.

That's irrelevant to the current discussion. There are no macros or anything else in the string.h file on my mac, just the usual prototypes...

hgm · Post by **hgm** » Thu Dec 12, 2013 7:00 pm

These headers contain lots of definitions, of which I only use some. Others I had never heard of (e.g. isxdigit()). The chances of inadvertantly picking a name that happens to be reserved certainly are not negligible; the names are often the obvious choice. (i.e. it is not like they all begin with __ or zzzzz).

I would rather have the compiler generate an error message in such a case, than having to debug the UB...

I once spent a week debugging a Pascal program that was supposed to plot things, but kept misteriously crashing. Turned out I had defined a function XCO that also existed in some plot library it linked against, but which required other parameter types. I was not very happy afterwards!

bob · Post by **bob** » Thu Dec 12, 2013 7:04 pm

Rein Halbersma wrote:
hgm wrote:So the 'context' in which strcpy would be a reserved identifier would be any file that does an #include <string.h>, and this allows the optimizer to be certain about what strcpy will do? I guess then there isn't any need for putting a defenition (and inlining it) in the header file.
Unlike C++, the C Standard Library does not have templates, so it can have headers that only contain declarations and the Runtime Library will contain the already compiled function definitions that you link against. Inside those definitions, the optimizations can be done (and even at link-time, additional optimizations).

I still don't understand why the Apple compiler generates external calls to inlined functions that do not exist independnetly. Are Word and Match accidentally standard library functions?
You mentioned error messages containing _Match and _Word, but your code shows Match and Word. If you in fact have _Match and _Word in your code, then it's undefined behavior:

7.1.3 first bullet:
-All identiﬁers that begin with an underscore and either an uppercase letter or another underscore are always reserved for any use.

You DO realize that ALL procedure names have the _ prepended to their name automatically?

Here is a simple example:

int main(int argc, char* argv[]) {
char a[16];

strcpyx(a, "123");
}

Here's what happens when I compile and link:

scrappy% gcc -o tst4 tst4.c
Undefined symbols for architecture x86_64:
"_strcpyx", referenced from:
_main in ccdHuRGL.o
ld: symbol(s) not found for architecture x86_64
collect2: error: ld returned 1 exit status
scrappy% more tst4.c

This has been the case for as long as I can remember.

bob · Post by **bob** » Thu Dec 12, 2013 7:06 pm

hgm wrote:
Rein Halbersma wrote:You mentioned error messages containing _Match and _Word, but your code shows Match and Word. If you in fact have _Match and _Word in your code, then it's undefined behavior:

7.1.3 first bullet:
-All identiﬁers that begin with an underscore and either an uppercase letter or another underscore are always reserved for any use.
Well, I am doing this from memory now, as the e-mail with the bug report is on another machine. But many C compilers prefix all user-defined identifiers with an underscore, to prevent name clashes with symbols they generate themselves (e.g. jump labels) or assembler mnemonics. The linker doesn't know about C, and its error messages for undefined symbols would use the name of the routines in the compiler-generated assembler. So the C code does use Word and Match, (that I am sure of), but in assembler this becomes _Word and _Match. From what I remember the Apple C compiler also had this habit (like my own (Cygwin) gcc). I don't think the Linux gcc does such prefixing, however.

It has to, and I verified that my gcc 4.7.3 does this. If it did not, you could not use a common library for two different compilers if one prepended _ and one did not. Ditto for inter-language procedure calls which I have done in the past many times (calling C from Fortran, for example).

syzygy · Post by **syzygy** » Thu Dec 12, 2013 7:14 pm

hgm wrote:
syzygy wrote:I don't know what you are talking about, but I am talking about the example I gave. You can compile it yourself with "gcc -O3 -S" and inspect the assembly code.
I was talking about Bob's example, which was more relevant

Not really, because you were responding to a post by Wylie that was definitely about my example.

I cannot know what the assembler was that the optimizer produced in your case.

With an ancient compiler you will indeed get a quite different result. But I have already explained twice in great detail how modern versions of gcc compile this code.

The most efficient, of course, would be to determine the length of the string at compile time, and make use of the known length to inline a dedicated move. Calling strlen() at run time for a string of known length already seems poor code.

What known length? The compiler cannot know with what command line argument the user will invoke the program.

syzygy · Post by **syzygy** » Thu Dec 12, 2013 7:17 pm

hgm wrote:So the bottom line seems to be that they needlessly break backward compatibility by introducing a new, unnatural kind of UB, and have all correct programs pay for their desire to pester old-time code in terms of performance...

This is not a new kind of UB at all.

You are aligning yourself more and more with Bob, I'm afraid.

strcpy() revisited

Re: strcpy() revisited

Re: strcpy() revisited

Re: strcpy() revisited

Re: strcpy() revisited

Re: strcpy() revisited

Re: strcpy() revisited

Re: strcpy() revisited

Re: strcpy() revisited

Re: strcpy() revisited

Re: strcpy() revisited