syzygy wrote:I don't know what you are talking about, but I am talking about the example I gave. You can compile it yourself with "gcc -O3 -S" and inspect the assembly code.
I was talking about Bob's example, which was more relevant, as it actually was done on an Apple. I suppose you are referring to this one:
Code: Select all
#include <stdio.h>
#include <string.h>
int main(int argc, char **argv)
{
char *a = argv[1];
char b[256];
strcpy(b, a);
strcpy(b+1, b);
printf("strlen("%s") = %d\n", b, strlen(b));
return 0;
}
?
Well, I compiled it, and I get:
Code: Select all
Makro@Makro-PC ~
$ gcc -O3 test2.c
Makro@Makro-PC ~
$ ./a.exe
Segmentation fault (geheugendump gemaakt)
Makro@Makro-PC ~
$
As I would have expected. The generated assembler contains no surprises:
Code: Select all
Makro@Makro-PC ~
$ gcc -O3 -S test2.c
Makro@Makro-PC ~
$ cat test2.s
.file "test2.c"
.def ___main; .scl 2; .type 32; .endef
.section .rdata,"dr"
LC0:
.ascii "strlen("%s") = %d\12\0"
.text
.p2align 4,,15
.globl _main
.def _main; .scl 2; .type 32; .endef
_main:
pushl %ebp
movl $16, %eax
movl %esp, %ebp
pushl %ebx
subl $276, %esp
andl $-16, %esp
call __alloca
call ___main
movl 12(%ebp), %ecx
leal -264(%ebp), %ebx
movl 4(%ecx), %edx
movl %ebx, (%esp)
movl %edx, 4(%esp)
call _strcpy
movl %ebx, 4(%esp)
leal -263(%ebp), %eax
movl %eax, (%esp)
call _strcpy
movl %ebx, %ecx
.p2align 4,,15
L2:
movl (%ecx), %eax
addl $4, %ecx
leal -16843009(%eax), %edx
notl %eax
andl %eax, %edx
movl %edx, %eax
andl $-2139062144, %eax
je L2
andl $32896, %edx
jne L4
shrl $16, %eax
addl $2, %ecx
L4:
movl %ebx, 4(%esp)
addb %al, %al
sbbl $3, %ecx
subl %ebx, %ecx
movl %ecx, 8(%esp)
movl $LC0, (%esp)
call _printf
movl -4(%ebp), %ebx
xorl %eax, %eax
leave
ret
.def _printf; .scl 3; .type 32; .endef
.def _strcpy; .scl 3; .type 32; .endef
Makro@Makro-PC ~
$
I cannot know what the assembler was that the optimizer produced in your case. The most efficient, of course, would be to determine the length of the string at compile time, and make use of the known length to inline a dedicated move. Calling strlen() at run time for a string of known length already seems poor code.
Of course one could redefine strcpy() to not have UB in case of overlapping regions. What should be clear is that this would not only have advantages.
This was clear from the beginning: the disadvantage is that you have to test if the implementation would break on the overlap at hand.
But this advantage is wholly annihilated in the Apple implementation, as they obviously do such a run-time test. And so far not a single other advantage has been shown here.
So the bottom line seems to be that they
needlessly break backward compatibility by introducing a new, unnatural kind of UB, and have
all correct programs pay for their desire to pester old-time code in terms of performance...