gcc / clib bug?

Discussion of chess software programming and technical issues.

Moderator: Ras

syzygy
Posts: 5747
Joined: Tue Feb 28, 2012 11:56 pm

Re: gcc / clib bug?

Post by syzygy »

Evert wrote:
hgm wrote:The error disappears when I replace name by name[2] in the struct defenition, which suggest the allocated space is one byte short. Is strcpy somehow using a byte it should not use, at the end of the string?
Have you tried name[1]? That seems more reasonable to me (and avoids the ugly & when passing the element), and I suppose it might just be possible that the compiler gets confused because it assumes that name is a scalar rather than the first element of an array (no idea what type of optimisation it would do to get confused by though).
It shouldn't make a difference if the code is not compiled with funny bound-checking options, but I agree name[1] would make the code look cleaner.

But why not use a "flexible array member" (name[]) as defined by ISO C99:
As a special case, the last element of a structure with more than one named member may have an incomplete array type; this is called a flexible array member. In most situations, the flexible array member is ignored. In particular, the size of the structure is as if the flexible array member were omitted except that it may have more trailing padding than the omission would imply. However, when a . (or ->) operator has a left operand that is (a pointer to) a structure with a flexible array member and the right operand names that member, it behaves as if that member were replaced with the longest array (with the same element type) that would not make the structure larger than the object being accessed; the offset of the array shall remain that of the flexible array member, even if this would differ from that of the replacement array. If this array would have no elements, it behaves as if it had one element but the behavior is undefined if any attempt is made to access that element or to generate a pointer one past it.
I am guessing this would even work with the funny bound checking introduced by -D_FORTIFY_SOURCE.
User avatar
hgm
Posts: 28393
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: gcc / clib bug?

Post by hgm »

Sven Schüle wrote:If you declare q as a pointer to 'struct y' then, according to typing concept, the expression q->name should be impossible without casting q to 'struct x *' since 'struct y' has no member 'name'. So I don't know what you intended to write.

I think that crashing is really the best that can happen here, if it compiles at all. Anything else would be disappointing for me ;-)

Fighting bad programming practice regards,
Sven
Oh sorry, this is again my bad. The code as I gave it would of course draw compile-time warnings for assigning a pointer of one type to that of another type. The casts I wrote in the assignments to p and q served no other purpose than to avoid such warnings.

In fact they serve no other purpose. Casting one pointer type to another before assigning does not translate to any actual code, it is just a syntactic device to satisfy the type checking. In all cases the pointer would be copied unchanged to its destination.

So what I really intended to write is the following:

Code: Select all

struct x { char name; };
struct y { char longName[1000]; }

struct x *p;
struct y *q;

p = (struct x *) malloc(sizeof(struct x) + 7);
q = (struct y *) p;
strcpy(&(q->name), "blablah"); 
I consider the case for this NOT violating any standards ironclad:

What malloc returns can be casted to (char*) for which the first N (=sizeof(struct x)+7) elements should be accessible. &(q->name) is a (char*) with a value that is precisely defined in the layout rules for structs. These define it to be a pointer to a hypothetical array of char starting at the same address. So what is passed to strcpy is merely a pointer to an array element of a bigger array (i.e. with enough elements behind it to allow the pointer to be incremented inside strcpy).
Sven
Posts: 4052
Joined: Thu May 15, 2008 9:57 pm
Location: Berlin, Germany
Full name: Sven Schüle

Re: gcc / clib bug?

Post by Sven »

hgm wrote:So what I really intended to write is the following:

Code: Select all

struct x { char name; };
struct y { char longName[1000]; }

struct x *p;
struct y *q;

p = (struct x *) malloc(sizeof(struct x) + 7);
q = (struct y *) p;
strcpy(&(q->name), "blablah"); 
I consider the case for this NOT violating any standards ironclad:

What malloc returns can be casted to (char*) for which the first N (=sizeof(struct x)+7) elements should be accessible. &(q->name) is a (char*) with a value that is precisely defined in the layout rules for structs. These define it to be a pointer to a hypothetical array of char starting at the same address. So what is passed to strcpy is merely a pointer to an array element of a bigger array (i.e. with enough elements behind it to allow the pointer to be incremented inside strcpy).
Still q->name would not compile IMO (I did not test but it looks wrong). 'q' is of type 'pointer to struct y' and 'struct y' does not have a member 'name'. You would need to write

Code: Select all

strcpy(&(((struct x *)q)->name), "blablah");
which would add another infinite number of levels of unreadability.

Of course my example from my other post does not compile as well, and if it did then it would go wrong since it would do pointer arithmetic ... I guess everyone can see what I meant, it "only" needs some additional type casts. I also did not intend to say that this would be "the optimal" solution, just "a possibly slightly better one". From design perspective "close to optimal" might be to explicitly store the address of the additional string in a "char * name" struct member, and malloc() the string itself. There is some small runtime cost for this but if that is not an issue (which I expect for such an application) then it should be done that way, instead of "pasting" the string directly behind the struct and allocating both at once.

I know that maintaining existing software which does those "dirty" tricks is another issue, there it is sometimes necessary to keep the dirt and only make it as safe as possible, in order not to break the program.

Sven
User avatar
hgm
Posts: 28393
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: gcc / clib bug?

Post by hgm »

Sven Schüle wrote:Still q->name would not compile IMO (I did not test but it looks wrong). 'q' is of type 'pointer to struct y' and 'struct y' does not have a member 'name'.
You are of course perfectly right; this would actually be a compile error than a warning. I seem to have a lot of trouble getting this right... :cry:

What I intended was really this:

Code: Select all

struct x { char name; };
struct y { char longName[1000]; }

struct x *p;
struct y *q;

p = (struct x *) malloc(sizeof(struct x) + 7);
q = (struct y *) p;
strcpy(&(q->longName), "blablah"); 
where I do the cast from p to q solely to exclude any possibility that strcpy could know it is working on a data structure with only a single char (e.g. if it would somehow be in-lined, or if strcpy would really be a macro, so that the optimizer can see both the code that is inside strcpy and the way the pointer to the destination is derived). This was in fact the sole reason for introducing the (struct y) type and q pointer in the first place: so that I could syntactically 'disguise' the (char*) passed to strcpy as a (char*) into a (char[1000]) rather than a (char*) to a (char). (As if the standard would somhow allow you to think these are different, despite the fact that they are both (char*).)
AlvaroBegue
Posts: 931
Joined: Tue Mar 09, 2010 3:46 pm
Location: New York
Full name: Álvaro Begué (RuyDos)

Re: gcc / clib bug?

Post by AlvaroBegue »

Please, pass your code through a compiler before posting it. It is annoying to keep having to ignore little mistakes and try to figure out what you really meant to write.

Code: Select all

#include <string.h>
#include <stdlib.h>

struct x { char name; }; 
struct y { char longName[1000]; };

struct x *p; 
struct y *q; 

int main() {
  p = (struct x *) malloc(sizeof(struct x) + 7); 
  q = (struct y *) p; 
  /* strcpy(&p->name, "blablah"); <--- This reproduces the "fortify" error */
  strcpy(q->longName, "blablah");
  
  return 0;
}
That code is probably conformant.
bnemias
Posts: 373
Joined: Thu Aug 14, 2008 3:21 am
Location: Albuquerque, NM

Re: gcc / clib bug?

Post by bnemias »

hgm wrote:

Code: Select all

 struct t_tree {
    struct t_tree *left, *right;
    char name;                    /* Not just 1 char - space for whole name */
  };                              /* is allocated.  Maybe a little cheesy? */
  
  struct t_dirs {
    struct t_dirs *left, *right;
    time_t mtime;                 /* dir's modification time */
    struct t_tree *files;
    char name;                    /* ditto */
  };

// in searc_directory():

      if (!*i) {                          /* if dir isn't in dir tree, add him */
        *i = malloc(sizeof(struct t_dirs) + strlen(dir));
        (*i)->left = (*i)->right = NULL;
        (*i)->files = NULL;
        strcpy(&(*i)->name, dir);
      }

The ICS code, when compiled with gcc 4.6.3 on Ubuntu 12.04 crashes. I could trace the problem to the above code. Although the code is very ugly, using a field defined as scalar char as if it is a string, it doesn't seem really in error: the struct as declared could hold the terminating 0-byte, and it is allocated in such a way that it is guaranteed there is enough empty space behind it to hold the entire string (which happened to be "./players/m" at the point where it crashed).

So how come the code crashes, with the OS complaint of "buffer overflow"?

Code: Select all

*** buffer overflow detected ***: bin/chessd terminated
======= Backtrace: =========
/lib/i386-linux-gnu/libc.so.6(__fortify_fail+0x45)[0xb770c045]
/lib/i386-linux-gnu/libc.so.6(+0x102e1a)[0xb770ae1a]
/lib/i386-linux-gnu/libc.so.6(+0x10214d)[0xb770a14d]
./lib/chessd.so(search_directory+0xe5)[0xb75b8385]
./lib/chessd.so(+0x17b14)[0xb7585b14]
./lib/chessd.so(process_input+0x62b)[0xb7586eab]
./lib/chessd.so(select_loop+0x9fb)[0xb75a9abb]
bin/chessd(main+0x3ea)[0x8048eba]
/lib/i386-linux-gnu/libc.so.6(__libc_start_main+0xf3)[0xb76214d3]
bin/chessd[0x8048fa1]
The error disappears when I replace name by name[2] in the struct defenition, which suggest the allocated space is one byte short. Is strcpy somehow using a byte it should not use, at the end of the string?
I don't need to know the specific reason it crashed. It's poor coding practice. Why not just have name be a pointer, and allocate it separately as has been done since forever for this type of thing? There's even a string duplication function in that project iirc.

The comment about it being cheesy is right... you've got to wonder why someone went to that trouble instead of just doing it correctly. Make a cover function for creating a new node that includes the directory as an argument, and then make another one that properly frees the node along with its copied string. Then the code that calls those functions is significantly clearer too.
syzygy
Posts: 5747
Joined: Tue Feb 28, 2012 11:56 pm

Re: gcc / clib bug?

Post by syzygy »

bnemias wrote:I don't need to know the specific reason it crashed. It's poor coding practice. Why not just have name be a pointer, and allocate it separately as has been done since forever for this type of thing? There's even a string duplication function in that project iirc.
Separately allocating the string is less efficient and the reason to code in C is often to not have to put up with such inefficiencies. Why should we need to call malloc() twice AND have an extra layer of indirection if the obvious solution at the machine level is to place the string right after the other members of the struct?

The fact that the code uses a kind of language hack and seems to rely on compiler behaviour not dictated by the standard (but which is otherwise entirely reasonable) is a bit more serious. This also appears to be the reason for the crash: the -D_FORTIFY_SOURCE compilation option does some range checking and intentionally crashes the program because it writes past the end of a struct.

The proper solution imho is to be ISO C99 compliant and use a flexible array member. Flexible array members were designed exactly for this purpose.
AlvaroBegue
Posts: 931
Joined: Tue Mar 09, 2010 3:46 pm
Location: New York
Full name: Álvaro Begué (RuyDos)

Re: gcc / clib bug?

Post by AlvaroBegue »

That sounds like the way I would have thought about it 12 years ago, before I started working with C++. I am not a C++ fan boy, as there are lots of things I dislike in that language. But I don't think I could live without std::string and some of the standard-library containers.

My guess is that all that code could be replaced with a few lines of C++ using std::map and std::string, with no performance problems and with fewer chances to make mistakes.
Sven
Posts: 4052
Joined: Thu May 15, 2008 9:57 pm
Location: Berlin, Germany
Full name: Sven Schüle

Re: gcc / clib bug?

Post by Sven »

syzygy wrote:
bnemias wrote:I don't need to know the specific reason it crashed. It's poor coding practice. Why not just have name be a pointer, and allocate it separately as has been done since forever for this type of thing? There's even a string duplication function in that project iirc.
Separately allocating the string is less efficient and the reason to code in C is often to not have to put up with such inefficiencies. Why should we need to call malloc() twice AND have an extra layer of indirection if the obvious solution at the machine level is to place the string right after the other members of the struct?
Strictly speaking, you are right. But that is, as you say, thinking on "machine level", it is "assembler thinking". And it is also in fact a "micro-optimization": you avoid to explicitly store the address of a string by storing the string directly beyond the end of other data, and therefore you are able to allocate those data and the string together with one malloc(). The whole solution saves a couple of CPU cycles in total.

Such optimizations have a benefit but also a cost. The cost depends on many factors, where one of them is whether that code remains "private" or needs to be maintained by different programmers in the future. No need to further explain that. Whether these costs shall be accepted or not depends on the benefit: if tiny performance losses are very critical in the given application then you will probably choose the "risk" and try to squeeze out as much of speed as you can. If, however, the performance difference is neglible then I would ALWAYS go for the safer way, safer in the sense of being easier to understand for everyone (not just myself) and therefore easier to be maintained and possibly changed in the future.

Focussing only on efficiency is not the optimal strategy in my opinion.

Sven
User avatar
Evert
Posts: 2929
Joined: Sat Jan 22, 2011 12:42 am
Location: NL

Re: gcc / clib bug?

Post by Evert »

Sven Schüle wrote: Strictly speaking, you are right. But that is, as you say, thinking on "machine level", it is "assembler thinking". And it is also in fact a "micro-optimization": you avoid to explicitly store the address of a string by storing the string directly beyond the end of other data, and therefore you are able to allocate those data and the string together with one malloc(). The whole solution saves a couple of CPU cycles in total.
This is a poor example, but in general there is an entirely different reason for allocating memory in one big chunk rather than calling malloc() several times: having all the data in a single continuous block of memory, which can be important for caching reasons.

This specific example doesn't fall under that heading though and looks more like being clever for the sake of being clever...