c/c++ question: comparing/sorting sctructs

Discussion of chess software programming and technical issues.

Moderator: Ras

Dann Corbit
Posts: 12808
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: c/c++ question: comparing/sorting sctructs

Post by Dann Corbit »

trojanfoe wrote:As far as comparison is concerned in C/C++, this has always worked and will generate a call to the compiler's internal 'memcmp' library function:

Code: Select all

struct typex a, b;

initialise a and b;

if (a == b)
{
    do something;
}
The only reason to override the equality operator in C++ is if this comparison is meaningless, which would only be case if you store pointers to sibling, parent or child structs (which is fairly common).
From the C-FAQ:
2.8: Is there a way to compare structures automatically?

A: No. There is not a good way for a compiler to implement
structure comparison (i.e. to support the == operator for
structures) which is consistent with C's low-level flavor.
A simple byte-by-byte comparison could founder on random bits
present in unused "holes" in the structure (see question 2.12).
A field-by-field comparison might require unacceptable amounts
of repetitive code for large structures.

If you need to compare two structures, you'll have to write your
own function to do so, field by field.

References: K&R2 Sec. 6.2 p. 129; Rationale Sec. 3.3.9; H&S
Sec. 5.6.2 p. 133.
trojanfoe

Re: c/c++ question: comparing/sorting sctructs

Post by trojanfoe »

All you need to do it make sure you initialise your structures before use, then the padding bytes are in a known state (0), and the equality operator will work without problem. This is much more efficient than field-by-field comparison as the internal memcmp will generally have been implemented in assembler.
Dirt
Posts: 2851
Joined: Wed Mar 08, 2006 10:01 pm
Location: Irvine, CA, USA

Re: c/c++ question: comparing/sorting sctructs

Post by Dirt »

trojanfoe wrote:All you need to do it make sure you initialise your structures before use, then the padding bytes are in a known state (0), and the equality operator will work without problem. This is much more efficient than field-by-field comparison as the internal memcmp will generally have been implemented in assembler.
What would be the best way to initialize the padding bytes?
trojanfoe

Re: c/c++ question: comparing/sorting sctructs

Post by trojanfoe »

memset. I coded a test program, but it looks like gcc doesn't like doing a structure compare, so instead I had to use the memcmp explicitly:

Code: Select all

#include <stdio.h>
#include <string.h>

struct typex
{
    char c;
    const char *s;
};

int main(int argc, char **argv)
{
    struct typex a, b;
    char buffer[] = "Hello";

    printf("sizeof(struct typex)=%u\n", sizeof(struct typex));

    memset(&a, 0, sizeof(struct typex));
    memset(&b, 0, sizeof(struct typex));

    a.c = 'A';
    a.s = buffer;
    b.c = 'A';
    b.s = buffer;

    printf("first compare: %d\n", memcmp(&a, &b, sizeof(struct typex)));

    memset(&a, 8, sizeof(struct typex));
    memset(&b, 4, sizeof(struct typex));

    a.c = 'A';
    a.s = buffer;
    b.c = 'A';
    b.s = buffer;

    printf("second compare: %d\n", memcmp(&a, &b, sizeof(struct typex)));
    return 0;
}
Under my Linux x86_64 platform, this produced:

Code: Select all

sizeof(struct typex)=16
first compare: 0
second compare: 4
I found the value 4 surprising, but whatever.
Last edited by trojanfoe on Fri Sep 05, 2008 10:30 pm, edited 1 time in total.
Dirt
Posts: 2851
Joined: Wed Mar 08, 2006 10:01 pm
Location: Irvine, CA, USA

Re: c/c++ question: comparing/sorting sctructs

Post by Dirt »

trojanfoe wrote:memset. In the above example that would be:

Code: Select all

memset(&a, 0, sizeof(typex));
...
Thanks. I knew that, but I got a little confused.
Dann Corbit
Posts: 12808
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: c/c++ question: comparing/sorting sctructs

Post by Dann Corbit »

trojanfoe wrote:All you need to do it make sure you initialise your structures before use, then the padding bytes are in a known state (0), and the equality operator will work without problem. This is much more efficient than field-by-field comparison as the internal memcmp will generally have been implemented in assembler.
The == comparison simply does not work in the C or C++ language for a struct. It will work in C++ if you define a == operator, but that will require the field by field comparison. For instance:

C:\tmp>type t.c
#include <stdio.h>
#include <string.h>

/*
* Concept source: "Calendrical Calculations"
* by Nachum Dershowitz and Edward M. Reingold
*/
typedef struct datetime {
long long njdate;
int year;
int month;
char *localized_month_name;
int day_of_month;
char *localized_weekday_name;
int hours;
int minutes;
int seconds;
int nanoseconds;
} datetime_type;

int main(void)
{
datetime_type d0 =
{733290, 2008, 9, "September", 5, "Friday", 17, 56, 23, 1234567};
datetime_type d1 =
{733290, 2008, 9, "September", 5, "Friday", 17, 56, 23, 1234567};
if (d0 == d1)
puts("same");
else
puts("different");
if (memcmp(&d0, &d1, sizeof (datetime_type)) == 0)
puts("same");
else
puts("different");
return 0;
}

C:\tmp>type t.cpp
#include <stdio.h>
#include <string.h>

/*
* Concept source: "Calendrical Calculations"
* by Nachum Dershowitz and Edward M. Reingold
*/
typedef struct datetime {
long long njdate;
int year;
int month;
char *localized_month_name;
int day_of_month;
char *localized_weekday_name;
int hours;
int minutes;
int seconds;
int nanoseconds;
} datetime_type;

int main(void)
{
datetime_type d0 =
{733290, 2008, 9, "September", 5, "Friday", 17, 56, 23, 1234567};
datetime_type d1 =
{733290, 2008, 9, "September", 5, "Friday", 17, 56, 23, 1234567};
if (d0 == d1)
puts("same");
else
puts("different");
if (memcmp(&d0, &d1, sizeof (datetime_type)) == 0)
puts("same");
else
puts("different");
return 0;
}

C:\tmp>cl t.cpp
Microsoft (R) C/C++ Optimizing Compiler Version 14.00.50727.42 for x64
Copyright (C) Microsoft Corporation. All rights reserved.

t.cpp
t.cpp(27) : error C2676: binary '==' : 'datetime_type' does not define this operator or a conversion to a type acceptable to the predefined operator

C:\tmp>cl t.c
Microsoft (R) C/C++ Optimizing Compiler Version 14.00.50727.42 for x64
Copyright (C) Microsoft Corporation. All rights reserved.

t.c
t.c(27) : error C2088: '==' : illegal for struct

C:\tmp>


Whereas this version (which is legal C or C++) may output "same" or may output "different" depending upon whether constant folding is in action:

Code: Select all

#include <stdio.h>
#include <string.h>

/* 
*  Concept source: "Calendrical Calculations" 
*  by Nachum Dershowitz and Edward M. Reingold 
*/
typedef struct datetime {
    long long       njdate;
    int             year;
    int             month;
    char           *localized_month_name;
    int             day_of_month;
    char           *localized_weekday_name;
    int             hours;
    int             minutes;
    int             seconds;
    int             nanoseconds;
}   datetime_type;

int             main(void)
{
    datetime_type   d0 =
    {733290, 2008, 9, "September", 5, "Friday", 17, 56, 23, 1234567};
    datetime_type   d1 =
    {733290, 2008, 9, "September", 5, "Friday", 17, 56, 23, 1234567};
    if (memcmp(&d0, &d1, sizeof (datetime_type)) == 0)
        puts("same");
    else
        puts("different");
    return 0;
}
Dann Corbit
Posts: 12808
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: c/c++ question: comparing/sorting sctructs

Post by Dann Corbit »

Dirt wrote:
trojanfoe wrote:memset. In the above example that would be:

Code: Select all

memset(&a, 0, sizeof(typex));
...
Thanks. I knew that, but I got a little confused.
Of course, memset should not be used on a class in C++ (or a struct in C++ that has a constructor).
Further, memset is not guaranteed to initialize pointers to NULL or floating point values to zero in C or C++ {though assignment of zero will do that}.
The idea of using memcmp() on structs to test for equality is a bad idea, as shown by the C-FAQ reference I posted before. Let's examine that citation:
From the C-FAQ:
2.8: Is there a way to compare structures automatically?

A: No. There is not a good way for a compiler to implement
structure comparison (i.e. to support the == operator for
structures) which is consistent with C's low-level flavor.
A simple byte-by-byte comparison could founder on random bits
present in unused "holes" in the structure (see question 2.12).
A field-by-field comparison might require unacceptable amounts
of repetitive code for large structures.

If you need to compare two structures, you'll have to write your
own function to do so, field by field.

References: K&R2 Sec. 6.2 p. 129; Rationale Sec. 3.3.9; H&S
Sec. 5.6.2 p. 133.

K&R2, Sec. 6.2 p. 129 says {in part} "Structures may not be compared."

The Rationale says:
"3.3.9 Equality operators
The Committee considered, on more than one occasion, permitting comparison of structures for equality. Such proposals foundered on the problem of holes in structures. A byte-wise comparison of two structures would require that the holes assuredly be set to zero so that all holes would compare equal, a difficult task for automatic or dynamically allocated variables. (The possibility of union-type elements in a structure raises insuperable problems with this approach.) Otherwise the implementation would have to be prepared to break a structure comparison into an arbitrary number of member comparisons; a seemingly simple expression could thus expand into a substantial stretch of code, which is contrary to the spirit of C.

In pointer comparisons, one of the operands may be of type void *. In particular, this allows NULL, which can be defined as (void *)0, to be compared to any object pointer."

I can't find my H&S right now.

See also:
http://learningcppisfun.blogspot.com/20 ... emcmp.html
https://www.securecoding.cert.org/confl ... B7EB5C3D6A
trojanfoe

Re: c/c++ question: comparing/sorting sctructs

Post by trojanfoe »

Dann Corbit wrote: The == comparison simply does not work in the C or C++ language for a struct.
You are right - I must admit I had confused the equality operator with the assignment operator, which does work fine. :oops:

However if the structures are large and performance is important (which it is in most parts of a chess engine) and there are no pointers within the structure and you are able to guarentee that you always initialise the structure before use, then I have already demonstrated that memcmp can be used. Quite alot of conditions, I know.

If your structure is large and does contain pointers then you are forced to implement an equality function to perform the test (operator == in C++), but this can still use memcmp if the struct elements are large.

'goto' is considered bad but people still use it (not me I might add) - you are free to write code any way you please regardless of what a group of experts consider to be bad or not. The only time this will cause a problem is during peer review within a company or by an instructor in an academic environment. Everywhere else you are home free.

Vive la difference!
trojanfoe

Re: c/c++ question: comparing/sorting sctructs

Post by trojanfoe »

Dann Corbit wrote:

Code: Select all

#include <stdio.h>
#include <string.h>

/* 
*  Concept source: "Calendrical Calculations" 
*  by Nachum Dershowitz and Edward M. Reingold 
*/
typedef struct datetime {
    long long       njdate;
    int             year;
    int             month;
    char           *localized_month_name;
    int             day_of_month;
    char           *localized_weekday_name;
    int             hours;
    int             minutes;
    int             seconds;
    int             nanoseconds;
}   datetime_type;

int             main(void)
{
    datetime_type   d0 =
    {733290, 2008, 9, "September", 5, "Friday", 17, 56, 23, 1234567};
    datetime_type   d1 =
    {733290, 2008, 9, "September", 5, "Friday", 17, 56, 23, 1234567};
    if (memcmp(&d0, &d1, sizeof (datetime_type)) == 0)
        puts("same");
    else
        puts("different");
    return 0;
}
Indeed, but this falls outside of the conditions I have mentioned:
1. The structure hasn't been initialised to zero.
2. It contains pointers.

However for the last point it will probably work if the compiler is set to merge identical string contants only to then probably fail if stuctures are initialised in different c modules (object files) as this merging is most likely done on a per-module basis.

I don't think I would advocate the use of memcmp in this case.
User avatar
Bo Persson
Posts: 262
Joined: Sat Mar 11, 2006 8:31 am
Location: Malmö, Sweden
Full name: Bo Persson

Re: c/c++ question: comparing/sorting sctructs

Post by Bo Persson »

trojanfoe wrote:
Dann Corbit wrote: The == comparison simply does not work in the C or C++ language for a struct.
You are right - I must admit I had confused the equality operator with the assignment operator, which does work fine. :oops:

However if the structures are large and performance is important (which it is in most parts of a chess engine) and there are no pointers within the structure and you are able to guarentee that you always initialise the structure before use, then I have already demonstrated that memcmp can be used. Quite alot of conditions, I know.

If your structure is large and does contain pointers then you are forced to implement an equality function to perform the test (operator == in C++), but this can still use memcmp if the struct elements are large.
Why don't you trust your compiler?

Just implement an operator== as a memberwise compare. If memcmp is possible, and actually more efficient, don't you think the compiler will recognize that and optimize the code?


Bo Persson