UCI options case insensitive ?

Discussion of chess software programming and technical issues.

Moderator: Ras

mcostalba
Posts: 2684
Joined: Sat Jun 14, 2008 9:17 pm

Re: UCI options case insensitive ?

Post by mcostalba »

Don wrote: I cannot believe you posted this
Yes this is really incredible :-)

I didn't read your other post and I noticed this Stockfish _bug_ only this afternoon. It is really amazing becuase this case insensitivity rule is out there from years and it happens that only this afternoon I noticed that and only this afternoon you posted your very strict related comment.

I guess if someone would calculate the probability of such an event well that should be really really low ;-)

Anyhow, in SF options are stored in a standard map std::map, so I needed to add something like the below:

Code: Select all

// Custom comparator because UCI options should not be case sensitive
struct CaseInsensitiveLess {
    bool operator() (const std::string&, const std::string&) const;
};

typedef std::map<std::string, Option, CaseInsensitiveLess> OptionsMap;
and here is the actual implementation, it seems correct after a quick test:

Code: Select all

// Our case insensitive Less() function as required by UCI protocol
bool CaseInsensitiveLess::operator() (const string& s1, const string& s2) const {

    int c1, c2;
    size_t i = 0;

    while (i < s1.size() && i < s2.size())
    {
        c1 = tolower(s1[i]);
        c2 = tolower(s2[i++]);

        if (c1 != c2)
            return c1 < c2;
    }
    return s1.size() < s2.size();
}

This is not general purpose full solution because it handles only ASCII character set. A real complete solution should properly handle locales, but it becomes really complex.....I don't know if it is required....what mandates UCI protocol regarding character set to be used by options ?
User avatar
ilari
Posts: 750
Joined: Mon Mar 27, 2006 7:45 pm
Location: Finland

Re: UCI options case insensitive ?

Post by ilari »

mcostalba wrote:This is not general purpose full solution because it handles only ASCII character set. A real complete solution should properly handle locales, but it becomes really complex.....I don't know if it is required....what mandates UCI protocol regarding character set to be used by options ?
I don't think Unicode was even a consideration when UCI was designed, so you're probably safe with ASCII. And in many languages there is no concept of uppercase or lowercase.
User avatar
Houdini
Posts: 1471
Joined: Tue Mar 16, 2010 12:00 am

Re: UCI options case insensitive ?

Post by Houdini »

mcostalba wrote:

Code: Select all

// Our case insensitive Less() function as required by UCI protocol
bool CaseInsensitiveLess::operator() (const string& s1, const string& s2) const {

    int c1, c2;
    size_t i = 0;

    while (i < s1.size() && i < s2.size())
    {
        c1 = tolower(s1[i]);
        c2 = tolower(s2[i++]);

        if (c1 != c2)
            return c1 < c2;
    }
    return s1.size() < s2.size();
}
Is there any reason why you don't use the standard stricmp() function?
User avatar
michiguel
Posts: 6401
Joined: Thu Mar 09, 2006 8:30 pm
Location: Chicago, Illinois, USA

Re: UCI options case insensitive ?

Post by michiguel »

ilari wrote:
Don wrote:It's not what we are used to seeing, but it's a complete correct specification because the delimiters are the tokens "option", "name" and "value" so it's fully specified. What IS broken is there is no mention of whitespace at the beginning or end of tokens. The standard should state that the tokens be trimmed of whitespace at the start and end. I hate that many spaces could be inside of tokens but that is not an issue for graphical user interfaces. I sometimes use UCI manually and it can be awkward because of the whitespace issue.
I agree that the specification is complete. But it's still very annoying and inefficient from a GUI developer's perspective. If quotes were required around strings that contain whitespace, parsing "info" and "option" lines would be a lot easier and faster. Now the GUI has to compare every whitespace-delimited token (eg. every move in a principal variation) to each of the keywords, which in the case of "info", are many.

In my view it would have been better for UCI to have 3 tokens, one as a description for display only, one with no white space allowed to represent the internal name of the token and then the value. OR perhaps the description is to be used only as an ADDITIONAL element for GUI's, something that can be used for help or for any purpose the GUI designer wants or is ignored. It could be one of those boxes that come to life when you drag the mouse over it.
Yes, it would be SO much better if there was a separate token for the description. I guess UCI was designed to be easy to read by humans, despite the fact that it's meant to be used for communication between a GUI and an engine. UCI is the COBOL of chess engine communication protocols.
Absolutely!!!!!!!! I was trying to push this idea for the winboard protocol before it was specified but it did not happen. In addition (but not necessary), it would have been more robust if the grammar would have followed what it is used for command line arguments in the setoption library. As a GUI writer, to parse this could have treated the whole thing with those available libraries, if you wanted.
You send
feature option --display "Memory alloted" --variable "hash" --help "Value in Megabytes" --default 64
and you get back
option hash 64

And when you pass the mouse over "Memory alloted" you see "Value in Megabytes". Of course, it is optional for the GUI how this is treated.

Miguel
mcostalba
Posts: 2684
Joined: Sat Jun 14, 2008 9:17 pm

Re: UCI options case insensitive ?

Post by mcostalba »

Houdini wrote: Is there any reason why you don't use the standard stricmp() function?
Yes, I missed that ;-)

Thanks for the hint !


.....wait...correction....it seems stricmp() is _not_ standard, so I cannot use that :-(


P.S: Standard string functions I think are only the following:
http://www.elook.org/programming/c/stdstring.html
User avatar
Don
Posts: 5106
Joined: Tue Apr 29, 2008 4:27 pm

Re: UCI options case insensitive ?

Post by Don »

mcostalba wrote:
Houdini wrote: Is there any reason why you don't use the standard stricmp() function?
Yes, I missed that ;-)

Thanks for the hint !


.....wait...correction....it seems stricmp() is _not_ standard, so I cannot use that :-(


P.S: Standard string functions I think are only the following:
http://www.elook.org/programming/c/stdstring.html
I thought strcasecmp() is standard.
uaf
Posts: 98
Joined: Sat Jul 31, 2010 8:48 pm
Full name: Ubaldo Andrea Farina

Re: UCI options case insensitive ?

Post by uaf »

AFAIK stricmp() is neither ANSI nor POSIX while strcasecmp() is POSIX
Sven
Posts: 4052
Joined: Thu May 15, 2008 9:57 pm
Location: Berlin, Germany
Full name: Sven Schüle

Re: UCI options case insensitive ?

Post by Sven »

Houdini wrote:
mcostalba wrote:

Code: Select all

// Our case insensitive Less() function as required by UCI protocol
bool CaseInsensitiveLess::operator() (const string& s1, const string& s2) const {

    int c1, c2;
    size_t i = 0;

    while (i < s1.size() && i < s2.size())
    {
        c1 = tolower(s1[i]);
        c2 = tolower(s2[i++]);

        if (c1 != c2)
            return c1 < c2;
    }
    return s1.size() < s2.size();
}
Is there any reason why you don't use the standard stricmp() function?
This is also what I thought when reading Marco's implementation. However, stricmp() is not part of the C++ standard so portability is affected. For instance, POSIX defines strcasecmp() but not stricmp(). MSVC++ 2005 and later versions even declare stricmp() as "deprecated" and propose the (also Microsoft specific) _stricmp() instead.

The following could be an acceptable alternative to writing own code:

Code: Select all

#include <cstring>
#ifdef _MSC_VER
#define strcasecmp _stricmp
#endif /* _MSC_VER */

// Our case insensitive Less() function as required by UCI protocol
bool CaseInsensitiveLess::operator() (const string& s1, const string& s2) const {
    return bool(strcasecmp(s1.c_str(), s2.c_str() < 0));
}
But since we are outside the standard here one would have to live with any future changes in behaviour of proprietary implementations like Microsoft _stricmp(), which could be a vote for maintaining own code.

Sven
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: UCI options case insensitive ?

Post by bob »

ilari wrote:
Houdini wrote:In my opinion the specification makes sense, it would be silly to allow "setoption name Hash value 128" and not "setoption name hash value 128".
Houdini follows the specification and is not case sensitive for UCI options.
Case-insensitivity would make sense if the engines were meant to be operated directly by humans. But for communication with another process it just complicates things.
Not to mention mixing apples and oranges. Clearly, SAN is case-sensitive, as it should be. If a program wants to accept Nf3 or nf3, that is OK, but output should always be Nf3. Why have case sensitivity in the moves, but not in the commands?
User avatar
Don
Posts: 5106
Joined: Tue Apr 29, 2008 4:27 pm

Re: UCI options case insensitive ?

Post by Don »

bob wrote:
ilari wrote:
Houdini wrote:In my opinion the specification makes sense, it would be silly to allow "setoption name Hash value 128" and not "setoption name hash value 128".
Houdini follows the specification and is not case sensitive for UCI options.
Case-insensitivity would make sense if the engines were meant to be operated directly by humans. But for communication with another process it just complicates things.
Not to mention mixing apples and oranges. Clearly, SAN is case-sensitive, as it should be. If a program wants to accept Nf3 or nf3, that is OK, but output should always be Nf3. Why have case sensitivity in the moves, but not in the commands?
That is a good point. I think peoples viewpoint on this are probably very affected by the OS they develop software with (or use themselves.) People with Unix backgrounds are used to consistent and regular behavior and clean abstractions and would expect this but windows users are more used to having to do everything differently.