Possibly massive time management bugs in Arena

mvanthoor · Post by **mvanthoor** » Wed Nov 04, 2020 9:37 pm

Hi

The last few days I've been testing some stuff regarding my engine's time management. I've extended the simple time management with some of the things HGM mentioned in Rustic's progress topic (thanks again HGM) and some of my own ideas.

Then I set to testing the engine in Arena, at the time control of 2 minutes per game, +1 second increment, which CCRL currently uses for Blitz matches.

Let's just say it didn't go well. I've been testing against Dreamer 0.3, because it's on the CCRL-list at around 1600 Elo, and my engine keeps falling between 1600-1700 Elo in short matches against other engines, so I intend to start testing around the 1600-1650 mark.

Rustic's been losing almost every game on time.... except... in the end, it turns out it actually doesn't. Dreamer switches to a super-fast time control when it gets under 1 minute or so, almost moving instantly. I've been delaying doing this. For example (these things can change in the next few days):

1. I use about 0.3x of the alotted move time in iterative deepening. After that time, I abort the ID run.
2. The a/b search function can overshoot up to 3x the alotted time, if it happens to start running because 1. didn't cut it off. The less time the serach has, the lower this factor becomes.
3. When a certain MIN_TIME on the clock has been hit (10 seconds for now), the search stops exceeding the alotted time ("overshoot factor" = 1)
4. If no "moves to go" is given by UCI, the engine assumes that the game is 60 moves long.
5. It calculates its own moves to go as 60 - moves_played + 10 (to avoid that it spends too much time on the last few moves.)
6.- If the game is still going after 60 moves, the engine will always return 100 "moves to go", so the time/move will shorten with each move, and it'll be very fast.
7. If a critical time has been hit (2 seconds for now), and "increment" is being set, the search doesn't even consider "moves to go" and the clock time anymore. The moves available base-time is set to 0, and only the given increment will be used for searching. So Rustic should never drop under 2 seconds on the clock.

Still, Rustic is losing on time in Arena. I've dialed back the max_time factor in iterative deepening to 0.1x, so it only uses 0.1 times the amount of time alotted for the move. I've disabled the overshoot factor. I've set "moves to go" permanently to 100. Because Dreamer switches to moving almost instantly when it gets under a minute (and it has a hash table, which I don't), if it hasn't lost the game already by then, it only has to hold out on time and it'll win.

I got suspicious, and I put several "info string" prints into the code, to inform me about the allotted time per move, moves_to_go, moves_made,elapsed time where Rustic either aborts the search or stops iterative deepening, and so on... and everything checks out.

Very often, the "info string" comes in: "Iterative deepening aborted after 852 ms", or "search time up at 2572 ms (allotted 2572 ms)". Even so, I can still SEE Arena's clock ticking for 1-2 seconds before it actually finishes making the move and switches to the other engine. (It could also be specific to UCI-engines: Rustic is UCI for now, while Dreamer is XBoard.)

So today, I have restored my code to the point where it was two days ago, roughly as outlined in point 1-7 above, and installed CuteChess 1.20 (upgrading from my old 1.0). I've run a 20 game match between Dreamer and Rustic: GM2001 opening book (only 4 moves per side), no adjudication whatsoever except call it a draw if a game exceeds 250 moves.

As it turns out, Rustic completely obliterates Dreamer 0.3 in most games. It can keep up in search depth despite not having a hash table, and it often plays the entire game up to and including mating Dreamer in less than 45 seconds, so it still has 1 minute 15s on the clock when the game is over. (So I can actually slow down the time control management a lot, it seems...) If a game happens to get into the stage above 60 moves to go, where Rustic starts returning 100 moves to go all the time, or at the point where it drops decreasing its time on the clock by only using the increment, it can actually GAIN TIME on the clock, by getting that 1 second increment, and then not using all of it.

In fact, Rustic hasn't lost a single game on time. It wins most of them with a HUGE plus margin under CuteChess... the only games it loses against Dreamer 0.3, are the ones where it is up +0.05, and then does something exceedingly stupid like giving the opponent a passed pawn (which it doesn't know about yet) trying to avoid a draw... which it then will cost the game. I have to make a provision for that. I'll think about that later. (But I can live with this: it is expected behavior for now.)

In the end, it turns out that Rustic behaves _exactly_ as designed when running under CuteChess, and in Arena, it doesn't, because the GUI substracts time even after the engine has already stopped searching and provided its move, eventually getting the engine into time trouble.

Sorry. This has become something of a rant, because I feel I have wasted three evenings in tests that weren't necessary. I'm not going to use Arena for testing in matches. (I was planning on CuteChess anyway, as it can run many games at the same time.)

mar · Post by **mar** » Wed Nov 04, 2020 10:40 pm

I've never actually used increment under Arena, but I think you should use another "GUI" for engine-engine tests
Also don't forget you get the increment after the move, not before

speaking of Arena bugs, it doesn't really support Chess960 (rejects legal castling in some positions) and doesn't even honor UCI spec wrt Chess960 castling, so I've recently removed Arena-related hacks in my engine completely

mvanthoor · Post by **mvanthoor** » Wed Nov 04, 2020 11:09 pm

mar wrote: ↑Wed Nov 04, 2020 10:40 pm I've never actually used increment under Arena, but I think you should use another "GUI" for engine-engine tests
Also don't forget you get the increment after the move, not before

speaking of Arena bugs, it doesn't really support Chess960 (rejects legal castling in some positions) and doesn't even honor UCI spec wrt Chess960 castling, so I've recently removed Arena-related hacks in my engine completely

Thanks. I'm not going to provide any hacks in my engine with regard to chess playing. The only things I'd consider would be to tinker with the UCI or XBoard protocols to make an engine work in a GUI where it wouldn't if sending/receiving commands to specification.

I didn't plan to use Arena. CuteChess is much smoother (with regard to the GUI when showing games), and it can run 4 games at once on my computer.

At this point, it seems I can attribute losses on time to Arena. I've played some matches as short as 1 minute +1, and Rustic behaves exactly as I intended with regard to time management, and it doesn't lose on time... as of yet. Same in 2m+1s games. Rustic can easily hold its own against engines in the 1600-1700 range, sometimes stronger, IF those engines don't employ one trick pony's Rustic doesn't know about. For example, it can draw a 20 game match against Shallow Blue (CCRL 1714), wipe down Pulse (CCRL 1650) and Dreamer 0.3 (1592) with 70%+ scores, but it gets completely toasted by "Got a passed pawn, will Boogie" TSCP. (But I have seen that it can defeat TSCP, if that engine doesn't manage to lure Rustic into giving it a passed pawn.) And yes, I know, 20 games is not enough, but I have to test this time management stuff and some engines first, before I start a 3000 game tournament.

I'll first fix the time management to more fully use the time. Now it's often already at move 50 by the time it's used 30 seconds.

Then my next goal will be to try and defeat TSCP on brute force alone, *without* teaching my engine about pawn stuff and open files. Thus, killers, heuristics, transposition table, null move, search extensions, etc... that sort of thing. In short, I'm going to try and go the PeSTO way and see how far I can get with material, PSQT, and a tapered evaluation only, before I start tinkering with the eval.

mvanthoor · Post by **mvanthoor** » Wed Nov 04, 2020 11:20 pm

PS: I personally think Fritz 11, with the next-to-last update, was/is the best chess GUI I ever used, at least for playing against and simple engine tournaments.

Everything works. It has no cloud-stuff. It has a decent tournament feature (which got overly simplistic somewhere between Fritz 11 and 17). If you refrain from installing the last update, which tries to backport some Frtiz 12 stuff onto 11 in a hugely buggy fashion that actually seriously break the program, Fritz 11 is a great program to use... IF you don't have a high-resolution monitor that needs scaling set to anything else but 100%, and don't increase your font-size with more than 20%. If you do one of those things (or god forbid, both), the GUI graphics and controls fall apart.

I still have it, and it still works on my desktop. On the laptop, where I ndeed scaling set to 150% and fonts to 115% to be able to use the hi-DPI screen, the GUI is either broken, or, if bitmap scaled by the OS, unsharp.

hgm · Post by **hgm** » Wed Nov 04, 2020 11:21 pm

Well, let me put it this way: Arena is not famous for its lack of bugs. WinBoard/XBoard should be pretty accurate w.r.t. time control, though.

TSCP might be tough, because it is relentlessly pushing Pawns. If your engine doesn't recognize 6th and 7th-rank Pawns as a very grave danger, it will get overwhelmed. Micro-Max only started beating TSCP after I added code to increase the value of a Pawn from its normal value around 1 to 1.8 on 6th, and 2.6 on 7th rank. This could have been done through PST (but micro-Max doesn't have these): the 6th-rank bonus is also given to non-passers (because micro-Max also knows nothing about that).

maksimKorzh · Post by **maksimKorzh** » Wed Nov 04, 2020 11:23 pm

One important thing to consider is that testers from CCRL might be using arena as a GUI to run tests.
I've been suffering with lose on time issue in BBC for quite a bit of time. Without getting into the implementation details I would rather list some "features" that other engines have:

1. When time control is 0 sec + 0 increment Stockfish and many other engines manage to finish a game while say VICE immediately loses on time.
2. When engine is playing whatever time control with say + 1 sec increment it should use "one second per move" scenario when main time has been exhausted (BBC was horribly loosing on time before I've handled this)
3. GUI might have a miserable lag, but that's enough to lose on time so having say 50ms time buffer is good idea. Also I was changing buffer's behavior depending on the particular time control circumstances.

Maybe it's all wrong and should be done differently but at least it works.
Here's my snippet on timing:

Code: Select all

// init start time
    starttime = get_time_ms();

    // init search depth
    depth = depth;

    // if time control is available
    if(time != -1)
    {
        // flag we're playing with time control
        timeset = 1;

        // set up timing
        time /= movestogo;
        
        // lag compensation
        time -= 50;
        
        // if time is up
        if (time < 0)
        {
            // restore negative time to 0
            time = 0;
            
            // inc lag compensation on 0+inc time controls
            inc -= 50;
            
            // timing for 0 seconds left and no inc
            if (inc < 0) inc = 1;
        }
        
        // init stoptime
        stoptime = starttime + time + inc;        
    }

mvanthoor · Post by **mvanthoor** » Wed Nov 04, 2020 11:30 pm

hgm wrote: ↑Wed Nov 04, 2020 11:21 pm Well, let me put it this way: Arena is not famous for its lack of bugs. WinBoard/XBoard should be pretty accurate w.r.t. time control, though.

I've noticed... one of the reasons I want to implement the xboard protocol is to be able to use Winboard and its native protocol. (I don't like adapters... if something doesn't work, I can never be really sure where the problem is without debugging other people's code.) I never used Winboard (or Arena) much, because I *NEEDED* Fritz for proper support of my DGT Board and I preferred 11, which I stuck with for over 13 years.

Now that I have my own (created by myself) PicoChess image which runs splendidly for playing against, I don't necessarily need Fritz anymore, or any other Chessbase product for that matter, so I'm looking into other GUI's; including Winboard.

TSCP might be tough, because it is relentlessly pushing Pawns. If your engine doesn't recognize 6th and 7th-rank Pawns as a very grave danger, it will get overwhelmed. Micro-Max only started beating TSCP after I added code to increase the value of a Pawn from its normal value around 1 to 1.8 on 6th, and 2.6 on 7th rank. This could have been done through PST (but micro-Max doesn't have these).

I've noticed. I wonder if brute force speed, a transposition table, and some "warnings" in the PSQT's are enough to defeat TSCP. I love this engine. It's small. It's old. It is quite strong despite the lack of features, and it has a VERY distinctive playing style. I like those kinds of engines that play chess I can actually humanly understand.

mvanthoor · Post by **mvanthoor** » Wed Nov 04, 2020 11:42 pm

maksimKorzh wrote: ↑Wed Nov 04, 2020 11:23 pm One important thing to consider is that testers from CCRL might be using arena as a GUI to run tests.
I've been suffering with lose on time issue in BBC for quite a bit of time. Without getting into the implementation details I would rather list some "features" that other engines have:

1. When time control is 0 sec + 0 increment Stockfish and many other engines manage to finish a game while say VICE immediately loses on time.

Hi Maksim, thanks for posting.

How does an engine finish a game when the time control is 0m+0s ?

2. When engine is playing whatever time control with say + 1 sec increment it should use "one second per move" scenario when main time has been exhausted (BBC was horribly loosing on time before I've handled this)

I do have that already:

Code: Select all

    pub fn time_for_move(refs: &SearchRefs) -> u128 {
        let gt = &refs.search_params.game_time;
        let mtg = Search::moves_to_go(refs);
        let white = refs.board.us() == Sides::WHITE;
        let clock = if white { gt.wtime } else { gt.btime };
        let increment = if white { gt.winc } else { gt.binc };
        let base_time = if clock > MIN_TIME || (increment == 0) {
            ((clock as f64 * 0.8) / (mtg as f64)).round() as u128
        } else {
            0
        };

        base_time + increment - OVERHEAD
    }

Thus: if clock time is larger than MIN_TIME (right now, 2.5 seconds or therebout) OR increment is not given, then it will set a move base time of 80% of the available clock time, divided by the moves to go to the end of the game. (If moves to go is not given, the engine will assume that the game is 60 moves long, and count towards that. If the game is still going, it will give 50 or 100 for moves to go, yielding a very short base time.)

If the engine drops under MIN_TIME but it DOES have an increment, the move base time is 0, and it will only use the increment minus an overhead.

3. GUI might have a miserable lag, but that's enough to lose on time so having say 50ms time buffer is good idea. Also I was changing buffer's behavior depending on the particular time control circumstances.

I have a buffer of 100ms already...

Maybe it's all wrong and should be done differently but at least it works.
Here's my snippet on timing:

Code: Select all

// init start time
    starttime = get_time_ms();

    // init search depth
    depth = depth;

    // if time control is available
    if(time != -1)
    {
        // flag we're playing with time control
        timeset = 1;

        // set up timing
        time /= movestogo;
        
        // lag compensation
        time -= 50;
        
        // if time is up
        if (time < 0)
        {
            // restore negative time to 0
            time = 0;
            
            // inc lag compensation on 0+inc time controls
            inc -= 50;
            
            // timing for 0 seconds left and no inc
            if (inc < 0) inc = 1;
        }
        
        // init stoptime
        stoptime = starttime + time + inc;        
    }

Seems to be good, at first glance. If you compare my function and yours, they essentially do the same thing.

You init a start time and then calculate a stop time, and your search should be in between.
I calculate "time_for_move", and then start a timer with search_info.timer_start(), and I check it against search_info.timer_elapsed() > time_for_move. I call it "starting a timer", but it actually isn't; that function just creates an Instant struct, and this has an "elapsed()" function that gives you the time that elapsed since you created the struct. It essentially is Rusts version of "starttime = get_time_ms();"

Ras · Post by **Ras** » Wed Nov 04, 2020 11:57 pm

I didn't have such issues with Arena when I was still using Windows, and my engine is also UCI. No time losses even with just 1 second for the whole game without increments. However, that was under Win 7, maybe it's different with Win 10 now.

mvanthoor · Post by **mvanthoor** » Thu Nov 05, 2020 12:40 am

The user interface is the problem. The arrows and animations slow Arena down.

Sending UCI information also slows it down (it actually says so in the help).

I've implemented (almost) the entire UCI standard, so Rustic sends everything: info with everyting in it, curr_move/curr_move number each time the analyzed move changes, and an update on speed (time, nps), every 4 million nodes or thereabouts.

Let's just say Rustic hits depth 7 in the console in under one second and it processes 5 million nps at the moment, so that is a LOT of information, and Arena can't handle it; especially not if its animations are also enabled.

I have disabled the animations and enabled Arena's "UCI filter". The help describes that this will ignore all non-important UCI input for 5 seconds, but it will still show important information such as mainlines.

Well, it does solve the problem with regard to Arena being too slow to handle the information my engine puts out in the first 7 or so depths, but now it also doesn't show the mainline anymore, except for a flash of something at depth 9 or up, if my engine manages to reach it.

Time behaviour is now more like it is in CuteChess, but Rustic still uses more time than it does in CuteChess.

I have recently implemented a "quiet mode" (command line parameter -q), where I suppress everything except the info string, which is useful when I'm implementing new features and testing them directly in the console. Maybe I should implement an "Arena mode" (command line -a, UCI-setting) where the engine doesn't send things such as curr_move/curr_movenumber and slows down time/node/nps reports.

Possibly massive time management bugs in Arena

Possibly massive time management bugs in Arena

Re: Possibly massive time management bugs in Arena

Re: Possibly massive time management bugs in Arena

Re: Possibly massive time management bugs in Arena

Re: Possibly massive time management bugs in Arena

Re: Possibly massive time management bugs in Arena

Re: Possibly massive time management bugs in Arena

Re: Possibly massive time management bugs in Arena

Re: Possibly massive time management bugs in Arena

Re: Possibly massive time management bugs in Arena