Possibly massive time management bugs in Arena

Discussion of chess software programming and technical issues.

Moderator: Ras

User avatar
Ras
Posts: 2727
Joined: Tue Aug 30, 2016 8:19 pm
Full name: Rasmus Althoff

Re: Possibly massive time management bugs in Arena

Post by Ras »

mvanthoor wrote: Thu Nov 05, 2020 12:40 amThe user interface is the problem. The arrows and animations slow Arena down.
Sure. Now that you mention it, I had them disabled, and the UCI tournament filter enabled.
I've implemented (almost) the entire UCI standard, so Rustic sends everything: info with everyting in it, curr_move/curr_move number each time the analyzed move changes, and an update on speed (time, nps), every 4 million nodes or thereabouts.
The UCI standard suggests that you should start sending currmove only after one second specifically to avoid too much traffic. I've implemented it as per that suggestion, and on top of that, I update currmove, currmovenumber and hashfull only once per second. Let's be realistic, the user wouldn't be able to read it anyway if it came faster. Of course, then currmovenumber in the GUI may jump, but that's still better than unreadable flashes. Actually, I got the idea for that because the LC display in my embedded system version is too slow and would just blur.
Rasmus Althoff
https://www.ct800.net
Dann Corbit
Posts: 12814
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: Possibly massive time management bugs in Arena

Post by Dann Corbit »

I suspect Arena may buffer I/O.
At least in the case of the log file output, it occasionally becomes disconnected from the data stream and wrong output is written to the file.
You will see output from the previous iteration written to subsequent output records.
Or other bad things

Fast time control exacerbates the problem enormously
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
jdart
Posts: 4420
Joined: Fri Mar 10, 2006 5:23 am
Location: http://www.arasanchess.org

Re: Possibly massive time management bugs in Arena

Post by jdart »

> 3. When a certain MIN_TIME on the clock has been hit (10 seconds for now), the search stops exceeding the alotted time ("overshoot factor" = 1)
May make sense but 10 seconds is actually a lot of time - engines can make many moves in that time.
>4. If no "moves to go" is given by UCI, the engine assumes that the game is 60 moves long.
Most engines assume something smaller, on the order of 40-50
> 7. If a critical time has been hit (2 seconds for now), and "increment" is being set, the search doesn't even consider "moves to go" and the clock time anymore.
This means you will not be able to handle hyper-bullet time controls, like the 1:0+0.6 games Stockfish uses for testing.

Time management is tricky. To do it right you need to consider all cases:
1. fixed time no increment
2. fixed time per game with increment
3. moves/time, like "40/40" time control

and in addition the ponder/no-ponder distinction (you can allocate more search time with pondering on, because you can use some of it during your opponent's time).
User avatar
mvanthoor
Posts: 1784
Joined: Wed Jul 03, 2019 4:42 pm
Location: Netherlands
Full name: Marcel Vanthoor

Re: Possibly massive time management bugs in Arena

Post by mvanthoor »

Ras wrote: Thu Nov 05, 2020 1:03 am The UCI standard suggests that you should start sending currmove only after one second specifically to avoid too much traffic.
I have this implemented; that's the reason why I chose around 4 million nodes/s for updating. I may change this to use actual time though. I'll look into the curr_move/curr_movenumber updates on slowing them down.
jdart wrote: Thu Nov 05, 2020 4:23 am Time management is tricky. To do it right you need to consider all cases...
Thanks for the suggestions :) The time management seems to work as intended, but I still have to tinker with the numbers such as the length of the game, MIN_TIME, and the critical time. And... 1m+0.6s? And here I was, thinking that 2m+1s was already an idiotically fast time control....
Author of Rustic, an engine written in Rust.
Releases | Code | Docs | Progress | CCRL
User avatar
maksimKorzh
Posts: 775
Joined: Sat Sep 08, 2018 5:37 pm
Location: Ukraine
Full name: Maksim Korzh

Re: Possibly massive time management bugs in Arena

Post by maksimKorzh »

mvanthoor wrote: Wed Nov 04, 2020 11:42 pm
maksimKorzh wrote: Wed Nov 04, 2020 11:23 pm One important thing to consider is that testers from CCRL might be using arena as a GUI to run tests.
I've been suffering with lose on time issue in BBC for quite a bit of time. Without getting into the implementation details I would rather list some "features" that other engines have:

1. When time control is 0 sec + 0 increment Stockfish and many other engines manage to finish a game while say VICE immediately loses on time.
Hi Maksim, thanks for posting.

How does an engine finish a game when the time control is 0m+0s ?
2. When engine is playing whatever time control with say + 1 sec increment it should use "one second per move" scenario when main time has been exhausted (BBC was horribly loosing on time before I've handled this)
I do have that already:

Code: Select all

    pub fn time_for_move(refs: &SearchRefs) -> u128 {
        let gt = &refs.search_params.game_time;
        let mtg = Search::moves_to_go(refs);
        let white = refs.board.us() == Sides::WHITE;
        let clock = if white { gt.wtime } else { gt.btime };
        let increment = if white { gt.winc } else { gt.binc };
        let base_time = if clock > MIN_TIME || (increment == 0) {
            ((clock as f64 * 0.8) / (mtg as f64)).round() as u128
        } else {
            0
        };

        base_time + increment - OVERHEAD
    }
Thus: if clock time is larger than MIN_TIME (right now, 2.5 seconds or therebout) OR increment is not given, then it will set a move base time of 80% of the available clock time, divided by the moves to go to the end of the game. (If moves to go is not given, the engine will assume that the game is 60 moves long, and count towards that. If the game is still going, it will give 50 or 100 for moves to go, yielding a very short base time.)

If the engine drops under MIN_TIME but it DOES have an increment, the move base time is 0, and it will only use the increment minus an overhead.

3. GUI might have a miserable lag, but that's enough to lose on time so having say 50ms time buffer is good idea. Also I was changing buffer's behavior depending on the particular time control circumstances.
I have a buffer of 100ms already...
Maybe it's all wrong and should be done differently but at least it works.
Here's my snippet on timing:

Code: Select all

// init start time
    starttime = get_time_ms();

    // init search depth
    depth = depth;

    // if time control is available
    if(time != -1)
    {
        // flag we're playing with time control
        timeset = 1;

        // set up timing
        time /= movestogo;
        
        // lag compensation
        time -= 50;
        
        // if time is up
        if (time < 0)
        {
            // restore negative time to 0
            time = 0;
            
            // inc lag compensation on 0+inc time controls
            inc -= 50;
            
            // timing for 0 seconds left and no inc
            if (inc < 0) inc = 1;
        }
        
        // init stoptime
        stoptime = starttime + time + inc;        
    }
Seems to be good, at first glance. If you compare my function and yours, they essentially do the same thing.

You init a start time and then calculate a stop time, and your search should be in between.
I calculate "time_for_move", and then start a timer with search_info.timer_start(), and I check it against search_info.timer_elapsed() > time_for_move. I call it "starting a timer", but it actually isn't; that function just creates an Instant struct, and this has an "elapsed()" function that gives you the time that elapsed since you created the struct. It essentially is Rusts version of "starttime = get_time_ms();"
How does an engine finish a game when the time control is 0m+0s ?
Either checkmates or loses on time at move... say 45...
Anyway it manages to reach the endgame phase.
User avatar
Ras
Posts: 2727
Joined: Tue Aug 30, 2016 8:19 pm
Full name: Rasmus Althoff

Re: Possibly massive time management bugs in Arena

Post by Ras »

mvanthoor wrote: Thu Nov 05, 2020 9:31 amAnd... 1m+0.6s? And here I was, thinking that 2m+1s was already an idiotically fast time control....
I can even test my engine at 1s+0, with a typical midgame depth around 6-7. There are sporadic losses on time, but I allocate a minimum thinking time depending on the game phase, and that buys more Elo than I lose on occasional time forfeits.

Interestingly, there were no time forfeits in Arena, only with c-chess-cli. Arena seems to subtract some time to account for its own GUI delays, and that's also how a game with 0m+0 is possible. c-chess-cli is more exact because it doesn't have such lags in the first place.
Rasmus Althoff
https://www.ct800.net
User avatar
mvanthoor
Posts: 1784
Joined: Wed Jul 03, 2019 4:42 pm
Location: Netherlands
Full name: Marcel Vanthoor

Re: Possibly massive time management bugs in Arena

Post by mvanthoor »

Ras wrote: Fri Nov 06, 2020 7:06 pm
mvanthoor wrote: Thu Nov 05, 2020 9:31 amAnd... 1m+0.6s? And here I was, thinking that 2m+1s was already an idiotically fast time control....
I can even test my engine at 1s+0, with a typical midgame depth around 6-7. There are sporadic losses on time, but I allocate a minimum thinking time depending on the game phase, and that buys more Elo than I lose on occasional time forfeits.

Interestingly, there were no time forfeits in Arena, only with c-chess-cli. Arena seems to subtract some time to account for its own GUI delays, and that's also how a game with 0m+0 is possible. c-chess-cli is more exact because it doesn't have such lags in the first place.
I see.

I've been tweaking my time management to make the engine use its time more fully. Also I've changed sending stats and curr-move a bit. It's not possible to check for elapsed time in every node. If I do, search speed drops by 75%. Now I do this:

- Check for TimeUp every 2047 nodes.
- Check if 2 seconds have elapsed, since the last +/- 500 million nodes. If so, send stats.
- If in the root, check if 1 second has elapsed after making a legal move, and if so, send "curr_move" and "curr_movenumber".

Now, the engine basically only sends the "info" string for the first 6-8 depths (it reaches depth 6-9 in under 1 second depending on the position; no hash table yet) before it even starts sending curr_move and stats, and they will be sent at a speed of once every 1 or 2 seconds. Seems to work fine. Arena is not slammed with incoming data anymore, and the search speed is basically the same. If there are any differences (there have to be, because the search is now doing more checking), it's impossible to see when watching the knps counter.

In the process, I have also implemented a "-q / -quiet" command line feature. When I start Rustic with this option, the search only outputs the "info" string and leaves out the stats and curr_move/curr_movenumber. Very useful when quickly analyzing a position in the console, when you want to paste the results into a forum, or save them for some reason.

This chess engine thing is coming together rather nicely, surely but slowly :) Tips and assistance given by people on this forum are certainly of great help.
Author of Rustic, an engine written in Rust.
Releases | Code | Docs | Progress | CCRL