The .ini file vs options dilemma

Discussion of chess software programming and technical issues.

Moderator: Ras

Daniel Shawul
Posts: 4186
Joined: Tue Mar 14, 2006 11:34 am
Location: Ethiopia

The .ini file vs options dilemma

Post by Daniel Shawul »

Currently Scorpio uses three forms of input to get settings and commands. I am in dilemma of which to have.

a) INI file for settings executed at start up
b) Command line mainly for commands (go/quit) but also settings
c) Options feature of winboard/uci protocol (local for an engine or common for all)

Now that I wanted to remove the ini file and put all things in options, I am having some difficulties.

a) First the options feature works only in GUI mode that is able to save settings and send the saved ones
to the engine at start up. So for execution on the command line, one needs local ini file. This is especially
important for EGBBs, Hashtable, book or other things that are done at start-up. If used in GUI mode as options,
these features must be re-loadable. EGBBs for example leak cache memory when they are detached because I lacked
the bullet-proof type of programming skills at the time I wrote it. Anyway this will be fixed but the question
is if such features should be added as an option.

b) Having a feature in both Ini file and Options means it will be called twice. If I remove the Ini file Scorpio
will be unusable from the command line. I don't want that because Scorpio needs to be started with proper defaults
from the ini instead of passing all the parameters from the command line.

c) The command line is basically for commands but it can be used to set options. The ini file is parsed first and we
wait until EGBBs are loaded for example. This could take for example 100 secs to load all 5 men in RAM, so it is important
that it is fully loaded before we search/analyze positions from a go command in the command line.

With MPI the command line is passed to the main process (rank=0) only so whatever is passed on the command line is parsed by
that processor alone. For example, I used this trick to turn logging on only for the main processor while turning it on
for the other processors via ini file.

d) Common engine setting may sometimes be too general. The 'memory' command of Winboard, for example, is not enough to
make divisions into main hash table, eval cache, pawn cache, egbb cache etc... Also One may want to set different settings
for each engine, such multiple threads for the weeker engine instead of relying on 'cores'.

About egtpath. Arena seems to have a separate path for scorpio,nalimov,gaviota and others. Winboard only has one edit box
but it may be that all can be concatenated with a semicolon. Arena seems to send 'egtpath scorpio d:/egbb/' twice same as other
commands such as 'new', so EGBBs are being loaded twice which currently leaks the cache memory (This is my fault anyway).

I guess all I am saying is this is getting very messy, and I wonder if there is some philosophy of which to have for engines.
I have all of them now but local ini files are thought to be backwards, but I can't remove them because of some things I mentioned above.
User avatar
velmarin
Posts: 1600
Joined: Mon Feb 21, 2011 9:48 am

Re: The .ini file vs options dilemma

Post by velmarin »

Daniel, maybe you have your reasons, I no know.

Why not UCI ?
Many users including me story, we like the UCI comfort.
:)

Greetings, Jose.
User avatar
hgm
Posts: 28454
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: The .ini file vs options dilemma

Post by hgm »

UCI would be far worse than WB in this respect. It is virtually impossible to do anything meaningful with UCI engines when you run them from the command line. You still would have to set all the options at startup, and you would have to do it with extremely verbose commands.

@Daniel:
If you want to be able to run from the command line, and default settings are in general not good enough, then I think an ini file would be the only option. But I don't think it would be wise to make reading of the ini file trigger immediate actions in the engine (such as loading EGBB). It should just set the parameters. Interpreting the parameters should be done at a point where you are sure you have the final option settings.

The most logical design would be to have a list compiled-in defaults for all parameters, and if there is an ini file, first read that to overwrite that list. Then use the list to send the option features (so the GUI knows what the settings would be if it took no further action to change them). Then send feature done=1 to indicate no more features will be coming.

The GUI will then proceed by setting options for the next game (like memory, cores, egtpath). They will immediately be followed by a 'new' command to start the game. At that point the engine should start to take the actions specified by the current settings. E.g. allocate its hash table, load the EGBBs, etc.

To prevent this is redone on every game (or every 'new' command), the engine should know which settings it in fact is using, and compare those with the settings that are requested, and only take action when they are different. E.g. for the hash table my engines use two variables: currentHasSize and hashSize, the latter set by the memory command. Every time it receives a memor command it compares hashSize to currentHashSize (which starts at 0), ignores the command if they still are the same, and reallocates the hash table at the new size when they are different. They starts with currentHashSize = 0. I could also have done all that on reception of 'new', but as I know WinBoard only sends memory commands directly before new, it really doesn't matter much. Same with egtpath: it only comes just before new. You would only have to act on it if the path is different from what it was.

As to division of memory between different tables: in this respect WB protocol is not different from UCI. It relies on engine-defined options to set things like EGTB cache, Pawn hash and what have you. The engine can have compiled-in defaults for them, or read them from an ini file. As long as the GUI does not send it other settings (e.g. because the user changed them through the Engine Settings dialog), it should use the defaults or ini-file settings. Of course it isn't really necessary to have options for that at all; the engine could use its own algorithm to decide how to split up the total memory. Like 80% hash, 10% EGTB cache, 10% Pawn hash.
Daniel Shawul
Posts: 4186
Joined: Tue Mar 14, 2006 11:34 am
Location: Ethiopia

Re: The .ini file vs options dilemma

Post by Daniel Shawul »

I have all of them now, and just took the easy way out by making EGBBs re-loadable without any leaks. Heap allocated memory by a module doesn't get destructed when dll is detached because the OS keeps record of which process allocated it and not which module. Anyway I did all the clean up as I could with atexit() and via automatic class destructors and now it seems all the memory is reclaimed. It doesn't matter much if I removed the ini, as the user may decided to change EGBB_cache size many times in the GUI, so it has to be reloadable. If the ini is present, it will use overwrite the defaults, which are then reported via feature=options to the GUI that will populate its settings panel. Then the GUI sends back settings that it has saved to the engine. So the priority in decreasing order is Options->Cmd line->Ini file. This seems to be working well.

The egbbs are loaded with a separate thread after the INI file is read, or after GUI options are set. I do not reload them for every change that affects loading (currently EgbbPath, EgbbCache, EgbbLoadType ) but wait until all options are sent by a sort of a hack to see if there is no input (bios_key()) and if options have been changed. Then I make sure Loading is finished at critical points a) Before sending 'feature done=1' after recieving protover b) Or in the command line case before any search (go/analyze) are executed. So I have a barrier wait_for_egbb() right before starting search. It is complicated but I made my way throught it and everything seems fine now.

If a setting is common to all engines, I would prefer to use it rather than add a local engine setting. The 'egtpath' works in both Arena and Winboard, but the latter has only one edit box. I think you should make it such that there are separate edit boxes for those engines which use Nalimov+Scorpio for example. The 'cores' command is properly working. For 'memory' I guess I can divide based on the ratio set in the local engine settings for (mht,pawnht,evalht,egtbcahce). That should take care of it. Any other common engine settings I forgot about?
jdart
Posts: 4420
Joined: Fri Mar 10, 2006 5:23 am
Location: http://www.arasanchess.org

Re: The .ini file vs options dilemma

Post by jdart »

Arasan has all three methods, like Scorpio, but like you, I am mulling removing the .ini file. Already most options are settable via UCI or Winboard commands, so it is not a large step to make them all settable that way and remove the ini file.

I think most of your issues are fixable. For EGBBs I don't think there is a Winboard standard for this. If you want to define custom options to enable them and set the path, etc. you could do that.

Re hash memory I think it is understood that the global setting is approximate and may not include all memory allocated to the program. If you really need to tune each cache individually, again, custom options work for that.

Winboard and UCI protocols already have provision for things like tablebases and bitbases that may require initialization time. For UCI you send "readyok" when you have initialized everything. For Winboard you use the feature command (Winboard protocol v. 2) and send
"feature done=1" when initialization is done.

Running from the command line w/o GUI is not usually necessary. If you want to analyze positions from the command line you can use the "epd2wb" tool (on Windows), or polyglot.

--Jon
Daniel Shawul
Posts: 4186
Joined: Tue Mar 14, 2006 11:34 am
Location: Ethiopia

fgetc vs fread

Post by Daniel Shawul »

Boy was I stupid back then? I was using fgetc() to read mega bytes of data instead of using fread()

Code: Select all

fread(table,1,cmpsize,pf);
//for(i = 0;i < cmpsize;i++) {
//	table[i] = fgetc(pf);
//}
This has decreased loading time by orders of magnitude. What took almost 2 minutes is now being loaded in less than a second!!! Can you belive it? Calling fgetc() on 211mb of data (211*10^6 times) has much more overhead than calling fread() on 145 files (145 times). The idea at the time was to always remeber that the egbbs use little-endian byte order for integers such as table indexs, but forgot the data is just a byte and I could have used fread() anyway. Problem solved now!
Running from the command line w/o GUI is not usually necessary. If you want to analyze positions from the command line you can use the "epd2wb" tool (on Windows), or polyglot.
There are many internal commands in scorpio that I use to test evaluation, parallel search etc on a set of positions without epd2web. I don't want to loose that capablity.
User avatar
hgm
Posts: 28454
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: The .ini file vs options dilemma

Post by hgm »

jdart wrote:Winboard and UCI protocols already have provision for things like tablebases and bitbases that may require initialization time. For UCI you send "readyok" when you have initialized everything. For Winboard you use the feature command (Winboard protocol v. 2) and send
"feature done=1" when initialization is done.
Note that "feature done=1" is really the equivalent of "uciok", while UCI's "isready"/"readyok" is the equivalent of "ping"/"pong". Both "isready" and "ping" can be used anytime during the session, rather than just at startup.

One problem raised by Daniel before is that the engine cannot load tablebases or bitbases before sending done=1, because it does not know the path to their directory yet. The GUI sends the "egtpath" command only after it receives "done=1", before every game (because it is conceivable the user has changed it interactively). If the engine needs to do something time-consuming at that point, it will automatically throttle the GUI by delaying the "pong". The GUI should send a "ping" after every "new", and only start the clock when it has received the "pong" reply from both engines.

The normal sequence for starting UCI engines is to collect options until you receive uciok, then set all the options, then send "isready" to allow the engine to acknowledge all options are set. The WB equivalent (treated by Polyglot as such) is: collect options until done=1 (or timeout), then set options, and finally send "ping" to give the engine opportunity to report it is done setting them, and ready to play.

I think that WinBoard currently responds to option features for which a value was defined (e.g. -firstOptions "name=value" was on the XBoard command line or on the engine line when the engine was loaded) immediately when it receives feature option="name -spin 5 0 10", by sending option name=value after accepted option, without waiting for done=1. There was little reason to sync here: the engine usually won't look at input before it is done printing its features, which usually is done in an instant.

It could be debatable if it would not be better to reply to standard features (such as memory=1, smp=1, egtformats="scorpio,nalimov") immediately with a memory, cores and egtpath command, in addition to sending those before every "new". That would give the engine the opportunity to delay done=1 until after initializing. (E.g. clearing a large hash table, loading a tablebase index map. For "cores" this is probably never needed.) But I think it would really be better if engines just used "pong" to throttle the GUI. They would have to do that anyway if the user changed the tablebase path interactively during the session.
Daniel Shawul
Posts: 4186
Joined: Tue Mar 14, 2006 11:34 am
Location: Ethiopia

Re: The .ini file vs options dilemma

Post by Daniel Shawul »

I think I now see why winboard chose 'memory' only and not care about divisions. Even for EGBBs I am having a dilemma whether to pre-load compressed tables or use it for caching decompressed tables. 5-men are fully loadable to 211mb, but some users might complain about that being too much. So if I am only given a total memory, I would first load index_tables=> then compressed tables (selectively KRPKP for instance)=>Then caching. It is indeed a mess for the GUI to be involved in such matters of the engine.

Btw now the 6-man index tables also load in less than a second, including loading all 5-men in RAM, for a total of 650mb! I don't know if I am making a mistake or some of the SSD drive I have is being used but this feels like too good to be true. It has no problems analyzing postions or anything. Now I wonder why Nalimov 6-man index tables took long time to load...
jdart
Posts: 4420
Joined: Fri Mar 10, 2006 5:23 am
Location: http://www.arasanchess.org

Re: The .ini file vs options dilemma

Post by jdart »

Yes, you have more accurately described the initialization.

I have a routine "delayedInitIfNeeded" in my Winboard/xboard driver module (arasanx.cpp). It is run during "new" processing but also runs before some internal non-standard commands such as "test" and "eval". So to run a test suite with tablebases I can issue an egtpath command, etc. then "test" and the tb init will occur before the test suite runs.

--Jon