IWB wrote:lkaufman wrote:... I think that the decision was correct, but that now you (and CCRL and CEGT) should switch to Ordo. ...
My biggest problem with ORDO is the missing error bar. I like the concept of uncertanty and these absolut values (with decimals?) somehow look too precise.
Is there a way to switch on an error bar in ORDO? I did not check so maybe ...
BYe
Ingo
ordo -W -p TOPRES.pgn -a2800 -s1000
where
-W automatic white advantage
-a2800 average set to 2800
-s1000 simulate ranking 1000 times to calc standard deviations
Each engine's error is the error relative to the average of the pool.
Code: Select all
# PLAYER : RATING ERROR POINTS PLAYED (%)
1 Stockfish 5 : 2996.1 9.3 2473.0 3300 74.9%
2 Houdini 4 : 2992.0 9.2 2458.5 3300 74.5%
3 Komodo 7a : 2970.4 9.3 2379.0 3300 72.1%
4 Gull 3 : 2935.9 8.9 2245.5 3300 68.0%
5 Critter 1.4a : 2849.9 8.3 1882.0 3300 57.0%
6 Equinox 2.02 : 2844.8 8.3 1859.5 3300 56.3%
7 Deep Rybka 4.1 : 2826.6 8.2 1778.5 3300 53.9%
8 Deep Fritz 14 : 2756.7 8.2 1464.5 3300 44.4%
9 Chiron 2 : 2750.4 8.4 1436.5 3300 43.5%
10 Protector 1.6.0 : 2731.1 8.3 1351.0 3300 40.9%
11 Hannibal 1.4b : 2729.3 8.1 1343.0 3300 40.7%
12 Texel 1.04 : 2697.4 8.7 1204.5 3300 36.5%
13 Naum 4.2 : 2696.5 8.6 1200.5 3300 36.4%
14 Senpai 1.0 : 2695.9 8.6 1198.0 3300 36.3%
15 HIARCS 14 WCSC 32b : 2671.6 8.9 1096.0 3300 33.2%
16 Jonny 6.00 : 2655.4 8.6 1030.0 3300 31.2%
or
ordo -p TOPRES.pgn -a3100 -A "Stockfish 5" -W -s1000
Where Stockfish is fixed to 3100, so it will have no error for that reason.
Then, each engines error is the error relative to SF. Of course, errors are bigger (they now implicitly include SF error too). In the previous example, each engines error is the error relative to the average of the pool.
This is better if you want to compare one engine to the rest.
Code: Select all
# PLAYER : RATING ERROR POINTS PLAYED (%)
1 Stockfish 5 : 3100.0 ---- 2473.0 3300 74.9%
2 Houdini 4 : 3095.9 13.5 2458.5 3300 74.5%
3 Komodo 7a : 3074.3 13.0 2379.0 3300 72.1%
4 Gull 3 : 3039.8 12.9 2245.5 3300 68.0%
5 Critter 1.4a : 2953.8 13.0 1882.0 3300 57.0%
6 Equinox 2.02 : 2948.7 12.8 1859.5 3300 56.3%
7 Deep Rybka 4.1 : 2930.5 12.6 1778.5 3300 53.9%
8 Deep Fritz 14 : 2860.6 12.5 1464.5 3300 44.4%
9 Chiron 2 : 2854.3 13.4 1436.5 3300 43.5%
10 Protector 1.6.0 : 2835.1 13.1 1351.0 3300 40.9%
11 Hannibal 1.4b : 2833.2 12.6 1343.0 3300 40.7%
12 Texel 1.04 : 2801.3 13.1 1204.5 3300 36.5%
13 Naum 4.2 : 2800.4 13.5 1200.5 3300 36.4%
14 Senpai 1.0 : 2799.8 13.4 1198.0 3300 36.3%
15 HIARCS 14 WCSC 32b : 2775.5 13.8 1096.0 3300 33.2%
16 Jonny 6.00 : 2759.4 13.4 1030.0 3300 31.2%
You can also save a matrix or errors (each engine against each other) with the switch -e.
help with ordo
Code: Select all
quick example: ordo -a 2500 -p input.pgn -o output.txt
- Processes input.pgn (PGN file) to calculate ratings to output.txt.
- The general pool will have an average of 2500
usage: ordo [-OPTION]
-h print this help
-H print just the switches
-v print version number and exit
-L display the license information
-q quiet mode (no screen progress updates)
-a <avg> set rating for the pool average
-A <player> anchor: rating given by '-a' is fixed for <player>, if provided
-m <file> multiple anchors: file contains rows of "AnchorName",AnchorRating
-w <value> white advantage value (default=0.0)
-W white advantage, automatically adjusted
-z <value> scaling: set rating for winning expectancy of 76% (default=202)
-T display winning expectancy table
-p <file> input file in PGN format
-c <file> output file (comma separated value format)
-o <file> output file (text format), goes to the screen if not present
-g <file> output file with group connection info (no rating output on screen)
-s # perform # simulations to calculate errors
-e <file> saves an error matrix, if -s was used
-F <value> confidence (%) to estimate error margins. Default is 95.0