Ordoprep 0.8

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

User avatar
michiguel
Posts: 6401
Joined: Thu Mar 09, 2006 8:30 pm
Location: Chicago, Illinois, USA

Re: Ordoprep 0.8

Post by michiguel »

Version 0.9.9
https://github.com/michiguel/Ordoprep/r ... tag/v0.9.9

The traditional way to use ordoprep is

ordoprep -p games.pgn -o out.pgn

But if the following switch is added

ordoprep -p games.pgn -o out.pgn --major-only

out.pgn will contain games from the major group of engines that are well connected. This will guaranteed a robust ranking when ordo is used afterwards.

This version can also split the pgn file into different files in which each contains a well connected group.

ordoprep -p games.pgn --group-games basename

will output

basename.000000001.pgn
basename.000000002.pgn
basename.000000003.pgn

if there are three groups.

Other switches like --group-players BASENAME will divide the player names into groups in different files.

Miguel
drj4759
Posts: 89
Joined: Mon Nov 17, 2014 10:05 am

Re: Ordoprep 0.8

Post by drj4759 »

I experienced instability with ordoprep. The -g and --major-only options caused segmentation fault or seemingly frozen even after 30 minutes of processing. Other regular options are operational.

The preprocessed pgn file size is 430MB with 6.5 million records and 1,638 players.
I tried with only 540 players and it went successfully in a couple of minutes. Maybe it can't cope with so much players or the newer options are still not properly tested.

The computer used is an Athlon 4 core with 8MB RAM running under Linux 4.4.6 kernel.
drj4759
Posts: 89
Joined: Mon Nov 17, 2014 10:05 am

Re: Ordoprep 0.8

Post by drj4759 »

Sorry, 8GB RAM instead of 8MB.
User avatar
michiguel
Posts: 6401
Joined: Thu Mar 09, 2006 8:30 pm
Location: Chicago, Illinois, USA

Re: Ordoprep 0.8

Post by michiguel »

drj4759 wrote:I experienced instability with ordoprep. The -g and --major-only options caused segmentation fault or seemingly frozen even after 30 minutes of processing. Other regular options are operational.

The preprocessed pgn file size is 430MB with 6.5 million records and 1,638 players.
I tried with only 540 players and it went successfully in a couple of minutes. Maybe it can't cope with so much players or the newer options are still not properly tested.

The computer used is an Athlon 4 core with 8MB RAM running under Linux 4.4.6 kernel.
If you do
ordoprep -p input.pgn -o output.pgn
with the problematic file and you zip output.pgn, how big is it? is there a chance I can get that file by email (mballicora account on gmail) so I can reproduce the problem? Generally after ordoprepping a file they compress really well.

Thanks for testing it,
Miguel
User avatar
michiguel
Posts: 6401
Joined: Thu Mar 09, 2006 8:30 pm
Location: Chicago, Illinois, USA

Re: Ordoprep 0.8

Post by michiguel »

drj4759 wrote:I experienced instability with ordoprep. The -g and --major-only options caused segmentation fault or seemingly frozen even after 30 minutes of processing. Other regular options are operational.

The preprocessed pgn file size is 430MB with 6.5 million records and 1,638 players.
I tried with only 540 players and it went successfully in a couple of minutes. Maybe it can't cope with so much players or the newer options are still not properly tested.

The computer used is an Athlon 4 core with 8MB RAM running under Linux 4.4.6 kernel.
Massive optimization of the grouping algorithm (~120 x faster)
https://github.com/michiguel/Ordoprep/r ... g/v0.9.9.4

Miguel
drj4759
Posts: 89
Joined: Mon Nov 17, 2014 10:05 am

Re: Ordoprep 0.8

Post by drj4759 »

The -g option worked without segmentation fault and is very fast indeed.
The -i option does not work which produced an empty file but the -o option worked using the same include/exclude file.

The next thing to be optimized is the Ordo convergence calculation which takes forever to complete with 2 discrete preprocessed PGN database consisting 1GB file size, 13 million records and 260,000+ players combined together. The last test which extracted 25% data from this combined database took 15 hours to produce the rating list.
User avatar
michiguel
Posts: 6401
Joined: Thu Mar 09, 2006 8:30 pm
Location: Chicago, Illinois, USA

Re: Ordoprep 0.8

Post by michiguel »

drj4759 wrote:The -g option worked without segmentation fault and is very fast indeed.
The -i option does not work which produced an empty file but the -o option worked using the same include/exclude file.

The next thing to be optimized is the Ordo convergence calculation which takes forever to complete with 2 discrete preprocessed PGN database consisting 1GB file size, 13 million records and 260,000+ players combined together. The last test which extracted 25% data from this combined database took 15 hours to produce the rating list.
Last modifications:
v0.9.9.7
https://github.com/michiguel/Ordoprep/releases

Changes

Functions for discarding games was optimized significantly
switch --timelog was included. It outputs a time stamp (seconds) for each step.

The -i switch is working fine for me. I wonder whether the format of the players you use is exactly how they are in the database. Do you surround the names with quotes in the include file?

See my other post relative to Ordo. Now, it takes me 25 min to process your Megabase with Ordo after ordoprepped with --major-only. The convergence is the same, but pre and post calculations were horrendously slow for a gigantic number of players. Those bottlenecks were fixed.

Still, I have an idea for further optimization of the convergence, but it may take some experimentation.

Miguel